delphij [Thu, 18 Dec 2014 23:45:26 +0000 (23:45 +0000)]
MFV r275914:
As of r270383, the dbuf_compare comparator compares the dbuf
attributes in the following order:
db_level (indirect level)
db_blkid (block number)
db_state (current state)
the address of the element
Because db_state is being considered before the element's state,
changing of db_state would affect balancedness of the AVL tree,
even when the address of element compares differently. For
instance, in dbuf_create, db_state may be altered after the
node is inserted into the AVL tree and may break AVL tree
balancedness.
Instead of using db_state as a comparision critera (introduced
in r270383), consider it only when we are doing a lookup, that
is one of the two dbuf pointers contains DB_SEARCH.
Illumos issue:
5422 preserve AVL invariants in dn_dbufs
ngie [Thu, 18 Dec 2014 18:16:00 +0000 (18:16 +0000)]
Fix building/installing tests when TESTSBASE != /usr/tests
The work in r258233 hardcoded the assumption that tests was the last component
of the tests tree by pushing tests as an explicit prefix for the paths in
BSD.tests.dist and /usr was the prefix for all tests, per BSD.usr.dist and all
of the mtree calls used in Makefile.inc1. This assumption breaks if/when one
provides a custom TESTSBASE "prefix", e.g. TESTSBASE=/mytests .
One thing that r258233 did properly though was remove "/usr/tests" creation
from BSD.usr.dist -- that should have not been there in the first place. That
was an "oops" on my part for the work that was originally committed in r241823
jamie [Thu, 18 Dec 2014 18:10:39 +0000 (18:10 +0000)]
Setgid before running a command as a specified user. Previously only
initgroups(3) was called, what isn't quite enough. This brings jail(8)
in line with jexec(8), which was already doing the right thing.
imp [Thu, 18 Dec 2014 16:57:22 +0000 (16:57 +0000)]
Don't deselect the card too soon. To set the block size or switch the
function parameters, the card has to be in transfer state. If it is in
the idle state, the commands are ignored. This caused us not to set
the proper parameters that we later assume to be present, leading to
downstream failures of the card / interface as our state machine
mismatches the card's.
Submitted by: Svatopluk Kraus <onwahe at gmail.com>, Michal Meloun
<meloun at miracle.cz>
kib [Thu, 18 Dec 2014 10:01:12 +0000 (10:01 +0000)]
The VOP_LOOKUP() implementations for CREATE op do not put the name
into namecache, to avoid cache trashing when doing large operations.
E.g., tar archive extraction is not usually followed by access to many
of the files created.
Right now, each VOP_LOOKUP() implementation explicitely knowns about
this quirk and tests for both MAKEENTRY flag presence and op != CREATE
to make the call to cache_enter(). Centralize the handling of the
quirk into VFS, by deciding to cache only by MAKEENTRY flag in VOP.
VFS now sets NOCACHE flag for CREATE namei() calls.
Note that the change in semantic is backward-compatible and could be
merged to the stable branch, and is compatible with non-changed
third-party filesystems which correctly handle MAKEENTRY.
Suggested by: Chris Torek <torek@pi-coral.com>
Reviewed by: mckusick
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
adrian [Thu, 18 Dec 2014 05:17:18 +0000 (05:17 +0000)]
Fix the scan handling for 11b->11g upgrades in a world where, well,
it's not just 11b/11g.
The following was happening, and it's quite .. annoyingly grr-y.
* create vap, setup wpa_supplicant with no bgscanning, etc - there's
no call to ieee80211_media_change, so vap->iv_des_mode is
IEEE80211_MODE_AUTO;
* do ifconfig wlan0 scan - same thing, media_change doesn't get called,
iv_des_mode stays as auto.
* But then, run wpa_cli and do 'scan' - it'll do a media change.
* if you're on 11ng, vap->iv_des_mode gets changed to IEEE80211_MODE_11NG
* Then makescanlist() is called. There's a block of code that gets
called if iv_des_mode != IEEE80211_MODE_AUTO, and it does this:
* .. now, iv_des_mode is not IEEE80211_MODE_11G, so it always runs
'continue'
* .. and thus the scan list stays empty and no further channel
scans occur. Ever.(1)
If you then disassociate and try associating to something, your
scan table has likely been purged / aged out and you'll never
see anything in the scan list.
(1) You need to do 'ifconfig wlan0 mode auto' or just destroy/re-create
the VAP to get working wireless again.
Tested:
* iwn(4) - intel 5300 wifi; STA mode; using wpa_supplicant; bgscan
enabled -and- wpa_supplicant scanning.
Thanks to:
* Everyone who kept poking me about this and wondering why the hell
their wifi would eventually stop seeing scan lists. Grr.
I eventually snapped this evening and dug back into this code.
dteske [Thu, 18 Dec 2014 03:51:09 +0000 (03:51 +0000)]
In bsdinstall's distextract, replace mixed_gauge() of dialog(3) with
new dpv(3) wrapper to dialog(3) dialog_gauge(). The dpv(3) library provides
a more flexible and refined interface similar to dialog_mixedgauge() however
is implemented atop the more generalized dialog_gauge() for portability.
Noticeable improvements in bsdinstall's distextract will be a status line
showing data rate information (with support for localeconv(3) to format
numbers according to $LANG or $LC_ALL conversion information), i18n support,
improved auto-sizing of gauge widget, a ``wheel barrow'' to keep the user
informed that things are moving (even if status/progress has not changed),
improved color support (mini-progress bars use the same color, if enabled,
as the main gauge bar), and several other improvements (some not visible).
dpv stands for "dialog progress view" (dpv was introduced in SVN r274116).
Differential Revision: https://reviews.freebsd.org/D714
Discussed on: -current
Reviewed by: julian
MFC after: 3 days
X-MFC-to: stable/10
Relnotes: Improved installer feedback from bsdinstall distextract
ngie [Wed, 17 Dec 2014 20:02:07 +0000 (20:02 +0000)]
Fix sporadic build failures due to race when running make installworld
when strip gets replaced at install time by adding it to ITOOLS for the
default usr.bin/xinstall STRIP_CMD
This will fix the failure noted in this Jenkins build step:
https://jenkins.freebsd.org/job/Build-UFS-image/688/
This will also fix the issue reported by alfred@ dealing with installing on
targets that differ from build hosts (e.g. installing on i386/i386 when built
on amd64/amd64)
mav [Wed, 17 Dec 2014 17:30:54 +0000 (17:30 +0000)]
Add configuration options to override physical and UNMAP blocks geometry.
While in most cases CTL should correctly fetch those values from backing
storages, there are some initiators (like MS SQL), that may not like large
physical block sizes, even if they are true. For such cases allow override
fetched values with supported ones (like 4K).
mav [Wed, 17 Dec 2014 15:13:21 +0000 (15:13 +0000)]
Make sequence numbers checks more strict.
While we don't support MCS, hole in received sequence numbers may mean
only PDU loss. While we don't support lost PDU recovery, terminate the
connection to avoid stuck commands.
While there, improve handling of sequence numbers wrap after 2^32 PDUs.
emaste [Wed, 17 Dec 2014 14:46:21 +0000 (14:46 +0000)]
Do not strip all when stripping an explicit symbol
When requested to strip specific symbols (-N flag) the default should be
to strip nothing (other than the requested symbols). This is consistent
with binutils strip(1).
PR: 196038
Reviewed by: imp
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D1327
will [Wed, 17 Dec 2014 00:22:41 +0000 (00:22 +0000)]
Initialize an argument to NULL instead of expecting dlinfo() to do it.
dlinfo() is a weak reference that may not be initialized at the time of
execution. The default implementation (in lib/libc/gen/dlfcn.c) neither
modifies the address pointed to by the third argument nor returns an error.
kib [Tue, 16 Dec 2014 18:28:33 +0000 (18:28 +0000)]
The iret instruction may generate #np and #ss fault, besides #gp.
When returning to usermode, the handler for that exceptions is also
executed with wrong gs base. Handle all three possible faults in the
same way, checking for iret fault, and performing full iret.
Sponsored by: The FreeBSD Foundation
MFC after: 3 days
will [Tue, 16 Dec 2014 17:59:05 +0000 (17:59 +0000)]
Make NanoBSD source-able from other scripts.
Summary:
This change converts NanoBSD into a two-script bundle.
- defaults.sh contains all non-CLI code. Most NanoBSD code is moved into
this file.
- nanobsd.sh now consists just of a command line interface that calls into
functions in defaults.sh.
Test Plan: Run NanoBSD using a previously-working configuration.
ae [Tue, 16 Dec 2014 14:59:20 +0000 (14:59 +0000)]
Add ability to not specify a zone identifier twice, when both source and
destination addresses are specified.
For example:
# ping6 -S fe80::1%ix0 ff02::1
or
# ping6 -S fe80::1 fe80::2%ix0
ed [Tue, 16 Dec 2014 09:21:56 +0000 (09:21 +0000)]
Rename cpack*() to CMPLX*().
The C11 standard introduced a set of macros (CMPLX, CMPLXF, CMPLXL) that
can be used to construct complex numbers from a pair of real and
imaginary numbers. Unfortunately, they require some compiler support,
which is why we only define them for Clang and GCC>=4.7.
The cpack() function in libm performs the same task as CMPLX(), but
cannot be used to generate compile-time constants. This means that all
invocations of cpack() can safely be replaced by C11's CMPLX(). To keep
the code building with GCC 4.2, provide copies of CMPLX() that can at
least be used to generate run-time complex numbers.
This makes it easier to build some of the functions outside of libm.
neel [Tue, 16 Dec 2014 06:33:57 +0000 (06:33 +0000)]
For level triggered interrupts clear the PIC IRR bit when the interrupt pin
is deasserted. Prior to this change each assertion on a level triggered irq
pin resulted in two interrupts being delivered to the CPU.
yongari [Tue, 16 Dec 2014 06:13:30 +0000 (06:13 +0000)]
Fix a bug introdiced in r217548. According to NS DP83815 data
sheet, RX filter should be disabled before programming.
Previously it was clearing wrong bits so RX filter was not
disabled in RX filter configuration.
delphij [Mon, 15 Dec 2014 18:28:22 +0000 (18:28 +0000)]
MFV r275784:
Plug a memory leak in libzfs. In zfs_iter_bookmarks, an nvlist is allocated
before calling lzc_get_bookmarks, which allocates the nvlist again (and
overwrites the pointer to previously allocated list).
Illumos issue:
5427 memory leak in libzfs when doing rollback
delphij [Mon, 15 Dec 2014 18:22:45 +0000 (18:22 +0000)]
MFV r275783:
Convert ARC flags to use enum. Previously, public flags are defined in
arc.h and private flags are defined in arc.c which can lead to confusion
and programming errors.
Consistently use 'hdr' (when referencing arc_buf_hdr_t) instead of 'buf'
or 'ab' because arc_buf_t are often named 'buf' as well.
Illumos issue:
5369 arc flags should be an enum
5370 consistent arc_buf_hdr_t naming scheme
Calculate the segment's memory size (p_memsz) using the virtual
addresses, not the file offsets. Otherwise padding preceeding SHT_NOBITS
sections may be excluded from the calculation, resulting in a segment
that is too small.
emaste [Mon, 15 Dec 2014 14:25:42 +0000 (14:25 +0000)]
Remove empty generated file upon gperf failure
Prior to this change the build could fail as follows, if gperf is not
available (or fails):
- make(1) stops due to the gperf error, but an empty target file
(cfns.h) is still created
- the empty cfns.h is newer than the source cfns.gperf so it is not
regenerated on subsequent builds
- the gcc build fails (undefined reference to libc_name_p)
Sponsored by: The FreeBSD Foundation
MFC after: 3 days
kib [Mon, 15 Dec 2014 12:01:42 +0000 (12:01 +0000)]
Add a facility for non-init process to declare itself the reaper of
the orphaned descendants. Base of the API is modelled after the same
feature from the DragonFlyBSD.
Requested by: bapt
Reviewed by: jilles (previous version)
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 3 weeks
delphij [Mon, 15 Dec 2014 07:52:23 +0000 (07:52 +0000)]
MFV r275551:
Remove "dbuf phys" db->db_data pointer aliases.
Use function accessors that cast db->db_data to the appropriate
"phys" type, removing the need for clients of the dmu buf user
API to keep properly typed pointer aliases to db->db_data in order
to conveniently access their data.
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zap_leaf.c:
In zap_leaf() and zap_leaf_byteswap, now that the pointer alias
field l_phys has been removed, use the db_data field in an on
stack dmu_buf_t to point to the leaf's phys data.
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c:
Remove the db_user_data_ptr_ptr field from dbuf and all logic
to maintain it.
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dnode.c:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dbuf.h:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu.h:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_dataset.c:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_dir.c:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sa.c:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zap.c:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zap_micro.c:
Modify the DMU buf user API to remove the ability to specify
a db_data aliasing pointer (db_user_data_ptr_ptr).
cddl/contrib/opensolaris/cmd/zdb/zdb.c:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_diff.c:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_objset.c:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_send.c:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_traverse.c:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_tx.c:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_bookmark.c:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_dataset.c:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_deadlist.c:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_deleg.c:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_destroy.c:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_dir.c:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_pool.c:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_prop.c:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_synctask.c:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_userhold.c:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sa.c:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_history.c:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zap.c:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zap_leaf.c:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zap_micro.c:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ioctl.c:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dsl_dataset.h:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dsl_dir.h:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zap_impl.h:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zap_leaf.h:
Create and use the new "phys data" accessor functions
dsl_dir_phys(), dsl_dataset_phys(), zap_m_phys(),
zap_f_phys(), and zap_leaf_phys().
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dsl_dataset.h:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dsl_dir.h:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zap_impl.h:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zap_leaf.h:
Remove now unused "phys pointer" aliases to db->db_data
from clients of the DMU buf user API.
delphij [Mon, 15 Dec 2014 04:51:36 +0000 (04:51 +0000)]
MFV r275549:
Add a loader tunable, vfs.zfs.arc_meta_min, which controls how much metadata
ZFS should keep in ARC at minimum.
In arc_evict(), when doing recycle, take more factors into account by
applying the following policy:
1. If no evictable data, evict metadata;
2. If no evictable metadata, evict data;
3. If we hit arc_meta_limit, evict metadata;
4. If we haven't hit arc_meta_min, evict data;
5* (Illumos only, not present in new FreeBSD code, yet) evict the oldest
cached element from data and metadata.
(FreeBSD) evict the data type specified by caller, which is the
existing behavior.
Note that because of our splitted locks (implemented in r205231 to improve
scalability by reducing lock contention), implementing the fifth Illumos
behavior will not be cheap, so for now just implement the 1-4 and fall back
to current behavior for 5.
Illumos issue:
5368 ARC should cache more metadata
MFC after: 2 months (assuming we didn't found better solution)
kib [Sat, 13 Dec 2014 16:18:29 +0000 (16:18 +0000)]
Add facility to stop all userspace processes. The supposed use of the
feature is to quisce the system before suspend.
Stop is implemented by reusing the thread_single(9) with the special
mode SINGLE_ALLPROC. SINGLE_ALLPROC differs from the existing
single-threading modes by allowing (requiring) caller to operate on
other process. Interruptible sleeps for !TDF_SBDRY threads are
suspended like SIGSTOP does it, instead of aborting the sleep, like
SINGLE_NO_EXIT, to avoid spurious EINTRs on resume.
Provide debugging sysctl debug.stop_all_proc, which causes total stop
and suspends syncer, while waiting for variable reset for resume. It
is used for debugging; should be removed after the real use of the
interface is added.
In collaboration with: pho
Discussed with: avg
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
kib [Sat, 13 Dec 2014 16:07:01 +0000 (16:07 +0000)]
Only sleep interruptible while waiting for suspension end when
filesystem specified VFCF_SBDRY flag, i.e. for NFS.
There are two issues with the sleeps. First, applications may get
unexpected EINTR from the disk i/o syscalls. Second, interruptible
sleep allows the stop of the process, and since mount point is
referenced while thread sleeps, unmount cannot free mount point
structure' memory, blocking unmount indefinitely.
Even for NFS, it is probably only reasonable to enable PCATCH for intr
mounts, but this information is currently not available at VFS level.
Reported and tested by: pho (previous version)
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
kib [Sat, 13 Dec 2014 16:02:37 +0000 (16:02 +0000)]
The vinactive() call in vgonel() may start writes for the dirty pages,
creating delayed write buffers belonging to the reclaimed vnode. Put
the buffer cleanup code after inactivation.
Add asserts that ensure that buffer queues are empty and add BO_DEAD
flag for bufobj to check that no buffers are added after the cleanup.
BO_DEAD is only used by INVARIANTS-enabled kernels.
Reported and tested by: pho (previous version)
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
delphij [Sat, 13 Dec 2014 02:08:18 +0000 (02:08 +0000)]
MFV r275548:
Verify that the block pointer is structurally valid, before attempting to
read it in. It can only be invalid in the case of a ZFS bug, but this
change will help identify such bugs in a more transparent way, by
panic'ing with a relevant message, rather than indexing off the end of an
array or something.
Illumos issue:
5349 verify that block pointer is plausible before reading
delphij [Sat, 13 Dec 2014 01:39:24 +0000 (01:39 +0000)]
MFV r275546:
Reduce scrub activities when system there is enough dirty data, namely when
dirty data is more than zfs_vdev_async_write_active_min_dirty_percent (once
we start to increase the number of concurrent async writes).
While there also correct rounding error which would make scrub end up
pausing for (zfs_txg_timeout + 1) seconds instead of the desired
zfs_txg_timeout seconds.
Illumos issue:
5351 scrub goes for an extra second each txg
5352 scrub should pause when there is some dirty data