This fixes a bug in the llvm vectorizer, which could sometimes cause
vectorized loops to perform an additional iteration, leading to possible
buffer overruns. Symptoms of this, which are usually segfaults, were
first noticed when building gcc ports, here:
ngie [Mon, 8 Dec 2014 18:29:20 +0000 (18:29 +0000)]
Add makewhatis to ITOOLS if MK_MAN != no
This will fix installation with differing host targets in installworld, so
one can build i386/i386 on an amd64 host, then install to an i386/i386 target
Reported by: alfred
Phabric: D1280
MFC after: 1 week
kib [Mon, 8 Dec 2014 16:48:57 +0000 (16:48 +0000)]
Add functions syncer_suspend() and syncer_resume(), which are supposed
to be called before suspension and after resume, correspondingly. The
syncer_suspend() ensures that all filesystems dirty data and metadata
are saved to the permanent storage, and stops kernel threads which
might modify filesystems. The syncer_resume() restores stopped
threads.
For now, only syncer is stopped. This is needed, because each sync
loop causes superblock updates for UFS.
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
kib [Mon, 8 Dec 2014 16:42:34 +0000 (16:42 +0000)]
When getnewbuf_reuse_bp() is called to reclaim some (clean) buffer,
the vnode owning the buffer is not locked. More, it cannot be locked
safely, since getnewbuf_reuse_bp() is called from newbuf(), and some
other vnode is already locked, for which reused buffer will be
reassigned.
As the consequence, reclamation of the owning vnode could go in
parallel, in particular, the call to vnode_destroy_vobject(), which
deallocates the vm object and zeroes the v_bufobj->bo_object. Note
that the pages wired by the buffer are left wired and can be safely
freed by the vfs_vmio_release() without the need for the vm object
lock. Also, seeing stale pointer to the v_object is safe due to vm
object type stability.
Check for bo_bufobj != NULL and cache the value in local variable to
avoid trying to lock NULL vm object.
Reported and tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
kib [Mon, 8 Dec 2014 16:33:18 +0000 (16:33 +0000)]
Current reaction of the nfsd worker threads to any signal is exit.
This is not correct at least for the stop requests. Check for stop
conditions and suspend threads if requested.
Reported and tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
kib [Mon, 8 Dec 2014 16:27:43 +0000 (16:27 +0000)]
Do some refactoring and minor cleanups of the thread_single() code in
preparation for the global stop commit.
Move the code to weed suspended or sleeping threads into the
appropriate state, into the helper weed_inhib(). Current code already
has deep nesting and hard to follow [1].
Add currently useless helper remain_for_mode(), which returns the
count of threads which are allowed to run, according to the
single-threading mode.
In thread_single_end(), do not save curthread into local variable, it
is unused after, except to find curproc.
Remove stray empty line.
Requested by: avg [1]
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
kib [Mon, 8 Dec 2014 16:18:05 +0000 (16:18 +0000)]
Thread waiting for the vfork(2)-ed child to exec or exit, must allow
for the suspension.
Currently, the loop performs uninterruptible cv_wait(9) call, which
prevents suspension until child allows further execution of parent.
If child is stopped, suspension or single-threading is delayed
indefinitely.
Create a helper thread_suspend_check_needed() to identify the need for
a call to thread_suspend_check(). It is required since call to the
thread_suspend_check() cannot be safely done while owning the child
(p2) process lock. Only when suspension is needed, drop p2 lock and
call thread_suspend_check(). Perform wait for cv with timeout, in
case suspend is requested after wait started; I do not see a better
way to interrupt the wait.
Reported and tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
kib [Mon, 8 Dec 2014 16:02:02 +0000 (16:02 +0000)]
When process is exiting, check for suspension regardless of
multithreaded status of the process.
The stopped state must be cleared before P_WEXIT is set. A stop
signal delivered just before first PROC_LOCK() block in exit1(9) would
put the process into pending stop with P_WEXIT set or assertion
triggered. Also recheck for the suspension after failed
thread_single(9) call, since process lock could be dropped.
Reported and tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
pfg [Mon, 8 Dec 2014 15:10:48 +0000 (15:10 +0000)]
patch(1): avoid line number overflows
Introduce strtolinenum to properly check line numbers while parsing:
no signs, no spaces, just digits, 0 <= x <= LONG_MAX
Properly validate line ranges supplied in diff file to prevent overflows.
Also fixes an out of boundary memory access because the resulting values
are used as array indices.
rodrigc [Mon, 8 Dec 2014 07:25:59 +0000 (07:25 +0000)]
Use CURVNET macros inside inet_get_local_port_range() function.
Without this fix, a kernel with VIMAGE + Infiniband will panic on bootup.
Certain necessary #include statements require LIST_HEAD.
Add these includes to ofed/include/linux/list.h, because
LIST_HEAD is specifically overridden in this file.
delphij [Mon, 8 Dec 2014 06:10:47 +0000 (06:10 +0000)]
Use calloc() instead of malloc() + bzero(). This also gets rid of a warning
because bzero is defined by strings.h which is not included in thread_pool.c.
delphij [Mon, 8 Dec 2014 06:04:42 +0000 (06:04 +0000)]
MFV r275540:
When importing a pool, don't assume that the passed pool configuration
at vdev_load is always vaild. It's possible that a stale configuration
that comes with extra vdevs, where metaslab_init() would fail because
of lower layer returns error.
Change the code to make metaslab_init() handle and return errors from
lower layer and pass it back to upper layer and handle it there.
Illumos issue:
5213 panic in metaslab_init due to space_map_open returning ENXIO
markj [Mon, 8 Dec 2014 04:44:40 +0000 (04:44 +0000)]
Add refcounting to IPv6 DAD objects and simplify the DAD code to fix a
number of races which could cause double frees or use-after-frees when
performing DAD on an address. In particular, an IPv6 address can now only be
marked as a duplicate from the DAD callout.
avg [Sun, 7 Dec 2014 11:21:41 +0000 (11:21 +0000)]
remove opensolaris cyclic code, replace with high-precision callouts
In the old days callout(9) had 1 tick precision and that was inadequate
for some uses, e.g. DTrace profile module, so we had to emulate cyclic
API and behavior. Now we can directly use callout(9) in the very few
places where cyclic was used.
mav [Sat, 6 Dec 2014 20:39:25 +0000 (20:39 +0000)]
Count consecutive read requests as blocking in CTL for files and ZVOLs.
Technically read requests can be executed in any order or simultaneously
since they are not changing any data. But ZFS prefetcher goes crasy when
it receives consecutive requests from different threads. Since prefetcher
works on level of separate blocks, instead of two consecutive 128K requests
it may receive 32 8K requests in mixed order.
This patch is more workaround then a real fix, and it does not fix all of
prefetcher problems, but it improves sequential read speed by 3-4x times
in some configurations. On the other side it may hurt performance if
some backing store has no prefetch, that is why it is disabled by default
for raw devices.
delphij [Sat, 6 Dec 2014 00:38:55 +0000 (00:38 +0000)]
4181 zfs(1m): 'zfs allow' examples in the man page are outdated
Reviewed by: Richard Lowe <richlowe@richlowe.net>
Reviewed by: Gordon Ross <gordon.ross@nexenta.com>
Reviewed by: Yuri Pankov <yuri.pankov@nexenta.com>
Approved by: Dan McDonald <danmcd@omniti.com>
Author: Marcel Telka <marcel.telka@nexenta.com>
delphij [Sat, 6 Dec 2014 00:17:25 +0000 (00:17 +0000)]
5312 libzfs should be decoupled from kernel's zfs_context.h
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Will Andrews <willa@spectralogic.com>
Reviewed by: Andriy Gapon <avg@FreeBSD.org>
Approved by: Dan McDonald <danmcd@omniti.com>
Author: Justin T. Gibbs <justing@spectralogic.com>
cxgbe(4): Allow for different pad and pack boundaries for different
adapters. Set the pack boundary for T5 cards to be the same as the
PCIe max payload size. The chip likes it this way.
In this revision the driver allocate rx buffers that align on both
boundaries. This is not a strict requirement and a followup commit
will switch the driver to a more relaxed allocation strategy.
delphij [Sat, 6 Dec 2014 00:10:47 +0000 (00:10 +0000)]
5316 allow smbadm join to use RPC
Reviewed by: Bayard Bell <bayard.bell@nexenta.com>
Reviewed by: Dan McDonald <danmcd@nexenta.com>
Reviewed by: Thomas Keiser <thomas.keiser@nexenta.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Gordon Ross <gwr@nexenta.com>
delphij [Sat, 6 Dec 2014 00:01:19 +0000 (00:01 +0000)]
3363 Mark non-returning functions in ctftools
Reviewed by: Richard Lowe <richlowe@richlowe.net>
Reviewed by: Josef 'Jeff' Sipek <josef.sipek@nexenta.com>
Approved by: Albert Lee <trisk@omniti.com>
Author: Erik Cederstrand <erik@cederstrand.dk>
andrew [Fri, 5 Dec 2014 19:14:05 +0000 (19:14 +0000)]
Place the literal pool after a RET otherwise clang 3.5 tries to put it too
far away from a ldr psuedo instruction. With this clang will place the
literal value here where it's close enough to be loaded.
kargl [Fri, 5 Dec 2014 19:00:55 +0000 (19:00 +0000)]
Update the constants associated with the evaluation of j0f(x)
for |x| small.
While here, remove the explicit cast of 0.25 to float. Replace
a multiplication involving 0.25 by a division using an integer
constant 4. Make a similar change in j0() to minimize the diff.
delphij [Fri, 5 Dec 2014 18:29:01 +0000 (18:29 +0000)]
Fix a regression introduced in r274337 (large block support)
In dsl_dataset_hold_obj() we used zap_contains(.., DS_FIELD_LARGE_BLOCKS)
to determine whether the extensible (zapifyed) dataset have large blocks.
The code expects the result be either 0 (found) or ENOENT (not found),
however reused the variable 'err' which later code expects to be 0.
Fix this by adopting similar code construct that is used later for
DS_FIELD_BOOKMARK_NAMES, which uses a temporary variable zaperr to catch
errors from zap_* rountines.
Reported by: Peter J. Creath (on FreeNAS; FreeNAS bug #6848)
Illumos issue: 5393 spurious failures from dsl_dataset_hold_obj()
Reviewed by: mahrens
Sponsored by: iXsystems, Inc.
X-MFC with: r274337
jhb [Fri, 5 Dec 2014 15:24:42 +0000 (15:24 +0000)]
Always ignore the deprecated MAP_RENAME and MAP_NORESERVE flags to mmap().
Some old libraries may be used even with newer binaries (specifically the
Nvidia driver libraries).
kib [Fri, 5 Dec 2014 15:02:30 +0000 (15:02 +0000)]
When the last reference on the vnode' vm object is dropped, read the
vp->v_vflag without taking vnode lock and without bypass. We do know
that vp is the lowest level in the stack, since the pointer is
obtained from the object' handle. Stale VV_TEXT flag read can only
happen if parallel execve() is performed and not yet activated the
image, since process takes reference for text mapping. In this case,
the execve() code manages the VV_TEXT flag on its own already.
It was observed that otherwise read-only sendfile(2) requires
exclusive vnode lock and contending on it on some loads for VV_TEXT
handling.
Reported by: glebius, scottl
Tested by: glebius, pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
mav [Thu, 4 Dec 2014 18:37:42 +0000 (18:37 +0000)]
Add to CTL support for threshold notifications for file-backed LUNs.
Previously it was supported only for ZVOL-backed LUNs, but now should work
for file-backed LUNs too. Used value in this case is a space occupied by
the backing file, while available value is an available space on file
system. Pool thresholds are still not implemented in this case.
kargl [Thu, 4 Dec 2014 15:57:58 +0000 (15:57 +0000)]
Fix a 20+ year bug by using an appropriate constant for
the transition from one asymptotic approximation to another
for the zeroth order Bessel and Neumann functions.
mav [Thu, 4 Dec 2014 11:34:19 +0000 (11:34 +0000)]
Add GET LBA STATUS command support to CTL.
It is implemented for LUNs backed by ZVOLs in "dev" mode and files.
GEOM has no such API, so for LUNs backed by raw devices all LBAs will
be reported as mapped/unknown.
hselasky [Wed, 3 Dec 2014 21:55:44 +0000 (21:55 +0000)]
Optimise the bit searching loops, by quickly skipping the 16 first set
bits if all the 16 first bits are set. This way the worst case
searching time is reduced from 32 to 16 cycles.
hselasky [Wed, 3 Dec 2014 21:48:30 +0000 (21:48 +0000)]
Workaround for possible bug in the SAF1761 chip. Wait 125us before
re-using a hardware propritary transfer descriptor, PTD, in USB host
mode. If the PTD's are recycled too quickly, it has been observed that
the hardware simply fails to schedule the requested job or resets
completely disconnecting all devices.
jhb [Wed, 3 Dec 2014 15:29:53 +0000 (15:29 +0000)]
Revert device_getenv_int() for now as it duplicates resource_int_value().
We should perhaps implement a device_getenv_*() and device_setenv_*() API
as a convenience wrapper on top of resource_*_value() and resource_set_*().
mav [Wed, 3 Dec 2014 15:16:18 +0000 (15:16 +0000)]
Do not pre-allocate UNIT ATTENTIONs storage for every possible initiator.
Abusing ability of major UAs cover minor ones we may not account UAs for
inactive ports. Allocate UAs storage for port and start accounting only
after some initiator from that port fetched its first POWER ON OCCURRED.
This reduces per-LUN CTL memory usage from >1MB to less then 100K.
mav [Wed, 3 Dec 2014 09:05:53 +0000 (09:05 +0000)]
Do not pre-allocate reservation keys memory for every possible initiator.
In configurations with many ports, like iSCSI, each LUN is typically
accessed only by limited subset of ports. Allocating that memory on
demand allows to reduce CTL memory usage from 5.3MB/LUN to 1.3MB/LUN.