https://www.illumos.org/issues/7542
libshare keeps a cached copy of the sharetab listing in memory, which can
become out of date if shares are destroyed or created while leaving a libzfs
handle open. This results in a spurious unmounting failure when an NFS share
exists but isn't in the stale libshare cache.
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Reviewed by: Matt Amdur <matt.amdur@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Chris Williamson <chris.williamson@delphix.com>
https://www.illumos.org/issues/7336
We can run into a problem where we call into zfs_mount, which in turn calls
is_dir_empty, which opens the directory to try and make sure it's empty. The
issue with the current approach is that it holds the directory open while it
traverses it with readdir, which, due to subtle interaction with the Java JVM,
vfork, and exec can cause a tricky race condition resulting in zfs_mount
failures.
The approach to resolving the issue in this patch is to drop the usage of
readdir altogether, and instead rely on the fact that ZFS stores the number of
entries contained in a directory using the st_size field of the stat structure.
Thus, if the directory in question is a ZFS directory, we can check to see if
it's empty by calling stat() and inspecting the st_size field of structure
returned.
===============================================================================
The root cause appears to be an interesting race between vfork, exec, and
zfs_mount's usage of O_CLOEXEC when calling openat. Here's what is going on:
1. We call zfs_mount, and this in turn calls openat to check if the directory
is empty, which results in opening the directory we're trying to mount onto,
and increment v_count.
2. As we're in the middle of reading the directory, vfork is called by the JVM
and proceeds to exec the jspawnhelper utility. As a result of the vfork, we
take an additional hold on the directory, which increments v_count a second
time. The semantics of vfork mean the parent process will wait for the child
process to exit or exec before the parent can continue; at this point the
parent is in the middle of zfs_mount, reading the directory to determine if
it's empty or not.
3. The child process exec-ing jspawnhelper gets to the relvm call within
exec_args (which is called by exec_common). relvm is the function that releases
the parent process, allowing the parent to proceed. The problem is, at this
point of calling relvm, the child hasn't yet called close_exec which is
responsible for closing the file descriptors inherited from the parent process
Reviewed by: Matt Ahrens <mahrens@delphix.com>
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed by: Robert Mustacchi <rm@joyent.com>
Approved by: Dan McDonald <danmcd@omniti.com>
Author: Prakash Surya <prakash.surya@delphix.com>
https://www.illumos.org/issues/7233
This fixes a race where one thread is executing zfs_mount() while another
thread forks and execs. If the fork occurs while the directory is open, the
child process will inherit (but not necessarily close immediately) the open fd
for the directory, preventing the mount.
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Approved by: Richard Lowe <richlowe@richlowe.net>
Author: Alex Reece <alex@delphix.com>
https://www.illumos.org/issues/7502
Right now ztest executes zdb without -G, so when it has errors, the messages
are often not very helpful:
Executing zdb -bccsv -d -U /rpool/tmp/zpool.cache ztest
zdb: can't open 'ztest': Operation not supported
ztest: '/usr/sbin/amd64/zdb -bccsv -d -U /rpool/tmp/zpool.cache ztest' exit
code 1
With -G, we'd have:
/usr/sbin/amd64/zdb -bccsv -d -U /rpool/tmp/zpool.cache -G ztest
zdb: can't open 'ztest': Operation not supported
ZFS_DBGMSG(zdb):
spa_open_common: opening ztest
spa_load(ztest): LOADING
spa_load(ztest): FAILED: unable to parse config [error=48]
spa_load(ztest): UNLOADING
Which indicates where the error came from
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Approved by: Gordon Ross <gordon.w.ross@gmail.com>
Author: Pavel Zakharov <pavel.zakharov@delphix.com>
Brooks Davis [Tue, 20 Feb 2018 18:08:57 +0000 (18:08 +0000)]
Reduce duplication in dynamic syscall registration code.
Remove the unused syscall_(de)register() functions in favor of the
better documented and easier to use syscall_helper_(un)register(9)
functions.
The default and freebsd32 versions differed in which array of struct
sysents they used and a few missing updates to the 32-bit code as
features were added to the main code.
Reviewed by: cem
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D14337
Kyle Evans [Tue, 20 Feb 2018 18:04:08 +0000 (18:04 +0000)]
lualoader: Move carousel storage out into config
Carousel storage doesn't need to happen in the menu module, and indeed
storing it there introduces a circular reference between drawer and menu
that only works because of global pollution in loader.lua.
Carousel choices generally map to config entries anyways, making it as good
of place as any to store these. Move {get,set}CarouselIndex functionality
out into config so that drawer and menu may both use it. If we had more
carousel functionality, it might make sense to create a carousel module, but
this is not the case.
Kyle Evans [Tue, 20 Feb 2018 17:46:50 +0000 (17:46 +0000)]
lualoader: Add ability to intercept cli commands
If we failed to execute the input line as pure lua, run the command through
parse for consistent argument parsing. Pass the parsed arguments through to
a global "cli_execute" written in Lua, which is expected to either handle it
or pass it back through to interp_builtin_cmd (via loader.command).
lua-handled cli commands will then exist as globals in whatever module they
most belong in, and invocations at the loader prompt will magically dispatch
to them if they exist.
Kyle Evans [Tue, 20 Feb 2018 14:36:28 +0000 (14:36 +0000)]
stand/lua: Consistently declare local functions at module scope
Declare these adjacent to the local definitions at the top of the module,
and make sure they're actually declared local to pollute global namespace a
little bit less.
Make vm_wait() take the vm_object argument which specifies the domain
set to wait for the min condition pass. If there is no object
associated with the wait, use curthread' policy domainset. The
mechanics of the wait in vm_wait() and vm_wait_domain() is supplied by
the new helper vm_wait_doms(), which directly takes the bitmask of the
domains to wait for passing min condition.
Eliminate pagedaemon_wait(). vm_domain_clear() handles the same
operations.
Eliminate VM_WAIT and VM_WAITPFAULT macros, the direct functions calls
are enough.
Eliminate several control state variables from vm_domain, unneeded
after the vm_wait() conversion.
Scetched and reviewed by: jeff
Tested by: pho
Sponsored by: The FreeBSD Foundation, Mellanox Technologies
Differential revision: https://reviews.freebsd.org/D14384
Wojciech Macek [Tue, 20 Feb 2018 06:38:55 +0000 (06:38 +0000)]
PowerNV: Send SIGILL on HEA illegal instruction exception
Currently Hypervisor Emulation Assistance interrupt is unhandled.
Executing an undefined instruction in userland triggers kernel panic.
Handle this the same way as Facility Unavailable Interrupt - send
SIGILL signal to userspace.
Kyle Evans [Tue, 20 Feb 2018 05:21:58 +0000 (05:21 +0000)]
style.lua(9): Clarify local variable guideline
The intent of this guideline is to avoid creating global variables in module
scope. Its main purpose is to serve as a reminder that variables at module
scope also need to be declared.
We want to avoid global variables in general, but this is easier to mess up
when designing things in the module scope.
https://www.illumos.org/issues/7812
This change removes all gendered language that did not refer specifically
to an individual person or pet. The convention taken was to use
variations on "they" when referring to users and/or human beings, while
using "it" when referring to code, functions, and/or libraries.
Additionally, we took the liberty to fix up any whitespace issues that
were found in any files that were already being modified.
Reviewed by: Matt Ahrens <mahrens@delphix.com>
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: Steve Gonczi <steve.gonczi@delphix.com>
Reviewed by: Chris Williamson <chris.williamson@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Igor Kozhukhov <igor@dilos.org>
Reviewed by: Dan McDonald <danmcd@omniti.com>
Reviewed by: Robert Mustacchi <rm@joyent.com>
Approved by: Richard Lowe <richlowe@richlowe.net>
Author: Daniel Hoffman <dj.hoffman@delphix.com>
Kyle Evans [Tue, 20 Feb 2018 04:56:03 +0000 (04:56 +0000)]
stand/lua: Refactor logos into drawer.logodefs table
This refactor makes it straightforward to add new logos for drawing and
organizes logo definitions in a logical manner.
The graphic to be drawn for each logo may again be modified outside of
drawer, but it must be done on a case-by-case basis as a modification to the
loader_logo.
Kyle Evans [Tue, 20 Feb 2018 04:23:43 +0000 (04:23 +0000)]
stand/lua: Reduce exposure of the drawer module
As part of an effort to slowly reduce our exports overall to a set of stable
properties/functions, go ahead and reduce what drawer exposes to others.
The graphics should generally not be modified on their own, but their
position could be modified if additional grahics also need to be drawn.
Export position/shift information, but leave the actual graphic local to
the module.
The next step will be to create a 'menudef' that gets exported instead, with
each entry in the menudef table describing the graphic to be drawn along
with specific positioning information.
Reviewed by: Igor Kozhukhov <ikozhukhov@gmail.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Approved by: Gordon Ross <gordon.ross@nexenta.com>
Author: Prakash Surya <prakash.surya@delphix.com>
Kyle Evans [Tue, 20 Feb 2018 03:58:45 +0000 (03:58 +0000)]
stand/lua: Add and use drawer.menu_name_handlers
Pull out specialized naming behavior into drawer.menu_name_handlers. This is
currently only needed for the carousel entry, where naming is based on the
current choice and the menu item purposefully does not store this state.
The setup was designed so that every type can have a special name handler,
and the default action is to simply use the result of entry.name().
Kyle Evans [Tue, 20 Feb 2018 03:40:16 +0000 (03:40 +0000)]
stand/lua: Extract menu handlers out into menu.handlers table
This is a bit cleaner than our former method of an if ... else chain of
handlers. Store handlers in the menu.handlers table so that they may be
added to or removed dynamically.
All handlers take the current menu and selected entry as parameters, and
their return value indicates whether the menu processor should continue or
not. An omitted return value or 'true' will indicate that we should
continue, while returning 'false' will indicate that we should exit the
current menu.
The omitted return value behavior is due to continuing the loop being the
more common situation.
Mateusz Guzik [Tue, 20 Feb 2018 02:18:30 +0000 (02:18 +0000)]
Reduce contention on the proctree lock during heavy package build.
There is a proctree -> allproc ordering established.
Most of the time it is either xlock -> xlock or slock -> slock.
On fork however there is a slock -> xlock pair which results in
pathological wait times due to threads keeping proctree held for
reading and all waiting on allproc. Switch this to xlock -> xlock.
Longer term fix would get rid of proctree in this place to begin with.
Right now it is necessary to walk the session/process group lists to
determine which id is free. The walk can be avoided e.g. with bitmaps.
The exit path used to have one place which dealt with allproc and
then with proctree. Move the allproc acquire into the section protected
by proctree. This reduces contention against threads waiting on proctree
in the fork codepath - the fork proctree holder does not have to wait
for allproc as often.
Finally, move tidhash manipulation outside of the area protected by
either of these locks. The removal from the hash was already unprotected.
There is no legitimate reason to look up thread ids for a process still
under construction.
This results in about 50% wait time reduction during -j 128 package build.
Jeff Roberson [Tue, 20 Feb 2018 00:06:07 +0000 (00:06 +0000)]
Further parallelize the buffer cache.
Provide multiple clean queues partitioned into 'domains'. Each domain manages
its own bufspace and has its own bufspace daemon. Each domain has a set of
subqueues indexed by the current cpuid to reduce lock contention on the cleanq.
Refine the sleep/wakeup around the bufspace daemon to use atomics as much as
possible.
Add a B_REUSE flag that is used to requeue bufs during the scan to approximate
LRU rather than locking the queue on every use of a frequently accessed buf.
Implement bufspace_reserve with only atomic_fetchadd to avoid loop restarts.
Kyle Evans [Mon, 19 Feb 2018 22:29:16 +0000 (22:29 +0000)]
stand/lua: Cache swapped menu, and don't create locals for swapping
Building the swapped welcome menu (first two items swapped) is kind of a
sluggish, because it requires a full (recrusive) shallow copy of the welcome
menu. Cache the result of that and re-use it later, instead of building it
everytime.
While here, don't create temporary locals just for swapping. The following
is just as good:
x, y = y, x;
Reported by: Alexander Nasonov <alnsn@yandex.ru> (swapping)
Alan Somers [Mon, 19 Feb 2018 22:09:49 +0000 (22:09 +0000)]
tail: fix "tail -r" for piped input that begins with '\n'
A subtle logic bug, probably introduced in r311895, caused tail to print the
first two lines of piped input in forward order, if the very first character
was a newline.
PR: 222671
Reported by: Jim Long <freebsd-bugzilla@umpquanet.com>, pprocacci@gmail.com
MFC after: 3 weeks
Sponsored by: Spectra Logic Corp
Kyle Evans [Mon, 19 Feb 2018 17:40:19 +0000 (17:40 +0000)]
stand/lua: Change boot menu items' names when swapped
[Enter] should be moved to the single user menu item when we swap them.
Define a non-standard menu entry function "alternate_name" to use for this
purpose for ultimate flexibility if we change our minds later. When we're
booting single user, make a shallow copy of the menu that we'd normally
display and swap the items and their name functions to use alternate_name
instead. Toggling single user in the options menu and going back to the main
menu will now correctly reflect the current boot setting with the first two
menu options and "[Enter]" will always be on the right one.
This shallow copy technique has the chance of being quite slow since it's
done on every redraw, but in my testing it does not seem to make any obvious
difference.
shallowCopyTable could likely belong better in a general-purpose utility
module, but this (and the key constnats) are the only candidates we have at
the moment so we'll drop it into our core stuff for the moment and consider
re-organization at a later date.
Kyle Evans [Mon, 19 Feb 2018 17:09:29 +0000 (17:09 +0000)]
stand/lua: Remove inaccurate comment after r329590
lualoader does a pretty good job of reverting any environment changes now.
It will even wipe out boot_verbose if it's set explicitly in loader.conf(5)
and overwritten in the boot options menu.
Future work will likely change this, as explicit choices made in the menu
should probably override the new loader.conf(5). I don't suspect this will
cause much grief, though, so it is not a high priority until boot
environment support actually lands.
Kyle Evans [Mon, 19 Feb 2018 16:59:28 +0000 (16:59 +0000)]
stand/lua: Track env changes that come in via loader.conf(5)
This will be used when boot environment support lands to make a good-faith
effort to apply any new loader.conf(5) environment settings atop the default
configuration that we started with.
Kyle Evans [Mon, 19 Feb 2018 16:35:46 +0000 (16:35 +0000)]
stand/lua: Call menu_entries if it's a function
If we've fetched menu.entries and it turns out it's a function, call it to
get the actual menu entries.
This will be used to swap multi-/single- user boot options if we're booting
single user by default (boot_single="YES" in loader.conf(5)). It can also be
used fairly easily for other non-standard situations.
Kyle Evans [Mon, 19 Feb 2018 15:49:27 +0000 (15:49 +0000)]
stand/lua: Remove some unused local declarations
Menus are actually defined as entries in the 'menu' table. These local
declarations have not been used in the history of our in-tree lua scripts,
so give them the boot.
Nathan Whitehorn [Mon, 19 Feb 2018 15:49:14 +0000 (15:49 +0000)]
Set internal error returns for OF_peer(), OF_child(), and OF_parent() to
zero, matching the IEEE 1275 standard. Since these internal error paths
have never, to my knowledge, been taken, behavior is unchanged.
Kyle Evans [Mon, 19 Feb 2018 14:21:56 +0000 (14:21 +0000)]
stand/lua: Defer kernel/module loading until boot or menu escape
Loading the kernel and modules can be really slow. Loading before the menu
draws and every time one changes kernel/boot environment is even more
painful.
Defer loading until we either boot, auto-boot, or escape to loader prompt.
We still need to deal with configuration changes as the boot environment
changes, but this is generally much quicker.
This commit strips all ELF loading out of config.load/config.reload so that
these are purely for configuration. config.loadelf has been created to deal
with kernel/module loads. Unloading logic has been ripped out, as we won't
need to deal with it in the menu anymore.
Andriy Gapon [Mon, 19 Feb 2018 08:55:22 +0000 (08:55 +0000)]
relax an assert in zfsctl_snapdir_lookup to match r323578
Since r323578 we may remove the last reference to a covered vnode with
vrele() instead of vput(). So, v_usecount may be decremented before
the vnode is locked and zfsctl_snapdir_lookup may "catch" the vnode
with v_usecount of zero and v_holdcnt of one.
Kyle Evans [Mon, 19 Feb 2018 03:59:26 +0000 (03:59 +0000)]
stand/lua: reload previously loaded kernel at config-load/reload
r329550 introduced config.kernel_loaded. config.load() doesn't provide a
means of overriding the kernel to load, but that likely isn't necessary as
it will not be a common case. When loading the kernel, just attempt to load
the kernel previously loaded and specified in config.kernel_loaded.
If we haven't loaded a kernel yet, config.kernel_loaded will be unset/nil
and the "default"/first kernel found will be loaded. If we've loaded a
kernel, we'll try to load that same kernel again and fallback to the default
kernel if we need to.
This in also in support of upcoming boot environment support.
Kyle Evans [Mon, 19 Feb 2018 03:52:02 +0000 (03:52 +0000)]
stand/lua: Store the loaded kernel as config.kernel_loaded
'nil' means the 'first kernel found in module_path', which is the same
interpretation as passing 'nil' to loadkernel.
Otherwise, it denotes the name of a kernel that we've successfully loaded.
When reloaded later, we will still need to do the full search again to
locate the actual kernel in case things have changed, so just the name is
good enough.
This is in support of upcoming boot environment support. vfs.root.mountfrom
and currdev will be changed, then we will reload configuration and attempt
to reload the currently chosen kernel unless we shouldn't.
Kyle Evans [Mon, 19 Feb 2018 02:09:10 +0000 (02:09 +0000)]
stand/lua: Clear the screen before prompting for passwords
In the worst case scenario, we have no passwords to prompt for and we end up
just clearing the screen twice before we draw the menu or proceed with boot.
In the best case scenario, we don't try drawing password prompts amidst a
bunch of kernel/module loading.
Kyle Evans [Mon, 19 Feb 2018 01:25:52 +0000 (01:25 +0000)]
Create style.lua(9)
This covers the lua style guidelines we've generally agreed on so far. It
will be revised as work continues and we run into more scenarios that need
specified.
Mateusz Guzik [Mon, 19 Feb 2018 00:54:08 +0000 (00:54 +0000)]
Fix process exit vs reap race introduced in r329449
The race manifested itself mostly in terms of crashes with "spin lock
held too long".
Relevant parts of respective code paths:
exit: reap:
PROC_LOCK(p);
PROC_SLOCK(p);
p->p_state == PRS_ZOMBIE
PROC_UNLOCK(p);
PROC_LOCK(p);
/* exit work */
if (p->p_state == PRS_ZOMBIE) /* true */
proc_reap()
free proc
/* more exit work */
PROC_SUNLOCK(p);
Thus a still exiting process is reaped.
Prior to the change the zombie check was followed by slock/sunlock trip
which prevented the problem.
Even code prior to this commit has a bug: the proc is still accessed for
statistic collection purposes. However, the severity is rather small and
the bug may be fixed in a future commit.
Mateusz Guzik [Mon, 19 Feb 2018 00:38:14 +0000 (00:38 +0000)]
mtx: add mtx_spin_wait_unlocked
The primitive can be used to wait for the lock to be released. Intended
usage is for locks in structures which are about to be freed.
The benefit is the avoided interrupt enable/disable trip + atomic op to
grab the lock and shorter wait if the lock is held (since there is no
worry someone will contend on the lock, re-reads can be more aggressive).
Warner Losh [Sun, 18 Feb 2018 23:16:16 +0000 (23:16 +0000)]
Print more info for -v runs and temp hack for usb vs uhub
Despite best efforts to regularize, there's a few tables in the system
that still report they are for bus usb when they are really for bus
uhub (where usb devices attach). Add a temporary workaround for this
until these places have been eliminated (likely my fault).
Second, when running verbose, describe what we're doing when
searching. This output can be quite long, but says exactly what's
going on (this output is to stdout, so it's useless for scripting).
Ian Lepore [Sun, 18 Feb 2018 23:08:43 +0000 (23:08 +0000)]
Provide a public function to acquire a gpio pin by giving the property name
and index. A private function to do exactly that already existed, so this
renames gpio_pin_get_by_ofw_impl() to gpio_pin_get_by_ofw_propidx() and
provides a declaration for it in a public header.
Previously there were functions to get a pin by property name (assuming
there would only be one pin defined for the name), or by index (asuming
the property has the standard name "gpios"). It turns out there are
devicetree bindings that describe properties with names other than "gpios"
which can describe multiple pins. Hence the need to retrieve the Nth item
from a named property.
Mateusz Guzik [Sun, 18 Feb 2018 21:07:15 +0000 (21:07 +0000)]
exit: get rid of PROC_SLOCK when checking a process to report, take #2
The suspension counter needs synchronisation through slock, but we don't
need it to check if inspecting the counter is necessary to begin with.
In the common case it is not, thus avoid the lock if possible.
Ian Lepore [Sun, 18 Feb 2018 20:08:35 +0000 (20:08 +0000)]
Give the imx_i2c driver its own name, set up its relationship to ofw_iicbus.
Previously it called itself 'iichb' to link up with the EARLY_DRIVER_MODULE
declaration in ofw_iicbus.c.
Ian Lepore [Sun, 18 Feb 2018 19:33:28 +0000 (19:33 +0000)]
Allow i2c hardware drivers to declare their own relationships to ofw_iicbus
rather than relying on a set of canned EARLY_DRIVER_MODULE() statements in
the ofw_iicbus source. This means hw drivers will no longer be required to
use one of a few predefined driver names. They will also now be able to
decide themselves if they want to use DRIVER_MODULE or EARLY_DRIVER_MODULE
and to set which pass to attach on for early modules.
Mainly, this adds extern declarations for the driver and devclass variables.
It also renames ofwiicbus_devclass to ofw_iicbus_devclass to be consistant
with the way we use ofw_ prefixes on this stuff.