https://www.illumos.org/issues/7446
Since we support whole-disk configuration for boot pool, we also will need
whole disk support with UEFI boot and for this, zpool create should create efi-
system partition.
I have borrowed the idea from oracle solaris, and introducing zpool create -
B switch to provide an way to specify that boot partition should be created.
However, there is still an question, how big should the system partition be.
For time being, I have set default size 256MB (thats minimum size for FAT32
with 4k blocks). To support custom size, the set on creation "bootsize"
property is created and so the custom size can be set as: zpool create B -
o bootsize=34MB rpool c0t0d0
After pool is created, the "bootsize" property is read only. When -B switch is
not used, the bootsize defaults to 0 and is shown in zpool get output with
value ''. Older zfs/zpool implementations are ignoring this property.
https://www.illumos.org/rb/r/219/
Reviewed by: Andrew Stormont <andyjstormont@gmail.com>
Reviewed by: Yuri Pankov <yuri.pankov@gmail.com>
Approved by: Dan McDonald <danmcd@kebe.com>
Author: Toomas Soome <tsoome@me.com>
This commit makes no sense for FreeBSD, that is why I blocked the option,
but it should be good to stay closer to upstream.
Bryan Drewery [Tue, 20 Feb 2018 21:48:16 +0000 (21:48 +0000)]
Move SVNVERSION_CMD into the one place that uses it.
This code, which is basically `svnversion || svnliteversion`, generates
2 fstatat(2) for every directory in PATH for every Makefile parsed that
includes bsd.own.mk. This can add up for things like generating a Ports
index (Poudriere) or building a dependency graph for base.
Kyle Evans [Tue, 20 Feb 2018 21:23:01 +0000 (21:23 +0000)]
lualoader: Prepare for interception of "boot" CLI cmd
core.boot and core.autoboot may both take arguments; add a helper to cleanly
append an argstring to the given loader command.
Also provide a popFrontTable() that we'll use pop the command name off of an
argv table. We don't have the table library included, and including it is
non-trivial, so we'll implement this one function that we need in lua for
the time being.
https://www.illumos.org/issues/7745
The problem is that consumers of `libZFS_Core` that forget to call
`libzfs_core_init()` before calling any other function of the library
are having a hard time realizing their mistake. The library's internal
file descriptor is declared as global static, which is ok, but it is not
initialized explicitly; therefore, it defaults to 0, which is a valid
file descriptor. If `libzfs_core_init()`, which explicitly initializes
the correct fd, is skipped, the ioctl functions return errors that do
not have anything to do with `libZFS_Core`, where the problem is
actually located.
Even though assertions for that existed within `libZFS_Core` for debug
builds, they were never enabled because the `-DDEBUG` flag was missing
from the compiler flags.
This patch applies the following changes:
1. It adds `-DDEBUG` for debug builds of `libZFS_Core` and `libzfs`,
to enable their assertions on debug builds.
2. It corrects an assertion within `libzfs`, where a function had
been spelled incorrectly (`zpool_prop_unsupported()`) and nobody
knew because the `-DDEBUG` flag was missing, and the preprocessor
was taking that part of the code away.
3. The library's internal fd is initialized to `-1` and `VERIFY`
assertions have been placed to check that the fd is not equal to
`-1` before issuing any ioctl. It is important here to note, that
the `VERIFY` assertions exist in both debug and non-debug builds.
4. In `libzfs_core_fini` we make sure to never increment the
refcount of our fd below 0, and also reset the fd to `-1` when no
one refers to it. The reason for this, is for the rare case that
the consumer closes all references but then calls one of the
library's functions without using `libzfs_core_init()` first, and
in the mean time, a previous call to `open()` decided to reuse
our previous fd. This scenario would have passed our assertion in
Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Approved by: Dan McDonald <danmcd@omniti.com>
Author: Serapheim Dimitropoulos <serapheim@delphix.com>
https://www.illumos.org/issues/7730
antares:root:~# mdb /usr/sbin/zpool
> ::sysbp _exit
> ::run import
pool: data
id: 2093977168778024605
state: ONLINE
action: The pool can be imported using its name or numeric identifier.
config:
https://www.illumos.org/issues/7604
If a zvol has the default setting for the "volblocksize" property, it is
8KB. However, it is displayed as "-" (not present), rather than "8K".
The problem was introduced by:
commit 25228e830e86924a41243343b1de9daf2d7dd43a
Author: Matthew Ahrens <mahrens@delphix.com>
Date: Thu Nov 17 14:37:24 2016 -0800
7571 non-present readonly numeric ZFS props do not have default value
which changed changed get_numeric_property() to indicate that readonly
default properties are not present. However, zfs_prop_readonly() returns
TRUE for both readonly and set-once properties (e.g. volblocksize).
Amusingly, that commit essentially reverted: 6900484 default volblocksize is no longer being reported correctly
from November 2009. However, that change was not correct either; the
correct solution is to only do this check for "truly readonly" (i.e. not
setonce) properties.
$ zfs list -t volume -o name,volblocksize
NAME
VOLBLOCK
domain0/group-100/appdata_container-101/appdata_windows_timeflow-102/
archive -
domain0/group-100/appdata_container-101/appdata_windows_timeflow-102/
datafile -
domain0/group-100/appdata_container-101/appdata_windows_timeflow-102/
external -
rpool/dump
128K
rpool/swap
4K
rpool/swap1
===============================================================================
Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed by: John Kennedy <john.kennedy@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Matthew Ahrens <mahrens@delphix.com>
https://www.illumos.org/issues/7542
libshare keeps a cached copy of the sharetab listing in memory, which can
become out of date if shares are destroyed or created while leaving a libzfs
handle open. This results in a spurious unmounting failure when an NFS share
exists but isn't in the stale libshare cache.
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Reviewed by: Matt Amdur <matt.amdur@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Chris Williamson <chris.williamson@delphix.com>
https://www.illumos.org/issues/7336
We can run into a problem where we call into zfs_mount, which in turn calls
is_dir_empty, which opens the directory to try and make sure it's empty. The
issue with the current approach is that it holds the directory open while it
traverses it with readdir, which, due to subtle interaction with the Java JVM,
vfork, and exec can cause a tricky race condition resulting in zfs_mount
failures.
The approach to resolving the issue in this patch is to drop the usage of
readdir altogether, and instead rely on the fact that ZFS stores the number of
entries contained in a directory using the st_size field of the stat structure.
Thus, if the directory in question is a ZFS directory, we can check to see if
it's empty by calling stat() and inspecting the st_size field of structure
returned.
===============================================================================
The root cause appears to be an interesting race between vfork, exec, and
zfs_mount's usage of O_CLOEXEC when calling openat. Here's what is going on:
1. We call zfs_mount, and this in turn calls openat to check if the directory
is empty, which results in opening the directory we're trying to mount onto,
and increment v_count.
2. As we're in the middle of reading the directory, vfork is called by the JVM
and proceeds to exec the jspawnhelper utility. As a result of the vfork, we
take an additional hold on the directory, which increments v_count a second
time. The semantics of vfork mean the parent process will wait for the child
process to exit or exec before the parent can continue; at this point the
parent is in the middle of zfs_mount, reading the directory to determine if
it's empty or not.
3. The child process exec-ing jspawnhelper gets to the relvm call within
exec_args (which is called by exec_common). relvm is the function that releases
the parent process, allowing the parent to proceed. The problem is, at this
point of calling relvm, the child hasn't yet called close_exec which is
responsible for closing the file descriptors inherited from the parent process
Reviewed by: Matt Ahrens <mahrens@delphix.com>
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed by: Robert Mustacchi <rm@joyent.com>
Approved by: Dan McDonald <danmcd@omniti.com>
Author: Prakash Surya <prakash.surya@delphix.com>
https://www.illumos.org/issues/7233
This fixes a race where one thread is executing zfs_mount() while another
thread forks and execs. If the fork occurs while the directory is open, the
child process will inherit (but not necessarily close immediately) the open fd
for the directory, preventing the mount.
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Approved by: Richard Lowe <richlowe@richlowe.net>
Author: Alex Reece <alex@delphix.com>
https://www.illumos.org/issues/7502
Right now ztest executes zdb without -G, so when it has errors, the messages
are often not very helpful:
Executing zdb -bccsv -d -U /rpool/tmp/zpool.cache ztest
zdb: can't open 'ztest': Operation not supported
ztest: '/usr/sbin/amd64/zdb -bccsv -d -U /rpool/tmp/zpool.cache ztest' exit
code 1
With -G, we'd have:
/usr/sbin/amd64/zdb -bccsv -d -U /rpool/tmp/zpool.cache -G ztest
zdb: can't open 'ztest': Operation not supported
ZFS_DBGMSG(zdb):
spa_open_common: opening ztest
spa_load(ztest): LOADING
spa_load(ztest): FAILED: unable to parse config [error=48]
spa_load(ztest): UNLOADING
Which indicates where the error came from
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Approved by: Gordon Ross <gordon.w.ross@gmail.com>
Author: Pavel Zakharov <pavel.zakharov@delphix.com>
Brooks Davis [Tue, 20 Feb 2018 18:08:57 +0000 (18:08 +0000)]
Reduce duplication in dynamic syscall registration code.
Remove the unused syscall_(de)register() functions in favor of the
better documented and easier to use syscall_helper_(un)register(9)
functions.
The default and freebsd32 versions differed in which array of struct
sysents they used and a few missing updates to the 32-bit code as
features were added to the main code.
Reviewed by: cem
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D14337
Kyle Evans [Tue, 20 Feb 2018 18:04:08 +0000 (18:04 +0000)]
lualoader: Move carousel storage out into config
Carousel storage doesn't need to happen in the menu module, and indeed
storing it there introduces a circular reference between drawer and menu
that only works because of global pollution in loader.lua.
Carousel choices generally map to config entries anyways, making it as good
of place as any to store these. Move {get,set}CarouselIndex functionality
out into config so that drawer and menu may both use it. If we had more
carousel functionality, it might make sense to create a carousel module, but
this is not the case.
Kyle Evans [Tue, 20 Feb 2018 17:46:50 +0000 (17:46 +0000)]
lualoader: Add ability to intercept cli commands
If we failed to execute the input line as pure lua, run the command through
parse for consistent argument parsing. Pass the parsed arguments through to
a global "cli_execute" written in Lua, which is expected to either handle it
or pass it back through to interp_builtin_cmd (via loader.command).
lua-handled cli commands will then exist as globals in whatever module they
most belong in, and invocations at the loader prompt will magically dispatch
to them if they exist.
Kyle Evans [Tue, 20 Feb 2018 14:36:28 +0000 (14:36 +0000)]
stand/lua: Consistently declare local functions at module scope
Declare these adjacent to the local definitions at the top of the module,
and make sure they're actually declared local to pollute global namespace a
little bit less.
Make vm_wait() take the vm_object argument which specifies the domain
set to wait for the min condition pass. If there is no object
associated with the wait, use curthread' policy domainset. The
mechanics of the wait in vm_wait() and vm_wait_domain() is supplied by
the new helper vm_wait_doms(), which directly takes the bitmask of the
domains to wait for passing min condition.
Eliminate pagedaemon_wait(). vm_domain_clear() handles the same
operations.
Eliminate VM_WAIT and VM_WAITPFAULT macros, the direct functions calls
are enough.
Eliminate several control state variables from vm_domain, unneeded
after the vm_wait() conversion.
Scetched and reviewed by: jeff
Tested by: pho
Sponsored by: The FreeBSD Foundation, Mellanox Technologies
Differential revision: https://reviews.freebsd.org/D14384
Wojciech Macek [Tue, 20 Feb 2018 06:38:55 +0000 (06:38 +0000)]
PowerNV: Send SIGILL on HEA illegal instruction exception
Currently Hypervisor Emulation Assistance interrupt is unhandled.
Executing an undefined instruction in userland triggers kernel panic.
Handle this the same way as Facility Unavailable Interrupt - send
SIGILL signal to userspace.
Kyle Evans [Tue, 20 Feb 2018 05:21:58 +0000 (05:21 +0000)]
style.lua(9): Clarify local variable guideline
The intent of this guideline is to avoid creating global variables in module
scope. Its main purpose is to serve as a reminder that variables at module
scope also need to be declared.
We want to avoid global variables in general, but this is easier to mess up
when designing things in the module scope.
https://www.illumos.org/issues/7812
This change removes all gendered language that did not refer specifically
to an individual person or pet. The convention taken was to use
variations on "they" when referring to users and/or human beings, while
using "it" when referring to code, functions, and/or libraries.
Additionally, we took the liberty to fix up any whitespace issues that
were found in any files that were already being modified.
Reviewed by: Matt Ahrens <mahrens@delphix.com>
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: Steve Gonczi <steve.gonczi@delphix.com>
Reviewed by: Chris Williamson <chris.williamson@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Igor Kozhukhov <igor@dilos.org>
Reviewed by: Dan McDonald <danmcd@omniti.com>
Reviewed by: Robert Mustacchi <rm@joyent.com>
Approved by: Richard Lowe <richlowe@richlowe.net>
Author: Daniel Hoffman <dj.hoffman@delphix.com>
Kyle Evans [Tue, 20 Feb 2018 04:56:03 +0000 (04:56 +0000)]
stand/lua: Refactor logos into drawer.logodefs table
This refactor makes it straightforward to add new logos for drawing and
organizes logo definitions in a logical manner.
The graphic to be drawn for each logo may again be modified outside of
drawer, but it must be done on a case-by-case basis as a modification to the
loader_logo.
Kyle Evans [Tue, 20 Feb 2018 04:23:43 +0000 (04:23 +0000)]
stand/lua: Reduce exposure of the drawer module
As part of an effort to slowly reduce our exports overall to a set of stable
properties/functions, go ahead and reduce what drawer exposes to others.
The graphics should generally not be modified on their own, but their
position could be modified if additional grahics also need to be drawn.
Export position/shift information, but leave the actual graphic local to
the module.
The next step will be to create a 'menudef' that gets exported instead, with
each entry in the menudef table describing the graphic to be drawn along
with specific positioning information.
Reviewed by: Igor Kozhukhov <ikozhukhov@gmail.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Approved by: Gordon Ross <gordon.ross@nexenta.com>
Author: Prakash Surya <prakash.surya@delphix.com>
Kyle Evans [Tue, 20 Feb 2018 03:58:45 +0000 (03:58 +0000)]
stand/lua: Add and use drawer.menu_name_handlers
Pull out specialized naming behavior into drawer.menu_name_handlers. This is
currently only needed for the carousel entry, where naming is based on the
current choice and the menu item purposefully does not store this state.
The setup was designed so that every type can have a special name handler,
and the default action is to simply use the result of entry.name().
Kyle Evans [Tue, 20 Feb 2018 03:40:16 +0000 (03:40 +0000)]
stand/lua: Extract menu handlers out into menu.handlers table
This is a bit cleaner than our former method of an if ... else chain of
handlers. Store handlers in the menu.handlers table so that they may be
added to or removed dynamically.
All handlers take the current menu and selected entry as parameters, and
their return value indicates whether the menu processor should continue or
not. An omitted return value or 'true' will indicate that we should
continue, while returning 'false' will indicate that we should exit the
current menu.
The omitted return value behavior is due to continuing the loop being the
more common situation.
Mateusz Guzik [Tue, 20 Feb 2018 02:18:30 +0000 (02:18 +0000)]
Reduce contention on the proctree lock during heavy package build.
There is a proctree -> allproc ordering established.
Most of the time it is either xlock -> xlock or slock -> slock.
On fork however there is a slock -> xlock pair which results in
pathological wait times due to threads keeping proctree held for
reading and all waiting on allproc. Switch this to xlock -> xlock.
Longer term fix would get rid of proctree in this place to begin with.
Right now it is necessary to walk the session/process group lists to
determine which id is free. The walk can be avoided e.g. with bitmaps.
The exit path used to have one place which dealt with allproc and
then with proctree. Move the allproc acquire into the section protected
by proctree. This reduces contention against threads waiting on proctree
in the fork codepath - the fork proctree holder does not have to wait
for allproc as often.
Finally, move tidhash manipulation outside of the area protected by
either of these locks. The removal from the hash was already unprotected.
There is no legitimate reason to look up thread ids for a process still
under construction.
This results in about 50% wait time reduction during -j 128 package build.
Jeff Roberson [Tue, 20 Feb 2018 00:06:07 +0000 (00:06 +0000)]
Further parallelize the buffer cache.
Provide multiple clean queues partitioned into 'domains'. Each domain manages
its own bufspace and has its own bufspace daemon. Each domain has a set of
subqueues indexed by the current cpuid to reduce lock contention on the cleanq.
Refine the sleep/wakeup around the bufspace daemon to use atomics as much as
possible.
Add a B_REUSE flag that is used to requeue bufs during the scan to approximate
LRU rather than locking the queue on every use of a frequently accessed buf.
Implement bufspace_reserve with only atomic_fetchadd to avoid loop restarts.
Kyle Evans [Mon, 19 Feb 2018 22:29:16 +0000 (22:29 +0000)]
stand/lua: Cache swapped menu, and don't create locals for swapping
Building the swapped welcome menu (first two items swapped) is kind of a
sluggish, because it requires a full (recrusive) shallow copy of the welcome
menu. Cache the result of that and re-use it later, instead of building it
everytime.
While here, don't create temporary locals just for swapping. The following
is just as good:
x, y = y, x;
Reported by: Alexander Nasonov <alnsn@yandex.ru> (swapping)
Alan Somers [Mon, 19 Feb 2018 22:09:49 +0000 (22:09 +0000)]
tail: fix "tail -r" for piped input that begins with '\n'
A subtle logic bug, probably introduced in r311895, caused tail to print the
first two lines of piped input in forward order, if the very first character
was a newline.
PR: 222671
Reported by: Jim Long <freebsd-bugzilla@umpquanet.com>, pprocacci@gmail.com
MFC after: 3 weeks
Sponsored by: Spectra Logic Corp
Kyle Evans [Mon, 19 Feb 2018 17:40:19 +0000 (17:40 +0000)]
stand/lua: Change boot menu items' names when swapped
[Enter] should be moved to the single user menu item when we swap them.
Define a non-standard menu entry function "alternate_name" to use for this
purpose for ultimate flexibility if we change our minds later. When we're
booting single user, make a shallow copy of the menu that we'd normally
display and swap the items and their name functions to use alternate_name
instead. Toggling single user in the options menu and going back to the main
menu will now correctly reflect the current boot setting with the first two
menu options and "[Enter]" will always be on the right one.
This shallow copy technique has the chance of being quite slow since it's
done on every redraw, but in my testing it does not seem to make any obvious
difference.
shallowCopyTable could likely belong better in a general-purpose utility
module, but this (and the key constnats) are the only candidates we have at
the moment so we'll drop it into our core stuff for the moment and consider
re-organization at a later date.
Kyle Evans [Mon, 19 Feb 2018 17:09:29 +0000 (17:09 +0000)]
stand/lua: Remove inaccurate comment after r329590
lualoader does a pretty good job of reverting any environment changes now.
It will even wipe out boot_verbose if it's set explicitly in loader.conf(5)
and overwritten in the boot options menu.
Future work will likely change this, as explicit choices made in the menu
should probably override the new loader.conf(5). I don't suspect this will
cause much grief, though, so it is not a high priority until boot
environment support actually lands.
Kyle Evans [Mon, 19 Feb 2018 16:59:28 +0000 (16:59 +0000)]
stand/lua: Track env changes that come in via loader.conf(5)
This will be used when boot environment support lands to make a good-faith
effort to apply any new loader.conf(5) environment settings atop the default
configuration that we started with.
Kyle Evans [Mon, 19 Feb 2018 16:35:46 +0000 (16:35 +0000)]
stand/lua: Call menu_entries if it's a function
If we've fetched menu.entries and it turns out it's a function, call it to
get the actual menu entries.
This will be used to swap multi-/single- user boot options if we're booting
single user by default (boot_single="YES" in loader.conf(5)). It can also be
used fairly easily for other non-standard situations.
Kyle Evans [Mon, 19 Feb 2018 15:49:27 +0000 (15:49 +0000)]
stand/lua: Remove some unused local declarations
Menus are actually defined as entries in the 'menu' table. These local
declarations have not been used in the history of our in-tree lua scripts,
so give them the boot.
Nathan Whitehorn [Mon, 19 Feb 2018 15:49:14 +0000 (15:49 +0000)]
Set internal error returns for OF_peer(), OF_child(), and OF_parent() to
zero, matching the IEEE 1275 standard. Since these internal error paths
have never, to my knowledge, been taken, behavior is unchanged.
Kyle Evans [Mon, 19 Feb 2018 14:21:56 +0000 (14:21 +0000)]
stand/lua: Defer kernel/module loading until boot or menu escape
Loading the kernel and modules can be really slow. Loading before the menu
draws and every time one changes kernel/boot environment is even more
painful.
Defer loading until we either boot, auto-boot, or escape to loader prompt.
We still need to deal with configuration changes as the boot environment
changes, but this is generally much quicker.
This commit strips all ELF loading out of config.load/config.reload so that
these are purely for configuration. config.loadelf has been created to deal
with kernel/module loads. Unloading logic has been ripped out, as we won't
need to deal with it in the menu anymore.
Andriy Gapon [Mon, 19 Feb 2018 08:55:22 +0000 (08:55 +0000)]
relax an assert in zfsctl_snapdir_lookup to match r323578
Since r323578 we may remove the last reference to a covered vnode with
vrele() instead of vput(). So, v_usecount may be decremented before
the vnode is locked and zfsctl_snapdir_lookup may "catch" the vnode
with v_usecount of zero and v_holdcnt of one.
Kyle Evans [Mon, 19 Feb 2018 03:59:26 +0000 (03:59 +0000)]
stand/lua: reload previously loaded kernel at config-load/reload
r329550 introduced config.kernel_loaded. config.load() doesn't provide a
means of overriding the kernel to load, but that likely isn't necessary as
it will not be a common case. When loading the kernel, just attempt to load
the kernel previously loaded and specified in config.kernel_loaded.
If we haven't loaded a kernel yet, config.kernel_loaded will be unset/nil
and the "default"/first kernel found will be loaded. If we've loaded a
kernel, we'll try to load that same kernel again and fallback to the default
kernel if we need to.
This in also in support of upcoming boot environment support.
Kyle Evans [Mon, 19 Feb 2018 03:52:02 +0000 (03:52 +0000)]
stand/lua: Store the loaded kernel as config.kernel_loaded
'nil' means the 'first kernel found in module_path', which is the same
interpretation as passing 'nil' to loadkernel.
Otherwise, it denotes the name of a kernel that we've successfully loaded.
When reloaded later, we will still need to do the full search again to
locate the actual kernel in case things have changed, so just the name is
good enough.
This is in support of upcoming boot environment support. vfs.root.mountfrom
and currdev will be changed, then we will reload configuration and attempt
to reload the currently chosen kernel unless we shouldn't.
Kyle Evans [Mon, 19 Feb 2018 02:09:10 +0000 (02:09 +0000)]
stand/lua: Clear the screen before prompting for passwords
In the worst case scenario, we have no passwords to prompt for and we end up
just clearing the screen twice before we draw the menu or proceed with boot.
In the best case scenario, we don't try drawing password prompts amidst a
bunch of kernel/module loading.