CyberLeo.Net >> Repos - FreeBSD/FreeBSD.git/log

MFC r330696, r330709, r330742, r331358

r330696:
Add some functional tests for tftpd(8)

tftpd(8) is difficult to test in isolation due to its relationship with
inetd. Create a test program that mimics the behavior of tftp(1) and
inetd(8) and verifies tftpd's response in several different scenarios.

These test cases cover all of the basic TFTP protocol, but not the optional
parts.

PR: 157700
PR: 225996
PR: 226004
PR: 226005
Differential Revision: https://reviews.freebsd.org/D14310

r330709:
Commit missing file from r330696

X-MFC-With: 330696

r330742:
tftpd: fix the build of tests on i386 after 330696

It's those darn printf format specifiers again

Reported by: cy, kibab
X-MFC-With: 330696

r331358:
tftpd: misc Coverity cleanup in the tests

A bunch of unchecked return values from open(2) and read(2)

Reported by: Coverity
CID: 1386900, 1386911, 1386926, 1386928, 1386932, 1386942
CID: 1386961, 1386979
X-MFC-With: 330696

MFC r330627:

g_bio(9): fix a documentation oversight from r163870

MFC r330515:

spray: fix the spelling in an output string

MFC r330514:

rpc.sprayd: raise WARNS to 6

MFC r329874:

Add tests for lagg(4) and other cloned network interfaces

Unfortunately, most of the tests are disabled because they fairly frequently
trigger panics.

Sponsored by: Spectra Logic Corp

MFC r329845, r329872

r329845:
Fix numerous Coverity issues in mptutil

Most are memory or file descriptor leaks. Three were unannotated
fallthroughs in a switch/case statement. One was an integer overflow before
widen.

Reported by: Coverity
CID: 1007463 1007462 1007461 1007460 1007459 1007458 1007457
CID: 1006855 1006854 1006853 1006852 1006851 1006850 1006849
CID: 1006848 1006845 1006844 1006843 1006842 1006841 1006840
CID: 1006839 1006838 1006837 1006836 1006835 1006834 1006833
CID: 1006832 1006831 1006831 1006830 1006829 1008334 1008170
CID: 1008169 1008168
Sponsored by: Spectra Logic Corp
Differential Revision: https://reviews.freebsd.org/D11013

r329872:
Delete copypasta

Reported by: rpokala
X-MFC-With: 329845
Sponsored by: Spectra Logic Corp

MFC r329754:

dhclient: raise WARNS to 4

Mostly const-correctness fixes. There were also some variable-shadowing,
unused variable, and a couple of sockaddr type-correctness changes. I also had
trouble with cast-align warnings. I was able to prove that one of them was a
false positive. But ultimately I had to disable the warning program-wide to
deal with the others.

Reviewed by: cem
Sponsored by: Spectra Logic Corp
Differential Revision: https://reviews.freebsd.org/D14460

MFC r328341:

Add SPDX tags to iscsi(4).

MFC r329606:

tail: fix "tail -r" for piped input that begins with '\n'

A subtle logic bug, probably introduced in r311895, caused tail to print the
first two lines of piped input in forward order, if the very first character
was a newline.

PR: 222671
Reported by: Jim Long <freebsd-bugzilla@umpquanet.com>, pprocacci@gmail.com
Sponsored by: Spectra Logic Corp

MFC r328590:

Document the new hw.usb.template behaviour.

MFC r328589:

Make the handler routine for the hw.usb.template sysctl trigger the USB
host to reprobe the bus by switching the USB pull up resistors off and
back on. In other words - when FreeBSD is configured as a USB device,
changing the sysctl will be immediately noticed by the machine it's
connected to.

Relnotes: yes
Sponsored by: The FreeBSD Foundation

MFC r328338:

Add SPDX tags for automount(8) et al.

MFC r328339:

Add SPDX tags to autofs(5).

MFC r328337:

Add missing SPDX tags for ctld(8).

MFC r326430:

Add "vmaddr" ps(1) keyword.

MFC r326248:

.Xr pmcstat(8) from kgmon(8) and gprof(1).

MFC r325403:

Add missing MLINKS for disk_add_alias(9).

MFC r331546:

pf: reload and resync do the same thing

The reload and resync commands for the startup script do exactly the same
thing, so implement one as a call to the other.

MFC r329312 by eadler@:

etc: clean up trailing whitespace in autofs

MFC r326252:

Add /etc/autofs/include_nis, a non-rewriting NIS map.

MFC r326251:

Rename /etc/autofs/include_nis to /etc/autofs/include_nis_nullfs, to indicate
that this script provides nullfs map rewriting for local mounts.

MFC r326250:

Change formatting; no functional changes.

MFC r325392:

Add NIS automounter map, which supports rewriting of self-hosted locations
to make them nullfs.

PR: 221010

MFC r325400:

Make autofs(5) rc scripts run earlier, matching those for amd(8).

This helps when you have some daemons that need to access automounted shares.

PR: 221011

MFC r325390:

Use proper naming in a debug message.

MFC r325312:

Add fetchbench, a trivial HTTP benchmark based on fetch(1).

MFC r330875:

Add "usbconfig dump_all_desc", a subcommand to dump all device and config
descriptors.

MFC r327522:

Fix warnings from "mandoc -Tlint -Wwarning".

MFC r327382:

Improve usbconfig(8) manual page by adding descriptions for subcommands.

MFC r328219:

Add missing manufacturer/serial number string descriptors.

MFC r328197:

Remove unused index.

MFC r328196:

Add missing SPDX tags; the rest of the license text is the same as in other
USB templates.

MFC r328194:

Add sysctls to control device side USB identifiers. This makes it
possible to change string and numeric vendor and product identifiers,
as well as anything else there might be to change for a particular
device side template, eg the MAC address.

Relnotes: yes

MFC r324626:

Replace some magic numbers in usb_template(4) code with #defines.
There should be no functional changes.

Merge r331871:
  Handle a special case when a slab can fit only one allocation,
  and zone has a large alignment. With alignment taken into
  account uk_rsize will be greater than space in a slab. However,
  since we have only one item per slab, it is always naturally
  aligned.

  Code that will panic before this change with 4k page:

        z = uma_zcreate("test", 3984, NULL, NULL, NULL, NULL, 31, 0);
        uma_zalloc(z, M_WAITOK);

  A practical scenario to hit the panic is a machine with 56 CPUs
  and 2 NUMA domains, which yields in zone size of 3984 (on head).

PR: 227116

MFC r328861: Update blacklist-helper to not emit messages from pf during operation.

Use 'pfctl -k' when blocking a site to kill active tcp connections
from the blocked address.

Fix 'purge' operation for pf, which must dynamically determine which
filters have been created, so the filters can be flushed by name.

MFC r324512: Don't use a non-zero argument for __builtin_frame_address

Mirror the change made for powerpc64 in r323687. With this
change, gcc 6.4.0 can successfully compile and link a kernel
that runs on sparc64.

MFC r324237:

Make procstat(1) recognize process descriptors, so that it shows
"P" instead of "?" in "procstat -af" output. Note that there are
still a few more DTYPE_* kinds we don't decode yet.

Sponsored by: DARPA, AFRL

MFC r323206: Enable dtrace support for mips64 and the ERL kernel config

Turn on the required options in the ERL config file, and ensure
that the fbt module is listed as a dependency for mips in
the modules/dtrace/dtraceall/dtraceall.c file.

MFC r332483:

dtc(1): Update to upstream 006664a

Highlights:

- Passing "-" to -o will now cause output to go to stdout
- Path-based syntactic sugar for overlays is now accepted. This looks like:

/dts-v1/;
/plugin/;

&{/soc} {
    sid: eeprom@1c14000 {
        compatible = "allwinner,sun8i-h3-sid";
        reg = <0x1c14000 0x400>;
        status = "okay";
    };
};

MFC r331950: 9434 Speculative prefetch is blocked by device removal code.

Device removal code does not set spa_indirect_vdevs_loaded for pools
that never experienced device removal. At least one visual consequence
of it is completely blocked speculative prefetcher. This patch sets
the variable in such situations.

MFC r331713: MFV r331712:
9280 Assertion failure while running removal_with_ganging test with 4K devices

illumos/illumos-gate@243952c7eeef020886e3e2e3df99a513df40584a

Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: John Kennedy <john.kennedy@delphix.com>
Approved by: Garrett D'Amore <garrett@damore.org>
Author: Matt Ahrens <Matt.Ahrens@delphix.com>

MFC r331711: MFV 331710:
9188 increase size of dbuf cache to reduce indirect block decompression

illumos/illumos-gate@268bbb2a2fa79c36d4695d13a595ba50a7754b76

With compressed ARC (6950) we use up to 25% of our CPU to decompress indirect
blocks, under a workload of random cached reads. To reduce this decompression
cost, we would like to increase the size of the dbuf cache so that more
indirect blocks can be stored uncompressed.

If we are caching entire large files of recordsize=8K, the indirect blocks
use 1/64th as much memory as the data blocks (assuming they have the same
compression ratio). We suggest making the dbuf cache be 1/32nd of all memory,
so that in this scenario we should be able to keep all the indirect blocks
decompressed in the dbuf cache. (We want it to be more than the 1/64th that
the indirect blocks would use because we need to cache other stuff in the
dbuf cache as well.)

In real world workloads, this won't help as dramatically as the example
above, but we think it's still worth it because the risk of decreasing
performance is low. The potential negative performance impact is that we
will be slightly reducing the size of the ARC (by ~3%).

Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Reviewed by: Prashanth Sreenivasa <pks@delphix.com>
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed by: Sanjay Nadkarni <sanjay.nadkarni@nexenta.com>
Reviewed by: Allan Jude <allanjude@freebsd.org>
Reviewed by: Igor Kozhukhov <igor@dilos.org>
Approved by: Garrett D'Amore <garrett@damore.org>
Author: George Wilson <george.wilson@delphix.com>

MFC r331709: MFV r331708:
9321 arc_loan_compressed_buf() can increment arc_loaned_bytes by the wrong value

illumos/illumos-gate@9be12bd737714550277bd02b0c693db560976990

arc_loan_compressed_buf() increments arc_loaned_bytes by psize unconditionally
In the case of zfs_compressed_arc_enabled=0, when the buf is returned via
arc_return_buf(), if ARC_BUF_COMPRESSED(buf) is false, then arc_loaned_bytes
is decremented by lsize, not psize.

Switch to using arc_buf_size(buf), instead of psize, which will return
psize or lsize, depending on the result of ARC_BUF_COMPRESSED(buf).

Reviewed by: Matt Ahrens <matt@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Approved by: Garrett D'Amore <garrett@damore.org>
Author: Allan Jude <allanjude@freebsd.org>

MFC r331707: MFV r331706:
9235 rename zpool_rewind_policy_t to zpool_load_policy_t

illumos/illumos-gate@5dafeea3ebd2dd77affc802bcb90f63faf01589f

We want to be able to pass various settings during import/open of a pool,
which are not only related to rewind. Instead of adding a new policy and
duplicate a bunch of code, we should just rename rewind_policy to a more
generic term like load_policy.

For instance, we'd like to set spa->spa_import_flags from the nvlist,
rather from a flags parameter passed to spa_import as in some cases we want
those flags not only for the import case, but also for the open case. One
such flag could be ZFS_IMPORT_MISSING_LOG (as used in zdb) which would
allow zfs to open a pool when logs are missing.

Reviewed by: Matt Ahrens <matt@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Pavel Zakharov <pavel.zakharov@delphix.com>

MFC r331705: MFV 331704:
9191 dump vdev tree to zfs_dbgmsg when spa load fails due to missing log devices

illumos/illumos-gate@ccef24b493bcbd146fcd6d8946666cae081470b6

Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: Matt Ahrens <matt@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Pavel Zakharov <pavel.zakharov@delphix.com>

MFC r331703: MFV 331702:
9187 racing condition between vdev label and spa_last_synced_txg in vdev_validate

illumos/illumos-gate@d1de72cfa29ab77ff80e2bb0e668a6afa5bccaf0

ztest failed with uncorrectable IO error despite having the fix for #7163.
Both sides of the mirror have CANT_OPEN_BAD_LABEL, which also distinguishes
it from that issue.

Definitely seems like a racing condition between the vdev_validate and spa_sync:
1. Thread A (spa_sync): vdev label is updated to latest txg
2. Thread B (vdev_validate): vdev label's txg is compared to spa_last_synced_txg and is ahead.
3. Thread A (spa_sync): spa_last_synced_txg is updated to latest txg.

Solution: do not check txg in vdev_validate unless config lock is held.

Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Matt Ahrens <matthew.ahrens@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Pavel Zakharov <pavel.zakharov@delphix.com>

MFC r331701: MFV r331695, 331700: 9166 zfs storage pool checkpoint

illumos/illumos-gate@8671400134a11c848244896ca51a7db4d0f69da4

The idea of Storage Pool Checkpoint (aka zpool checkpoint) deals with
exactly that.  It can be thought of as a “pool-wide snapshot” (or a
variation of extreme rewind that doesn’t corrupt your data).  It remembers
the entire state of the pool at the point that it was taken and the user
can revert back to it later or discard it.  Its generic use case is an
administrator that is about to perform a set of destructive actions to ZFS
as part of a critical procedure.  She takes a checkpoint of the pool before
performing the actions, then rewinds back to it if one of them fails or puts
the pool into an unexpected state.  Otherwise, she discards it.  With the
assumption that no one else is making modifications to ZFS, she basically
wraps all these actions into a “high-level transaction”.

Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: John Kennedy <john.kennedy@delphix.com>
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Approved by: Richard Lowe <richlowe@richlowe.net>
Author: Serapheim Dimitropoulos <serapheim.dimitro@delphix.com>

MFC r331699: Partial MFV r329753:
8809 libzpool should leverage work done in libfakekernel

illumos/illumos-gate@f06dce2c1f0f3af78581e7574f65bfba843ddb6e

Reviewed by: Sebastien Roy <sebastien.roy@delphix.com>
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: Gordon Ross <gordon.w.ross@gmail.com>
Approved by: Richard Lowe <richlowe@richlowe.net>
Author: Andrew Stormont <astormont@racktopsystems.com>

We do not have libfakekernel, but need to reduce code divergence.

MFC r331420 (by avg): zfs: fix mismatch between format specifier and type

vdev_dbgmsg_print_tree printed vdev_id of uint64_t type with %u format
specifier. That caused subsequent parameters to be incorrectly read
from the stack and lead to a crash when a wrong value was interpreted as
a string pointer.

This should be upstreamed.

MFC r331414: Reduce struct aggsum_bucket padding to fit into one cache line.

MFC r331408: MFV r331407: 9213 zfs: sytem typo

illumos/illumos-gate@edc8ef7d921c96b23969898aeb766cb24960bda7

Reviewed by: C Fraire <cfraire@me.com>
Reviewed by: Andy Fiddaman <omnios@citrus-it.co.uk>
Approved by: Joshua M. Clulow <josh@sysmgr.org>
Author: Toomas Soome <tsoome@me.com>

MFC r331406: MFV r331405: 9084 spa_*_ashift must ignore spare devices

illumos/illumos-gate@b037f3dbd69cef4a7ffd576ad33e07bfaf0b1e84

Reviewed by: Prashanth Sreenivasa <pks@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Approved by: Dan McDonald <danmcd@joyent.com>
Author: Prakash Surya <prakash.surya@delphix.com>

MFC r331404: MFV r331400:
8484 Implement aggregate sum and use for arc counters

In pursuit of improving performance on multi-core systems, we should
implements fanned out counters and use them to improve the performance of
some of the arc statistics. These stats are updated extremely frequently,
and can consume a significant amount of CPU time.

Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Approved by: Dan McDonald <danmcd@joyent.com>
Author: Paul Dagnelie <pcd@delphix.com>

MFC r329808: MFV r329807:
8940 Sending an intra-pool resumable send stream may result in EXDEV

illumos/illumos-gate@544132fce3fa6583f01318f9559adc46614343a7

"zfs send -t <token>" for an incremental send should be able to resume
successfully when sending to the same pool: a subtle issue in
zfs_iter_children() doesn't currently allow this.

Because resuming from a token requires "guid" -> "dataset" mapping
(guid_to_name()), we have to walk the whole hierarchy to find the right
snapshots to send.
When resuming an incremental send both source and destination live in the
same pool and have the same guid: this is where zfs_iter_children() gets
confused and picks up the wrong snapshot, so we end up trying to send an
incremental "destination@snap1 -> source@snap2" stream instead of
"source@snap1 -> source@snap2": this fails with an "Invalid cross-device
link" (EXDEV) error.

Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Approved by: Hans Rosenfeld <rosenfeld@grumpf.hope-2000.org>
Author: loli10K <ezomori.nozomu@gmail.com>

MFC r329805: MFV r329803:
9080 recursive enter of vdev_indirect_rwlock from vdev_indirect_remap()

illumos/illumos-gate@bdfded42e66b9fc1395ff2401aa2952f7c44ae34

A scenario came up where a callback executed by vdev_indirect_remap() on a vdev, calls
vdev_indirect_remap() on the same vdev and tries to reacquire vdev_indirect_rwlock that
was already acquired from the first call to vdev_indirect_remap(). The specific scenario,
is that we want to remap a block pointer that is snapshoted but its dataset's remap_deadlist
is not cached. So in order to add it we issue a read through a vdev_indirect_remap() on the
same vdev, which brings up the aforementioned issue.

Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Approved by: Hans Rosenfeld <rosenfeld@grumpf.hope-2000.org>
Author: Serapheim Dimitropoulos <serapheim.dimitro@delphix.com>

MFC r329802: MFV r329799, r329800:
9079 race condition in starting and ending condesing thread for indirect vdevs

illumos/illumos-gate@667ec66f1b4f491d5e839644e0912cad1c9e7122

The timeline of the race condition is the following:
[1] Thread A is about to finish condesing the first vdev in spa_condense_indirect_thread(),
so it calls the spa_condense_indirect_complete_sync() sync task which sets the
spa_condensing_indirect field to NULL. Waiting for the sync task to finish, thread A
sleeps until the txg is done. When this happens, thread A will acquire spa_async_lock
and set spa_condense_thread to NULL.
[2] While thread A waits for the txg to finish, thread B which is running spa_sync() checks
whether it should condense the second vdev in vdev_indirect_should_condense() by checking
the spa_condensing_indirect field which was set to NULL by spa_condense_indirect_thread()
from thread A. So it goes on and tries to spawn a new condensing thread in
spa_condense_indirect_start_sync() and the aforementioned assertions fails because thread A
has not set spa_condense_thread to NULL (which is basically the last thing it does before
returning).

The main issue here is that we rely on both spa_condensing_indirect and spa_condense_thread to
signify whether a condensing thread is running. Ideally we would only use one throughout the
codebase. In addition, for managing spa_condense_thread we currently use spa_async_lock which
basically tights condensing to scrubing when it comes to pausing and resuming those actions
during spa export.

Reviewed by: Matt Ahrens <mahrens@delphix.com>
Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Approved by: Hans Rosenfeld <rosenfeld@grumpf.hope-2000.org>
Author: Serapheim Dimitropoulos <serapheim@delphix.com>

MFC r329798: MFV r329793, r329795:
9075 Improve ZFS pool import/load process and corrupted pool recovery

illumos/illumos-gate@6f7938128a2c5e23f4b970ea101137eadd1470a1

Some work has been done lately to improve the debugability of the ZFS pool
load (and import) process. This includes:

https://www.illumos.org/issues/7638: Refactor spa_load_impl into several functions
https://www.illumos.org/issues/8961: SPA load/import should tell us why it failed
https://www.illumos.org/issues/7277: zdb should be able to print zfs_dbgmsg's

To iterate on top of that, there's a few changes that were made to make the
import process more resilient and crash free. One of the first tasks during the
pool load process is to parse a config provided from userland that describes
what devices the pool is composed of. A vdev tree is generated from that config,
and then all the vdevs are opened.

The Meta Object Set (MOS) of the pool is accessed, and several metadata objects
that are necessary to load the pool are read. The exact configuration of the
pool is also stored inside the MOS. Since the configuration provided from
userland is external and might not accurately describe the vdev tree
of the pool at the txg that is being loaded, it cannot be relied upon to safely
operate the pool. For that reason, the configuration in the MOS is read early
on. In the past, the two configurations were compared together and if there was
a mismatch then the load process was aborted and an error was returned.

The latter was a good way to ensure a pool does not get corrupted, however it
made the pool load process needlessly fragile in cases where the vdev
configuration changed or the userland configuration was outdated. Since the MOS
is stored in 3 copies, the configuration provided by userland doesn't have to be
perfect in order to read its contents. Hence, a new approach has been adopted:
The pool is first opened with the untrusted userland configuration just so that
the real configuration can be read from the MOS. The trusted MOS configuration
is then used to generate a new vdev tree and the pool is re-opened.

When the pool is opened with an untrusted configuration, writes are disabled
to avoid accidentally damaging it. During reads, some sanity checks are
performed on block pointers to see if each DVA points to a known vdev;
when the configuration is untrusted, instead of panicking the system if those
checks fail we simply avoid issuing reads to the invalid DVAs.

This new two-step pool load process now allows rewinding pools accross
vdev tree changes such as device replacement, addition, etc. Loading a pool
from an external config file in a clustering environment also becomes much
safer now since the pool will import even if the config is outdated and didn't,
for instance, register a recent device addition.

With this code in place, it became relatively easy to implement a
long-sought-after feature: the ability to import a pool with missing top level
(i.e. non-redundant) devices. Note that since this almost guarantees some loss
Of data, this feature is for now restricted to a read-only import.

Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Andrew Stormont <andyjstormont@gmail.com>
Approved by: Hans Rosenfeld <rosenfeld@grumpf.hope-2000.org>
Author: Pavel Zakharov <pavel.zakharov@delphix.com>

MFC r329783: 8942 zfs promote .../%recv should be an error

illumos/illumos-gate@add927f8c8d101e16c23eb9cd270be4fd7edf7d5

Reported on the ZFSonLinux https://github.com/zfsonlinux/zfs/issues/4843,
fixed by https://github.com/zfsonlinux/zfs/pull/6339:

If we are in the middle of an incremental zfs receive, the child .../%recv
will exist. If you concurrently run zfs promote .../%recv, it will "work",
but then zfs gets confused. For example, there's no obvious way to destroy
the containing filesystem (because it is now a clone of its invisible child).

Attempting to do this promote should be an error. We could fix this by
having zfs_ioc_promote() check if zc_name contains a %, similar to
zfs_ioc_rename().

Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Approved by: Dan McDonald <danmcd@joyent.com>
Author: loli10K <ezomori.nozomu@gmail.com>

MFC r329777: MFV r329776:
8477 Assertion failed in vdev_state_dirty(): spa_writeable(spa)

illumos/illumos-gate@f4c1745bd6c9829a05ecec15759ede7757100ab5

Illumos 4080 allows "zpool clear" to work on readonly pools: i don't think
this is the intended behaviour, we shouldn't be allowed to clear readonly
pools. Probably.

A fix is already in the ZFS on Linux repository to addess this issue:
https://github.com/zfsonlinux/zfs/commit/92e43c17188d47f47b69318e4884096dec380e36

Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Approved by: Dan McDonald <danmcd@joyent.com>
Author: loli10K <ezomori.nozomu@gmail.com>

MFC r329775: MFV r329774:
8408 dsl_props_set_sync_impl() does not handle nested nvlists correctly

illumos/illumos-gate@85723e5eec42f46dbfdb4c09b9e1ed66501d1ccf

When iterating over the input nvlist in dsl_props_set_sync_impl() when we
don't preserve the nvpair name before looking up ZPROP_VALUE, so when we
later go to process it nvpair_name() is always "value" instead of the actual
property name.

This results in a couple of bugs in the recv code:

- received properties are not restored correctly when failing to receive
an incremental send stream
- received properties are not completely replaced by the new ones when
successfully receiving an incremental send stream

This was discovered on ZFS on Linux (fixed in
https://github.com/zfsonlinux/zfs/commit/5f1346c29997dd4e02acf4c19c875d5484f33b1e)

Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Approved by: Dan McDonald <danmcd@joyent.com>
Author: loli10K <ezomori.nozomu@gmail.com>

MFC r329771: MFV r329770: 9035 zfs: this statement may fall through

illumos/illumos-gate@46ac8fdfc5a1f9d8240c79a6ae5b2889cbe83553

Reviewed by: Yuri Pankov <yuripv@yuripv.net>
Reviewed by: Andy Fiddaman <omnios@citrus-it.co.uk>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Approved by: Dan McDonald <danmcd@joyent.com>
Author: Toomas Soome <tsoome@me.com>

MFC r329769: MFV r329766: 8962 zdb should work on non-idle pools

illumos/illumos-gate@e144c4e6c90e7d4dccaad6db660ee42b6e7ba04f

Currently `zdb` consistently fails to examine non-idle pools as it fails
during the `spa_load()` process. The main problem seems to be that
`spa_load_verify()` fails as can be seen below:

$ sudo zdb -d -G dcenter
    zdb: can't open 'dcenter': I/O error

ZFS_DBGMSG(zdb):
    spa_open_common: opening dcenter
    spa_load(dcenter): LOADING
    disk vdev '/dev/dsk/c4t11d0s0': best uberblock found for spa dcenter. txg 40824950
    spa_load(dcenter): using uberblock with txg=40824950
    spa_load(dcenter): UNLOADING
    spa_load(dcenter): RELOADING
    spa_load(dcenter): LOADING
    disk vdev '/dev/dsk/c3t10d0s0': best uberblock found for spa dcenter. txg 40824952
    spa_load(dcenter): using uberblock with txg=40824952
    spa_load(dcenter): FAILED: spa_load_verify failed [error=5]
    spa_load(dcenter): UNLOADING

This change makes `spa_load_verify()` a dryrun when ran from `zdb`. This is
done by creating a global flag in zfs and then setting it in `zdb`.

Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Andy Stormont <astormont@racktopsystems.com>
Approved by: Dan McDonald <danmcd@joyent.com>
Author: Pavel Zakharov <pavel.zakharov@delphix.com>

MFC r329765: MFV r329762: 8961 SPA load/import should tell us why it failed

illumos/illumos-gate@3ee8c80c747c4aa3f83351a6920f30c411236e1b

When we fail to open or import a storage pool, we typically don't get any
additional diagnostic information, just "no pool found" or "can not import".

While there may be no additional user-consumable information, we should at
least make this situation easier to debug/diagnose for developers and support.
For example, we could start by using `zfs_dbgmsg()` to log each thing that we
try when importing, and which things failed. E.g. "tried uberblock of txg X
from label Y of device Z". Also, we could log each of the stages that we go
through in `spa_load_impl()`.

Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Andrew Stormont <andyjstormont@gmail.com>
Approved by: Dan McDonald <danmcd@joyent.com>
Author: Pavel Zakharov <pavel.zakharov@delphix.com>

MFC r329761: MFV r329760: 7638 Refactor spa_load_impl into several functions

illumos/illumos-gate@1fd3785ff6601d3e391378c2dcbf4c5f27e1fe32

spa_load_impl has grown out of proportions. It is currently over 700
lines long and makes it very hard to follow or debug the import process
even for experienced ZFS developers. The objective is to split it up
in a series of well commented functions.

Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Andrew Stormont <andyjstormont@gmail.com>
Approved by: Dan McDonald <danmcd@joyent.com>
Author: Pavel Zakharov <pavel.zakharov@delphix.com>

MFC r329759:
9018 Replace kmem_cache_reap_now() with kmem_cache_reap_soon()

illumos/illumos-gate@36a64e62848b51ac5a9a5216e894ec723cfef14e

To prevent kmem_cache reaping from blocking other system resources, turn
kmem_cache_reap_now() (which blocks) into kmem_cache_reap_soon(). Callers
to kmem_cache_reap_soon() should use kmem_cache_reap_active(), which
exploits #9017's new taskq_empty().

Reviewed by: Bryan Cantrill <bryan@joyent.com>
Reviewed by: Dan McDonald <danmcd@joyent.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Yuri Pankov <yuripv@yuripv.net>
Author: Tim Kordas <tim.kordas@joyent.com>

FreeBSD does not use taskqueue for kmem caches reaping, so this change
is less dramatic then it is on Illumos, just limiting reaping to 1 time
per second. It may possibly be improved later, if needed.

MFC r329755: MFV r329753:
8809 libzpool should leverage work done in libfakekernel

illumos/illumos-gate@f06dce2c1f0f3af78581e7574f65bfba843ddb6e

Reviewed by: Sebastien Roy <sebastien.roy@delphix.com>
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: Gordon Ross <gordon.w.ross@gmail.com>
Approved by: Richard Lowe <richlowe@richlowe.net>
Author: Andrew Stormont <astormont@racktopsystems.com>

MFC r329732: MFV r329502: 7614 zfs device evacuation/removal

illumos/illumos-gate@5cabbc6b49070407fb9610cfe73d4c0e0dea3e77

https://www.illumos.org/issues/7614:
This project allows top-level vdevs to be removed from the storage pool with
“zpool remove”, reducing the total amount of storage in the pool. This
operation copies all allocated regions of the device to be removed onto other
devices, recording the mapping from old to new location. After the removal is
complete, read and free operations to the removed (now “indirect”) vdev must
be remapped and performed at the new location on disk. The indirect mapping
table is kept in memory whenever the pool is loaded, so there is minimal
performance overhead when doing operations on the indirect vdev.

The size of the in-memory mapping table will be reduced when its entries
become “obsolete” because they are no longer used by any block pointers in
the pool. An entry becomes obsolete when all the blocks that use it are
freed. An entry can also become obsolete when all the snapshots that
reference it are deleted, and the block pointers that reference it have been
“remapped” in all filesystems/zvols (and clones). Whenever an indirect block
is written, all the block pointers in it will be “remapped” to their new
(concrete) locations if possible. This process can be accelerated by using
the “zfs remap” command to proactively rewrite all indirect blocks that
reference indirect (removed) vdevs.

Note that when a device is removed, we do not verify the checksum of the data
that is copied. This makes the process much faster, but if it were used on
redundant vdevs (i.e. mirror or raidz vdevs), it would be possible to copy
the wrong data, when we have the correct data on e.g. the other side of the
mirror. Therefore, mirror and raidz devices can not be removed.

Reviewed by: Alex Reece <alex@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: John Kennedy <john.kennedy@delphix.com>
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Richard Laager <rlaager@wiktel.com>
Reviewed by: Tim Chase <tim@chase2k.com>
Approved by: Garrett D'Amore <garrett@damore.org>
Author: Prashanth Sreenivasa <pks@delphix.com>

MFC r307317: MFV r307313:
5120 zfs should allow large block/gzip/raidz boot pool (loader project)

Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Yuri Pankov <yuri.pankov@nexenta.com>
Reviewed by: Andrew Stormont <andyjstormont@gmail.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Toomas Soome <tsoome@me.com>

openzfs/openzfs@c8811bd3e2427dddbac6c05a59cfe117d8fea370

FreeBSD still does not support booting from gzip-compressed datasets,
so keep one chunk of this commit out.

MFC r308137, r316312, r332361

r308137:
Fix alignment issues on MIPS: align the pointers properly.

All the 5520 GEOM_ELI tests passed successfully on MIPS64EB.

r316312:
sys/geom/eli: Switch bzero() to explicit_bzero() for sensitive data

In GELI, anywhere we are zeroing out possibly sensitive data, like
the metadata struct, the metadata sector (both contain the encrypted
master key), the user key, or the master key, use explicit_bzero.

Didn't touch the bzero() used to initialize structs.

r332361:
Introduce dry run option for attaching the device.
This will allow us to verify if passphrase and key is valid without
decrypting whole device.

MFC r323108, r323125, r326047-r326049

r323108:
Add efimedia attribute for all GPT partitions.

r323125:
The hard drive media device path contains the size of the partition,
not its end. This makes the GEOM efimedia attribute match the
FreeBSD:Boot1Device environment variable now.

r326047:
Implement efi media tagging for MBR partitioning types.

r326048:
Remove trailing whitespace (one I just introduced and a bunch of
others in the same directory).

r326049:
While the EFI spec allows numbers to be in many forms, libefivar
produces hex numbers for the dsn. Since that come is from EDK2, change
this for symmetry, by generating the dsn as a hex number.

[Missed as part of the efivar/efibootmgr MFCs]

Reported by: Oliver Pinter <oliver.pinter@hardenedbsd.org>

geom_aes: Provide some deprecation notices

This is a direct commit to stable/11, due to having already been removed in
head.

MFC r322318-r322319

r322318:
Mark geom classes as deprecated.

geom_bsd, geom_mbr and geom_sunlabel have been obsolete since Marcel
Moolenaar's geom_part was in FreeBSD 7. They haven't been in GENERIC
since FreeBSD 8. Add warning when used.

geom_vol_ffs has been obsolete since ufs support to geom_label was
committed in FreeBSD 5. It hasn't been in GENERIC since FreeBSD 5.
Add warning when used.

geom_fox has been obsolete since gmultipath was committed in FreeBSD 7.
(no warning added, since this is a very obscure class).

These will all be removed in FreeBSD 12.

r322319:
Also provide a warning for geom_fox.

MFC r330764
Add CR2 get/set support.

MFC r325261
Emulate the "OR reg, r/m" instruction (opcode 0BH).

This is needed for the HDA emulation with FreeBSD guests.

MFC r331436:

netpfil: Introduce PFIL_FWD flag

Forwarded packets passed through PFIL_OUT, which made it difficult for
firewalls to figure out if they were forwarding or producing packets. This in
turn is an issue for pf for IPv6 fragment handling: it needs to call
ip6_output() or ip6_forward() to handle the fragments. Figuring out which was
difficult (and until now, incorrect).
Having pfil distinguish the two removes an ugly piece of code from pf.

Introduce a new variant of the netpfil callbacks with a flags variable, which
has PFIL_FWD set for forwarded packets. This allows pf to reliably work out if
a packet is forwarded.

Fix mis-merge of r329507 in r331501

sus/modules/Makefile part of r329507 just removed ffec
while r331501 also added conditional clause for bcm283x_clkman
and bcm283x_pwm. Since they're part of another revision,
remove mi-merged chunk

MFC r332182:
Handle Skylake-X errata SKZ63.

MFC r331077 (brooks): Add _IOC_NEWLEN() and _IOC_NEWTYPE() macros.

These macros take an existing ioctl(2) command and replace the length
with the specified length or length of the specified type respectively.
These can be used to define commands for 32-bit compatibility with fewer
opportunities for cut-and-paste errors then a whole new definition.

Obtained from: CheriBSD
Sponsored by: DARPA, AFRL

MFC r332142:

pf: Improve ioctl validation

Ensure that multiplications for memory allocations cannot overflow, and
that we'll not try to allocate M_WAITOK for potentially overly large
allocations.

MFC r332107:

pf: Improve ioctl validation for DIOCRGETTABLES, DIOCRGETTSTATS, DIOCRCLRTSTATS and DIOCRSETTFLAGS

These ioctls can process a number of items at a time, which puts us at
risk of overflow in mallocarray() and of impossibly large allocations
even if we don't overflow.

Limit the allocation to required size (or the user allocation, if that's
smaller). That does mean we need to do the allocation with the rules
lock held (so the number doesn't change while we're doing this), so it
can't M_WAITOK.

MFC r332088:

Add 32-bit compat for ioctls that take struct ifgroupreq.

Use an accessor to access ifgr_group and ifgr_groups.

Use an macro CASE_IOC_IFGROUPREQ(cmd) in place of case statements such
as "case SIOCAIFGROUP:". This avoids poluting the switch statements
with large numbers of #ifdefs.

Reviewed by: kib
Obtained from: CheriBSD
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D14960

MFC r332136:

pf: Improve ioctl validation for DIOCIGETIFACES and DIOCXCOMMIT

These ioctls can process a number of items at a time, which puts us at
risk of overflow in mallocarray() and of impossibly large allocations
even if we don't overflow.

There's no obvious limit to the request size for these, so we limit the
requests to something which won't overflow. Change the memory allocation
to M_NOWAIT so excessive requests will fail rather than stall forever.

MFC r332101:

pf: Improve ioctl validation for DIOCRADDTABLES and DIOCRDELTABLES

The DIOCRADDTABLES and DIOCRDELTABLES ioctls can process a number of
tables at a time, and as such try to allocate <number of tables> *
sizeof(struct pfr_table). This multiplication can overflow. Thanks to
mallocarray() this is not exploitable, but an overflow does panic the
system.

Arbitrarily limit this to 65535 tables. pfctl only ever processes one
table at a time, so it presents no issues there.

Remove .info debugging output that accidentally got left in for MFC commit.

This was just an artifact of my testing to ensure the option had the
desired effect on freebsd 11, both when enabled and when disabled.

Reported by: Thomas Mueller <tmueller@sysgo>
Point hat: ian@

MFC r332372-r332374: tail(1)/head(1) compatibility long options

r332372:
tail(1): Add some long options

Add --blocks, --bytes, and --lines long options for -b, -c, and -n
respectively. This improves tail(1)'s compatibility with its GNU counterpart
in a straightforward way.

r332373:
tail(1): Address mandoc concern (space before punctuation after macro)

r332374:
head(1): Provide long options

Provide long options --bytes and --lines to match -c and -n respectively.
This improves head(1)'s compatibility with its GNU counterpart in a sensible
way.

Move 1-second spin into ixgbe_netmap_reg()

This should still work around the netmap issue, but should not impact other
calls to ixgbe_stop().

PR: 221317
Sponsored by: Limelight Networks

MFC: r332075

Exit with usage when extra arguments are on command line
preventing mistakes such as "halt 0p" for "halt -p".
Approved by: bde (mentor, implicit), phk (mentor,implicit)
MFC after: 1 week

MFC r319897-r319898, r319904: Improve yes' throughput

r319897: Improve yes' throughput

On my system, this brings up the throughput from ~20 to ~600 MiB/s.

Inspired by:
https://www.reddit.com/r/unix/comments/6gxduc/how_is_gnu_yes_so_fast/

r319898: Handle partial writes

r319904: style(9) fixes.

MFC r308432, r308657: Capsicumize some trivial stdio programs

r308432: Capsicumize some trivial stdio programs

Trivially capsicumize some simple programs that just interact with
stdio. This list of programs uses 'pledge("stdio")' in OpenBSD.

r308657: fold(1): Revert incorrect r308432

As Jean-Sébastien notes, fold(1) requires handling argv-supplied files. That
will require a slightly more sophisticated approach.

MFC r306758 (emaste): locate: ANSIfy

MFC r332145: Do not fail devices just for errors in descriptor format.

Sponsored by: iXsystems, Inc.

MFC r331758: makefs: sync fragment and block size with newfs

r222319 in newfs raised the default blocksize for UFS/FFS filesystems
from 16K to 32K and the default fragment size from 2K to 4K, with a
rationale that most disks were now running with 4K sectors.

Relnotes: Yes
Sponsored by: The FreeBSD Foundation

Work around netmap issue with ixgbe

After multiple start/stop of netmap, ixgbe will get into a bad state
requiring a reboot to recover. Adding a delay before stopping the interface
appears to work around the issue.

The -CURRENT driver has diverged too far from -STABLE for an MFC.

PR: 221317
Submitted by: Sylvain Galliano <sg@efficientip.com>
Reported by: Cassiano Peixoto <peixoto.cassiano@gmail.com>
Sponsored by: Limelight Networks

MFC r332061:
Fix ERESTART for lcall $7,$0 syscalls.

MFC r332060:
Make the INTO instruction operational in 32bit mode.

MFC 328101,328911: Require SHF_ALLOC for kernel object module sections.

328101:
Require the SHF_ALLOC flag for program sections from kernel object modules.

ELF object files can contain program sections which are not supposed
to be loaded into memory (e.g. .comment).  Normally the static linker
uses these flags to decide which sections are allocated to loadable
program segments in ELF binaries and shared objects (including kernels
on all architectures and kernel modules on architectures other than
amd64).

Mapping ELF object files (such as amd64 kernel modules) into memory
directly is a bit of a grey area.  ELF object files are intended to be
used as inputs to the static linker.  As a result, there is not a
standardized definition for what the memory layout of an ELF object
should be (none of the section headers have valid virtual memory
addresses for example).

The kernel and loader were not checking the SHF_ALLOC flag but loading
any program sections with certain types such as SHT_PROGBITS.  As a
result, the kernel and loader would load into RAM some sections that
weren't marked with SHF_ALLOC such as .comment that are not loaded
into RAM for kernel modules on other architectures (which are
implemented as ELF shared objects).  Aside from possibly requiring
slightly more RAM to hold a kernel module this does not affect runtime
correctness as the kernel relocates symbols based on the layout it
uses.

Debuggers such as gdb and lldb do not extract symbol tables from a
running process or kernel.  Instead, they replicate the memory layout
of ELF executables and shared objects and use that to construct their
own symbol tables.  For executables and shared objects this works
fine.  For ELF objects the current logic in kgdb (and probably lldb
based on a simple reading) assumes that only sections with SHF_ALLOC
are memory resident when constructing a memory layout.  If the
debugger constructs a different memory layout than the kernel, then it
will compute different addresses for symbols causing symbols in the
debugger to appear to have the wrong values (though the kernel itself
is working fine).  The current port of mdb does not check SHF_ALLOC as
it replicates the kernel's logic in its existing kernel support.

The bfd linker sorts the sections in ELF object files such that all of
the allocated sections (sections with SHF_ALLOCATED) are placed first
followed by unallocated sections.  As a result, when kgdb composed a
memory layout using only the allocated sections, this layout happened
to match the layout used by the kernel and loader.  The lld linker
does not sort the sections in ELF object files and mixed allocated and
unallocated sections.  This resulted in kgdb composing a different
memory layout than the kernel and loader.

We could either patch kgdb (and possibly in the future lldb) to use
custom handling when generating memory layouts for kernel modules that
are ELF objects, or we could change the kernel and loader to check
SHF_ALLOCATED.  I chose the latter as I feel we shouldn't be loading
things into RAM that the module won't use.  This should mostly be a
NOP when linking with bfd but will allow the existing kgdb to work
with amd64 kernel modules linked with lld.

Note that we only require SHF_ALLOC for "program" sections for types
like SHT_PROGBITS and SHT_NOBITS.  Other section types such as symbol
tables, string tables, and relocations must also be loaded and are not
marked with SHF_ALLOC.

328911:
Ignore relocation tables for non-memory-resident sections.

As a followup to r328101, ignore relocation tables for ELF object
sections that are not memory resident.  For modules loaded by the
loader, ignore relocation tables whose associated section was not
loaded by the loader (sh_addr is zero).  For modules loaded at runtime
via kldload(2), ignore relocation tables whose associated section is
not marked with SHF_ALLOC.

MFC r328988,r328989:
  Rework ipfw dynamic states implementation to be lockless on fast path.

  o added struct ipfw_dyn_info that keeps all needed for ipfw_chk and
    for dynamic states implementation information;
  o added DYN_LOOKUP_NEEDED() macro that can be used to determine the
    need of new lookup of dynamic states;
  o ipfw_dyn_rule now becomes obsolete. Currently it used to pass
    information from kernel to userland only.
  o IPv4 and IPv6 states now described by different structures
    dyn_ipv4_state and dyn_ipv6_state;
  o IPv6 scope zones support is added;
  o ipfw(4) now depends from Concurrency Kit;
  o states are linked with "entry" field using CK_SLIST. This allows
    lockless lookup and protected by mutex modifications.
  o the "expired" SLIST field is used for states expiring.
  o struct dyn_data is used to keep generic information for both IPv4
    and IPv6;
  o struct dyn_parent is used to keep O_LIMIT_PARENT information;
  o IPv4 and IPv6 states are stored in different hash tables;
  o O_LIMIT_PARENT states now are kept separately from O_LIMIT and
    O_KEEP_STATE states;
  o per-cpu dyn_hp pointers are used to implement hazard pointers and they
    prevent freeing states that are locklessly used by lookup threads;
  o mutexes to protect modification of lists in hash tables now kept in
    separate arrays. 65535 limit to maximum number of hash buckets now
    removed.
  o Separate lookup and install functions added for IPv4 and IPv6 states
    and for parent states.
  o By default now is used Jenkinks hash function.

  Obtained from: Yandex LLC
  Sponsored by: Yandex LLC
  Differential Revision: https://reviews.freebsd.org/D12685

MFC r331668:
  Rework ipfw rules parsing and printing code.

  Introduce show_state structure to keep information about printed opcodes.
  Split show_static_rule() function into several smaller functions. Make
  parsing and printing opcodes into several passes. Each printed opcode
  is marked in show_state structure and will be skipped in next passes.
  Now show_static_rule() function is simple, it just prints each part
  of rule separately: action, modifiers, proto, src and dst addresses,
  options. The main goal of this change is avoiding occurrence of wrong
  result of `ifpw show` command, that can not be parsed by ipfw(8).
  Also now it is possible to make some simple static optimizations
  by reordering of opcodes in the rule.

  PR: 222705

MFC r308490 by syrinx:

Reply to a snmpEngineID discovery PDU with a Report PDU as per the
requirements of RFC 3414 section 4.

PR: 174974
Submitted by: pguyot@kallisys.net