]> CyberLeo.Net >> Repos - FreeBSD/FreeBSD.git/log
FreeBSD/FreeBSD.git
3 years agoCI checkstyle: add linter + rename job + install latest flake8
George Melikov [Mon, 24 Aug 2020 04:15:25 +0000 (07:15 +0300)]
CI checkstyle: add linter + rename job + install latest flake8

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: George Melikov <mail@gmelikov.ru>
Closes #10784

3 years agoZFS performance tests should clean up NFS mount
Tony Nguyen [Sun, 23 Aug 2020 22:14:22 +0000 (16:14 -0600)]
ZFS performance tests should clean up NFS mount

This change umounts client's NFS mount after each test so we can avoid
two sporadic issues:
1) client NFS stale mount and
2) zpool export and zpool destroy failed due to dataset busy

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Nguyen <tony.nguyen@delphix.com>
Closes #10767

3 years agoAdd seperate issue for questions
Kjeld Schouten-Lebbing [Sun, 23 Aug 2020 22:13:07 +0000 (00:13 +0200)]
Add seperate issue for questions

A big portion of issues are of "Type: Question".

This PR adds a separate issue template for those.
It also automatically adds the "Type: Question" tag.

in addition it adds "Type: Defect" to all bug reports by default

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Kjeld Schouten-Lebbing <kjeld@schouten-lebbing.nl>
Closes #10779

3 years agolibzstd: Don't warn about stack frame size in userspace
Ryan Moeller [Sun, 23 Aug 2020 18:13:34 +0000 (14:13 -0400)]
libzstd: Don't warn about stack frame size in userspace

With the current way CFLAGS are modified in libzstd, CFLAGS passed on
the make command line will cause the CFLAGS in the Makefile for zstd.c
to be discarded, but not AM_CFLAGS.  This causes a smaller frame size
limit to be used, and the build fails.

We don't need to worry about stack frame sizes in userspace.  Drop the
extra flags.

Reviewed-by: Kjeld Schouten <kjeld@schouten-lebbing.nl>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <freqlabs@FreeBSD.org>
Closes #10773

3 years agoPrevent zfs_acl_chmod() if aclmode restricted and ACL inherited
Andrew [Sun, 23 Aug 2020 04:49:25 +0000 (00:49 -0400)]
Prevent zfs_acl_chmod() if aclmode restricted and ACL inherited

In absence of inheriting entry for owner@, group@, or everyone@,
zfs_acl_chmod() is called to set these. This can cause confusion for Samba
admins who do not expect these entries to appear on newly created files and
directories once they have been stripped from from the parent directory.

When aclmode is set to "restricted", chmod is prevented on non-trivial ACLs.
It is not a stretch to assume that in this case the administrator does not want
ZFS to add the missing special entries. Add check for this aclmode, and if an
inherited entry is present skip zfs_acl_chmod().

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Andrew Walker <awalker@ixsystems.com>
Closes #10748

3 years agoUpdate issue template
Kjeld Schouten-Lebbing [Sun, 23 Aug 2020 04:41:01 +0000 (06:41 +0200)]
Update issue template

Github has started using a new issue templating structure.
This commit moves the current template and adds one additional one.

- Moves issue template to new issue-template folder
- Adds feature request template
- removes the following warning when viewing issue template

Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Kjeld Schouten-Lebbing <kjeld@schouten-lebbing.nl>
Closes #10759

3 years agoZTS: Remove leftover variable names
Ryan Moeller [Sat, 22 Aug 2020 18:05:59 +0000 (14:05 -0400)]
ZTS: Remove leftover variable names

These were overlooked when use of `local` was removed to satisfy
checkbashisms.

Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #10762

3 years agoRemove vestigial settings related to initramfs
Chris McDonough [Sat, 22 Aug 2020 18:04:49 +0000 (14:04 -0400)]
Remove vestigial settings related to initramfs

Remove ZFS_POOL_IMPORT, ZFS_INITRD_PRE_MOUNTROOT_SLEEP,
ZFS_INITRD_POST_MODPROBE_SLEEP, and ZFS_INITRD_ADDITIONAL_DATASETS
features from etc/defaults/zfs.in.  These features no longer work.

Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Chris McDonough <chrism@plope.com>
Closes #9126
Closes #10757

3 years agoMake formatting of dedup values string consistent
Clint Armstrong [Sat, 22 Aug 2020 17:58:07 +0000 (13:58 -0400)]
Make formatting of dedup values string consistent

All other prop values return options separated by ` | `,
dedup values do not, they are separated by `, `. This change
makes the dedup value formatting consistent with other properties.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Clint Armstrong <clint@clintarmstrong.net>
Closes #10761

3 years agoImport vdev ashift optimization from FreeBSD
Ryan Moeller [Fri, 21 Aug 2020 19:53:17 +0000 (15:53 -0400)]
Import vdev ashift optimization from FreeBSD

Many modern devices use physical allocation units that are much
larger than the minimum logical allocation size accessible by
external commands. Two prevalent examples of this are 512e disk
drives (512b logical sector, 4K physical sector) and flash devices
(512b logical sector, 4K or larger allocation block size, and 128k
or larger erase block size). Operations that modify less than the
physical sector size result in a costly read-modify-write or garbage
collection sequence on these devices.

Simply exporting the true physical sector of the device to ZFS would
yield optimal performance, but has two serious drawbacks:

 1. Existing pools created with devices that have different logical
    and physical block sizes, but were configured to use the logical
    block size (e.g. because the OS version used for pool construction
    reported the logical block size instead of the physical block
    size) will suddenly find that the vdev allocation size has
    increased. This can be easily tolerated for active members of
    the array, but ZFS would prevent replacement of a vdev with
    another identical device because it now appears that the smaller
    allocation size required by the pool is not supported by the new
    device.

 2. The device's physical block size may be too large to be supported
    by ZFS. The optimal allocation size for the vdev may be quite
    large. For example, a RAID controller may export a vdev that
    requires read-modify-write cycles unless accessed using 64k
    aligned/sized requests. ZFS currently has an 8k minimum block
    size limit.

Reporting both the logical and physical allocation sizes for vdevs
solves these problems. A device may be used so long as the logical
block size is compatible with the configuration. By comparing the
logical and physical block sizes, new configurations can be optimized
and administrators can be notified of any existing pools that are
sub-optimal.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Co-authored-by: Matthew Macy <mmacy@freebsd.org>
Signed-off-by: Matt Macy <mmacy@FreeBSD.org>
Closes #10619

3 years agoRemove hard coded "Linux" OS from manpages
Ryan Moeller [Fri, 21 Aug 2020 18:55:47 +0000 (14:55 -0400)]
Remove hard coded "Linux" OS from manpages

The recommended practice for `.Os` on FreeBSD is to not specify any
arguments.  The correct OS name is used automatically.

Oddly enough, on the Linux distro I tested this on (CentOS 7), the man
pager defaulted to displaying "BSD" as the OS rather than "Linux".  To
accommodate this, tack " Linux" back on in an install hook on Linux.
This is much simpler than removing it for FreeBSD when vendored in the
base system.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #10760

3 years agoSilence 'make checkbashisms'
Brian Behlendorf [Thu, 20 Aug 2020 20:45:47 +0000 (13:45 -0700)]
Silence 'make checkbashisms'

Commit d2bce6d03 added the 'make checkbashisms' target but did not
resolve all of the bashisms in the scripts.  This commit doesn't
resolve them all either but it does fix up a few, and it excludes
the others so 'make checkstyle' no longer prints warnings.  It's
a small step in the right direction.

* Dracut is Linux specific and itself depends on bash.  Therefore
  all dracut support scripts can be bash specific, update their
  shebang accordingly.

* zed-functions.sh, zfs-import, zfs-mount, zfs-zed, smart
  paxcheck.sh, make_gitrev.sh - these scripts were excuded from
  the check until they can be updated and properly tested.

* zfsunlock - only whole values for sleep are allowed.

* vdev_id - removed unneeded locals; use && instead of -a.

* dkms.mkconf, dkms.postbuil - use || instead of -o.

Reviewed-by: InsanePrawn <insane.prawny@gmail.com>
Reviewed-by: Gabriel A. Devenyi <gdevenyi@gmail.com>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #10755

3 years ago'zfs share -a' should clean noauto exports
Don Brady [Thu, 20 Aug 2020 20:12:12 +0000 (14:12 -0600)]
'zfs share -a' should clean noauto exports

This is a follow on to PR #10688 where `zfs share -a` allows the
sharing of canmount=noauto datasets if they are mounted.  However,
when a dataset with canmount=noauto is not mounted, the command
should also purge any existing entries from the exports file.
Otherwise, after a reboot, the nfs server attempts to export the
underlying mountpath, not the dataset. This can lead to a hard hang
for existing client mounts.

Instead of just skipping the adding of an export if not mounted
and canmount=noauto, have it also remove an existing export of the
dataset so that, after a reboot, we don't export an unmounted dataset.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Wilson <gwilson@delphix.com>
Signed-off-by: Don Brady <don.brady@delphix.com>
Closes #10747

3 years agoFix indentation in dnode_free_range()
Matthew Ahrens [Thu, 20 Aug 2020 18:45:20 +0000 (11:45 -0700)]
Fix indentation in dnode_free_range()

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
Closes #10744

3 years agoFreeBSD: 11.x arc_stats compatibility
Matthew Macy [Thu, 20 Aug 2020 17:55:02 +0000 (10:55 -0700)]
FreeBSD: 11.x arc_stats compatibility

Removing other_size from arc_stats breaks top in 11.x jails
running on HEAD.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matt Macy <mmacy@FreeBSD.org>
Closes #10745

3 years agoAdd zstd support to zfs
Michael Niewöhner [Tue, 18 Aug 2020 17:10:17 +0000 (19:10 +0200)]
Add zstd support to zfs

This PR adds two new compression types, based on ZStandard:

- zstd: A basic ZStandard compression algorithm Available compression.
  Levels for zstd are zstd-1 through zstd-19, where the compression
  increases with every level, but speed decreases.

- zstd-fast: A faster version of the ZStandard compression algorithm
  zstd-fast is basically a "negative" level of zstd. The compression
  decreases with every level, but speed increases.

  Available compression levels for zstd-fast:
   - zstd-fast-1 through zstd-fast-10
   - zstd-fast-20 through zstd-fast-100 (in increments of 10)
   - zstd-fast-500 and zstd-fast-1000

For more information check the man page.

Implementation details:

Rather than treat each level of zstd as a different algorithm (as was
done historically with gzip), the block pointer `enum zio_compress`
value is simply zstd for all levels, including zstd-fast, since they all
use the same decompression function.

The compress= property (a 64bit unsigned integer) uses the lower 7 bits
to store the compression algorithm (matching the number of bits used in
a block pointer, as the 8th bit was borrowed for embedded block
pointers).  The upper bits are used to store the compression level.

It is necessary to be able to determine what compression level was used
when later reading a block back, so the concept used in LZ4, where the
first 32bits of the on-disk value are the size of the compressed data
(since the allocation is rounded up to the nearest ashift), was
extended, and we store the version of ZSTD and the level as well as the
compressed size. This value is returned when decompressing a block, so
that if the block needs to be recompressed (L2ARC, nop-write, etc), that
the same parameters will be used to result in the matching checksum.

All of the internal ZFS code ( `arc_buf_hdr_t`, `objset_t`,
`zio_prop_t`, etc.) uses the separated _compress and _complevel
variables.  Only the properties ZAP contains the combined/bit-shifted
value. The combined value is split when the compression_changed_cb()
callback is called, and sets both objset members (os_compress and
os_complevel).

The userspace tools all use the combined/bit-shifted value.

Additional notes:

zdb can now also decode the ZSTD compression header (flag -Z) and
inspect the size, version and compression level saved in that header.
For each record, if it is ZSTD compressed, the parameters of the decoded
compression header get printed.

ZSTD is included with all current tests and new tests are added
as-needed.

Per-dataset feature flags now get activated when the property is set.
If a compression algorithm requires a feature flag, zfs activates the
feature when the property is set, rather than waiting for the first
block to be born.  This is currently only used by zstd but can be
extended as needed.

Portions-Sponsored-By: The FreeBSD Foundation
Co-authored-by: Allan Jude <allanjude@freebsd.org>
Co-authored-by: Brian Behlendorf <behlendorf1@llnl.gov>
Co-authored-by: Sebastian Gottschall <s.gottschall@dd-wrt.com>
Co-authored-by: Kjeld Schouten-Lebbing <kjeld@schouten-lebbing.nl>
Co-authored-by: Michael Niewöhner <foss@mniewoehner.de>
Signed-off-by: Allan Jude <allan@klarasystems.com>
Signed-off-by: Allan Jude <allanjude@freebsd.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Sebastian Gottschall <s.gottschall@dd-wrt.com>
Signed-off-by: Kjeld Schouten-Lebbing <kjeld@schouten-lebbing.nl>
Signed-off-by: Michael Niewöhner <foss@mniewoehner.de>
Closes #6247
Closes #9024
Closes #10277
Closes #10278

3 years agoImport ZStandard v1.4.5
Michael Niewöhner [Tue, 18 Aug 2020 17:10:10 +0000 (19:10 +0200)]
Import ZStandard v1.4.5

ZStandard is a modern, high performance, general compression algorithm.
It provides similar or better compression levels to GZIP, but with much
better performance. ZStandard provides a large selection of compression
levels to allow a storage administrator to select the preferred
performance/compression trade-off.

This commit imports the unmodified ZStandard single-file library which
will be used by ZFS.

The implementation of this new library is done with future updates of
zstd in mind. For this reason we integrated the code in a way, that does
not require modifications to the library. For more details, see
`module/zstd/README.md`.

The library is excluded from codecov calculation and cppcheck as
unaltered dependencies do not need full codecov or cppcheck.

Co-authored-by: Allan Jude <allanjude@freebsd.org>
Co-authored-by: Kjeld Schouten-Lebbing <kjeld@schouten-lebbing.nl>
Co-authored-by: Michael Niewöhner <foss@mniewoehner.de>
Signed-off-by: Allan Jude <allanjude@freebsd.org>
Signed-off-by: Kjeld Schouten-Lebbing <kjeld@schouten-lebbing.nl>
Signed-off-by: Michael Niewöhner <foss@mniewoehner.de>
3 years agoLinux 5.7 compat: Include linux/sched.h in spl/sys/mutex.h
Pavel Snajdr [Thu, 20 Aug 2020 04:37:38 +0000 (06:37 +0200)]
Linux 5.7 compat: Include linux/sched.h in spl/sys/mutex.h

struct task_struct is needed for lockdep_off() in mutex.h

This has popped up after e616cb8daadf (in linux-5.7-rc7).

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Pavel Snajdr <snajpa@snajpa.net>
Closes #10741

3 years agoFreeBSD: Add option to rewind checkpoint while importing root pool
Mariusz Zaborski [Thu, 20 Aug 2020 00:19:42 +0000 (02:19 +0200)]
FreeBSD: Add option to rewind checkpoint while importing root pool

This option is used by FreeBSD boot loader.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Mariusz Zaborski <oshogbo@vexillium.org>
Closes #10738

3 years agoZED: Do not offline a missing device if no spare is available
Brian Behlendorf [Wed, 19 Aug 2020 05:13:17 +0000 (22:13 -0700)]
ZED: Do not offline a missing device if no spare is available

Due to commit d48091d a removed device is now explicitly offlined by
the ZED if no spare is available, rather than the letting ZFS detect
it as UNAVAIL. This broke auto-replacing of whole-disk devices, as
described in issue #10577.  In short, when a new device is reinserted
in the same slot, the ZED will try to ONLINE it without letting ZFS
recreate the necessary partition table.

This change simply avoids setting the device OFFLINE when removed if
no spare is available (or if spare_on_remove is false).  This change
has been left minimal to allow it to be backported to 0.8.x release.
The auto_offline_001_pos ZTS test has been updated accordingly.

Some follow up work is planned to update the ZED so it transitions
the vdev to a REMOVED state.  This is a state which has always
existed but there is no current interface the ZED can use to
accomplish this.  Therefore it's being left to a follow up PR.

Reviewed-by: Gionatan Danti <g.danti@assyoma.it>
Co-authored-by: Gionatan Danti <g.danti@assyoma.it>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #10577
Closes #10730

3 years agoFix ARC aggsum access after arc_state_fini()
Brian Behlendorf [Wed, 19 Aug 2020 05:11:34 +0000 (22:11 -0700)]
Fix ARC aggsum access after arc_state_fini()

Commit 85ec5cbae updated abd_update_scatter_stats() such that it
calls arc_space_consume() and arc_space_return() when updating the
scatter stats.  This requires that the global aggsum value for the
ARC be initialized.  Normally this is not an issue, however during
module unload the l2arc_do_free_on_write() function was called in
l2arc_cleanup() after arc_state_fini() destroyed the aggsum values.
We can resolve this issue by performing l2arc_do_free_on_write()
slightly earlier in arc_fini().

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #10739

3 years agolibzfs_core: Initialize fail_ioc_cmd to ZFS_IOC_LAST
Ryan Moeller [Wed, 19 Aug 2020 01:07:43 +0000 (21:07 -0400)]
libzfs_core: Initialize fail_ioc_cmd to ZFS_IOC_LAST

FreeBSD numbers `ZFS_IOC_*` starting at 0, so pick a different
sentinel value to avoid unintentionally messing with
`ZFS_IOC_POOL_CREATE` ioctls.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <freqlabs@FreeBSD.org>
Closes #10729

3 years agoFreeBSD: Fix UNIX permissions checking
Matthew Macy [Tue, 18 Aug 2020 16:57:07 +0000 (09:57 -0700)]
FreeBSD: Fix UNIX permissions checking

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matt Macy <mmacy@FreeBSD.org>
Closes #10727

3 years agoAdd define to enable autotrim to default to on
Matthew Macy [Tue, 18 Aug 2020 16:52:30 +0000 (09:52 -0700)]
Add define to enable autotrim to default to on

In FreeBSD trim has defaulted to on for several
years. In order to minimize POLA violations on
import it's important to maintain this default
when importing vendored openzfs in to FreeBSD
base.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matt Macy <mmacy@FreeBSD.org>
Closes #10719

3 years agoMake zc_nvlist_src_size limit tunable
Ryan Moeller [Tue, 18 Aug 2020 16:33:55 +0000 (12:33 -0400)]
Make zc_nvlist_src_size limit tunable

We limit the size of nvlists passed to the kernel so a user cannot make
the kernel do an unreasonably large allocation.  On FreeBSD this limit
was 128 kiB, which turns out to be a bit too small when doing some
operations involving a large number of datasets or snapshots, for
example replication.

Make this limit tunable, with a platform-specific auto default.
Linux keeps its limit at KMALLOC_MAX_SIZE. FreeBSD uses 1/4 of the
system limit on user wired memory, which allows it to scale depending
on system configuration.

Reviewed-by: Matt Macy <mmacy@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <freqlabs@FreeBSD.org>
Issue #6572
Closes #10706

3 years agoRemove unused `zpool_is_bootable`
George Melikov [Tue, 18 Aug 2020 16:30:12 +0000 (19:30 +0300)]
Remove unused `zpool_is_bootable`

Otherwise compiler errors with:

```
libzfs_pool.c:449:1: error: 'zpool_is_bootable'
 defined but not used [-Werror=unused-function]
```

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: George Melikov <mail@gmelikov.ru>
Closes #10734

3 years agoRemove GRUB restrictions
Richard Laager [Tue, 18 Aug 2020 06:12:39 +0000 (01:12 -0500)]
Remove GRUB restrictions

The GRUB restrictions are based around the pool's bootfs property.
Given the current situation where GRUB is not staying current with
OpenZFS pool features, having either a non-ZFS /boot or a separate
pool with limited features are pretty much the only long-term answers
for GRUB support.  Only the second case matters in this context.  For
the restrictions to be useful, the bootfs property would have to be set
on the boot pool, because that is where we need the restrictions, as
that is the pool that GRUB reads from. The documentation for bootfs
describes it as pointing to the root pool. That's also how it's used in
the initramfs. ZFS does not allow setting bootfs to point to a dataset
in another pool. (If it did, it'd be difficult-to-impossible to enforce
these restrictions cross-pool). Accordingly, bootfs is pretty much
useless for GRUB scenarios moving forward.

Even for users who have only one pool, the existing restrictions for
GRUB are incomplete. They don't prevent you from enabling the
unsupported checksums, for example. For that reason, I have ripped out
all the GRUB restrictions.

A little longer-term, I think extending the proposed features=portable
system to define a features=grub is a much more useful approach. The
user could set that on the boot pool at creation, and things would
Just Work.

Reviewed-by: Paul Dagnelie <pcd@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Laager <rlaager@wiktel.com>
Closes #8627

3 years agoZTS: ztest may cause mmp tests failures
Brian Behlendorf [Tue, 18 Aug 2020 05:31:18 +0000 (22:31 -0700)]
ZTS: ztest may cause mmp tests failures

The mmp_exported_import and mmp_inactive_import tests depend on
ztest simulating an active pool.  If ztest unexpectedly terminates
due to an unrelated issue the test case will fail.  Since ztest is
not yet 100% reliable I've added these tests to the maybe exception
list.  They can be removed when the issues with ztest are resolved
or if the test cases are updated to handle these unexpected failures.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #10726

3 years agoInclude scatter_chunk_waste in arc_size
Matthew Ahrens [Tue, 18 Aug 2020 03:04:04 +0000 (20:04 -0700)]
Include scatter_chunk_waste in arc_size

The ARC caches data in scatter ABD's, which are collections of pages,
which are typically 4K.  Therefore, the space used to cache each block
is rounded up to a multiple of 4K.  The ABD subsystem tracks this wasted
memory in the `scatter_chunk_waste` kstat.  However, the ARC's `size` is
not aware of the memory used by this round-up, it only accounts for the
size that it requested from the ABD subsystem.

Therefore, the ARC is effectively using more memory than it is aware of,
due to the `scatter_chunk_waste`.  This impacts observability, e.g.
`arcstat` will show that the ARC is using less memory than it
effectively is.  It also impacts how the ARC responds to memory
pressure.  As the amount of `scatter_chunk_waste` changes, it appears to
the ARC as memory pressure, so it needs to resize `arc_c`.

If the sector size (`1<<ashift`) is the same as the page size (or
larger), there won't be any waste.  If the (compressed) block size is
relatively large compared to the page size, the amount of
`scatter_chunk_waste` will be small, so the problematic effects are
minimal.

However, if using 512B sectors (`ashift=9`), and the (compressed) block
size is small (e.g. `compression=on` with the default `volblocksize=8k`
or a decreased `recordsize`), the amount of `scatter_chunk_waste` can be
very large.  On a production system, with `arc_size` at a constant 50%
of memory, `scatter_chunk_waste` has been been observed to be 10-30% of
memory.

This commit adds `scatter_chunk_waste` to `arc_size`, and adds a new
`waste` field to `arcstat`.  As a result, the ARC's memory usage is more
observable, and `arc_c` does not need to be adjusted as frequently.

Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Wilson <gwilson@delphix.com>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
Closes #10701

3 years agoRemove KMC_KMEM and KMC_VMEM
Matthew Ahrens [Mon, 17 Aug 2020 23:04:28 +0000 (16:04 -0700)]
Remove KMC_KMEM and KMC_VMEM

`KMC_KMEM` and `KMC_VMEM` are now unused since all SPL-implemented
caches are `KMC_KVMEM`.

KMC_KMEM: Given the default value of `spl_kmem_cache_kmem_limit`, we
don't use kmalloc to back the SPL caches, instead we use kvmalloc
(KMC_KVMEM).  The flag, module parameter, /proc entries, and associated
code are removed.

KMC_VMEM: This flag is not used, and kvmalloc() is always preferable to
vmalloc().  The flag, /proc entries, and associated code are removed.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
Closes #10673

3 years agoFreeBSD: fix the build with Clang 11
Ryan Moeller [Mon, 17 Aug 2020 22:40:17 +0000 (18:40 -0400)]
FreeBSD: fix the build with Clang 11

* Cast void * to uintptr_t before casting to boolean_t.

* Avoid clashing definition of __asm when not on Linux to
  prevent duplicate __volatile__. This was already done in
  some places but not all.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Matt Macy <mmacy@FreeBSD.org>
Signed-off-by: Ryan Moeller <freqlabs@FreeBSD.org>
Closes #10723

3 years agoFreeBSD: fix merge error in zfs_acl_ids_create
Matthew Macy [Mon, 17 Aug 2020 22:28:03 +0000 (15:28 -0700)]
FreeBSD: fix merge error in zfs_acl_ids_create

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matt Macy <mmacy@FreeBSD.org>
Closes #10721

3 years agoFix typo in btree.c
Serapheim Dimitropoulos [Mon, 17 Aug 2020 22:25:37 +0000 (15:25 -0700)]
Fix typo in btree.c

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Signed-off-by: Serapheim Dimitropoulos <serapheim@delphix.com>
Closes #10725

3 years agoFreeBSD: fallback to /boot/ to look for zpool.cache
Matthew Macy [Mon, 17 Aug 2020 21:43:47 +0000 (14:43 -0700)]
FreeBSD: fallback to /boot/ to look for zpool.cache

Up until now zpool.cache has always lived in /boot on FreeBSD.
For the sake of compatibility fallback to /boot if zpool.cache
isn't found in /etc/zfs.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matt Macy <mmacy@FreeBSD.org>
Closes #10720

3 years agoFix reporting of L2ARC writes in arc_summary3
George Amanakis [Mon, 17 Aug 2020 18:04:06 +0000 (14:04 -0400)]
Fix reporting of L2ARC writes in arc_summary3

arc_summary3 reports L2ARC writes in bytes. However, the related
arc_stat is reported as hits. arc_summary2 report this correctly.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: George Amanakis <gamanakis@gmail.com>
Closes #10717

3 years agoFix l2arc_dev_rebuild_start thread name
Ryan Moeller [Mon, 17 Aug 2020 18:02:32 +0000 (14:02 -0400)]
Fix l2arc_dev_rebuild_start thread name

`thread_create` on FreeBSD stringifies the argument passed as the
thread function to create a name for the thread. The thread name for
`l2arc_dev_rebuild_start` ended up with `(void (*)(void *))` in it.

Change the type signature so the function does not need to be cast
when creating the thread.  Rename the function to
`l2arc_dev_rebuild_thread` for clarity and consistency, as well.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Amanakis <gamanakis@gmail.com>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #10716

3 years agoFreeBSD: Create taskq threads in appropriate proc
Ryan Moeller [Mon, 17 Aug 2020 18:01:19 +0000 (14:01 -0400)]
FreeBSD: Create taskq threads in appropriate proc

Stepping stone toward re-enabling spa_thread on FreeBSD.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #10715

3 years agoFix L2ARC reads when compressed ARC disabled
Allan Jude [Fri, 14 Aug 2020 06:31:20 +0000 (02:31 -0400)]
Fix L2ARC reads when compressed ARC disabled

When reading compressed blocks from the L2ARC, with
compressed ARC disabled, arc_hdr_size() returns
LSIZE rather than PSIZE, but the actual read is PSIZE.
This causes l2arc_read_done() to compare the checksum
against the wrong size, resulting in checksum failure.

This manifests as an increase in the kstat l2_cksum_bad
and the read being retried from the main pool, making the
L2ARC ineffective.

Add new L2ARC tests with Compressed ARC enabled/disabled

Blocks are handled differently depending on the state of the
zfs_compressed_arc_enabled tunable.

If a block is compressed on-disk, and compressed_arc is enabled:
- the block is read from disk
- It is NOT decompressed
- It is added to the ARC in its compressed form
- l2arc_write_buffers() may write it to the L2ARC (as is)
- l2arc_read_done() compares the checksum to the BP (compressed)

However, if compressed_arc is disabled:
- the block is read from disk
- It is decompressed
- It is added to the ARC (uncompressed)
- l2arc_write_buffers() will use l2arc_apply_transforms() to
  recompress the block, before writing it to the L2ARC
- l2arc_read_done() compares the checksum to the BP (compressed)
- l2arc_read_done() will use l2arc_untransform() to uncompress it

This test writes out a test file to a pool consisting of one disk
and one cache device, then randomly reads from it. Since the arc_max
in the tests is low, this will feed the L2ARC, and result in reads
from the L2ARC.

We compare the value of the kstat l2_cksum_bad before and after
to determine if any blocks failed to survive the trip through the
L2ARC.

Sponsored-by: The FreeBSD Foundation
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Allan Jude <allanjude@freebsd.org>
Closes #10693

3 years agoRelease onexit/events with any missed zfsdev_state
Jorgen Lundman [Thu, 13 Aug 2020 22:03:23 +0000 (07:03 +0900)]
Release onexit/events with any missed zfsdev_state

Linux and FreeBSD will most likely never see this issue.
On macOS when kext is unloaded, but zed is still connected, zed
will be issued ENODEV. As the cdevsw is released, the kernel
will not have zfsdev_release() called to release minor/onexit/events,
and it "leaks". This ensures it is cleaned up before unload.

Changed the for loop from zsprev, to zsnext style, for less
code duplication.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Jorgen Lundman <lundman@lundman.net>
Closes #10700

3 years agoGithub workflow: checkstyle
George Melikov [Wed, 12 Aug 2020 17:46:26 +0000 (20:46 +0300)]
Github workflow: checkstyle

Use github workflow to run checkstyle
- use free (for OS projects) resources
- starts for every commit and branch
- work on forks, contributors may use it
  before creating PRs

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: George Melikov <mail@gmelikov.ru>
Closes #10705

3 years agocstyle.pl: echo commands for github workflow
George Melikov [Wed, 12 Aug 2020 17:45:50 +0000 (20:45 +0300)]
cstyle.pl: echo commands for github workflow

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: George Melikov <mail@gmelikov.ru>
Closes #10705

3 years agoRemove stale .travis.yml
George Melikov [Thu, 13 Aug 2020 21:55:45 +0000 (00:55 +0300)]
Remove stale .travis.yml

- It doesn't work now.
- It has to be manually edited on tests changes.
  (even on test runtime changes!)
- Travis gives too small time to run to be useful.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: George Melikov <mail@gmelikov.ru>
Closes #10704

3 years agoUse zfs_dbgmsg to log metaslab_load/unload
Matthew Ahrens [Wed, 12 Aug 2020 17:10:50 +0000 (10:10 -0700)]
Use zfs_dbgmsg to log metaslab_load/unload

Metaslabs are now (usually) loaded and unloaded infrequently, but when
that is not the case, it is useful to have a log of when and why these
events happened.

This commit enables the zfs_dbgmsg() in metaslab_load(), and adds a
zfs_dbgmsg() in metaslab_unload().

Reviewed-by: Serapheim Dimitropoulos <serapheim@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
Closes #10683

3 years agoRestore ARC MFU/MRU pressure
Matthew Macy [Wed, 12 Aug 2020 17:03:24 +0000 (10:03 -0700)]
Restore ARC MFU/MRU pressure

The arc_adapt() function tunes LRU/MLU balance according to 4 types of
cache hits (which is passed as state agrument): ghost LRU, LRU, MRU,
ghost MRU. If this function is called with wrong cache hit (state),
adaptation will be sub-optimal and performance will suffer.

Some time ago upstream received this commit:

6950 ARC should cache compressed data) in arc_read() do next
sequence (access to ghost buffer)

Before this commit, hit to any ghost list was passed arc_adapt() before
call to arc_access() which revive element in cache and change state from
ghost to real hit.

After this commit, the order of calls was reverted and arc_adapt() is
now called only with «real» hits even if hit was in one of two ghost
lists, which renders ghost lists useless and breaks the ARC algorithm.

FreeBSD fixed this problem locally in Change D19094 / Commit r348772.

This change is an adaptation of the above commit to the current arc
code.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matt Macy <mmacy@FreeBSD.org>
Closes #10548
Closes #10618

3 years ago'zfs share -a' should handle 'canmount=noauto'
George Wilson [Tue, 11 Aug 2020 20:55:04 +0000 (14:55 -0600)]
'zfs share -a' should handle 'canmount=noauto'

The 'zfs share -a' currently skips any filesystems which
have 'canmount=noauto' set. This behavior is unexpected since the
one would expect 'zfs share -a' to share any mounted filesystem
that has the 'sharenfs' property already set.

This changes the behavior of 'zfs share -a' to allow the sharing
of 'canmount=noauto' datasets if they are mounted.

Reviewed-by: Matt Ahrens <matt@delphix.com>
Reviewed-by: Don Brady <don.brady@delphix.com>
Reviewed-by: Prakash Surya <prakash.surya@delphix.com>
Signed-off-by: George Wilson <gwilson@delphix.com>
External-issue: DLPX-71313
Closes #10688

3 years agoFreeBSD: Fix module autoloading when built in base
Matthew Macy [Tue, 11 Aug 2020 20:49:50 +0000 (13:49 -0700)]
FreeBSD: Fix module autoloading when built in base

The KMOD name is "zfs" instead of "openzfs" when building in FreeBSD.

Define a ZFS_KMOD symbol as "zfs" when IN_BASE is defined, otherwise
"openzfs".

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Co-authored-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #10699

3 years agoLinux 5.9 compat: make_request_fn replaced with submit_bio interface
Coleman Kane [Sun, 9 Aug 2020 16:12:25 +0000 (12:12 -0400)]
Linux 5.9 compat: make_request_fn replaced with submit_bio interface

The make_request_fn and associated API was replaced recently in a
Linux 5.9 merge, to replace its functionality with a new submit_bio
member in struct block_device_operations.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Coleman Kane <ckane@colemankane.org>
Closes #10696

3 years agoLinux 5.9 compat: Update NR_SLAB_RECLAIMABLE to NR_SLAB_RECLAIMABLE_B
Coleman Kane [Sun, 9 Aug 2020 16:07:49 +0000 (12:07 -0400)]
Linux 5.9 compat: Update NR_SLAB_RECLAIMABLE to NR_SLAB_RECLAIMABLE_B

This change appears to primarily be a name change for the enum. Had
to update the test logic so that it works so long as either one of
these is present (favoring the newer one). Additionally, as this is
newer, it only shows up in node_page_item, so this commit doesn't
test zone_page_item for the same enum.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Coleman Kane <ckane@colemankane.org>
Closes #10696

3 years agoLinux 5.9 compat: add linux/blkdev.h include
Coleman Kane [Sun, 9 Aug 2020 16:03:03 +0000 (12:03 -0400)]
Linux 5.9 compat: add linux/blkdev.h include

Many of the block device operations (often functions with bdev in
the name) were moved into linux/blkdev.h from linux/fs.h. Seems
that this header is already included where needed in the code, but
in the autoconf tests it was missing causing false negatives. This
commit has those tests include linux/fs.h (old location) and now
also linux/blkdev.h (new locations).

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Coleman Kane <ckane@colemankane.org>
Closes #10696

3 years agoFix typo
Allan Jude [Tue, 11 Aug 2020 20:16:57 +0000 (16:16 -0400)]
Fix typo

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Allan Jude <allanjude@freebsd.org>
Closes #10694

3 years agoMove ZVOL_DIR back to zfs.h
Ryan Moeller [Tue, 11 Aug 2020 20:12:12 +0000 (16:12 -0400)]
Move ZVOL_DIR back to zfs.h

This was previously moved because nothing else in-tree uses it, but
evidently DilOS uses it out of tree.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Igor Kozhukhov <igor@dilos.org>
Signed-off-by: Ryan Moeller <freqlabs@freebsd.org>
Closes #10361
Closes #10685

3 years agoFreeBSD: update vaccess signature on most recent HEAD
Matthew Macy [Fri, 7 Aug 2020 21:16:01 +0000 (14:16 -0700)]
FreeBSD: update vaccess signature on most recent HEAD

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Matt Macy <mmacy@FreeBSD.org>
Closes #10682

3 years agoClarify error message when a range-tree double-add occurs
Paul Dagnelie [Fri, 7 Aug 2020 21:13:13 +0000 (14:13 -0700)]
Clarify error message when a range-tree double-add occurs

In various other pieces of logic have resulted in situations where
we double-free space in ZFS. This in turn results in a double-add
to the range trees. These issues have been much more difficult to
diagnose than they should have been, because the error handling
around this case is much weaker than around the double remove case.

Reviewed-by: Matt Ahrens <matt@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Wilson <gwilson@delphix.com>
Signed-off-by: Paul Dagnelie <pcd@delphix.com>
Closes #10654

3 years agoZTS: Remove bashisms from zfs-tests.sh
Ryan Moeller [Fri, 7 Aug 2020 21:10:48 +0000 (17:10 -0400)]
ZTS: Remove bashisms from zfs-tests.sh

Bring zfs-tests.sh in to compliance with the other scripts
by converting it /bin/sh for to avoid a dependency on bash.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #10640

3 years agoRemove commented-out code
Matthew Ahrens [Tue, 4 Aug 2020 18:36:53 +0000 (11:36 -0700)]
Remove commented-out code

Remove dead code to make the implementation easier to understand.

Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matt Ahrens <matt@delphix.com>
Closes #10650

3 years agoRemove KM_NODEBUG
Matthew Ahrens [Thu, 30 Jul 2020 20:59:07 +0000 (13:59 -0700)]
Remove KM_NODEBUG

Remove dead code to make the implementation easier to understand.

Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matt Ahrens <matt@delphix.com>
Closes #10650

3 years agoRemove KMC_NOMAGAZINE
Matthew Ahrens [Thu, 30 Jul 2020 20:56:00 +0000 (13:56 -0700)]
Remove KMC_NOMAGAZINE

Remove dead code to make the implementation easier to understand.

Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matt Ahrens <matt@delphix.com>
Closes #10650

3 years agoRemove KMC_QCACHE
Matthew Ahrens [Thu, 30 Jul 2020 20:51:31 +0000 (13:51 -0700)]
Remove KMC_QCACHE

Remove dead code to make the implementation easier to understand.

Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matt Ahrens <matt@delphix.com>
Closes #10650

3 years agoRemove KMC_NOHASH
Matthew Ahrens [Thu, 30 Jul 2020 20:46:32 +0000 (13:46 -0700)]
Remove KMC_NOHASH

Remove dead code to make the implementation easier to understand.

Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matt Ahrens <matt@delphix.com>
Closes #10650

3 years agoRemove KMC_NOTOUCH
Matthew Ahrens [Thu, 30 Jul 2020 20:43:18 +0000 (13:43 -0700)]
Remove KMC_NOTOUCH

Remove dead code to make the implementation easier to understand.

Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matt Ahrens <matt@delphix.com>
Closes #10650

3 years agoRemove KMC_OFFSLAB
Matthew Ahrens [Thu, 30 Jul 2020 05:03:23 +0000 (22:03 -0700)]
Remove KMC_OFFSLAB

Remove dead code to make the implementation easier to understand.

Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matt Ahrens <matt@delphix.com>
Closes #10650

3 years agoFix i/o error handling of livelists and zap iteration
Matthew Ahrens [Wed, 5 Aug 2020 17:22:09 +0000 (10:22 -0700)]
Fix i/o error handling of livelists and zap iteration

Pool-wide metadata is stored in the MOS (Meta Object Set).  This
metadata is stored in triplicate, in addition to any pool-level
reduncancy (e.g. RAIDZ).  However, if all 3+ copies of this metadata are
not available, we can still get EIO/ECKSUM when reading from the MOS.
If we encounter such an error in syncing context, we have typically
already committed to making a change that we now can't do because of the
corrupt/missing metadata.  We typically "handle" this with a `VERIFY()`
or `zfs_panic_recover()`.  This prevents the system from continuing on
in an undefined state, while minimizing the amount of error-handling
code.

However, there are some code paths that ignore these i/o errors, or
`ASSERT()` that they don't happen.  Since assertions are disabled on
non-debug builds, they effectively ignore them as well.  This can lead
to ZFS continuing on in an incorrect state, potentially leading to
on-disk inconsistencies.

This commit adds handling for these i/o errors on MOS metadata,
typically with a `VERIFY()`:

* Handle error return from `zap_cursor_retrieve()` in 4 places in
`dsl_deadlist.c`.

* Handle error return from `zap_contains()` in `dsl_dir_hold_obj()`.
Turns out this call isn't necessary because we can always call
`zap_lookup()`.

* Handle error return from `zap_lookup()` in `dsl_fs_ss_limit_check()`.

* Handle error return from `zap_remove()` in `dsl_dir_rename_sync()`.

* Handle error return from `zap_lookup()` in
`dsl_dir_remove_livelist()`.

* Handle error return from `dsl_process_sub_livelist()` in
`spa_livelist_delete_cb()`.

Additionally:

* Augment the internal history log message for `zfs destroy` to note
which method is used (e.g. bptree, livelist, or, synchronous) and the
mintxg.

* Correct a comment in `dbuf_init()`.

* Correct indentation in `dsl_dir_remove_livelist()`.

Reviewed by: Sara Hartse <sara.hartse@delphix.com>
Reviewed-by: George Wilson <george.wilson@delphix.com>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
Closes #10643

3 years agoFreeBSD: Add support for lockless lookup
Matthew Macy [Wed, 5 Aug 2020 17:19:51 +0000 (10:19 -0700)]
FreeBSD: Add support for lockless lookup

Authored-by: mjg <mjg@FreeBSD.org>
Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Signed-off-by: Matt Macy <mmacy@FreeBSD.org>
Closes #10657

3 years agoAdd missed thread_exit() to vdev_{autotrim,rebuild}_thread
Matthew Macy [Wed, 5 Aug 2020 17:17:07 +0000 (10:17 -0700)]
Add missed thread_exit() to vdev_{autotrim,rebuild}_thread

Reviewed-by: Jorgen Lundman <lundman@lundman.net>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Co-authored-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Matt Macy <mmacy@FreeBSD.org>
Closes #10668

3 years agoFix arc__wait__for__eviction tracepoint
Pavel Snajdr [Tue, 4 Aug 2020 17:04:00 +0000 (19:04 +0200)]
Fix arc__wait__for__eviction tracepoint

3442c2a02d added new `arc_wait_for_eviction` tracepoint, which fails to
compile, when tracepoints are enabled.

The tracepoint definition begins with `DEFINE_ARC_WAIT_FOR_EVICTION_EVENT`
and is a multi-line definition, so this fixes the backslash
and parenthesis accordingly.

Reviewed-by: Matt Ahrens <matt@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Pavel Snajdr <snajpa@snajpa.net>
Closes #10669

3 years agoVerify zfs module loaded before starting services
Jonathon [Sun, 2 Aug 2020 00:13:15 +0000 (00:13 +0000)]
Verify zfs module loaded before starting services

This is a minor change to the systemd service templates that verifies
the zfs kernel module is loaded by the kernel prior to attempting to
import any zpool.

The services check for the presence of /sys/module/zfs which indicates
the zfs is module is loaded. This uses the systemd built-in check
ConditionPathIsDirectory.

Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Matthew Thode <prometheanfire@gentoo.org>
Signed-off-by: Jonathon Fernyhough <jonathon.fernyhough@york.ac.uk>
Closes #10663

3 years agoFix logging in l2arc_rebuild()
George Amanakis [Sat, 1 Aug 2020 18:17:18 +0000 (14:17 -0400)]
Fix logging in l2arc_rebuild()

In case the L2ARC rebuild was canceled, do not log to spa history
log as the pool may be in the process of being removed and a panic
may occur:

BUG: kernel NULL pointer dereference, address: 0000000000000018
RIP: 0010:spa_history_log_internal+0xb1/0x120 [zfs]
Call Trace:
 l2arc_rebuild+0x464/0x7c0 [zfs]
 l2arc_dev_rebuild_start+0x2d/0x130 [zfs]
 ? l2arc_rebuild+0x7c0/0x7c0 [zfs]
 thread_generic_wrapper+0x78/0xb0 [spl]
 kthread+0xfb/0x130
 ? IS_ERR+0x10/0x10 [spl]
 ? kthread_park+0x90/0x90
 ret_from_fork+0x35/0x40

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: George Amanakis <gamanakis@gmail.com>
Closes #10659

3 years agoFreeBSD: Fix `zfs jail` and add a test
Ryan Moeller [Sat, 1 Aug 2020 15:44:54 +0000 (11:44 -0400)]
FreeBSD: Fix `zfs jail` and add a test

zfs_jail was not using zfs_ioctl so failed to map the IOC number
correctly.  Use zfs_ioctl to perform the jail ioctl and add a test
case for FreeBSD.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #10658

3 years agoFix page fault in zfsctl_snapdir_getattr
Matthew Macy [Sat, 1 Aug 2020 15:42:55 +0000 (08:42 -0700)]
Fix page fault in zfsctl_snapdir_getattr

Must acquire the z_teardown_lock before accessing the zfsvfs_t object.
I can't reproduce this panic on demand, but this looks like the
correct solution.

Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Authored-by: asomers <asomers@FreeBSD.org>
Signed-off-by: Matt Macy <mmacy@FreeBSD.org>
Closes #10656

3 years agoChange the error handling for invalid property values
Allan Jude [Sat, 1 Aug 2020 15:41:31 +0000 (11:41 -0400)]
Change the error handling for invalid property values

ZFS recv should return a useful error message when an invalid index
property value is provided in the send stream properties nvlist

With a compression= property outside of the understood range:

Before:
```
receiving full stream of zof/zstd_send@send2 into testpool/recv@send2
internal error: Invalid argument
Aborted (core dumped)
```
Note: the recv completes successfully, the abort() is likely just to
make it easier to track the unexpected error code.

After:
```
receiving full stream of zof/zstd_send@send2 into testpool/recv@send2
cannot receive compression property on testpool/recv: invalid property
value received 28.9M stream in 1 seconds (28.9M/sec)
```

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Allan Jude <allan@klarasystems.com>
Closes #10631

3 years agoChanges to make openzfs build within FreeBSD buildworld
Matthew Macy [Sat, 1 Aug 2020 04:30:31 +0000 (21:30 -0700)]
Changes to make openzfs build within FreeBSD buildworld

A collection of header changes to enable FreeBSD to build
with vendored OpenZFS.

Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matt Macy <mmacy@FreeBSD.org>
Closes #10635

3 years agoConvert Linux-isms to FreeBSD-isms in platform zfs_debug.c
Ryan Moeller [Sat, 1 Aug 2020 04:25:35 +0000 (00:25 -0400)]
Convert Linux-isms to FreeBSD-isms in platform zfs_debug.c

Change some comments copied from the Linux code to describe
the appropriate methods on FreeBSD.

Convert some tunables to ZFS_MODULE_PARAM so they get created
on FreeBSD.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #10647

3 years agoZTS: FreeBSD does have a l2arc.trim_ahead tunable
Ryan Moeller [Sat, 1 Aug 2020 04:18:32 +0000 (00:18 -0400)]
ZTS: FreeBSD does have a l2arc.trim_ahead tunable

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #10633

3 years agoRevise ARC shrinker algorithm
Matthew Ahrens [Sat, 1 Aug 2020 04:10:52 +0000 (21:10 -0700)]
Revise ARC shrinker algorithm

The ARC shrinker callback `arc_shrinker_count/_scan()` is invoked by the
kernel's shrinker mechanism when the system is running low on free
pages.  This happens via 2 code paths:

1. "direct reclaim": The system is attempting to allocate a page, but we
are low on memory.  The ARC shrinker callback is invoked from the
page-allocation code path.

2. "indirect reclaim": kswapd notices that there aren't many free pages,
so it invokes the ARC shrinker callback.

In both cases, the kernel's shrinker code requests that the ARC shrinker
callback release some of its cache, and then it measures how many pages
were released.  However, it's measurement of released pages does not
include pages that are freed via `__free_pages()`, which is how the ARC
releases memory (via `abd_free_chunks()`).  Rather, the kernel shrinker
code is looking for pages to be placed on the lists of reclaimable pages
(which is separate from actually-free pages).

Because the kernel shrinker code doesn't detect that the ARC has
released pages, it may call the ARC shrinker callback many times,
resulting in the ARC "collapsing" down to `arc_c_min`.  This has several
negative impacts:

1. ZFS doesn't use RAM to cache data effectively.

2. In the direct reclaim case, a single page allocation may wait a long
time (e.g. more than a minute) while we evict the entire ARC.

3. Even with the improvements made in 67c0f0dedc5 ("ARC shrinking blocks
reads/writes"), occasionally `arc_size` may stay above `arc_c` for the
entire time of the ARC collapse, thus blocking ZFS read/write operations
in `arc_get_data_impl()`.

To address these issues, this commit limits the ways that the ARC
shrinker callback can be used by the kernel shrinker code, and mitigates
the impact of arc_is_overflowing() on ZFS read/write operations.

With this commit:

1. We limit the amount of data that can be reclaimed from the ARC via
the "direct reclaim" shrinker.  This limits the amount of time it takes
to allocate a single page.

2. We do not allow the ARC to shrink via kswapd (indirect reclaim).
Instead we rely on `arc_evict_zthr` to monitor free memory and reduce
the ARC target size to keep sufficient free memory in the system.  Note
that we can't simply rely on limiting the amount that we reclaim at once
(as for the direct reclaim case), because kswapd's "boosted" logic can
invoke the callback an unlimited number of times (see
`balance_pgdat()`).

3. When `arc_is_overflowing()` and we want to allocate memory,
`arc_get_data_impl()` will wait only for a multiple of the requested
amount of data to be evicted, rather than waiting for the ARC to no
longer be overflowing.  This allows ZFS reads/writes to make progress
even while the ARC is overflowing, while also ensuring that the eviction
thread makes progress towards reducing the total amount of memory used
by the ARC.

4. The amount of memory that the ARC always tries to keep free for the
rest of the system, `arc_sys_free` is increased.

5. Now that the shrinker callback is able to provide feedback to the
kernel's shrinker code about our progress, we can safely enable
the kswapd hook. This will allow the arc to receive notifications
when memory pressure is first detected by the kernel. We also
re-enable the appropriate kstats to track these callbacks.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Co-authored-by: George Wilson <george.wilson@delphix.com>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
Closes #10600

3 years agoZTS: zvol_misc_volmode is flaky on FreeBSD
Ryan Moeller [Sat, 1 Aug 2020 04:05:55 +0000 (00:05 -0400)]
ZTS: zvol_misc_volmode is flaky on FreeBSD

Mark this as a known issue.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #10655

3 years agoZTS: Use POSIX-compatible space character class
Ryan Moeller [Sat, 1 Aug 2020 01:11:21 +0000 (21:11 -0400)]
ZTS: Use POSIX-compatible space character class

FreeBSD recently integrated a change which causes \s in a regex to
throw an error instead of silently being misinterpreted as an s.

Change the regex in zpool_colors.ksh to use [[:space:]].

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <freqlabs@FreeBSD.org>
Closes #10651

3 years agolua: Increase reserved stack space for FreeBSD in debug config
Ryan Moeller [Fri, 31 Jul 2020 16:17:37 +0000 (12:17 -0400)]
lua: Increase reserved stack space for FreeBSD in debug config

FreeBSD uses more stack space in debug configurations and can overflow
the stack while formatting the error message when the call depth limit
of 20 frames is reached.  This is readily reproduced by running the
gsub recursion test with increased kstack size.  I hit the panic with
16 pages per kstack, and noticed it go away when bumped to 17.

Reserve an additional 64 bytes on the stack when building for FreeBSD.
This is enough to avoid the panic with a deep stack while not wasting
too much space when the default stack size is used.

Reviewed-by: Matt Ahrens <matt@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #10634

3 years agoWhen encountering EZFS_UNKNOWN, print the error text buffer anyway
Allan Jude [Fri, 31 Jul 2020 16:07:37 +0000 (12:07 -0400)]
When encountering EZFS_UNKNOWN, print the error text buffer anyway

Rather than just saying there was an internal error, provide any
context we might have to the user to help them understand the issue.

Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Allan Jude <allan@klarasystems.com>
Closes #10632

3 years agoRemove duplicate include of sys/zfeature.h in dmu_objset.c
Allan Jude [Fri, 31 Jul 2020 16:04:45 +0000 (12:04 -0400)]
Remove duplicate include of sys/zfeature.h in dmu_objset.c

Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Allan Jude <allan@klarasystems.com>
Closes #10636

3 years agopyzfs: Add missing entry to zfs_errno
Allan Jude [Fri, 31 Jul 2020 16:01:41 +0000 (12:01 -0400)]
pyzfs: Add missing entry to zfs_errno

This was causing all later errno's to have the incorrect value.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Allan Jude <allan@klarasystems.com>
Closes #10649

3 years agozfs promote does not delete livelist of origin
Matthew Ahrens [Fri, 31 Jul 2020 15:59:00 +0000 (08:59 -0700)]
zfs promote does not delete livelist of origin

When a clone is promoted, its livelist is no longer accurate, so it is
discarded.  If the clone's origin is also a clone (i.e. we are promoting
a clone of a clone), then the origin's livelist is also no longer
accurate, so it should be discarded, but the code doesn't actually do
that.

Consider a pool with:
* Filesystem A
* Clone B, a clone of A
* Clone C, a clone of B

If we promote C, it discards C's livelist.  It should discard B's
livelist, but that is not happening.  The impact is that when B is
destroyed, we use the livelist to find the blocks to free, but the
livelist is no longer correct so we end up freeing blocks that are still
in use by C.  The incorrectly-freed blocks can be reallocated causing
checksum errors.  And when C is destroyed it can double-free the
incorrectly-freed blocks.

The problem is that we remove the livelist of `origin_ds->ds_dir`, but
the origin snapshot has already been moved to the promoted dsl_dir.  So
this is actually trying to remove the livelist of the promoted dsl_dir,
which was already removed.  As explained in a comment in the beginning
of `dsl_dataset_promote_sync()`, we need to use the saved `odd` for the
origin's dsl_dir.

Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Wilson <gwilson@delphix.com>
Reviewed by: Sara Hartse <sara.hartse@delphix.com>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
Closes #10652

3 years agoZTS: minor improvements to alloc_class_009_pos functional test
Don Brady [Thu, 30 Jul 2020 16:11:05 +0000 (10:11 -0600)]
ZTS: minor improvements to alloc_class_009_pos functional test

* Fixed a typo that cause one of the variations to be a no-op
* Added additional coverage for adding special vdev after pool create
* Added additional coverage for using 4K sector size

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Signed-off-by: Don Brady <don.brady@delphix.com>
Closes #10641

3 years agoUse correct prefix for share/pam-configs
Ryan Moeller [Thu, 30 Jul 2020 16:09:46 +0000 (12:09 -0400)]
Use correct prefix for share/pam-configs

Respect the configured install prefix.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Arvind Sankar <nivedita@alum.mit.edu>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #10604

3 years agoFix error handling of vdev_top_zap
Matthew Ahrens [Thu, 30 Jul 2020 00:04:34 +0000 (17:04 -0700)]
Fix error handling of vdev_top_zap

In `vdev_load()`, we look up several entries in the `vdev_top_zap`
object.  In most cases, if we encounter an i/o error, it will be
returned to the caller.  However, when handling
`VDEV_TOP_ZAP_ALLOCATION_BIAS`, if we get an i/o error, we may continue
on, which in theory could cause us to not realize that a vdev should be
used only for `special` allocations.

In practice, if we encountered an i/o error while looking for
`VDEV_TOP_ZAP_ALLOCATION_BIAS` in the `vdev_top_zap`, we'd also get an
i/o error while looking for other entries in the same object, and thus
the zpool open/import would fail.  Therefore the impact of this problem
is negligible.

This commit adds error handling for i/o errors while accessing the
`vdev_top_zap`, so that we aren't relying on unrelated code to fail for
us.

Reviewed-by: Don Brady <don.brady@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
Closes #10637

3 years agoVerify zfs module loaded before starting services
Jonathon [Wed, 29 Jul 2020 23:52:18 +0000 (23:52 +0000)]
Verify zfs module loaded before starting services

This is a minor change to the systemd service templates that verifies the zfs
kernel module is loaded by the kernel prior to attempting to import any zpool.

Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Jonathon Fernyhough <jonathon.fernyhough@york.ac.uk>
Closes #10627

3 years agoRename refcount.h to zfs_refcount.h
Matthew Macy [Wed, 29 Jul 2020 23:35:33 +0000 (16:35 -0700)]
Rename refcount.h to zfs_refcount.h

Renamed to avoid conflicting with refcount.h when a different
implementation is already provided by the platform.

Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matt Macy <mmacy@FreeBSD.org>
Closes #10620

3 years agoIntroduce names for ZTHRs
Serapheim Dimitropoulos [Wed, 29 Jul 2020 16:43:33 +0000 (09:43 -0700)]
Introduce names for ZTHRs

When debugging issues or generally analyzing the runtime of
a system it would be nice to be able to tell the different
ZTHRs running by name rather than having to analyze their
stack.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Co-authored-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Serapheim Dimitropoulos <serapheim@delphix.com>
Closes #10630

3 years agoPrefix zfs internal endian checks with _ZFS
Matthew Macy [Tue, 28 Jul 2020 20:02:49 +0000 (13:02 -0700)]
Prefix zfs internal endian checks with _ZFS

FreeBSD defines _BIG_ENDIAN BIG_ENDIAN _LITTLE_ENDIAN
LITTLE_ENDIAN on every architecture. Trying to do
cross builds whilst hiding this from ZFS has proven
extremely cumbersome.

Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matt Macy <mmacy@FreeBSD.org>
Closes #10621

3 years agoFix lua stack overflow on recursive call to gsub()
Matthew Ahrens [Mon, 27 Jul 2020 23:11:47 +0000 (16:11 -0700)]
Fix lua stack overflow on recursive call to gsub()

The `zfs program` subcommand invokes a LUA interpreter to run ZFS
"channel programs".  This interpreter runs in a constrained environment,
with defined memory limits.  The LUA stack (used for LUA functions that
call each other) is allocated in the kernel's heap, and is limited by
the `-m MEMORY-LIMIT` flag and the `zfs_lua_max_memlimit` module
parameter.  The C stack is used by certain LUA features that are
implemented in C.  The C stack is limited by `LUAI_MAXCCALLS=20`, which
limits call depth.

Some LUA C calls use more stack space than others, and `gsub()` uses an
unusually large amount.  With a programming trick, it can be invoked
recursively using the C stack (rather than the LUA stack).  This
overflows the 16KB Linux kernel stack after about 11 iterations, less
than the limit of 20.

One solution would be to decrease `LUAI_MAXCCALLS`.  This could be made
to work, but it has a few drawbacks:

1. The existing test suite does not pass with `LUAI_MAXCCALLS=10`.

2. There may be other LUA functions that use a lot of stack space, and
the stack space may change depending on compiler version and options.

This commit addresses the problem by adding a new limit on the amount of
free space (in bytes) remaining on the C stack while running the LUA
interpreter: `LUAI_MINCSTACK=4096`.  If there is less than this amount
of stack space remaining, a LUA runtime error is generated.

Reviewed-by: George Wilson <gwilson@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Reviewed-by: Allan Jude <allanjude@freebsd.org>
Reviewed-by: Serapheim Dimitropoulos <serapheim@delphix.com>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
Closes #10611
Closes #10613

3 years agoRefactor ccompile.h to not include system headers
Matthew Macy [Sun, 26 Jul 2020 03:09:50 +0000 (20:09 -0700)]
Refactor ccompile.h to not include system headers

This is a step toward being able to vendor the OpenZFS code in FreeBSD.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Matt Macy <mmacy@FreeBSD.org>
Closes #10625

3 years agoMake use of ZFS_DEBUG consistent within kmod sources
Matthew Macy [Sun, 26 Jul 2020 03:07:44 +0000 (20:07 -0700)]
Make use of ZFS_DEBUG consistent within kmod sources

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Matt Macy <mmacy@FreeBSD.org>
Closes #10623

3 years agoFreeBSD: Fixes required to build ZFS on PowerPC
Matthew Macy [Sat, 25 Jul 2020 18:00:23 +0000 (11:00 -0700)]
FreeBSD: Fixes required to build ZFS on PowerPC

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matt Macy <mmacy@FreeBSD.org>
Closes #10622

3 years agoFreeBSD: Remove accidental ARC size limiter
Ryan Moeller [Sat, 25 Jul 2020 17:49:49 +0000 (13:49 -0400)]
FreeBSD: Remove accidental ARC size limiter

i386 has some additional memory reservation logic that limits the size
of the reported available memory.  This was accidentally being used on
all arches due to a missing header.

Include machine/vmparam.h in freebsd/zfs/arc_os.c to pull in the
missing UMA_MD_SMALL_ALLOC definition.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #10616

3 years agoFreeBSD: Implement arc_free_memory
Ryan Moeller [Sat, 25 Jul 2020 17:47:18 +0000 (13:47 -0400)]
FreeBSD: Implement arc_free_memory

This is only used for the kstat, but something other than 0 is nice.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #10626

3 years agoAdd gang ABD child to parent gang ABD
Brian Atkinson [Sat, 25 Jul 2020 04:09:20 +0000 (22:09 -0600)]
Add gang ABD child to parent gang ABD

By design a gang ABD can not have another gang ABD as a child. This is
to make sure the logical offset in a gang ABD is consistent with the
individual ABDS it contains as children. If a gang ABD is added as a
child of a gang ABD we will add the individual children of the gang ABD
to the parent gang ABD. This allows for a consistent view of offsets
within the parent gang ABD.

Reviewed-by: Mark Maybee <mmaybee@cray.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Brian Atkinson <batkinson@lanl.gov>
Closes #10430

3 years agoLimit dbuf cache sizes based only on ARC target size by default
Ryan Moeller [Sat, 25 Jul 2020 03:38:48 +0000 (23:38 -0400)]
Limit dbuf cache sizes based only on ARC target size by default

Set the initial max sizes to ULONG_MAX to allow the caches to grow
with the ARC.

Recalculate the metadata cache size on demand so it can adapt, too.

Update descriptions in zfs-module-parameters(5).

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Matt Ahrens <matt@delphix.com>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #10563
Closes #10610

3 years agoremove kmem_cache module parameter KMC_EXPIRE_AGE
Matthew Ahrens [Fri, 24 Jul 2020 16:39:26 +0000 (09:39 -0700)]
remove kmem_cache module parameter KMC_EXPIRE_AGE

By default, `spl_kmem_cache_expire` is `KMC_EXPIRE_MEM`, meaning that
objects will be removed from kmem cache magazines by
`spl_kmem_cache_reap_now()`.

There is also a module parameter to change this to `KMC_EXPIRE_AGE`,
which establishes a maximum lifetime for objects to stay in the
magazine.  This setting has rarely, if ever, been used, and is not
regularly tested.

This commit removes the code for `KMC_EXPIRE_AGE`, and associated module
parameters.

Additionally, the unused module parameter
`spl_kmem_cache_obj_per_slab_min` is removed.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
Closes #10608

3 years agoAdd support to decode a resume token
tony-zfs [Fri, 24 Jul 2020 00:44:03 +0000 (20:44 -0400)]
Add support to decode a resume token

Adding a new subcommand to zstream called token. This
now allows users to decode a resume token to retrieve the toname
field. This can be useful for tools that need this information.
The syntax works as follows zstream token <resume_token>.

Reviewed-by: Matt Ahrens <matt@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Paul Zuchowski <pzuchowski@datto.com>
Signed-off-by: Tony Perkins <tperkins@datto.com>
Closes #10558

3 years agoAnnotate unused parameters on inline definitions as such
Kyle Evans [Fri, 24 Jul 2020 00:41:48 +0000 (19:41 -0500)]
Annotate unused parameters on inline definitions as such

* libspl: umem: These are obviously and intentionally unused; annotate
  them as such to appease -Wunused-parameter builds that include this
  header.

* sys/dmu.h: In this case, clear_on_evict_dbufp is only used for
  ZFS_DEBUG builds, so annotate it as __maybe_unused to appease
  -Wunused-parameter.

Reviewed-By: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Kyle Evans <kevans@FreeBSD.org>
Closes #10606

3 years agoFreeBSD: Remove some code duplication in sysctl_os.c
Ryan Moeller [Fri, 24 Jul 2020 00:35:34 +0000 (20:35 -0400)]
FreeBSD: Remove some code duplication in sysctl_os.c

Drop unnecessary redefinition's of several arcstat values.
Put missing extern declaration of arc_no_grow_shift in arc_impl.h.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #10609