]> CyberLeo.Net >> Repos - FreeBSD/FreeBSD.git/log
FreeBSD/FreeBSD.git
2 years agoZTS: alloc_class.ksh must wait for the process to exit
Brian Behlendorf [Fri, 17 Dec 2021 20:40:34 +0000 (12:40 -0800)]
ZTS: alloc_class.ksh must wait for the process to exit

The alloc_class_* tests may fail on Linux with an EBUSY error if
`zfs destroy` is run before the `dd` process has had a chance to
terminate.  Wait on the pid after the `kill -9` to make sure.

When testing I didn't observe any failures for the alloc_class
tests.  Remove them from the exceptions list, the CI was used to
verify the tests pass on all platforms.

Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Rich Ercolani <rincebrain@gmail.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #12873

2 years agoZTS: Avoid piping send directly to /dev/null
Rich Ercolani [Fri, 17 Dec 2021 20:39:10 +0000 (15:39 -0500)]
ZTS: Avoid piping send directly to /dev/null

Unfortunately, #11445 means while we fail gracefully now, we still
fail, unless people want to implement a complex workaround just to
support /dev/null.

So let's just use the cheap workaround in a test for now.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes #12872

2 years agoZTS: Fix zpool_reopen_[1-5] on Fedora 35
Tony Hutter [Fri, 17 Dec 2021 20:37:21 +0000 (12:37 -0800)]
ZTS: Fix zpool_reopen_[1-5] on Fedora 35

The zpool_reopen_[1-5] tests are failing Fedora 35 with:

zpool_reopen_001_pos.ksh[64]: log_must[67]: log_pos[270]:
wait_for_resilver_end[98]: wait_for_action: line 71: func: is read only

Renaming 'func' -> 'funct' fixes the issue.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #12871

2 years agoFix zvol_open() lock inversion
Brian Behlendorf [Fri, 17 Dec 2021 17:52:13 +0000 (09:52 -0800)]
Fix zvol_open() lock inversion

When restructuring the zvol_open() logic for the Linux 5.13 kernel
a lock inversion was accidentally introduced.  In the updated code
the spa_namespace_lock is now taken before the zv_suspend_lock
allowing the following scenario to occur:

    down_read <=== waiting for zv_suspend_lock
    zvol_open <=== holds spa_namespace_lock
    __blkdev_get
    blkdev_get_by_dev
    blkdev_open
    ...

     mutex_lock <== waiting for spa_namespace_lock
     spa_open_common
     spa_open
     dsl_pool_hold
     dmu_objset_hold_flags
     dmu_objset_hold
     dsl_prop_get
     dsl_prop_get_integer
     zvol_create_minor
     dmu_recv_end
     zfs_ioc_recv_impl <=== holds zv_suspend_lock via zvol_suspend()
     zfs_ioc_recv
     ...

This commit resolves the issue by moving the acquisition of the
spa_namespace_lock back to after the zv_suspend_lock which restores
the original ordering.

Additionally, as part of this change the error exit paths were
simplified where possible.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Rich Ercolani <rincebrain@gmail.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #12863

2 years agoFreeBSD: Update argument types for VOP_READDIR
Alan Somers [Fri, 17 Dec 2021 17:50:12 +0000 (10:50 -0700)]
FreeBSD: Update argument types for VOP_READDIR

A recent commit to FreeBSD changed the type of
vop_readdir_args.a_cookies to a uint64_t**.  There is no functional
impact to ZFS because ZFS only uses 32-bit cookies, which will be
zero-extended to 64-bits by the existing code.

https://github.com/freebsd/freebsd-src/commit/b214fcceacad6b842545150664bd2695c1c2b34f

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Alan Somers <asomers@gmail.com>
Closes #12874

2 years agozcommon: pre-iterate over sysfs instead of statting every feature
наб [Fri, 17 Dec 2021 00:43:10 +0000 (01:43 +0100)]
zcommon: pre-iterate over sysfs instead of statting every feature

If sufficient memory (<2K, realistically) is available, libzfs_init()
can be significantly shorted by iterating over the correct sysfs
directory before registrations, we can turn 168 stats into 15/18
syscalls (3 opens (6 if built in), 3 fstats, 6 getdentses, and 3
closes), a tenfoldish reduction; this is probably a bit faster, too.

The list is always optional, and registration functions (and one-off
users) can simply pass NULL, which will fall back to the previous
mechanism

Also, don't allocate in zfs_mod_supported_impl, and use use access()
instead of stat(), since existence is really what we care about

Also, fix pre-prop-checking compat in fallback for built-in ZFS

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12089

2 years agozcommon: *_prop: make all zprop_index_t tables const
наб [Thu, 16 Dec 2021 21:26:04 +0000 (22:26 +0100)]
zcommon: *_prop: make all zprop_index_t tables const

They're already static, and there's no point in them being R/W
and living outside .rodata

Reviewed-by: RageLtMan <rageltman@sempervictus>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12836

2 years agoFreeBSD: Provide correct file generation number
Ryan Moeller [Thu, 16 Dec 2021 21:22:15 +0000 (16:22 -0500)]
FreeBSD: Provide correct file generation number

va_seq was actually a thin veil over va_gen, so z_gen is a more
appropriate value than z_seq to populate the field with.

Drop the unnecessary compat obfuscation and provide the correct
file generation number.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Ryan Moeller <freqlabs@freebsd.org>
Closes #12851

2 years agozfs list: Allow more fields in ZFS_ITER_SIMPLE mode
Allan Jude [Thu, 16 Dec 2021 19:56:22 +0000 (14:56 -0500)]
zfs list: Allow more fields in ZFS_ITER_SIMPLE mode

If the fields to be listed and sorted by are constrained
to those populated by dsl_dataset_fast_stat(), then
zfs list is much faster, as it does not need to open each
objset and reads its properties.

A previous optimization by Pawel Dawidek
(0cee24064a79f9c01fc4521543c37acea538405f) took advantage
of this to make listing snapshot names sorted only by name
much faster.

However, it was limited to `-o name -s name`, this work
extends this optimization to work with:
  - name
  - guid
  - createtxg
  - numclones
  - inconsistent
  - redacted
  - origin
and could be further extended to any other properties
supported by dsl_dataset_fast_stat() or similar, that do
not require extra locking or reading from disk.

Reviewed-by: Mark Maybee <mark.maybee@delphix.com>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Pawel Jakub Dawidek <pawel@dawidek.net>
Signed-off-by: Allan Jude <allan@klarasystems.com>
Closes #11080

2 years agosystemd: add weekly and monthly scrub timers
Georgy Yakovlev [Thu, 16 Dec 2021 19:47:22 +0000 (11:47 -0800)]
systemd: add weekly and monthly scrub timers

Timers can be enabled as follows:

systemctl enable zfs-scrub-weekly@rpool.timer --now
systemctl enable zfs-scrub-monthly@datapool.timer --now

Each timer will pull in zfs-scrub@${poolname}.service, which is not
schedule-specific.

Added PERIODIC SCRUB section to zpool-scrub.8.

Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Georgy Yakovlev <gyakovlev@gentoo.org>
Closes #12193

2 years agot/z_diff/socket, zfs: main: fix unused argument warnings, ARGSUSED tags
наб [Thu, 9 Dec 2021 23:08:19 +0000 (00:08 +0100)]
t/z_diff/socket, zfs: main: fix unused argument warnings, ARGSUSED tags

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Rich Ercolani <rincebrain@gmail.com>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12829

2 years agolibzfs: diff: simplify superfluous stdio
наб [Thu, 9 Dec 2021 22:50:41 +0000 (23:50 +0100)]
libzfs: diff: simplify superfluous stdio

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Rich Ercolani <rincebrain@gmail.com>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12829

2 years agolibzfs: diff: print_what() can return the symbol => get_what()
наб [Thu, 9 Dec 2021 22:44:51 +0000 (23:44 +0100)]
libzfs: diff: print_what() can return the symbol => get_what()

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Rich Ercolani <rincebrain@gmail.com>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12829

2 years agolibzfs: diff: stream_bytes: use fputc, %hho formats chars
наб [Thu, 9 Dec 2021 22:42:02 +0000 (23:42 +0100)]
libzfs: diff: stream_bytes: use fputc, %hho formats chars

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Rich Ercolani <rincebrain@gmail.com>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12829

2 years agolibzfs: zpool_set_vdev_prop: remove unused vprop
наб [Thu, 9 Dec 2021 22:40:17 +0000 (23:40 +0100)]
libzfs: zpool_set_vdev_prop: remove unused vprop

Found by clang 14 with -Wunused-but-set-variable

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Rich Ercolani <rincebrain@gmail.com>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12829

2 years agolinux: libspl: getmntany: remove unused argument
наб [Thu, 9 Dec 2021 22:39:36 +0000 (23:39 +0100)]
linux: libspl: getmntany: remove unused argument

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Rich Ercolani <rincebrain@gmail.com>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12829

2 years agozfs, libzfs: diff: accept -h/ZFS_DIFF_NO_MANGLE, disabling path escaping
наб [Thu, 9 Dec 2021 23:02:52 +0000 (00:02 +0100)]
zfs, libzfs: diff: accept -h/ZFS_DIFF_NO_MANGLE, disabling path escaping

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Rich Ercolani <rincebrain@gmail.com>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12829

2 years agoAdd init script to load keys
ogelpre [Sun, 12 Dec 2021 19:17:14 +0000 (20:17 +0100)]
Add init script to load keys

Add new init scripts which allow automatic loading of keys if
keylocation property is set to a URI.

Reviewed-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Benedikt Neuffer <ogelpre@itfriend.de>
Closes #11659
Closes #11662

2 years agozfs-dkms rpm: Fix scriptlets dependencies
Till Maas [Sun, 12 Dec 2021 19:15:25 +0000 (20:15 +0100)]
zfs-dkms rpm: Fix scriptlets dependencies

To ensure that the necessary packages are available during the %post and
%preun scriptlets, require them properly.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Till Maas <opensource@till.name>
Closes #12822
Closes #12832

2 years agoFreeBSD: Add vop_standard_writecount_nomsync
Ryan Moeller [Fri, 10 Dec 2021 14:15:27 +0000 (14:15 +0000)]
FreeBSD: Add vop_standard_writecount_nomsync

https://cgit.freebsd.org/src/commit?id=3ffcfa599e29686cf2b3c1a6087408c37acaed78

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Ryan Moeller <freqlabs@FreeBSD.org>
Closes #12828

2 years agozfs: Fix a deadlock between page busy and the teardown lock
Mark Johnston [Sat, 20 Nov 2021 16:21:25 +0000 (11:21 -0500)]
zfs: Fix a deadlock between page busy and the teardown lock

When rolling back a dataset, ZFS has to purge file data resident in the
system page cache.  To do this, it loops over all vnodes for the
mountpoint and calls vn_pages_remove() to purge pages associated with
the vnode's VM object.  Each page is thus exclusively busied while the
dataset's teardown write lock is held.

When handling a page fault on a mapped ZFS file, FreeBSD's page fault
handler busies newly allocated pages and then uses VOP_GETPAGES to fill
them.  The ZFS getpages VOP acquires the teardown read lock with vnode
pages already busied.  This represents a lock order reversal which can
lead to deadlock.

To break the deadlock, observe that zfs_rezget() need only purge those
pages marked valid, and that pages busied by the page fault handler are,
by definition, invalid.  Furthermore, ZFS pages always transition from
invalid to valid with the teardown lock held, and ZFS never creates
partially valid pages.  Thus, zfs_rezget() can use the new
vn_pages_remove_valid() to skip over pages busied by the fault handler.

PR: 258208
Tested by: pho
Reviewed by: avg, sef, kib
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D32931

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Ryan Moeller <freqlabs@FreeBSD.org>
Closes #12828

2 years agoFreeBSD: Catch up with more VFS changes
Ryan Moeller [Thu, 9 Dec 2021 18:04:56 +0000 (18:04 +0000)]
FreeBSD: Catch up with more VFS changes

Unused thread argument was removed from NDINIT*

https://cgit.freebsd.org/src/commit?id=7e1d3eefd410ca0fbae5a217422821244c3eeee4

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Ryan Moeller <freqlabs@FreeBSD.org>
Closes #12828

2 years agoFreeBSD supports edonr follow up
наб [Thu, 9 Dec 2021 01:01:36 +0000 (02:01 +0100)]
FreeBSD supports edonr follow up

This chases 269b5dadcfd1d5732cf763dddcd46009a332eae4 (#12735),
which touched the actual code but didn't fix the comment

Additionally, ignore the name.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Rich Ercolani <rincebrain@gmail.com>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12823

2 years agoLinux 5.15 compat: META (#12824)
Brian Behlendorf [Tue, 7 Dec 2021 23:35:42 +0000 (15:35 -0800)]
Linux 5.15 compat: META (#12824)

The final 5.15 kernel is available and has been tested.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
2 years agocontrib/bash_completion.d: fix error spew from __zfs_match_snapshot()
наб [Tue, 7 Dec 2021 20:30:10 +0000 (21:30 +0100)]
contrib/bash_completion.d: fix error spew from __zfs_match_snapshot()

Given:
  /sbin/zfs list filling/a-zvol<TAB> -o space,refratio
The rest of the cmdline gets vored by:
  /sbin/zfs list filling/a-zvolcannot open 'filling/a-zvol':
  operation not applicable to datasets of this type

With -x (fragment):
  + COMPREPLY=($(compgen -W "$(__zfs_match_snapshot)" -- "$cur"))
  +++ __zfs_match_snapshot
  +++ local base_dataset=filling/dziadtop-nowe-duchy
  +++ [[ filling/dziadtop-nowe-duchy != filling/dziadtop-nowe-duchy ]]
  +++ [[ filling/dziadtop-nowe-duchy != '' ]]
  +++ __zfs_list_datasets filling/dziadtop-nowe-duchy
  +++ /sbin/zfs list -H -o name -s name -t filesystem
                     -r filling/dziadtop-nowe-duchy
  +++ tail -n +2
  cannot open 'filling/dziadtop-nowe-duchy':
  operation not applicable to datasets of this type
  +++ echo filling/dziadtop-nowe-duchy
  +++ echo filling/dziadtop-nowe-duchy@
  ++ compgen -W 'filling/dziadtop-nowe-duchy

This properly completes with:
  $ /sbin/zfs list filling/a-zvol<TAB> -o space,refratio
  filling/a-zvol   filling/a-zvol@
  $ /sbin/zfs list filling/a-zvol<cursor> -o space,refratio

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12820

2 years agoLinux 5.16: Resolve ZSTD_isError symbol collision in Linux kernel
Coleman Kane [Sun, 5 Dec 2021 20:18:46 +0000 (15:18 -0500)]
Linux 5.16: Resolve ZSTD_isError symbol collision in Linux kernel

Newer zstd code introduced in the main kernel tree now creates a symbol
collision with ZSTD_isError in our ZSTD code. This change relabels our
implementation with a ZFS-specific symbol name, and undoes some
macro-based micro-optimizations that conflict with the attempt to rename
our internal-use version.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Coleman Kane <ckane@colemankane.org>
Closes #12819

2 years agoLinux 5.16: The blk-cgroup.h header is where struct blkcg_gq is defined
Coleman Kane [Sat, 4 Dec 2021 03:00:10 +0000 (22:00 -0500)]
Linux 5.16: The blk-cgroup.h header is where struct blkcg_gq is defined

The definition of struct blkcg_gq was moved into blk-cgroup.h, which is
a header that's been in Linux since 2015. This is used by
vdev_blkg_tryget() in module/os/linux/zfs/vdev_disk.c. Since the kernel
for CentOS 7 and similar-generation releases doesn't have this header,
its inclusion is guarded by a configure test.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Coleman Kane <ckane@colemankane.org>
Closes #12819

2 years agoLinux 5.16: bio_set_dev is no longer a helper macro
Coleman Kane [Sat, 4 Dec 2021 02:45:28 +0000 (21:45 -0500)]
Linux 5.16: bio_set_dev is no longer a helper macro

This change adds a confiugre check to determine if bio_set_dev is a
helper macro or not. If not, then the attempt to override its internal
call to bio_associate_blkg(), with a macro definition to our own
version, is no longer possible, as the compiler won't use it when
compiling the new inline function replacement implemented in the header.
This change also creates a new vdev_bio_set_dev() function that performs
the same work, and also performs the work implemented in
vdev_bio_associate_blkg(), as it is the only thing calling that function
in our code. Our custom vdev_bio_associate_blkg() is now only compiled
if the bio_set_dev() is a macro in the Linux headers.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Coleman Kane <ckane@colemankane.org>
Closes #12819

2 years agoLinux 5.16: type member of iov_iter renamed iter_type
Coleman Kane [Fri, 3 Dec 2021 04:25:08 +0000 (23:25 -0500)]
Linux 5.16: type member of iov_iter renamed iter_type

The iov_iter->type member was renamed iov_iter->iter_type. However,
while looking into this, realized that in 2018 a iov_iter_type(*iov)
accessor function was introduced. So if that is present, use it,
otherwise fall back to trying the existing behavior of directly
accessing type from iov_iter.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Coleman Kane <ckane@colemankane.org>
Closes #12819

2 years agoLinux 5.16: block_device_operations->submit_bio now returns void
Coleman Kane [Fri, 3 Dec 2021 03:54:05 +0000 (22:54 -0500)]
Linux 5.16: block_device_operations->submit_bio now returns void

The return type for the submit_bio member of struct
block_device_operations was changed to no longer return a value.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Coleman Kane <ckane@colemankane.org>
Closes #12819

2 years agoZFS send/recv with ashift 9->12 leads to data corruption
Paul Dagnelie [Tue, 7 Dec 2021 18:27:59 +0000 (10:27 -0800)]
ZFS send/recv with ashift 9->12 leads to data corruption

Improve the ability of zfs send to determine if a block is compressed
or not by using information contained in the blkptr.

Reviewed-by: Rich Ercolani <rincebrain@gmail.com>
Reviewed-by: Matthew Ahrens <matthew.ahrens@delphix.com>
Signed-off-by: Paul Dagnelie <pcd@delphix.com>
Closes #12770

2 years agoUpdate "tests/README.md"
Arshad Hussain [Tue, 7 Dec 2021 16:49:25 +0000 (22:19 +0530)]
Update "tests/README.md"

This patch adds detail section on adding and running
test-case. It also changes markdown number list to
more readeable headers

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Closes #12737

2 years agoAdd `const` to nvlist functions to properly expose their real behavior
Paul Dagnelie [Tue, 7 Dec 2021 01:19:13 +0000 (17:19 -0800)]
Add `const` to nvlist functions to properly expose their real behavior

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Paul Dagnelie <pcd@delphix.com>
Closes #12728

2 years agoZTS: import_rewind_device_replaced reliably fails
Brian Behlendorf [Mon, 6 Dec 2021 17:45:17 +0000 (09:45 -0800)]
ZTS: import_rewind_device_replaced reliably fails

The import_rewind_device_replaced.ksh test was never entirely reliable
because it depends on MOS data not being overwritten.  The MOS data is
not protected by the snapshot so occasional failures were always
expected.  However, this test is now failing reliably on all platforms
indicating something has changed in the code since the test was marked
"maybe".  Convert the test to a "known" failure until the root cause
is identified and resolved.

Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #12821

2 years agoCorrected a case where we could read uninited ABD memory
Rich Ercolani [Fri, 3 Dec 2021 21:13:21 +0000 (16:13 -0500)]
Corrected a case where we could read uninited ABD memory

For my sins, I started running valgrind over ztest to try and fix
that pesky intermittent "zloop dies with malloc errors" problem.

This one seemed exciting enough to merit cutting a PR for before
the rest get polished.

Suggested-by: Paul Dagnelie <pcd@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes #12214

2 years agoStrip colons from all test result filenames
John Wren Kennedy [Thu, 2 Dec 2021 00:18:45 +0000 (17:18 -0700)]
Strip colons from all test result filenames

The upload artifact functionality in github can't handle colons in
filenames. The current code handles this for files under the most
recent set of results. With the ability to rerun failed tests, now
there can be multiple sets of results, and they all need to be
processed in the same way.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Signed-off-by: John Kennedy <john.kennedy@delphix.com>
Closes #12815

2 years agoLinux 5.13 compat: retry zvol_open() when contended
Brian Behlendorf [Thu, 2 Dec 2021 00:07:12 +0000 (16:07 -0800)]
Linux 5.13 compat: retry zvol_open() when contended

Due to a possible lock inversion the zvol open call path on Linux
needs to be able to retry in the case where the spa_namespace_lock
cannot be acquired.

For Linux 5.12 an older kernel this was accomplished by returning
-ERESTARTSYS from zvol_open() to request that blkdev_get() drop
the bdev->bd_mutex lock, reaquire it, then call the open callback
again.  However, as of the 5.13 kernel this behavior was removed.

Therefore, for 5.12 and older kernels we preserved the existing
retry logic, but for 5.13 and newer kernels we retry internally in
zvol_open().  This should always succeed except in the case where
a pool's vdev are layed on zvols, in which case it may fail.  To
handle this case vdev_disk_open() has been updated to retry when
opening a device when -ERESTARTSYS is returned.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #12301
Closes #12759

2 years agoTemporarily remove tests from sanity runfile
John Wren Kennedy [Wed, 1 Dec 2021 21:22:52 +0000 (14:22 -0700)]
Temporarily remove tests from sanity runfile

With the addition of functionality to rerun failing tests, some
tests that fail only sometimes still fail often enough to degrade
the reliability of the sanity runs. Remove them from the runfile
until they reliably pass.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Signed-off-by: John Kennedy <john.kennedy@delphix.com>
Closes #12814

2 years agoAdd zfs-test facility to automatically rerun failing tests
Paul Dagnelie [Wed, 1 Dec 2021 17:38:53 +0000 (09:38 -0800)]
Add zfs-test facility to automatically rerun failing tests

This was a project proposed as part of the Quality theme for the
hackthon for the 2021 OpenZFS Developer Summit. The idea is to improve
the usability of the automated tests that get run when a PR is created
by having failing tests automatically rerun in order to make flaky
tests less impactful.

Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Signed-off-by: Paul Dagnelie <pcd@delphix.com>
Closes #12740

2 years agoget_key_material: fix style
Attila Fülöp [Sun, 14 Nov 2021 17:50:49 +0000 (18:50 +0100)]
get_key_material: fix style

Reviewed-by: Felix Dörre <felix@dogcraft.de>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Attila Fülöp <attila@fueloep.org>
Closes #12765

2 years agoget_key_material: skip passphrase validation when loading keys
Harald van Dijk [Tue, 19 Oct 2021 23:32:28 +0000 (00:32 +0100)]
get_key_material: skip passphrase validation when loading keys

The restriction that an encryption key must be at least
MIN_PASSPHRASE_LEN characters long make sense when changing the
encryption key, but not when loading: as this restriction is not
enforced in the libraries, it is possible to bypass zfs change-key's
restrictions and end up with a key that becomes impossible to load with
zfs load-key, for example through pam_zfs_key.

Reviewed-by: Felix Dörre <felix@dogcraft.de>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Harald van Dijk <harald@gigawatt.nl>
Closes #12765

2 years agopam_zfs_key: tests: check if zfs load-key works on short passphrases
Attila Fülöp [Sun, 14 Nov 2021 17:08:45 +0000 (18:08 +0100)]
pam_zfs_key: tests: check if zfs load-key works on short passphrases

The pam_zfs_key pam module does not enforce a minimum password
length while changing the user password and thus the users home
dataset passphrase. To not end up with a dateset `zfs load-key`
can't load the key for, `zfs load-key` should not enforce a minimum
passphrase length. This adds a test for that.

Reviewed-by: Felix Dörre <felix@dogcraft.de>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Attila Fülöp <attila@fueloep.org>
Closes #12765
Closes #12651
Closes #12656

2 years agopam_zfs_key: tests: clean up the generated pam service config file
Attila Fülöp [Sun, 14 Nov 2021 16:36:12 +0000 (17:36 +0100)]
pam_zfs_key: tests: clean up the generated pam service config file

Remove the generated pam service config file
`/etc/pam.d/pam_zfs_key_test` on test cleanup, since the tests
shouldn't alter system state.

While here, move the pam service config file name into a variable.

Reviewed-by: Felix Dörre <felix@dogcraft.de>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Attila Fülöp <attila@fueloep.org>
Closes #12765

2 years agoRemove REMAKE_INITRD
jokersus [Tue, 30 Nov 2021 19:09:15 +0000 (22:09 +0300)]
Remove REMAKE_INITRD

The option has been deprecated in dkms and will break packaging in
future versions. See https://github.com/dell/dkms/commit/7114c62

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Amanakis <gamanakis@gmail.com>
Signed-off-by: jokersus <jokersus.cava@gmail.com>
Closes #12781

2 years agoDefault to zfs_dmu_offset_next_sync=1
Brian Behlendorf [Tue, 30 Nov 2021 18:38:09 +0000 (10:38 -0800)]
Default to zfs_dmu_offset_next_sync=1

Strict hole reporting was previously disabled by default as a
performance optimization.  However, this has lead to confusion
over the expected behavior and a variety of workarounds being
adopted by consumers of ZFS.  Change the default behavior to
always report holes and force the TXG sync.

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #12746

2 years agoStop segfaulting on unmount error case
Rich Ercolani [Tue, 30 Nov 2021 18:36:36 +0000 (13:36 -0500)]
Stop segfaulting on unmount error case

After interrupting ZTS runs that errored out, I found that
"zpool export testpool2" was segfaulting.

This seems unnecessary.

Reviewed-by: szubersk <szuberskidamian@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Jorgen Lundman <lundman@lundman.net>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes #12804

2 years agoCode cleanups
Pawel Jakub Dawidek [Tue, 30 Nov 2021 18:32:38 +0000 (10:32 -0800)]
Code cleanups

- Allocate ve_search on the stack, so we avoid allocating memory for
  every I/O even if the VDEV cache is disabled.
- Reduce lock scope.
- Avoid locking in vdev_cache_read() when the VDEV cache is disabled.
- Sort file names properly.
- Correct comment.

Reviewed-by: Allan Jude <allan@klarasystems.com>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Pawel Jakub Dawidek <pawel@dawidek.net>
Closes #12749

2 years agoReplace wrong occurrences of `affect` by `effect` in the man pages
maxz [Tue, 30 Nov 2021 18:28:57 +0000 (19:28 +0100)]
Replace wrong occurrences of `affect` by `effect` in the man pages

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Amanakis <gamanakis@gmail.com>
Signed-off-by: Max Zettlmeißl <max@zettlmeissl.de>
Closes #12784

2 years agoAllow printing special vdev metaslab groups
Rich Ercolani [Tue, 30 Nov 2021 18:26:45 +0000 (13:26 -0500)]
Allow printing special vdev metaslab groups

Sometimes, we'd like to know info about the metaslab groups
on special vdevs too. So let's make -MM do something useful.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Allan Jude <allan@klarasystems.com>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes #12750

2 years agoPass `--enable=all` to shellcheck within contrib/
Damian Szuberski [Tue, 30 Nov 2021 18:23:10 +0000 (19:23 +0100)]
Pass `--enable=all` to shellcheck within contrib/

- Remove `SHELLCHECK_IGNORE` in favor of inline suppressions
  and more general `SHELLCHECK_OPTS`.

- Exclude `SC2250` (turned on by `--enable=all`) globally

- Pass `--enable=all` to shellcheck for scripts in contrib/: it's
  very important to catch errors early in areas that are not easily
  testable.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: szubersk <szuberskidamian@gmail.com>
Closes #12760

2 years agoetc/systemd/zfs-mount-generator: serialise, handle keylocation=http[s]://
наб [Tue, 30 Nov 2021 16:29:50 +0000 (17:29 +0100)]
etc/systemd/zfs-mount-generator: serialise, handle keylocation=http[s]://

* etc/systemd/zfs-mount-generator: serialise

The wins for a relatively normal workload are rather slim:
real 0.02119s/0.00985s=2.15029x
user 0.02130s/0.00346s=6.15560x
sys 0.03858s/0.00643s=6.00062x

wall-total 0.014518s/0.005925s=2.45009x
wall-init 0.014518s/0.002457s=5.90684x
wall-real 0.014518s/0.003467s=4.18668x

But this is a big win on machines with a lot of datasets and expensive
forks.

For example, the gain on a VM on my work laptop with 900+ legacy-mount
Docker datasets, the original gains from the C rewrite were
only five-fold:
real    0.516s/0.102s=5.05882x
user    0.237s/0.143s=1.65734x
sys     0.287s/0.100s=2.87x

And this serial variant gains this back there as well:
real    0.102s/0.008s=12.75x
user    0.143s/0.007s=20.42857
sys     0.100s/0.001s=100x

wall-total 0.09717s/0.00319s=30.40255x
wall-init 0.00203s/0.00200s=1.015941x
wall-real 0.09513s/0.00118s=80.02043x

For a total of
real    0.516s/0.008s=64.5x
user    0.237s/0.007s=33.85714x
sys     0.287s/0.001s=287x

Suggested-by: Richard Laager <rlaager@wiktel.com>
* etc/systemd/zfs-mount-generator: pull in network for keylocation=https

Also simplify RequiresMountsFor= handling
Ref: #11956

Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12138

2 years agoVdev Properties Feature
Allan Jude [Tue, 30 Nov 2021 14:46:25 +0000 (09:46 -0500)]
Vdev Properties Feature

Add properties, similar to pool properties, to each vdev.
This makes use of the existing per-vdev ZAP that was added as
part of device evacuation/removal.

A large number of read-only properties are exposed,
many of the members of struct vdev_t, that provide useful
statistics.

Adds support for read-only "removing" vdev property.
Adds the "allocating" property that defaults to "on" and
can be set to "off" to prevent future allocations from that
top-level vdev.

Supports user-defined vdev properties.
Includes support for properties.vdev in SYSFS.

Co-authored-by: Allan Jude <allan@klarasystems.com>
Co-authored-by: Mark Maybee <mark.maybee@delphix.com>
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Mark Maybee <mark.maybee@delphix.com>
Signed-off-by: Allan Jude <allan@klarasystems.com>
Closes #11711

2 years agoFix typo in zpool.8
pstef [Mon, 29 Nov 2021 18:52:42 +0000 (19:52 +0100)]
Fix typo in zpool.8

Update zpool.8 to avoid parseltongue.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Piotr P. Stefaniak <pstef@freebsd.org>
Closes #12763

2 years agoLinux 5.16 compat: asm/fpu/xcr.h is new location for xgetbv/xsetbv
Coleman Kane [Tue, 16 Nov 2021 04:23:30 +0000 (23:23 -0500)]
Linux 5.16 compat: asm/fpu/xcr.h is new location for xgetbv/xsetbv

Linux 5.16 moved these functions into this new header in commit
1b4fb8545f2b00f2844c4b7619d64d98440a477c. This change adds code to look
for the presence of this header, and include it so that the code using
xgetbv & xsetbv will compile again.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Coleman Kane <ckane@colemankane.org>
Closes #12800

2 years agoLinux 5.16: wait_on_page_bit() no longer available to modules
Coleman Kane [Tue, 16 Nov 2021 05:10:35 +0000 (00:10 -0500)]
Linux 5.16: wait_on_page_bit() no longer available to modules

Instead, linux/pagemap.h offers a number of folio-specific functions to
be called instead. In this case, module/os/linux/zfs/zfs_vnops_os.c
wants to call wait_on_page_bit(pp, PG_writeback). This gets replaced
with folio_wait_bit(folio_page(pp), PG_writeback). This change modifies
the code to conditionally compile that if configure identifies th
presence of the folio_wait_bit() function.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Coleman Kane <ckane@colemankane.org>
Closes #12800

2 years agoFix several bugs in the FreeBSD rename VOP implementation
Mark Johnston [Fri, 19 Nov 2021 22:26:39 +0000 (17:26 -0500)]
Fix several bugs in the FreeBSD rename VOP implementation

- To avoid a use-after-free, zfsvfs->z_log needs to be loaded after the
  teardown lock is acquired with ZFS_ENTER().
- Avoid leaking vnode locks in zfs_rename_relock() and zfs_rename_()
  when the ZFS_ENTER() macros forces an early return.

Refactor the rename implementation so that ZFS_ENTER() can be used
safely.  As a bonus, this lets us use the ZFS_VERIFY_ZP() macro instead
of open-coding its implementation.

Reported-by: Peter Holm <pho@FreeBSD.org>
Tested-by: Peter Holm <pho@FreeBSD.org>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Signed-off-by: Mark Johnston <markj@FreeBSD.org>
Sponsored-by: The FreeBSD Foundation
Closes #12717

2 years agoAdd notes to system_taskq
Paul Dagnelie [Fri, 19 Nov 2021 17:02:45 +0000 (09:02 -0800)]
Add notes to system_taskq

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Signed-off-by: Paul Dagnelie <pcd@delphix.com>
Closes #12771

2 years agoEnable edonr in FreeBSD
Rich Ercolani [Tue, 16 Nov 2021 19:40:10 +0000 (14:40 -0500)]
Enable edonr in FreeBSD

The code is integrated, builds fine, runs fine, there's not really
any reason not to.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Reviewed-by: Allan Jude <allan@klarasystems.com>
Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes #12735

2 years agoFreeBSD: fix world build after de198f2d9
Martin Matuška [Mon, 15 Nov 2021 16:07:39 +0000 (17:07 +0100)]
FreeBSD: fix world build after de198f2d9

The inline function vn_flush_cached_data() in vnode.h
must not be compiled when building BASE.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Reviewed-by: Allan Jude <allan@klarasystems.com>
Signed-off-by: Martin Matuska <mm@FreeBSD.org>
Closes #12743

2 years agoFix `zfs:AUTO` autodetection in initramfs scripts
Damian Szuberski [Sat, 13 Nov 2021 15:02:50 +0000 (16:02 +0100)]
Fix `zfs:AUTO` autodetection in initramfs scripts

Don't exit early in find_rootfs() when zpool.bootfs
is set to `zfs:AUTO`.

Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Signed-off-by: szubersk <szuberskidamian@gmail.com>
Closes #12658

2 years agoRemove (now unused) td argument from zfs_lookup()
Pawel Jakub Dawidek [Sat, 13 Nov 2021 01:06:44 +0000 (17:06 -0800)]
Remove (now unused) td argument from zfs_lookup()

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Signed-off-by: Pawel Jakub Dawidek <pawel@dawidek.net>
Closes #12748

2 years agoIntroduce a tunable to exclude special class buffers from L2ARC
George Amanakis [Thu, 11 Nov 2021 20:52:16 +0000 (21:52 +0100)]
Introduce a tunable to exclude special class buffers from L2ARC

Special allocation class or dedup vdevs may have roughly the same
performance as L2ARC vdevs. Introduce a new tunable to exclude those
buffers from being cacheable on L2ARC.

Reviewed-by: Don Brady <don.brady@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: George Amanakis <gamanakis@gmail.com>
Closes #11761
Closes #12285

2 years agoRemove basename(1). Clean up/shorten some coreutils pipelines
наб [Thu, 11 Nov 2021 20:27:37 +0000 (21:27 +0100)]
Remove basename(1). Clean up/shorten some coreutils pipelines

Basenames that remain, in cmd/zed/zed.d/statechange-led.sh:
dev=$(basename "$(echo "$therest" | awk '{print $(NF-1)}')")
vdev=$(basename "$ZEVENT_VDEV_PATH")
I don't wanna interfere with #11988

scripts/zfs-tests.sh:
SINGLETESTFILE=$(basename "$SINGLETEST")
tests/zfs-tests/tests/functional/cli_user/zfs_list/zfs_list.kshlib:
ACTUAL=$(basename $dataset)
ACTUAL=$(basename $dataset)
tests/zfs-tests/tests/functional/cli_user/zpool_iostat/
zpool_iostat_-c_homedir.ksh:
typeset USER_SCRIPT=$(basename "$USER_SCRIPT_FULL")
tests/zfs-tests/tests/functional/cli_user/zpool_iostat/
zpool_iostat_-c_searchpath.ksh:
typeset CMD_1=$(basename "$SCRIPT_1")
typeset CMD_2=$(basename "$SCRIPT_2")
tests/zfs-tests/tests/functional/cli_user/zpool_status/
zpool_status_-c_homedir.ksh:
typeset USER_SCRIPT=$(basename "$USER_SCRIPT_FULL")
tests/zfs-tests/tests/functional/cli_user/zpool_status/
zpool_status_-c_searchpath.ksh
typeset CMD_1=$(basename "$SCRIPT_1")
typeset CMD_2=$(basename "$SCRIPT_2")
tests/zfs-tests/tests/functional/migration/migration.cfg:
export BNAME=`basename $TESTFILE`
tests/zfs-tests/tests/perf/perf.shlib:
typeset logbase="$(get_perf_output_dir)/$(basename \
tests/zfs-tests/tests/perf/perf.shlib:
typeset logbase="$(get_perf_output_dir)/$(basename \

These are potentially Of Directories, where basename is actually
useful

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12652

2 years agoCheck l2cache vdevs pending list inside the vdev_inuse()
Fedor Uporov [Thu, 11 Nov 2021 19:54:15 +0000 (11:54 -0800)]
Check l2cache vdevs pending list inside the vdev_inuse()

The l2cache device could be added twice because vdev_inuse() does not
check spa_l2cache for added devices. Make l2cache vdevs inuse checking
logic more closer to spare vdevs.

Reviewed-by: George Amanakis <gamanakis@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Fedor Uporov <fuporov.vstack@gmail.com>
Closes #9153
Closes #12689

2 years agozhack: Add repair label option
Fedor Uporov [Thu, 11 Nov 2021 19:26:18 +0000 (11:26 -0800)]
zhack: Add repair label option

In case if all label checksums will be invalid on any vdev, the pool
will become unimportable. The zhack with newly added cli options could
be used to restore label checksums and make pool importable again.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Fedor Uporov <fuporov.vstack@gmail.com>
Closes #2510
Closes #12686

2 years agoZTS: zfs_list_004_neg should not check paths that belong to ZFS
Palash Gandhi [Thu, 11 Nov 2021 15:46:44 +0000 (07:46 -0800)]
ZTS: zfs_list_004_neg should not check paths that belong to ZFS

When ZFS is on root, /tmp is a ZFS. This causes zfs_list_004_neg to
fail since `zfs list` on /tmp passes when the test expects it not to.
The fix is to exclude paths that belong to ZFS.

Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Palash Gandhi <pbg4930@rit.edu>
Closes #12744

2 years agoRestore dirty dnode detection logic
Brian Behlendorf [Thu, 11 Nov 2021 00:14:32 +0000 (16:14 -0800)]
Restore dirty dnode detection logic

In addition to flushing memory mapped regions when checking holes,
commit de198f2d95 modified the dirty dnode detection logic to check
the dn->dn_dirty_records instead of the dn->dn_dirty_link.  Relying
on the dirty record has not be reliable, switch back to the previous
method.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #11900
Closes #12745

2 years agoExclude zfs_copies_003_pos on Linux
Brian Behlendorf [Wed, 10 Nov 2021 20:56:01 +0000 (12:56 -0800)]
Exclude zfs_copies_003_pos on Linux

This test case may fail on 5.13 and newer Linux kernels if the
/dev/zvol/ device is not created by udev.

Reviewed-by: Rich Ercolani <rincebrain@gmail.com>
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #12301
Closes  #12738

2 years agozdb: Report bad label checksum
Fedor Uporov [Wed, 10 Nov 2021 19:22:00 +0000 (11:22 -0800)]
zdb: Report bad label checksum

In case if all label checksums will be invalid on any vdev, the pool
will become unimportable. From other side zdb with -l option will not
provide any useful information why it happened. Add notifications
about corrupted label checksums.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Signed-off-by: Fedor Uporov <fuporov.vstack@gmail.com>
Closes #2509
Closes #12685

2 years agoUpgrade to libabigail 2.0.0
Dimitri John Ledkov [Mon, 8 Nov 2021 15:44:04 +0000 (15:44 +0000)]
Upgrade to libabigail 2.0.0

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Dimitri John Ledkov <dimitri.ledkov@canonical.com>
Closes #12722
Closes #12739

2 years agozed: Control NVMe fault LEDs
Tony Hutter [Wed, 10 Nov 2021 00:50:18 +0000 (16:50 -0800)]
zed: Control NVMe fault LEDs

The ZED code currently can only turn on the fault LED for
a faulted disk in a JBOD enclosure.  This extends support
for faulted NVMe disks as well.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #12648
Closes #12695

2 years agoSkip spacemaps reading in case of pool readonly import
Fedor Uporov [Tue, 9 Nov 2021 20:50:39 +0000 (12:50 -0800)]
Skip spacemaps reading in case of pool readonly import

The only zdb utility require to read metaslab-related data during
read-only pool import because of spacemaps validation. Add global
variable which will allow zdb read spacemaps in case of readonly
import mode.

Reviewed-by: Serapheim Dimitropoulos <serapheim@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Fedor Uporov <fuporov.vstack@gmail.com>
Closes #9095
Closes #12687

2 years agoSingle IO issue for raidz writes with skip sector
Brian Atkinson [Tue, 9 Nov 2021 19:51:33 +0000 (12:51 -0700)]
Single IO issue for raidz writes with skip sector

In order to reduce contention on the vq_lock, optional skip sectors
for Raidz writes can be placed into a single IO request. This is done by
padding out the linear ABD for a parity column to contain the skip
sector and by creating gang ABD to contain the data and skip sector for
data columns.

The vdev_raidz_map_alloc() function now contains specific functions for
both reads and write to allocate the ABD's that will be issued down to
the VDEV chldren.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-By: Mark Maybee <mark.maybee@delphix.com>
Signed-off-by: Brian Atkinson <batkinson@lanl.gov>
Closes #12333

2 years agoLinux 5.16 compat: submit_bio()
Brian Behlendorf [Sat, 6 Nov 2021 00:17:03 +0000 (17:17 -0700)]
Linux 5.16 compat: submit_bio()

The submit_bio() prototype has changed again.  The version is 5.16
still only expects a single argument but the return type has changed
to void.  Since we never used the returned value before update the
configure check to detect both single arg versions.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Alexander Lobakin <alobakin@pm.me>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #12725

2 years agoLinux 5.16 compat: linux/elevator.h
Brian Behlendorf [Fri, 5 Nov 2021 00:03:50 +0000 (17:03 -0700)]
Linux 5.16 compat: linux/elevator.h

Commit https://github.com/torvalds/linux/commit/2e9bc346 moved
the elevator.h header under the block/ directory as part of some
refactoring.  This turns out not to be a problem since there's
no longer anything we need from the header.  This has been the
case for some time, this change removes the elevator.h include
and replaces it with a major.h include.

Reviewed-by: Alexander Lobakin <alobakin@pm.me>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #12725

2 years agoExclude zvol_misc_volmode for now
Rich Ercolani [Tue, 9 Nov 2021 02:01:19 +0000 (21:01 -0500)]
Exclude zvol_misc_volmode for now

It keeps failing, on changes which aren't related at all.

So until someone runs down why, I'd like it to stop being the
sole reason for CI failures.

Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes #12733

2 years agoFix lseek(SEEK_DATA/SEEK_HOLE) mmap consistency
Brian Behlendorf [Sun, 7 Nov 2021 21:27:44 +0000 (13:27 -0800)]
Fix lseek(SEEK_DATA/SEEK_HOLE) mmap consistency

When using lseek(2) to report data/holes memory mapped regions of
the file were ignored.  This could result in incorrect results.
To handle this zfs_holey_common() was updated to asynchronously
writeback any dirty mmap(2) regions prior to reporting holes.

Additionally, while not strictly required, the dn_struct_rwlock is
now held over the dirty check to prevent the dnode structure from
changing.  This ensures that a clean dnode can't be dirtied before
the data/hole is located.  The range lock is now also taken to
ensure the call cannot race with zfs_write().

Furthermore, the code was refactored to provide a dnode_is_dirty()
helper function which checks the dnode for any dirty records to
determine its dirtiness.

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Rich Ercolani <rincebrain@gmail.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #11900
Closes #12724

2 years agoUpdate contrib/initramfs/README.initramfs.markdown
Michael Franzl [Thu, 4 Nov 2021 15:23:50 +0000 (16:23 +0100)]
Update contrib/initramfs/README.initramfs.markdown

Note that Dropbear supports ed25519 keys since version 2020.79.

See https://github.com/mkj/dropbear/pull/91

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Signed-off-by: Michael Franzl <michael@franzl.name>
Closes #12715

2 years agoRevert behavior of 59eab109 on not-Linux
Rich Ercolani [Thu, 4 Nov 2021 13:49:40 +0000 (09:49 -0400)]
Revert behavior of 59eab109 on not-Linux

It turns out that short-circuiting the EFAULT behavior on a short read
breaks things on FreeBSD. So until there's a nicer solution, let's
just revert the behavior for not-Linux.

Reference:
https://reviews.freebsd.org/R10:70f51f0e474ffe1fb74cb427423a2fba3637544d

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Reviewed-by: Brian Atkinson <batkinson@lanl.gov>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes #12698

2 years agoWorkaround issue cleaning up automounted snapshots on Linux
Rich Ercolani [Wed, 3 Nov 2021 15:00:08 +0000 (11:00 -0400)]
Workaround issue cleaning up automounted snapshots on Linux

On Linux, sometimes, when ZFS goes to unmount an automounted snap,
it fails a VERIFY check on debug builds, because taskq_cancel_id
returned ENOENT after not finding the taskq it was trying to cancel.

This presumably happens when it already died for some reason; in this
case, we don't really mind it already being dead, since we're just
going to dispatch a new task to unmount it right after.

So we just ignore it if we get back ENOENT trying to cancel here,
retry a couple times if we get back the only other possible condition
(EBUSY), and log to dbgmsg if we got anything but ENOENT or success.

(We also add some locking around taskqid, to avoid one or two cases
of two instances of trying to cancel something at once.)

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes #11632
Closes #12670

2 years agoAdd more explicit warning about dedup being dropped
Rich Ercolani [Tue, 2 Nov 2021 21:45:20 +0000 (17:45 -0400)]
Add more explicit warning about dedup being dropped

"has unsupported feature: [number]" seems reasonable when we can't
know what the problem was, but with the send -D removal, we know
what it was, and can explicitly tell people "don't do that; try
this if you must".

So let's.

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes #12708

2 years agoUpdate `checkstyle` workflow env to ubuntu-20.04
Damian Szuberski [Tue, 2 Nov 2021 20:02:57 +0000 (21:02 +0100)]
Update `checkstyle` workflow env to ubuntu-20.04

- `checkstyle` workflow uses ubuntu-20.04 environment
- improved `mancheck.sh` readability

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Signed-off-by: szubersk <szuberskidamian@gmail.com>
Closes #12713

2 years agoFix cpu hotplug atomic sleep issue
Paul Dagnelie [Tue, 2 Nov 2021 16:23:48 +0000 (09:23 -0700)]
Fix cpu hotplug atomic sleep issue

We move the spinlock unlock before the thread creation. This should be
safe because the thread creation code doesn't actually manipulate any
taskq data structures; that's done by the thread once it's created.

We also remove the assertion that the maxthreads is the current threads
plus one; that assertion could fail if multiple hotplug events come in
quick succession, and the first new taskq thread hasn't had a chance to
start processing yet.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
eviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Signed-off-by: Paul Dagnelie <pcd@delphix.com>
Closes #12714

2 years agoDisable normalization implicitly when setting "utf8only=off"
Mike Swanson [Fri, 29 Oct 2021 23:59:18 +0000 (16:59 -0700)]
Disable normalization implicitly when setting "utf8only=off"

When a parent dataset has normalization set to any value other than
"none", and a file system is created with the property "utf8only=off",
implicitly also set "normalization=none" instead of overriding the
desire for a non-UTF8 enforcing file system.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Mike Swanson <mikeonthecomputer@gmail.com>
Closes #11892
Closes #12038

2 years agoExit the teardown section later in rename on FreeBSD
Mark Johnston [Thu, 28 Oct 2021 17:25:26 +0000 (13:25 -0400)]
Exit the teardown section later in rename on FreeBSD

We have to hold the teardown lock while dereferencing zfsvfs->z_os and,
I believe, when committing to the ZIL.

Note that jumping to the "out" label, "error" is always non-zero.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Signed-off-by: Mark Johnston <markj@FreeBSD.org>
Closes #12704

2 years agoFix potential use-after-frees in FreeBSD getpages and setattr VOPs
Mark Johnston [Thu, 28 Oct 2021 15:58:57 +0000 (11:58 -0400)]
Fix potential use-after-frees in FreeBSD getpages and setattr VOPs

The objset object is reallocated during certain dataset operations, such
as rollbacks, so the objset pointer must be loaded after acquiring the
teardown lock.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Signed-off-by: Mark Johnston <markj@FreeBSD.org>
Closes #12704

2 years agozfsprops.7: Add note about comma-separation
D. Ebdrup [Fri, 29 Oct 2021 23:30:44 +0000 (01:30 +0200)]
zfsprops.7: Add note about comma-separation

This change primarily seeks to make implicit documentation explicit, as
it is not outright stated that options should be comma-separated, nor is
there a reason given for it.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Daniel Ebdrup Jensen <debdrup@FreeBSD.org>
Closes #12579

2 years agoDo not print UINT64_MAX value for some of zfs properties
Fedor Uporov [Fri, 29 Oct 2021 23:18:13 +0000 (16:18 -0700)]
Do not print UINT64_MAX value for some of zfs properties

The values of next properties: filesystem_limit, filesystem_count,
snapshot_limit, snapshot_count were returned to user as UINT64_MAX
integers in case if -p cli option is used, return 'none' value instead.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Fedor Uporov <fuporov.vstack@gmail.com>
Closes #9306
Closes #12690

2 years agoAdd explicit error for device_rebuild being disabled
Rich Ercolani [Fri, 29 Oct 2021 22:55:22 +0000 (18:55 -0400)]
Add explicit error for device_rebuild being disabled

Currently, you get back "can only attach to mirrors and top-level disks"
unconditionally if zpool attach returns ENOTSUP, but that also happens
if, say, feature@device_rebuild=disabled and you tried attach -s.

So let's print an error for that case, lest people go down a rabbit hole
looking into what they did wrong.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes #11414
Closes #12680

2 years agoNormalize property names for zfs receive
Rich Ercolani [Fri, 29 Oct 2021 22:38:10 +0000 (18:38 -0400)]
Normalize property names for zfs receive

It turns out, userland is much more happy with aliased property
names than the kernel is.

So let's normalize those to the expected names before we pass
them off.

Added a test case hacked up from the other recv -o/-x test that fails
on unpatched git and passes here.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes #12607
Closes #12609

2 years agoPython 3.10 fixes, part 2
Rich Ercolani [Fri, 29 Oct 2021 22:36:01 +0000 (18:36 -0400)]
Python 3.10 fixes, part 2

There was a fallback case I overlooked in the initial patch, with
a similarly imperfect version extractor.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes #12045
Closes #12673

2 years agoSet DEFAULT_INIT_SHELL to /sbin/openrc-run for Gentoo and Alpine
Peter Levine [Fri, 29 Oct 2021 22:34:37 +0000 (18:34 -0400)]
Set DEFAULT_INIT_SHELL to /sbin/openrc-run for Gentoo and Alpine

Gentoo and Alpine always set the rc init scripts' shebang to
#!/sbin/openrc-run, whether or not openrc is installed.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Peter Levine <plevine457@gmail.com>
Closes #12683
Closes #12692

2 years agovdev_id: Fix PHY sorting
Tony Hutter [Fri, 29 Oct 2021 22:33:34 +0000 (15:33 -0700)]
vdev_id: Fix PHY sorting

One of our developers noticed a bug in vdev_id where we were incorrectly
sorting PHYs using alphabetical sorting (which usually works) instead
of natural sorting (-v).  For example:

[port-0:0]# ls -d phy*
phy-0:10  phy-0:11  phy-0:8  phy-0:9

[port-0:0]# ls -vd phy*
phy-0:8  phy-0:9  phy-0:10  phy-0:11

This fixes the issue.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #12699

2 years agoRemove unused function zvol_set_volblocksize()
Fedor Uporov [Wed, 27 Oct 2021 00:07:53 +0000 (17:07 -0700)]
Remove unused function zvol_set_volblocksize()

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Fedor Uporov <fuporov.vstack@gmail.com>
Closes #12688

2 years agoMake dsl_scan print the pool name in dbgmsg
Rich Ercolani [Tue, 26 Oct 2021 23:24:14 +0000 (19:24 -0400)]
Make dsl_scan print the pool name in dbgmsg

If you've got multiple scrubs/resilvers going, it's rather helpful
to know which pool each scan line refers to.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes: #12674
2 years agospa.c: Replace VERIFY(nvlist_*(...) == 0) with fnvlist_* (#12678)
Allan Jude [Tue, 26 Oct 2021 23:15:38 +0000 (19:15 -0400)]
spa.c: Replace VERIFY(nvlist_*(...) == 0) with fnvlist_* (#12678)

The fnvlist versions of the functions are fatal if they fail,
saving each call from having to include checking the result.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Igor Kozhukhov <igor@dilos.org>
Signed-off-by: Allan Jude <allan@klarasystems.com>
2 years agoZTS: Standardize use of destroy_dataset in cleanup
Brian Behlendorf [Mon, 25 Oct 2021 21:13:50 +0000 (14:13 -0700)]
ZTS: Standardize use of destroy_dataset in cleanup

When cleaning up a test case standardize on using the convention:

    datasetexists $ds && destroy_dataset $ds <flags>

By using 'destroy_dataset' instead of 'log_must zfs destroy' we ensure
that the destroy is retried in the event that a ZFS volume is busy.
This helps ensures ensure tests are fully cleaned up and prevents false
positive test failures on Linux.

Note that all of the tests which used 'zfs destroy' in cleanup have
been updated even if they don't use volumes.  This was done to
clearly establish the expected convention.

Reviewed-by: Rich Ercolani <rincebrain@gmail.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #12663

2 years agoWorkaround cloud-init hotplug issue
Rich Ercolani [Mon, 25 Oct 2021 17:27:05 +0000 (13:27 -0400)]
Workaround cloud-init hotplug issue

cloud-init added a hook which triggers on every device add/rm
event, which results in holding open devices for a while after
they're created/destroyed.

So let's shove an exclusion rule for that into the GH workflows
until it gets fixed.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes #12644
Closes #12669

2 years agoFreeBSD: Catch up with recent VFS changes
Ryan Moeller [Mon, 25 Oct 2021 16:46:28 +0000 (12:46 -0400)]
FreeBSD: Catch up with recent VFS changes

cn_thread is always curthread.

https://cgit.freebsd.org/src/commit?id=b4a58fbf640409a1e507d9f7b411c83a3f83a2f3
https://cgit.freebsd.org/src/commit?id=2b68eb8e1dbbdaf6a0df1c83b26f5403ca52d4c3

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Alan Somers <asomers@gmail.com>
Signed-off-by: Ryan Moeller <freqlabs@FreeBSD.org>
Closes #12668

2 years agopam_zfs_key: malloc and mlock/munlock won't match
Attila Fülöp [Thu, 21 Oct 2021 10:17:47 +0000 (12:17 +0200)]
pam_zfs_key: malloc and mlock/munlock won't match

mlock(2) and munlock(2) operate on memory pages whereas malloc(3)
does not. So if you munlock(2) a malloced memory region, the whole
page containing it is freed. Since this page may contain another
malloced and mlocked memory region, used as a password buffer by a
concurrent running instance of pam_zfs_key, there is a slight chance
of leaking passwords. By using mmap(2) we avoid such problems since
it will return whole pages on page aligned addresses.

Although the above concern may be mostly academical, it is still
better to use mmap(2) for allocating memory since the FreeBSD
documentation suggests to call mlock(2) and munlock(2) on page
aligned addresses, and other implementations even require it.

While here, remove duplicate code in alloc_pw_string() by calling
alloc_pw_size().

Reviewed-by: Felix Dörre <felix@dogcraft.de>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Attila Fülöp <attila@fueloep.org>
Closes #12665