]> CyberLeo.Net >> Repos - FreeBSD/FreeBSD.git/log
FreeBSD/FreeBSD.git
21 months agoCleanup: Switch to strlcpy from strncpy
Richard Yao [Tue, 27 Sep 2022 23:35:29 +0000 (19:35 -0400)]
Cleanup: Switch to strlcpy from strncpy

Coverity found a bug in `zfs_secpolicy_create_clone()` where it is
possible for us to pass an unterminated string when `zfs_get_parent()`
returns an error. Upon inspection, it is clear that using `strlcpy()`
would have avoided this issue.

Looking at the codebase, there are a number of other uses of `strncpy()`
that are unsafe and even when it is used safely, switching to
`strlcpy()` would make the code more readable. Therefore, we switch all
instances where we use `strncpy()` to use `strlcpy()`.

Unfortunately, we do not portably have access to `strlcpy()` in
tests/zfs-tests/cmd/zfs_diff-socket.c because it does not link to
libspl. Modifying the appropriate Makefile.am to try to link to it
resulted in an error from the naming choice used in the file. Trying to
disable the check on the file did not work on FreeBSD because Clang
ignores `#undef` when a definition is provided by `-Dstrncpy(...)=...`.
We workaround that by explictly including the C file from libspl into
the test. This makes things build correctly everywhere.

We add a deprecation warning to `config/Rules.am` and suppress it on the
remaining `strncpy()` usage. `strlcpy()` is not portably avaliable in
tests/zfs-tests/cmd/zfs_diff-socket.c, so we use `snprintf()` there as a
substitute.

This patch does not tackle the related problem of `strcpy()`, which is
even less safe. Thankfully, a quick inspection found that it is used far
more correctly than strncpy() was used. A quick inspection did not find
any problems with `strcpy()` usage outside of zhack, but it should be
said that I only checked around 90% of them.

Lastly, some of the fields in kstat_t varied in size by 1 depending on
whether they were in userspace or in the kernel. The origin of this
discrepancy appears to be 04a479f7066ccdaa23a6546955303b172f4a6909 where
it was made for no apparent reason. It conflicts with the comment on
KSTAT_STRLEN, so we shrink the kernel field sizes to match the userspace
field sizes.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13876

21 months agoEnforce "-F" flag on resuming recv of full/newfs on existing dataset
Jitendra Patidar [Tue, 27 Sep 2022 23:34:27 +0000 (05:04 +0530)]
Enforce "-F" flag on resuming recv of full/newfs on existing dataset

When receiving full/newfs on existing dataset, then it should be done
with "-F" flag. Its enforced for initial receive in checks done in
zfs_receive_one function of libzfs. Similarly, on resuming full/newfs
recv on existing dataset, it should be done with "-F" flag.

When dataset doesn't exist, then full/new recv is done on newly created
dataset and it's marked INCONSISTENT. But when receiving on existing
dataset, recv is first done on %recv and its marked INCONSISTENT.
Existing dataset is not marked INCONSISTENT. Resume of full/newfs
receive with dataset not INCONSISTENT indicates that its resuming newfs
on existing dataset. So, enforce "-F" flag in this case.

Also return an error from dmu_recv_resume_begin_check() in zfs kernel,
when its resuming full/newfs recv without force.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Chunwei Chen <david.chen@nutanix.com>
Signed-off-by: Jitendra Patidar <jitendra.patidar@nutanix.com>
Closes #13856
Closes #13857

21 months agoFix bad free in skein code
Richard Yao [Tue, 27 Sep 2022 19:36:58 +0000 (15:36 -0400)]
Fix bad free in skein code

Clang's static analyzer found a bad free caused by skein_mac_atomic().
It will allocate a context on the stack and then pass it to
skein_final(), which attempts to free it. Upon inspection,
skein_digest_atomic() also has the same problem.

These functions were created to match the OpenSolaris ICP API, so I was
curious how we avoided this in other providers and looked at the SHA2
code. It appears that SHA2 has a SHA2Final() helper function that is
called by the exported sha2_mac_final()/sha2_digest_final() as well as
the sha2_mac_atomic() and sha2_digest_atomic() functions. The real work
is done in SHA2Final() while some checks and the free are done in
sha2_mac_final()/sha2_digest_final().

We fix the use after free in the skein code by taking inspiration from
the SHA2 code. We introduce a skein_final_nofree() that does most of the
work, and make skein_final() into a function that calls it and then
frees the memory.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13954

21 months agoFix userspace memory leaks found by Clang Static Analzyer
Richard Yao [Tue, 27 Sep 2022 00:18:05 +0000 (20:18 -0400)]
Fix userspace memory leaks found by Clang Static Analzyer

Recently, I have been making a push to fix things that coverity found.
However, I was curious what Clang's static analyzer reported, so I ran
it and found things that coverity had missed.

* contrib/pam_zfs_key/pam_zfs_key.c: If prop_mountpoint is passed more
  than once, we leak memory.
* module/zfs/zcp_get.c: We leak memory on temporary properties in
  userspace.
* tests/zfs-tests/cmd/draid.c: On error from vdev_draid_rand(), we leak
  memory if best_map had been allocated by a prior iteration.
* tests/zfs-tests/cmd/mkfile.c: Memory used by the loop is not freed
  before program termination.

Arguably, these are all minor issues, but if we ignore them, then they
could obscure serious bugs, so we fix them.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13955

21 months agoUpdate zfs-mount to load before fstab, matches systemd service.
Chris Zubrzycki [Thu, 15 Sep 2022 01:38:30 +0000 (21:38 -0400)]
Update zfs-mount to load before fstab, matches systemd service.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chris Zubrzycki <github@mid-earth.net>
Closes #13895

21 months agoCleanup: Remove ineffective unsigned comparisons against 0
Richard Yao [Tue, 27 Sep 2022 00:02:38 +0000 (20:02 -0400)]
Cleanup: Remove ineffective unsigned comparisons against 0

Coverity found a number of places where we either do MAX(unsigned, 0) or
do assertions that a unsigned variable is >= 0. These do nothing, so
let us drop them all.

It also found a spot where we do `if (unsigned >= 0 && ...)`. Let us
also drop the unsigned >= 0 check.

Reviewed-by: Neal Gompa <ngompa@datto.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13871

21 months agoLinux: Fix uninitialized variable usage in zio_do_crypt_data()
Richard Yao [Mon, 26 Sep 2022 23:44:22 +0000 (19:44 -0400)]
Linux: Fix uninitialized variable usage in zio_do_crypt_data()

Coverity complained about this. An error from `hkdf_sha512()` before uio
initialization will cause pointers to uninitialized memory to be passed
to `zio_crypt_destroy_uio()`. This is a regression that was introduced
by cf63739191b6cac629d053930a4aea592bca3819. Interestingly, this never
affected FreeBSD, since the FreeBSD version never had that patch ported.
Since moving uio initialization to the top of this function would slow
down the qat_crypt() path, we only move the `memset()` calls to the top
of the function. This is sufficient to fix this problem.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Neal Gompa <ngompa@datto.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13944

21 months agoFix double declaration of getauxval() for FreeBSD PPC
Tino Reichardt [Mon, 26 Sep 2022 17:32:22 +0000 (19:32 +0200)]
Fix double declaration of getauxval() for FreeBSD PPC

The extern declaration is only for Linux, move this line
into the right #ifdef section.

Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Co-authored-by: Martin Matuska <mm@FreeBSD.org>
Signed-off-by: Tino Reichardt <milky-zfs@mcmilk.de>
Closes #13934
Closes #13936

21 months agoFix userland resource leaks
Richard Yao [Fri, 23 Sep 2022 23:55:26 +0000 (19:55 -0400)]
Fix userland resource leaks

Coverity caught these. With the exception of the file descriptor leak in
tests/zfs-tests/cmd/draid.c, they are all memory leaks.

Also, there is a piece of dead code in zfs_get_enclosure_sysfs_path().
We delete it as cleanup.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13921

21 months agoFix unchecked return values and unused return values
Richard Yao [Fri, 23 Sep 2022 23:52:03 +0000 (19:52 -0400)]
Fix unchecked return values and unused return values

Coverity complained about unchecked return values and unused values that
turned out to be unused return values.

Different approaches were used to handle the different cases of
unchecked return values:

* cmd/zdb/zdb.c: VERIFY0 was used in one place since the existing code
  had no error handling. An error message was printed in another to
  match the rest of the code.

* cmd/zed/agents/zfs_retire.c: We dismiss the return value with `(void)`
  because the value is expected to be potentially unset.

* cmd/zpool_influxdb/zpool_influxdb.c: We dismiss the return value with
  `(void)` because the values are expected to be potentially unset.

* cmd/ztest.c: VERIFY0 was used since we want failures if something goes
  wrong in ztest.

* module/zfs/dsl_dir.c: We dismiss the return value with `(void)`
  because there is no guarantee that the zap entry will always be there.
  For example, old pools imported readonly would not have it and we do
  not want to fail here because of that.

* module/zfs/zfs_fm.c: `fnvlist_add_*()` was used since the
  allocations sleep and thus can never fail.

* module/zfs/zvol.c: We dismiss the return value with `(void)` because
  we do not need it. This matches what is already done in the analogous
  `zfs_replay_write2()`.

* tests/zfs-tests/cmd/draid.c: We suppress one return value with
  `(void)` since the code handles errors already. The other return value
  is handled by switching to `fnvlist_lookup_uint8_array()`.

* tests/zfs-tests/cmd/file/file_fadvise.c: We add error handling.

* tests/zfs-tests/cmd/mmap_sync.c: We add error handling for munmap, but
  ignore failures on remove() with (void) since it is expected to be
  able to fail.

* tests/zfs-tests/cmd/mmapwrite.c: We add error handling.

As for unused return values, they were all in places where there was
error handling, so logic was added to handle the return values.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13920

21 months agoset_global_var_parse_kv() should pass the pointer from strdup()
Richard Yao [Fri, 23 Sep 2022 17:51:14 +0000 (13:51 -0400)]
set_global_var_parse_kv() should pass the pointer from strdup()

A comment says that the caller should free k_out, but the pointer passed
via k_out is not the same pointer we received from strdup(). Instead,
it is a pointer into the region we received from strdup(). The free
function should always be called with the original pointer, so this is
likely a bug.

We solve this by calling `strdup()` a second time and then freeing the
original pointer.

Coverity reported this as a memory leak.

Reviewed-by: Neal Gompa <ngompa@datto.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13867

21 months agozpool: Don't print "repairing" on force faulted drives
Tony Hutter [Fri, 23 Sep 2022 17:24:19 +0000 (10:24 -0700)]
zpool: Don't print "repairing" on force faulted drives

If you force fault a drive that's resilvering, it's scan stats can get
frozen in time, giving the false impression that it's being resilvered.
This commit checks the vdev state to see if the vdev is healthy before
reporting "resilvering" or "repairing" in zpool status.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #13927
Closes #13930

21 months agoZTS: fallocate tests fail with hard coded values
John Wren Kennedy [Thu, 22 Sep 2022 22:42:34 +0000 (16:42 -0600)]
ZTS: fallocate tests fail with hard coded values

Currently, these two tests pass on disks with 512 byte sectors. In
environments where the backing store is different, the number of
blocks allocated to write the same file may differ. This change
modifies the reported size check to detect an expected change in the
reported number of blocks without specifying a particular number.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Signed-off-by: John Kennedy <john.kennedy@delphix.com>
Closes  #13931

21 months agoDynamically size dbuf hash mutex array
Brian Behlendorf [Mon, 19 Sep 2022 19:17:11 +0000 (12:17 -0700)]
Dynamically size dbuf hash mutex array

Incorrectly sizing the array of hash locks used to protect the
dbuf hash table can lead to contention and reduce performance.
We could unconditionally allocate a larger array for the locks
but it's wasteful, particularly for a low-memory system.
Instead, dynamically allocate the array of locks and scale
it based on total system memory.

Additionally, add a new `dbuf_mutex_cache_shift` module option
which can be used to override the hash lock array size.  This is
disabled by default (dbuf_mutex_hash_shift=0) and can only be
set at module load time.  The minimum target array size is set
to 8192, this matches the current constant value.

Note that the count of the dbuf hash table and count of the
mutex array were added to the /proc/spl/kstat/zfs/dbufstats
kstat.

Finally, this change removes the _KERNEL conditional checks.
These were not required since for the user space build there
is no difference between the kmem and vmem interfaces.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13928

21 months agoRevert "Reduce dbuf_find() lock contention"
Brian Behlendorf [Mon, 19 Sep 2022 18:07:15 +0000 (11:07 -0700)]
Revert "Reduce dbuf_find() lock contention"

This reverts commit 34dbc618f50cfcd392f90af80c140398c38cbcd1.  While this
change resolved the lock contention observed for certain workloads, it
inadventantly reduced the maximum hash inserts/removes per second.  This
appears to be due to the slightly higher acquisition cost of a rwlock vs
a mutex.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
21 months agoCleanup: Change 1 used in bitshifts to 1ULL
Richard Yao [Thu, 22 Sep 2022 18:28:33 +0000 (14:28 -0400)]
Cleanup: Change 1 used in bitshifts to 1ULL

Coverity complains about this. It is not a bug as long as we never shift
by more than 31, but it is not terrible to change the constants from 1
to 1ULL as clean up.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13914

21 months agoRetire ZFS_TEARDOWN_TRY_ENTER_READ
Mateusz Guzik [Tue, 20 Sep 2022 22:34:41 +0000 (00:34 +0200)]
Retire ZFS_TEARDOWN_TRY_ENTER_READ

There were never any users and it so happens the operation is not even
supported by rrm locks -- the macros were wrong for Linux and FreeBSD
when not using it's RMS locks.

Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
Closes #13906

21 months agoAdd membar_sync
Mateusz Guzik [Tue, 20 Sep 2022 22:32:44 +0000 (00:32 +0200)]
Add membar_sync

Provides the missing full barrier variant to the membar primitive set.

While not used right now, this is probably going to change down the
road.

Name taken from Solaris, to follow the existing routines.

Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
Closes #13907

21 months agoFix minor issues in namespace delegation support
youzhongyang [Tue, 20 Sep 2022 22:25:21 +0000 (18:25 -0400)]
Fix minor issues in namespace delegation support

get_user_ns() is only done once for each namespace, so put_user_ns()
should be done once too.

Fix two typos in user_namespace/user_namespace_002.ksh and
user_namespace/user_namespace_003.ksh.

Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Youzhong Yang <yyang@mathworks.com>
Closes #13918

21 months agoFreeBSD: handle V_PCATCH
Mateusz Guzik [Tue, 20 Sep 2022 22:22:32 +0000 (00:22 +0200)]
FreeBSD: handle V_PCATCH

See https://cgit.FreeBSD.org/src/commit/?id=a75d1ddd74312f5dd79bc1e965f7077679659f2e

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
Closes #13910

21 months agoFreeBSD: catch up to 1400068
Mateusz Guzik [Tue, 20 Sep 2022 22:21:30 +0000 (00:21 +0200)]
FreeBSD: catch up to 1400068

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
Closes #13909

21 months agoCall va_end() before return in zpool_standard_error_fmt()
Richard Yao [Tue, 20 Sep 2022 22:20:56 +0000 (18:20 -0400)]
Call va_end() before return in zpool_standard_error_fmt()

Commit ecd6cf800b63704be73fb264c3f5b6e0dafc068d by marks in OpenSolaris
at Tue Jun 26 07:44:24 2007 -0700 introduced a bug where we fail to call
`va_end()` before returning.

The man page for va_start() says:

"Each invocation of va_start() must be matched by a corresponding
invocation of va_end() in the same function."

Coverity complained about this.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Chunwei Chen <david.chen@nutanix.com>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13904

21 months agoFix potential NULL pointer dereference in zfsdle_vdev_online()
Richard Yao [Tue, 20 Sep 2022 22:20:04 +0000 (18:20 -0400)]
Fix potential NULL pointer dereference in zfsdle_vdev_online()

Coverity complained about this.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Chunwei Chen <david.chen@nutanix.com>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13903

21 months agoDelay ZFS_PROP_SHARESMB property to handle it for encrypted raw receive
Ameer Hamza [Tue, 20 Sep 2022 22:19:05 +0000 (03:19 +0500)]
Delay ZFS_PROP_SHARESMB property to handle it for encrypted raw receive

For encrypted raw receive, objset creation is delayed until a call to
dmu_recv_stream(). ZFS_PROP_SHARESMB property requires objset to be
populated when calling zpl_earlier_version(). To correctly handle the
ZFS_PROP_SHARESMB property for encrypted raw receive, this change
delays setting the property.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>
Closes #13878

21 months agoFreeBSD: Cleanup zfs_readdir()
Richard Yao [Tue, 20 Sep 2022 21:50:16 +0000 (17:50 -0400)]
FreeBSD: Cleanup zfs_readdir()

The FreeBSD project's coverity scans found dead code in `zfs_readdir()`.
Also, the comment above `zfs_readdir()` is out of date.

I fixed the comment and deleted all of the dead code, plus additional
dead code that was found upon review.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13924

21 months agoFreeBSD: Fix uninitialized pointer read in spa_import_rootpool()
Richard Yao [Tue, 20 Sep 2022 21:43:03 +0000 (17:43 -0400)]
FreeBSD: Fix uninitialized pointer read in spa_import_rootpool()

The FreeBSD project's coverity scans found this.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13923

21 months agoCleanup: Remove unused uu_pname code
Richard Yao [Tue, 20 Sep 2022 00:33:52 +0000 (20:33 -0400)]
Cleanup: Remove unused uu_pname code

Coverity caught a possible NULL pointer dereference in dead code. We can
delete it all.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Chunwei Chen <david.chen@nutanix.com>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13900

21 months agoFix usage of zed_log_msg() and zfs_panic_recover()
Richard Yao [Tue, 20 Sep 2022 00:32:18 +0000 (20:32 -0400)]
Fix usage of zed_log_msg() and zfs_panic_recover()

Coverity complained about the format specifiers not matching variables.
In one case, the variable is a constant, so we fix it. In another, we
were missing an argument (about which coverity also complained).

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13888

21 months agoLinux: Fix use-after-free in zfsvfs_create()
Richard Yao [Tue, 20 Sep 2022 00:30:58 +0000 (20:30 -0400)]
Linux: Fix use-after-free in zfsvfs_create()

Coverity reported that we pass a pointer to zfsvfs to
`dmu_objset_disown()` after freeing zfsvfs in zfsvfs_create_impl() after
a failure in zfsvfs_init().

We have nearly identical duplicate versions of this code for FreeBSD and
Linux, but interestingly, the FreeBSD version of this code differs in
such a way that it does not suffer from this bug. We remove the
difference from the FreeBSD version to fix this bug.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13883

21 months agoFreeBSD: fix static module build broken in 7bb707ffa
Martin Matuška [Tue, 20 Sep 2022 00:21:45 +0000 (02:21 +0200)]
FreeBSD: fix static module build broken in 7bb707ffa

param_set_arc_free_target(SYSCTL_HANDLER_ARGS) and
param_set_arc_no_grow_shift(SYSCTL_HANDLER_ARGS) defined in
sysctl_os.c must be made available to arc_os.c.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Martin Matuska <mm@FreeBSD.org>
Closes #13915

21 months agoFreeBSD: stop passing LK_INTERLOCK to VOP_LOCK
Mateusz Guzik [Tue, 20 Sep 2022 00:17:27 +0000 (02:17 +0200)]
FreeBSD: stop passing LK_INTERLOCK to VOP_LOCK

There is an ongoing effort to eliminate this feature.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
Closes #13908

21 months agoAdd PPC cpu feature tests for FreeBSD and Linux
Tino Reichardt [Wed, 7 Sep 2022 18:33:59 +0000 (20:33 +0200)]
Add PPC cpu feature tests for FreeBSD and Linux

Add needed cpu feature tests for powerpc architecture.

Overview:
zfs_altivec_available() - needed by RAID-Z
zfs_vsx_available()     - needed by BLAKE3
zfs_isa207_available()  - needed by SHA2

Part 1 - Userspace
- use getauxval() for Linux and elf_aux_info() for FreeBSD
- direct including <sys/auxv.h> fails with double definitions
- so we self define the needed functions and definitions

Part 2 - Kernel space FreeBSD
- use exported cpu_features of <powerpc/cpu.h>

Part 3 - Kernel space Linux
- use cpu_has_feature() function of <asm/cpufeature.h>

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Tino Reichardt <milky-zfs@mcmilk.de>
Closes #13725

21 months agoAdd zfs_blake3_impl to zfs.4
Tino Reichardt [Sat, 3 Sep 2022 08:40:29 +0000 (10:40 +0200)]
Add zfs_blake3_impl to zfs.4

The zfs module parameter zfs_blake3_impl got no manual page entry while
adding BLAKE3 to OpenZFS. This commit adds the required notes about the
parameter into zfs.4

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Co-authored-by: Ryan Moeller <ryan@freqlabs.com>
Signed-off-by: Tino Reichardt <milky-zfs@mcmilk.de>
Closes #13725

21 months agoFix BLAKE3 tuneable and module loading on Linux and FreeBSD
Tino Reichardt [Wed, 3 Aug 2022 16:36:41 +0000 (18:36 +0200)]
Fix BLAKE3 tuneable and module loading on Linux and FreeBSD

Apply similar options to BLAKE3 as it is done for zfs_fletcher_4_impl.

The zfs module parameter on Linux changes from icp_blake3_impl to
zfs_blake3_impl.

You can check and set it on Linux via sysfs like this:
```
[bash]# cat /sys/module/zfs/parameters/zfs_blake3_impl
cycle [fastest] generic sse2 sse41 avx2

[bash]# echo sse2 > /sys/module/zfs/parameters/zfs_blake3_impl
[bash]# cat /sys/module/zfs/parameters/zfs_blake3_impl
cycle fastest generic [sse2] sse41 avx2
```

The modprobe module parameters may also be used now:
```
[bash]# modprobe zfs zfs_blake3_impl=sse41
[bash]# cat /sys/module/zfs/parameters/zfs_blake3_impl
cycle fastest generic sse2 [sse41] avx2
```

On FreeBSD the BLAKE3 implementation can be set via sysctl like this:
```
[bsd]# sysctl vfs.zfs.blake3_impl
vfs.zfs.blake3_impl: cycle [fastest] generic sse2 sse41 avx2
[bsd]# sysctl vfs.zfs.blake3_impl=sse2
vfs.zfs.blake3_impl: cycle [fastest] generic sse2 sse41 avx2 \
  -> cycle fastest generic [sse2] sse41 avx2
```

This commit changes also some Blake3 internals like these:
- blake3_impl_ops_t was renamed to blake3_ops_t
- all functions are named blake3_impl_NAME() now

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Co-authored-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Tino Reichardt <milky-zfs@mcmilk.de>
Closes #13725

21 months agozfs_enter rework followup
Brian Behlendorf [Fri, 16 Sep 2022 21:22:52 +0000 (14:22 -0700)]
zfs_enter rework followup

The zpl_fadvise() function was recently added and was not included
in the initial patch.  Update it accordingly.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13831

21 months agoFix null pointer dereferences in PAM
Richard Yao [Fri, 16 Sep 2022 21:02:54 +0000 (17:02 -0400)]
Fix null pointer dereferences in PAM

Coverity caught these.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13889

21 months agoHandle ECKSUM as new EZFS_CKSUM ‒ "insufficient replicas"
наб [Fri, 16 Sep 2022 20:59:25 +0000 (22:59 +0200)]
Handle ECKSUM as new EZFS_CKSUM ‒ "insufficient replicas"

Add a meaningful error message for ECKSUM to common error messages.

Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #6805
Closes #13808
Closes #13898

21 months agozfs recv hangs if max recordsize is less than received recordsize
Ameer Hamza [Fri, 16 Sep 2022 20:52:25 +0000 (01:52 +0500)]
zfs recv hangs if max recordsize is less than received recordsize

- Some optimizations for bqueue enqueue/dequeue.
- Added a fix to prevent deadlock when both bqueue_enqueue_impl()
and bqueue_dequeue() waits for signal to be triggered.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>
Closes #13855

21 months agoUpdate coverity model
Richard Yao [Fri, 16 Sep 2022 20:45:15 +0000 (16:45 -0400)]
Update coverity model

`uu_panic()` needs to be modelled and the definition of `vpanic()` from
the original coverity model was missing
`__coverity_format_string_sink__()`.

We also model `libspl_assertf()` as part of an attempt to eliminate
false positives.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13901

21 months agoFix unable to export zpool without nfs-utils
Chunwei Chen [Fri, 16 Sep 2022 20:43:26 +0000 (13:43 -0700)]
Fix unable to export zpool without nfs-utils

Don't return error in nfs_disable_share when nfs is not available, since
it wouldn't have been able to share in the first place.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chunwei Chen <david.chen@nutanix.com>
Closes #13534
Closes #13800

21 months agozfs_enter rework
Chunwei Chen [Fri, 16 Sep 2022 20:36:47 +0000 (13:36 -0700)]
zfs_enter rework

Replace ZFS_ENTER and ZFS_VERIFY_ZP, which have hidden returns, with
functions that return error code. The reason we want to do this is
because hidden returns are not obvious and had caused some missing fail
path unwinding.

This patch changes the common, linux, and freebsd parts. Also fixes
fail path unwinding in zfs_fsync, zpl_fsync, zpl_xattr_{list,get,set}, and
zfs_lookup().

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Chunwei Chen <david.chen@nutanix.com>
Closes #13831

21 months agoAdd zfs_btree_verify_intensity kernel module parameter
Richard Yao [Thu, 15 Sep 2022 23:22:33 +0000 (19:22 -0400)]
Add zfs_btree_verify_intensity kernel module parameter

I see a few issues in the issue tracker that might be aided by being
able to turn this on. We have no module parameter for it, so I would
like to add one.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13874

21 months agoFix incorrect size given to bqueue_enqueue() call in dmu_redact.c
Richard Yao [Thu, 15 Sep 2022 23:21:21 +0000 (19:21 -0400)]
Fix incorrect size given to bqueue_enqueue() call in dmu_redact.c

We pass sizeof (struct redact_record *) rather than sizeof (struct
redact_record). Passing the pointer size is wrong.

Coverity caught this in two places.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13885

21 months agoUse correct mdoc macros for arguments
Mateusz Piotrowski [Thu, 15 Sep 2022 21:22:00 +0000 (23:22 +0200)]
Use correct mdoc macros for arguments

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Signed-off-by: Mateusz Piotrowski <0mp@FreeBSD.org>
Closes #13890

21 months agoFix assertions in crypto reference helpers
Richard Yao [Thu, 15 Sep 2022 20:24:00 +0000 (16:24 -0400)]
Fix assertions in crypto reference helpers

The assertions are racy and the use of `membar_exit()` did nothing to
fix that.

The helpers use atomic functions, so we cleverly get values from the
atomics that we can use to ensure that the assertions operate on the
correct values.

We also use `membar_producer()` prior to decrementing reference counts
so that operations that happened prior to a decrement to 0 will be
guaranteed to happen before the decrement on architectures that reorder
atomics.

This also slightly improves performance by eliminating unnecessary
reads, although I doubt it would be measurable in any benchmark.

Reviewed-by: Mateusz Guzik <mjguzik@gmail.com>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13880

21 months agoZTS: parameter expansion in zfs_unshare_006_pos
John Wren Kennedy [Thu, 15 Sep 2022 20:14:35 +0000 (14:14 -0600)]
ZTS: parameter expansion in zfs_unshare_006_pos

zfs_unshare_006 checks to see if a dataset still has an active SMB
share after doing an NFS unshare -a. The test could fail because the
check for the SMB share does not expect dashes in a dataset name to be
converted to underscores as pathname delimiters are.

Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Signed-off-by: John Kennedy <john.kennedy@delphix.com>
Closes #13893

21 months agoAdd coverity model to repository
Richard Yao [Thu, 15 Sep 2022 18:50:19 +0000 (14:50 -0400)]
Add coverity model to repository

Other projects such as the python project include their coverity models
in their repositories. This provides transparency, which is beneficial
in open source projects. Therefore, it is a good idea to include the
coverity model in our repository too.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13884

21 months agoFix use-after-free bugs in icp code
Richard Yao [Thu, 15 Sep 2022 18:46:42 +0000 (14:46 -0400)]
Fix use-after-free bugs in icp code

These were reported by Coverity as "Read from pointer after free" bugs.
Presumably, it did not report it as a use-after-free bug because it does
not understand the inline assembly that implements the atomic
instruction.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13881

21 months agoCI: revert `--with-config=dist` to hotfix Ubuntu 20.04
George Melikov [Wed, 14 Sep 2022 23:26:57 +0000 (02:26 +0300)]
CI: revert `--with-config=dist` to hotfix Ubuntu 20.04

Recently Github action runners started to fail on kmod build.
Revert --with-config=dist from ./configure section of github
runners to stabilize CI for now.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: George Melikov <mail@gmelikov.ru>
Closes #13894

21 months agoFreeBSD: Fix integer conversion for vnlru_free{,_vfsops}()
Richard Yao [Wed, 14 Sep 2022 19:51:55 +0000 (15:51 -0400)]
FreeBSD: Fix integer conversion for vnlru_free{,_vfsops}()

When reviewing #13875, I noticed that our FreeBSD code has an issue
where it converts from `int64_t` to `int` when calling
`vnlru_free{,_vfsops}()`. The result is that if the int64_t is `1 <<
36`, the int will be 0, since the low bits are 0. Even when some low
bits are set, a value such as `((1 << 36) + 1)` would truncate to 1,
which is wrong.

There is protection against this on 32-bit platforms, but on 64-bit
platforms, there is no check to protect us, so we add a check.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13882

21 months agoAdd assertion to dsl_dataset_set_compression_sync
Richard Yao [Wed, 14 Sep 2022 19:50:03 +0000 (15:50 -0400)]
Add assertion to dsl_dataset_set_compression_sync

Coverity pointed out that if we somehow receive SPA_FEATURE_NONE, we
will use a negative number as an array index. A defensive assertion
seems appropriate.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Neal Gompa <ngompa@datto.com>
Reviewed-by: Allan Jude <allan@klarasystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13872

21 months agoFix theoretical "use-after-free" in dbuf_prefetch_indirect_done()
Richard Yao [Wed, 14 Sep 2022 00:58:29 +0000 (20:58 -0400)]
Fix theoretical "use-after-free" in dbuf_prefetch_indirect_done()

Coverity complains about a "use-after-free" bug in
`dbuf_prefetch_indirect_done()` because we use a pointer value after
freeing its buffer. The pointer is used for refcounting in ARC (as the
reference holder). There is a theoretical situation where the pointer
would be reused in a way that causes the refcounting to collide, so we
change the order in which we call arc_buf_destroy() and
dbuf_prefetch_fini() to match the rest of the function. This prevents
the theoretical situation from being a possibility.

Also, we have a few return statements with a value, despite this being a
void function. We clean those up while we are making changes here.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Neal Gompa <ngompa@datto.com>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13869

21 months agoRemove incorrect free() in zfs_get_pci_slots_sys_path()
Richard Yao [Wed, 14 Sep 2022 00:00:53 +0000 (20:00 -0400)]
Remove incorrect free() in zfs_get_pci_slots_sys_path()

Coverity found this. We attempted to free tmp, which is a pointer to a
string that should be freed by the caller.

Reviewed-by: Neal Gompa <ngompa@datto.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13864

21 months agoCleanup: Make memory barrier definitions consistent across kernels
Richard Yao [Tue, 13 Sep 2022 23:59:33 +0000 (19:59 -0400)]
Cleanup: Make memory barrier definitions consistent across kernels

We inherited membar_consumer() and membar_producer() from OpenSolaris,
but we had replaced membar_consumer() with Linux's smp_rmb() in
zfs_ioctl.c. The FreeBSD SPL consequently implemented a shim for the
Linux-only smp_rmb().

We reinstate membar_consumer() in platform independent code and fix the
FreeBSD SPL to implement membar_consumer() in a way analogous to Linux.

Reviewed-by: Konstantin Belousov <kib@FreeBSD.org>
Reviewed-by: Mateusz Guzik <mjguzik@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Neal Gompa <ngompa@datto.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13843

21 months agoFix memory leak in ztest
Richard Yao [Tue, 13 Sep 2022 23:53:21 +0000 (19:53 -0400)]
Fix memory leak in ztest

Coverity found this.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Neal Gompa <ngompa@datto.com>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13863

21 months agoCleanup dead spa_boot code
Richard Yao [Tue, 13 Sep 2022 23:40:10 +0000 (19:40 -0400)]
Cleanup dead spa_boot code

Unused code detected by coverity.

Reviewed-by: Allan Jude <allan@klarasystems.com>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Neal Gompa <ngompa@datto.com>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13868

21 months agozpool_load_compat() should create strings of length ZFS_MAXPROPLEN
Richard Yao [Mon, 12 Sep 2022 19:54:43 +0000 (15:54 -0400)]
zpool_load_compat() should create strings of length ZFS_MAXPROPLEN

Otherwise, `strlcat()` can overflow them.

Coverity found this.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Neal Gompa <ngompa@datto.com>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13866

21 months agovdev_draid_lookup_map() should not iterate outside draid_maps
Richard Yao [Mon, 12 Sep 2022 19:51:17 +0000 (15:51 -0400)]
vdev_draid_lookup_map() should not iterate outside draid_maps

Coverity reported this as an out-of-bounds read.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Neal Gompa <ngompa@datto.com>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13865

21 months agoFix file descriptor handling in zdb_copy_object()
Richard Yao [Mon, 12 Sep 2022 19:34:10 +0000 (15:34 -0400)]
Fix file descriptor handling in zdb_copy_object()

Coverity found a file descriptor leak. Eyeballing it showed that we had
no handling for the `open()` call failing either. We can address both of
these at once.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Neal Gompa <ngompa@datto.com>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13862

21 months agoFix use-after-free in btree code
Richard Yao [Mon, 12 Sep 2022 18:22:15 +0000 (14:22 -0400)]
Fix use-after-free in btree code

Coverty static analysis found these.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Neal Gompa <ngompa@datto.com>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #10989
Closes #13861

21 months agoCleanup: Use OpenSolaris functions to call scheduler
Richard Yao [Mon, 12 Sep 2022 16:55:37 +0000 (12:55 -0400)]
Cleanup: Use OpenSolaris functions to call scheduler

In our codebase, `cond_resched() and `schedule()` are Linux kernel
functions that have replaced the OpenSolaris `kpreempt()` functions in
the codebase to such an extent that `kpreempt()` in zfs_context.h was
broken. Nobody noticed because we did not actually use it. The header
had defined `kpreempt()` as `yield()`, which works on OpenSolaris and
Illumos where `sched_yield()` is a wrapper for `yield()`, but that does
not work on any other platform.

The FreeBSD platform specific code implemented shims for these, but the
shim for `schedule()` forced us to wait, which is different than merely
rescheduling to another thread as the original Linux code does, while
the shim for `cond_resched()` had the same definition as its kernel
kpreempt() shim.

After studying this, I have concluded that we should reintroduce the
kpreempt() function in platform independent code with the following
definitions:

- In the Linux kernel:
kpreempt(unused) -> cond_resched()

- In the FreeBSD kernel:
kpreempt(unused) -> kern_yield(PRI_USER)

- In userspace:
kpreempt(unused) -> sched_yield()

In userspace, nothing changes from this cleanup. In the kernels, the
function `fm_fini()` will now call `kern_yield(PRI_USER)` on FreeBSD and
`cond_resched()` on Linux.  This is instead of `pause("schedule", 1)` on
FreeBSD and `schedule()` on Linux. This makes our behavior consistent
across platforms.

Note that Linux's SPL continues to use `cond_resched()` and
`schedule()`.  However, those functions have been removed from both the
FreeBSD code and userspace code.

This should have the benefit of making it slightly easier to port the
code to new platforms by making how things should be mapped less
confusing.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Neal Gompa <ngompa@datto.com>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13845

21 months agoMake zfs-share service resilient to stale exports
Don Brady [Fri, 9 Sep 2022 17:54:16 +0000 (11:54 -0600)]
Make zfs-share service resilient to stale exports

The are a few cases where stale entries in /etc/exports.d/zfs.exports
will cause the nfs-server service to fail when starting up.

Since the nfs-server startup consumes /etc/exports.d/zfs.exports, the
zfs-share service (which rebuilds the list of zfs exports) should run
before the nfs-server service.

To make the zfs-share service resilient to stale exports, this change
truncates the zfs config file as part of the zfs share -a operation.

Reviewed-by: Allan Jude <allan@klarasystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Don Brady <don.brady@delphix.com>
Closes #13775

21 months agoFreeBSD: Replace legacy make_dev() interface usage
Ryan Moeller [Thu, 8 Sep 2022 17:40:18 +0000 (13:40 -0400)]
FreeBSD: Replace legacy make_dev() interface usage

The function make_dev_s() was introduced to replace make_dev() in
FreeBSD 11.0.  It allows further specification of properties and flags
and returns an error code on failure.  Using this we can fail loading
the module more gracefully than a panic in situations such as when a
device named zfs already exists.  We already use it for zvols.

Use make_dev_s() for /dev/zfs.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #13854

21 months agozed: Fix config_sync autoexpand flood
Tony Hutter [Thu, 8 Sep 2022 17:32:30 +0000 (10:32 -0700)]
zed: Fix config_sync autoexpand flood

Users were seeing floods of `config_sync` events when autoexpand was
enabled.  This happened because all "disk status change" udev events
invoke the autoexpand codepath, which calls zpool_relabel_disk(),
which in turn cause another "disk status change" event to happen,
in a feedback loop.  Note that "disk status change" happens every time
a user calls close() on a block device.

This commit breaks the feedback loop by only allowing an autoexpand
to happen if the disk actually changed size.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes: #7132
Closes: #7366
Closes #13729

21 months agoImprove too large physical ashift handling
Alexander Motin [Thu, 8 Sep 2022 17:30:53 +0000 (13:30 -0400)]
Improve too large physical ashift handling

When iterating through children physical ashifts for vdev, prefer
ones above the maximum logical ashift, that we can actually use,
but within the administrator defined maximum.

When selecting top-level vdev ashift, do not set it to the defined
maximum in case physical ashift is even higher, but just ignore one.
Using the maximum does not prevent misaligned writes, but reduces
space efficiency.  Since ZFS tries to write data sequentially and
aggregates the writes, in many cases large misanigned writes may be
not as bad as the space penalty otherwise.

Allow internal physical ashifts for vdevs higher than SHIFT_MAX.
May be one day allocator or aggregation could benefit from that.

Reduce zfs_vdev_max_auto_ashift default from 16 (64KB) to 14 (16KB),
so that ZFS may still use bigger ashifts up to SHIFT_MAX (64KB),
but only if it really has to or explicitly told to, but not as an
"optimization".

There are some read-intensive NVMe SSDs that report Preferred Write
Alignment of 64KB, and attempt to build RAIDZ2 of those leads to a
space inefficiency that can't be justified.  Instead these changes
make ZFS fall back to logical ashift of 12 (4KB) by default and
only warn user that it may be suboptimal for performance.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored by: iXsystems, Inc.
Closes #13798

21 months agoAdd Linux posix_fadvise support
Finix1979 [Thu, 8 Sep 2022 17:29:41 +0000 (01:29 +0800)]
Add Linux posix_fadvise support

The purpose of this PR is to accepts fadvise ioctl from userland
to do read-ahead by demand.

It could dramatically improve sequential read performance especially
when primarycache is set to metadata or zfs_prefetch_disable is 1.

If the file is mmaped, generic_fadvise is also called for page cache
read-ahead besides dmu_prefetch.

Only POSIX_FADV_WILLNEED and POSIX_FADV_SEQUENTIAL are supported in
this PR currently.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Finix Yan <yancw@info2soft.com>
Closes #13694

21 months agoLinux SPL module init: Handle memory allocation failures correctly
Richard Yao [Thu, 8 Sep 2022 17:28:20 +0000 (13:28 -0400)]
Linux SPL module init: Handle memory allocation failures correctly

Upon inspection of our code, I noticed that we assume that
__alloc_percpu() cannot fail, and while it probably never has failed in
practice, technically, it can fail, so we should handle that.

Additionally, we incorrectly assume that `taskq_create()` in
spl_kmem_cache_init() cannot fail. The same remark applies to it.

Lastly, `spl-init()` failures should always return negative error
values, but in some places, we are returning positive 1, which is
incorrect. We change those values to their correct error codes.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13847

21 months agoFix build on FreeBSD/powerpc64*
pkubaj [Thu, 8 Sep 2022 17:27:25 +0000 (17:27 +0000)]
Fix build on FreeBSD/powerpc64*

There's no VSX handler on FreeBSD for now.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Piotr Kubaj <pkubaj@FreeBSD.org>
Closes #13848

21 months agomake DMU_OT_IS_METADATA and DMU_OT_IS_ENCRYPTED return B_TRUE or B_FALSE
Christian Schwarz [Thu, 8 Sep 2022 00:04:15 +0000 (02:04 +0200)]
make DMU_OT_IS_METADATA and DMU_OT_IS_ENCRYPTED return B_TRUE or B_FALSE

Without this patch, the

    ASSERT3U(dbuf_is_metadata(db), ==, arc_is_metadata(buf));

at the beginning of dbuf_assign_arcbuf can panic
if the object type is a DMU_OT_NEWTYPE that has
DMU_OT_METADATA set.

While we're at it, fix DMU_OT_IS_ENCRYPTED as well.

Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Christian Schwarz <christian.schwarz@nutanix.com>
Closes #13842

21 months agoAdd xattr_handler support for Android kernels
Walter Huf [Tue, 6 Sep 2022 17:02:18 +0000 (10:02 -0700)]
Add xattr_handler support for Android kernels

Some ARM BSPs run the Android kernel, which has
a modified xattr_handler->get() function signature.
This adds support to compile against these kernels.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Walter Huf <hufman@gmail.com>
Closes #13824

21 months agoFreeBSD: add kqfilter support for zvol cdev
Rob Wing [Wed, 2 Feb 2022 05:00:57 +0000 (20:00 -0900)]
FreeBSD: add kqfilter support for zvol cdev

The only event hooked up is NOTE_ATTRIB, which is triggered when the
device is resized.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Rob Wing <rew@FreeBSD.org>
Closes #13773

21 months agoFreeBSD: add knlist_init_sx() for exclusive locks
Rob Wing [Sun, 14 Aug 2022 05:09:49 +0000 (21:09 -0800)]
FreeBSD: add knlist_init_sx() for exclusive locks

This will be used to implement kqfilter support for zvol cdevs.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Rob Wing <rew@FreeBSD.org>
Closes #13773

21 months agoCleanup Raid-Z Typo fixes
Richard Yao [Tue, 6 Sep 2022 16:43:21 +0000 (12:43 -0400)]
Cleanup Raid-Z Typo fixes

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13834

21 months agoFix column width in 'zpool iostat -v' and 'zpool list -v'
Samuel [Tue, 6 Sep 2022 16:37:47 +0000 (22:07 +0530)]
Fix column width in 'zpool iostat -v' and 'zpool list -v'

This commit fixes a minor spacing issue caused when
enumerating vdev names, which originated from #13031

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Akash B <akash-b@hpe.com>
Signed-off-by: Samuel Wycliffe <samuelwycliffe@gmail.com>
Closes #13811

21 months agoAdd DD_FIELD string for snapshots_changed property
Umer Saleem [Fri, 2 Sep 2022 20:33:50 +0000 (01:33 +0500)]
Add DD_FIELD string for snapshots_changed property

This commit adds DD_FIELD string used in extensified dsl_dir zap object
for snapshots_changed property.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Umer Saleem <usaleem@ixsystems.com>
Closes #13819

21 months agoAdd zfs.sync.snapshot_rename
Andriy Gapon [Fri, 2 Sep 2022 20:31:19 +0000 (23:31 +0300)]
Add zfs.sync.snapshot_rename

Only the single snapshot rename is provided.
The recursive or more complex rename can be scripted.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Andriy Gapon <avg@FreeBSD.org>
Closes #13802

21 months agoFreeBSD: Organize sysctls
Ryan Moeller [Tue, 9 Aug 2022 09:05:47 +0000 (09:05 +0000)]
FreeBSD: Organize sysctls

FreeBSD had a few platform-specific ARC tunables in the wrong place:

- Move FreeBSD-specifc ARC tunables into the same vfs.zfs.arc node as
  the rest of the ARC tunables.
- Move the handlers from arc_os.c to sysctl_os.c and add compat sysctls
  for the legacy names.

While here, some additional clean up:

- Most handlers are specific to a particular variable and don't need a
  pointer passed through the args.
- Group blocks of related variables, handlers, and sysctl declarations
  into logical sections.
- Match variable types for temporaries in handlers with the type of the
  global variable.
- Remove leftover comments.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #13756

21 months agoFreeBSD: Mark ZFS_MODULE_PARAM_CALL as MPSAFE
Ryan Moeller [Tue, 9 Aug 2022 09:05:29 +0000 (09:05 +0000)]
FreeBSD: Mark ZFS_MODULE_PARAM_CALL as MPSAFE

ZFS_MODULE_PARAM_CALL handlers implement their own locking if needed
and do not require Giant.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #13756

21 months agoAdd zilstat script to report zil kstats in a user friendly manner
Ameer Hamza [Fri, 2 Sep 2022 20:24:07 +0000 (01:24 +0500)]
Add zilstat script to report zil kstats in a user friendly manner

Added a python script to process both global and per dataset
zil kstats and report them in a user friendly manner similar
to arcstat and dbufstat.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Richard Elling <Richard.Elling@RichardElling.com>
Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>
Closes #13704

21 months agoApply arc_shrink_shift to ARC above arc_c_min
Alexander Motin [Fri, 2 Sep 2022 20:21:18 +0000 (16:21 -0400)]
Apply arc_shrink_shift to ARC above arc_c_min

It makes sense to free memory in smaller chunks when approaching
arc_c_min to let other kernel subsystems to free more, since after
that point we can't free anything.  This also matches behavior on
Linux, where to shrinker reported only the size above arc_c_min.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Allan Jude <allan@klarasystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Closes #13794

21 months agoFreeBSD: Cleanup dead code from VFS
Richard Yao [Fri, 2 Sep 2022 20:20:10 +0000 (16:20 -0400)]
FreeBSD: Cleanup dead code from VFS

The vfs_*_feature() macros turn anything that uses them into dead code,
so we can delete all of it.

As a side effect, zfs_set_fuid_feature() is now identical in
module/os/freebsd/zfs/zfs_vnops_os.c and
module/os/linux/zfs/zfs_vnops_os.c. A few other functions are identical
too. Future cleanup could move these into a common file.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13832

21 months agoAlloc zdb_cd_t to fix stack issue
Andrew Innes [Fri, 2 Sep 2022 20:15:18 +0000 (04:15 +0800)]
Alloc zdb_cd_t to fix stack issue

Alloc zdb_cd_t since it is too large for the stack on windows
which results in `zdb` crashing immediately.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Andrew Innes <andrew.c12@gmail.com>
Co-authored-by: Jorgen Lundman <lundman@lundman.net>
Closes #13807

22 months agoImporting from cachefile can trip assertion
George Wilson [Fri, 26 Aug 2022 21:04:27 +0000 (16:04 -0500)]
Importing from cachefile can trip assertion

When importing from cachefile, it is possible that the builtin retry
logic will trip an assertion because it also fails to find the pool.
This fix addresses that case and returns the correct error message to
the user.

Reviewed-by: Richard Yao <ryao@gentoo.org>
Reviewed-by: Serapheim Dimitropoulos <serapheim@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: George Wilson <gwilson@delphix.com>
Closes #13781

22 months agoZTS: zvol_stress: fix race condition with zinject usage
Christian Schwarz [Thu, 25 Aug 2022 21:22:10 +0000 (23:22 +0200)]
ZTS: zvol_stress: fix race condition with zinject usage

In automated ZTS runs, I'd occasionally hit

    log_fail "Expected to see some write errors"

because there weren't any write errors.

The reason is that we're not syncing the zpool before `zinject -c`.
If the writes by `dd` aren't synced out at the time `zinject -c` runs,
they will not hit an error and we'll hit the log_fail above.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Christian Schwarz <christian.schwarz@nutanix.com>
Closes #13793

22 months agoRevert "Avoid panic with recordsize > 128k, raw sending and no large_blocks"
Brian Behlendorf [Thu, 25 Aug 2022 20:33:32 +0000 (13:33 -0700)]
Revert "Avoid panic with recordsize > 128k, raw sending and no large_blocks"

This reverts commit 80a650b7bb04bce3aef5e4cfd1d966e3599dafd4.  This change
inadvertently introduced a regression in ztest where one of the new ASSERTs
is triggered in dsl_scan_visitbp().

Reviewed-by: George Amanakis <gamanakis@gmail.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #12275
Closes #13799

22 months agoUpdates for snapshots_changed property
Umer Saleem [Wed, 24 Aug 2022 21:20:43 +0000 (02:20 +0500)]
Updates for snapshots_changed property

Currently, snapshots_changed property is stored in dd_props_zapobj, due
to which the property is assumed to be local. This causes a difference
in behavior with respect to other readonly properties.

This commit stores the snapshots_changed property in dd_object. Source
is not set to local in this case, which makes it consistent with other
readonly properties.

This commit also updates the date string format to include seconds.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Umer Saleem <usaleem@ixsystems.com>
Closes #13785

22 months agoFix zpool status in case of unloaded keys
George Amanakis [Tue, 23 Aug 2022 00:42:01 +0000 (02:42 +0200)]
Fix zpool status in case of unloaded keys

When scrubbing an encrypted filesystem with unloaded key still report an
error in zpool status.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alek Pinchuk <apinchuk@axcient.com>
Signed-off-by: George Amanakis <gamanakis@gmail.com>
Closes #13675
Closes #13717

22 months agoPrevent zevent list from consuming all of kernel memory
Paul Dagnelie [Mon, 22 Aug 2022 19:36:22 +0000 (12:36 -0700)]
Prevent zevent list from consuming all of kernel memory

There are a couple changes included here. The first is to introduce
a cap on the size the ZED will grow the zevent list to. One million
entries is more than enough for most use cases, and if you are
overflowing that value, the problem needs to be addressed another
way. The value is also tunable, for those who want the limit to be
higher or lower.

The other change is to add a kernel module parameter that allows
snapshot creation/deletion to be exempted from the history logging;
for most workloads, having these things logged is valuable, but for
some workloads it produces large quantities of log spam and isn't
especially helpful.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Paul Dagnelie <pcd@delphix.com>
Issue #13374
Closes #13753

22 months agocontrib: dracut: zfs-snapshot-bootfs: exit status fix
gregory-lee-bartholomew [Fri, 12 Aug 2022 21:28:15 +0000 (16:28 -0500)]
contrib: dracut: zfs-snapshot-bootfs: exit status fix

When the zfs-snapshot-bootfs service attempts to create a snapshot
that already exists, the exit status of the command is non-zero and
the service reports failed to the systemd service manager. This is a
common occurrence if bootfs.snapshot is left set on the kernel command
line and it should not be considered a failure.

This service was originally set to ignore this error by prefixing
the command with - on the ExecStart line, but the leading - appears
to have been dropped in #13359.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Gregory Bartholomew <gregory.lee.bartholomew@gmail.com>
Closes #13769

22 months agoarcstat: fix -p option
r-ricci [Fri, 12 Aug 2022 21:21:52 +0000 (22:21 +0100)]
arcstat: fix -p option

When the -p option is used, a list of floats is passed to sep.join(),
which expects strings. Fix this by converting each value to a string.

Reviewed-by: Richard Elling <Richard.Elling@RichardElling.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Roberto Ricci <ricci@disroot.org>
Closes #12916
Closes #13767

22 months agoEnable relatime by default
George Melikov [Fri, 12 Aug 2022 21:20:25 +0000 (00:20 +0300)]
Enable relatime by default

Linux sets relatime on mount by default for any file system,
but relatime=off in ZFS disables it explicitly.

Let's be consistent with other file systems on Linux.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: George Melikov <mail@gmelikov.ru>
Closes #13614

22 months agoZTS: Fix zpool_expand_001_pos
Tony Hutter [Tue, 9 Aug 2022 20:26:46 +0000 (13:26 -0700)]
ZTS: Fix zpool_expand_001_pos

`zpool_expand_001_pos` was often failing due to not seeing autoexpand
commands in the `zpool history`.  During testing, I found this to be
unreliable (sometimes the "online" wouldn't appear in `zpool history`)
and unnecessary, as we could simply check that the pool increased in
size.

This commit revamps the test to check for the expanded pool size
and corresponding new free space.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #13743

22 months agoAdd comment on acb_zio_dummy
Christian Schwarz [Mon, 8 Aug 2022 23:55:13 +0000 (01:55 +0200)]
Add comment on acb_zio_dummy

Thanks to George Wilson for clarifying this on Slack.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Wilson <gwilson@delphix.com>
Signed-off-by: Christian Schwarz <christian.schwarz@nutanix.com>
Closes #13698

22 months agoLinux 6.0 compat: register_shrinker() now var-arg
Coleman Kane [Mon, 8 Aug 2022 23:18:30 +0000 (19:18 -0400)]
Linux 6.0 compat: register_shrinker() now var-arg

The 6.0 kernel added a printf-style var-arg for args > 0 to the
register_shrinker function, in order to add names to shrinkers, in
commit e33c267ab70de4249d22d7eab1cc7d68a889bac2. This enables the
shrinkers to have friendly names exposed in /sys/kernel/debug/shrinker/.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Coleman Kane <ckane@colemankane.org>
Closes #13748

22 months agolibzfs: Remove unused zpool_get_physpath()
Ryan Moeller [Fri, 5 Aug 2022 00:04:09 +0000 (20:04 -0400)]
libzfs: Remove unused zpool_get_physpath()

This is an oddly specific function that has never had any consumers in
the history of this repo.  Get rid of it and the pile of helper
functions that exist for it.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #13724

22 months agozpool: fix redundancy check after vdev removal
Stéphane Lesimple [Fri, 5 Aug 2022 00:02:57 +0000 (03:02 +0300)]
zpool: fix redundancy check after vdev removal

The presence of indirect vdevs was confusing get_redundancy(), which
considered a pool with e.g. only mirror top-level vdevs and at least
one indirect vdev (due to the removal of a previous vdev) as already
having a broken redundancy, which is not the case. This lead to the
possibility of compromising the redundancy of a pool by adding
mismatched vdevs without requiring the use of `-f`, and with no
visible notice or warning.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Stéphane Lesimple <speed47_github@speed47.net>
Closes #13705
Closes #13711

22 months agoLinux 5.20 compat: blk_cleanup_disk()
Brian Behlendorf [Thu, 4 Aug 2022 00:37:52 +0000 (17:37 -0700)]
Linux 5.20 compat: blk_cleanup_disk()

As of the Linux 5.20 kernel blk_cleanup_disk() has been removed,
all callers should use put_disk().

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13728

22 months agoLinux 5.20 compat: bdevname()
Brian Behlendorf [Wed, 3 Aug 2022 18:35:47 +0000 (11:35 -0700)]
Linux 5.20 compat: bdevname()

As of the Linux 5.20 kernel bdevname() has been removed, all
callers should use snprintf() and the "%pg" format specifier.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13728

22 months agoDon't double-zero buffers in fault management nvlists
Paul Dagnelie [Thu, 4 Aug 2022 23:53:47 +0000 (16:53 -0700)]
Don't double-zero buffers in fault management nvlists

This is a small cleanup for a trivial problem which happened to
be noticed while another issue was being investigated.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Paul Dagnelie <pcd@delphix.com>
Closes #13730

22 months agoAdd snapshots_changed as property
Umer Saleem [Tue, 2 Aug 2022 23:45:30 +0000 (04:45 +0500)]
Add snapshots_changed as property

Make dd_snap_cmtime property persistent across mount and unmount
operations by storing in ZAP and restore the value from ZAP on hold
into dd_snap_cmtime instead of updating it.

Expose dd_snap_cmtime as 'snapshots_changed' property that provides a
mechanism to quickly determine whether snapshot list for dataset has
changed without having to mount a dataset or iterate the snapshot list.

It specifies the time at which a snapshot for a dataset was last
created or deleted. This allows us to be more efficient how often we
query snapshots.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Umer Saleem <usaleem@ixsystems.com>
Closes #13635