]> CyberLeo.Net >> Repos - FreeBSD/FreeBSD.git/log
FreeBSD/FreeBSD.git
3 years agoRemove UIO_ZEROCOPY functions structures
Matthew Macy [Fri, 30 Oct 2020 17:00:33 +0000 (10:00 -0700)]
Remove UIO_ZEROCOPY functions structures

The original xuio zero copy functionality has always been unused
on Linux and FreeBSD.  Remove this disabled code to avoid any
confusion and improve readability.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Matt Macy <mmacy@FreeBSD.org>
Closes #11124

3 years agoYield periodically when rebuilding L2ARC
Alexander Motin [Fri, 30 Oct 2020 15:57:54 +0000 (11:57 -0400)]
Yield periodically when rebuilding L2ARC

L2ARC devices of several terabytes filled with 4KB blocks may take 15
minutes to rebuild.  Due to the way L2ARC log reading is implemented
it is quite likely that for all that time rebuild thread will never
sleep.  At least on FreeBSD kernel threads have absolute priority and
can not be preempted by threads with lower priorities.  If some thread
is also bound to that specific CPU it may not get any CPU time for all
the 15 minutes.

Reviewed-by: Cedric Berger <cedric@precidata.com>
Reviewed-by: Ryan Moeller <freqlabs@FreeBSD.org>
Reviewed-by: George Amanakis <gamanakis@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Closes #11116

3 years agoUpdate references to nonexistent man pages in code
Ryan Moeller [Fri, 30 Oct 2020 15:55:59 +0000 (11:55 -0400)]
Update references to nonexistent man pages in code

Refer to the correct section or alternative for FreeBSD and Linux.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #11132

3 years agoFreeBSD: Remove BIO_ORDERED flag from BIO_FLUSH
Alexander Motin [Fri, 30 Oct 2020 15:50:57 +0000 (11:50 -0400)]
FreeBSD: Remove BIO_ORDERED flag from BIO_FLUSH

ZFS always waits for the write completion before flushing the cache.
That is why it does not require explicit ordering fences around it,
which are pretty difficult to implement for NVMe, since one has no
internal concept of strict request ordering.

This was already removed from FreeBSD once, but got resurrected
by mistake during OpenZFS merge.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Closes #11130

3 years agoZTS: Fix xattr_004_pos failure, don't use tmpfs
Tony Hutter [Fri, 30 Oct 2020 15:47:42 +0000 (08:47 -0700)]
ZTS: Fix xattr_004_pos failure, don't use tmpfs

Previously, xattr_004_pos would create files with xattrs on both
tmpfs and ext2, and then copy them to zfs to verify that their
xattrs were preserved.  However tmpfs doesn't support xattrs.

This was never noticed until Fedora 33.  In Fedora 32 and older,
/tmp was on the root partition (like ext4), whereas on Fedora 33
/tmp is actually tmpfs.  That caused this test to fail on Fedora 33.

This fix updates the test to only create the file on ext2, not tmpfs.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #11133

3 years agoLinux: g/c leftover fence in zfs_znode_alloc
Mateusz Guzik [Thu, 29 Oct 2020 16:54:20 +0000 (17:54 +0100)]
Linux: g/c leftover fence in zfs_znode_alloc

The port removed provisions for zfs_znode_move but the cleanup missed
this bit. To quote the original:

[snip]
    list_insert_tail(&zfsvfs->z_all_znodes, zp);
    membar_producer();
    /*
     * Everything else must be valid before assigning z_zfsvfs makes the
     * znode eligible for zfs_znode_move().
     */
    zp->z_zfsvfs = zfsvfs;
[/snip]

In the current code it is immediately followed by unlock which issues
the same fence, thus plays no role in correctness.

Reviewed-by: Matt Macy <mmacy@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
Closes #11115

3 years agoFreeBSD: g/c unused zfs_znode_move support
Mateusz Guzik [Thu, 29 Oct 2020 16:52:50 +0000 (17:52 +0100)]
FreeBSD: g/c unused zfs_znode_move support

The allocator does not provide the functionality to begin with.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Matt Macy <mmacy@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
Closes #11114

3 years agoUse known license string for zlua
Brian Behlendorf [Tue, 27 Oct 2020 16:43:36 +0000 (09:43 -0700)]
Use known license string for zlua

The Linux kernel MODULE_LICENSE macro only recognizes a handful of
license strings and "MIT" is not one of the them.  Update the macro
to use "Dual MIT/GPL" which is recognized and what the kernel expects
MIT licensed modules to use.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #11112
Closes #11113

3 years agoFreeBSD: Skip RAW kstat sysctls by default
Ryan Moeller [Mon, 26 Oct 2020 21:34:28 +0000 (17:34 -0400)]
FreeBSD: Skip RAW kstat sysctls by default

These kstats are often expensive to compute so we want to avoid them
unless specifically requested.

The following kstats are affected by this change:

kstat.zfs.${pool}.multihost
kstat.zfs.${pool}.misc.state
kstat.zfs.${pool}.txgs
kstat.zfs.misc.fletcher_4_bench
kstat.zfs.misc.vdev_raidz_bench
kstat.zfs.misc.dbufs
kstat.zfs.misc.dbgmsg

In FreeBSD 13, sysctl(8) has been updated to still list the
names/description/type of skipped sysctls so they are still
discoverable.

Reviewed-by: Allan Jude <allan@klarasystems.com>
Reviewed-by: Mateusz Guzik <mjguzik@gmail.com>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #11099

3 years agoFreeBSD: catch up with 1300123 version bump
Mateusz Guzik [Mon, 26 Oct 2020 21:32:17 +0000 (22:32 +0100)]
FreeBSD: catch up with 1300123 version bump

- removed thread argument from VOP_INACTIVE
- removed cred argument from VOP_VPTOCNP

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Matt Macy <mmacy@FreeBSD.org>
Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
Closes #11104

3 years agoRestore identification of VDEVs using non-native block size
Cy Schubert [Thu, 22 Oct 2020 19:15:17 +0000 (12:15 -0700)]
Restore identification of VDEVs using non-native block size

NAME         STATE     READ WRITE CKSUM
dsk02        ONLINE       0     0     0
  mirror-0   ONLINE       0     0     0
    ada1s4a  ONLINE       0     0     0
    ada2s4a  ONLINE       0     0     0  block size: 512B configured, 4096B native

Reviewed-by: Matt Macy <mmacy@FreeBSD.org>
Reviewed-by: Toomas Soome <tsoome@me.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed off by: Cy Schubert <cy@FreeBSD.org>
Closes #11088

3 years agoProperly format NAME subsection of zfs/zpool subcommands
xtouqh [Thu, 22 Oct 2020 18:28:10 +0000 (21:28 +0300)]
Properly format NAME subsection of zfs/zpool subcommands

Use proper names (i.e. zfs-allow and zpool-add) in NAME subsections
of zfs/zpool subcommands instead of current "pretty-printed" ones as
makewhatis utilities (or some implementations of it, namely the one
from mandoc suite used in FreeBSD) look not only at the document title
but also in NAME subsection, adding zfs(8)/zpool(8) to search results
which is not correct. (Common sense and other utilities splitting
subcommands in multiple man pages, e.g. git, do the same.)

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: xtouqh <xtouqh@hotmail.com>
Closes #11086

3 years agoAdd missing zfs_arc_evict_batch_limit tunable
Ryan Moeller [Thu, 22 Oct 2020 17:18:26 +0000 (13:18 -0400)]
Add missing zfs_arc_evict_batch_limit tunable

It's even documented already.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #11094

3 years agoarcstat: Add -a and -p options from FreeNAS
Ryan Moeller [Wed, 21 Oct 2020 21:09:14 +0000 (17:09 -0400)]
arcstat: Add -a and -p options from FreeNAS

Added -a option to automatically print all valid statistics.
Added -p option to suppress scaling of printed data.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Authored by: Nick Principe <32284693+powernap@users.noreply.github.com>
Ported-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #11090

3 years agoShare zfs_fsync, zfs_read, zfs_write, et al between Linux and FreeBSD
Matthew Macy [Wed, 21 Oct 2020 21:08:06 +0000 (14:08 -0700)]
Share zfs_fsync, zfs_read, zfs_write, et al between Linux and FreeBSD

The zfs_fsync, zfs_read, and zfs_write function are almost identical
between Linux and FreeBSD.  With a little refactoring they can be
moved to the common code which is what is done by this commit.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Matt Macy <mmacy@FreeBSD.org>
Closes #11078

3 years agoNon-l2arc pool reads shouldn't be l2arc misses
Adam D. Moss [Tue, 20 Oct 2020 18:39:52 +0000 (11:39 -0700)]
Non-l2arc pool reads shouldn't be l2arc misses

The current l2_misses accounting behavior treats all reads to pools
without a configured l2arc as an l2arc miss, IFF there is at least
one other pool on the system which does have an l2arc configured.

This makes it extremely hard to tune for an improved l2arc hit/miss
ratio because this ratio will be modulated by reads from pools which
do not (and should not) have l2arc devices; its upper limit will
depend on the ratio of reads from l2arc'd pools and non-l2arc'd pools.

This PR prevents ARC reads affecting l2arc stats (n.b. l2_misses is
the only relevant one) where the target spa doesn't have an l2arc.

Includes new test - l2arc_l2miss_pos.ksh

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Amanakis <gamanakis@gmail.com>
Signed-off-by: Adam Moss <c@yotes.com>
Closes #10921

3 years agoMakefile.bsd: remove directory that no longer exists
Kyle Evans [Tue, 20 Oct 2020 18:34:59 +0000 (13:34 -0500)]
Makefile.bsd: remove directory that no longer exists

This was removed in a reorganization of directories preparing for the
merge of FreeBSD support, 006e9a408824 by mmacy. While llvm is perfectly
happy with the nonexistent -I directory, the gcc6 and gcc9 we can elect
to use as cross-toolchains both trip over it.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Kyle Evans <kevans@FreeBSD.org>
Closes #11077

3 years agoFreeBSD: delete unreferenced file
Matthew Macy [Tue, 20 Oct 2020 15:53:16 +0000 (08:53 -0700)]
FreeBSD: delete unreferenced file

zfs_onexit_os.c was not deleted when it was removed from the build

Reviewed-by: Matt Ahrens <matt@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Matt Macy <mmacy@FreeBSD.org>
Closes #11079

3 years agoFix commitcheck on FreeBSD
Ryan Moeller [Tue, 20 Oct 2020 15:35:53 +0000 (11:35 -0400)]
Fix commitcheck on FreeBSD

Convert from bash to sh, avoid Perl regexes and \s, prune unused
functions.

Reviewed-by: Mateusz Piotrowski <0mp@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <freqlabs@FreeBSD.org>
Closes #11070

3 years agozed syslog entries drop important info
Don Brady [Mon, 19 Oct 2020 18:01:00 +0000 (12:01 -0600)]
zed syslog entries drop important info

ZED will log zevents summaries to the syslog, however the log entries
tend to drop event details that can be useful for diagnosis. This is
especially true for ereport events, like io, checksum, and delay.

Update the all-syslog.sh script to log additional event information.

Add an optional config option, ZED_SYSLOG_DISPLAY_GUIDS, to zed.rc
for choosing GUIDs over names for pool and vdev.

Change the default ZED_SYSLOG_SUBCLASS_EXCLUDE to exclude history_event
events. These events tend to be frequent, convey no meaningful info,
and are already logged in the zpool history.

Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Don Brady <don.brady@delphix.com>
Closes #10967

3 years agoIgnore zpool_influxdb binary
Ryan Moeller [Fri, 16 Oct 2020 20:21:28 +0000 (16:21 -0400)]
Ignore zpool_influxdb binary

This was requested but forgotten in #10786.

Reviewed-by: Richard Elling <Richard.Elling@RichardElling.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <freqlabs@FreeBSD.org>
Closes #11071

3 years agoFreeBSD: add missing fplookup_vexec handler to special vop vectors
Mateusz Guzik [Thu, 15 Oct 2020 05:08:20 +0000 (05:08 +0000)]
FreeBSD: add missing fplookup_vexec handler to special vop vectors

Otherwise lookup can fail with EOPNOTSUPP or panic.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Matt Macy <mmacy@FreeBSD.org>
Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
Closes #11066

3 years agoFreeBSD: g/c unused vop vector zfsctl_ops_shares_dir
Mateusz Guzik [Thu, 15 Oct 2020 05:07:02 +0000 (05:07 +0000)]
FreeBSD: g/c unused vop vector zfsctl_ops_shares_dir

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Matt Macy <mmacy@FreeBSD.org>
Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
Closes #11066

3 years agoIgnore special vdev ashift for spa ashift min/max
Don Brady [Thu, 15 Oct 2020 21:45:16 +0000 (15:45 -0600)]
Ignore special vdev ashift for spa ashift min/max

The removal of a vdev in the normal class would fail if there was a
special or deup vdev that had a different ashift than the vdevs in
the normal class.

Moved the initialization of spa_min_ashift / spa_max_ashift from
vdev_open so that it occurs after the vdev allocation bias was
initialized (i.e. after vdev_load).

Caveat -- In order to remove a special/dedup vdev it must have the
same ashift as the normal pool vdevs.  This could perhaps be lifted
in the future (i.e. for the case where there is ample space in any
surviving special class vdevs)

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Don Brady <don.brady@delphix.com>
Closes #9363
Closes #9364
Closes #11053

3 years agoFix crash caused by invalid snapshot names in redactnvl
Christian Schwarz [Wed, 14 Oct 2020 21:04:19 +0000 (23:04 +0200)]
Fix crash caused by invalid snapshot names in redactnvl

This is a follow up fix for commit 0fdd6106bb.  The VERIFY is
only true when we haven't hit an error code path.  See added
test case for a reproducer.

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Christian Schwarz <me@cschwarz.com>
Closes #11048

3 years agoFix incorrect deletion order in range_tree_add_impl gap case
Paul Dagnelie [Wed, 14 Oct 2020 15:59:54 +0000 (08:59 -0700)]
Fix incorrect deletion order in range_tree_add_impl gap case

After a side-effectful call like add or remove, references to range
segs stored in btrees can no longer be used safely.  We move the
remove call to just before the reinsertion call so that the seg
remains valid for as long as we need it.

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Paul Dagnelie <pcd@delphix.com>
Closes #11044
Closes #11056

3 years agoFreeBSD: fix panic due to tqid overflow
Mateusz Guzik [Wed, 14 Oct 2020 15:57:03 +0000 (17:57 +0200)]
FreeBSD: fix panic due to tqid overflow

The 32-bit counter eventually wraps to 0 which is a sentinel for invalid
id.

Make it 64-bit on LP64 platforms and 0-check otherwise.

Note: Linux counterpart uses id stored per queue instead of a global.
I did not check going that way is feasible with the goal being the
minimal fix doing the job.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
Closes #11059

3 years agoCross-platform acltype
Ryan Moeller [Wed, 14 Oct 2020 04:25:48 +0000 (00:25 -0400)]
Cross-platform acltype

The acltype property is currently hidden on FreeBSD and does not
reflect the NFSv4 style ZFS ACLs used on the platform.  This makes it
difficult to observe that a pool imported from FreeBSD on Linux has a
different type of ACL that is being ignored, and vice versa.

Add an nfsv4 acltype and expose the property on FreeBSD.

Make the default acltype nfsv4 on FreeBSD.

Setting acltype to an unhanded style is treated the same as setting
it to off.  The ACLs will not be removed, but they will be ignored.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #10520

3 years agoFreeBSD: make adjustments for the standalone environment
Warner Losh [Wed, 14 Oct 2020 04:05:49 +0000 (22:05 -0600)]
FreeBSD: make adjustments for the standalone environment

In FreeBSD, there are three compile environments that are supported:
user land, the kernel and the bootloader / standalone. Adjust the
headers to compile in the standalone environment. Limit kernel-only
items from view when _STANDALONE is defined.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Warner Losh <imp@FreeBSD.org>
Closes #10998

3 years agodmu_zfetch: don't leak unreferenced stream when zfetch is freed
Matthew Macy [Wed, 14 Oct 2020 04:03:36 +0000 (21:03 -0700)]
dmu_zfetch: don't leak unreferenced stream when zfetch is freed

Currently streams are only freed when:
  - They have no referencing zfetch and and their I/O references
    go to zero.
  - They are more than 2s old and a new I/O request comes in on
    the same zfetch.

This means that we will leak unreferenced streams when their zfetch
structure is freed.

This change checks the reference count on a stream at zfetch free
time. If it is zero we free it immediately. If it has remaining
references we allow the prefetch callback to free it at I/O
completion time.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Adam Moss <c@yotes.com>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Matt Macy <mmacy@FreeBSD.org>
Closes #11052

3 years agoaarch64: Use proper guards for NEON instructions
Warner Losh [Wed, 14 Oct 2020 04:01:40 +0000 (22:01 -0600)]
aarch64: Use proper guards for NEON instructions

The zstd code assumes that if you are on aarch64, you have NEON
instructions. This is not necessarily true. In a boot loader, where
you might not have the VFP properly initialized, these instructions
may not be available. It's also an error to include arm_neon.h when
the NEON insturctions aren't enabled. Change the guards for using the
NEON instructions from __aarch64__ to __ARM_NEON which is the standard
symbol for knowing if they are available.

__ARM_NEON is the proper symbol, defined in ARM C Language Extensions
Release 2.1 (https://developer.arm.com/documentation/ihi0053/d/). Some
sources suggest __ARM_NEON__, but that's the obsolete spelling from
prior versions of the standard.

Updated based on zstd pull request https://github.com/facebook/zstd/pull/2356

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Warner Losh <imp@bsdimp.com>
Closes #11055

3 years agoAdd zfs.sh module unload error message
Adam D. Moss [Tue, 13 Oct 2020 23:51:54 +0000 (16:51 -0700)]
Add zfs.sh module unload error message

If modules fail to unload because of outstanding users, don't
consider this a success.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Adam Moss <c@yotes.com>
Closes #11042

3 years agodmu.h: remove stale declaration dmu_objset_snapshot_tmp
Christian Schwarz [Tue, 13 Oct 2020 23:46:00 +0000 (01:46 +0200)]
dmu.h: remove stale declaration dmu_objset_snapshot_tmp

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Adam Moss <c@yotes.com>
Signed-off-by: Christian Schwarz <me@cschwarz.com>
Closes #11047

3 years agoFreeBSD: use cache_rename if available
Mateusz Guzik [Tue, 13 Oct 2020 23:41:26 +0000 (01:41 +0200)]
FreeBSD: use cache_rename if available

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Matt Macy <mmacy@FreeBSD.org>
Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
Closes #11045

3 years agoblkg_tryget config test: initialize struct
Mathieu Velten [Tue, 13 Oct 2020 23:36:36 +0000 (01:36 +0200)]
blkg_tryget config test: initialize struct

Missing struct initialization in a config test results in the
interface being incorrectly detected.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Adam Moss <c@yotes.com>
Signed-off-by: Mathieu Velten <matmaul@gmail.com>
Closes #10713
Closes #11049

3 years agoIncrease Supported Linux Kernel to 5.9
Kjeld Schouten-Lebbing [Tue, 13 Oct 2020 16:51:13 +0000 (18:51 +0200)]
Increase Supported Linux Kernel to 5.9

This increases the Linux kernel version to 5.9 from 5.8
as most compatibility fixes should already be included.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Pavel Snajdr <snajpa@snajpa.net>
Signed-off-by: Kjeld Schouten-Lebbing <kjeld@schouten-lebbing.nl>
Closes #11050

3 years agoFreeBSD: Improve libzfs_error_init messages
Ryan Moeller [Tue, 13 Oct 2020 16:38:40 +0000 (12:38 -0400)]
FreeBSD: Improve libzfs_error_init messages

It is a common mistake to have failed to autoload the module due to
permission issues when running a ZFS command as a user.  "Operation
not permitted" is an unhelpfully vague error message.

Use a thread-local message buffer to format a nicer error message.
We can infer that loading the kernel module failed if the module is
not loaded.  This can be extended with heuristics for other errors
in the future.

While looking at this stuff, remove an unused thread-local message
buffer found in libspl and remove some inaccurate verbiage from the
comment on libzfs_load_module.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #11033

3 years agoExpose zfetch_max_idistance tunable
Ryan Moeller [Tue, 13 Oct 2020 16:32:34 +0000 (12:32 -0400)]
Expose zfetch_max_idistance tunable

FreeBSD had this value tunable before the switch to the new OpenZFS.
The tunable name has changed, breaking legacy compat.

Restore legacy compat for this tunable, properly expose the tunable
with the new name on all platforms, and document it in
zfs-module-parameters(5).

While here, clean up the documentation for zfetch_max_distance a bit.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #11038

3 years agozil_parse: make callback parameters const
Christian Schwarz [Fri, 9 Oct 2020 16:34:54 +0000 (18:34 +0200)]
zil_parse: make callback parameters const

Code cleanup, a follow up commit to 4d55ea81.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Co-authored-by: Ryan Moeller <ryan@freqlabs.com>
Signed-off-by: Christian Schwarz <me@cschwarz.com>
Closes #11020

3 years agoAdd zpool_influxdb command
Richard Elling [Fri, 9 Oct 2020 16:29:21 +0000 (09:29 -0700)]
Add zpool_influxdb command

A zpool_influxdb command is introduced to ease the collection
of zpool statistics into the InfluxDB time-series database.
Examples are given on how to integrate with the telegraf
statistics aggregator, a companion to influxdb.

Finally, a grafana dashboard template is included to show
how pool latency distributions can be visualized in a
ZFS + telegraf + influxdb  + grafana environment.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Richard Elling <Richard.Elling@RichardElling.com>
Closes #10786

3 years agoLinux: Initialize zp in zfs_setattr_dir
Ryan Moeller [Fri, 9 Oct 2020 16:27:14 +0000 (12:27 -0400)]
Linux: Initialize zp in zfs_setattr_dir

The value of zp is used without having been initialized under some
conditions.  Initialize the pointer to NULL.

Add a regression test case using chown in acl/posix.  However, this is
not enough because the setup sets xattr=sa, which means zfs_setattr_dir
will not be called.  Create a second group of acl tests in acl/posix-sa
duplicating the acl/posix tests with symlinks, and remove xattr=sa from
the original acl/posix tests.  This provides more coverage for the
default xattr=on code.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #10043
Closes #11025

3 years agoReplace ZFS on Linux references with OpenZFS
Brian Behlendorf [Fri, 9 Oct 2020 03:10:13 +0000 (20:10 -0700)]
Replace ZFS on Linux references with OpenZFS

This change updates the documentation to refer to the project
as OpenZFS instead ZFS on Linux.  Web links have been updated
to refer to https://github.com/openzfs/zfs.  The extraneous
zfsonlinux.org web links in the ZED and SPL sources have been
dropped.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #11007

3 years agoFix Linux modules uninstall
Jacob Adams [Fri, 9 Oct 2020 03:07:10 +0000 (03:07 +0000)]
Fix Linux modules uninstall

A missing semicolon between kmoddir variable declaration and the
uninstall for loop caused modules_uninstall-Linux to fail with:

    Syntax error: "do" unexpected

Reviewed-by: Kjeld Schouten <kjeld@schouten-lebbing.nl>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Jacob Adams <jacob@tookmund.com>
Closes #11032

3 years agoZTS: Fix path to /dev/null in nopwrite_recsize
Ryan Moeller [Thu, 8 Oct 2020 23:39:23 +0000 (19:39 -0400)]
ZTS: Fix path to /dev/null in nopwrite_recsize

Don't direct stdout and stderr of dd to $TEST_BASE_DIR/null,
direct it to /dev/null.

Reviewed-by: Kjeld Schouten <kjeld@schouten-lebbing.nl>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #11026

3 years agoFix ubsan: shift exponent is too large
Chuck Tuffli [Thu, 8 Oct 2020 23:37:27 +0000 (16:37 -0700)]
Fix ubsan: shift exponent is too large

When running libzpool with the Undefined Behavior Sanitizer (ubsan)
enabled, a zpool create causes a run-time error:

    module/zfs/vdev_label.c:600:14: runtime error: shift exponent 64 is
    too large for 64-bit type 'long long unsigned int'`

in vdev_config_generate()

Fix is to convert vdev_removal_max_span to its base-2 logarithm, using
highbit64(), and then compare the "shifts".

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Chuck Tuffli <ctuffli@gmail.com>
Closes #9744
Closes #11024

3 years agolibzfs_sendrecv: zfs_send: remove unused pipefd and tid variables
Christian Schwarz [Thu, 8 Oct 2020 16:43:51 +0000 (18:43 +0200)]
libzfs_sendrecv: zfs_send: remove unused pipefd and tid variables

fixup of 196bee4

On gcc (GCC) 9.2.1 20190827 (Red Hat 9.2.1-1), the code removed
caused `-Wmaybe-uninitialized` errors.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Christian Schwarz <me@cschwarz.com>
Closes #11021

3 years agoMake dbufstat work on FreeBSD
Ryan Moeller [Thu, 8 Oct 2020 16:40:23 +0000 (12:40 -0400)]
Make dbufstat work on FreeBSD

With procfs_list kstats implemented for FreeBSD, dbufs are now exposed
as kstat.zfs.misc.dbufs.

On FreeBSD, dbufstats can use the sysctl instead of procfs when no
input file has been given.

Enable the dbufstats tests on FreeBSD.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <freqlabs@FreeBSD.org>
Closes #11008

3 years agoFreeBSD: Sort and dedup includes in kmod_core
Ryan Moeller [Thu, 8 Oct 2020 16:37:56 +0000 (12:37 -0400)]
FreeBSD: Sort and dedup includes in kmod_core

Code cleanup. Sort includes, remove duplicates, and drop
some extra blank lines in kmod_core.c.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <freqlabs@FreeBSD.org>
Closes #11000

3 years agodocs: update README's installation link
George Melikov [Thu, 8 Oct 2020 16:33:53 +0000 (19:33 +0300)]
docs: update README's installation link

OpenZFS is a cross-OS project now.

Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Reviewed-by: Kjeld Schouten <kjeld@schouten-lebbing.nl>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: George Melikov <mail@gmelikov.ru>
Closes #11022

3 years agoMake L2ARC tests more robust
George Amanakis [Mon, 5 Oct 2020 22:29:05 +0000 (18:29 -0400)]
Make L2ARC tests more robust

Instead of relying on arbitrary timers after pool export/import or cache
device off/online rely on arcstats. This makes the L2ARC tests more
robust. Also cleanup some functions related to persistent L2ARC.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Adam Moss <c@yotes.com>
Signed-off-by: George Amanakis <gamanakis@gmail.com>
Closes #10983

3 years agozdb should not output binary data on terminal
Toomas Soome [Mon, 5 Oct 2020 21:05:28 +0000 (00:05 +0300)]
zdb should not output binary data on terminal

The zdb is interpreting byte array as textual string in dump_zap,
but there are also binary arrays and we should not output binary
data on terminal.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Igor Kozhukhov <igor@dilos.org>
Signed-off-by: Toomas Soome <tsoome@me.com>
External-issue: https://www.illumos.org/issues/12012
External-issue: https://www.illumos.org/issues/11713
Closes #11006

3 years agoFreeBSD: Sort out kernel FPU headers for 12.1-REL
Ryan Moeller [Sat, 3 Oct 2020 00:48:45 +0000 (20:48 -0400)]
FreeBSD: Sort out kernel FPU headers for 12.1-REL

We were missing an include for kernel FPU functions, breaking the build
on FreeBSD 12.1-RELEASE.  This was apparently being pulled in from
elsewhere on stable/12 and head.

Sorted the other includes in these files while here.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #11005

3 years agoFix EIO after resuming receive of new dataset over an existing one
Alan Somers [Sat, 3 Oct 2020 00:47:09 +0000 (18:47 -0600)]
Fix EIO after resuming receive of new dataset over an existing one

When resuming an interrupted ZFS send stream that creates a new dataset
with the same name as an existing dataset, if the existing dataset is
accessed after the failed receive, then after the subsequent successful
receive it will return EIO. This happens because nothing mounts the new
dataset, leaving the old, no longer valid dataset still mounted.

This commit fixes zfs receive to always unmount and remount the
destination, regardless of whether the stream is a new stream or a
resumed stream.

Sponsored by: Axcient
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Alan Somers <asomers@gmail.com>
External-issue: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=249579
Closes #10995
Closes #10999

3 years agoThrow const on some strings
Ryan Moeller [Sat, 3 Oct 2020 00:44:10 +0000 (20:44 -0400)]
Throw const on some strings

In C, const indicates to the reader that mutation will not occur.
It can also serve as a hint about ownership.

Add const in a few places where it makes sense.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <freqlabs@FreeBSD.org>
Closes #10997

3 years agoMismatched nvlist names in zfs_keys_send_space
John Poduska [Sat, 3 Oct 2020 00:40:46 +0000 (20:40 -0400)]
Mismatched nvlist names in zfs_keys_send_space

This causes "zfs send -vt ..." to fail with:

    cannot resume send: Unknown error 1030

It turns out that some of the name/value pairs in the verification
list for zfs_ioc_send_space(), zfs_keys_send_space, had the wrong
name, so the ioctl got kicked out in zfs_check_input_nvpairs().
Update the names accordingly.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: John Poduska <jpoduska@datto.com>
Closes #10978

3 years agoFix buggy procfs_list_seq_next warning
Brian Behlendorf [Wed, 30 Sep 2020 20:27:51 +0000 (13:27 -0700)]
Fix buggy procfs_list_seq_next warning

The kernel seq_read() helper function expects ->next() to update
the passed position even there are no more entries.  Failure to
do so results in the following warning being logged.

    seq_file: buggy .next function procfs_list_seq_next [spl]
    did not update position index

Functionally there is no issue with the way procfs_list_seq_next()
is implemented and the warning is harmless.  However, we want to
silence this some what scary incorrect warning.  This commit
updates the Linux procfs code to advance the position even for
the last entry.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #10984
Closes #10996

3 years agoFreeBSD: Fix legacy compat for platform IOCs
Ryan Moeller [Wed, 30 Sep 2020 20:25:50 +0000 (16:25 -0400)]
FreeBSD: Fix legacy compat for platform IOCs

The request number is out of bounds of the platform table.

Subtract the starting offset to get the correct subscript.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #10994

3 years agoEliminate gratuitous bzeroing in dbuf_stats_hash_table_data
Matthew Macy [Wed, 30 Sep 2020 20:24:38 +0000 (13:24 -0700)]
Eliminate gratuitous bzeroing in dbuf_stats_hash_table_data

`dbuf_stats_hash_table_data` can take much longer than it needs to
by repeatedly bzeroing its buffer when in fact the buffer only needs
to be NULL terminated.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Matt Macy <mmacy@FreeBSD.org>
Closes #10993

3 years agodo a cyclic seek for unused memory objects in pool
Sebastian Gottschall [Wed, 30 Sep 2020 20:22:34 +0000 (22:22 +0200)]
do a cyclic seek for unused memory objects in pool

In non regular use cases allocated memory might stay persistent in memory
pool. This small patch checks every minute if there are old objects which
can be released from memory pool.

Right now with regular use, the pool is checked for old objects on each
allocation attempt from this pool. so basically polling by its use. Now
consider what happens if someone writes a lot of files and stops use of
the volume or even unmounts it. So the code will no longer check if
objects can be released from the pool. Already allocated objects will
still stay in pool cache. this is no big issue for common use. But
someone discovered this issue while doing tests. personally i know this
behavior and I'm aware of it. Its no big issue. just a enhancement

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Kjeld Schouten-Lebbing <kjeld@schouten-lebbing.nl>
Signed-off-by: Sebastian Gottschall <s.gottschall@dd-wrt.com>
Closes #10938
Closes #10969

3 years agoDrop references when skipping dmu_send due to EXDEV
Ryan Moeller [Wed, 30 Sep 2020 20:19:49 +0000 (16:19 -0400)]
Drop references when skipping dmu_send due to EXDEV

When an invalid incremental send is requested where the "to" ds is
before the "from" ds, make sure to drop the reference to the pool
and the dataset before returning the error.

Add an assert on FreeBSD to make sure we don't hold any locks after
returning from an ioctl.

Add some test coverage.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #10919

3 years agoAdd intel_QAT patches
Kjeld Schouten-Lebbing [Wed, 30 Sep 2020 20:17:30 +0000 (22:17 +0200)]
Add intel_QAT patches

Add community compatibility patches for Intel QAT
Due to incompatibility with higher kernel versions.

Also includes basic instructions.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Kjeld Schouten-Lebbing <kjeld@schouten-lebbing.nl>
Closes #10961
Closes #10962

3 years agoUse known license string for zzstd
Brian Behlendorf [Tue, 29 Sep 2020 01:43:27 +0000 (18:43 -0700)]
Use known license string for zzstd

The Linux kernel MODULE_LICENSE macro only recognizes a handful of
license strings and "BSD" is not one of the them.  Update the macro
to use "Dual BSD/GPL" which is recognized and what the kernel expects
BSD licensed module to use.

Reviewed-by: Kjeld Schouten <kjeld@schouten-lebbing.nl>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #10982
Closes #10992

3 years agoFix CONFIG_DEBUG_LOCK_ALLOC configure check
Brian Behlendorf [Mon, 28 Sep 2020 23:42:54 +0000 (16:42 -0700)]
Fix CONFIG_DEBUG_LOCK_ALLOC configure check

This check was accidentally broken when the kABI checks were updated
to run in parallel, commit 608f874.  The check must be for the
config_debug_lock_alloc_license name to determine if the symbol
is license compatible.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #10991

3 years agoFix objtool configure check
Brian Behlendorf [Mon, 28 Sep 2020 23:40:50 +0000 (16:40 -0700)]
Fix objtool configure check

The m4 objtool configure check can incorrectly fail because of a
missing header in the test.  This appears to be the result of a
recent kernel change and was observed on the Fedora 5.8.11-200
kernel.

  In file included from /home/fedora/zfs/build/objtool/objtool.c:75:
  ./arch/x86/include/asm/frame.h:100:57: error: 'struct pt_regs'
      declared inside parameter list will not be visible outside
      of this definition or declaration [-Werror]

The consequence of this is that the "stack_frame_non_standard"
check is never run and HAVE_STACK_FRAME_NON_STANDARD is set
incorrectly which results in a build failure.  This change adds
the appropriate header to the "objtool" check so it now behaves
as intended.

Reviewed-by: Kjeld Schouten <kjeld@schouten-lebbing.nl>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #10990

3 years agoNote that keys must be loaded for 'zpool remove'
grodik [Fri, 18 Sep 2020 00:19:13 +0000 (20:19 -0400)]
Note that keys must be loaded for 'zpool remove'

The error returned by `zpool remove` when the encryption keys aren't
loaded isn't very helpful.  Furthermore, the man pages make no
mention that the keys need to be loaded. This change doesn't resolve
the error message but it does update the man page to mention this
requirement.

Authored-by: grodik <pat@litke.dev>
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #10939
Closes #10948

3 years agoDocument branching structure
Kjeld Schouten-Lebbing [Mon, 28 Sep 2020 20:23:49 +0000 (22:23 +0200)]
Document branching structure

This change documents the currently used branching structure.
It has been cut down to not include any controversial changes.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Kjeld Schouten-Lebbing <kjeld@schouten-lebbing.nl>
Closes #10976

3 years agozfetch: Don't issue new streams when old have not completed
Matthew Macy [Mon, 28 Sep 2020 00:08:38 +0000 (17:08 -0700)]
zfetch: Don't issue new streams when old have not completed

The current dmu_zfetch code implicitly assumes that I/Os complete
within min_sec_reap seconds. With async dmu and a readonly workload
(and thus no exponential backoff in operations from the "write
throttle") such as L2ARC rebuild it is possible to saturate the drives
with I/O requests. These are then effectively compounded with prefetch
requests.

This change reference counts streams and prevents them from being
recycled after their min_sec_reap timeout if they still have
outstanding I/Os.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matt Macy <mmacy@FreeBSD.org>
Closes #10900

3 years agozfs userspace: use zfs_path_to_zhandle so argument can be a path
Allan Jude [Fri, 25 Sep 2020 21:37:10 +0000 (17:37 -0400)]
zfs userspace: use zfs_path_to_zhandle so argument can be a path

Change zfs userspace subcommand to use zfs_path_to_zhandle() so that
the provided dataset can be a path (/usr) or a dataset (rpool/usr).

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Allan Jude <allan@klarasystems.com>
Closes #8915

3 years agoAdd DB_RF_NOPREFETCH to dbuf_read()s in dnode.c
Adam D. Moss [Fri, 25 Sep 2020 20:49:22 +0000 (13:49 -0700)]
Add DB_RF_NOPREFETCH to dbuf_read()s in dnode.c

Prefetching of dnodes in dbuf_read() can cause significant mutex
contention for some workloads and isn't very helpful.  This is
because we already get 32 dnodes for each block read, and when
iterating over a directory we prefetch the dnodes in the directory.
Disable this prefetching to prevent the lock contention.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Submitted-by: Adam Moss <c@yotes.com>
Submitted-by: Matthew Ahrens <mahrens@delphix.com>
Signed-off-by: Adam Moss <c@yotes.com>
Closes #10877
Closes #10953

3 years agoFix PREEMPTION=y and BLK_CGROUP=y config on arm64
Brian Behlendorf [Fri, 25 Sep 2020 20:28:35 +0000 (13:28 -0700)]
Fix PREEMPTION=y and BLK_CGROUP=y config on arm64

With PREEMPTION=y and BLK_CGROUP=y preempt_schedule_notrace() is being
used on arm64 which is a GPL-only function and hence the build of the
DKMS kernel module fails.

Fix that by redefining preempt_schedule_notrace() to preempt_schedule()
which should be safe as long as tracing is not used.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Juerg Haefliger <juergh@canonical.com>
Closes #8545
Closes #9948
Closes #10416
Closes #10973

3 years agoFreeBSD: update cache_purgevfs usage after 1300117 version bump
Mateusz Guzik [Fri, 25 Sep 2020 20:23:43 +0000 (22:23 +0200)]
FreeBSD: update cache_purgevfs usage after 1300117 version bump

Reviewed-by: Ryan Moeller <freqlabs@FreeBSD.org>
Reviewed-by: Nick Wolff <darkfiberiru@gmail.com>
Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
Closes #10970

3 years agoFreeBSD: Code cleanup in zio_crypt
Ryan Moeller [Thu, 3 Sep 2020 21:15:32 +0000 (21:15 +0000)]
FreeBSD: Code cleanup in zio_crypt

Address some unused value and control flow issues flagged by Coverity.

Unreachable code is pruned and unused values are avoided.
Some scattered sections are reordered for coherence.

We can assume kmem_alloc(n, KM_SLEEP) doesn't fail, so there is no need
to check if it returned NULL.  The allocated memory doesn't need to be
zeroed, other than the last iovec (the MAC).

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #10884

3 years agoPrune dead branch reported by Coverity
Ryan Moeller [Thu, 3 Sep 2020 19:23:30 +0000 (19:23 +0000)]
Prune dead branch reported by Coverity

wkey is NULL at every `goto error;`.
dcp is never NULL.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #10884

3 years agozpool command complains about /etc/exports.d
George Wilson [Fri, 25 Sep 2020 20:09:40 +0000 (15:09 -0500)]
zpool command complains about /etc/exports.d

If the /etc/exports.d directory does not exist, then we should only
create it when we're performing an action which already requires root
privileges.

This commit moves the directory creation to the enable/disable code
path which ensures that we have the appropriate privileges.

Reviewed-by: Richard Elling <Richard.Elling@RichardElling.com>
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: George Wilson <gwilson@delphix.com>
Closes #10785
Closes #10934

3 years agozfs_log_write: simplify data copying code for WR_COPIED records
Christian Schwarz [Fri, 25 Sep 2020 20:06:34 +0000 (22:06 +0200)]
zfs_log_write: simplify data copying code for WR_COPIED records

lr_write_t records that are WR_COPIED have the record data directly
appended to them (see lr_write_t type definition).

The data is copied from the debuf using dmu_read_by_dnode.

This function was called, only for WR_COPIED records, as part of a
short-circuiting if-statement's if-expression.

I found this side-effectful call to dmu_read_by_dnode pretty
hard to spot.
This patch improves readability by moving the call to its own line.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Wilson <gwilson@delphix.com>
Signed-off-by: Christian Schwarz <me@cschwarz.com>
Closes #10956

3 years agoFreeBSD: Add support for procfs_list
Matthew Macy [Wed, 23 Sep 2020 23:43:51 +0000 (16:43 -0700)]
FreeBSD: Add support for procfs_list

The procfs_list interface is required by several kstats. Implement
this functionality for FreeBSD to provide access to these kstats.

Reviewed-by: Allan Jude <allan@klarasystems.com>
Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matt Macy <mmacy@FreeBSD.org>
Closes #10890

3 years agoFreeBSD: Don't save user FPU context in kernel threads
Matthew Macy [Wed, 23 Sep 2020 18:09:48 +0000 (11:09 -0700)]
FreeBSD: Don't save user FPU context in kernel threads

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Ryan Moeller <freqlabs@FreeBSD.org>
Signed-off-by: Matt Macy <mmacy@FreeBSD.org>
Closes #10899

3 years agoUpdate issue templates, commitcheck and Contributing.md
Kjeld Schouten-Lebbing [Wed, 23 Sep 2020 16:53:26 +0000 (18:53 +0200)]
Update issue templates, commitcheck and Contributing.md

- Removes OpenZFS ports from commit check
- Removes OpenZFS ports from CONTRIBUTING.md
- Adds mailings lists and IRC to issue template selector
- Remove blank issue option from issue creator

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Co-authored-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Kjeld Schouten-Lebbing <kjeld@schouten-lebbing.nl>
Closes #10965

3 years agoDon't set numobjs to UINT64_MAX or near it
Paul Dagnelie [Tue, 22 Sep 2020 23:16:07 +0000 (16:16 -0700)]
Don't set numobjs to UINT64_MAX or near it

Resolves an issue with `zfs send` streams from 0.8.4 which
prevents them from being received by versions < 0.7.

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Paul Zuchowski <pzuchowski@datto.com>
Signed-off-by: Paul Dagnelie <pcd@delphix.com>
Closes #10911
Closes #10916

3 years agocontrib/initramfs: fix shellcheck and checkbashisms errors with shebang
наб [Tue, 22 Sep 2020 23:10:09 +0000 (01:10 +0200)]
contrib/initramfs: fix shellcheck and checkbashisms errors with shebang

Reviewed-by: Gabriel A. Devenyi <gdevenyi@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #10908
Closes #10917

3 years agoRestore clearing of L2CACHE flag in arc_read_done()
George Amanakis [Tue, 22 Sep 2020 23:08:05 +0000 (19:08 -0400)]
Restore clearing of L2CACHE flag in arc_read_done()

Commit 45152dc removed clearing of L2CACHE flag in arc_read_done() and
moved related code in l2arc_write_eligible(). After careful code
inspection arc_read_done() is not bypassed in the case of prefetches.
Thus restore the old behavior.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: adam moss <c@yotes.com>
Signed-off-by: George Amanakis <gamanakis@gmail.com>
Closes #10951

3 years agoFix a logic bug in the FreeBSD getpages VOP
Mark Johnston [Tue, 22 Sep 2020 23:05:52 +0000 (19:05 -0400)]
Fix a logic bug in the FreeBSD getpages VOP

In commit cd32b4f5b79c ("Fix a deadlock in the FreeBSD getpages VOP") I
introduced a bug while porting the patch originally committed to
FreeBSD: the rangelock pointer may be NULL if the try operation failed,
so we must avoid calling zfs_rangelock_unlock() in that case.

Reviewed-by: Allan Jude <allan@klarasystems.com>
Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Reviewed-by: Matt Macy <mmacy@FreeBSD.org>
Reported-by: Steve Wills <swills@FreeBSD.org>
Signed-off-by: Mark Johnston <markj@FreeBSD.org>
Closes #10519
Closes #10960

3 years agoFreeBSD: Reduce stack usage of Lua
Ryan Moeller [Tue, 22 Sep 2020 23:03:11 +0000 (19:03 -0400)]
FreeBSD: Reduce stack usage of Lua

Use the same reduced buffer size for lauxlib that is used on Linux.

Fixes panic on HEAD in lua gsub test designed to exhaust stack space.

With this we can remove the special case to reserve more stack space
on FreeBSD.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Kyle Evans <kevans@FreeBSD.org>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #10959

3 years agoAnnontate FreeBSD sysctls with CTLFLAG_MPSAFE
Mark Johnston [Fri, 18 Sep 2020 12:45:54 +0000 (08:45 -0400)]
Annontate FreeBSD sysctls with CTLFLAG_MPSAFE

Without this, the sysctl system calls will acquire a global lock before
invoking the handler.  This is noticeable in some situations when
running top(1).  The global lock is mostly vestigal but continues to see
some use and so contention is still a problem; until the default sense
of the MPSAFE flag changes, we have to annotate each and every handler.

Reviewed-by: Allan Jude <allan@klarasystems.com>
Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Signed-off-by: Mark Johnston <markj@FreeBSD.org>
Closes #10836

3 years agoFix switch statement indentation in the FreeBSD kstat code
Mark Johnston [Fri, 18 Sep 2020 12:41:28 +0000 (08:41 -0400)]
Fix switch statement indentation in the FreeBSD kstat code

This is in preparation for some functional changes.

Reviewed-by: Allan Jude <allan@klarasystems.com>
Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Signed-off-by: Mark Johnston <markj@FreeBSD.org>
Closes #10950

3 years agoUpdate documentation of l2arc_mfuonly
George Amanakis [Mon, 21 Sep 2020 16:26:24 +0000 (12:26 -0400)]
Update documentation of l2arc_mfuonly

with regard to evicted_l2_eligibile_mru. Even if l2arc_mfuonly is
enabled, this is not reflected in evicted_l2_eligible_mru as this
information is useful for deciding whether to toggle l2arc_mfuonly
depending on the current workload.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: George Amanakis <gamanakis@gmail.com>
Closes #10945

3 years agovdev_ashift should only be set once
George Wilson [Fri, 18 Sep 2020 19:13:47 +0000 (14:13 -0500)]
vdev_ashift should only be set once

== Motivation and Context

The new vdev ashift optimization prevents the removal of devices when
a zfs configuration is comprised of disks which have different logical
and physical block sizes. This is caused because we set 'spa_min_ashift'
in vdev_open and then later call 'vdev_ashift_optimize'. This would
result in an inconsistency between spa's ashift calculations and that
of the top-level vdev.

In addition, the optimization logical ignores the overridden ashift
value that would be provided by '-o ashift=<val>'.

== Description

This change reworks the vdev ashift optimization so that it's only
set the first time the device is configured. It still allows the
physical and logical ahsift values to be set every time the device
is opened but those values are only consulted on first open.

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Cedric Berger <cedric@precidata.com>
Signed-off-by: George Wilson <gwilson@delphix.com>
External-Issue: DLPX-71831
Closes #10932

3 years agolibzfs: Don't leak buf if nvlist is too large
Allan Jude [Fri, 18 Sep 2020 17:23:29 +0000 (13:23 -0400)]
libzfs: Don't leak buf if nvlist is too large

Resolves FreeBSD Coverity defect:
CID 1432398:  Resource leaks  (RESOURCE_LEAK)

libzfs: don't leak hdl if there is an error reading env var

Resolves FreeBSD Coverity defect:
CID 1432395:  Resource leaks  (RESOURCE_LEAK)

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Allan Jude <allanjude@freebsd.org>
Closes #10882

3 years agopool may become suspended during device expansion
George Wilson [Fri, 18 Sep 2020 03:03:10 +0000 (22:03 -0500)]
pool may become suspended during device expansion

When expanding a device zfs needs to rescan the partition table to
get the correct size. This can only happen when we're in the kernel
and requires the device to be closed. As part of the rescan, udev is
notified and the device links are removed and recreated. This leave a
window where the vdev code may try to reopen the device before udev
has recreated the link. If that happens, then the pool may end up in
a suspended state.

To correct this, we leverage the BLKPG_RESIZE_PARTITION ioctl which
allows the partition information to be modified even while it's in use.
This ioctl also does not remove the device link associated with the zfs
data partition so it eliminates the race condition that can occur in
the kernel.

Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: George Wilson <gwilson@delphix.com>
Closes #10897

3 years agozdb leak detection fails with in-progress device removal
Matthew Ahrens [Thu, 17 Sep 2020 17:55:30 +0000 (10:55 -0700)]
zdb leak detection fails with in-progress device removal

When a device removal is in progress, there are 2 locations for the data
that's already been moved: the original location, on the device that's
being removed; and the new location, which is pointed to by the indirect
mapping.  When doing leak detection, zdb needs to know about both
locations.  To determine what's already been copied, we load the
spacemaps of the removing vdev, omit the blocks that are yet to be
copied, and then use the vdev's remap op to find the new location.

The problem is with an optimization to the spacemap-loading code in zdb.
When processing the log spacemaps, we ignore entries that are not
relevant because they are past the point that's been copied.  However,
entries which span the point that's been copied (i.e. they are partly
relevant and partly irrelevant) are processed normally.  This can lead
to an illegal spacemap operation, for example if offsets up to 100KB
have been copied, and the spacemap log has the following entries:

ALLOC 50KB-150KB (partly relevant)
FREE 50KB-100KB (entirely relevant)
FREE 100KB-150KB (entirely irrlevant - ignored)
ALLOC 50KB-150KB (partly relevant)

Because the entirely irrelevant entry was ignored, its space remains in
the spacemap.  When the last entry is processed, we attempt to add it to
the spacemap, but it partially overlaps with the 100-150KB entry that
was left over.

This problem was discovered by ztest/zloop.

One solution would be to also ignore the irrelevant parts of
partially-irrelevant entries (i.e. when processing the ALLOC 50-150, to
only add 50-100 to the spacemap).  However, this commit implements a
simpler solution, which is to remove this optimization entirely.  I.e.
to process the entire spacemap log, without regard for the point that's
been copied.  After reconstructing the entire allocatable range tree,
there's already code to remove the parts that have not yet been copied.

Reviewed-by: Serapheim Dimitropoulos <serapheim@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
External-issue: DLPX-71820
Closes #10920

3 years agoFreeBSD: Do not copy vp into f_data for DTYPE_VNODE files
Ryan Moeller [Thu, 17 Sep 2020 17:54:14 +0000 (13:54 -0400)]
FreeBSD: Do not copy vp into f_data for DTYPE_VNODE files

https://reviews.freebsd.org/D26346

Do not copy vp into f_data for DTYPE_VNODE files.  The vnode pointer is
already stored in f_vnode.  Use that so f_data can be reused.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #10929

3 years agoNeed a long hold in zpl_mount_impl
John Poduska [Thu, 17 Sep 2020 17:53:02 +0000 (13:53 -0400)]
Need a long hold in zpl_mount_impl

In zpl_mount_impl, there is:
    dmu_objset_hold ; returns with pool & ds held
    dsl_pool_rele

    sget

    dsl_dataset_rele

As spelled out in the "DSL Pool Configuration Lock" in dsl_pool.c,
this requires a long hold.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Paul Zuchowski <pzuchowski@datto.com>
Signed-off-by: John Poduska <jpoduska@datto.com>
Closes #10936

3 years agolibzfsbootenv: lzbe_nvlist_set needs to store bootenv version VB_NVLIST
Toomas Soome [Thu, 17 Sep 2020 17:51:09 +0000 (20:51 +0300)]
libzfsbootenv: lzbe_nvlist_set needs to store bootenv version VB_NVLIST

A small bug did slip into initial libzfsbootenv; while storing nvlist
in nvlist, we should make sure the bootenv is using VB_NVLIST format.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Toomas Soome <tsoome@me.com>
Closes #10937

3 years agoRename acltype=posixacl to acltype=posix
Ryan Moeller [Wed, 16 Sep 2020 19:26:06 +0000 (15:26 -0400)]
Rename acltype=posixacl to acltype=posix

Prefer acltype=off|posix, retaining the old names as aliases.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #10918

3 years agocmd/zgenhostid: replace with simple c implementation
Georgy Yakovlev [Wed, 16 Sep 2020 19:25:12 +0000 (12:25 -0700)]
cmd/zgenhostid: replace with simple c implementation

It was discovered that dracut scripts and zgenhostid
always generate little-endian /etc/hostid.

This commit provides simple endianess-aware binary
and updates the scripts to use it.

New features include:
 -f flag to force overwrite.
 -o flag to write to different file (for dracut)
 accepting both 0x01234567 and 01234567 values as input

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Signed-off-by: Georgy Yakovlev <gyakovlev@gentoo.org>
Closes #10887
Closes #10925

3 years agoFix stack frame size: dnode_dirty_l1range()
Pavel Snajdr [Mon, 7 Sep 2020 15:33:34 +0000 (17:33 +0200)]
Fix stack frame size: dnode_dirty_l1range()

Reviewed-by: Ryan Moeller <freqlabs@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Pavel Snajdr <snajpa@snajpa.net>
Closes #10879

3 years agodmu_redact_snap: fix possible memleak
Pavel Snajdr [Mon, 7 Sep 2020 15:27:51 +0000 (17:27 +0200)]
dmu_redact_snap: fix possible memleak

Reviewed-by: Ryan Moeller <freqlabs@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Pavel Snajdr <snajpa@snajpa.net>
Closes #10879

3 years agoFix stack frame size: dmu_redact_snap()
Pavel Snajdr [Mon, 7 Sep 2020 15:12:17 +0000 (17:12 +0200)]
Fix stack frame size: dmu_redact_snap()

Reviewed-by: Ryan Moeller <freqlabs@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Pavel Snajdr <snajpa@snajpa.net>
Closes #10879

3 years agoFix stack frame size: spa_livelist_delete_cb()
Pavel Snajdr [Thu, 3 Sep 2020 15:38:16 +0000 (17:38 +0200)]
Fix stack frame size: spa_livelist_delete_cb()

Reviewed-by: Ryan Moeller <freqlabs@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Pavel Snajdr <snajpa@snajpa.net>
Closes #10879

3 years agozpoolprops.8: fix raidz par[i]ty typo
наб [Tue, 15 Sep 2020 22:43:42 +0000 (00:43 +0200)]
zpoolprops.8: fix raidz par[i]ty typo

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #10923