]> CyberLeo.Net >> Repos - FreeBSD/FreeBSD.git/log
FreeBSD/FreeBSD.git
14 months agocontrib: dracut: fix race with root=zfs:dset when necessities required
наб [Tue, 28 Mar 2023 20:49:50 +0000 (22:49 +0200)]
contrib: dracut: fix race with root=zfs:dset when necessities required

This had always worked in my testing, but a user on hardware reported
this to happen 100%, and I reproduced it once with cold VM host caches.

dracut-zfs-generator runs as a systemd generator, i.e. at Some
Relatively Early Time; if root= is a fixed dataset, it tries to
"solve [necessities] statically at generation time".

If by that point zfs-import.target hasn't popped (because the import is
taking a non-negligible amount of time for whatever reason), it'll see
no children for the root datase, and as such generate no mounts.

This has never had any right to work. No-one caught this earlier because
it's just that much more convenient to have root=zfs:AUTO, which orders
itself properly.

To fix this, always run zfs-nonroot-necessities.service;
this additionally simplifies the implementation by:
  * making BOOTFS from zfs-env-bootfs.service be the real, canonical,
    root dataset name, not just "whatever the first bootfs is",
    and only set it if we're ZFS-booting
  * zfs-{rollback,snapshot}-bootfs.service can use this instead of
    re-implementing it
  * having zfs-env-bootfs.service also set BOOTFSFLAGS
  * this means the sysroot.mount drop-in can be fixed text
  * zfs-nonroot-necessities.service can also be constant and always
    enabled, because it's conditioned on BOOTFS being set

There is no longer any code generated at run-time
(the sysroot.mount drop-in is an unavoidable gratuitous cp).

The flow of BOOTFS{,FLAGS} from zfs-env-bootfs.service to sysroot.mount
is not noted explicitly in dracut.zfs(7), because (a) at some point it's
just visual noise and (b) it's already ordered via d-p-m.s from z-i.t.

Backport-of: 3399a30ee02d0d31ba2d43d0ce0a2fd90d5c575d
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
14 months agoTag zfs-2.1.10
Tony Hutter [Wed, 29 Mar 2023 17:41:12 +0000 (10:41 -0700)]
Tag zfs-2.1.10

META file and changelog updated.

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
14 months agoRemoved Python 2 and Python 3.5- support
Damian Szuberski [Thu, 13 Jan 2022 16:51:12 +0000 (17:51 +0100)]
Removed Python 2 and Python 3.5- support

Deprecation of Python versions below 3.6 gives opportunity to unify the
build and install requirements for OpenZFS packages. The minimal
supported Python version is 3.6 as this is the most recent Python
package CentOS/RHEL 7 users can get.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Rich Ercolani <rincebrain@gmail.com>
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Signed-off-by: szubersk <szuberskidamian@gmail.com>
Closes #12925

14 months agolinux 6.3 compat: needs REQ_PREFLUSH | REQ_OP_WRITE
youzhongyang [Fri, 31 Mar 2023 16:46:22 +0000 (12:46 -0400)]
linux 6.3 compat: needs REQ_PREFLUSH | REQ_OP_WRITE

Modify bio_set_flush() so if kernel version is >= 4.10, flags
REQ_PREFLUSH and REQ_OP_WRITE are set together.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Youzhong Yang <yyang@mathworks.com>
Closes #14695

14 months agoFix "Add colored output to zfs list"
Tino Reichardt [Wed, 5 Apr 2023 16:57:01 +0000 (18:57 +0200)]
Fix "Add colored output to zfs list"

Running `zfs list -o avail rpool` resulted in a core dump.
This commit will fix this.

Run the needed overhead only, when `use_color()` is true.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Wilson <gwilson@delphix.com>
Signed-off-by: Tino Reichardt <milky-zfs@mcmilk.de>
Closes #14712

15 months agoZTS: Log test name to /dev/kmsg on Linux
Tony Hutter [Wed, 23 Mar 2022 15:15:02 +0000 (08:15 -0700)]
ZTS: Log test name to /dev/kmsg on Linux

Add a -K option to the test suite to log each test name to /dev/kmsg
(on Linux), so if there's a kernel warning we'll be able to match
it up to a particular test.

Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes  #13227

15 months agoAdd Linux kmemleak support to ZTS
Damian Szuberski [Thu, 24 Feb 2022 18:21:13 +0000 (19:21 +0100)]
Add Linux kmemleak support to ZTS

- Kmemleak `clear` is invoked right before every test case run.
- Kmemleak `scan` is requested right after each test case is finished.
- Kmemleak instrumentation is not used for
  setup/cleanup/pretest/posttest/failsafe stages to shorten the test
  case execution time.
- Kmemleak periodic scan is disabled (`scan=0`) before the test suite
  run to avoid interfering with the on-demand scan results.
- There are unavoidable potential false positives coming from kernel
  areas other than OpenZFS module.
- The ZTS with kmemleak enabled duration is increased by ~50%.

Example run
```
Running Time:   07:12:13
Percent passed: 98.3%

unreferenced object 0xffff9da82aea5410 (size 80):
  comm "kworker/u32:10", pid 942206, jiffies 4296749716 (age 2615.516s)
  hex dump (first 32 bytes):
    00 30 30 00 00 00 00 00 ff 8f 30 00 00 00 00 00  .00.......0.....
    51 e6 77 05 a8 9d ff ff 00 00 00 00 00 00 00 00  Q.w.............
  backtrace:
    [<000000005cf1fea2>] alloc_extent_state+0x1d/0xb0 [btrfs]
    [<0000000083f78ae5>] set_extent_bit+0x2ff/0x670 [btrfs]
    [<00000000de29249e>] lock_extent_bits+0x6b/0xa0 [btrfs]
    [<00000000b241f424>] lock_and_cleanup_extent_if_need+0xaf/0x1c0
       [btrfs]
    [<0000000093ca72b5>] btrfs_buffered_write+0x297/0x7d0 [btrfs]
    [<000000002c2938c8>] btrfs_file_write_iter+0x127/0x390 [btrfs]
    [<00000000b888f720>] do_iter_readv_writev+0x152/0x1b0
    [<00000000320f0bcc>] do_iter_write+0x7c/0x1c0
    [<000000000b5a8fe0>] lo_write_bvec+0x62/0x150 [loop]
    [<000000009aa03c73>] loop_process_work+0x250/0xbd0 [loop]
    [<00000000c7487d8a>] process_one_work+0x1f1/0x390
    [<000000000b236831>] worker_thread+0x53/0x3e0
    [<0000000023cb3e57>] kthread+0x127/0x150
    [<000000002d48676a>] ret_from_fork+0x22/0x30
```

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: szubersk <szuberskidamian@gmail.com>
Closes #13084

15 months agoLinux 6.2 compat: META
Tony Hutter [Wed, 29 Mar 2023 22:39:36 +0000 (15:39 -0700)]
Linux 6.2 compat: META

Update the META file to reflect compatibility with the 6.2 kernel.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #14689

15 months agoFix console progress reporting for recursive send
Ameer Hamza [Thu, 2 Feb 2023 23:09:57 +0000 (04:09 +0500)]
Fix console progress reporting for recursive send

After commit 19d3961, progress reporting (-v) with replication flag
enabled does not report the progress on the console. This commit
fixes the issue by updating the logic to check for pa->progress
instead of pa_verbosity in send_progress_thread().

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>
Closes #14448

15 months agozfs_main.c: fix unused variable error with GCC
rob-wing [Thu, 2 Feb 2023 23:16:40 +0000 (14:16 -0900)]
zfs_main.c: fix unused variable error with GCC

zfs_setproctitle_init() is stubbed out on FreeBSD.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Ameer Hamza <ahamza@ixsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Signed-off-by: Rob Wing <rob.fx907@gmail.com>
Closes #14441

15 months agoUse setproctitle to report progress of zfs send
Ameer Hamza [Tue, 17 Jan 2023 18:17:35 +0000 (23:17 +0500)]
Use setproctitle to report progress of zfs send

This allows parsing of zfs send progress by checking the process
title.
Doing so requires some changes to the send code in libzfs_sendrecv.c;
primarily these changes move some of the accounting around, to allow
for the code to be verbose as normal, or set the process title. Unlike
BSD, setproctitle() isn't standard in Linux; thus, borrowed it from
libbsd with slight modifications.

Authored-by: Sean Eric Fagan <sef@FreeBSD.org>
Co-authored-by: Ryan Moeller <ryan@iXsystems.com>
Co-authored-by: Ameer Hamza <ahamza@ixsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>
Closes #14376

15 months agoAdditional limits on hole reporting
Brian Behlendorf [Tue, 28 Mar 2023 15:19:03 +0000 (08:19 -0700)]
Additional limits on hole reporting

Holding the zp->z_rangelock as a RL_READER over the range
0-UINT64_MAX is sufficient to prevent the dnode from being
re-dirtied by concurrent writers.  To avoid potentially
looping multiple times for external caller which do not
take the rangelock holes are not reported after the first
sync.  While not optimal this is always functionally correct.

This change adds the missing rangelock calls on FreeBSD to
zvol_cdev_ioctl().

Reviewed-by: Brian Atkinson <batkinson@lanl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #14512
Closes #14641

15 months agoAdd colored output to zfs list
Tino Reichardt [Mon, 13 Mar 2023 22:31:44 +0000 (23:31 +0100)]
Add colored output to zfs list

Use a bold header row and colorize the AVAIL column based on
the used space percentage of volume.

We define these colors:
- when > 80%, use yellow
- when > 90%, use red

Reviewed-by: WHR <msl0000023508@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ethan Coe-Renner <coerenner1@llnl.gov>
Signed-off-by: Tino Reichardt <milky-zfs@mcmilk.de>
Closes #14621
Closes #14350

15 months agoColorize zpool iostat output
Tino Reichardt [Mon, 13 Mar 2023 22:30:09 +0000 (23:30 +0100)]
Colorize zpool iostat output

Use a bold header and colorize the space suffixes in iostat
by order of magnitude like this:
- K is green
- M is yellow
- G is red
- T is lightblue
- P is magenta
- E is cyan
- 0 space is colored gray

Reviewed-by: WHR <msl0000023508@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ethan Coe-Renner <coerenner1@llnl.gov>
Signed-off-by: Tino Reichardt <milky-zfs@mcmilk.de>
Closes #14621
Closes #14459

15 months agoAdd more ANSI colors to libzfs
Tino Reichardt [Mon, 13 Mar 2023 22:23:04 +0000 (23:23 +0100)]
Add more ANSI colors to libzfs

Reviewed-by: WHR <msl0000023508@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ethan Coe-Renner <coerenner1@llnl.gov>
Signed-off-by: Tino Reichardt <milky-zfs@mcmilk.de>
Closes #14621

15 months agolinux 6.3 compat: add another bdev_io_acct case
Rich Ercolani [Mon, 27 Mar 2023 18:29:19 +0000 (14:29 -0400)]
linux 6.3 compat: add another bdev_io_acct case

Linux 6.3+, and backports from it (6.2.8+), changed the
signatures on bdev_io_{start,end}_acct.  Add a case for it.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes #14658
Closes #14668

15 months agoUpdate vdev state for spare vdev
Ameer Hamza [Fri, 24 Mar 2023 17:30:38 +0000 (22:30 +0500)]
Update vdev state for spare vdev

zfsd fetches new pool configuration through ZFS_IOC_POOL_STATS but
it does not get updated nvlist configuration for spare vdev since
the configuration is read by spa_spares->sav_config. In this commit,
updating the vdev state for spare vdev that is consumed by zfsd on
spare disk hotplug.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>
Closes #14653

15 months agozed: add hotplug support for spare vdevs
Ameer Hamza [Mon, 9 Jan 2023 20:43:03 +0000 (01:43 +0500)]
zed: add hotplug support for spare vdevs

This commit supports for spare vdev hotplug. The
spare vdev associated with all the pools will be
marked as "Removed" when the drive is physically
detached and will become "Available" when the
drive is reattached. Currently, the spare vdev
status does not change on the drive removal and
the same is the case with reattachment.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>
Closes #14295

15 months agozed: post a udev change event from spa_vdev_attach()
Ameer Hamza [Fri, 18 Nov 2022 19:39:59 +0000 (00:39 +0500)]
zed: post a udev change event from spa_vdev_attach()

In order for zed to process the removal event correctly,
udev change event needs to be posted to sync the blkid
information. spa_create() and spa_config_update() posts
the event already through spa_write_cachefile(). Doing
the same for spa_vdev_attach() that handles the case
for vdev attachment and replacement.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>
Closes #14172

15 months agozed: mark disks as REMOVED when they are removed
Ameer Hamza [Mon, 26 Sep 2022 23:32:42 +0000 (04:32 +0500)]
zed: mark disks as REMOVED when they are removed

ZED does not take any action for disk removal events if there is no
spare VDEV available. Added zpool_vdev_remove_wanted() in libzfs
and vdev_remove_wanted() in vdev.c to remove the VDEV through ZED
on removal event.  This means that if you are running zed and
remove a disk, it will be propertly marked as REMOVED.

Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>
15 months agoFreeBSD: Remove extra arc_reduce_target_size() call
Alexander Motin [Sat, 18 Mar 2023 00:31:08 +0000 (20:31 -0400)]
FreeBSD: Remove extra arc_reduce_target_size() call

Remove arc_reduce_target_size() call from arc_prune_task().  The idea
of arc_prune_task() is to remove external references on ARC metadata,
such as vnodes. Since arc_prune_async() is called only from ARC itself,
it makes no sense to create a parasitic loop between ARC eviction and
the pruning, treatening to drop ARC to its minimum.  I can't guess why
it was added as part of FreeBSD to OpenZFS integration.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Brian Atkinson <batkinson@lanl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored by: iXsystems, Inc.
Closes #14639

15 months agoImprove arc_read() error reporting
Alexander Motin [Mon, 13 Feb 2023 21:21:53 +0000 (16:21 -0500)]
Improve arc_read() error reporting

Debugging reported NULL de-reference panic in dnode_hold_impl() I found
that for certain types of errors arc_read() may only return error code,
but not properly report it via done and pio arguments.  Lack of done
calls may result in reference and/or memory leaks in higher level code.
Lack of error reporting via pio may result in unnoticed errors there.
For example, dbuf_read(), where dbuf_read_impl() ignores arc_read()
return, relies completely on the pio mechanism and missed the errors.

This patch makes arc_read() to always call done callback and always
propagate errors to parent zio, if either is provided.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Brian Atkinson <batkinson@lanl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored by: iXsystems, Inc.

15 months agoQAT: Fix uninitialized seed in QAT compression
naivekun [Thu, 16 Mar 2023 18:54:10 +0000 (02:54 +0800)]
QAT: Fix uninitialized seed in QAT compression

CpaDcRqResults have to be initialized with checksum=1 for adler32.
Otherwise when error CPA_DC_OVERFLOW occurred, the next compress
operation will continue on previously part-compressed data, and write
invalid checksum data. When zfs decompress the compressed data, a
invalid checksum will occurred and lead to #14463

Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de>
Reviewed-by: Weigang Li <weigang.li@intel.com>
Reviewed-by: Chengfei Zhu <chengfeix.zhu@intel.com>
Signed-off-by: naivekun <naivekun0817@gmail.com>
Closes #14632
Closes #14463

15 months agoFix for mountpoint=legacy
ofthesun9 [Tue, 14 Mar 2023 23:40:55 +0000 (00:40 +0100)]
Fix for mountpoint=legacy

We need to clear mountpoint only after checking it.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Signed-off-by: ofthesun9 <olivier@ofthesun.net>
Closes #14599
Closes #14604

15 months agoZFS_IOC_COUNT_FILLED does unnecessary txg_wait_synced()
Matthew Ahrens [Tue, 14 Mar 2023 21:30:29 +0000 (14:30 -0700)]
ZFS_IOC_COUNT_FILLED does unnecessary txg_wait_synced()

`lseek(SEEK_DATA | SEEK_HOLE)` are only accurate when the on-disk blocks
reflect all writes, i.e. when there are no dirty data blocks.  To ensure
this, if the target dnode is dirty, they wait for the open txg to be
synced, so we can call them "stabilizing operations".  If they cause
txg_wait_synced often, it can be detrimental to performance.

Typically, a group of files are all modified, and then SEEK_DATA/HOLE
are performed on them.  In this case, the first SEEK does a
txg_wait_synced(), and subsequent SEEKs don't need to wait, so
performance is good.

However, if a workload involves an interleaved metadata modification,
the subsequent SEEK may do a txg_wait_synced() unnecessarily.  For
example, if we do a `read()` syscall to each file before we do its SEEK.
This applies even with `relatime=on`, when the `read()` is the first
read after the last write.  The txg_wait_synced() is unnecessary because
the SEEK operations only care that the structure of the tree of indirect
and data blocks is up to date on disk.  They don't care about metadata
like the contents of the bonus or spill blocks.  (They also don't care
if an existing data block is modified, but this would be more involved
to filter out.)

This commit changes the behavior of SEEK_DATA/HOLE operations such that
they do not call txg_wait_synced() if there is only a pending change to
the bonus or spill block.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
Closes #13368
Issue #14594
Issue #14512
Issue #14009

16 months agoUpdate workflows
Brian Behlendorf [Fri, 3 Mar 2023 00:18:27 +0000 (16:18 -0800)]
Update workflows

Update the GitHub actions workflows using a subset of the changes
from the master branch, commit 620a977f22.  Cherry-picking each
relevant commit would have resulted in a large number of conflicts
so this change only applies a minimal set of useful updates.

- Added build-dependencies.txt and checkstyle-dependencies.txt
- Added reclaim_disk_space.sh script
- Minor changes to build steps
- Reduced ztest run time
- checkbashisms, mandoc, and cppcheck were not included to
  avoid additional backports
- Add exceptions for shellcheck

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
16 months agoWorkaround GitHub Action failure
Brian Behlendorf [Thu, 2 Mar 2023 22:58:21 +0000 (14:58 -0800)]
Workaround GitHub Action failure

Ubuntu 20.04 and 22.04 workflows are failing due to an error
which is hit when running `apt-get update`.  Until the
problematic package is fixed apply the suggested workaround
described here:

  https://github.com/orgs/community/discussions/47863

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #14530

16 months agoUbuntu 22.04 integration: GitHub workflows
Brian Behlendorf [Thu, 2 Mar 2023 22:53:34 +0000 (14:53 -0800)]
Ubuntu 22.04 integration: GitHub workflows

- GitHub workflows are run on Ubuntu 22.04
- Extract the `checkstyle` workflow dependencies to a separate file.
- Refresh the `build-dependencies.txt` list.

Notes: Partial check pick of 9e7fc5da380.  This change does not
include the build-dependencies.txt or checkstyle-dependencies.txt
changes.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: szubersk <szuberskidamian@gmail.com>
Closes #14148

16 months agoinitramfs: fix zpool get argument order
q66 [Tue, 7 Mar 2023 01:07:01 +0000 (02:07 +0100)]
initramfs: fix zpool get argument order

When using the zfs initramfs scripts on my system, I get various
errors at initramfs stage, such as:

cannot open '-o': name must begin with a letter

My zfs binaries are compiled with musl libc, which may be why
this happens. In any case, fix the argument order to make the
zpool binary happy, and to match its --help output.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Signed-off-by: Daniel Kolesa <daniel@octaforge.org>
Closes #14572

16 months agoTurn default_bs and default_ibs into ZFS_MODULE_PARAMs
Mateusz Piotrowski [Wed, 11 Jan 2023 17:38:20 +0000 (18:38 +0100)]
Turn default_bs and default_ibs into ZFS_MODULE_PARAMs

The default_bs and default_ibs tunables control the default block size
and indirect block size.

So far, default_bs and default_ibs were tunable only on FreeBSD, e.g.,

    sysctl vfs.zfs.default_ibs

Remove the FreeBSD-specific sysctl code and expose default_bs and
default_ibs as tunables on both Linux and FreeBSD using
ZFS_MODULE_PARAM.

One of the use cases for changing the values of those tunables is to
lower the indirect block size, which may improve performance of large
directories (as discussed during the OpenZFS Leadership Meeting
on 2022-08-16).

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Signed-off-by: Mateusz Piotrowski <mateusz.piotrowski@klarasystems.com>
Sponsored-by: Wasabi Technology, Inc.
Closes #14293

16 months agoAdd missing increment to dsl_deadlist_move_bpobj()
Richard Yao [Sat, 4 Mar 2023 23:42:01 +0000 (18:42 -0500)]
Add missing increment to dsl_deadlist_move_bpobj()

dc5c8006f684b1df3f2d4b6b8c121447d2db0017 was recently merged to prefetch
up to 128 deadlists. Unfortunately, a loop was missing an increment,
such that it will prefetch all deadlists. The performance properties of
that patch probably should be re-evaluated.

This was caught by CodeQL's cpp/constant-comparison check in an
experimental branch where I am testing the security-and-extended
queries. It complained about the `i < 128` part of the loop condition
always evaluating to the same thing. The standard CodeQL configuration
we use missed this because it does not include that check.

Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #14573

16 months agoOptimize the is_l2cacheable functions
George Amanakis [Tue, 7 Mar 2023 00:13:05 +0000 (01:13 +0100)]
Optimize the is_l2cacheable functions

by placing the most common use case (no special vdevs) first and avoid
allocating new variables.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: George Amanakis <gamanakis@gmail.com>
Closes #14494
Closes #14563

16 months agoSystem-wide speculative prefetch limit.
Alexander Motin [Wed, 1 Mar 2023 23:27:40 +0000 (18:27 -0500)]
System-wide speculative prefetch limit.

With some pathological access patterns it is possible to make ZFS
accumulate almost unlimited amount of speculative prefetch ZIOs.
Combined with linear ABD allocations in RAIDZ code, it appears to
be possible to exhaust system KVA, triggering kernel panic.

Address this by introducing a system-wide counter of active prefetch
requests and blocking prefetch distance doubling per stream hits if
the number of active requests is higher that ~6% of ARC size.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored by: iXsystems, Inc.
Closes #14516

16 months agoPrefetch on deadlists merge
Alexander Motin [Wed, 25 Jan 2023 19:30:24 +0000 (14:30 -0500)]
Prefetch on deadlists merge

During snapshot deletion ZFS may issue several reads for each deadlist
to merge them into next snapshot's or pool's bpobj.  Number of the dead
lists increases with number of snapshots.  On HDD pools it may take
significant time during which sync thread is blocked.

This patch introduces prescient prefetch of required blocks for up to
128 deadlists ahead.  Tests show reduction of time required to delete
dataset with 720 snapshots with randomly overwritten file on wide HDD
pool from 75-85 to 22-28 seconds.

Reviewed-by: Allan Jude <allan@klarasystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored by: iXsystems, Inc.
Issue #14276
Closes #14402

16 months agoIntroduce minimal ZIL block commit delay
Alexander Motin [Tue, 24 Jan 2023 17:20:32 +0000 (12:20 -0500)]
Introduce minimal ZIL block commit delay

Despite all optimizations, tests on actual hardware show that FreeBSD
kernel can't sleep for less then ~2us.  Similar tests on Linux show
~50us delay at least from nanosleep() (haven't tested inside kernel).
It means that on very fast log device ZIL may not be able to satisfy
zfs_commit_timeout_pct block commit timeout, increasing log latency
more than desired.

Handle that by introduction of zil_min_commit_timeout parameter,
specifying minimal timeout value where additional delays to aggregate
writes may be skipped.  Also skip delays if the LWB is more than 7/8
full, that often happens if I/O sizes are constant and match one of
LWB sizes.  Both things are applied only if there were no already
outstanding log blocks, that may indicate single-threaded workload,
that by definition can not benefit from the commit delays.

While there, add short time moving average to zl_last_lwb_latency to
make it more stable.

Tests of single-threaded 4KB writes to NVDIMM SLOG on FreeBSD show IOPS
increase by 9% instead of expected 5%.  For zfs_commit_timeout_pct of
1 there IOPS increase by 5.5% instead of expected 1%.

Reviewed-by: Allan Jude <allan@klarasystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Prakash Surya <prakash.surya@delphix.com>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored by: iXsystems, Inc.
Closes #14418

16 months agoPack zrlock_t by 8 bytes
Alexander Motin [Thu, 5 Jan 2023 17:31:55 +0000 (12:31 -0500)]
Pack zrlock_t by 8 bytes

On FreeBSD this reduces this structure size from 64 to 56 bytes.
dnode_handle_t respectively reduces from 72 to 64 bytes. It sounds
like a waste to need 72 bytes to be able to relocate 808 bytes of
dnode_t, which relocation on FreeBSD is not even supported.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored by:   iXsystems, Inc.
Closes #14317

16 months agoRemove few pointer dereferences in dbuf_read()
Alexander Motin [Tue, 29 Nov 2022 17:49:02 +0000 (12:49 -0500)]
Remove few pointer dereferences in dbuf_read()

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Closes #14199

16 months agoSwitch dnode stats to wmsums
Alexander Motin [Tue, 29 Nov 2022 17:33:45 +0000 (12:33 -0500)]
Switch dnode stats to wmsums

I've noticed that some of those counters are used in hot paths like
dnode_hold_impl(), and results of this change is visible in profiler.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Closes #14198

16 months agoMicro-optimize zrl_remove()
Alexander Motin [Tue, 29 Nov 2022 17:26:03 +0000 (12:26 -0500)]
Micro-optimize zrl_remove()

atomic_dec_32() should be a bit lighter than atomic_dec_32_nv().

Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de>
Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Closes #14200

16 months agoRemove atomics from zh_refcount
Alexander Motin [Mon, 28 Nov 2022 19:36:53 +0000 (14:36 -0500)]
Remove atomics from zh_refcount

It is protected by z_hold_locks, so we do not need more serialization,
simple integer math should be fine.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Closes #14196

16 months agoOptimize microzaps
Alexander Motin [Thu, 20 Oct 2022 18:57:15 +0000 (14:57 -0400)]
Optimize microzaps

Microzap on-disk format does not include a hash tree, expecting one to
be built in RAM during mzap_open().  The built tree is linked to DMU
user buffer, freed when original DMU buffer is dropped from cache. I've
found that workloads accessing many large directories and having active
eviction from DMU cache spend significant amount of time building and
then destroying the trees.  I've also found that for each 64 byte mzap
element additional 64 byte tree element is allocated, that is a waste
of memory and CPU caches.

Improve memory efficiency of the hash tree by switching from AVL-tree
to B-tree.  It allows to save 24 bytes per element just on pointers.
Save 32 bits on mze_hash by storing only upper 32 bits since lower 32
bits are always zero for microzaps.  Save 16 bits on mze_chunkid, since
microzap can never have so many elements.  Respectively with the 16 bits
there can be no more than 16 bits of collision differentiators.  As
result, struct mzap_ent now drops from 48 (rounded to 64) to 8 bytes.

Tune B-trees for small data.  Reduce BTREE_CORE_ELEMS from 128 to 126
to allow struct zfs_btree_core in case of 8 byte elements to pack into
2KB instead of 4KB.  Aside of the microzaps it should also help 32bit
range trees.  Allow custom B-tree leaf size to reduce memmove() time.

Split zap_name_alloc() into zap_name_alloc() and zap_name_init_str().
It allows to not waste time allocating/freeing memory when processing
multiple names in a loop during mzap_open().

Together on a pool with 10K directories of 1800 files each and DMU
cache limited to 128MB this reduces time of `find . -name zzz` by 41%
from 7.63s to 4.47s, and saves additional ~30% of CPU time on the DMU
cache reclamation.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored by: iXsystems, Inc.
Closes #14039

(cherry picked from commit 9dcdee788985b4aa9bbf250af3e018056402ba9f)

16 months agoautoconf: add support for openEuler
Xinliang Liu [Sat, 3 Dec 2022 01:39:48 +0000 (09:39 +0800)]
autoconf: add support for openEuler

Add config support for openEuler, so that it set the right sysconfig
dir for openEuler.

And DEFAULT_INIT_SCRIPT is no longer needed since commit "2a34db1bd
Base init scripts for SYSV systems".

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Xinliang Liu <xinliang.liu@linaro.org>
Closes #14241

16 months agoSet DEFAULT_INIT_SHELL to /sbin/openrc-run for Gentoo and Alpine
Peter Levine [Fri, 29 Oct 2021 22:34:37 +0000 (18:34 -0400)]
Set DEFAULT_INIT_SHELL to /sbin/openrc-run for Gentoo and Alpine

Gentoo and Alpine always set the rc init scripts' shebang to
#!/sbin/openrc-run, whether or not openrc is installed.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Peter Levine <plevine457@gmail.com>
Closes #12683
Closes #12692

16 months agorpm: add support for openEuler
Xinliang Liu [Tue, 29 Nov 2022 17:27:22 +0000 (01:27 +0800)]
rpm: add support for openEuler

OpenEuler uses the same package manager DNF as RHEL/Fedora. And
it is similar to RHEL/Fedora.

OpenEuler Linux is becoming the mainstream Linux distro in China.
So adding support for it makes sense for the users. For more
details about it see: https://www.openeuler.org/en/.

Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Damian Szuberski <szuberskidamian@gmail.com>
Signed-off-by: Xinliang Liu <xinliang.liu@linaro.org>
Closes #14222
 Conflicts:
rpm/generic/zfs.spec.in

16 months agoRevert zfeature_active() to static
George Amanakis [Tue, 28 Feb 2023 07:36:25 +0000 (08:36 +0100)]
Revert zfeature_active() to static

Signed-off-by: George Amanakis <gamanakis@gmail.com>
16 months agoMove dmu_buf_rele() after dsl_dataset_sync_done()
George Amanakis [Fri, 24 Feb 2023 01:14:52 +0000 (02:14 +0100)]
Move dmu_buf_rele() after dsl_dataset_sync_done()

Otherwise the dataset may be freed after the last dmu_buf_rele() leading
to a panic.

Reviewed-by: Mark Maybee <mark.maybee@delphix.com>
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Signed-off-by: George Amanakis <gamanakis@gmail.com>
Closes #14522
Closes #14523

16 months agoPartially revert eee9362a7
George Amanakis [Tue, 21 Feb 2023 17:36:22 +0000 (18:36 +0100)]
Partially revert eee9362a7

With commit 34ce4c42f applied, there is no need for eee9362a7.
Revert that aside from the test. All tests introduced in those commits
pass.

Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: George Amanakis <gamanakis@gmail.com>
Closes #14502

16 months agoFix a race condition in dsl_dataset_sync() when activating features
George Amanakis [Tue, 14 Feb 2023 00:37:46 +0000 (01:37 +0100)]
Fix a race condition in dsl_dataset_sync() when activating features

The zio returned from arc_write() in dmu_objset_sync() uses
zio_nowait(). However we may reach the end of dsl_dataset_sync()
which checks if we need to activate features in the filesystem
without knowing if that zio has even run through the ZIO pipeline yet.
In that case we will flag features to be activated in
dsl_dataset_block_born() but dsl_dataset_sync() has already
completed its run and those features will not actually be activated.
Mitigate this by moving the feature activation code in
dsl_dataset_sync_done(). Also add new ASSERTs in
dsl_scan_visitbp() checking if a block contradicts any filesystem
flags.

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Atkinson <batkinson@lanl.gov>
Signed-off-by: George Amanakis <gamanakis@gmail.com>
Closes #13816

16 months agoinitramfs: Make mountpoint=none work
Ryan Moeller [Mon, 6 Feb 2023 19:16:01 +0000 (14:16 -0500)]
initramfs: Make mountpoint=none work

In initramfs, mount.zfs fails to mount a dataset with mountpoint=none,
but mount.zfs -o zfsutil works.  Use -o zfsutil when mountpoint=none.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #14455
(cherry picked from commit eb823cbc76d28a7cafdf6a7aafdefe7e74fe26bc)

17 months agoAvoid a null pointer dereference in zfs_mount() on FreeBSD
Allan Jude [Mon, 28 Nov 2022 21:40:49 +0000 (16:40 -0500)]
Avoid a null pointer dereference in zfs_mount() on FreeBSD

When mounting the root filesystem, vfs_t->mnt_vnodecovered is null

This will cause zfsctl_is_node() to dereference a null pointer when
mounting, or updating the mount flags, on the root filesystem, both
of which happen during the boot process.

Reported-by: Martin Matuska <mm@FreeBSD.org>
Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Signed-off-by: Allan Jude <allan@klarasystems.com>
Closes #14218

17 months agoAllow mounting snapshots in .zfs/snapshot as a regular user
Allan Jude [Thu, 3 Nov 2022 18:53:24 +0000 (14:53 -0400)]
Allow mounting snapshots in .zfs/snapshot as a regular user

Rather than doing a terrible credential swapping hack, we just
check that the thing being mounted is a snapshot, and the mountpoint
is the zfsctl directory, then we allow it.

If the mount attempt is from inside a jail, on an unjailed dataset
(mounted from the host, not by the jail), the ability to mount the
snapshot is controlled by a new per-jail parameter: zfs.mount_snapshot

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Co-authored-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Allan Jude <allan@klarasystems.com>
Sponsored-by: Modirum MDPay
Sponsored-by: Klara Inc.
Closes #13758

17 months agoTag zfs-2.1.9
Tony Hutter [Tue, 24 Jan 2023 23:41:54 +0000 (15:41 -0800)]
Tag zfs-2.1.9

META file and changelog updated.

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
17 months agolinux 6.2 compat: zpl_set_acl arg2 is now struct dentry
Coleman Kane [Tue, 24 Jan 2023 19:20:50 +0000 (14:20 -0500)]
linux 6.2 compat:  zpl_set_acl arg2 is now struct dentry

Linux 6.2 changes the second argument of the set_acl operation to be a
"struct dentry *" rather than a "struct inode *". The inode* parameter
is still available as dentry->d_inode, so adjust the call to the _impl
function call to dereference and pass that pointer to it.

Also document that the get_acl -> get_inode_acl member name change from
commit 884a693 was an API change also introduced in Linux 6.2.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Signed-off-by: Coleman Kane <ckane@colemankane.org>
Closes #14415

17 months agoRevert "ztest fails assertion in zio_write_gang_member_ready()"
Tony Hutter [Tue, 24 Jan 2023 23:31:09 +0000 (15:31 -0800)]
Revert "ztest fails assertion in zio_write_gang_member_ready()"

This reverts commit 0156253d29a303bdcca3e535958e754d8f086e33.

That commit was identified as causing IO errors on a user's
encrypted dataset:
https://github.com/openzfs/zfs/issues/14413

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
17 months agoTag zfs-2.1.8
Tony Hutter [Thu, 19 Jan 2023 21:00:56 +0000 (13:00 -0800)]
Tag zfs-2.1.8

META file and changelog updated.

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
17 months agochange how d_alias is replaced by du.d_alias
Gian-Carlo DeFazio [Thu, 12 Jan 2023 18:14:04 +0000 (10:14 -0800)]
change how d_alias is replaced by du.d_alias

d_alias may need to be converted to du.d_alias
depending on the kernel version.
d_alias is currently in only one place in the code which
changes
"hlist_for_each_entry(dentry, &inode->i_dentry, d_alias)"
to
"hlist_for_each_entry(dentry, &inode->i_dentry, d_u.d_alias)"
as neccesary.

This effectively results in a double macro expansion
for code that uses the zfs headers but already has its
own macro for just d_alias (lustre in this case).

Remove the conditional code for hlist_for_each_entry
and have a macro for "d_alias -> du.d_alias" instead.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Gian-Carlo DeFazio <defazio1@llnl.gov>
Closes #14377

17 months agoLinux ppc64le ieee128 compat: Do not redefine __asm on external headers
Richard Yao [Fri, 13 Jan 2023 18:58:58 +0000 (13:58 -0500)]
Linux ppc64le ieee128 compat: Do not redefine __asm on external headers

There is an external assembly declaration extension in GNU C that glibc
uses when building with ieee128 floating point support on ppc64le.
Marking that as volatile makes no sense, so the build breaks.

It does not make sense to only mark this as volatile on Linux, since if
do not want the compiler reordering things on Linux, we do not want the
compiler reordering things on any other platform, so we stop treating
Linux specially and just manually inline the CPP macro so that we can
eliminate it. This should fix the build on ppc64le.

Tested-by: @gyakovlev
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #14308
Closes #14384

17 months agoinclude systemd overrides to zfs-dracut module
Vince van Oosten [Sun, 23 Oct 2022 08:55:46 +0000 (10:55 +0200)]
include systemd overrides to zfs-dracut module

If a user that uses systemd and dracut wants to overide certain
settings, they typically use `systemctl edit [unit]` or place a file in
`/etc/systemd/system/[unit].d/override.conf` directly.

The zfs-dracut module did not include those overrides however, so this
did not have any effect at boot time.

For zfs-import-scan.service and zfs-import-cache.service, overrides are
now included in the dracut initramfs image.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Vince van Oosten <techhazard@codeforyouand.me>
Closes #14075
Closes #14076

17 months agoActivate filesystem features only in syncing context
George Amanakis [Thu, 12 Jan 2023 02:00:39 +0000 (03:00 +0100)]
Activate filesystem features only in syncing context

When activating filesystem features after receiving a snapshot, do
so only in syncing context.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: George Amanakis <gamanakis@gmail.com>
Closes #14304
Closes #14252

17 months agoIllumos #15286: do_composition() needs sign awareness
Richard Yao [Thu, 5 Jan 2023 19:16:21 +0000 (14:16 -0500)]
Illumos #15286: do_composition() needs sign awareness

Authored by: Dan McDonald <danmcd@mnx.io>
Reviewed by: Patrick Mooney <pmooney@pfmooney.com>
Reviewed by: Richard Lowe <richlowe@richlowe.net>
Approved by: Joshua M. Clulow <josh@sysmgr.org>
Ported-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Illumos-issue: https://www.illumos.org/issues/15286
Illumos-commit: https://github.com/illumos/illumos-gate/commit/f137b22e734e85642da3e56e8b94da3f5f027c73

Porting Notes:

The patch in illumos did not have much of a commit message, and did not
provide attribution to the reporter, while original patch proposed to
OpenZFS did, so I am listing the reporter (myself) and original patch
author (also myself) below while including the original commit message
with some minor corrections as part of the porting notes:

In do_composition(), we have:

size = u8_number_of_bytes[*p];
if (size <= 1 || (p + size) > oslast)
break;

There, we have type promotion from int8_t to size_t, which is unsigned.
C will sign extend the value as part of the widening before treating the
value as unsigned and the negative values we can counter are error
values from U8_ILLEGAL_CHAR and U8_OUT_OF_RANGE_CHAR, which are -1 and
-2 respectively. The unsigned versions of these under two's complement
are SIZE_MAX and SIZE_MAX-1 respectively.

The bounds check is written under the assumption that `size <= 1` does a
signed comparison. This is followed by a pointer comparison to see if
the string has the correct length, which is fine.

A little further down we have:

for (i = 0; i < size; i++)
tc[i] = *p++;

When an error condition is encountered, this will attempt to iterate at
least SIZE_MAX-1 times, which will massively overflow the buffer, which
is not fine.

The kernel will kill the loop as soon as it hits the kernel stack guard
on Linux systems built with CONFIG_VMAP_STACK=y, which should be just
about all of them. That prevents arbitrary code execution and just about
any other bad thing that a black hat attacker might attempt with
knowledge of this buffer overflow. Other systems' kernels have
mitigations for unbounded in-kernel buffer overflows that will catch
this too.

Also, the patch in illumos-gate made an effort to fix C style issues
that had been fixed in the OpenZFS/ZFSOnLinux repository. Those issues
had been mentioned in the email that I originally sent them about this
issue. One of the fixes had not been already done, so it is included.
Another to collect_a_seq()'s arguments was handled differently in
OpenZFS. For the sake of avoiding unnecessary differences, it has been
adopted. This has the interesting effect that if you correct the paths
in the illumos-gate patch to match the current OpenZFS repository, you
can reverse apply it cleanly.

Original-patch-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Reported-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Co-authored-by: Dan McDonald <danmcd@mnx.io>
Closes #14318
Closes #14342

17 months agodracut: fix typo in mount-zfs.sh.in
Brian Behlendorf [Wed, 29 Jun 2022 22:33:38 +0000 (15:33 -0700)]
dracut: fix typo in mount-zfs.sh.in

Format the `zpool get` command correctly.  The -o option must
be followed by "all" or the requested field name.

Reviewed-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13602

17 months agoremoval of LegacyVersion broke ax_python_dev.m4
Matthew Ahrens [Thu, 5 Jan 2023 19:04:24 +0000 (11:04 -0800)]
removal of LegacyVersion broke ax_python_dev.m4

The 22.0 release of the python `packaging` package removed the
`LegacyVersion` trait, causing ZFS to no longer compile.

This commit replaces the sections of `ax_python_dev.m4` that rely on
`LegacyVersion` with updated implementations from the upstream
`autoconf-archive`.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
Closes #14297

17 months agoFreeBSD: catch up to 1400077
Mateusz Guzik [Thu, 5 Jan 2023 18:56:40 +0000 (19:56 +0100)]
FreeBSD: catch up to 1400077

Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
Closes #14328

17 months agoFix shebang for helper script of deb-utils
Martin Rüegg [Thu, 29 Dec 2022 09:59:12 +0000 (10:59 +0100)]
Fix shebang for helper script of deb-utils

Shebang was missing the `!` between `#` and the actual path.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Martin Rüegg <martin.rueegg@metaworx.ch>
Closes #14339

17 months agoAdd quotation marks around `$PATH` for deb-utils
Martin Rüegg [Thu, 29 Dec 2022 09:56:39 +0000 (10:56 +0100)]
Add quotation marks around `$PATH` for deb-utils

Fix #14338, failing to build deb-utils if existing `$PATH` variable
would include a whitespace.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Martin Rüegg <martin.rueegg@metaworx.ch>
Closes #14339

17 months agoDocumentation corrections
Brian Behlendorf [Thu, 22 Dec 2022 19:34:28 +0000 (11:34 -0800)]
Documentation corrections

- Update the link to the OpenZFS Code of Conduct.
- Remove extra "the" from contrib/initramfs/scripts/zfs

Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #14298
Closes #14307

17 months agosystemd: set restart=always for zfs-zed.service
George Melikov [Tue, 20 Dec 2022 00:08:15 +0000 (03:08 +0300)]
systemd: set restart=always for zfs-zed.service

Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: George Melikov <mail@gmelikov.ru>
Co-authored-by: Attila Fülöp <attila@fueloep.org>
Closes #14294

17 months agoAdd color output to zfs diff.
Ethan Coe-Renner [Mon, 12 Dec 2022 23:30:51 +0000 (15:30 -0800)]
Add color output to zfs diff.

This adds support to color zfs diff (in the style of git diff)
conditional on the ZFS_COLOR environment variable.

Signed-off-by: Ethan Coe-Renner <coerenner1@llnl.gov>
17 months agolibzfs: diff: simplify superfluous stdio
наб [Thu, 9 Dec 2021 22:50:41 +0000 (23:50 +0100)]
libzfs: diff: simplify superfluous stdio

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Rich Ercolani <rincebrain@gmail.com>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12829

17 months agolibzfs: diff: print_what() can return the symbol => get_what()
наб [Thu, 9 Dec 2021 22:44:51 +0000 (23:44 +0100)]
libzfs: diff: print_what() can return the symbol => get_what()

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Rich Ercolani <rincebrain@gmail.com>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12829

17 months agoFreeBSD: Remove stray debug printf
Doug Rabson [Wed, 14 Dec 2022 01:35:07 +0000 (01:35 +0000)]
FreeBSD: Remove stray debug printf

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Signed-off-by: Doug Rabson <dfr@rabson.org>
Closes #14286
Closes #14287

17 months agoZero end of embedded block buffer in dump_write_embedded()
Richard Yao [Wed, 14 Dec 2022 01:31:47 +0000 (20:31 -0500)]
Zero end of embedded block buffer in dump_write_embedded()

This fixes a kernel stack leak.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Tested-by: Nicholas Sherlock <n.sherlock@gmail.com>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13778
Closes #14255

17 months agoChange ZEVENT_POOL_GUID to ZEVENT_POOL to display pool names
Marcel Menzel [Wed, 14 Dec 2022 01:26:10 +0000 (02:26 +0100)]
Change ZEVENT_POOL_GUID to ZEVENT_POOL to display pool names

Outgoing mails for ZFS pool events include the pool GUID,
but not the actual pool name. Let's change this for better
readability, as it is already done in the mails for finished
pool resilvers.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by Richard Yao <richard.yao@alumni.stonybrook.edu>
Signed-off-by: Marcel Menzel <mail@mcl.gg>
Closes #14272

17 months agoRestrict visibility of per-dataset kstats inside FreeBSD jails
Allan Jude [Fri, 9 Dec 2022 19:04:29 +0000 (14:04 -0500)]
Restrict visibility of per-dataset kstats inside FreeBSD jails

When inside a jail, visibility on datasets not "jailed" to the
jail is restricted. However, it was possible to enumerate all
datasets in the pool by looking at the kstats sysctl MIB.

Only the kstats corresponding to datasets that the user has
visibility on are accessible now.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Signed-off-by: Allan Jude <allan@klarasystems.com>
Closes #14254

17 months agoFix dereference after null check in enqueue_range
Richard Yao [Sun, 4 Dec 2022 21:31:28 +0000 (16:31 -0500)]
Fix dereference after null check in enqueue_range

If the bp is NULL, we have a hole. However, when we build with
assertions, we will dereference bp when `blkid == DMU_SPILL_BLKID`. When
this happens on a hole, we will have a NULL pointer dereference.

Reported-by: Coverity (CID-1524670)
Reviewed-by: Damian Szuberski <szuberskidamian@gmail.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #14264

17 months agoFix potential buffer overflow in zpool command
Richard Yao [Sun, 4 Dec 2022 02:43:33 +0000 (21:43 -0500)]
Fix potential buffer overflow in zpool command

The ZPOOL_SCRIPTS_PATH environment variable can be passed here. This
allows for arbitrarily long strings to be passed to sprintf(), which can
overflow the buffer.

I missed this in my earlier audit of the codebase. CodeQL's
cpp/unbounded-write check caught this.

Reviewed-by: Damian Szuberski <szuberskidamian@gmail.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #14264

17 months agoFreeBSD: zfs_register_callbacks() must implement error check correctly
Richard Yao [Mon, 5 Dec 2022 18:16:50 +0000 (13:16 -0500)]
FreeBSD: zfs_register_callbacks() must implement error check correctly

I read the following article and noticed a couple of ZFS bugs mentioned:

https://pvs-studio.com/en/blog/posts/cpp/0377/

I decided to search for them in the modern OpenZFS codebase and then
found one that matched the description of the first one:

V593 Consider reviewing the expression of the 'A = B != C' kind. The
expression is calculated as following: 'A = (B != C)'. zfs_vfsops.c 498

The consequence of this is that the error value is replaced with `1`
when there is an error. When there is no error, 0 is correctly passed.
This is a very minor issue that is unlikely to cause any real problems.

The incorrect error code would either be returned to the mount command
on a failure or any of `zfs receive`, `zfs recv`, `zfs rollback` or `zfs
upgrade`.

The second one has already been fixed.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Damian Szuberski <szuberskidamian@gmail.com>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #14261

17 months agofgrep -> grep -F
наб [Fri, 11 Mar 2022 23:26:46 +0000 (00:26 +0100)]
fgrep -> grep -F

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13259

17 months agoegrep -> grep -E
наб [Fri, 11 Mar 2022 23:25:47 +0000 (00:25 +0100)]
egrep -> grep -E

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13259

17 months agoUpdate META to 6.1 kernel
Tony Hutter [Tue, 10 Jan 2023 23:53:33 +0000 (15:53 -0800)]
Update META to 6.1 kernel

ZFS successfully builds against the 6.1.4 kernel.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #14371

17 months agoztest fails assertion in zio_write_gang_member_ready()
Matthew Ahrens [Tue, 10 Jan 2023 00:43:45 +0000 (16:43 -0800)]
ztest fails assertion in zio_write_gang_member_ready()

Encrypted blocks can have up to 2 DVA's, as the third DVA is reserved
for the salt+IV.  However, dmu_write_policy() allows non-encrypted
blocks (e.g. DMU_OT_OBJSET) inside encrypted datasets to request and
allocate 3 DVA's, since they don't need a salt+IV (they are merely
authenicated).

However, if such a block becomes a gang block, the gang code incorrectly
limits the gang block header to 2 DVA's.  This leads to a "NDVAs
inversion", where a parent block (the gang block header) has less DVA's
than its children (the gang members), causing an assertion failure in
zio_write_gang_member_ready().

This commit addresses the problem by only restricting the gang block
header to 2 DVA's if the block is actually encrypted (and thus its gang
block members can have at most 2 DVA's).

Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
Closes #14250
Closes #14356

17 months agoIntroduce ZFS_LINUX_REQUIRE_API autoconf macro
Antonio Russo [Thu, 5 Jan 2023 04:41:16 +0000 (21:41 -0700)]
Introduce ZFS_LINUX_REQUIRE_API autoconf macro

Currently, if API tests fail, we either ignore the failures, or
unconditionally halt the kernel build.  This leads to situations where
incompatibilities with existing APIs may develop, but not trip the
configure compatibility checks.

This introduces a new mechanism to require APIs for kernels above a
particular version.  While not perfect, this at least guarantees
mainline kernels do not break existing APIs without at least providing
some warning.

Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Antonio Russo <aerusso@aerusso.net>
Closes #14343

17 months agolinux 6.2 compat: bio->bi_rw was renamed bio->bi_opf
Coleman Kane [Tue, 27 Dec 2022 03:59:13 +0000 (22:59 -0500)]
linux 6.2 compat: bio->bi_rw was renamed bio->bi_opf

The bi_rw member of struct bio was renamed to bi_opf in Linux 6.2.
As well, Linux's implementation of bio_set_op_attrs(...) has been
removed.

The HAVE_BIO_BI_OPF macro already appears to be defined, but the
removal of the bio_set_op_attrs(...) implementation makes the build
fall back on the locally-defined implementation, which isn't updated
for the bio->bi_opf change. This commit adds that update.

Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Coleman Kane <ckane@colemankane.org>
Closes #14324
Closes #14331

17 months agolinux 6.2 compat: get_acl() got moved to get_inode_acl() in 6.2
Coleman Kane [Tue, 27 Dec 2022 03:04:34 +0000 (22:04 -0500)]
linux 6.2 compat: get_acl() got moved to get_inode_acl() in 6.2

Linux 6.2 renamed the get_acl() operation to get_inode_acl() in
the inode_operations struct. This should fix Issue #14323.

Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Coleman Kane <ckane@colemankane.org>
Closes #14323
Closes #14331

17 months agoLinux 6.1 compat: open inside tmpfile()
Antonio Russo [Sat, 31 Dec 2022 14:51:32 +0000 (07:51 -0700)]
Linux 6.1 compat: open inside tmpfile()

commit d27c81847b43584483b5509ff352e7e727b0ce87 upstream

Linux 863f144 modified the .tmpfile interface to pass a struct file,
rather than a struct dentry, and expect the tmpfile implementation to
open inside of tmpfile().

This patch implements a configuration test that checks for this new API
and appropriately sets a HAVE_TMPFILE_DENTRY flag that tracks this old
API.  Contingent on this flag, the appropriate API is implemented.

Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Antonio Russo <aerusso@aerusso.net>
Closes #14301
Closes #14343

17 months agoZTS: close in mmapwrite.c
Antonio Russo [Fri, 6 Jan 2023 18:52:08 +0000 (11:52 -0700)]
ZTS: close in mmapwrite.c

commit a7304ab9c1ea62c556aa7d007821322afc75ff79 upstream

mmapwrite is used during the ZTS to identify issues with mmap-ed files.
This helper program exercises this pathway by continuously writing to a
file.  ee6bf97c7 modified the writing threads to terminate after a set
amount of total data is written.  This change allows standard program
execution to reach the end of a writer thread without closing the file
descriptor, introducing a resource "leak."

This patch appeases resource leak analyses by close()-ing the file at
the end of the thread.

Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Antonio Russo <aerusso@aerusso.net>
Closes #14353

17 months agoZTS: limit mmapwrite file size
Antonio Russo [Thu, 5 Jan 2023 20:50:30 +0000 (13:50 -0700)]
ZTS: limit mmapwrite file size

commit ee6bf97c7727c118f7bbb8be4bb0eedcaca2ad35 upstream

mmapwrite spawns several threads, all of which perform writes on a file
for the purpose of testing the behavior of mmap(2)-ed files.  One
thread performs an mmap and a write to the beginning of that region,
while the others perform regular writes after lseek(2)-ing the end of
the file.

Because these regular writes are set in a while (1) loop, they will
write an unbounded amount of data to disk.  The mmap_write_001_pos test
script SIGKILLs them after 30 seconds, but on fast testbeds, this may
be enough time to exhaust the available space in the filesystem,
leading to spurious test failures.

Instead, limit the total file size by checking that the lseek return
value is no greater than 250 * 1024*1024 bytes, which is less than the
default minimum vdev size defined in includes/default.cfg .

This also includes part of 2a493a4c7127258b14c39e8c71a9d6f01167c5cd,
which checks the return value of lseek.

Signed-off-by: Antonio Russo <aerusso@aerusso.net>
Closes #14277
Closes #14345

18 months agoskip permission checks for extended attributes
Ameer Hamza [Tue, 22 Nov 2022 20:28:13 +0000 (01:28 +0500)]
skip permission checks for extended attributes

zfs_zaccess_trivial() calls the generic_permission() to read
xattr attributes. This causes deadlock if called from
zpl_xattr_set_dir() context as xattr and the dent locks are
already held in this scenario. This commit skips the permissions
checks for extended attributes since the Linux VFS stack already
checks it before passing us the control.

Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>
18 months agoAllow receiver to override encryption properties in case of replication
Ameer Hamza [Thu, 1 Dec 2022 20:29:49 +0000 (01:29 +0500)]
Allow receiver to override encryption properties in case of replication

Currently, the receiver fails to override the encryption
property for the plain replicated dataset with the error:
"cannot receive incremental stream: encryption property
'encryption' cannot be set for incremental streams.". The
problem is resolved by allowing the receiver to override
the encryption property for plain replicated send.

Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>
18 months agozed: unclean disk attachment faults the vdev
Ameer Hamza [Tue, 15 Nov 2022 00:59:03 +0000 (05:59 +0500)]
zed: unclean disk attachment faults the vdev

If the attached disk already contains a vdev GUID, it
means the disk is not clean. In such a scenario, the
physical path would be a match that makes the disk
faulted when trying to online it. So, we would only
want to proceed if either GUID matches with the last
attached disk or the disk is in a clean state.

Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>
18 months agoFreeBSD: Fix potential boot panic with bad label
Ryan Moeller [Thu, 22 Dec 2022 19:50:09 +0000 (14:50 -0500)]
FreeBSD: Fix potential boot panic with bad label

vdev_geom_read_pool_label() can leave NULL in configs.  Check for it
and skip consistently when generating rootconf.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #14291
(cherry picked from commit dc8c2f615852cb79d3c4cae6c1fb738c7f4a793c)

18 months agoAdd workaround for broken Linux pipes
Rich Ercolani [Mon, 9 May 2022 23:33:55 +0000 (19:33 -0400)]
Add workaround for broken Linux pipes

Linux has an unresolved hang if you resize a pipe with bytes
in it.

Since there's no obvious way to detect this happening, added a
workaround to disable resizing the pipe buffer if you set an
environment variable.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes #13309

18 months agoinitramfs: Fix legacy mountpoint rootfs
Ryan Moeller [Mon, 12 Dec 2022 18:23:06 +0000 (13:23 -0500)]
initramfs: Fix legacy mountpoint rootfs

Legacy mountpoint datasets should not pass `-o zfsutil` to `mount.zfs`.
Fix the logic in `mount_fs()` to not forget we have a legacy mountpoint
when checking for an `org.zol:mountpoint` userprop.

Reviewed-by: Richard Yao <ryao@gentoo.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #14274
(cherry picked from commit 786ff6a6cb33226b4f4292c7569b9093286f74d9)

18 months agovdev_raidz_math_aarch64_neonx2.c: suppress diagnostic only for GCC
szubersk [Wed, 7 Dec 2022 08:30:11 +0000 (18:30 +1000)]
vdev_raidz_math_aarch64_neonx2.c: suppress diagnostic only for GCC

Signed-off-by: szubersk <szuberskidamian@gmail.com>
18 months agotests: mkfile: usage: () -> (void)
szubersk [Wed, 7 Dec 2022 06:16:11 +0000 (16:16 +1000)]
tests: mkfile: usage: () -> (void)

Signed-off-by: szubersk <szuberskidamian@gmail.com>
18 months agoUse Ubuntu 20.04 and remove Ubuntu 18.04 from workflows
szubersk [Wed, 7 Dec 2022 06:13:10 +0000 (16:13 +1000)]
Use Ubuntu 20.04 and remove Ubuntu 18.04 from workflows

- `ubuntu-latest` now resolves to `ubuntu-22.04`. Explicit pinning
  is needed.

- cherry-pick #14238

Signed-off-by: szubersk <szuberskidamian@gmail.com>
18 months agodracut: skip zfsexpandknoweldge when zfs_devs is present in dracut
Savyasachee Jha [Mon, 28 Feb 2022 22:35:25 +0000 (04:05 +0530)]
dracut: skip zfsexpandknoweldge when zfs_devs is present in dracut

PR 1711 (https://github.com/dracutdevs/dracut/pull/1711) adds a zfs_devs
function to dracut to detect the physical devices backing zfs pools. If
this function exists in the version of dracut this module is being
called from, then it does not need to run.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Signed-off-by: Savyasachee Jha <hi@savyasacheejha.com>
Closes #13121

19 months agoTag zfs-2.1.7
Tony Hutter [Wed, 30 Nov 2022 21:09:04 +0000 (13:09 -0800)]
Tag zfs-2.1.7

META file and changelog updated.

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
19 months agozfs-2.1.7: Use ubuntu-20.04 for zloop and sanity builders
Tony Hutter [Thu, 1 Dec 2022 18:21:12 +0000 (10:21 -0800)]
zfs-2.1.7: Use ubuntu-20.04 for zloop and sanity builders

The zfs-2.1.7 branch is still using the older 'python-dev'
package names rather than the newer 'python3-dev' packages that
are required for 'ubuntu-latest'.  Use 'ubuntu-20.04' instead of
'ubuntu-latest' to get around this.

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
19 months agoFix setting the large_block feature after receiving a snapshot
George Amanakis [Fri, 18 Nov 2022 19:38:37 +0000 (20:38 +0100)]
Fix setting the large_block feature after receiving a snapshot

We are not allowed to dirty a filesystem when done receiving
a snapshot. In this case the flag SPA_FEATURE_LARGE_BLOCKS will
not be set on that filesystem since the filesystem is not on
dp_dirty_datasets, and a subsequent encrypted raw send will fail.
Fix this by checking in dsl_dataset_snapshot_sync_impl() if the feature
needs to be activated and do so if appropriate.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: George Amanakis <gamanakis@gmail.com>
Closes #13699
Closes #13782