CyberLeo.Net >> Repos - FreeBSD/FreeBSD.git/log

tests/netgraph: Inital framework for testing libnetgraph

Provide a framework of functions to test various netgraph modules.
Tests contain:
- creating, renaming, and destroying nodes
- connecting and removing hooks
- sending and receiving data
- sending ASCII messages and receiving binary responses
- errors can be passed for indiviual inspection or fail the test

Reviewed by: kp
Differential Revision: https://reviews.freebsd.org/D30629
Differential Revision: https://reviews.freebsd.org/D30657
Differential Revision: https://reviews.freebsd.org/D30671
Differential Revision: https://reviews.freebsd.org/D30699

(cherry picked from commit 24ea1dbf257aa6757f469bcd859f90e9ad851e59)
(cherry picked from commit 09307dbfb888a98232096c751a96ecb3344aa77c)
(cherry picked from commit 9021c46603bf29b9700f24b8dce8796b434d7c8f)
(cherry picked from commit 5554abd9cc9702af30af90925b33c5efff4e7d88)

Also contains some fixups:
- indent all files correctly
- finish factoring out
- remove debugging code
- check for renaming issues reported in PR241954

PR: 241954
Differential Revision: https://reviews.freebsd.org/D30692
Differential Revision: https://reviews.freebsd.org/D30714
Differential Revision: https://reviews.freebsd.org/D30713

(cherry picked from commit a664ade93972ce617f0888ff79e715dff9cf0f87)
(cherry picked from commit 0afa9be03937d60cb5aeba64c81e3e2165bd3737)
(cherry picked from commit 43e4821315c31db067e23564b9bfafb519e77b2b)

tcp: Missing mfree in rack and bbr

Recently (Nov) we added logic that protects against a peer negotiating a timestamp, and
then not including a timestamp. This involved in the input path doing a goto done_with_input
label. Now I suspect the code was cribbed from one in Rack that has to do with the SYN.
This had a bug, i.e. it should have a m_freem(m) before going to the label (bbr had this
missing m_freem() but rack did not). This then caused the missing m_freem to show
up in both BBR and Rack. Also looking at the code referencing m->m_pkthdr.lro_nsegs
later (after processing) is not a good idea, even though its only for logging. Best to
copy that off before any frees can take place.

Reviewed by: mtuexen
Sponsored by: Netflix Inc
Differential Revision: https://reviews.freebsd.org/D30727

(cherry picked from commit ba1b3e48f5be320f0590bc357ea53fdc3e4edc65)

tcp: Mbuf leak while holding a socket buffer lock.

When running at NF the current Rack and BBR changes with the recent
commits from Richard that cause the socket buffer lock to be held over
the ip_output() call and then finally culminating in a call to tcp_handle_wakeup()
we get a lot of leaked mbufs. I don't think that this leak is actually caused
by holding the lock or what Richard has done, but is exposing some other
bug that has probably been lying dormant for a long time. I will continue to
look (using his changes) at what is going on to try to root cause out the issue.

In the meantime I can't leave the leaks out for everyone else. So this commit
will revert all of Richards changes and move both Rack and BBR back to just
doing the old sorwakeup_locked() calls after messing with the so_rcv buffer.

We may want to look at adding back in Richards changes after I have pinpointed
the root cause of the mbuf leak and fixed it.

Reviewed by: mtuexen,rscheff
Sponsored by: Netflix Inc
Differential Revision: https://reviews.freebsd.org/D30704

(cherry picked from commit 67e892819b26c198e4232c7586ead7f854f848c5)

tcp: LRO timestamps have lost their previous precision

Recently we had a rewrite to tcp_lro.c that was tested but one subtle change
was the move to a less precise timestamp. This causes all kinds of chaos
in tcp's that do pacing and needs to be fixed to use the more precise
time that was there before.

Reviewed by: mtuexen, gallatin, hselasky
Sponsored by: Netflix Inc
Differential Revision: https://reviews.freebsd.org/D30695

(cherry picked from commit b45daaea95abd8bda52caaacf120f9197caab3e7)

arm64: Fix pmap_copy()'s handling of 2MB mappings

When copying mappings from parent to child, we clear the accessed and
dirty bits.  This is done for both 4KB and 2MB PTEs.  However,
pmap_demote_l2() asserts that writable superpages must be dirty.  This
is to avoid races with the MMU setting the dirty bit during promotion
and demotion.  pmap_copy() can create clean, writable superpage
mappings, so it violates this assertion.

Modify pmap_copy() to preserve the accessed and dirty bits when copying
2MB mappings, like we do on amd64.

Fixes: ca2cae0b4dd
Reported by: Jenkins via mhorne
Reviewed by: alc, kib
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D30643

(cherry picked from commit 4e4035ef1fb5e2f9da6b658ffae8a54862b4d018)

Fix handling of D_GIANTOK

It was meant to suppress only the printf(), not the subsequent injection
of Giant-protected thunks for various file operations.

Fixes: fbeb4ccac9
Reported by: pho
Tested by: pho
Pointy hat: markj

(cherry picked from commit 887c753c9f451322cae3efbf9b63f53f3d9011c8)

riscv: Rename pmap_fault_fixup() to pmap_fault()

This is consistent with other platforms, specifically arm and arm64. No
functional change intended.

Reviewed by: jrtc27
Sponsored by: The FreeBSD Foundation

(cherry picked from commit 317113bb125166f6ba3035a29408339af38cca54)

arm: Remove last_fault_code

It is unused since the removal of pmap-v4.c in commit b88b275145.

Sponsored by: The FreeBSD Foundation

(cherry picked from commit 62ba0def5584bc1fc84fc7df6d7a1256e4a34fb8)

riscv: Handle hardware-managed dirty bit updates in pmap_promote_l2()

pmap_promote_l2() failed to handle implementations which set the
accessed and dirty flags. In particular, when comparing the attributes
of a run of 512 PTEs, we must handle the possibility that the hardware
will set PTE_D on a clean, writable mapping.

Following the example of amd64 and arm64, change riscv's
pmap_promote_l2() to downgrade clean, writable mappings to read-only, so
that updates are synchronized by the pmap lock.

Fixes: f6893f09d
Reported by: Nathaniel Filardo <nwf20@cl.cam.ac.uk>
Tested by: Nathaniel Filardo <nwf20@cl.cam.ac.uk>
Reviewed by: jrtc27, alc, Nathaniel Filardo
Sponsored by: The FreeBSD Foundation

(cherry picked from commit c05748e028b84c216d0161e70418f8cb09e074e4)

Suppress D_NEEDGIANT warnings for some drivers

During boot we warn that the kbd and openfirm drivers are Giant-locked
and may be deleted. Generally, the warning helps signal that certain
old drivers are not being maintained and are subject to removal, but
this doesn't really apply to certain drivers which are harder to
detangle from Giant.

Add a flag, D_GIANTOK, that devices can specify to suppress the
misleading warning. Use it in the kbd and openfirm drivers.

Reviewed by: imp, jhb
Sponsored by: The FreeBSD Foundation

(cherry picked from commit fbeb4ccac990fdb3bc26ab925a3ca8e7d2f89721)

iwn: adjust EEPROM read timeout for Intel 4965AGN M2

Reading EEPROM from Intel 4965AGN M2 takes 60 us which was causing panic
on system startup.

PR: 255465
Reviewed by: markj

(cherry picked from commit 03d4b58feee396d392668f192ecdde08ecc8036c)

ngatm: Handle errors from uni_msg_extend()

uni_msg_extend() may fail due to a memory allocation failure. In this
case, though, the message is freed, so callers shouldn't touch it.

PR: 255861
Reviewed by: harti
Sponsored by: The FreeBSD Foundation

(cherry picked from commit e755e2776ddff729ae4102f3273473aa33b00077)

arm64: Use the right PTE when downgrading perms in pmap_promote_l2()

When promoting a run of small mappings to a superpage, we have to
downgrade clean, writable mappings to read-only, to handle the
possibility that the MMU will concurrently mark one of the mappings as
dirty.

The code which performed this operation for the first PTE in the run
used the wrong PTE pointer. As a result, the comparison would always
fail, aborting the promotion. This only occurs when promoting writable,
clean mappings.

Fixes: ca2cae0b4dd
Reviewed by: alc, kib
Sponsored by: The FreeBSD Foundation

(cherry picked from commit a48f51b3d396664f9b0a91f016159f4e4324da85)

linuxkpi: Add list_for_each_entry_lockless() macro

This is needed by the drm-kmod 5.7 update.

Approved by: hselasky (src)
Differential Revision: https://reviews.freebsd.org/D30708

(cherry picked from commit b47f461c8e67253fdb394968428b760e880baa08)

pf: Convenience function for optional (numeric) arguments

Add _opt() variants for the uint* functions. These functions set the
provided default value if the nvlist doesn't contain the relevant value.
This is helpful for optional values (e.g. when the API is extended to
add new fields).

While here simplify the header by also using macros to create the
prototypes for the macro-generated function implementations.

Reviewed by: scottl
MFC after: 2 weeks
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D30510

(cherry picked from commit 7c4342890bf17b72f0d79ada1326d9cbf34e736c)

LinuxKPI: add pr_err_once

Reviewed by: hselasky, emaste
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D30672

(cherry picked from commit 05c2d94a081d5948560a01c26c7f432960cde606)

tcp: fix two bugs in new reno

* Completely initialise the CC module specific data
* Use beta_ecn in case of an ECN event whenever ABE is enabled
or it is requested by the stack.

Reviewed by: rscheff, rrs
Sponsored by: Netflix, Inc.

(cherry picked from commit fa3746be4203fc9a3414afb21d964eec8bad74f8)

tcp: remove debug output from RACK

Reported by: iron.udjin@gmail.com, Marek Zarychta
Reviewed by: rrs
PR: 256538
Differential Revision: https://reviews.freebsd.org/D30723
Sponsored by: Netflix, Inc.

(cherry picked from commit f1536bb53898b12e2d19938f8fe2d04b5e5d12a6)

tcp: fix compilation of IPv4-only builds

PR: 256538
Reported by: iron.udjin@gmail.com
Sponsored by: Netflix, Inc.

(cherry picked from commit 224cf7b35b9bbe8d075f6004249d850c620b7855)

iwmbtfw(8): Improve Intel 7260/7265 adaptors handling

- Allow firmware downloading for hw_variant #8;
- Enter manufacturer mode for setting of event mask;
- Handle multi-event response on HCI commands for 7260;
This allows to remove kludge with skipping of 0xfc2f opcode.
- Disable patch and exit manufacturer mode on downloading failure;
- Use default firmware if correct firmware file is not found;

Reviewed by: Philippe Michaud-Boudreault <pitwuu_AT_gmail_DOT_com>
Tested by: arrowd
Differential revision: https://reviews.freebsd.org/D30543

(cherry picked from commit da93a73f834612b659b37b513c8296e1178d249b)

ums(4): Do not stop USB xfers on FIFO close when evdev is still active

This fixes lose of evdev events after moused has been killed.

While here use bitwise operations for UMS_EVDEV_OPENED flag.

Reviewed by: hselasky
Differential revision: https://reviews.freebsd.org/D30342

(cherry picked from commit 05ab03a31798d4cc96c22a8f30b1d9a0d7a3dd35)

ums(4): Start USB xfers on opening of evdev node unconditionally.

This fixes inability to start USB xfers in a case when FIFO has been
already open()-ed but no read() or poll() calls has been issued yet.

MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D30343

(cherry picked from commit 47791339f0cfe3282a6f64b5607047f7bca968ad)

usbhid(4): Add second set of USB transfers to work in polled mode.

The second set of USB transfer is requested by hkbd(4) and
should improve HID keyboard handling in kdb and panic contexts.

Reviewed by: hselasky
Differential revision: https://reviews.freebsd.org/D30486

(cherry picked from commit 9aa0e5af75d033aa2dff763dd2886daaa7213612)

usbhid(4): Fix NULL pointer dereference in usbd_xfer_max_len()

Which happens when USB transfer setup is failed.

PR: 254974
Reviewed by: hselasky
Differential revision: https://reviews.freebsd.org/D30485

(cherry picked from commit e889a462d878675551b227a382764c3879e6c2b3)

Create VM_MEMATTR_DEVICE on all architectures

This is intended to be used with memory mapped IO, e.g. from
bus_space_map with no flags, or pmap_mapdev.

Use this new memory type in the map request configured by
resource_init_map_request, and in pciconf.

Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D29692

(cherry picked from commit 5d2d599d3f3494d813e51e1bcd1c9693eb9c098b)

pciconf: Use VM_MEMATTR_DEVICE on supported architectures

Some architectures - armv7, armv8 and riscv use VM_MEMATTR_DEVICE
when mapping device registers in kernel. Do the same in pciconf.
On armada8k SoC all reads from BARs mapped with hitherto attribute
(VM_MEMATTR_UNCACHEABLE) return 0xff's.

Submitted by: Kornel Duleba <mindal@semihalf.com>
Reviewed by: kib
Obtained from: Semihalf
Sponsored by: Marvell
Differential revision: https://reviews.freebsd.org/D29603

(cherry picked from commit 1c1ead9b94a1a731646327ec3b09e8f3acd577b8)

zfs: merge openzfs/zfs@c3b60eded (zfs-2.1-release) into stable/13

Notable upstream pull request merges:
  #12015 Replace zstreamdump with zstream link
  #12046 Improve scrub maxinflight_bytes math.
  #12052 FreeBSD: incorporate changes to the VFS_QUOTACTL(9) KPI
  #12072 Let zfs diff be more permissive
  #12091 libzfs: On FreeBSD, use MNT_NOWAIT with getfsstat
  #12104 Reminder to update boot code after zpool upgrade
  #12114 Introduce write-mostly sums
  #12125 Modernise all (most) remaining .TH manpages
  #12145 More aggsum optimizations
  #12149 Multiple man-pages: Move to appropriate section
  #12158 Re-embed multilist_t storage
  #12177 Livelist logic should handle dedup blkptrs
  #12196 Unify manpage makefiles, move pages to better sexions, revisit some
  #12212 Remove pool io kstats

Obtained from: OpenZFS
OpenZFS commit: c3b60ededa6e6ce36a457a54451ca153c4c630dc
OpenZFS tag: zfs-2.1.0-rc7

vfs: slightly rework vn_rlimit_fsize

(cherry picked from commit 478c52f1e3654213c7d79096a2bc7f908e0b501d)

ktrace: Remove vrele() at the end of ktr_writerequest()

(cherry picked from commit 6f6cd1e8e8aa3a48b35598992f1b6c21003d35cd)

ktrace: fix a race between writes and close

(cherry picked from commit fc369a353b5b5e0f8046687fcbd78a7cd9ad1810)

Fix limit testing after 1762f674ccb571e6 ktrace commit.

(cherry picked from commit e71d5c7331700504e58cf1a35dca529381723e02)

Fix a braino in previous.

(cherry picked from commit 48235c377f960050e9129aa847d7d73019561c82)

Fix tinderbox build after 1762f674ccb571e6 ktrace commit.

(cherry picked from commit 154f0ecc10abdd3c23d233bf85e292011a130583)

libkvm: Fix build after removal of p_tracevp

(cherry picked from commit e67ef6ce667d42a235a70914159048e10039145d)

ktrace: add a kern.ktrace.filesize_limit_signal knob

(cherry picked from commit ea2b64c2413355ac0d5fc6ff597342e9437a34d4)

ktrace: use the limit of the trace initiator for file size limit on writes

(cherry picked from commit 02645b886bc62dfd8a998fd51d2e6c1bbca03ecb)

ktrace: pack all ktrace parameters into allocated structure ktr_io_params

(cherry picked from commit 1762f674ccb571e6b03c009906dd1af3c6343f9b)

ktrace: do not stop tracing other processes if our cannot write to this vnode

(cherry picked from commit a6144f713cee8f522150b1398b225eedbf4cfef1)

accounting: explicitly mark the exiting thread as doing accounting

(cherry picked from commit 9bb84c23e762e7d1b6154ef4afdcc80662692e76)

Change the return type of sv__setid_allowed from bool to int

(cherry picked from commit 62b8258a7e43f3c774f13eab758b2cfdf353073e)

linuxolator: Add compat.linux.setid_allowed knob

PR: 21463

(cherry picked from commit 598f6fb49c9ca688029b79de0a44227ab79c608c)

sysent: allow ABI to disable setid on exec.

(cherry picked from commit 2d423f7671fe452486932c8e41e7d3547afe82aa)

kern_exec.c: Add execve_nosetid() helper

(cherry picked from commit 19e6043a443ea51207786b85c8d62d070ec36005)

hyperv: register intr handler as usermode-mapped if loaded as module

(cherry picked from commit fe7d7ac40881c9d01a54bf57fff71a3af199f237)

Clean up early arm64 pmap code

Early in the arm64 pmap code we need to translate between a virtual
address and a physical address. Rather than manually walking the page
table we can ask the hardware to do it for us.

Reviewed by: kib, markj
Sponsored by: Innovate UK
Differential Revision: https://reviews.freebsd.org/D30357

(cherry picked from commit e779604f1d4e5fd0cdf3a9d1bb756b168f97b39c)

Update the EFI timer to be called once a second

There is no need to call it evert 10ms when we need 1s granularity.
Update to update the time every second.

Reviewed by: imp, manu, tsoome
Sponsored by: Innovate UK
Differential Revision: https://reviews.freebsd.org/D30227

(cherry picked from commit 93f7be080f3ad0bd71190d87aa2043d714270206)

Use '.arch_extension crc' in the arm64 crc32 code

We don't care about the base architecture here, just that the crc
extension is enabled.

Sponsored by: Innovate UK

(cherry picked from commit 0ec3e991112d85a790ca3bbb4175652f37f4bd15)

Implement bus_map_resource on arm64

This will allow us to allocate an unmapped memory resource, then
later map it with a specific memory attribute.

This is also needed for virtio with the modern PCI attachment.

Reviewed by: kib (via D29723)
Sponsored by: Innovate UK
Differential Revision: https://reviews.freebsd.org/D29694

(cherry picked from commit fe3822497726ab84a1e3753be41e43e4d51aab0b)

Use if ... else when printing memory attributes

In vmstat there is a switch statement that converts these attributes to
a string. As some values can be duplicate we have to hide these from
userspace.

Replace this switch statement with an if ... else macro that lets us
repeat values without a compiler error.

Reviewed by: kib
MFC after: 2 weeks
Sponsored by: ABT Systems Ltd
Differential Revision: https://reviews.freebsd.org/D29703

(cherry picked from commit 15221c552b3cabcbf26613246e855010b176805a)

Clean up the style in the arm64 bus.h

MFC after: 2 weeks
Sponsored by: Innovate UK

(cherry picked from commit 5998328e55f8850718a6b48842823eb0a6524ae6)

nfscl: Use hash lists to improve expected search performance for opens

A problem was reported via email, where a large (130000+) accumulation
of NFSv4 opens on an NFSv4 mount caused significant lock contention
on the mutex used to protect the client mount's open/lock state.
Although the root cause for the accumulation of opens was not
resolved, it is obvious that the NFSv4 client is not designed to
handle 100000+ opens efficiently. When searching for an open,
usually for a match by file handle, a linear search of all opens
is done.

Commit 3f7e14ad9345 added a hash table of lists hashed on file handle
for the opens. This patch uses the hash lists for searching for
a matching open based of file handle instead of an exhaustive
linear search of all opens.
This change appears to be performance neutral for a small number
of opens, but should improve expected performance for a large
number of opens.

This commit should not affect the high level semantics of open
handling.

(cherry picked from commit 96b40b896772bbce5f93d62eaa640e1eb51b4a30)

nfscl: Use hash lists to improve expected search performance for opens

A problem was reported via email, where a large (130000+) accumulation
of NFSv4 opens on an NFSv4 mount caused significant lock contention
on the mutex used to protect the client mount's open/lock state.
Although the root cause for the accumulation of opens was not
resolved, it is obvious that the NFSv4 client is not designed to
handle 100000+ opens efficiently.  When searching for an open,
usually for a match by file handle, a linear search of all opens
is done.

Commit 3f7e14ad9345 added a hash table of lists hashed on file handle
for the opens.  This patch uses the hash lists for searching for
a matching open based of file handle instead of an exhaustive
linear search of all opens.
This change appears to be performance neutral for a small number
of opens, but should improve expected performance for a large
number of opens.  This patch also moves any found match to the front
of the hash list, to try and maintain the hash lists in recently
used ordering (least recently used at the end of the list).

This commit should not affect the high level semantics of open
handling.

(cherry picked from commit 724072ab1d588677a83a5a5011b5ad9ff5d56538)

linuxkpi: Add macros for might_lock_nested() and lockdep_(re/un/)pin_lock()

In Linux, these are macros to locks in the kernel for scheduling purposes.
But as with other macros in this header, we aren't doing anything with them
so we are doing `do {} while (0)` for now.

This is needed by the drm-kmod 5.7 update.

Approved by: hselasky (src)
Differential Revision: https://reviews.freebsd.org/D30710

(cherry picked from commit 8a1a42b2a7a428fb97fda9f19fd0d67a4eec7535)

linuxkpi: Add _RET_IP_ macro in kernel.h

This is needed by the drm-kmod 5.7 update.

Approved by: hselasky (src)
Differential Revision: https://reviews.freebsd.org/D30707

(cherry picked from commit fee0d486ef34c6bd113ed743e33357ff626f2495)

linuxkpi: Add rom and romlen to struct pci_dev

Approved by: bz (src), hselasky (src)
Differential Reivison: https://reviews.freebsd.org/D30686

(cherry picked from commit 096104e790fb182d230f25f169792cc610b8f41c)

Also enable IPIs on 32-bit arm

This was missed in 2420f6a

Reported by: tuexen, imp

(cherry picked from commit 0ec205197b56b9257cf0fdc1a5b268fef3e3f2dc)

Enable IPIs on CPU 0 on arm and arm64

Not all interrupt controllers enable IPIs by default as the Arm
GIC specs make it an implementation defined option. As at least two
hypervisors have also previously masked the IPIs on boot.

As we already enable these IPIs on the non-boot CPUs it is expected
this is a safe operation.

Differential Revision: https://reviews.freebsd.org/D26975

(cherry picked from commit 2420f6aed9e355ff65377152ba977b3a5ac441d1)

bectl(8): don't allow creation of boot environments with spaces

Boot environment datasets that contain spaces are not bootable.

When a user attempts to create a boot environment with a space, abort
the creation and print an error message.

PR: 254441
Reviewed by: allanjude
Differential Revision: https://reviews.freebsd.org/D30194

(cherry picked from commit 0e6549c8745049e3d6fba3ad748034de2d5cbd2a)

fsck_ufs: fix segfault with gjournal

The segfault was being hit in ckfini() (sbin/fsck_ffs/fsutil.c) while
attempting to traverse the buffer cache. The tail queue used for the
buffer cache was not initialized before dropping into gjournal_check().

Initialize the buffer cache before calling gjournal_check().

PR:             245907
Reviewed by:    jhb, mckusick
Differential Revision:  https://reviews.freebsd.org/D30537

(cherry picked from commit 441e69e419effac0225a45f4cdb948280b8ce5ab)

fsck_ffs(8): fix divide by zero when debug messages are enabled

Only print buffer cache debug message when a cache lookup has been done.

When running `fsck_ffs -d` on a gjournal'ed filesystem, it's possible
that totalreads is greater than zero when no cache lookup has been
done - causing a divide by zero. This commit fixes the following error:

    Floating point exception (core dumped)

Reviewed by:    mckusick
Differential Revision:  https://reviews.freebsd.org/D30370

(cherry picked from commit 20123b25ee51e9b122676a30d45c21a506bf1461)

Add missing chunks after cherry-picking 9ca874cf740ee68c5742df8b5f9e20910085c011
from main to stable/13: Add TCP LRO support for VLAN and VxLAN.

Make sure all counters are allocated.

This is a direct commit.

Reported by: Herbert J. Skuhra <herbert@gojira.at>
Sponsored by: Mellanox Technologies // NVIDIA Networking

fdescfs: add an option to return underlying file vnode on lookup

(cherry picked from commit f9b1e711f0d8c27f2d29e7a8e6276947d37a6a22)

Tag 2.1.0-rc7

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>

Re-embed multilist_t storage

This commit partially reverts changes to multilists in PR 7968
(multi-threaded spa-sync()) and adds some cache line alignments to
separate read-only multilists and heavily modified refcount's to
different cache lines.

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored-by: iXsystems, Inc.
Closes #12158

dracut: 90zfs: respect zfs_force=1 on systemd systems

On systemd systems provide an environment generator in order
to respect the zfs_force=1 kernel command line option.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #11403
Closes #12195

Remove pool io kstats

This mostly reverts "3537 want pool io kstats" commit of 8 years ago.

From one side this code using pool-wide locks became pretty bad for
performance, creating significant lock contention in I/O pipeline.
From another, there are more efficient ways now to obtain detailed
statistics, while this statistics is illumos-specific and much less
usable on Linux and FreeBSD, reported only via procfs/sysctls.

This commit does not remove KSTAT_TYPE_IO implementation, that may
be removed later together with already unused KSTAT_TYPE_INTR and
KSTAT_TYPE_TIMER.

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored-By: iXsystems, Inc.
Closes #12212

Added error for writing to /dev/ on Linux

Starting in Linux 5.10, trying to write to /dev/{null,zero} errors out.
Prefer to inform people when this happens rather than hoping they guess
what's wrong.

Reviewed-by: Antonio Russo <aerusso@aerusso.net>
Reviewed-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes: #11991

libzfs: format safety

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12116

zgenhostid.8: revisit

Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12212

Consistentify miscellaneous style on remaining manpages

Most notably this fixes the vdev_id(8) non-.Xrs in vdev_id.conf.5

Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12212

Move properties, parameters, events, and concepts around manual sections

The pages moved as follows:
  zpool-features.{5 => 7}
  spl{-module-parameters.5 => .4}
  zfs{-module-parameters.5 => .4}
  zfs-events.5 => into zpool-events.8
  zfsconcepts.{8 => 7}
  zfsprops.{8 => 7}
  zpoolconcepts.{8 => 7}
  zpoolprops.{8 => 7}

Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Co-authored-by: Daniel Ebdrup Jensen <debdrup@FreeBSD.org>
Closes #12149
Closes #12212

man: use one Makefile, use OpenZFS for .Os

The prevailing style is to use either nothing, or the originating
organisational umbrella (here: OpenZFS), and these aren't Linux manpages

This also deduplicates the substitution code, and makes adding/removing
sexions simpler in future

Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12212

Fix minor shellcheck 0.7.2 warnings

The first warning of a misspelling is a false positive, so we annotate
the script accordingly.  As for the x-prefix warnings update the check
to use the conventional '[ -z <string> ]' syntax.

all-syslog.sh:46:47: warning: Possible misspelling: ZEVENT_ZIO_OBJECT
    may not be assigned, but ZEVENT_ZIO_OBJSET is. [SC2153]
make_gitrev.sh:53:6: note: Avoid x-prefix in comparisons as it no
    longer serves a purpose [SC2268]
man-dates.sh:10:7: note: Avoid x-prefix in comparisons as it no
    longer serves a purpose [SC2268]

Reviewed-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #12208

Clarify that the new STABLE branch is branched off CURRENT, not renamed

Reviewed by: emaste
Differential Revision: https://reviews.freebsd.org/D29317

(cherry picked from commit 53844d3ea4aa504c37c8bd679cfea9966ade1400)

Add freeze/thaw description to devctl(8)

This is a follow-up to 5fa29797910346fc0c54829bd979856e83b9b7ea .

PR: 256311
Reviewed by: imp
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D29867

(cherry picked from commit 315674fb6acc4fa54cf82de3863c349c5a71f498)

Fix test case header function name

This restores the expected behavior (skip) when running with non-root user

MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D30584

(cherry picked from commit 847b7d505490ae407a5c876e14e7788a4add7737)

Cirrus-CI: retry pkg installation on failure

Pkg installation failed somewhat frequently, always at:

[62/104] Fetching jpeg-turbo-2.0.6.txz: .......... done
pkg: http://pkgmir.geo.freebsd.org/FreeBSD:13:amd64/quarterly/All/jbigkit-2.1_1.txz: No route to host

Move pkg installation to a script and retry once upon failure as a
(hopefully temporary) workaround.

Reviewed by: imp
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D30613

(cherry picked from commit dd41de95a84d979615a2ef11df6850622bf6184e)

linux_common: retire extra module version.

The second 'linuxcommon' line was added by c66f5b079d2a259c3a65b1efe0f2143cd030dc52
but Linuxulator's modules dependend on 'linux_common'.
To avoid such mistakes in the future rename moduledata name and module
name to 'linux_common' and retire 'linuxcommon' line.

Reviewed by: emaste
Differential Revision: https://reviews.freebsd.org/D30409

(cherry picked from commit 5184e2da41921dfec5a3668756890b5c073fbad9)

netgraph/ng_base: Renaming a node to the same name is a noop

Detailed analysis in https://github.com/genneko/freebsd-vimage-jails/issues/2
brought the problem down to a double call of ng_node_name() before and
after a vnet move. Because the name of the node is already known
(occupied by itself), the second call fails.

PR: 241954
Reported by: Paul Armstrong
Differential Revision: https://reviews.freebsd.org/D30110

(cherry picked from commit 0345fd891fe13a191fc0fae9463ea9458bfaff5a)

rtwn_usb(4): Add a USB ID for the TP-Link Archer T2U v3.

PR: 256203
Submitted by: Steve Kargl sgk at troutmask.apl.washington.edu

(cherry picked from commit 434c46c00602e16115cfb19344e4c45135b68b09)

Disable x2APIC for SandyBridge laptops with Samsung BIOS

PR: 256389

(cherry picked from commit 37f780d3e0a2e8e4c64c526b6e7dc77ff6b91057)

madt_setup_local: extract special case checks into a helper

(cherry picked from commit e9e00cc0c98738f0ce7bface221c920c36a1356e)

madt_setup_local: convert series of strcmp to iteration over the array

(cherry picked from commit 92adf00d0512b674ce18eb1a542ae42e85dd5bca)

madt_setup_local: skip further checks if ACPI DMAR table already disabled x2APIC

(cherry picked from commit a603d41aca48ff21df59967c55ddef181e16ec14)

src.conf.5: Regen.

Add a description for WITH_SVNLITE.

Reviewed by: emaste
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D30106

(cherry picked from commit 0ac711e07ece9ac260023a62250d50fb2f37e975)

Disable building svnlite(1) by default.

Now that all repositories have switched to git, initiate the de-orbit
burn for svnlite(1).

Reviewed by: emaste
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D30105

(cherry picked from commit a2bc17474b962ba9e29c4526356203fe48a549eb)

libcrypto: Add symbol versions for symbols added since 1.1.1d.

While here, trim a spurious local: I missed when added SSL_sendfile.

PR: 255277
Reported by: yuri
Reviewed by: jkim
Differential Revision: https://reviews.freebsd.org/D30483

(cherry picked from commit 7ad70d22c667173586c04fc13dd315995d78fbbf)

etcupdate: Add -D destdir to usage for 'extract'.

Reported by: Mark Millard <marklmi@yahoo.com>

(cherry picked from commit 5eb9c93a20d7320b24635ed09eba4908951bdeb2)

etcupdate: Add a revert mode to restore one or more stock files.

Reviewed by: imp
Differential Revision: https://reviews.freebsd.org/D29846

(cherry picked from commit ba30215ae0efeb49e5e9ca2469d95edaea78680d)

etcupdate: Trim trailing whitespace.

Reviewed by: imp
Differential Revision: https://reviews.freebsd.org/D29845

(cherry picked from commit ada7fd17d57fac3dbafff5a1b3268afb872c8b0b)

etcupdate: Gracefully handle SIGINT when building trees.

Run the 'build_tree' function inside of a subshell and trap SIGINT to
return an error to the caller. This allows callers to gracefully
cleanup a partially created tree.

While here, redirect stdout/stderr of the subshell to the log file
instead of applying redirections individually to each command executed
while building the tree.

Reviewed by: imp
Differential Revision: https://reviews.freebsd.org/D29844

(cherry picked from commit 1f7afa9364805a912270c9d6a70dc4a889d47a4e)

etcupdate: Always extract to a temporary tree.

etcupdate has had a somewhat nasty race condition since its creation
in that its state machine can get very confused if it is interrupted
while building the tree to compare against. This is exacerbated by
the fact that etcupdate doesn't emit any output while building the
tree which can take several seconds (especially in recent years with
the addition of the tree-wide buildconfig/installconfig passes).

To mitigate this, always install a new tree into a temporary directory
created via mktemp as was previously done only for dry-runs via -n.
The existing trees are only rotated and the new tree installed as
/var/db/etcupdate/current after the update command has completed.

Reported by: dim, np (and many others)
Reviewed by: imp
Differential Revision: https://reviews.freebsd.org/D29843

(cherry picked from commit 0611aec3cf3a373e6a06f103699dbc91c3d6d472)
(cherry picked from commit b0df36580d5b0df67a0f58ded8f6356b268f7f71)

zed.d/history_event-zfs-list-cacher.sh.in: parallelise, simplify

This:
  (a) improves the error log message,
  (b) locks per pool instead of globally,
  (c) locks the actual output file instead of /var/lock/zfs-list,
      which would otherwise linger there forever (well, still will,
      but you can remove it and it won't come back), and
  (d) preserves attributes of the output file
      instead of reverting them to 0:0 644

It is imperative that the previous commit
("zed-functions.sh: zed_lock(): don't truncate lock")
be included in any series that contains this one

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12042

zed.d/all-debug.sh: simplify

By locking the log file itself, we can omit arduous rebinding and
explicit umask setting, but, perhaps more importantly, avoid permanently
littering /var/lock/ with zed.debug.log.lock we will never delete

It is imperative that the previous commit
("zed-functions.sh: zed_lock(): don't truncate lock")
be included in any series that contains this one

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12042

zed-functions.sh: zed_lock(): don't truncate lock

By appending instead of truncating, we can lock on any file (with write
permissions) instead of only dedicated lock files, since the locking
process itself no longer alters the file in any way

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12042

libzfs: On FreeBSD, use MNT_NOWAIT with getfsstat

`getfsstat(2)` is used to retrieve the list of mounted file systems,
which libzfs uses when fetching properties like mountpoint, atime,
setuid, etc.  The `mode` parameter may be `MNT_NOWAIT`, which uses
information in the VFS's cache, or `MNT_WAIT`, which effectively does a
`statfs` on every single mounted file system in order to fetch the most
up-to-date information.  As far as I can tell, the only fields that
libzfs cares about are the filesystem's name, mountpoint, fstypename,
and mount flags.  Those things are always updated on mount and unmount,
so they will always be accurate in the VFS's mount cache except in two
circumstances:

1) When a file system is busy unmounting
2) When a ZFS file system changes the value of a mount-overridable
   property like atime or setuid, but doesn't remount the file system.
   Right now that only happens when the property is changed by an
   unprivileged user who has delegated authority to change the property
   but not to mount the dataset.  But perhaps libzfs could choose to do
   it for other reasons in the future.

Switching to `MNT_NOWAIT` will greatly improve speed with no downside,
as long as we explicitly update the mount cache whenever we change a
mount-overridable property.

For comparison, Illumos gets this information using the native
`getmntany` and `getmntent` functions, which also use cached
information.  The illumos function that would refresh the cache,
`resetmnttab`, is never called by libzfs.

And on GNU/Linux, `getmntany` and `getmntent` don't even communicate
with the kernel directly.  They simply parse the file they are given,
which is usually /etc/mtab or /proc/mounts.  Perhaps the implementation
of /proc/mounts is synchronous, ala MNT_WAIT; I don't know.

Sponsored-by: Axcient
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alan Somers <asomers@gmail.com>
Closes: #12091

Modernise/fix/rewrite unlinted manpages

zpool-destroy.8: flatten, fix description
zfs-wait.8: flatten, fix description, use list for events
zpool-reguid.8: flatten, fix description
zpool-history.8: flatten, fix description
zpool-export.8: flatten, fix description, remove -f "unmount" reference
AFAICT no such command exists even in Illumos (as of today, anyway),
and we definitely don't call it
zpool-labelclear.8: flatten, fix description
zpool-features.5: modernise
spl-module-parameters.5: modernise
zfs-mount-generator.8: rewrite
zfs-module-parameters.5: modernise

Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12169

Force --enable-debug on FreeBSD if INVARIANTS is set

There's already logic to force INVARIANTS on for building if it's
present in the running kernel; however, not having DEBUG enabled
when DEBUG and INVARIANTS are can cause strange panics.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes #12185
Closes #12163

Livelist logic should handle dedup blkptrs

Update the logic to handle the dedup-case of consecutive
FREEs in the livelist code. The logic still ensures that
all the FREE entries are matched up with a respective
ALLOC by keeping a refcount for each FREE blkptr that we
encounter and ensuring that this refcount gets to zero
by the time we are done processing the livelist.

zdb -y no longer panics when encountering double frees

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Don Brady <don.brady@delphix.com>
Signed-off-by: Serapheim Dimitropoulos <serapheim@delphix.com>
Closes #11480
Closes #12177