]> CyberLeo.Net >> Repos - FreeBSD/FreeBSD.git/log
FreeBSD/FreeBSD.git
7 months agotcp: properly initialize LRD while accepting session in syncache
Richard Scheffenegger [Sat, 2 Dec 2023 11:15:37 +0000 (12:15 +0100)]
tcp: properly initialize LRD while accepting session in syncache

Inherit the setting from the listener socket in syncache_socket.

MFC after:             2 weeks
Reviewed By:           tuexen, #transport
Sponsored by:          NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D42874

7 months agolibclang_rt: Update Makefile.depend
Ka Ho Ng [Sat, 2 Dec 2023 09:15:53 +0000 (04:15 -0500)]
libclang_rt: Update Makefile.depend

MFC after: 3 days

7 months agotermcap.small: Include xterm-256color
Ka Ho Ng [Sat, 2 Dec 2023 05:55:56 +0000 (00:55 -0500)]
termcap.small: Include xterm-256color

MFC after: 7 days
Reviewed by: imp, jhb
Differential Revision: https://reviews.freebsd.org/D42784

7 months agoarmv8rng: Don't require toolchain to support FEAT_RNG
Jessica Clarke [Fri, 1 Dec 2023 23:59:07 +0000 (23:59 +0000)]
armv8rng: Don't require toolchain to support FEAT_RNG

We have the mechanism in place to support encoding system registers
explicitly, so use that rather than requiring LLVM 13+, which breaks our
current set of GitHub CI builds.

Fixes: 9eecef052155 ("Add an Armv8 rndr random number provider")

7 months agounix/dgram: bump maximum datagram size limit to 8k
Gleb Smirnoff [Fri, 1 Dec 2023 23:37:29 +0000 (15:37 -0800)]
unix/dgram: bump maximum datagram size limit to 8k

This is important for wpa_supplicant operation on a crowded network.

Note: we actually need an API to increase maximum datagram size on a
socket.  Previously SO_SNDBUF magically acted like that, but that was
an undocumented "feature".

Also move the comment to the proper line.  Previously it was the receive
buffer that imposed the limit.  Now notion of buffer size and maximum
datagram are separate.

Reviewed by: bz, tuexen, karels
Differential Revision: https://reviews.freebsd.org/D42830
PR: 274990

7 months agoLinuxKPI: 802.11: bring in some HT code
Bjoern A. Zeeb [Thu, 26 Oct 2023 21:14:44 +0000 (21:14 +0000)]
LinuxKPI: 802.11: bring in some HT code

Fix defines and structures to use proper types.

Bring in basic ni->sta synchronization, some channel width handling,
and overload the net80211 functions so that we can talk to
driver/firmware to setup parameters.  We will likely not need one
or two of those but it is good for tracing currently.

Cover HT and bits of VHT code in LinuxKPI behind apropriate #ifdef
which are currently not enabled (like LKPI_80211_HW_CRYPTO) until
confirmed to work.
Last, IEEE80211_AMPDU_RX_START made some firmware unhappy.

This will allow others to work on it and test as well.

Sponsored by: The FreeBSD Foundation
MFC after: 10 days

7 months agosysproto.h: regen after c1c8afd04e34d
Brooks Davis [Fri, 1 Dec 2023 21:45:42 +0000 (21:45 +0000)]
sysproto.h: regen after c1c8afd04e34d

7 months agosysvipc: Fix 32-bit compat on !i386
Brooks Davis [Fri, 1 Dec 2023 20:48:29 +0000 (20:48 +0000)]
sysvipc: Fix 32-bit compat on !i386

The various time fields are time_t's which are only 32-bit on i386.

Fixing the old versions is probably of little use, but it's more correct
and in theory there could be powerpc binaries from 6.x.

PR: 240035
Fixes: fbb273bc05bef Properly support for FreeBSD 4 32bit System V shared memory.
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D42870

7 months agomakesyscalls: add COMPAT14 support
Brooks Davis [Fri, 1 Dec 2023 20:00:39 +0000 (20:00 +0000)]
makesyscalls: add COMPAT14 support

Reviewed by: kevans, imp
Fixes: 84d12f887c91f Add a COMPAT_FREEBSD14 kernel option
Differential Revision: https://reviews.freebsd.org/D42861

7 months agoarm64: Add register definitions for MDCR_EL2
Mark Johnston [Fri, 1 Dec 2023 18:28:58 +0000 (13:28 -0500)]
arm64: Add register definitions for MDCR_EL2

This is needed to support the bhyve gdb stub implementation on arm64.

Reviewed by: andrew
MFC after: 1 week
Sponsored by: Innovate UK
Differential Revision: https://reviews.freebsd.org/D42867

7 months agosigaction.2: clarify that fork isn't async-signal-safe, but _Fork is
Alan Somers [Fri, 1 Dec 2023 15:19:24 +0000 (08:19 -0700)]
sigaction.2: clarify that fork isn't async-signal-safe, but _Fork is

[skip ci]

MFC after: 2 weeks
Sponsored by: Axcient
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D42865

7 months agobhyve: Fix a leak that happens when we fail to load a hostfwd rule
Mark Johnston [Fri, 1 Dec 2023 14:46:31 +0000 (09:46 -0500)]
bhyve: Fix a leak that happens when we fail to load a hostfwd rule

Reported by: Coverity
Fixes: c5359e2af5ab ("bhyve: Add a slirp network backend")

7 months agostress2: Handle a define with comments
Peter Holm [Fri, 1 Dec 2023 09:37:13 +0000 (10:37 +0100)]
stress2: Handle a define with comments

7 months agoofed: garbage collect now unused sdp_sockaddr()
Gleb Smirnoff [Fri, 1 Dec 2023 05:50:16 +0000 (21:50 -0800)]
ofed: garbage collect now unused sdp_sockaddr()

Submitted by: zlei

7 months agotools/net80211: add mlme_assoc
Bjoern A. Zeeb [Fri, 1 Dec 2023 01:37:25 +0000 (01:37 +0000)]
tools/net80211: add mlme_assoc

mlme_assoc is a tool to trigger net80211::ieee80211_sta_join1() calls
which in certain conditions cause problems to the LinuxKPI 802.11 compat
code (but also believed to possibly cause problems in case of race to
other firmware based drivers).  This has proven to be a good reproducer
for the problem even on setups which otherwise could run for days without
hitting it.

Sponsored by: The FreeBSD Foundation
PR: 271979

7 months agoiicbus: add compat32 support for I2C ioctls
Stephen J. Kiernan [Wed, 29 Nov 2023 19:20:45 +0000 (14:20 -0500)]
iicbus: add compat32 support for I2C ioctls

Some of the I2C ioctl request structures contain pointers and need to
handle requests from 32-bit applications on 64-bit kernels.

Obtained from: Juniper Networks, Inc.
Differential Revision: https://reviews.freebsd.org/D42836

7 months agocam: Make cam.h self-contained for userland
Warner Losh [Fri, 1 Dec 2023 01:18:59 +0000 (18:18 -0700)]
cam: Make cam.h self-contained for userland

We reference FILE * here, but don't include stdio.h. Do so (both of
these are in !_KERNEL blocks).

Sponsored by: Netflix

7 months agocam: Remove prototype for cam_sim_alloc_dev
Warner Losh [Fri, 1 Dec 2023 01:17:30 +0000 (18:17 -0700)]
cam: Remove prototype for cam_sim_alloc_dev

The implementation was removed in dcd5dea96509, but the prototype was
not. Correct that oversight.

Fixes: dcd5dea96509
Sponsored by: Netflix

7 months agortld: add a test for RTLD_DEEPBIND
Kyle Evans [Fri, 1 Dec 2023 01:26:09 +0000 (19:26 -0600)]
rtld: add a test for RTLD_DEEPBIND

This tests that with RTLD_DEEPBIND, symbols are looked up in all of the
object's needed objects before the global object.

PR: 275393
Reviewed by: kib
Sponsored by: NetApp, Inc.
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D42843

7 months agoath: Revert "Use NET_TASK_INIT() and NET_GROUPTASK_INIT() for drivers that process"
Bjoern A. Zeeb [Fri, 3 Nov 2023 21:52:35 +0000 (21:52 +0000)]
ath: Revert "Use NET_TASK_INIT() and NET_GROUPTASK_INIT() for drivers that process"

This reverts commit 6c3e93cb5a4aa4b8a2d8d4d326f2a7c34d3a4458 for
sys/dev/ath/if_ath.c only.

Sponsored by: The FreeBSD Foundation
MFC after: 3 days

7 months agoRevert "[ath] Attempt to fix epoch handling."
Bjoern A. Zeeb [Fri, 3 Nov 2023 21:50:31 +0000 (21:50 +0000)]
Revert "[ath] Attempt to fix epoch handling."

This reverts commit af2441fbc7fa9e522e7f8697e5a181bdd4ff9e00.

Sponsored by: The FreeBSD Foundation
MFC after: 3 days

7 months agoRevert "Enter the network epoch in USB WiFi drivers when processing input"
Bjoern A. Zeeb [Fri, 3 Nov 2023 21:31:29 +0000 (21:31 +0000)]
Revert "Enter the network epoch in USB WiFi drivers when processing input"

This reverts commit 17c328b6aebfa03cd1c2cbfbbc617e3b341bf1e4.

Sponsored by: The FreeBSD Foundation
MFC after: 3 days

7 months agoRevert "Widen EPOCH(9) usage in USB WLAN drivers."
Bjoern A. Zeeb [Fri, 3 Nov 2023 21:27:15 +0000 (21:27 +0000)]
Revert "Widen EPOCH(9) usage in USB WLAN drivers."

This reverts commit 21c4082de9e2cf9a0fd81a9a981ab06022956847.

Sponsored by: The FreeBSD Foundation
MFC after: 3 days

7 months agoRevert "Widen EPOCH(9) usage in PCI WLAN drivers."
Bjoern A. Zeeb [Fri, 3 Nov 2023 21:19:26 +0000 (21:19 +0000)]
Revert "Widen EPOCH(9) usage in PCI WLAN drivers."

This reverts commit b65f813c1ab99448278961c5ca80dc422b1eae29.
As a side effect this also seems to fix wtap which seems to have
lost the epoch over the input path in between.

Sponsored by: The FreeBSD Foundation
MFC after: 3 days

7 months agonet80211: move net_epoch into net80211
Bjoern A. Zeeb [Sun, 29 Oct 2023 14:25:23 +0000 (14:25 +0000)]
net80211: move net_epoch into net80211

Move the net_epoch into net80211 around the if_input calls and out of
the driver (in this first case LinuxKPI).  This reduces coverage but
also allows us to alloc in calls like (*ampdu_rx_start) which do not
actually pass data up the stack.

The follow-up commits will revert b65f813c1ab99448278961c5ca80dc422b1eae29,
21c4082de9e2cf9a0fd81a9a981ab06022956847,
17c328b6aebfa03cd1c2cbfbbc617e3b341bf1e4,
af2441fbc7fa9e522e7f8697e5a181bdd4ff9e00,
and 6c3e93cb5a4aa4b8a2d8d4d326f2a7c34d3a4458 for ath.

Sponsored by: The FreeBSD Foundation
MFC after: 3 days
Tested by: few (rtwn, ath, iwlwifi, ...)
Reviewed by: adrian
Differential Revision: https://reviews.freebsd.org/D42427

7 months agosort: test against all month formats in month-sort
Christos Margiolis [Fri, 1 Dec 2023 00:30:10 +0000 (02:30 +0200)]
sort: test against all month formats in month-sort

The CLDR specification [1] defines three possible month formats:

- Abbreviation (e.g Jan, Ιαν)
- Full (e.g January, Ιανουαρίου)
- Standalone (e.g January, Ιανουάριος)

Many languages use different case endings depending on whether the month
is referenced as a standalone word (nominative case), or in date context
(genitive, partitive, etc.). sort(1)'s -M option currently sorts months
by testing input against only the abbrevation format, which is
essentially a substring of the full format. While this works fine for
languages like English, where there are no cases, for languages where
there is a different case ending between the abbreviation/full and
standalone formats, it is not sufficient.

For example, in Greek, "May" can take the following forms:

Abbreviation: Μαΐ (genitive case)
Full: Μαΐου (genitive case)
Standalone: Μάιος (nominative case)

If we use the standalone format in Greek, sort(1) will not able to match
"Μαΐ" to "Μάιος" and the sort will fail.

This change makes sort(1) test against all three formats. It also works
when the input contains mixed formats.

[1] https://cldr.unicode.org/translation/date-time/date-time-patterns

Reviewed by: markj
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D42847

7 months agotcp: for LRD move sysctl from tcp.do_lrd tp tcp.sack.lrd, remove sockopt
Richard Scheffenegger [Thu, 30 Nov 2023 20:10:14 +0000 (21:10 +0100)]
tcp: for LRD move sysctl from tcp.do_lrd tp tcp.sack.lrd, remove sockopt

Moving lrd sysctl to the tcp.sack branch, since LRD only works with SACK.
Remove the sockopt to programmatically control LRD per session.

Reviewed By:           #transport, tuexen, rrs
Sponsored by:          NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D42851

7 months agoRTLD_DEEPBIND: make lookup not just symbolic, but walk all refobj' DAGs
Konstantin Belousov [Wed, 29 Nov 2023 18:30:59 +0000 (20:30 +0200)]
RTLD_DEEPBIND: make lookup not just symbolic, but walk all refobj' DAGs

before starting the walk over the global list.  Effectively we visit
needed objects first as well, instead of just the object itself.
This seems to better match the semantic offered by the glibc flag.

Reported by: kevans
PR: 275393
Reviewed by: kevans
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D42841

7 months agonet80211: ieee80211_dump_node() check for channel to be set
Bjoern A. Zeeb [Thu, 30 Nov 2023 18:20:22 +0000 (18:20 +0000)]
net80211: ieee80211_dump_node() check for channel to be set

Avoid panics in case ieee80211_dump_node() gets called before a
channel context is set.

Sponsored by: The FreeBSD Foundation
MFC after: 3 days

7 months agoossl: Add AES-GCM support for NEON-enabled armv7
Mark Johnston [Thu, 30 Nov 2023 17:46:54 +0000 (12:46 -0500)]
ossl: Add AES-GCM support for NEON-enabled armv7

This provides substantially higher throughput than the fallback
implementation.

Reviewed by: jhb
MFC after: 3 months
Sponsored by: Klara, Inc.
Sponsored by: Stormshield
Differential Revision: https://reviews.freebsd.org/D41305

7 months agoossl: Add support for armv7
Mark Johnston [Thu, 30 Nov 2023 17:46:08 +0000 (12:46 -0500)]
ossl: Add support for armv7

OpenSSL provides implementations of several AES modes which use
bitslicing and can be accelerated on CPUs which support the NEON
extension.  This patch adds arm platform support to ossl(4) and provides
an AES-CBC implementation, though bsaes_cbc_encrypt() only implements
decryption.  The real goal is to provide an accelerated AES-GCM
implementation; this will be added in a subsequent patch.

Initially derived from https://reviews.freebsd.org/D37420.

Reviewed by: jhb
Sponsored by: Klara, Inc.
Sponsored by: Stormshield
MFC after: 3 months
Differential Revision: https://reviews.freebsd.org/D41304

7 months agoossl: Fix some bugs in the fallback AES-GCM implementation
Mark Johnston [Wed, 29 Nov 2023 20:08:12 +0000 (15:08 -0500)]
ossl: Fix some bugs in the fallback AES-GCM implementation

gcm_*_aesni() are used when the AVX512 implementation is not available.
Fix two bugs which manifest when handling operations spanning multiple
segments:
- Avoid underflow when the length of the input is smaller than the
  residual.
- In gcm_decrypt_aesni(), ensure that we begin the operation at the
  right offset into the input and output buffers.

Reviewed by: jhb
Fixes: 9b1d87286c78 ("ossl: Add a fallback AES-GCM implementation using AES-NI")
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D42838

7 months agosockets: don't malloc/free sockaddr memory on getpeername/getsockname
Gleb Smirnoff [Thu, 30 Nov 2023 16:30:55 +0000 (08:30 -0800)]
sockets: don't malloc/free sockaddr memory on getpeername/getsockname

Just like it was done for accept(2) in cfb1e92912b4, use same approach
for two simplier syscalls that return socket addresses.  Although,
these two syscalls aren't performance critical, this change generalizes
some code between 3 syscalls trimming code size.

Following example of accept(2), provide VNET-aware and INVARIANT-checking
wrappers sopeeraddr() and sosockaddr() around protosw methods.

Reviewed by: tuexen
Differential Revision: https://reviews.freebsd.org/D42694

7 months agosockets: don't malloc/free sockaddr memory on accept(2)
Gleb Smirnoff [Thu, 30 Nov 2023 16:30:55 +0000 (08:30 -0800)]
sockets: don't malloc/free sockaddr memory on accept(2)

Let the accept functions provide stack memory for protocols to fill it in.
Generic code should provide sockaddr_storage, specialized code may provide
smaller structure.

While rewriting accept(2) make 'addrlen' a true in/out parameter, reporting
required length in case if provided length was insufficient.  Our manual
page accept(2) and POSIX don't explicitly require that, but one can read
the text as they do.  Linux also does that. Update tests accordingly.

Reviewed by: rscheff, tuexen, zlei, dchagin
Differential Revision: https://reviews.freebsd.org/D42635

7 months agotcp: enable LRD by default
Richard Scheffenegger [Thu, 30 Nov 2023 04:33:50 +0000 (05:33 +0100)]
tcp: enable LRD by default

Lost Retransmission Detection was added as a
feature in May 2021, but disabled by default.

Enabling the feature by default to reduce the
flow completion time by avoiding RTOs when
retransmissions get lost too.

Reviewed By:           tuexen, #transport, zlei
MFC after:             10 weeks
Sponsored by:          NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D42845

7 months agovm: Add kva_alloc_aligned
Andrew Turner [Wed, 29 Nov 2023 12:54:49 +0000 (12:54 +0000)]
vm: Add kva_alloc_aligned

Add a function like kva_alloc that allows us to specify the alignment
of the virtual address space returned.

Reviewed by: alc, kib, markj
Sponsored by: Arm Ltd
Differential Revision: https://reviews.freebsd.org/D42788

7 months agovm: Use vmem_xalloc in kva_alloc
Andrew Turner [Wed, 29 Nov 2023 12:11:37 +0000 (12:11 +0000)]
vm: Use vmem_xalloc in kva_alloc

The kernel_arena used in kva_alloc has the qcache disabled. vmem_alloc
will first try to use the qcache before falling back to vmem_xalloc.

Rather than trying to use the qcache in vmem_alloc just call
vmem_xalloc directly.

Reviewed by: alc, kib, markj
Sponsored by: Arm Ltd
Differential Revision: https://reviews.freebsd.org/D42831

7 months agoUnbreak build from ed31b3f4a146 (misapplied diff).
Jamie Gritton [Thu, 30 Nov 2023 01:27:37 +0000 (17:27 -0800)]
Unbreak build from ed31b3f4a146 (misapplied diff).

Differential Revision: <https://reviews.freebsd.org/D28150

7 months agojail: Don't allow jail_set(2) to resurrect dying jails.
Jamie Gritton [Thu, 30 Nov 2023 00:12:13 +0000 (16:12 -0800)]
jail: Don't allow jail_set(2) to resurrect dying jails.

Currently, a prison in "dying" state (removed but still holding
resources) can be brought back to alive state via "jail -d", or
the JAIL_DYING flag to jail_set(2).  This seemed like a good idea
at the time.

Its main use was to improve support for specifying the jid when
creating a jail, which also seemed like a good idea at the time.
But resurrecting a jail that was partway through thr process of
shutting down is trouble waiting to happen.

This patch deprecates that flag, leaving it as a no-op for creating
jails (but still useful for looking at dying jails).  It sill allows
creating a new jail with the same jid as a dying one, but will renumber
the old one in that case.  That's imperfect, but allows for current
behavior.

Reviewed by: bz
Differential Revision: https://reviews.freebsd.org/D28150

7 months agosmbus: add compat32 support for SMB ioctls
Stephen J. Kiernan [Wed, 29 Nov 2023 19:33:59 +0000 (14:33 -0500)]
smbus: add compat32 support for SMB ioctls

Some of the SMB ioctl request structures contain pointers and need to
handle requests from 32-bit applications on 64-bit kernels.

Obtained from: Juniper Networks, Inc.
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D42837

7 months agozfs: merge openzfs/zfs@a03ebd9be
Martin Matuska [Wed, 29 Nov 2023 22:07:33 +0000 (23:07 +0100)]
zfs: merge openzfs/zfs@a03ebd9be

Notable upstream pull request merges:
 #15517 2a27fd411 ZIL: Assert record sizes in different places
 #15557 b94ce4e17 module/icp/asm-arm/sha2: fix compiling on armv5/6
 #15557 4340f69be module/icp/asm-arm/sha2: auto detect __ARM_ARCH
 #15603 a03ebd9be ZIL: Call brt_pending_add() replaying TX_CLONE_RANGE
 #15606 1c38cdfe9 zdb: fix printf() length for uint64_t devid

Obtained from: OpenZFS
OpenZFS commit: a03ebd9beec6243682557fa692c12b1061fc58bd

7 months agotail: Clean up error messages.
Dag-Erling Smørgrav [Wed, 29 Nov 2023 21:48:57 +0000 (22:48 +0100)]
tail: Clean up error messages.

MFC after: 1 week
Sponsored by: Klara, Inc.
Reviewed by: markj
Differential Revision: https://reviews.freebsd.org/D42842

7 months agotail: Fix heap overflow in -F case.
Dag-Erling Smørgrav [Wed, 29 Nov 2023 21:48:50 +0000 (22:48 +0100)]
tail: Fix heap overflow in -F case.

The number of events we track can vary over time, but we only allocate
enough space for the exact number of events we are tracking when we
first begin, resulting in a trivially reproducable heap overflow.  Fix
this by allocating enough space for the greatest possible number of
events (two per file) and clean up the code a bit.

Also add a test case which triggers the aforementioned heap overflow,
although we don't currently have a way to detect it.

MFC after: 1 week
Sponsored by: Klara, Inc.
Reviewed by: allanjude, markj
Differential Revision: https://reviews.freebsd.org/D42839

7 months agoiwlwififw: add firmware for the Bz/B200 chipset
Bjoern A. Zeeb [Wed, 29 Nov 2023 21:33:23 +0000 (21:33 +0000)]
iwlwififw: add firmware for the Bz/B200 chipset

The iwlwifi driver already supports the chipset as "Bz TBD"
(also in 14.0).  Add the firmware for it.  Successfully tested
for 0x8086/0x272b/0x8086/0x00f4 on arm64 thanks to donated
hardware [1].

    Firmware was obtained from linux-firmware at
    9552083a783e5e48b90de674d4e3bf23bb855ab0 .

Sponsored by: The FreeBSD Foundation
Sponsored by: Martin Hoehne / minipci.biz (B200 card) [1]
MFC after: 3 days

7 months agolinuxkpi: Include <linux/rbtree.h> from <linux/hrtimer.h> and <linux/mm_types.h>
Jean-Sébastien Pédron [Wed, 29 Nov 2023 18:38:54 +0000 (19:38 +0100)]
linuxkpi: Include <linux/rbtree.h> from <linux/hrtimer.h> and <linux/mm_types.h>

[Why]
Some files in DRM rely on this indirect include to use `struct rb_*`.

Reviewed by: manu
Approved by: manu
Differential Revision: https://reviews.freebsd.org/D42835

7 months agovt(4): Call post-switch callback after replacing the backend
Jean-Sébastien Pédron [Wed, 29 Nov 2023 18:34:48 +0000 (19:34 +0100)]
vt(4): Call post-switch callback after replacing the backend

[Why]
For instance, it gives a chance to the new backend to refresh the
screen. This is needed by the vt_drmfb backend and `drm_fb_helper`.

This change was lost when I posted changes to reviews.freebsd.org and it
broken the amdgpu driver... Thanks to manu@ for reporting the problem
and wulf@ to find out the missing change!

Tested by: manu
Reviewed by: manu
Approved by: manu
Differential Revision: https://reviews.freebsd.org/D42834

7 months agoZIL: Call brt_pending_add() replaying TX_CLONE_RANGE
Alexander Motin [Wed, 29 Nov 2023 18:51:34 +0000 (13:51 -0500)]
ZIL: Call brt_pending_add() replaying TX_CLONE_RANGE

zil_claim_clone_range() takes references on cloned blocks before ZIL
replay.  Later zil_free_clone_range() drops them after replay or on
dataset destroy.  The total balance is neutral.  It means on actual
replay we must take additional references, which would stay in BRT.

Without this blocks could be freed prematurely when either original
file or its clone are destroyed.  I've observed BRT being emptied
and the feature being deactivated after ZIL replay completion, which
should not have happened.  With the patch I see expected stats.

Reviewed-by: Kay Pedersen <mail@mkwg.de>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Rob Norris <robn@despairlabs.com>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored by: iXsystems, Inc.
Closes #15603

7 months agox86: Support multiple PCI MCFG regions
John Baldwin [Wed, 29 Nov 2023 18:32:39 +0000 (10:32 -0800)]
x86: Support multiple PCI MCFG regions

In particular, this enables support for PCI config access for domains
(segments) other than 0.

Reported by: cperciva
Tested by: cperciva (m7i.metal-48xl AWS instance)
Reviewed by: imp
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D42828

7 months agox86: Refactor pcie_cfgregopen
John Baldwin [Wed, 29 Nov 2023 18:32:16 +0000 (10:32 -0800)]
x86: Refactor pcie_cfgregopen

Split out some bits of pcie_cfgregopen that only need to be executed
once into helper functions in preparation for supporting multiple MCFG
entries.

Reviewed by: imp
Differential Revision: https://reviews.freebsd.org/D42829

7 months agopci_cfgreg: Add a PCI domain argument to the low-level register API
John Baldwin [Wed, 29 Nov 2023 18:31:47 +0000 (10:31 -0800)]
pci_cfgreg: Add a PCI domain argument to the low-level register API

This commit changes the API of pci_cfgreg(read|write) to add a domain
argument (referred to as a segment in ACPI parlance) (note that this
is not the same as a NUMA domain, but something PCI-specific).  This
does not yet enable access to domains other than 0, but updates the
API to support domains.

Places that use hard-coded bus/slot/function addresses have been
updated to hardcode a domain of 0.  A few places that have the PCI
domain (segment) available such as the acpi_pcib_acpi.c Host-PCI
bridge driver pass the PCI domain.

The hpt27xx(4) and hptnr(4) drivers fail to attach to a device not on
domain 0 since they provide APIs to their binary blobs that only
permit bus/slot/function addressing.

The x86 non-ACPI PCI bus drivers all hardcode a domain of 0 as they do
not support multiple domains.

Reviewed by: imp
Differential Revision: https://reviews.freebsd.org/D42827

7 months agoagp_amd64: Use <machine/pci_cfgreg.h> rather than bare prototypes
John Baldwin [Wed, 29 Nov 2023 18:31:16 +0000 (10:31 -0800)]
agp_amd64: Use <machine/pci_cfgreg.h> rather than bare prototypes

Reviewed by: imp
Differential Revision: https://reviews.freebsd.org/D42826

7 months agoFix zoneid when USER_NS is disabled
Wraithh [Wed, 29 Nov 2023 17:55:17 +0000 (19:55 +0200)]
Fix zoneid when USER_NS is disabled

getzoneid() should return GLOBAL_ZONEID instead of 0 when USER_NS is disabled.

Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ilkka Sovanto <github@ilkka.kapsi.fi>
Closes #15560

7 months agoZTS: get_persistent_disk_name can return truncated names
VaibhavB [Wed, 29 Nov 2023 17:34:29 +0000 (23:04 +0530)]
ZTS: get_persistent_disk_name can return truncated names

Instead of using only the 3rd element return the entire string after
the split to handle device names with dashes.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Vaibhav Bhanawat <vaibhav.bhanawat@delphix.com>
Closes #15567

7 months agozdb: fix printf() length for uint64_t devid
Martin Matuška [Wed, 29 Nov 2023 17:18:30 +0000 (18:18 +0100)]
zdb: fix printf() length for uint64_t devid

Bug introduced in 213d6829673.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Warner Losh <imp@FreeBSD.org>
Signed-off-by: Martin Matuska <mm@FreeBSD.org>
Closes #15606

7 months agopf: fix mem leaks upon vnet destroy
Igor Ostapenko [Wed, 29 Nov 2023 12:35:41 +0000 (13:35 +0100)]
pf: fix mem leaks upon vnet destroy

Add missing cleanup actions:
- remove user defined anchor rulesets
- remove user defined ether anchor rulesets
- remove tables linked to user defined anchors
- deal with wildcard anchor peculiarities to get them removed correctly

PR: 274310
Reviewed by: kp
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D42747

7 months agoossl: Keep mutable AES-GCM state on the stack
Mark Johnston [Wed, 29 Nov 2023 17:51:55 +0000 (12:51 -0500)]
ossl: Keep mutable AES-GCM state on the stack

ossl(4)'s AES-GCM implementation keeps mutable state in the session
structure, together with the key schedule.  This was done for
convenience, as both are initialized together.  However, some OCF
consumers, particularly ZFS, assume that requests may be dispatched to
the same session in parallel.  Without serialization, this results in
incorrect output.

Fix the problem by explicitly copying per-session state onto the stack
at the beginning of each operation.

PR: 275306
Reviewed by: jhb
Fixes: 9a3444d91c70 ("ossl: Add a VAES-based AES-GCM implementation for amd64")
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D42783

7 months agoopenzfs: unbreak 32-bit builds.
Warner Losh [Wed, 29 Nov 2023 15:26:29 +0000 (08:26 -0700)]
openzfs: unbreak 32-bit builds.

32-bit builds are broken. fix that by using PRIu64 instead of a
bare '%lu.'

Feel free to revert when upstream has this fixed. I'm agnostic as to the
proper fix, but don't have the time to fight upstreaming this on top of
everything else.

7 months agozfsd: fault disks that generate too many I/O delay events
Alan Somers [Wed, 12 Jul 2023 20:46:27 +0000 (14:46 -0600)]
zfsd: fault disks that generate too many I/O delay events

If ZFS reports that a disk had at least 8 I/O operations over 60s that
were each delayed by at least 30s (implying a queue depth > 4 or I/O
aggregation, obviously), fault that disk.  Disks that respond this
slowly can degrade the entire system's performance.

MFC after: 2 weeks
Sponsored by: Axcient
Reviewed by: delphij
Differential Revision: https://reviews.freebsd.org/D42825

7 months agompi3mr: Minor tweak to task queue pausing
Warner Losh [Wed, 29 Nov 2023 01:50:57 +0000 (18:50 -0700)]
mpi3mr: Minor tweak to task queue pausing

Use a while loop with cancel / drain to make sure that all tasks have
completed before proceeding to reset.

Suggested by: jhb
Sponsored by: Netflix

7 months agompi3mr: Assume dma_hiaddr is BUS_SPACE_MAXADDR
Warner Losh [Wed, 29 Nov 2023 01:50:52 +0000 (18:50 -0700)]
mpi3mr: Assume dma_hiaddr is BUS_SPACE_MAXADDR

No sense having a variable for this. So use BUS_SPACE_MAXADDR and remove
dma_hiaddr from softc.

Suggested by: jhb
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D42808

7 months agompi3mr: Replace can't happen DataLength == 0 with an assert
Warner Losh [Wed, 29 Nov 2023 01:50:47 +0000 (18:50 -0700)]
mpi3mr: Replace can't happen DataLength == 0 with an assert

Replace the test for DataLength == 0 with an assert. It can't happen,
but an assert doesn't hurt. Emacs removed some trailing white space too.

Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D42807

7 months agompi3mr: Use template for main busdma tag.
Alexander Motin [Wed, 29 Nov 2023 01:50:39 +0000 (18:50 -0700)]
mpi3mr: Use template for main busdma tag.

Use the simpler template code for the parent busdma tag for all I/O to
this card.

Reviewed by: mav, jhb, imp
Differential Revision: https://reviews.freebsd.org/D42607

7 months agompi3mr: Make these bus_dmamap_load calls synchronous
Alexander Motin [Wed, 29 Nov 2023 01:50:30 +0000 (18:50 -0700)]
mpi3mr: Make these bus_dmamap_load calls synchronous

These calls "should" all be synchrounous. There's no bouncing that's
needed for them (at least in the typical case that we have a sane card
that has more bits of dma addresses decoded than we have memory), so
there's no errors possible. Ensure these calls are really synchronous
with BUS_DMA_NOWAIT flags (which should never fail now that the
bus_dmamem_alloc() has succeeded).

Reviewed by: mav, jhb, imp
Differential Revision: https://reviews.freebsd.org/D42606

7 months agompi3mr: Fix MAXPHYS usage
Alexander Motin [Wed, 29 Nov 2023 01:50:24 +0000 (18:50 -0700)]
mpi3mr: Fix MAXPHYS usage

This usage is obsolete. Replace with maximum bus space size. maxphys
will sort itself out at higher levels.

Reviewed by: mav, jhb, imp
Differential Revision: https://reviews.freebsd.org/D42605

7 months agompi3mr: Add firmware version
Warner Losh [Wed, 29 Nov 2023 01:50:10 +0000 (18:50 -0700)]
mpi3mr: Add firmware version

Publish the firmware version on the card like we do for mps/mpr.

Sponsored by: Netflix
Reviewed by: mav
Differential Revision: https://reviews.freebsd.org/D42588

7 months agompi3mr: Trivial trailing white space reduction
Warner Losh [Wed, 29 Nov 2023 01:49:56 +0000 (18:49 -0700)]
mpi3mr: Trivial trailing white space reduction

Sponsored by: Netflix

7 months agompi3mr: Honor the dma mask from IOCFacts
Warner Losh [Wed, 29 Nov 2023 01:49:49 +0000 (18:49 -0700)]
mpi3mr: Honor the dma mask from IOCFacts

The number of signficant bits that are decoded are returned in the flags
field of the IOCFacts structure from the device.  Rather than assume the
worst with a pessimal 32-bit maximum, look at this value and pass it
along to all the dma map creation requests.

A lof of those creations are repetitive and could just inherit from the
base tag if we moved to the templated interface.  This is called out as
desireable future work not done at this time.

In addition, due to a chicken and an egg problem, we have to allocate
some of the maps with a 32-bit loaddr.  These are the ones we need to
read iocfacts.  And they are fine to be so restricted: they are little
used after startup, and when they are used, bouncing is fine.

Sponsored by: Netflix
Reviewed by: mav
Differential Revision: https://reviews.freebsd.org/D42559

7 months agompi3mr: Fix EINPROGRESS errors hanging the card
Warner Losh [Wed, 29 Nov 2023 01:49:39 +0000 (18:49 -0700)]
mpi3mr: Fix EINPROGRESS errors hanging the card

Move enqueueing of commands to bus_dmamap_load_ccb callback

Fix fundamental difference between FreeBSD and Linux. On Linux, your dma
load callback always happends before it returns, so drivers are written
to load the map, then submit to hardware. On FreeBSD, the callback may
be deferred and return EINPROGRESS. This means the callback is
responsible for queueing the request to the hardware is done after the
SGL list is created. Make a number of interrelated cahnages:

At the end of mpi3mr_prepare_sgls, add a call to mpi3mr_enqueue_request.

Split the hardware submission out from the end of mpi3mr_action_scsiio
and move it into a new routine mpi3mr_enqueue_request.

Move all error completion from the end of mpi3mr_action_scsiio to where
the error is detected. We cannot pass errors back from the
mpi3mr_enqueue_request to do this on a 'failed' mpi3mr in a centralized
place (since it has to be fire and forget).

Add comments about zero length SGLs never making it into
mpi3mr_prepare_sgls. Keep the code there for the moment, but we only set
cm->data to non-NULL when scsiio_req->DataLength is not zero. So the
datalength can't be zero and we can't send the zero SGLs.

Add commentts about other "impossible" tests in mpi3mr_prepare_sgls that
really should be simple asserts of some flavor.

Eliminate cm->error_code, since we can't pass data back from the
mpi3mr_prepare_sgl callback anymore.

In mpi3mr_map_request, call mpi3mr_enqueue_request for the no data case.
This seems to work even though we've not done the special zero length
handling that was in mpi3mr_prepare_sgls, giving further evidence to it
not actually being needed. This is needed for SCSI CDBs that have no
data to pass to the drive like TEST UNIT READY.

With this change, and the prior ones, we're now able to run with mpi3mr
on 128GB systems and very heavy disk load (so many buffers land > 4GB:
the driver instructs busdma to never use memory abouve 4GB, which may be
too conservative, but an issue for another time).

Sponsored by: Netflix
Reviewed by: sumit.saxena_broadcom.com, mav, jhb
Differential Revision: https://reviews.freebsd.org/D42543

7 months agompi3mr: Cleaup setting of status in processing scsiio requests
Warner Losh [Wed, 29 Nov 2023 01:49:30 +0000 (18:49 -0700)]
mpi3mr: Cleaup setting of status in processing scsiio requests

More uniformly use mpi3mr_set_ccbstatus in mpi3mr_action_scsiio.  The
routine mostly used it, but also has setting of status by hand. In those
cases where we want to error out the request, use this routine.

As part of this, move setting CAM_SIM_QUEUED later in the function to
when we're sure it's been queued. Remove the places we clear it before
this.

Sponsored by: Netflix
Reviewed by: mav, jhb
Differential Revision: https://reviews.freebsd.org/D42542

7 months agompi3mr: Only set callout_owned when we create a timeout
Warner Losh [Wed, 29 Nov 2023 01:49:24 +0000 (18:49 -0700)]
mpi3mr: Only set callout_owned when we create a timeout

Since we assume there's a timeout to cancel when this is true, only set
it true when we set the timeout. Otherwise we may try to cancel a timeout
when there's been an error in submission.

Sponsored by: Netflix
Reviewed by: mav
Differential Revision: https://reviews.freebsd.org/D42541

7 months agompi3mr: Minor style fix
Warner Losh [Wed, 29 Nov 2023 01:49:16 +0000 (18:49 -0700)]
mpi3mr: Minor style fix

Fold two lines to make this more readable.

Sponsored by: Netflix
Reviewed by: mav, jhb
Differential Revision: https://reviews.freebsd.org/D42540

7 months agompi3mr: Reduce the scope of the reset_mutext
Warner Losh [Wed, 29 Nov 2023 01:49:08 +0000 (18:49 -0700)]
mpi3mr: Reduce the scope of the reset_mutext

Reduce the scope of reset_mutext to protect the msleep in the watch dog
thread as well as the MPI3MR_FLAGS_SHUTDOWN bit. Use it to protect the
wakeup in mpi3mr_detach so this thread can exit sooner when we're trying
to do an orderly shutdown. Optimize the flow to check the sleep and
other conditions before going to sleep.

It's an open question if this should protect sc->unrecoverable, and if
we should wakeup the watchdog thread when we set it. We might also want
to move too booleans for the three flags that we have now in
mpi3mr_flags. There are a number of U8s that should really be bools and
we might want to also group them together to pack softc better.

Sponsored by: Netflix
Reviewed by: mav
Differential Revision: https://reviews.freebsd.org/D42539

7 months agompi3mr: Remove unused fields in struct mpi3mr_cmd
Warner Losh [Wed, 29 Nov 2023 01:49:01 +0000 (18:49 -0700)]
mpi3mr: Remove unused fields in struct mpi3mr_cmd

All of these fields are either unused, or just initialized. Remove
them. This saves about 1MB of memory for the cards that I have which can
do 8k transactions at once.

Sponsored by: Netflix
Reviewed by: mav, jhb
Differential Revision: https://reviews.freebsd.org/D42538

7 months agompi3mr: Don't hold fwevt_lock over call to taskqueue_drain
Warner Losh [Wed, 29 Nov 2023 01:48:48 +0000 (18:48 -0700)]
mpi3mr: Don't hold fwevt_lock over call to taskqueue_drain

Holding fwevt_lock when we call taskqueue_drain can lead to deadlock
because it's draining a queue needs fwevt_lock to do work, so that other
thread will try to take out the lock and block, making the thread never
finish and taskqueue_drain never complete. There's a witness
warning/error for this which was exposed when the lock was converted to
a MTX_DEF lock from a MTX_SPIN prior to committing to the FreeBSD tree.

The lock appears to be to protect against additional items being added
to the event list while we're doing a reset. Since the taskqueue is
blocked, items can get added to the list, but won't be processed during
the reset, but there is still a (likely small) race between the
taskqueue_drain and the taskqueue_block calls where an interrupt could
fire on another CPU, resulting in a task being enqueued and started
before the block can take effect. The only way to fix that race is to
turn off interrupt processing during a reset. So we replace a deadlock
with a smaller race.

Sponsored by: Netflix
Reviewed by: sumit.saxena_broadcom.com, mav, jhb
Differential Revision: https://reviews.freebsd.org/D42537

7 months agosys/sys: Remove some more vestiges of the $FreeBSD$
Konstantin Belousov [Tue, 28 Nov 2023 22:56:03 +0000 (00:56 +0200)]
sys/sys: Remove some more vestiges of the $FreeBSD$

Reviewed by: imp
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D42824

7 months agonetlink: Add tests when adding an interface route
Jose Luis Duran [Tue, 28 Nov 2023 19:58:03 +0000 (14:58 -0500)]
netlink: Add tests when adding an interface route

Add tests for adding a route using an interface only (without an IP
address).

Reviewed by: rcm
Approved by: kp (mentor)
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D41436

7 months agoZIL: Assert record sizes in different places
Alexander Motin [Tue, 28 Nov 2023 21:35:14 +0000 (16:35 -0500)]
ZIL: Assert record sizes in different places

This should make sure we have log written without overflows.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored by: iXsystems, Inc.
Closes #15517

7 months agomodule/icp/asm-arm/sha2: fix compiling on armv5/6
Shengqi Chen [Wed, 22 Nov 2023 14:27:24 +0000 (22:27 +0800)]
module/icp/asm-arm/sha2: fix compiling on armv5/6

The `adr` insn in neon kernel generates an compiling
error on armv5/6 target. Fix that by using `ldr`.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Shengqi Chen <harry-chen@outlook.com>
Closes #15557

7 months agomodule/icp/asm-arm/sha2: auto detect __ARM_ARCH
Shengqi Chen [Wed, 22 Nov 2023 13:58:47 +0000 (21:58 +0800)]
module/icp/asm-arm/sha2: auto detect __ARM_ARCH

This patch uses __ARM_ARCH set by compiler (both
GCC and Clang have this) whenever possible instead
of hardcoding it to 7. This change allows code to
compile on earlier ARM architectures such as armv5te.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Shengqi Chen <harry-chen@outlook.com>
Closes #15557

7 months agoroute: introduce add interface route test cases
R. Christian McDonald [Tue, 28 Nov 2023 18:18:15 +0000 (13:18 -0500)]
route: introduce add interface route test cases

As a followup to D41330 and D41436, this patch introduces two new tests
for sbin/route: interface_route_v[46].

These tests fail without D41330.

Reviewed by: kp
Approved by: kp (mentor)
MFC after: 1 week
Sponsored by: Rubicon Communications, LLC ("Netgate")

7 months agonetlink: fix adding an interface route
KUROSAWA Takahiro [Tue, 28 Nov 2023 18:14:50 +0000 (13:14 -0500)]
netlink: fix adding an interface route

route add <host> -iface <netif>" for a netif without an IPv4/IPv6
address fails with EINVAL. Need to use a link-level ifaddr for gw if
an ifaddr for dst is not found as the rtsock-based implementation does.

PR: 275341
Reported by: Sean Cody <sean@tinfoilhat.ca>
Reviewed by: rcm
Tested by: rcm
Approved by: kp (mentor)
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D41330

7 months agozfs: merge openzfs/zfs@688514e47
Martin Matuska [Tue, 28 Nov 2023 20:35:02 +0000 (21:35 +0100)]
zfs: merge openzfs/zfs@688514e47

Notable upstream pull request merges:
 #15532 c1a47de86 zdb: Fix zdb '-O|-r' options with -e/exported zpool
 #15535 cf3316633 ZVOL: Minor code cleanup
 #15541 803a9c12c brt: lift internal definitions into _impl header
 #15541 213d68296 zdb: show BRT statistics and dump its contents
 #15543 a49087510 ZIL: Refactor TX_WRITE encryption similar to
                  TX_CLONE_RANGE
 #15543 27d8c23c5 ZIL: Do not encrypt block pointers in lr_clone_range_t
 #15549 67894a597 unnecessary alloc/free in dsl_scan_visitbp()
 #15551 126efb588 FreeBSD: Fix the build on FreeBSD 12
 #15563 acb33ee1c FreeBSD: Fix ZFS so that snapshots under .zfs/snapshot are
                  NFS visible
 #15564 7bbd42ef4 Don't allow attach to a raidz child vdev
 #15566 688514e47 dmu_buf_will_clone: fix race in transition back to NOFILL
 #15571 30d581121 dnode_is_dirty: check dnode and its data for dirtiness

Obtained from: OpenZFS
OpenZFS commit: 688514e4704bdee4551d25960febd322ac26f297

7 months agoifconfig: add -D option to print driver name for interface
Mike Karels [Tue, 28 Nov 2023 19:47:37 +0000 (13:47 -0600)]
ifconfig: add -D option to print driver name for interface

Add -D option to add the drivername and unit number to ifconfig output
for normal display, including -a.  Use ifconfig_get_orig_name() from
libifconfig to fetch the name.  Note that this is the original name
for many drivers, but not for some exceptions like epair (which appends
'a' or 'b' to the unit number).  epair interface pairs both display
as "epair0", etc.  Make -v imply -D; might as well be fully verbose.

MFC after: 1 week
Reviewed by: zlei, kp
Differential Revision: https://reviews.freebsd.org/D42721

7 months agoossl: Fix handling of separate AAD buffers in ossl_aes_gcm()
Mark Johnston [Tue, 28 Nov 2023 19:35:49 +0000 (14:35 -0500)]
ossl: Fix handling of separate AAD buffers in ossl_aes_gcm()

Consumers may optionally provide a reference to a separate buffer
containing AAD, but ossl_aes_gcm() didn't handle this and would thus
compute an incorrect digest.

Fixes: 9a3444d91c70 ("ossl: Add a VAES-based AES-GCM implementation for amd64")
Reviewed by: jhb
MFC after: 3 days
Sponsored by: Klara, Inc.
Sponsored by: Stormshield
Differential Revision: https://reviews.freebsd.org/D42736

7 months agoLinux 6.6 compat: fix configure error with clang (#15558)
Jaron Kent-Dobias [Tue, 28 Nov 2023 19:34:40 +0000 (20:34 +0100)]
Linux 6.6 compat: fix configure error with clang (#15558)

With Linux v6.6.x and clang 16, a configure step fails on a warning that
later results in an error while building, due to 'ts' being
uninitialized. Add a trivial initialization to silence the warning.

Signed-off-by: Jaron Kent-Dobias <jaron@kent-dobias.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
7 months agocompiler-rt: remove unnecessary include
Dimitry Andric [Tue, 28 Nov 2023 18:17:36 +0000 (19:17 +0100)]
compiler-rt: remove unnecessary include

This is to sync the code with upstream, see:
See https://github.com/llvm/llvm-project/pull/73439#discussion_r1406644942

Fixes: 4c9a0adad182
MFC after: 3 days

7 months agodmu_buf_will_clone: fix race in transition back to NOFILL
Rob N [Tue, 28 Nov 2023 17:53:04 +0000 (04:53 +1100)]
dmu_buf_will_clone: fix race in transition back to NOFILL

Previously, dmu_buf_will_clone() would roll back any dirty record, but
would not clean out the modified data nor reset the state before
releasing the lock. That leaves the last-written data in db_data, but
the dbuf in the wrong state.

This is eventually corrected when the dbuf state is made NOFILL, and
dbuf_noread() called (which clears out the old data), but at this point
its too late, because the lock was already dropped with that invalid
state.

Any caller acquiring the lock before the call into
dmu_buf_will_not_fill() can find what appears to be a clean, readable
buffer, and would take the wrong state from it: it should be getting the
data from the cloned block, not from earlier (unwritten) dirty data.

Even after the state was switched to NOFILL, the old data was still not
cleaned out until dbuf_noread(), which is another gap for a caller to
take the lock and read the wrong data.

This commit fixes all this by properly cleaning up the previous state
and then setting the new state before dropping the lock. The
DBUF_VERIFY() calls confirm that the dbuf is in a valid state when the
lock is down.

Sponsored-by: Klara, Inc.
Sponsored-By: OpenDrives Inc.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Pawel Jakub Dawidek <pawel@dawidek.net>
Signed-off-by: Rob Norris <rob.norris@klarasystems.com>
Closes #15566
Closes #15526

7 months agonullfs: do not allow bypass on copy_file_range()
Konstantin Belousov [Sat, 18 Nov 2023 09:23:22 +0000 (11:23 +0200)]
nullfs: do not allow bypass on copy_file_range()

There must be no callers of VOP_COPY_FILE_RANGE() except
vn_copy_file_range(), which does enough to find the write-vnodes where
to call the VOP.

Reviewed by: markj, Olivier Certner <olce.freebsd@certner.fr>
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D42603

7 months agovn_copy_file_range(): provide ENOSYS fallback to vn_generic_copy_file_range()
Konstantin Belousov [Sat, 18 Nov 2023 08:59:19 +0000 (10:59 +0200)]
vn_copy_file_range(): provide ENOSYS fallback to vn_generic_copy_file_range()

Reviewed by: markj, Olivier Certner <olce.freebsd@certner.fr>
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D42603

7 months agovn_copy_file_range(): find write vnodes on which to call the VOP
Konstantin Belousov [Sat, 18 Nov 2023 08:57:44 +0000 (10:57 +0200)]
vn_copy_file_range(): find write vnodes on which to call the VOP

Reviewed by: markj, Olivier Certner <olce.freebsd@certner.fr>
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D42603

7 months agoVFS: add VOP_GETLOWVNODE()
Konstantin Belousov [Sat, 18 Nov 2023 08:55:48 +0000 (10:55 +0200)]
VFS: add VOP_GETLOWVNODE()

It is similar to VOP_GETWRITEMOUNT(), and for given vnode vp should
return the lower vnode which would actually handle write to vp.
Flags allow to specify FREAD or FWRITE for benefit of possible unionfs
implementation.

Reviewed by: markj, Olivier Certner <olce.freebsd@certner.fr>
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D42603

7 months agoEVFILT_TIMER: intialize stop timer list in type-stable proc init, instead of fork
Konstantin Belousov [Tue, 28 Nov 2023 15:42:49 +0000 (17:42 +0200)]
EVFILT_TIMER: intialize stop timer list in type-stable proc init, instead of fork

Since kqueue timer may exist after the process that created it exited
(same scenario with rfork(2) as in PR 275286), make the tailq
p_kqtim_stop accessed by filt_timerdetach() type-stable.

Noted and reviewed by: markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D42777

7 months agoEVFILT_SIGNAL: do not use target process pointer on detach
Konstantin Belousov [Tue, 28 Nov 2023 12:51:54 +0000 (14:51 +0200)]
EVFILT_SIGNAL: do not use target process pointer on detach

It is enough to know knlist to remove from it, and the list is
autodestroyed on last removal.

PR: 275286
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D42777

7 months agoRevert "kqueue: on process exit, force-clear its registered signal events"
Konstantin Belousov [Tue, 28 Nov 2023 12:32:24 +0000 (14:32 +0200)]
Revert "kqueue: on process exit, force-clear its registered signal events"

This reverts commit 393ac29f0b8be068c8e46f76c2eeee07d20ea4df.  A
different fix is following, which preserves semantic, required by the
sys.kqueue.proc3_test.proc3 test.

Reviewed by: markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
PR: 275286
Differential revision: https://reviews.freebsd.org/D42777

7 months agounnecessary alloc/free in dsl_scan_visitbp()
Matthew Ahrens [Tue, 28 Nov 2023 17:20:48 +0000 (09:20 -0800)]
unnecessary alloc/free in dsl_scan_visitbp()

Clean up code in dsl_scan_visitbp() by removing an unnecessary
alloc/free and `goto`.  This has the side benefit of reducing CPU usage,
which is only really noticeable if we are not doing i/o for the leaf
blocks, like when `zfs_no_scrub_io` is set.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Mark Maybee <mark.maybee@delphix.com>
Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
Closes #15549

7 months agopst-raid: De-pessimize the building of i386 kernels
Warner Losh [Tue, 28 Nov 2023 17:14:04 +0000 (10:14 -0700)]
pst-raid: De-pessimize the building of i386 kernels

Add include of sys/proc.h

Fixes: c4dacfa7f4b8
7 months agomemfd_create: don't allocate heap memory
Brooks Davis [Mon, 27 Nov 2023 17:07:06 +0000 (17:07 +0000)]
memfd_create: don't allocate heap memory

Rather than calling calloc() to allocate space for a page size array to
pass to getpagesizes(), just follow the getpagesizes() implementation
and allocate MAXPAGESIZES elements on the stack.  This avoids the need
for the allocation.

While this does mean that a new libc is required to take advantage of a
new huge page size, that was already true due to getpagesizes() using a
static buffer of MAXPAGESIZES elements.

Reviewed by: kevans, imp, emaste
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D42710

7 months agomemfd_create: move implementation to libc/gen
Brooks Davis [Mon, 27 Nov 2023 17:06:33 +0000 (17:06 +0000)]
memfd_create: move implementation to libc/gen

Due to memfd_create(3)'s construction of a path to pass to shm_open2(2),
it has a much larger than typical dependency footprint for a system
call wrapper (the list currently includes calloc, memset, sprintf, and
strlen).  As such, split it off into its own file under libc/gen to
lighten libc/sys's dependency list.

Reviewed by: kevans, imp, emaste
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D42709

7 months agogetpagesize(3): drop support for non-ELF kernels
Brooks Davis [Mon, 27 Nov 2023 17:06:25 +0000 (17:06 +0000)]
getpagesize(3): drop support for non-ELF kernels

AT_PAGESZ was introduced with ELF support in 1996 (commit
e1743d02cd14069f69a50bb8a6c626c1c6f47ddd) so we can safely count on
being able to use it to get our page size via elf_aux_info().  As such
we don't need a fallback sysctl query.

Save a few bytes of bss by dropping caching as elf_aux_info() runs
in constant time for a given query.

Reviewed by: kevans, imp, emaste
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D42708

7 months agogetpagesizes(3): drop support for kernels before 9.0
Brooks Davis [Mon, 27 Nov 2023 17:06:01 +0000 (17:06 +0000)]
getpagesizes(3): drop support for kernels before 9.0

AT_PAGESIZES and elf_aux_info where added prior to FreeBSD 9.0 in commit
ee235befcb8253fab9beea27b916f1bc46b33147.  It's safe to say that a
FreeBSD 15 libc won't work on a 8.x kernel so drop sysctl fallback.

Reviewed by: kevans, imp, emaste
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D42707