Kristof Provost [Wed, 17 Jan 2024 17:11:27 +0000 (18:11 +0100)]
pf: work around icmp6 packet-too-big not being sent when binat-ing
If we're applying NPTv6 we pass a packet with a modified source and/or
destination address to the network stack.
If that packet then turns out to be larger than the MTU of the sending
interface the stack will attempt to generate an icmp6 packet-too-big
error, but may fail to look up the appropriate source address for that
error message. Even if it does, pf would still have to undo the binat
operation inside the icmp6 packet so the sending host can make sense of
the error.
We can avoid both problems entirely by having pf also perform the MTU
check (taking the potential refragmentation into account), and
generating the icmp6 error directly in pf.
Alexander Ziaee [Fri, 12 Jan 2024 22:12:48 +0000 (17:12 -0500)]
newfs_msdos.8: example for specific cluster size
The usual use case in 2024 for newfs_msdosfs is creating filesystems on SD cards
for older hardware. In most tutorials, they call the cluster size "allocation
size". Therefore, add a small note next to cluster size that it is also called
allocation size, and add an example for how to do this.
Cy Schubert [Sat, 20 Jan 2024 13:52:35 +0000 (05:52 -0800)]
rc.d/kdc: Support start of MIT krb5kdc
Some users wishing to use the MIT krb5kdc have discovered the
kdc script workaround applied to the MIT krb5 ports is insufficient.
Let's build into this rc script the smarts to determine whether
base or ports Hiemdal kdc is being invoked or the MIT krb5kdc.
While at it, remove kdc_start_precmd(). This will simplify a future
jail patch.
Warner Losh [Sat, 20 Jan 2024 04:32:16 +0000 (21:32 -0700)]
firmware(9): Update example
Update the example to include a firmware module in the kernel from npe
to iwn. Npe was deleted 6 years ago so makes a poor example of how to
embed firmware in the kernel.
Jessica Clarke [Sat, 20 Jan 2024 22:07:48 +0000 (22:07 +0000)]
tools/build/make.py: Add missing comma to fix tinderbox and worlds
The missing comma meant this was interpreted as a single target called
"tinderboxworlds", and so neither tinderbox nor worlds were recognised
as being MI targets (i.e. still required TARGET(_ARCH) to be given).
Fixes: 5157b451c654 ("tools/build/make.py: Grow the list of MI targets")
Mark Johnston [Fri, 19 Jan 2024 19:06:16 +0000 (14:06 -0500)]
makefs: Make it possible to silence warnings about duplicate paths
When generating a VM image from an installworld mtree manifest, makefs
spits out several thousand warnings about duplicate paths in the
manifest. These are harmless and have been around for a long time (see
the phabricator revision for some more details), so let's at least have
a way to make makefs quieter.
Mark Johnston [Thu, 18 Jan 2024 21:47:52 +0000 (16:47 -0500)]
mlx5: Zero DMA memory mlx5_alloc_cmd_msg() and alloc_cmd_page()
These functions may map more memory for DMA than is actually used, since
the allocator operates on multiples of a 4KB page size. Thus,
bus_dmamap_sync() can trigger KMSAN reports when the unused portion of
a page is not zero-ed.
msdosfs_integrity_error(): plug possible busy leak
If taskqueue_enqueue() returned error, unbusy().
Handle parallel calls to msdosfs_integrity_error() by unbusying in
msdosfs_remount_ro() up to pending times.
Noted and reviewed by: markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D43482
msdosfs_rename(): handle errors from msdosfs_lookup_ino()
Properly working storage and correct filesystem structure indeed only
allow the EJUSTRETURN return code, but since the called function needs
to read directory blocks and (re)parse the content, the assert is not
neccessary hold.
PR: 276408
Reported by: John F. Carr
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D43482
Cy Schubert [Wed, 6 Dec 2023 15:30:05 +0000 (07:30 -0800)]
kerberos: Fix numerous segfaults when using weak crypto
Weak crypto is provided by the openssl legacy provider which is
not load by default. Load the legacy providers as needed.
When the legacy provider is loaded into the default context the default
provider will no longer be automatically loaded. Without the default
provider the various kerberos applicaions and functions will abort().
This is the second attempt at this patch. Instead of linking
secure/lib/libcrypto at build time we now link it at runtime, avoiding
buildworld failures under Linux and MacOS. This is because
TARGET_ENDIANNESS is undefined at pre-build time.
PR: 272835
MFC after: 3 days
X-MFC: only to stable/14
Tested by: netchild
Joerg Pulz <Joerg.Pulz@frm2.tum.de> (previous version)
Aaron LI [Wed, 17 Jan 2024 23:29:23 +0000 (23:29 +0000)]
if_wg: fix access to noise_local->l_has_identity and l_private
These members are protected by the identity lock, so rlock it in
noise_remote_alloc() and then assert that we have it held to some extent
in noise_precompute_ss().
Aaron LI [Wed, 17 Jan 2024 23:29:23 +0000 (23:29 +0000)]
if_wg: fix erroneous calculation in calculate_padding() for p_mtu == 0
In practice this is harmless; only keepalive packets may realistically have
p_mtu == 0, and they'll also have no payload so the math works out the same
either way. Still, let's prefer technical accuracy and calculate the amount
of padding needed rather than the padded length...
Notable upstream pull request merges:
#15660 66670ba9f fix(mount): do not truncate shares not zfs mount
#15719 3bddc4dae spa: Fix FreeBSD sysctl handlers (already merged)
#15719 5a703d136 spa: Let spa_taskq_param_get()'s addition of a newline
be optional
#15721 6138af86b Stop wasting time on malloc in snprintf_zstd_header
#15723 1f5bf9600 Make zdb -R a little more sane.
#15726 20dd16d9f Make zdb -R scale less poorly
#15737 d9885b377 fix: Uber block label not always found for aux vdevs
#15737 2df2a58dc Extend aux label to add path information
#15737 b64be1624 Add path handling for aux vdevs in `label_path`
#15747 a1771d243 Fix "out of memory" error
#15752 1a11ad9d2 Fix a potential use-after-free in zfs_setsecattr()
#15772 f45dd90f3 Fix cloning into mmaped and cached file
#15781 1494e8fba Autotrim High Load Average Fix
There is a lack of proper visibility checking in kern.ttys sysctl handler
which leads to information leak about processes outside the current jail.
This can be demonstrated with pstat -t: when called from within a jail,
it will output all terminal devices including process groups and
session leader process IDs:
jail# pstat -t | grep pts/ | head
LINE INQ CAN LIN LOW OUTQ USE LOW COL SESS PGID STATE
pts/2 1920 0 0 192 1984 0 199 0 4132 27245 Oi
pts/3 1920 0 0 192 1984 0 199 16 24890 33627 Oi
pts/5 0 0 0 0 0 0 0 25 17758 0 G
pts/16 0 0 0 0 0 0 0 0 52495 0 G
pts/15 0 0 0 0 0 0 0 25 53446 0 G
pts/17 0 0 0 0 0 0 0 6702 33230 0 G
pts/19 0 0 0 0 0 0 0 14 1116 0 G
pts/0 0 0 0 0 0 0 0 0 2241 0 G
pts/23 0 0 0 0 0 0 0 20 15639 0 G
pts/6 0 0 0 0 0 0 0 0 44062 93792 G
jail# pstat -t | grep pts/ | wc -l
85
Devfs does the filtering correctly and we get only one entry:
Tino Reichardt [Wed, 17 Jan 2024 17:06:14 +0000 (18:06 +0100)]
fix: variable type with zfs-tests/cmd/clonefile.c
Compiling on arm64 freebsd-13.2 and arm64 almalinux-8 brings currently
this error:
```
CC tests/zfs-tests/cmd/clonefile.o
tests/zfs-tests/cmd/clonefile.c:166:43: error: result of comparison of \
constant -1 with expression of type 'char' is always true \
[-Werror,-Wtautological-constant-out-of-range-compare]
while ((c = getopt(argc, argv, "crfdq")) != -1) {
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ^ ~~
1 error generated.
gmake[2]: *** [Makefile:8675: tests/zfs-tests/cmd/clonefile.o] Error 1
```
Fix: use correct variable type `int`.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Rob Norris <robn@despairlabs.com> Signed-off-by: Tino Reichardt <milky-zfs@mcmilk.de>
Closes #15783
If the destination file is mmaped and the mmaped region was already
read, so it is cached, we need to update mmaped pages after successful
clone using update_pages().
Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Pointed out by: Ka Ho Ng <khng@freebsd.org> Signed-off-by: Pawel Jakub Dawidek <pawel@dawidek.net>
Closes #15772
rc.d/jail: add legacy compatibility for zfs.dataset
Evaluate the jail_${jailname}_zfs_dataset variable for legacy
jail managers.
This variable can take a space separated list of datasets.
The singular was used specially to allow unmaintained jail
managers like ezjail to use this (simply rename
jail_${jailname}_zfs_datasets in the ezjail config to
jail_${jailname}_zfs_dataset).
Wei Hu [Wed, 17 Jan 2024 09:19:35 +0000 (09:19 +0000)]
mana: Fix TX CQE error handling
For an unknown TX CQE error type (probably from a newer hardware),
still free the mbuf, update the queue tail, etc., otherwise the
accounting will be wrong.
Also, TX errors can be triggered by injecting corrupted packets, so
replace the mana_err to mana_dbg logging.
Reported by: NetApp
MFC after: 1 week
Sponsored by: Microsoft
Add zfs.dataset to jail(8) to add a list of ZFS datasets.
Bump FreeBSD version for jail managers to switch to native
dataset support.
Datasets are attached to the jail after the jail creation and
before the execution of any start command. Unlike current
implementations in jail managers which attach datasets after
the start command, this allows the zfs rc.d script to mount
the datasets on start.
Rob N [Tue, 16 Jan 2024 22:01:17 +0000 (09:01 +1100)]
Linux 6.7 compat: zfs_setattr fix atime update
In db4fc559c I messed up and changed this bit of code to set the inode
atime to an uninitialised value, when actually it was just supposed to
loading the atime from the inode to be stored in the SA. This changes it
to what it should have been.
Ensure times change by the right amount Previously, we only checked
if the times changed at all, which missed a bug where the atime was
being set to an undefined value.
Now ensure the times change by two seconds (or thereabouts), ensuring
we catch cases where we set the time to something bonkers
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rob Norris <robn@despairlabs.com> Sponsored-by: https://despairlabs.com/sponsor/
Closes #15762
Closes #15773
Lalufu [Tue, 16 Jan 2024 21:32:59 +0000 (22:32 +0100)]
Make sure all necessary RPM path macros are defined
When building (s)rpm files through the Makefile, a directory structure
is created in /tmp to hold the various files.
In case the user running the command has overridden some of the RPM path
settings through their user profile (for example in `~/.rpmmacros`),
these paths do not line up with the configuration, and the build fails.
Make sure all paths used are properly defined.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ralf Ertzinger <ralf@skytale.net>
Closes #15756
youzhongyang [Tue, 16 Jan 2024 21:30:58 +0000 (16:30 -0500)]
Make spl_kmem_cache size check consistent
On Linux x86_64, kmem cache can have size up to 4M,
however increasing spl_kmem_cache_slab_limit can lead
to crash due to the size check inconsistency.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Youzhong Yang <yyang@mathworks.com>
Closes #15757
Ameer Hamza [Thu, 4 Jan 2024 14:35:04 +0000 (19:35 +0500)]
Add path handling for aux vdevs in `label_path`
If the AUX vdev is added using UUID, importing the pool falls back AUX
vdev to open it with disk name instead of UUID due to the absence of
path information for AUX vdevs. Since AUX label now have path
information, this PR adds path handling for it in `label_path`.
Reviewed-by: Umer Saleem <usaleem@ixsystems.com> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>
Closes #15737
Ameer Hamza [Thu, 4 Jan 2024 14:02:50 +0000 (19:02 +0500)]
fix: Uber block label not always found for aux vdevs
When spare or l2cache (aux) vdev is added during pool creation,
spa->spa_uberblock is not dumped until that point. Subsequently,
the aux label is never synchronized after its initial creation,
resulting in the uberblock label remaining undumped. The uberblock
is crucial for lib_blkid in identifying the ZFS partition type. To
address this issue, we now ensure sync of the uberblock label once
if it's not dumped initially.
Reviewed-by: Umer Saleem <usaleem@ixsystems.com> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>
Closes #15737
Rich Ercolani [Tue, 16 Jan 2024 21:16:08 +0000 (16:16 -0500)]
Make zdb -R a little more sane.
zdb -R has a minor flaw in which it will not always print the full
output of a decompressed block. Oops.
While I was in there, I also reworked the logic so it won't try
ZLE unless everything else fails, which will hopefully avoid the
problem ZDB_NO_ZLE was intended to mitigate of reporting a lot of
false positives of ZLE compressed blocks...
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes #15723
Florian Walpen [Sat, 25 Nov 2023 00:04:34 +0000 (01:04 +0100)]
sound: Fix OSS API requests for more than 8 channels
Audio devices with more than 8 channels need bitperfect mode to operate,
the vchan processing chain is limited to 8 channels. For these devices,
let applications properly select a certain number of channels supported
by the driver, instead of mapping the request to a vchan format.
Xavier Beaudouin [Tue, 16 Jan 2024 20:44:34 +0000 (20:44 +0000)]
Add UDP encapsulation of ESP in IPv6
This patch provides UDP encapsulation of ESP packets over IPv6.
Ports the IPv4 code to IPv6 and adds support for IPv6 in udpencap.c
As required by the RFC and unlike in IPv4 encapsulation,
UDP checksums are calculated.
Direct the review request to #linuxkpi instead of #x11 as it
also is #wireless these days and possibly others in the future.
I would suggest #x11 reviewers also add themselves to #linuxkpi
instead.
Gleb Smirnoff [Tue, 16 Jan 2024 18:26:10 +0000 (10:26 -0800)]
sockets: retire sorflush()
With removal of dom_dispose method the function boils down to two
meaningful function calls: socantrcvmore() and sbrelease(). The latter is
only relevant for protocols that use generic socket buffers.
The socket I/O sx(9) lock acquisition in sorflush() is not relevant for
shutdown(2) operation as it doesn't do any I/O that may interleave with
read(2) or write(2). The socket buffer mutex acquisition inside
sbrelease() is what guarantees thread safety. This sx(9) acquisition in
soshutdown() can be tracked down to 4.4BSD times, where it used to be
sblock(), and it was carried over through the years evolving together with
sockets with no reconsideration of why do we carry it over. I can't tell
if that sblock() made sense back then, but it doesn't make any today.
Gleb Smirnoff [Tue, 16 Jan 2024 18:26:10 +0000 (10:26 -0800)]
sockets: make pr_shutdown fully protocol specific method
Disassemble a one-for-all soshutdown() into protocol specific methods.
This creates a small amount of copy & paste, but makes code a lot more
self documented, as protocol specific method would execute only the code
that is relevant to that protocol and nothing else. This also fixes a
couple recent regressions and reduces risk of future regressions. The
extended KPI for the new pr_shutdown removes need for the extra pr_flush
which was added for the sake of SCTP which could not perform its shutdown
properly with the old one. Particularly for SCTP this change streamlines
a lot of code.
Some notes on why certain parts of code were copied or were not to certain
protocols:
* The (SS_ISCONNECTED | SS_ISCONNECTING | SS_ISDISCONNECTING) check is
needed only for those protocols that may be connected or disconnected.
* The above reduces into only SS_ISCONNECTED for those protocols that
always connect instantly.
* The ENOTCONN and continue processing hack is left only for datagram
protocols.
* The SOLISTENING(so) block is copied to those protocols that listen(2).
* sorflush() on SHUT_RD is copied almost to every protocol, but that
will be refactored later.
* wakeup(&so->so_timeo) is copied to protocols that can make a non-instant
connect(2), can SO_LINGER or can accept(2).
There are three protocols (netgraph(4), Bluetooth, SDP) that did not have
pr_shutdown, but old soshutdown() would still perform sorflush() on
SHUT_RD for them and also wakeup(9). Those protocols partially supported
shutdown(2) returning EOPNOTSUP for SHUT_WR/SHUT_RDWR, now they fully lost
shutdown(2) support. I'm pretty sure netgraph(4) and Bluetooth are okay
about that and SDP is almost abandoned anyway.
Roger Pau Monné [Tue, 16 Jan 2024 15:32:56 +0000 (16:32 +0100)]
x86/xen: fix HVM guest hypercall page setup
c7368ccb6801 didn't take into account that vm_guest will also get setup by
generic identify CPU code, and hence by the time xen_hvm_init() gets called
vm_guest will always be set if running as a Xen guest, either by the PVH entry
point code, or by generic CPU identification.
xen_hvm_init() and xen_hvm_init_hypercall_stubs() were relying on xen_domain()
returning false when running as an HVM guest, and used that into order to
figure out whether hypercall page needed to be populated.
Get rid of such assumptions and simplify the code since legacy PVH is no
longer supported.
subr_bus: introduce device_set_descf() and modify allocation logic
device_set_descf() is a printf-like version of device_set_desc().
Allocation code has been transferred from device_set_desc_internal() to
device_set_desc_copy() and device_set_descf() to avoid complicating
device_set_desc_internal(). The "copy" argument in
device_set_desc_internal() has been replaced with a flag which is set
when the description string has been allocated with M_BUS.
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Reviewed by: imp, markj
Differential Revision: https://reviews.freebsd.org/D43370
usb: use only usb_devinfo() in device_set_usb_desc()
device_set_usb_desc() first tries to fetch device information through
the iInterface descriptor, otherwise it falls back to usb_devinfo().
Since usb_devinfo() is both guaranteed to work, and is more verbose, get
rid of the initial iInterface attempt.
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Reviewed by: imp, markj
Differential Revision: https://reviews.freebsd.org/D43383
sound: remove PCM_KLDSTRING() and fix status strings
PCM_KLDSTRING() prints the kernel module associated with a given audio
device only when that module is not compiled in. Get rid of
PCM_KLDSTRING() altogether and print the driver name (even for modules
that are compiled in) instead, as it implies the module as well.
While here, convert all status strings to the following dmesg-like
format:
[<port|mem> <irq>] on <driver>
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Reviewed by: markj, imp
Differential Revision: https://reviews.freebsd.org/D43349
snd_uaudio: provide information about the device name and attached driver
Unlike the other sound drivers, snd_uaudio(4) doesn't provide
information about the device's description and the driver it's attached
to. A side-effect of this is that applications such as mixer(8), that
fetch these strings through the OSS API's SNDCTL_CARDINFO ioctl will
show a USB audio device as:
pcm0:mixer: <USB Audio> at ? kld snd_uaudio
This patch replaces the generic "USB Audio" description with the
device's actual manufacturer and product strings, and the "at ?" string
with the driver it's attached to:
pcm0:mixer: <Focusrite Scarlett Solo USB> at uaudio0 kld snd_uaudio
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Reviewed by: markj, emaste
Differential Revision: https://reviews.freebsd.org/D43347
rilysh [Mon, 8 Jan 2024 06:06:55 +0000 (11:36 +0530)]
bhyve: return ENOMEM instead of EFAULT and call free() after being used
1. In basl_load() function, when allocation fails,
it returns an EFAULT instead of ENOMEM. An EFAULT
can mislead in some scenarios, whereas an ENOMEM
for an allocation function makes much more sense.
2. Call free() on addr, as it's not being used
anymore after the basl_table_append_bytes()
function.