* The empty test case no longer fails because 89f1dcb3eb46 causes empty
files to bypass the bug.
* The bug still exists, so add a test case which exercises it.
* While here, tighten up some of the checks.
A similar patch has been submitted upstream.
PR: 274615
X-MFC-With: 89f1dcb3eb46
Sponsored by: Juniper Networks, Inc.
Sponsored by: Klara, Inc.
Reviewed by: markj
Differential Revision: https://reviews.freebsd.org/D44609
It's possible for the capture buffer to be smaller than indicated by the
header length. However, pfsync_print() only took the header length into
account. As a result we could read outside of the buffer.
Check that we have at least the expected amount of data before we start
parsing.
There is a conflict between bsm/audit.h and security/audit/audit.h due
to the way that staging is being set up using .PATH to point to the
full directory and the leaf files being specified in the list. Due to
this, the bsm/audit.h was getting staged as both bsm/audit.h and
security/audit/audit.h since the sys/bsm directory is listed first in
the .PATH list.
Use sys/security in the .PATH instead of sys/security/audit and specify
the audit header files as audit/<name>.h. This ensures that we get the
correct audit.h stanged for security/audit/audit.h.
Reviewed by: sjg
Obtained from: Juniper Networks, Inc.
Alan Cox [Wed, 3 Apr 2024 05:21:08 +0000 (00:21 -0500)]
arm64: correctly handle a failed BTI check in pmap_enter_l2()
If pmap_enter_l2() does not create a mapping because the BTI check
fails, then we should release the reference on the page table page
acquired from pmap_alloc_l2(). Otherwise, the page table page will
never be reclaimed.
The ddb pretty-printer currently does not print out enum values that
are not labeled (e.g. X | Y).
The enum printer was reworked to print non-labeled values.
Mark Johnston [Wed, 3 Apr 2024 15:29:25 +0000 (11:29 -0400)]
nextboot: Write nextboot.conf safely
As in the old nextboot.sh script:
- First write everything to a tempfile instead of /boot/nextboot.conf.
- fsync() the tempfile before renaming it to nextboot.conf.
Fixes: fd6d47375a78 ("rescue,nextboot: Install nextboot as a link to reboot, rm nextboot.sh")
Reviewed by: imp
Differential Revision: https://reviews.freebsd.org/D44572
Previously, we would error out if we encountered a global extended
header, because we don't know what it means. This doesn't really
matter though, and traditionally, tar implementations have either
ignored them or treated them as plain files, so just ignore them.
This allows tarfs to mount tar files created by `git archive`.
MFC after: 3 days
Sponsored by: Juniper Networks, Inc.
Sponsored by: Klara, Inc.
Reviewed by: kevans
Differential Revision: https://reviews.freebsd.org/D44600
Reinstate returning EOVERFLOW from stats_v1_blob_clone()
a0993376ec5f (from D43179) subtly changed stats_v1_blob_clone() to stop returning EOVERFLOW in the case where the user buffer is not large enough to receive the entire statsblob. This results in any consumers which are implemented to retry on receiving EOVERFLOW to instead give up after receiving an empty statsblob header.
Fix by latching any errors recorded prior to copyout.
Reviewed by: markj
Obtained from: Netflix, Inc.
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D44585
Fixes: a0993376ec5f ("stats: Check for errors from copyout()")
sound: Move sndstat_prepare_pcm() to pcm/sndstat.c and remove sndstat_entry->handler
Since all sndstat_entry->handler fields point to sndstat_prepare_pcm(),
we can just call the function directly, without assigning it to a
function pointer and calling it indirectly.
While here, move sndstat_prepare_pcm() to pcm/sndstat.c, as it is more
suitable there.
No functional change intended.
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Reviewed by: dev_submerge.ch, markj
Differential Revision: https://reviews.freebsd.org/D44571
Fix the logic which determines if the destination Q variable can represent the source Q variable's value with full accuracy.
The new logic is mostly self explanatory except for the value fit checks.
If b has fewer integer bits than a, 0 == (Q_GIABSVAL(a) & (~Q_TC(a, 0) << Q_NIBITS(b))) is checking that a's integer value does not have high-order bits set above what b is capable of storing.
If b has fewer fractional bits than a, 0 == (Q_GFABSVAL(a) & ~(~Q_TC(a, 0) << (Q_NFBITS(a) - Q_NFBITS(b))))) is checking that a's fractional value does not have low-order bits set below what b is capable of storing.
Michael Tuexen [Mon, 1 Apr 2024 19:51:59 +0000 (21:51 +0200)]
tcp hpts: improve consistency
The target_slot argument of max_slots_available() can be NULL.
Therefore, check for this in all places.
Right now, all callers provide non-NULL pointer.
Reported by: Coverity Scan
CID: 1527732
Reviewed by: rrs
MFC after: 3 days
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D44527
cp: Improved conformance when copying directories.
* When copying a directory, if the destination exists and is not a
directory, we would previously emit an error message and exit. The
correct behavior according to POSIX is to emit an error message and
continue without descending further into the source directory.
* When copying a directory, if the destination does not exist and we
fail to create it, we would previously emit an error message and
exit. The correct behavior according to POSIX is to emit an error
message and continue. Whether to descend further into the source
directory is explicitly left unspecified; GNU cp does not, which
seems to me to be the safer and less surprising option, so let's not
either.
Mark Johnston [Mon, 1 Apr 2024 17:20:55 +0000 (13:20 -0400)]
wg: Use ENETUNREACH when transmitting to a non-existent peer
The old errno value used is specifically for Capsicum and shouldn't be
co-opted in this way. It has special handling in the generic syscall
layer (see syscallret()). OpenBSD returns ENETUNREACH in this case;
let's do the same thing.
Florian Walpen [Sun, 31 Mar 2024 19:14:16 +0000 (20:14 +0100)]
snd_hdspe(4): Only buffer_copy() audio data once.
Instead of blindly copying two periods of audio data to and from DMA
buffers, keep track of the writing position and derive the actual
part of audio data that needs to be copied.
This approximately halves the number of samples copied in total.
Rick Macklem [Sun, 31 Mar 2024 19:00:08 +0000 (12:00 -0700)]
mountd.c: Add warning messages for administrative controls
When "administrative controls" (which are exports of subdirectories
within a NFS server's local file system) are used, they export the
entire local server file system. (The subdirectory only applies to
the Mount protocol used for NFSv3 mounts.)
To minimize the risk that this causes confusion w.r.t. what is exported
to NFS client(s), this patch generates warning messages for these.
Only one message is generated for each server local file system.
The messages can be silenced via a new "-A" command line option.
The mountd.8 man page will be patched via a separate commit.
Mark Johnston [Sun, 31 Mar 2024 18:14:02 +0000 (14:14 -0400)]
kern linker: Don't invoke dtors without having invoked ctors
I have a kernel module which fails to load because of an unrecognized
relocation type. link_elf_load_file() fails before the module's ctors
are invoked and it calls linker_file_unload(), which causes the module's
dtors to be executed, resulting in a kernel panic.
Add a flag to the linker file to ensure that dtors are not invoked if
unloading due to an error prior to ctors being invoked.
At the moment I only implemented this for link_elf_obj.c since
link_elf.c doesn't invoke dtors, but I refactored link_elf.c to make
them more similar.
Hot-unplugging a sound device, such as a USB sound card, whilst being
consumed by an application, results in an infinite loop until either the
application closes the device's file descriptor, or the channel
automatically times out after hw.snd.timeout seconds. In the case of a
detach however, the timeout approach is still not ideal, since we want
all resources to be released immediatelly, without waiting for N seconds
until we can use the bus again.
The timeout mechanism works by calling chn_sleep() in chn_read() and
chn_write() (see pcm/channel.c) in order to send the thread to sleep,
using cv_timedwait_sig(). Since chn_sleep() sets the CHN_F_SLEEPING flag
while waiting for cv_timedwait_sig() to return, we can test this flag in
pcm_unregister() (called during detach) and wakeup the sleeping
thread(s) to immediately kill the channel(s) being consumed.
sound: Get rid of snd_clone and use DEVFS_CDEVPRIV(9)
Currently the snd_clone framework creates device nodes on-demand for
every channel, through the dsp_clone() callback, and is responsible for
routing audio to the appropriate channel(s). This patch gets rid of the
whole snd_clone framework (including any related sysctls) and instead
uses DEVFS_CDEVPRIV(9) to handle device opening, channel allocation and
audio routing. This results in a significant reduction in code size as
well as complexity.
Behavior that is preserved:
- hw.snd.basename_clone.
- Exclusive access of an audio device (i.e VCHANs disabled).
- Multiple processes can read from/write to the device.
- A device can only be opened as many times as the maximum allowed
channel number (see SND_MAXHWCHAN in pcm/sound.h).
- OSSv4 compatibility aliases are preserved.
Behavior changes:
Only one /dev/dspX device node is created (on attach) for each audio
device, as opposed to the current /dev/dspX.Y devices created by
snd_clone. According to the sound(4) man page, devices are not meant to
be opened through /dev/dspX.Y anyway, so it is best if we do not create
device nodes for them in the first place. As a result of this, modify
dsp_oss_audioinfo() to print /dev/dspX in the "ai->devnode", instead of
/dev/dspX.Y.
Sponsored by: The FreeBSD Foundation
MFC after: 2 months
Reviewed by: dev_submerge.ch, markj
Differential Revision: https://reviews.freebsd.org/D44411
Alan Cox [Sat, 30 Mar 2024 20:35:32 +0000 (15:35 -0500)]
arm64: enable superpage mappings by pmap_mapdev{,_attr}()
In order for pmap_kenter{,_device}() to create superpage mappings,
either 64 KB or 2 MB, pmap_mapdev{,_attr}() must request appropriately
aligned virtual addresses.
Eliot Solomon [Sun, 24 Mar 2024 19:01:47 +0000 (14:01 -0500)]
arm64 pmap: Add ATTR_CONTIGUOUS support [Part 1]
The ATTR_CONTIGUOUS bit within an L3 page table entry designates that
L3 page as being part of an aligned, physically contiguous collection
of L3 pages. For example, 16 aligned, physically contiguous 4 KB pages
can form a 64 KB superpage, occupying a single TLB entry. While this
change only creates ATTR_CONTIGUOUS mappings in a few places,
specifically, the direct map and pmap_kenter{,_device}(), it adds all
of the necessary code for handling them once they exist, including
demotion, protection, and removal. Consequently, new ATTR_CONTIGUOUS
usage can be added (and tested) incrementally.
Modify the implementation of sysctl vm.pmap.kernel_maps so that it
correctly reports the number of ATTR_CONTIGUOUS mappings on machines
configured to use a 16 KB base page size, where an ATTR_CONTIGUOUS
mapping consists of 128 base pages.
Additionally, this change adds support for creating L2 superpage
mappings to pmap_kenter{,_device}().
thread_single(9): decline external requests for traced or debugger-stopped procs
Debugger has the powers to cause unbound delay in single-threading,
which then blocks the threaded taskqueue. The reproducer is
`truss -f timeout 2 sleep 10`.
Reported by: mjg
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D44523
Robert Evans [Sat, 30 Mar 2024 00:11:52 +0000 (20:11 -0400)]
Linux 5.18+ compat: Detect filemap_range_has_page
In v5.18 `filemap_range_has_page` moved to `pagemap.h`
`pagemap.h` has been around since 3.10 so just include both
Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Rob Norris <robn@despairlabs.com> Signed-off-by: Robert Evans <evansr@google.com>
Closes #16034
Update the I2C controller logic to be more consistent with the
newer version of the controller reference manual.
This makes it work better on modern LS/LX platforms and avoids
unnecessary delays. Also fixes a lock leak.
vf_i2c: split up and add ACPI attachments in addition to FDT
Move the code from the arm specific to the iicbus controller directory.
Split up between general logic and bus attachment code.
Add support for ACPI attachment in addition to FDT.
MFC after: 7 days
Tested by: bz (LS1088a FDT), Pierre-Luc Drouin (Honeycomb, ACPI)
Based on: D24917 by Val Packett (initial early version)
Differential Revision: https://reviews.freebsd.org/D44020
Robert Evans [Fri, 29 Mar 2024 21:59:23 +0000 (17:59 -0400)]
Fix buffer underflow if sysfs file is empty
Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Jason Lee <jasonlee@lanl.gov> Signed-off-by: Robert Evans <evansr@google.com>
Closes #16028
Closes #16035
Rob N [Fri, 29 Mar 2024 21:51:33 +0000 (08:51 +1100)]
vdev_disk: clean up spa/bdev mode conversion
43e8f6e37 introduced a subtle API misuse, in that it passed the output
from vdev_bdev_mode() back into itself. Fortunately, the
SPA_MODE_(READ|WRITE) bit values exactly map to the FMODE_(READ|WRITE) &
BLK_OPEN_(READ|WRITE) bit values, so it didn't result in a bug, but it
was hard to read and understand, so I cleaned it up.
In doing so, I noticed that the only call to vdev_bdev_mode() without
the "exclusive" flag set was in that misuse, and actually, we never do a
non-exclusive blkdev_get_by_path(). So I've just made exclusive be
always-on.
Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc.
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Allan Jude <allan@klarasystems.com> Signed-off-by: Rob Norris <rob.norris@klarasystems.com>
Closes #15995
currently, the linux kernel allows 2^20 minor devices per major device
number. ZFS reserves blocks of 2^4 minors per zvol: 1 for the zvol
itself, the other 15 for the first partitions of that zvol. as a result,
only 2^16 such blocks are available for use.
there are no checks in place to avoid overflowing into the major device
number when more than 2^16 zvols are allocated (with volmode=dev or
default). instead of ignoring this limit, which comes with all sorts of
weird knock-on effects, detect this situation and simply fail allocating
the zvol block device early on.
without this safeguard, the kernel will reject the attempt to create an
already existing block device, but ZFS doesn't handle this error and
gets confused about which zvol occupies which minor slot, potentially
resulting in kernel NULL derefs and other issues later on.
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Closes #16006
Gleb Smirnoff [Fri, 29 Mar 2024 20:35:51 +0000 (13:35 -0700)]
linux: make linux_netlink_p->msg_from_linux be able to fail
The KPI for this function was misleading. From the NetLink perspective it
looked like a function that: a) allocates new hdr, b) can fail. Neither
was true. Let the function return a error code instead of returning the
same hdr it was passed to. In case if future Linux NetLink compatibility
support calls for reallocating header, pass hdr as pointer to pointer.
With KPI that returns a error, propagate domain conversion errors all the
way up to NetLink module. This fixes panic when unknown domain is
converted to 0xff and this invalid value is passed into NetLink
processing.
Gleb Smirnoff [Fri, 29 Mar 2024 20:35:37 +0000 (13:35 -0700)]
linux: use sa_family_t for address family conversions
Express "conversion failed" with maximum possible value. This allows to
reduce number of size/signedness conversion in the code that utilizes the
functions.
Gleb Smirnoff [Fri, 29 Mar 2024 19:35:41 +0000 (12:35 -0700)]
if_tuntap: simplify storage of per-vnet cloners
There is no need for a separate structure neither for a linked list.
Provide each VNET with an array of pointers to if_clone that has the same
size as the driver list.
Bojan Novković [Fri, 29 Mar 2024 19:17:19 +0000 (20:17 +0100)]
kern_ctf.c: Don't print out warning messages unconditionally
The kernel CTF loading routines print various warnings when attempting
to load CTF data from an ELF file. After the changes in c21bc6f3c242
those warnings are unnecessarily printed for each kernel module
that was compiled without CTF data.
The kernel linker already uses the bootverbose flag to conditionally
print CTF loading errors. This patch alters kern_ctf.c
routines to do the same.
Gleb Smirnoff [Fri, 29 Mar 2024 19:16:59 +0000 (12:16 -0700)]
inpcb: fully retire inp_ppcb pointer
Before a protocol specific control block started to embed inpcb in self
(see 0aa120d52f3c, e68b3792440c, 483fe96511ec) this pointer used to point
at it.
Retain kf_sock_inpcb field in the struct kinfo_file in <sys/user.h>. The
exp-run detected a minimal use of the field in ports:
* sysutils/lsof - patched upstream
* net-mgmt/netdata - patch accepted upstream
* emulators/qemu-user-static - upstream master branch seems not using
the field anymore
We can keep the field around for some time, but eventually it may be
reused for something else.
George Wilson [Fri, 29 Mar 2024 19:15:56 +0000 (15:15 -0400)]
Add ashift validation when adding devices to a pool
Currently, zpool add allows users to add top-level vdevs that have
different ashifts but doing so prevents users from being able to
perform a top-level vdev removal. Often times consumers may not realize
that they have mismatched ashifts until the top-level removal fails.
This feature adds ashift validation to the zpool add command and will
fail the operation if the sector size of the specified vdev does not
match the existing pool. This behavior can be disabled by using the -f
flag. In addition, new flags have been added to provide fine-grained
control to disable specific checks. These flags
are:
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Mark Maybee <mmaybee@delphix.com> Signed-off-by: George Wilson <gwilson@delphix.com>
Closes #15509
Prevent a use-after-free in kern_poll() by making sure the buffer's
selinfo is drained. This is required for a subsequent patch that
implements asynchronous audio device detach.
Reported by: KASAN
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Reviewed by: markj
Differential Revision: https://reviews.freebsd.org/D44544
Colin Percival [Fri, 29 Mar 2024 07:10:50 +0000 (00:10 -0700)]
release.sh: Don't install git if already present
Prior to this commit, we install git from ports if there is a ports
tree available and git is not installed, and we install git from pkg
otherwise -- including the case where git is already installed.
Rework the logic to not (re)install git at all if it is already
installed.
Gleb Smirnoff [Thu, 28 Mar 2024 21:10:15 +0000 (14:10 -0700)]
pfilctl: fix 'pfilctl hooks' when nothing is connected
The 'hooks' command actually worked accidentially until now. It used
PFILIOC_LISTHEADS to determine current number of hooks. This worked when
at least one head had a hook connected to it.
kerneldump: Add flag to indicate kernel core was successfully dumped
This allows for shutdown_final EVENTHANDLERs to know that a core dump
successfully occurred. Embedded systems may want to record this fact
or act on it.
Kristof Provost [Wed, 27 Mar 2024 14:47:21 +0000 (15:47 +0100)]
pf: fix reply-to after rdr and dummynet
If we redirect a packet to localhost and it gets dummynet'd it may be
re-injected later (e.g. when delayed) which means it will be passed
through ip_input() again. ip_input() will then reject the packet because
it's directed to the loopback address, but did not arrive on a loopback
interface.
Fix this by having pf set the rcvif to V_iflo if we redirect to
loopback.
See also: https://redmine.pfsense.org/issues/15363
Sponsored by: Rubicon Communications, LLC ("Netgate")
Randall Stewart [Thu, 28 Mar 2024 12:12:37 +0000 (08:12 -0400)]
Optimize HPTS so that little work is done until we have a hpts thread that is over the connection threshold
HPTS inserts a softclock for system call return that optimizes performance. However when
no HPTS threads need the help (i.e. when they have less than 100 or so connections) then
there should be little work done i.e. check the counter and return instead of running through
all the threads getting locks etc.ptimize HPTS so that little work is done until we have a hpts
thread that is over the connection threshold.
Robert Evans [Wed, 27 Mar 2024 21:59:16 +0000 (17:59 -0400)]
ZTS: fix flakiness in cp_files_002_pos
Fix RANDOM to not return zero.
Overwriting with `dd ... count=0` does not test anything.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: George Melikov <mail@gmelikov.ru> Reviewed-by: Allan Jude <allan@klarasystems.com> Signed-off-by: Robert Evans <evansr@google.com>
Closes #16029
Alexander Motin [Mon, 18 Mar 2024 18:19:53 +0000 (14:19 -0400)]
BRT: Fix holes cloning.
- When reading L0 block pointers handle buffers without ones and
without dirty records as a holes. Those appear when dnode size
was increased, but the end was never written, so there are no new
indirection levels to store the pointers. It makes no sense to
return EAGAIN here, since sync won't create new indirection levels
until there will be actual writes.
- When cloning blocks set destination hole logical birth time
to the current TXG. Otherwise if we are cloning over existing
data, newly created holes may not be properly replicated later.
Use BP_SET_BIRTH() when possible to not replicate its logic.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored by: iXsystems, Inc.
Closes #15994
Closes #16007
Mike Karels [Wed, 27 Mar 2024 20:10:43 +0000 (15:10 -0500)]
bsdinstall: draw attention to new network config options
The network configuration options have changed in bsdinstall, with
an Auto option to proceed directly to DHCP and IPv6 autoconfig (which
is the default) as well as Manual (the old mode). For users like me
that were used to hitting return automatically to select an interface,
but want manual configuration, attempt to call out the difference:
Change the menu caption to say "Please select a network interface
and configuration mode:" and not just an interface.
Gleb Smirnoff [Wed, 27 Mar 2024 19:19:44 +0000 (12:19 -0700)]
sockets: define shutdown(2) constants in cpp namespace
There is software that uses SHUT_RD, SHUT_WR as preprocessor defines and
its build was broken by enum declaration. Keep the enum, but provide
defines to propagate the constants to cpp namespace.