Warner Losh [Wed, 3 Apr 2024 17:26:17 +0000 (11:26 -0600)]
nvme: Add LPA bits
Add all the bits from the NVMe 2.0 base specification: CMD_EFFECTS to
indicate the commands and effects log page is supported, TELEMETRY to
indicate that the telemetry log pages and protocols are supported,
PERSISTENT_EVENTS to indicate the persistent event log is supported,
LOG_PAGES_PAGE to indicate that various log pages related to log page
and command support are supported: L0, L5, L12, and L13. and
DA4_TELEMETRY to indicate that the DA4 area is supported for telemetry
data.
arm64: Add a CPU reset hook instead of expecting PSCI
Some SoCs do not include a PSCI for power management and defer it to
something else instead. Add a CPU reset hook to account for this, and
use it in the psci driver.
Reviewed by: andrew
Obtained from: Juniper Networks, Inc.
Differential Revision: https://reviews.freebsd.org/D44535
Collin Funk [Sun, 11 Feb 2024 04:26:38 +0000 (20:26 -0800)]
which: Use size_t instead of ssize_t for pathlen
The "pathlen" variable is the return value of strlen(3) and is then
passed as an argument to malloc(3) and memcpy(3). The size_t type
matches the prototype for these functions. The size_t type is unsigned
so it can fit larger $PATH values than ssize_t. However, in practice
ssize_t should be larger enough so this change is just for clarity.
LinuxKPI: Remove the temporary variable fileid from the macro request_module
The variable fileid stores the result from kern_kldload() but never gets
used. Since the third parameter `*fileid` of kern_kldload() can be NULL,
this unused variable can be safely removed.
Michael Tuexen [Fri, 5 Apr 2024 15:47:03 +0000 (17:47 +0200)]
tcp rack: fix sending
In rack_output(), idle is used as a boolean variable. So don't use it
as an int and don't clear it afterwards.
This avoids setting idle to false, when it is not intended.
Reported by: olivier
Reviewed by: rrs, rscheff
MFC after: 3 days
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D44610
Mark Johnston [Fri, 5 Apr 2024 15:14:36 +0000 (11:14 -0400)]
tarfs: Implement VOP_BMAP
This lets tarfs provide readahead/behind hints to the VFS, which helps
memory-mapped I/O performance, important when running faulting in
executables out of a tarfs mount as one might if tarfs is used to back
the root filesystem, for example. The improvement is particularly
noticeable when the backing tarball is zstd-compressed.
The implementation simply returns the extent of the virtual block
containing the target offset, clamped by the maximum I/O size. This is
perhaps simplistic; it effectively just chooses values that would
correspond to a single VOP_READ call in tarfs_read_file().
The revert was not directly due to the attack (CVE-2024-3094):
our import process have removed the test cases and build scripts
that would have enabled the attack. However, reverting would
help to reduce potential confusion and false positives from
security scanners that assess risk based solely on version
numbers.
Another commit will follow to restore binary compatibility with
the liblzma 5.6.0 library by making the previously private
symbol (lzma_mt_block_size) public.
* Give link(1) its own usage message.
* Use getprogname(3) instead of rolling our own.
* Verify that the target file does not already exist.
* Add tests specific to link(1).
MFC after: 3 days
Sponsored by: Klara, Inc.
Reviewed by: allanjude
Differential Revision: https://reviews.freebsd.org/D44635
* The empty test case no longer fails because 89f1dcb3eb46 causes empty
files to bypass the bug.
* The bug still exists, so add a test case which exercises it.
* While here, tighten up some of the checks.
A similar patch has been submitted upstream.
PR: 274615
X-MFC-With: 89f1dcb3eb46
Sponsored by: Juniper Networks, Inc.
Sponsored by: Klara, Inc.
Reviewed by: markj
Differential Revision: https://reviews.freebsd.org/D44609
It's possible for the capture buffer to be smaller than indicated by the
header length. However, pfsync_print() only took the header length into
account. As a result we could read outside of the buffer.
Check that we have at least the expected amount of data before we start
parsing.
There is a conflict between bsm/audit.h and security/audit/audit.h due
to the way that staging is being set up using .PATH to point to the
full directory and the leaf files being specified in the list. Due to
this, the bsm/audit.h was getting staged as both bsm/audit.h and
security/audit/audit.h since the sys/bsm directory is listed first in
the .PATH list.
Use sys/security in the .PATH instead of sys/security/audit and specify
the audit header files as audit/<name>.h. This ensures that we get the
correct audit.h stanged for security/audit/audit.h.
Reviewed by: sjg
Obtained from: Juniper Networks, Inc.
Alan Cox [Wed, 3 Apr 2024 05:21:08 +0000 (00:21 -0500)]
arm64: correctly handle a failed BTI check in pmap_enter_l2()
If pmap_enter_l2() does not create a mapping because the BTI check
fails, then we should release the reference on the page table page
acquired from pmap_alloc_l2(). Otherwise, the page table page will
never be reclaimed.
The ddb pretty-printer currently does not print out enum values that
are not labeled (e.g. X | Y).
The enum printer was reworked to print non-labeled values.
Mark Johnston [Wed, 3 Apr 2024 15:29:25 +0000 (11:29 -0400)]
nextboot: Write nextboot.conf safely
As in the old nextboot.sh script:
- First write everything to a tempfile instead of /boot/nextboot.conf.
- fsync() the tempfile before renaming it to nextboot.conf.
Fixes: fd6d47375a78 ("rescue,nextboot: Install nextboot as a link to reboot, rm nextboot.sh")
Reviewed by: imp
Differential Revision: https://reviews.freebsd.org/D44572
Previously, we would error out if we encountered a global extended
header, because we don't know what it means. This doesn't really
matter though, and traditionally, tar implementations have either
ignored them or treated them as plain files, so just ignore them.
This allows tarfs to mount tar files created by `git archive`.
MFC after: 3 days
Sponsored by: Juniper Networks, Inc.
Sponsored by: Klara, Inc.
Reviewed by: kevans
Differential Revision: https://reviews.freebsd.org/D44600
Reinstate returning EOVERFLOW from stats_v1_blob_clone()
a0993376ec5f (from D43179) subtly changed stats_v1_blob_clone() to stop returning EOVERFLOW in the case where the user buffer is not large enough to receive the entire statsblob. This results in any consumers which are implemented to retry on receiving EOVERFLOW to instead give up after receiving an empty statsblob header.
Fix by latching any errors recorded prior to copyout.
Reviewed by: markj
Obtained from: Netflix, Inc.
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D44585
Fixes: a0993376ec5f ("stats: Check for errors from copyout()")
sound: Move sndstat_prepare_pcm() to pcm/sndstat.c and remove sndstat_entry->handler
Since all sndstat_entry->handler fields point to sndstat_prepare_pcm(),
we can just call the function directly, without assigning it to a
function pointer and calling it indirectly.
While here, move sndstat_prepare_pcm() to pcm/sndstat.c, as it is more
suitable there.
No functional change intended.
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Reviewed by: dev_submerge.ch, markj
Differential Revision: https://reviews.freebsd.org/D44571
Fix the logic which determines if the destination Q variable can represent the source Q variable's value with full accuracy.
The new logic is mostly self explanatory except for the value fit checks.
If b has fewer integer bits than a, 0 == (Q_GIABSVAL(a) & (~Q_TC(a, 0) << Q_NIBITS(b))) is checking that a's integer value does not have high-order bits set above what b is capable of storing.
If b has fewer fractional bits than a, 0 == (Q_GFABSVAL(a) & ~(~Q_TC(a, 0) << (Q_NFBITS(a) - Q_NFBITS(b))))) is checking that a's fractional value does not have low-order bits set below what b is capable of storing.
Michael Tuexen [Mon, 1 Apr 2024 19:51:59 +0000 (21:51 +0200)]
tcp hpts: improve consistency
The target_slot argument of max_slots_available() can be NULL.
Therefore, check for this in all places.
Right now, all callers provide non-NULL pointer.
Reported by: Coverity Scan
CID: 1527732
Reviewed by: rrs
MFC after: 3 days
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D44527
cp: Improved conformance when copying directories.
* When copying a directory, if the destination exists and is not a
directory, we would previously emit an error message and exit. The
correct behavior according to POSIX is to emit an error message and
continue without descending further into the source directory.
* When copying a directory, if the destination does not exist and we
fail to create it, we would previously emit an error message and
exit. The correct behavior according to POSIX is to emit an error
message and continue. Whether to descend further into the source
directory is explicitly left unspecified; GNU cp does not, which
seems to me to be the safer and less surprising option, so let's not
either.
Mark Johnston [Mon, 1 Apr 2024 17:20:55 +0000 (13:20 -0400)]
wg: Use ENETUNREACH when transmitting to a non-existent peer
The old errno value used is specifically for Capsicum and shouldn't be
co-opted in this way. It has special handling in the generic syscall
layer (see syscallret()). OpenBSD returns ENETUNREACH in this case;
let's do the same thing.
Florian Walpen [Sun, 31 Mar 2024 19:14:16 +0000 (20:14 +0100)]
snd_hdspe(4): Only buffer_copy() audio data once.
Instead of blindly copying two periods of audio data to and from DMA
buffers, keep track of the writing position and derive the actual
part of audio data that needs to be copied.
This approximately halves the number of samples copied in total.
Rick Macklem [Sun, 31 Mar 2024 19:00:08 +0000 (12:00 -0700)]
mountd.c: Add warning messages for administrative controls
When "administrative controls" (which are exports of subdirectories
within a NFS server's local file system) are used, they export the
entire local server file system. (The subdirectory only applies to
the Mount protocol used for NFSv3 mounts.)
To minimize the risk that this causes confusion w.r.t. what is exported
to NFS client(s), this patch generates warning messages for these.
Only one message is generated for each server local file system.
The messages can be silenced via a new "-A" command line option.
The mountd.8 man page will be patched via a separate commit.
Mark Johnston [Sun, 31 Mar 2024 18:14:02 +0000 (14:14 -0400)]
kern linker: Don't invoke dtors without having invoked ctors
I have a kernel module which fails to load because of an unrecognized
relocation type. link_elf_load_file() fails before the module's ctors
are invoked and it calls linker_file_unload(), which causes the module's
dtors to be executed, resulting in a kernel panic.
Add a flag to the linker file to ensure that dtors are not invoked if
unloading due to an error prior to ctors being invoked.
At the moment I only implemented this for link_elf_obj.c since
link_elf.c doesn't invoke dtors, but I refactored link_elf.c to make
them more similar.
Hot-unplugging a sound device, such as a USB sound card, whilst being
consumed by an application, results in an infinite loop until either the
application closes the device's file descriptor, or the channel
automatically times out after hw.snd.timeout seconds. In the case of a
detach however, the timeout approach is still not ideal, since we want
all resources to be released immediatelly, without waiting for N seconds
until we can use the bus again.
The timeout mechanism works by calling chn_sleep() in chn_read() and
chn_write() (see pcm/channel.c) in order to send the thread to sleep,
using cv_timedwait_sig(). Since chn_sleep() sets the CHN_F_SLEEPING flag
while waiting for cv_timedwait_sig() to return, we can test this flag in
pcm_unregister() (called during detach) and wakeup the sleeping
thread(s) to immediately kill the channel(s) being consumed.
sound: Get rid of snd_clone and use DEVFS_CDEVPRIV(9)
Currently the snd_clone framework creates device nodes on-demand for
every channel, through the dsp_clone() callback, and is responsible for
routing audio to the appropriate channel(s). This patch gets rid of the
whole snd_clone framework (including any related sysctls) and instead
uses DEVFS_CDEVPRIV(9) to handle device opening, channel allocation and
audio routing. This results in a significant reduction in code size as
well as complexity.
Behavior that is preserved:
- hw.snd.basename_clone.
- Exclusive access of an audio device (i.e VCHANs disabled).
- Multiple processes can read from/write to the device.
- A device can only be opened as many times as the maximum allowed
channel number (see SND_MAXHWCHAN in pcm/sound.h).
- OSSv4 compatibility aliases are preserved.
Behavior changes:
Only one /dev/dspX device node is created (on attach) for each audio
device, as opposed to the current /dev/dspX.Y devices created by
snd_clone. According to the sound(4) man page, devices are not meant to
be opened through /dev/dspX.Y anyway, so it is best if we do not create
device nodes for them in the first place. As a result of this, modify
dsp_oss_audioinfo() to print /dev/dspX in the "ai->devnode", instead of
/dev/dspX.Y.
Sponsored by: The FreeBSD Foundation
MFC after: 2 months
Reviewed by: dev_submerge.ch, markj
Differential Revision: https://reviews.freebsd.org/D44411
Alan Cox [Sat, 30 Mar 2024 20:35:32 +0000 (15:35 -0500)]
arm64: enable superpage mappings by pmap_mapdev{,_attr}()
In order for pmap_kenter{,_device}() to create superpage mappings,
either 64 KB or 2 MB, pmap_mapdev{,_attr}() must request appropriately
aligned virtual addresses.
Eliot Solomon [Sun, 24 Mar 2024 19:01:47 +0000 (14:01 -0500)]
arm64 pmap: Add ATTR_CONTIGUOUS support [Part 1]
The ATTR_CONTIGUOUS bit within an L3 page table entry designates that
L3 page as being part of an aligned, physically contiguous collection
of L3 pages. For example, 16 aligned, physically contiguous 4 KB pages
can form a 64 KB superpage, occupying a single TLB entry. While this
change only creates ATTR_CONTIGUOUS mappings in a few places,
specifically, the direct map and pmap_kenter{,_device}(), it adds all
of the necessary code for handling them once they exist, including
demotion, protection, and removal. Consequently, new ATTR_CONTIGUOUS
usage can be added (and tested) incrementally.
Modify the implementation of sysctl vm.pmap.kernel_maps so that it
correctly reports the number of ATTR_CONTIGUOUS mappings on machines
configured to use a 16 KB base page size, where an ATTR_CONTIGUOUS
mapping consists of 128 base pages.
Additionally, this change adds support for creating L2 superpage
mappings to pmap_kenter{,_device}().
thread_single(9): decline external requests for traced or debugger-stopped procs
Debugger has the powers to cause unbound delay in single-threading,
which then blocks the threaded taskqueue. The reproducer is
`truss -f timeout 2 sleep 10`.
Reported by: mjg
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D44523
Robert Evans [Sat, 30 Mar 2024 00:11:52 +0000 (20:11 -0400)]
Linux 5.18+ compat: Detect filemap_range_has_page
In v5.18 `filemap_range_has_page` moved to `pagemap.h`
`pagemap.h` has been around since 3.10 so just include both
Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Rob Norris <robn@despairlabs.com> Signed-off-by: Robert Evans <evansr@google.com>
Closes #16034
Update the I2C controller logic to be more consistent with the
newer version of the controller reference manual.
This makes it work better on modern LS/LX platforms and avoids
unnecessary delays. Also fixes a lock leak.
vf_i2c: split up and add ACPI attachments in addition to FDT
Move the code from the arm specific to the iicbus controller directory.
Split up between general logic and bus attachment code.
Add support for ACPI attachment in addition to FDT.
MFC after: 7 days
Tested by: bz (LS1088a FDT), Pierre-Luc Drouin (Honeycomb, ACPI)
Based on: D24917 by Val Packett (initial early version)
Differential Revision: https://reviews.freebsd.org/D44020