Dimitry Andric [Wed, 22 Nov 2023 18:23:06 +0000 (19:23 +0100)]
compiler-rt: avoid segfaults when re-exec'ing with ASLR
After 930a7c2ac67e ("compiler-rt: re-exec with ASLR disabled when
necessary") and 96fe7c8ab0f6 ("compiler-rt: support ReExec() on
FreeBSD"), binaries linked against the sanitizer libraries may segfault
due to procctl(2) being intercepted. Instead, the non-intercepted
internal_procctl() should be called.
Similarly, the ReExec() function that re-executes the binary after
turning off ASLR should not call elf_aux_info(3) and realpath(3), since
these will also be intercepted. Instead, loop directly over the elf aux
info vector to find the executable path, and avoid calling realpath(3)
since it is actually unwanted for this use case.
Gordon Bergling [Sat, 18 Nov 2023 09:09:40 +0000 (10:09 +0100)]
Add a HISTORY section for memcpy(3) and mempcpy(3)
The memcpy() function first appeared in AT&T System V UNIX and was
reimplemented for 4.3BSD-Tahoe. The mempcpy() function first appeared in
FreeBSD 13.1.
Kristof Provost [Fri, 17 Nov 2023 12:52:34 +0000 (13:52 +0100)]
pf: sctp heartbeats confirm a connection
When we create a new state for multihomed sctp connections (i.e.
based on INIT/INIT_ACK or ASCONF parameters) the new connection will
never see a COOKIE/COOKIE_ACK exchange. We should consider HEARTBEAT_ACK
to be a confirmation that the connection is established.
This ensures that such connections do not time out earlier than
expected.
MFC after: 1 week
Sponsored by: Orange Business Services
Kristof Provost [Thu, 16 Nov 2023 19:55:02 +0000 (20:55 +0100)]
pf: skip urpf check for sctp multihomed states
When we create a new state for multihomed sctp connections (i.e.
based on INIT/INIT_ACK or ASCONF parameters) we cannot know what
interfaces we'll be seeing that traffic on. These states are floating
states, i.e. on "all" interfaces. We cannot do reverse path filtering
for these states, so do not do so.
MFC after: 1 week
Sponsored by: Orange Business Services
Kristof Provost [Thu, 16 Nov 2023 16:06:29 +0000 (17:06 +0100)]
pf: always create multihomed states as floating
When we create a new state for multihomed sctp connections (i.e.
based on INIT/INIT_ACK or ASCONF parameters) we cannot know what
interfaces we'll be seeing that traffic on. Make those states floating,
irrespective of state policy.
MFC after: 1 week
Sponsored by: Orange Business Services
Rick Macklem [Mon, 6 Nov 2023 22:25:30 +0000 (14:25 -0800)]
nfscl: newnfs_copycred() cannot be called when a mutex is held
Since newnfs_copycred() calls crsetgroups() which in turn calls
crextend() which might do a malloc(M_WAITOK), newnfs_copycred()
cannot be called with a mutex held. Fortunately, the malloc()
call is rarely done, since XU_GROUPS is 16 and the NFS client
uses a maximum of 17 (only 17 groups will cause the malloc() to
be called). Further, it is only a problem if the malloc() tries
to sleep(). As such, this bug does not seem to have caused
problems in practice.
This patch fixes the one place in the NFS client where
newnfs_copycred() is called while a mutex is held by moving the
call to after where the mutex is released.
Found by inspection while working on an experimental patch.
Rick Macklem [Sun, 22 Oct 2023 01:33:33 +0000 (18:33 -0700)]
nfscl: Handle a Getattr failure with NFSERR_DELAY following Open
During testing at a recent IETF NFSv4 Bakeathon, a non-FreeBSD
server was rebooted. After the reboot, the FreeBSD client sent
an Open/Claim_previous with a Getattr after the Open in the same
compound. The Open/Claim_previous was done to recover the Open
and a Delegation for for a file. The Open succeeded, but the
Getattr after the Open failed with NFSERR_DELAY. This resulted
in the FreeBSD client retrying the entire RPC over and over again,
until the server's recovery grace period ended. Since the Open
succeeded, there was no need to retry the entire RPC.
This patch modifies the NFSv4 client side recovery Open/Claim_previous
RPC reply handling to deal with this case. With this patch, the
Getattr reply of NFSERR_DELAY is ignored and the successful Open
reply is processed.
This bug will not normally affect users, since this non-FreeBSD
server is not widely used (it may not even have shipped to any
customers).
Bojan Novković [Mon, 13 Nov 2023 18:02:30 +0000 (20:02 +0200)]
tty: properly check character position when handling IUTF8 backspaces
The tty_rubchar() code handling backspaces for UTF-8 characters didn't
properly check whether the beginning of the current line was reached.
This resulted in a kernel panic in ttyinq_unputchar() when prodded with
certain malformed UTF-8 sequences.
128f63cedc14 and 9e589b093857 added proper UTF-8 backspacing handling in
the tty(4) driver, which is enabled by setting the new IUTF8 flag
through stty(1). Since the default locale is UTF-8, and the feature
itself is important enough, enable IUTF8 by default.
Related discussion:
https://lists.freebsd.org/archives/freebsd-arch/2023-November/000534.html
Aaron LI [Sat, 11 Nov 2023 13:13:08 +0000 (14:13 +0100)]
if_wg: Missing radix unlock can cause deadlock
In function 'wg_aip_add()', the error path of returning ENOMEM when
(node == NULL) is forgetting to unlock the radix tree, and thus may lead
to a deadlock.
Rick Macklem [Tue, 17 Oct 2023 20:55:48 +0000 (13:55 -0700)]
nfsd: Avoid acquiring a vnode for some NFSv4 Readdir operations
Without this patch, a NFSv4 Readdir operation acquires the vnode for
each entry in the directory. If only the Type, Fileid, Mounted_on_fileid
and ReaddirError attributes are requested by a client, acquiring the vnode
is not necessary for non-directories. Directory vnodes must be acquired
to check for server file system mount points.
This patch avoids acquiring the vnode, as above, resulting in a 3-8%
improvement in Readdir RPC RTT for some simple tests I did.
Note that only non-rdirplus NFSv4 mounts will benefit from this change.
Tested during a recent IETF NFSv4 Bakeathon testing event.
Alexander Motin [Mon, 6 Nov 2023 16:05:48 +0000 (11:05 -0500)]
nvme: Introduce longer timeouts for admin queue
KIOXIA CD8 SSDs routinely take ~25 seconds to delete non-empty
namespace. In some cases like hot-plug it takes longer, triggering
timeout and controller resets after just 30 seconds. Linux for many
years has separate 60 seconds timeout for admin queue. This patch
does the same. And it is good to be consistent.
Rick Macklem [Wed, 18 Oct 2023 19:42:12 +0000 (12:42 -0700)]
nfscl: Handle the NFSERR_RETRYUNCACHEDREP error from a NFSv4 server
In a recent email list discussion related to NFSv4 mount problems
against a non-FreeBSD NFSv4 server, the reporter of the issue noted
that the server had replied 10068 (NFSERR_RETRYUNCACHEDREP). This
did not seem related to the mount problem, but I had never seen this
error before. It indicates that an RPC retry after a new TCP
connection has been established failed because the server did not
cache the reply. Since this should only happen for idempotent
operations, redoing the RPC should be safe.
This patch modifies the NFSv4.1/4.2 client to redo the RPC instead
of considering the server error fatal. It should only affect the
unusual case where TCP connections to NFSv4 servers are breaking
without the NFSv4 server rebooting.
Jose Luis Duran [Fri, 6 Oct 2023 17:55:06 +0000 (17:55 +0000)]
ping: Avoid reporting NaNs
Avoid calculating the square root of negative zero, which can easily
happen on certain architectures when calculating the population standard
deviation with a sample size of one, e.g., 0.01 - (0.1 * 0.1) =
-0.000000.
Avoid returning a NaN by capping the minimum possible variance value to
zero (positive).
In the future, maybe skip reporting statistics at all for a single
sample.
Zhenlei Huang [Sat, 21 Oct 2023 04:52:27 +0000 (12:52 +0800)]
bpf: Make dead_bpf_if const
The dead_bpf_if is not subjected to be written. Make it const so that
on destructive writing to it the kernel will panic instead of silent
memory corruption.
Franco Fichtner [Fri, 10 Nov 2023 11:42:17 +0000 (12:42 +0100)]
libpfctl: fix label setting
A mismerge caused the labels list to be added to the wrong nvlist,
breaking label configuration.
If you compare the change from from main and stable/13 you
can see that main uses "nvl" and stable/13 has "nlvr" for
nvlist_append_string_array() but the backport changes it to "nlv".
This code was supposed to apply to pfctl_add_eth_rule() but instead
applied to pfctl_add_rule() for otherwise interesting reasons. Since
pfctl_add_eth_rule() uses "nvl" and pfctl_add_rule() uses "nvlr" but
also has "nvl" this compiled fine but still broke the label set.
Kristof Provost [Fri, 27 Oct 2023 12:13:57 +0000 (14:13 +0200)]
libpfctl: be more tolerant of kernel extensions
Allow the kernel to supply more array elements than expected, but cut
off when we hit what we think the maximum is. This will improve forward
compatibility (i.e. old userspace with newer kernel).
Kristof Provost [Tue, 10 Oct 2023 09:56:15 +0000 (11:56 +0200)]
pf tests: ensure that we generate all permutations for SCTP multihome
The initial multihome implementation was a little simplistic, and failed
to create all of the required states. Given a client with IP 1 and 2 and
a server with IP 3 and 4 we end up creating states for 1 - 3 and 2 - 3,
as well as 3 - 1 and 4 - 1, but not for 2 - 4.
Kristof Provost [Tue, 17 Oct 2023 16:10:39 +0000 (18:10 +0200)]
pf: fix missing SCTP multihomed states
The existing code to create extra states when SCTP endpoints supplied
extra addresses missed a case. As a result we failed to generate all of
the required states.
Briefly, if host A has address 1 and 2 and host B has addres 3 and 4 we
generated 1 - 3 and 2 - 3, as well as 1 - 4, but not 2 - 4.
Store the list of endpoints supplied by each host and use those to
generate all of the connection permutations.
The commit that purported to fix CVE-2014-8611 (805288c2f062) only hid
it behind another bug. Two later commits, 86a16ada1ea6 and 44cf1e5eb470, attempted to address this new bug but mostly just confused
the issue. This commit rolls back the three previous changes and fixes
CVE-2014-8611 correctly.
The key to understanding the bug (and the fix) is that `_w` has
different meanings for different stream modes. If the stream is
unbuffered, it is always zero. If the stream is fully buffered, it is
the amount of space remaining in the buffer (equal to the buffer size
when the buffer is empty and zero when the buffer is full). If the
stream is line-buffered, it is a negative number reflecting the amount
of data in the buffer (zero when the buffer is empty and negative buffer
size when the buffer is full).
At the heart of `fflush()`, we call the stream's write function in a
loop, where `t` represents the return value from the last call and `n`
the amount of data that remains to be written. When the write function
fails, we need to move the unwritten data to the top of the buffer
(unless nothing was written) and adjust `_p` (which points to the next
free location in the buffer) and `_w` accordingly. These variables have
already been set to the values they should have after a successful
flush, so instead of adjusting them down to reflect what was written,
we're adjusting them up to reflect what remains.
The bug was that while `_p` was always adjusted, we only adjusted `_w`
if the stream was fully buffered. The fix is to also adjust `_w` for
line-buffered streams. Everything else is just noise.
Fixes: 805288c2f062 Fixes: 86a16ada1ea6 Fixes: 44cf1e5eb470
Sponsored by: Klara, Inc.
Mariusz Zaborski [Mon, 23 Oct 2023 21:03:51 +0000 (23:03 +0200)]
cap_net: correct capability name from addr2name to name2addr
Previously, while checking name2addr capabilities, we mistakenly used
the addr2name set. This error could cause a process to inadvertently
reset its limitations.
Yishai Hadas [Sat, 28 Oct 2023 20:55:47 +0000 (16:55 -0400)]
mlx5ib: Fix RSS Toeplitz setup to be aligned with the HW specification
The specification for the Toeplitz function doesn't require to set the key
explicitly to be symmetric. In case a symmetric functionality is required
a symmetric key can be simply used.
Wrongly forcing the algorithm to symmetric causes the wrong packet
distribution and a performance degradation.
Rick Macklem [Thu, 19 Oct 2023 19:35:35 +0000 (12:35 -0700)]
nfsd: Fix NFSv4.1/4.2 Claim_Deleg_Cur_FH
When I implemented a test patch using Open Claim_Deleg_Cur_FH
I discovered that the NFSv4.1/4.2 server was broken for this
Open option. Fortunately it is never used by the FreeBSD
client and never used by other clients unless delegations
are enabled. (The FreeBSD NFSv4 server does not have delegations
enabled by default.)
Claim_Deleg_Cur_FH was broken because the code mistakenly
assumed a stateID argument, which is not the case.
This patch fixes the bug by changing the XDR parser to not
expect a stateID and to fill most of the stateID in from the
clientID. The clientID is the first two elements of the "other"
array for the stateID and is sufficient to identify which
client the delegation is issued to. Since there is only one
delegation issued to a client per file, this is sufficient to
locate the correct delegation.
If you are running non-FreeBSD NFSv4.1/4.2 mounts against the
FreeBSD server, you need this patch if you have delegations enabled.
Zhenlei Huang [Thu, 2 Nov 2023 05:14:40 +0000 (13:14 +0800)]
cam/ata: Postpone removal of two compat sysctls until 15
Prefer UNMAPPEDIO and ROTATING from flags sysctl. See
1. aeab0812e68c (Add flags sysctl to ada)
2. cf3ff63e55e4 (Convert unmappedio over to a flag)
3. 96eb32bf0f5a (Convert rotating to a flag bit)
Reviewed by: imp, ken, #cam
MFC after: immediately (we want this in 14.0)
Differential Revision: https://reviews.freebsd.org/D42402
Kristof Provost [Mon, 23 Oct 2023 11:46:11 +0000 (13:46 +0200)]
libpfctl: fix Coverity issues
- handle snl_finalize_msg() returning NULL
- insert the correct data into the states list
- add missing nvlist_destroy()
- incorrect order for array bounds
Kristof Provost [Mon, 23 Oct 2023 11:43:52 +0000 (13:43 +0200)]
libpfctl: fix pfctl_do_ioctl()
pfctl_do_ioctl() copies the packed request data into the request buffer
and then frees it. However, it's possible for the buffer to be too small
for the reply, causing us to allocate a new buffer. We then copied from
the freed request, and freed it again.
Do not free the request buffer until we're all the way done.
netlink: fix potential llentry lock leak in newneigh handler
The netlink newneigh handler has the potential to leak the lock on
llentry objects in the kernel. This patch reconciles several paths
through the newneigh handler that could result in a lock leak.
Gleb Smirnoff [Thu, 26 Oct 2023 09:59:21 +0000 (02:59 -0700)]
bhyve: fix arguments to ioctl(VMIO_SIOCSIFFLAGS)
ioctl(2)'s with integer argument shall pass command argument by value,
not by pointer. The ioctl(2) manual page is not very clear about that.
See sys/kern/sys_generic.c:sys_ioctl() near IOC_VOID.
Kristof Provost [Fri, 6 Oct 2023 12:20:17 +0000 (14:20 +0200)]
pfctl: fix incorrect mask on dynamic address
A PF rule using an IPv4 address followed by an IPv6 address and then a
dynamic address, e.g. "pass from {192.0.2.1 2001:db8::1} to (pppoe0)",
will have an incorrect /32 mask applied to the dynamic address.
MFC after: 3 weeks
Obtained from: OpenBSD
See also: https://ftp.openbsd.org/pub/OpenBSD/patches/5.6/common/007_pfctl.patch.sig
Sponsored by: Rubicon Communications, LLC ("Netgate")
Event: Oslo Hackathon at Modirum
Brooks Davis [Thu, 26 Oct 2023 20:38:41 +0000 (21:38 +0100)]
libprocstat: improve conditional for 32-bit compat
Include support for translating 32-bit auxv vectors on non-64-bit
platforms that aren't riscv (which has no 32-bit ABI support and
probably never will).
Brooks Davis [Thu, 26 Oct 2023 20:38:41 +0000 (21:38 +0100)]
libprocstat: make sv_name not static
Making this variable static makes is_elf32_sysctl() and callers thread
unsafe.
Use a less absurd length for sv_name. The longest name in the system is
"FreeBSD ELF64 V2" which tips the scales at 16+1 bytes. We'll almost
certainly have other problems if we exceed 32 characters.
John Baldwin [Fri, 20 Oct 2023 21:52:38 +0000 (14:52 -0700)]
x86 msi: Enable/disable IDT vectors for MSI groups all at once
Unlike MSI-X, when a device uses multiple MSI interrupts, the entire
group of interrupts are enabled/disabled at once in the relevant PCI
config register. Currently, the interrupt code enables the IDT vector
for each MSI interrupt when a handler is first registered. If the PCI
device triggers an MSI interrupt which doesn't yet have a handler,
this can trigger a panic when the Xrsvd ISR executes rather than
treating it as a stray device interrupt.
To fix, enable all the IDT vectors for an MSI group when the first
interrupt handler is configured, and don't disable the IDT vectors
until the last interrupt handler for the group is torn down.
When migrating an MSI group between CPUs, enable/disable the entire
group of IDT vectors if at least one interrupt handler is configured
for the group.
John Baldwin [Mon, 16 Oct 2023 22:19:07 +0000 (15:19 -0700)]
acpi_pcib: Trust decoded bus range from _CRS over _BBN
Currently if _BBN doesn't match the first bus in the decoded bus range
from _CRS for a Host to PCI bridge, the driver fails to attach as a
defensive measure.
There is now firmware in the field where these do not match, and the
_BBN values are clearly wrong, so rather than failing attach, trust
the range from _CRS over _BBN.
John Baldwin [Mon, 16 Oct 2023 22:17:48 +0000 (15:17 -0700)]
bhyve: Replace many fprintf(stderr, ...) calls with EPRINTLN
EPRINTLN handles newlines appropriately when stdout/stderr have been
reused as the backend for a serial port.
For bhyverun.c itself, the rule this attempts to follow is to use
regular fprintf/perror/warn/err prior to init_pci() (which is when
serial ports are configured) and to switch to EPRINTLN afterwards.
John Baldwin [Fri, 13 Oct 2023 19:26:22 +0000 (12:26 -0700)]
bhyve: Some fwctl simplifications.
- Collapse IDENT_SEND/IDENT_WAIT states down to a single state.
- Remove unused 'len' argument to op_data callback. The value passed
in (total amount of remaining data to receive) didn't seem very useful
and no op_data implementations used it.