Mark Johnston [Sat, 25 Feb 2023 15:21:19 +0000 (10:21 -0500)]
ck_queue: add CK_*_FOREACH_FROM
This is a variant of CK_*_FOREACH from FreeBSD queue.h which starts
iteration at the specified item. If the item pointer is NULL, iteration
starts from the beginning of the list.
Mike Karels [Sat, 25 Feb 2023 14:04:00 +0000 (08:04 -0600)]
sys/conf/NOTES: clean up whitespace
Most options in kernel config files use "options<space><tab>OPTION".
This allows the option to be commented out without shifting columns.
A few options had two tabs, and some had spaces. Make them consistent.
Piotr Kubaj [Fri, 24 Feb 2023 13:26:31 +0000 (14:26 +0100)]
gh-bc: don't disable LTO on powerpc64
Summary:
The LTO issue has been fixed. While -flto for some reason is commented out,
since it wasn't completely removed, it may be expected to be reenabled.
Reviewers: se
Approved by: se
MFC after: 3 days
Subscribers: imp
Differential Revision: https://reviews.freebsd.org/D38755
Paul Floyd [Fri, 24 Feb 2023 16:29:01 +0000 (11:29 -0500)]
libc: handle zero alignment in memalign()
For compatibility with glibc. The previous code would trigger a division
by zero in roundup() and terminate. Instead, just pass through to
malloc() for align == 0.
Mitchell Horne [Fri, 24 Feb 2023 17:19:54 +0000 (13:19 -0400)]
bcm_dma: don't dereference NULL softc
This file defines a small API to be used by other drivers. If any of
these functions are called before the bcm_dma device has attached we
should handle the error gracefully. Fix a formatting quirk while here.
Reviewed by: manu
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D38756
Mark Millard [Fri, 17 Feb 2023 20:30:35 +0000 (16:30 -0400)]
bcm_dma: attach at an earlier bus pass
The sdhci_bcm driver attach routine relies on bcm_dma already being
attached, in order to allocate a DMA channel. However, both drivers
attached at the default pass so this is not guaranteed. Newer RPI
firmware exposes this assumption, and the result is a NULL-dereference
in bcm_dma_allocate().
Rick Macklem [Fri, 24 Feb 2023 15:36:28 +0000 (07:36 -0800)]
nfsd: Fix a use after free when vnet prisons are deleted
The Kasan tests show the nfsrvd_cleancache() results
in a modify after free. I think this occurs because the
nfsrv_cleanup() function gets executed after nfs_cleanup()
which free's the nfsstatsv1_p.
This patch makes them use the same subsystem and sets
SI_ORDER_FIRST for nfs_cleanup(), so that it will be called
after nfsrv_cleanup() via VNET_SYSUNINIT().
The patch also sets nfsstatsv1_p NULL after free'ng it,
so that a crash will result if it is used after free'ng.
The reason for this is while looping through loose source routing (LSRR)
and strict source routing (SSRR), hlen will become smaller than the IP
header. It may even become negative. This should terminate the loop.
However, when hlen is unsigned, an integer underflow occurs becoming a
large number causing the loop to continue virtually forever until hlen
is either by chance smaller than the lenghth of an IP header or it
segfaults.
Mike Karels [Thu, 23 Feb 2023 17:45:08 +0000 (11:45 -0600)]
riscv kernel config: clean up whitespace
Most options in kernel config files use "options<space><tab>OPTION".
This allows the option to be commented out without shifting columns.
A few options had two tabs, and some had spaces. Make them consistent.
Mike Karels [Thu, 23 Feb 2023 17:44:18 +0000 (11:44 -0600)]
powerpc kernel config: clean up whitespace
Most options in kernel config files use "options<space><tab>OPTION".
This allows the option to be commented out without shifting columns.
A few options had two tabs, and some had spaces. Make them consistent.
Mike Karels [Thu, 23 Feb 2023 17:42:59 +0000 (11:42 -0600)]
i386 kernel config: clean up whitespace
Most options in kernel config files use "options<space><tab>OPTION".
This allows the option to be commented out without shifting columns.
A few options had two tabs, and some had spaces. Make them consistent.
Mike Karels [Thu, 23 Feb 2023 17:41:31 +0000 (11:41 -0600)]
arm64 kernel config: clean up whitespace
Most options in kernel config files use "options<space><tab>OPTION".
This allows the option to be commented out without shifting columns.
A few options had two tabs, and some had spaces. Make them consistent.
Mike Karels [Thu, 23 Feb 2023 17:38:26 +0000 (11:38 -0600)]
arm kernel config: clean up whitespace
Most options in kernel config files use "options<space><tab>OPTION".
This allows the option to be commented out without shifting columns.
A few options had two tabs, and some had spaces. Make them consistent.
Mike Karels [Thu, 23 Feb 2023 17:09:39 +0000 (11:09 -0600)]
amd64 kernel config: clean up whitespace
Most options in kernel config files use "options<space><tab>OPTION".
This allows the option to be commented out without shifting columns.
A few options had two tabs, and some had spaces. Make them consistent.
Mark Johnston [Fri, 24 Feb 2023 02:45:01 +0000 (21:45 -0500)]
config: Include errno.h in mkmakefile.cc
Commit da8884202940 ("config: error out on malformed env/hint lines")
added a reference to EINVAL. In some configurations the bootstrap tools
build fails for lack of errno definitions.
Fixes: da8884202940 ("config: error out on malformed env/hint lines")
Reported by: syzbot+b1a5d112a737d9a2be9b@syzkaller.appspotmail.com
Bjoern A. Zeeb [Sat, 18 Feb 2023 01:15:21 +0000 (01:15 +0000)]
net80211: ieee80211_swscan_bg_scan() track return variable under lock
As the comment says it probably does not matter but use a local
variable to track state under lock so we can return the last known
good state of what we thought we were operating under after unlocking.
Likely no functional changes.
Sponsored by: The FreeBSD Foundation
MFC atfer: 3 days
Reviewed by: enweiwu, adrian
Differential Revision: https://reviews.freebsd.org/D38660
Zhenlei Huang [Thu, 23 Feb 2023 18:00:09 +0000 (02:00 +0800)]
Delete obsolete Solaris compat header file stdlib.h
This drops function `getexecname()` redirection.
Historically `getexecname()` is a compatibility definition. Since
openzfs has its own implementation of function `getexecname()` in libspl
and has been merged into base, the compat header file stdlib.h is
no longer needed and should not be used.
Also without this fix libspl will end up an incompatible version of
`getprogname()` with libc. In particular, if zfs is enabled, programs
such as pgrep in /rescue can be wrongly statically linked with libspl
and will not function properly.
PR: 269738
Reviewed by: markj
Fixes: 9e5787d2284e Merge OpenZFS support in to HEAD
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D38733
* Make nhop_set_blackhole() set all necessary properties for the
nexthop
* Make nexthops blackhole/reject based on the rtm_type netlink
property instead of using rtflags.
Reported by: Marek Zarychta <zarychtam@plan-b.pwste.edu.pl>
MFC after: 3 days
Kornel Dulęba [Mon, 20 Feb 2023 14:48:40 +0000 (14:48 +0000)]
arm: Fix initialization of VFP context
Make sure that pcb_vfpsaved is always initialized.
Create a vfp_new_thread helper that is heavily based on the arm64 logic.
While here remove un unnecessary assigment and add an assertion
to make sure that it's been properly initialized before we return
from a VFP exception.
Reported by: Mark Millard <marklmi@yahoo.com>
Tested by: Mark Millard <marklmi@yahoo.com>
Differential Revision: https://reviews.freebsd.org/D38698
Kornel Dulęba [Mon, 20 Feb 2023 14:44:36 +0000 (14:44 +0000)]
arm: Unbreak debugging programs that use FP instructions
Contrary to arm64, on armv7 get_vfpcontext/set_vfpcontext can be called
from cpu_ptrace. This can be triggered when gdb hits a breakpoint
in a userspace program.
Relax td == currthread assertion to account for that situation.
While here update an outdated comment in vfp_discard.
Reported by: Mark Millard <marklmi@yahoo.com>
Tested by: Mark Millard <marklmi@yahoo.com>
Differential Revision: https://reviews.freebsd.org/D38696
Gleb Smirnoff [Thu, 23 Feb 2023 04:44:46 +0000 (20:44 -0800)]
unix/dgram tests: match the kernel behavior
In CURRENT for some time an overflowed unix/dgram socket would
return EAGAIN if it has O_NONBLOCK set. This proved to be
undesired. See 71e70c25c00 for details. Update tests to match
the "new" behavior, which actually is the historical behavior.
Rick Macklem [Wed, 22 Feb 2023 22:09:15 +0000 (14:09 -0800)]
nfsd.c: Log a more meaningful failure message
For the cases where the nfsd(8) daemon is already running or
has failed to start within a prison due to an incorrect prison
configuration, the failure message logged is:
Can't read stable storage file: operation not permitted
This patch replaces the above with more meaningful messages.
It depends on commit 10dff9da9748 to differentiate between the
above two cases, however even without this commit, the messages
should be an improvement.
Rick Macklem [Wed, 22 Feb 2023 21:19:07 +0000 (13:19 -0800)]
nfsd: Return ENXIO instead of EPERM when nfsd(8) already running
The nfsd(8) daemon generates an error message that does not
indicate that the nfsd daemon is already running when the nfssvc(2)
syscall fails for the NFSSVC_STABLERESTART. Also, the check for
running nfsd(8) in a vnet prison will return EPERM when it fails.
This patch replaces EPERM with ENXIO so that the nfsd(8) daemon
can generate more reasonable failure messages. The nfsd(8) daemon
will be patched in a future commit.
Mitchell Horne [Wed, 22 Feb 2023 15:11:15 +0000 (11:11 -0400)]
lockmgr: upgrade panic return checks
We short-circuit lockmgr functions in the face of a kernel panic. Other
lock implementations do this with a SCHEDULER_STOPPED() check, which
covers the additional case where the debugger is active but the system
has not panicked. Update this code to match that behaviour.
Michael Tuexen [Tue, 21 Feb 2023 21:38:18 +0000 (22:38 +0100)]
bblog: improve timeout event handling
Extend the BBLog RTO event to deal with all timers of the base
stack. Also provide information about starting, stopping, and
running off. The expiration of the retransmission timer is
reported as it was done before.
Reviewed by: rscheff@
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D38710
Rick Macklem [Tue, 21 Feb 2023 21:00:42 +0000 (13:00 -0800)]
vfs_export: Add mnt_exjail to control exports done in prisons
If there are multiple instances of mountd(8) (in different
prisons), there will be confusion if they manipulate the
exports of the same file system. This patch adds mnt_exjail
to "struct mount" so that the credentials (and, therefore,
the prison) that did the exports for that file system can
be recorded. If another prison has already exported the
file system, vfs_export() will fail with an error.
If mnt_exjail == NULL, the file system has not been exported.
mnt_exjail is checked by the NFS server, so that exports done
from within a different prison will not be used.
The patch also implements vfs_exjail_destroy(), which is
called from prison_cleanup() to release all the mnt_exjail
credential references, so that the prison can be removed.
Mainly to avoid doing a scan of the mountlist for the case
where there were no exports done from within the prison,
a count of how many file systems have been exported from
within the prison is kept in pr_exportcnt.
Michael Tuexen [Tue, 21 Feb 2023 17:26:49 +0000 (18:26 +0100)]
tcp: rearrange enum and remove unused variable
Rearrange the enum tt_which such that TT_REXMIT is 0. This allows
an extension of the BBLog event RTO in a backwards compatible way.
Remove tcptimers, which was only used in trpt, a utility removed
from the source tree recently.
Gleb Smirnoff [Tue, 21 Feb 2023 16:50:07 +0000 (08:50 -0800)]
Revert "unix/dgram: return EAGAIN instead of ENOBUFS when O_NONBLOCK set"
This API change led to unexpected consequences with Go runtime. The
Go runtime emulates blocking sockets over non-blocking sockets and
for that uses available event dispatcher on the target OS, which is
kevent(2) if availabe, with OS independent layer on top. It expects
that if whatever O_NONBLOCK socket returned ever EAGAIN, then it is
supposed to be reported as writable by the event dispatcher. kevent(2)
would never report a unix/dgram socket, since they never change their
state, they always are writeable. The expectations of Go are not
literally specified by SUS, however they are in its spirit. The SUS
specifies EAGAIN for send(2) as "The socket's file descriptor is marked
O_NONBLOCK and the requested operation would block" [1]. This doesn't
apply to FreeBSD unix/dgram socket, it never blocks on send(2).
Thus, changing API trying to mimic Linux was a mistake. But what about
the problem we tried to fix? Discussed that with Max Dounin of nginx,
and we agreed that the log bomb described shall be fixed on nginx side,
and it actually isn't specific to FreeBSD, may happen with nginx on any
non-Linux system with a certain configuration.
Zhenlei Huang [Tue, 21 Feb 2023 15:43:25 +0000 (23:43 +0800)]
jail: Fix redoing ip restricting
`prison_ip_restrict()` is called in loop FOREACH_PRISON_DESCENDANT_LOCKED.
While under low memory, it is still possible that in subsequent rounds
`prison_ip_restrict()` succeed and `redo_ip[46]` flip over from true to
false, thus leave some prisons's IPv[46] addresses unrestricted.
Reviewed by: jamie
Fixes: 8bce8d28abe6 jail: Avoid multipurpose return value of function prison_ip_restrict()
Differential Revision: https://reviews.freebsd.org/D38697
Rick Macklem [Tue, 21 Feb 2023 03:43:37 +0000 (19:43 -0800)]
nfscommon: Use IS_DEFAULT_VNET() in the vnet initialization
Another oopsie. The vnet initialization function in
nfs_commonport.c for initializing prison0 by testing
curthread->td_ucred->cr_prison == &prison0. This is bogus
and always true. Replace it with IS_DEFAULT_VNET(curvnet).
Rick Macklem [Tue, 21 Feb 2023 00:40:07 +0000 (16:40 -0800)]
nfscl: Add NFSD_CURVNET macros to nfsclient syscall
Although the nfsclient syscall is used for client side,
it does set up server side krpc for callbacks. As such,
it needs to have the vnet set. This patch does this.
Without this patch, the system would crash when the
nfscbd(8) daemon was killed.
Michael Tuexen [Mon, 20 Feb 2023 20:42:57 +0000 (21:42 +0100)]
bblog: sync tcp_log_events with Netflix tree
This allows the addition of entries to tcp_log_events without
causing conflicts in the Netflix tree.
rrs@ will upstream the related functional changes eventually.
tarfs: Really prevent descending into a non-directory.
The previous fix was incorrect: we need to verify that the current node, if it exists, is not a directory, but we were checking the parent node instead. Address this, add more tests, and fix the test cleanup routines.
PR: 269519, 269561
Fixes: ae6cff89738b
Sponsored by: Juniper Networks, Inc.
Sponsored by: Klara, Inc.
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D38645
Rick Macklem [Mon, 20 Feb 2023 21:11:22 +0000 (13:11 -0800)]
nfsd: Add VNET_SYSUNINIT() macros for vnet cleanup
Commit ed03776ca7f4 enabled the vnet front end macros.
As such, for kernels built with the VIMAGE option will malloc
data and initialize locks on a per-vnet basis, typically
via a VNET_SYSINIT().
This patch adds VNET_SYSUNINIT() macros to do the frees
of the per-vnet malloc'd data and destroys of per-vnet
locks. It also removes the mtx_lock/mtx_unlock calls
from nfsrvd_cleancache(), since they are not needed.
netlink: fix IPv6 route addition with link-local gateway
Currently kernel assumes that IPv6 gateway address is in "embedded"
form - that is, for the link-local IPv6 addresses, interface index
is embedded in bytes 2 and 3 of the address.
Fix address embedding in netlink by wrapping nhop_set_gw() in the
netlink-specific nl_set_nexthop_gw(), which does such embedding
automatically.
Reported by: Marek Zarychta <zarychtam@plan-b.pwste.edu.pl>
MFC after: 3 days
Mark Johnston [Mon, 20 Feb 2023 13:58:19 +0000 (08:58 -0500)]
man9: Add an smr(9) manual page
Also update the UMA manual page to mention its SMR-enabled
functionality, and update locking.9 to mention both epoch and SMR.
Details of its usage are provided in the SMR manual page.
Reviewed by: Olivier Certner, mhorne, kib
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D38108
Andrew Turner [Mon, 20 Feb 2023 12:22:11 +0000 (12:22 +0000)]
When saving a context on arm call the vfp handler
When adding kernel VFP support on arm a comparison instruction was
removed, however the branch to vfp_save_state was still conditional.
Remove the conditional check and always call into vfp_save_state as
it could cause unexpected results otherwise.
In the manual page for zmore(1) and zless(1) it is said that zless(1) is
equivalent to zmore(1) except that it uses less(1) as a pager. However
zmore(1) is able to handle uncompressed files transparently while zless(1)
is not.
Add another case to the switch in lesspipe.sh to handle non-compressed files.