Alexander Motin [Wed, 17 Mar 2021 14:30:40 +0000 (10:30 -0400)]
nvme: Replace potentially long DELAY() with pause().
In some cases like broken hardware nvme(4) may wait minutes for
controller response before timeout. Doing so in a tight spin loop
made whole system unresponsive.
Kyle Evans [Tue, 16 Mar 2021 02:38:23 +0000 (21:38 -0500)]
base: remove if_wg(4) and associated utilities, manpage
After length decisions, we've decided that the if_wg(4) driver and
related work is not yet ready to live in the tree. This driver has
larger security implications than many, and thus will be held to
more scrutiny than other drivers.
Please also see the related message sent to the freebsd-hackers@
and freebsd-arch@ lists by Kyle Evans <kevans@FreeBSD.org> on
2021/03/16, with the subject line "Removing WireGuard Support From Base"
for additional context.
Daniel Engerg [Wed, 17 Mar 2021 14:00:57 +0000 (15:00 +0100)]
Remove tmpfs size and properly format generated fstab for arm
Remove tmpfs size limitation, this breaks make installworld and installation of some packages
Format generated fstab using tabs to make it consistent and readable
Cy Schubert [Wed, 17 Mar 2021 00:06:17 +0000 (17:06 -0700)]
wpa: import fix for P2P provision discovery processing vulnerability
Latest version available from: https://w1.fi/security/2021-1/
Vulnerability
A vulnerability was discovered in how wpa_supplicant processes P2P
(Wi-Fi Direct) provision discovery requests. Under a corner case
condition, an invalid Provision Discovery Request frame could end up
reaching a state where the oldest peer entry needs to be removed. With
a suitably constructed invalid frame, this could result in use
(read+write) of freed memory. This can result in an attacker within
radio range of the device running P2P discovery being able to cause
unexpected behavior, including termination of the wpa_supplicant process
and potentially code execution.
Vulnerable versions/configurations
wpa_supplicant v1.0-v2.9 with CONFIG_P2P build option enabled
An attacker (or a system controlled by the attacker) needs to be within
radio range of the vulnerable system to send a set of suitably
constructed management frames that trigger the corner case to be reached
in the management of the P2P peer table.
These ioctl commands aim to provide easier ways for user space
applications to enumerate existing audio devices and the node they can
potentially use.
The exchange of device lists between user space and kernel is done on
nv(9). Some ioctl commands are added to /dev/sndstat node:
- SNDSTAT_REFRESH_DEVS
- SNDSTAT_GET_DEVS
- SNDSTAT_ADD_USER_DEVS
- SNDSTAT_FLUSH_USER_DEVS
Bump __FreeBSD_version to reflect the addition of the ioctls.
Sponsored by: The FreeBSD Foundation
Reviewed by: hselasky
Approved by: philip (mentor)
Differential Revision: https://reviews.freebsd.org/D26884
Jung-uk Kim [Tue, 16 Mar 2021 18:16:10 +0000 (14:16 -0400)]
pkgbase: Fix building out-of-tree manual pages
c7e6cb9e08d6 introduced MK_MANSPLITPKG but it was not available for
building out-of-tree manual pages. For example, x11/nvidia-driver fails
with the following error:
Kristof Provost [Thu, 11 Mar 2021 10:37:05 +0000 (11:37 +0100)]
pf: pool/kpool conversion code
stuct pf_pool and struct pf_kpool are different. We should not simply
bcopy() them.
Happily it turns out that their differences were all pointers, and the
userspace provided pointers were overwritten by the kernel, so this did
actually work correctly, but we should fix it anyway.
Emmanuel Vadot [Tue, 16 Mar 2021 06:11:56 +0000 (07:11 +0100)]
pkgbase: Add an src.conf option for splitting man pages
Man pages can be big in total, add an options to split man pages
in -man packages so we produce smaller packages.
This is useful for small jails or mfsroot produced of pkgbase.
The option is off by default.
Emmanuel Vadot [Tue, 16 Mar 2021 06:12:56 +0000 (07:12 +0100)]
pkgbase: Remove case for runtime and jail package ucl generation
They aren't needed and produce wrong package comments :
We use to have "runtime-dev package" instead of
"FreeBSD Base System (Development Files)" for example
Emmanuel Vadot [Tue, 16 Mar 2021 06:12:53 +0000 (07:12 +0100)]
include: Remove symlink installation
headers could be installed as symlink to the source tree instead of copies.
Remove the possibility to do that.
This make the makefile easier to read and to maintain and also don't duplicate
code.
While here remove some directories from LSBUDIRS as we already install them using
the INCS stuff.
As follow-on work to e4b8deb222278b2a, move page table page
allocation and freeing into their own functions. Use these
functions to provide separate kernel vs. user page table page
accounting, and to wrap common tasks such as management of
zero-filled page state.
netmap: fix memory leak in NETMAP_REQ_PORT_INFO_GET
The netmap_ioctl() function has a reference counting bug in case of
NETMAP_REQ_PORT_INFO_GET command. When `hdr->nr_name[0] == '\0'`,
the function does not decrease the refcount of "nmd", which is
increased by netmap_mem_find(), causing a refcount leak.
Reported by: Xiyu Yang <sherllyyang00@gmail.com>
Submitted by: Carl Smith <carl.smith@alliedtelesis.co.nz>
MFC after: 3 days
PR: 254311
pkg(7): when bootstrapping first search for pkg.bsd file then pkg.txz
The package extension is going to be changed to .bsd to be among other
things resilient to the change of compression format used and reduce
the impact of all third party tool of that change.
Ensure the bootstrap knows about it
Reviewed by: manu
Differential revision: https://reviews.freebsd.org/D29232
Kyle Evans [Mon, 15 Mar 2021 12:31:09 +0000 (07:31 -0500)]
if_wg: fix build with DIAGNOSTICS
This file got resynced with OpenBSD to pick up fixes that had taken
place after the version initially ported to FreeBSD. KASSERT there is
more like MPASS here.
Reported by: David Wolfskill <david@catwhisker.org>
Kyle Evans [Mon, 15 Mar 2021 02:25:40 +0000 (21:25 -0500)]
if_wg: import latest fixup work from the wireguard-freebsd project
This is the culmination of about a week of work from three developers to
fix a number of functional and security issues. This patch consists of
work done by the following folks:
- Jason A. Donenfeld <Jason@zx2c4.com>
- Matt Dunwoodie <ncon@noconroy.net>
- Kyle Evans <kevans@FreeBSD.org>
Notable changes include:
- Packets are now correctly staged for processing once the handshake has
completed, resulting in less packet loss in the interim.
- Various race conditions have been resolved, particularly w.r.t. socket
and packet lifetime (panics)
- Various tests have been added to assure correct functionality and
tooling conformance
- Many security issues have been addressed
- if_wg now maintains jail-friendly semantics: sockets are created in
the interface's home vnet so that it can act as the sole network
connection for a jail
- if_wg no longer fails to remove peer allowed-ips of 0.0.0.0/0
- if_wg now exports via ioctl a format that is future proof and
complete. It is additionally supported by the upstream
wireguard-tools (which we plan to merge in to base soon)
- if_wg now conforms to the WireGuard protocol and is more closely
aligned with security auditing guidelines
Note that the driver has been rebased away from using iflib. iflib
poses a number of challenges for a cloned device trying to operate in a
vnet that are non-trivial to solve and adds complexity to the
implementation for little gain.
The crypto implementation that was previously added to the tree was a
super complex integration of what previously appeared in an old out of
tree Linux module, which has been reduced to crypto.c containing simple
boring reference implementations. This is part of a near-to-mid term
goal to work with FreeBSD kernel crypto folks and take advantage of or
improve accelerated crypto already offered elsewhere.
There's additional test suite effort underway out-of-tree taking
advantage of the aforementioned jail-friendly semantics to test a number
of real-world topologies, based on netns.sh.
Also note that this is still a work in progress; work going further will
be much smaller in nature.
Now, try this again, but with a 512-byte type-ahead buffer. While there
is buffer space, control input is handled and non-control input is
buffered. When the buffer is exhausted, the default is to print a
warning and drop further non-control input in order to continue handling
control input. sysctl debug.ddb.prioritize_control_input can be set to
0 to instead preserve all input but lose immediate handling of control
input. This could for example effect pasting of a large script into the
ddb console.
Suggested by: Anton Rang <rang@acm.org>
Reviewed by: markj
Discussed with: imp
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D28676
Ryan Libby [Sun, 14 Mar 2021 23:04:27 +0000 (16:04 -0700)]
kobj: avoid gcc -Wcast-function-type
The actual type of kobjop_t is arbitrary, it is only used as a generic
function pointer type. Declare it as void (*)(void) in order to avoid
gcc's -Wcast-function-type, which is included in -Wextra.
Notable upstream pull request merges:
#11153 Scalable teardown lock for FreeBSD
#11651 Don't bomb out when using keylocation=file://
#11667 zvol: call zil_replaying() during replay
#11683 abd_get_offset_struct() may allocate new abd
#11693 Intentionally allow ZFS_READONLY in zfs_write
#11716 zpool import cachefile improvements
#11720 FreeBSD: Clean up zfsdev_close to match Linux
#11730 FreeBSD: bring back possibility to rewind the
checkpoint from bootloader
Gordon Bergling [Sat, 13 Mar 2021 18:28:26 +0000 (19:28 +0100)]
ls(1): Refine the HISTORY within the manual page.
A simple find command appeared in Version 1 AT&T UNIX and was removed in
Version 3 AT&T UNIX. It was rewritten for Version 5 AT&T UNIX and later
be enhanced for the Programmer's Workbench (PWB). These changes were
later incorporated in AT&T UNIX v7.
Notable upstream pull request merges:
#11153 Scalable teardown lock for FreeBSD
#11651 Don't bomb out when using keylocation=file://
#11667 zvol: call zil_replaying() during replay
#11683 abd_get_offset_struct() may allocate new abd
#11693 Intentionally allow ZFS_READONLY in zfs_write
#11716 zpool import cachefile improvements
#11720 FreeBSD: Clean up zfsdev_close to match Linux
#11730 FreeBSD: bring back possibility to rewind the
checkpoint from bootloader
Dimitry Andric [Sat, 13 Mar 2021 13:54:24 +0000 (14:54 +0100)]
Partially revert libcxxrt changes to avoid _Unwind_Exception change
(Note I am also applying this to main and stable/13, to restore the old
libcxxrt ABI and to avoid having to maintain a compat library.)
After the recent cherry-picking of libcxxrt commits 0ee0dbfb0d26 and d2b3fadf2db5, users reported that editors/libreoffice packages from the
official package builders did not start anymore. It turns out that the
combination of these commits subtly changes the ABI, requiring all
applications that depend on internal details of struct _Unwind_Exception
(available via unwind-arm.h and unwind-itanium.h) to be recompiled.
However, the FreeBSD package builders always use -RELEASE jails, so
these still use the old declaration of struct _Unwind_Exception, which
is not entirely compatible. In particular, LibreOffice uses this struct
in its internal "uno bridge" component, where it attempts to setup its
own exception handling mechanism.
To fix this incompatibility, go back to the old declarations of struct
_Unwind_Exception, and restore the __LP64__ specific workaround we had
in place before (which was to cope with yet another, older ABI bug).
Effectively, this reverts upstream libcxxrt commits 88bdf6b290da
("Specify double-word alignment for ARM unwind") and b96169641f79
("Updated Itanium unwind"), and reapplies our commit 3c4fd2463bb2
("libcxxrt: add padding in __cxa_allocate_* to fix alignment").
Mariusz Zaborski [Sat, 13 Mar 2021 11:56:17 +0000 (12:56 +0100)]
zfs: bring back possibility to rewind the checkpoint from
Add parsing of the rewind options.
When I was upstreaming the change [1], I omitted the part where we
detect that the pool should be rewind. When the FreeBSD repo has
synced with the OpenZFS, this part of the code was removed.
Adrian Chadd [Sat, 13 Mar 2021 07:30:25 +0000 (23:30 -0800)]
[ath] do a cold reset if TSFOOR triggers
TSFOOR happens if a beacon with a given TSF isn't received within the
programmed/expected TSF value, plus/minus a fudge range. (OOR == out of range.)
If this happens then it could be because the baseband/mac is stuck, or
the baseband is deaf. So, do a cold reset and resync the beacon to
try and unstick the hardware.
It also happens when a bad AP decides to err, slew its TSF because they
themselves are resetting and they don't preserve the TSF "well."
This has fixed a bunch of weird corner cases on my 2GHz AP radio upstairs
here where it occasionally goes deaf due to how much 2GHz noise is up
here (and ANI gets a little sideways) and this unsticks the station
VAP.
For AP modes a hung baseband/mac usually ends up as a stuck beacon
and those have been addressed for a long time by just resetting the
hardware. But similar hangs in station mode didn't have a similar
recovery mechanism.
Tested:
* AR9380, STA mode, 2GHz/5GHz
* AR9580, STA mode, 5GHz
* QCA9344 SoC w/ on-board wifi (TL-WDR4300/3600 devices); 2GHz
STA mode
Adrian Chadd [Sat, 13 Mar 2021 01:29:09 +0000 (17:29 -0800)]
[ath] validate ts_antenna before updating tx statistics
Right now ts_antenna is either 0 or 1 in each supported HAL so
this is purely a sanity check.
Later on if I ever get magical free time I may add some extensions
for the NICs that can have slightly more complicated antenna switches
for transmit and I'd like this to not bust memory.
Peter Jeremy [Fri, 12 Mar 2021 22:06:04 +0000 (09:06 +1100)]
arm64: Add support for the RK805/RK808 RTC
Implement a driver for the RTC embedded in the RK805/RK808 power
management system used for RK3328 and RK3399 SoCs.
Based on experiments on my RK808, setting the time doesn't alter the
internal/inaccessible sub-second counter, therefore there's no point
in calling clock_schedule().
Based on an earlier revision by andrew.
Reviewed by: manu
Differential Revision: https://reviews.freebsd.org/D22692
Sponsored by: Google
MFC after: 1 week
Warner Losh [Fri, 12 Mar 2021 20:20:52 +0000 (13:20 -0700)]
cam: Run all XPT_ASYNC ccbs in a dedicated thread
Queue all XPT_ASYNC ccb's and run those in a new cam async thread. This thread
is allowed to sleep for things like memory. This should allow us to make all the
registration routines for cam periph drivers simpler since they can assume they
can always allocate memory. This is a separate thread so that any I/O that's
completed in xpt_done_td isn't held up.
This should fix the panics for WAITOK alloations that are elsewhere in the
storage stack that aren't so easy to convert to NOWAIT. Additional future work
will convert other allocations in the registration path to WAITOK should
detailed analysis show it to be safe.
John Baldwin [Fri, 12 Mar 2021 18:35:56 +0000 (10:35 -0800)]
ccr: Disable requests on port 1 when needed to workaround a firmware bug.
Completions for crypto requests on port 1 can sometimes return a stale
cookie value due to a firmware bug. Disable requests on port 1 by
default on affected firmware.
John Baldwin [Fri, 12 Mar 2021 18:35:05 +0000 (10:35 -0800)]
ccr: Set the RX channel ID correctly in work requests.
These fixes are only relevant for requests on the second port. In
some cases, the crypto completion data, completion message, and
receive descriptor could be written in the wrong order.
- Add a separate rx_channel_id that is a copy of the port's rx_c_chan
and use it when an RX channel ID is required in crypto requests
instead of using the tx_channel_id.
- Set the correct rx_channel_id in the CPL_RX_PHYS_ADDR used to write
the crypto result.
- Set the FID to the first rx queue ID on the adapter rather than the
queue ID of the first rx queue for the port.
- While here, use tx_chan to set the tx_channel_id though this is
identical to the previous value.
Apparently GCC only supports arithmetic expressions that use static
const variables in initializers starting with GCC8. To keep older
versions happy use a macro instead.
Fixes: 221622ec0c ("lib/msun: Avoid FE_INEXACT for x86 log2l/log10l")
Reported by: Jenkins
Reviewed By: imp
Differential Revision: https://reviews.freebsd.org/D29233
Fetch the sigfastblock value in syscalls that wait for signals
We have seen several cases of processes which have become "stuck" in
kern_sigsuspend(). When this occurs, the kernel's td_sigblock_val
is set to 0x10 (one block outstanding) and the userspace copy of the
word is set to 0 (unblocked). Because the kernel's cached value
shows that signals are blocked, kern_sigsuspend() blocks almost all
signals, which means the process hangs indefinitely in sigsuspend().
It is not entirely clear what is causing this condition to occur.
However, it seems to make sense to add some protection against this
case by fetching the latest sigfastblock value from userspace for
syscalls which will sleep waiting for signals. Here, the change is
applied to kern_sigsuspend() and kern_sigtimedwait().
John Baldwin [Fri, 12 Mar 2021 17:47:31 +0000 (09:47 -0800)]
amd64: Cleanups to setting TLS registers for Linux binaries.
- Use update_pcb_bases() when updating FS or GS base addresses to
permit use of FSBASE and GSBASE in Linux processes. This also sets
PCB_FULL_IRET. linux32 was setting PCB_32BIT which should be a
no-op (exec sets it).
- Remove write-only variables to construct unused segment descriptors
for linux32.
John Baldwin [Fri, 12 Mar 2021 17:45:18 +0000 (09:45 -0800)]
amd64: Only update fsbase/gsbase in pcb for curthread.
Before the pcb is copied to the new thread during cpu_fork() and
cpu_copy_thread(), the kernel re-reads the current register values in
case they are stale. This is done by setting PCB_FULL_IRET in
pcb_flags.
This works fine for user threads, but the creation of kernel processes
and kernel threads do not follow the normal synchronization rules for
pcb_flags. Specifically, new kernel processes are always forked from
thread0, not from curthread, so adjusting pcb_flags via a simple
instruction without the LOCK prefix can race with thread0 running on
another CPU. Similarly, kthread_add() clones from the first thread in
the relevant kernel process, not from curthread. In practice, Netflix
encountered a panic where the pcb_flags in the first kthread of the
KTLS process were trashed due to update_pcb_bases() in
cpu_copy_thread() running from thread0 to create one of the other KTLS
threads racing with the first KTLS kthread calling fpu_kern_thread()
on another CPU. In the panicking case, the write to update pcb_flags
in fpu_kern_thread() was lost triggering an "Unregistered use of FPU
in kernel" panic when the first KTLS kthread later tried to use the
FPU.
Alex Richardson [Fri, 12 Mar 2021 17:15:00 +0000 (17:15 +0000)]
Allow using sanitizers for ssp tests with out-of-tree compiler
With an out-of-tree Clang, we can use the -resource-dir flag when linking
to point it at the runtime libraries from the current SYSROOT.
This moves the path to the clang-internal library directory to a separate
.mk file that can be used by Makefiles that want to find the sanitizer
libraries. I intend to re-use this .mk file for my upcoming changes that
allow building the entire base system with ASAN/UBSAN/MSAN.
Reviewed By: dim
Differential Revision: https://reviews.freebsd.org/D28852
Alex Richardson [Fri, 12 Mar 2021 17:12:10 +0000 (17:12 +0000)]
capsicum-test: Update for O_BENEATH removal
Update the tests to check O_RESOLVE_BENEATH instead.
If this looks reasonable, I'll try to upstream this change.
This keeps a compat fallback for O_BENEATH since the Linux port still
has/had O_BENEATH with "no .., no absolute paths" semantics.
Test Plan: `/usr/tests/sys/capsicum/capsicum-test -u 977` passes and
runs the O_RESOLVE_BENEATH tests.
Reviewed By: markj
Differential Revision: https://reviews.freebsd.org/D29016
Alex Richardson [Fri, 12 Mar 2021 17:01:37 +0000 (17:01 +0000)]
Save all fpcr/fpsr bits in the AArch64 fenv_t
The existing code masked off all bits that it didn't know about. To be
future-proof, we should save and restore the entire fpcr/fpsr registers.
Additionally, the existing fesetenv() was incorrectly setting the rounding
mode in fpsr instead of fpcr.
This patch stores fpcr in the high 32 bits of fenv_t and fpsr in the low
bits instead of trying to interleave them in a single 32-bit field.
Technically, this is an ABI break if you re-compile parts of your code or
pass a fenv_t between DSOs that were compiled with different versions
of fenv.h. However, I believe we should fix this since the existing code
was broken and passing fenv_t across DSOs should rarely happen.
Reviewed By: andrew
Differential Revision: https://reviews.freebsd.org/D29160
linux(4): make getcwd(2) return ERANGE instead of ENOMEM
For native FreeBSD binaries, the return value from __getcwd(2)
doesn't really matter, as the libc wrapper takes over and returns
the proper errno.
PR: kern/254120
Reported By: Alex S <iwtcex@gmail.com>
Reviewed By: kib
Sponsored By: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D29217
Kristof Provost [Wed, 10 Mar 2021 21:56:11 +0000 (22:56 +0100)]
pf: Fully remove interrupt events on vnet cleanup
swi_remove() removes the software interrupt handler but does not remove
the associated interrupt event.
This is visible when creating and remove a vnet jail in `procstat -t
12`.
We can remove it manually with intr_event_destroy().
Kristof Provost [Wed, 10 Mar 2021 14:11:59 +0000 (15:11 +0100)]
uma: allow uma_zfree_pcu(..., NULL)
We already allow free(NULL) and uma_zfree(..., NULL). Make
uma_zfree_pcpu(..., NULL) work as well.
This also means that counter_u64_free(NULL) will work.
null_vput_pair(): release use reference on dvp earlier
We might own the last use reference, and then vrele() at the end would
need to take the dvp vnode lock to inactivate, which causes deadlock
with vp. We cannot vrele() dvp from start since this might unlock ldvp.
Handle it by holding the vnode and dropping use ref after lowerfs
VOP_VPUT_PAIR() ended. This effectivaly requires unlock of the vp vnode
after VOP_VPUT_PAIR(), so the call is changed to set unlock_vp to true
unconditionally. This opens more opportunities for vp to be reclaimed,
if lvp is still alive we reinstantiate vp with null_nodeget().
Reported and tested by: pho
Reviewed by: mckusick
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D29178
vlrureclaim: only skip vnode with resident pages if it own the pages
Nullfs vnode which shares vm_object and pages with the lower vnode should
not be exempt from the reclaim just because lower vnode cached a lot.
Their reclamation is actually very cheap and should be preferred over
real fs vnodes, but this change is already useful.
Reported and tested by: pho
Reviewed by: mckusick
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D29178
FFS: assign fully initialized struct mount_softdeps to um_softdep
Other threads observing the non-NULL um_softdep can assume that it is
safe to use it. This is important for ro->rw remounts where change from
read-only to read-write status cannot be made atomic.
Reviewed by: mckusick
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D29178