Mark Johnston [Mon, 16 Aug 2021 17:15:25 +0000 (13:15 -0400)]
sigtimedwait: Use a unique wait channel for sleeping
When a sigtimedwait(2) caller goes to sleep, it uses a wait channel of
p->p_sigacts with the proc lock as the interlock. However, p_sigacts
can be shared between processes if a child is created with
rfork(RFSIGSHARE | RFPROC). Thus we can end up with two threads
sleeping on the same wait channel using different locks, which is not
permitted.
Fix the problem simply by using a process-unique wait channel, following
the example of sigsuspend. The actual wait channel value is irrelevant
here, sleeping threads are awoken using sleepq_abort().
John Baldwin [Mon, 16 Aug 2021 17:42:46 +0000 (10:42 -0700)]
ktls: Fix accounting for TLS 1.0 empty fragments.
TLS 1.0 empty fragment mbufs have no payload and thus m_epg_npgs is
zero. However, these mbufs need to occupy a "unit" of space for the
purposes of M_NOTREADY tracking similar to regular mbufs. Previously
this was done for the page count returned from ktls_frame() and passed
to ktls_enqueue() as well as the page count passed to pru_ready().
However, sbready() and mb_free_notready() only use m_epg_nrdy to
determine the number of "units" of space in an M_EXT mbuf, so when a
TLS 1.0 fragment was marked ready it would mark one unit of the next
mbuf in the socket buffer as ready as well. To fix, set m_epg_nrdy to
1 for empty fragments. This actually simplifies the code as now only
ktls_frame() has to handle TLS 1.0 fragments explicitly and the rest
of the KTLS functions can just use m_epg_nrdy.
Dimitry Andric [Mon, 16 Aug 2021 16:56:41 +0000 (18:56 +0200)]
Apply upstream lldb fix for unhandled Error causing abort
Merge commit 5033f0793fe6 from llvm git (by Dimitry Andric):
[lldb] Avoid unhandled Error in TypeSystemMap::GetTypeSystemForLanguage
When assertions are turned off, the `llvm::Error` value created at the
start of this function is overwritten using the move-assignment
operator, but the success value is never checked. Whenever a TypeSystem
cannot be found or created, this can lead to lldb core dumping with:
Program aborted due to an unhandled Error:
Error value was Success. (Note: Success values must still be checked prior to being destroyed).
Fix this by not creating a `llvm::Error` value in advance, and directly
returning the result of `llvm::make_error` instead, whenever an error is
encountered.
See also: <https://bugs.freebsd.org/253881> and
<https://bugs.freebsd.org/257829>.
Ed Maste [Thu, 25 Jun 2020 00:42:10 +0000 (20:42 -0400)]
bxe: tag files to skip clang-format formatting
bxe contains three files which are sets of constants or other data, and
might be auto-generated or have an upstream. They are rather large
files and clang-format takes quite some time when run against them, so
just skip formatting.
Fangrui Song [Sun, 15 Aug 2021 04:13:33 +0000 (07:13 +0300)]
rtld: Switch to the standard symbol lookup behavior if LD_DYNAMIC_WEAK is set
The current lookup prefers a strong definition to a STB_WEAK definition
(similar to glibc pre-2.2 behavior) which does not conform to the ELF
specification.
The non-compliant behavior provoked https://reviews.llvm.org/D4418
which was intended to fix -shared-libasan but introduced
new problems (and caused some sanitizer tests (e.g.
test/asan/TestCases/interception_failure_test.cpp) to fail): sanitizer
interceptors are STB_GLOBAL instead of STB_WEAK, so defining a second
STB_GLOBAL interceptor can lead to a multiple definition linker error.
For example, in a -fsanitize={address,memory,...} build, libc functions
like malloc/free/strtol/... cannot be provided by user object files.
See
https://docs.freebsd.org/cgi/getmsg.cgi?fetch=16483939+0+archive/2014/freebsd-current/20140716.freebsd-current
for discussions.
This patch implements the ELF-compliant behavior when LD_DYNAMIC_WEAK is
set. STB_WEAK wrestling in symbol lookups in `Search the dynamic linker
itself` are untouched.
For a Variant II architecture, the TP offset of a TLS symbol is st_value -
tlsoffset + r_addend. tlsoffset is computed by either calculate_tls_offset
or calculate_first_tls_offset.
The return value of calculate_first_tls_offset is the smallest integer
satisfying res >= size and (-res) % p_align = p_vaddr % p_align
(= p_offset % p_align). (The formula is a bit contrived. The basic idea
is to subtract the minimum integer from size + align - 1 so that the result
ihas the expected remainder.)
Kyle Evans [Fri, 26 Jun 2020 02:38:27 +0000 (21:38 -0500)]
domain: make it safer to add domains post-domainfinalize
I can see two concerns for adding domains after domainfinalize:
1.) The slow/fast callouts have already been setup.
2.) Userland could create a socket while we're in the middle of
initialization.
We can address #1 fairly easily by tracking whether the domain's been
initialized for at least the default vnet. There are still some concerns
about the callbacks being invoked while a vnet is in the process of
being created/destroyed, but this is a pre-existing issue that the
callbacks must coordinate anyways.
We should also address #2, but technically this has been an issue
anyways because we don't assert on post-domainfinalize additions; we
don't seem to hit it in practice.
Future work can fix that up to make sure we don't find partially
constructed domains, but care must be taken to make sure that at least,
e.g., the usages of pffindproto in ip_input.c can still find them.
Kyle Evans [Fri, 26 Jun 2020 02:17:47 +0000 (21:17 -0500)]
domain: give domains a chance to probe for availability
This gives any given domain a chance to indicate that it's not actually
supported on the current system. If dom_probe isn't supplied, we assume
the domain is universally applicable as most of them are. Keeping
fully-initialized and registered domains around that physically can't
work on a large majority of FreeBSD deployments is sub-optimal and leads
to errors that aren't consistent with the reality of why the socket
can't be created (e.g. ESOCKTNOSUPPORT) because such scenario has to be
caught upon pru_attach, at which point kicking back the more-appropriate
EAFNOSUPPORT would seem weird.
The initial consumer of this will be hvsock, which is only available on
HyperV guests.
Reviewed by: cem (earlier version), bcr (manpages)
Differential Revision: https://reviews.freebsd.org/D25062
Kornel Duleba [Mon, 16 Aug 2021 04:26:50 +0000 (06:26 +0200)]
tpm_tis: Improve interrupt allocation
Validate the irq received from ACPI. Test if it works by sending a simple command and checking if the interrupt handler was executed.
Internal buffer allocation was moved away from common code to tis and crb parts - in order to test the interrupt we need to have it allocated early.
Adam Fenn [Sat, 7 Aug 2021 20:01:46 +0000 (13:01 -0700)]
pvclock: Add 'struct pvclock' API
Consolidate more hypervisor-agnostic functionality behind a new 'struct
pvclock' API.
This should also make it easier to subsequently add hypervisor-agnostic
vDSO timekeeping support.
Also, perform some clean-up:
- Remove 'pvclock_get_last_cycles()'; do not allow external access
to 'pvclock_last_systime' since this is not necessary.
- Consolidate/simplify wall and system time reading codepaths.
- Ensure correct ordering within wall and system time reading
codepaths via 'atomic(9)' and 'rdtsc_ordered()' rather than via
'rmb()'.
- Remove some extra newlines.
Sponsored by: Juniper Networks, Inc.
Sponsored by: Klara, Inc.
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D31418
fstatat(2): handle non-vnode file descriptors for AT_EMPTY_PATH
Set NIRES_EMPTYPATH earlies, to have use of EMPTYPATH recorded even if
we are going to return error. When namei_setup() refused to accept dirfd,
which is not of the vnode type, and indicated by ENOTDIR error return,
fall back to kern_fstat(dirfd).
Reported by: dchagin
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D31530
ufs rename: ensure that the result of ufs_checkpath() is stable
ufs_rename() calls ufs_checkpath() to ensure that the target directory
is not a child of the source. If not, rename would create a loop.
For instance:
source->X1->X2->target
and if source moved under target, we get corrupted filesystem.
Suppose that we initially have
source->X1 .... and X2->target
where X1 is not on path from root to X2. Then ufs_checkpath() accepts
the inodes, but there is nothing preventing parallel rename of X2 to become
under X1, after checkpath finished.
Ensure stability of ufs_checkpath() result by taking a per-mount sx in
ufs_rename right before ufs_checkpath() and till the end.
Reviewed by: chs, mckusick
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Mark Johnston [Fri, 13 Aug 2021 13:52:05 +0000 (09:52 -0400)]
arc4random: Avoid KMSAN false positives from pre-seeding results
If code calls arc4random(), and our RNG is not yet seeded and
random_bypass_before_seeding is true, we'll compute a key using the
SHA256 hash of some hopefully hard-to-predict data, including the
contents of an uninitialized stack buffer (which is also the output
buffer).
When KMSAN is enabled, this use of uninitialized state propagtes through
to the arc4random() output, resulting in false positives. To address
this, lie to KMSAN and explicitly mark the buffer as initialized.
Reviewed by: cem (previous version)
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D31510
lld rounds up p_memsz(PT_GNU_RELRO) to satisfy common-page-size. If the
page size is smaller than common-page-size, rounding up relro_size may
incorrectly make some RW pages read-only.
GNU ld, gold, and ld.lld ensures p_vaddr+p_memsz is a multiple of
common-page-size. While max-page-size >= system the page size,
common-page-size can be smaller than the system page size.
This is a new major release with a number of changes and extensions:
- Limited the number of temporary numbers and made the space for them
static so that allocating more space for them cannot fail.
- Allowed integers with non-zero scale to be used with power, places,
and shift operators.
- Added greatest common divisor and least common multiple to lib2.bc.
- Made bc and dc UTF-8 capable.
- Added the ability for users to have bc and dc quit on SIGINT.
- Added the ability for users to disable prompt and TTY mode by
environment variables.
- Added the ability for users to redefine keywords.
- Added dc's modular exponentiation and divmod to bc.
- Added the ability to assign strings to variables and array elements
and pass them to functions in bc.
- Added dc's asciify command and stream printing to bc.
- Added bitwise and, or, xor, left shift, right shift, reverse,
left rotate, right rotate, and mod functions to lib2.bc.
- Added the functions s2u(x) and s2un(x,n), to lib2.bc.
Kornel Duleba [Fri, 13 Aug 2021 07:35:08 +0000 (09:35 +0200)]
ipsec: Return error code if no matching SA was found
If we matched SP to a packet, but no associated SA was found
ipsec4_allocsa will return NULL while setting error=0.
This resulted in use after free and potential kernel panic.
Return EINPROGRESS if the case described above instead.
Rick Macklem [Thu, 12 Aug 2021 23:48:28 +0000 (16:48 -0700)]
nfsd: Fix sanity check for NFSv4.2 Allocate operations
The NFSv4.2 Allocate operation sanity checks the aa_offset
and aa_length arguments. Since they are assigned to variables
of type off_t (signed) it was possible for them to be negative.
It was also possible for aa_offset+aa_length to exceed OFF_MAX
when stored in lo_end, which is uint64_t.
This patch adds checks for these cases to the sanity check.
Jessica Clarke [Thu, 12 Aug 2021 22:50:48 +0000 (23:50 +0100)]
tools/build/cross-build: Fix building libllvmminimal on Linux
There is a __used member in glibc's posix_spawn_file_actions_t in
spawn.h, so we must temporarily undefine __used when including it,
otherwise Support/Unix/Program.inc fails to build. This is based on
similar handling for __unused in other headers.
Fixes: 31ba4ce8898f ("Allow bootstrapping llvm-tblgen on macOS and Linux")
MFC after: 1 week
Jessica Clarke [Thu, 12 Aug 2021 22:45:09 +0000 (23:45 +0100)]
bsd.compiler.mk: Fix cross-building from non-FreeBSD
On non-FreeBSD, the various MACHINE variables for the host when
bootstrapping can be missing or not match FreeBSD's naming, causing
bsd.endian.mk to be unable to infer the endianness. Work around this by
assuming it's unsupported.
Note that we can't check BOOTSTRAPPING here as Makefile.inc1 includes
bsd.compiler.mk before that is set, and so we are unable to catch errors
during buildworld itself when cross-building and bsd.endian.mk failed,
but such errors should also show up when building on FreeBSD.
Fixes: 47363e99d3d3 ("Enable compressed debug on little-endian targets")
John Baldwin [Thu, 12 Aug 2021 15:48:14 +0000 (08:48 -0700)]
cxgbei: Wait for the final CPL to be received in icl_cxgbei_conn_close.
A socket in the FIN_WAIT_1 state is marked disconnected by
do_close_con_rpl() even though there might still receive data pending.
This is because the socket at that point has set SBS_CANTRCVMORE which
causes the protocol layer to discard any data received before the FIN.
However, icl_cxgbei_conn_close needs to wait until all the data has
been discarded. Replace the wait for SS_ISDISCONNECTED with instead
waiting for final_cpl_received() to be called.
Ka Ho Ng [Thu, 12 Aug 2021 15:01:02 +0000 (23:01 +0800)]
uipc_shm: Implements fspacectl(2) support
This implements fspacectl(2) support on shared memory objects. The
semantic of SPACECTL_DEALLOC is equivalent to clearing the backing
store and free the pages within the affected range. If the call
succeeds, subsequent reads on the affected range return all zero.
tests/sys/posixshm/posixshm_tests.c is expanded to include a
fspacectl(2) functional test.
Sponsored by: The FreeBSD Foundation
Reviewed by: kevans, kib
Differential Revision: https://reviews.freebsd.org/D31490
Ka Ho Ng [Thu, 12 Aug 2021 14:58:52 +0000 (22:58 +0800)]
vfs: Add ioflag to VOP_DEALLOCATE(9)
The addition of ioflag allows callers passing
IO_SYNC/IO_DATASYNC/IO_DIRECT down to the file system implementation.
The vop_stddeallocate fallback implementation is updated to pass the
ioflag to the file system implementation. vn_deallocate(9) internally is
also changed to pass ioflag to the VOP_DEALLOCATE call.
Sponsored by: The FreeBSD Foundation
Reviewed by: kib, markj
Differential Revision: https://reviews.freebsd.org/D31500
Cy Schubert [Thu, 12 Aug 2021 13:38:21 +0000 (06:38 -0700)]
wpa: Add wpa_cli action file event
Yan Zhong at FreeBSD Foundation is working on a wireless network
configuratior for an experimental FreeBSD installer. The new installer
requires an event to detect when connecting to a network fails due to a
bad password. When this happens a WPA-EVENT-TEMP-DISABLED event is
triggered. This patch passes the event to an action file provided by
the new experimental installer.
Submitted by: Yang Zhong <yzhong () freebsdfoundation.org>
Reviewed by: assumed to be reviewed by emaste (and cy)
MFC after: 1 week
Dmitry Chagin [Thu, 12 Aug 2021 08:49:01 +0000 (11:49 +0300)]
linux(4): Add struct clone_args for future clone3 system call.
In preparation for clone3 system call add struct clone_args and use it in
clone implementation.
Move all of clone related bits to the newly created linux_fork.h header.
Dmitry Chagin [Thu, 12 Aug 2021 08:45:25 +0000 (11:45 +0300)]
fork: Allow ABI to specify fork return values for child.
At least Linux x86 ABI's does not use carry bit and expects that the dx register
is preserved. For this add a new sv_set_fork_retval hook and call it from cpu_fork().
Add a short comment about touching dx in x86_set_fork_retval(), for more details
see phab comments from kib@ and imp@.