Rick Macklem [Sun, 12 Dec 2021 23:40:30 +0000 (15:40 -0800)]
nfscl: Fix must_commit handling for mirrored pNFS mounts
For pNFS mounts to mirrored Flexible File layout pNFS servers,
the "must_commit" component in the nfsclwritedsdorpc
structure must be checked and the "must_commit" argument passed
into nfscl_doiods() must be updated. Technically, only writes to
the DS with a writeverf change must be redone, but since this
occurrence will be rare, the must_commit argument to nfscl_doiosd()
is set to 1, so all writes to all DSs will be redone.
This bug would affect few, since use of mirrored pNFS servers
is rare and "writeverf" rarely changes. Normally "writeverf"
only changes when a NFS server reboots.
Dimitry Andric [Sun, 12 Dec 2021 20:11:40 +0000 (21:11 +0100)]
Revert clang change that breaks CTF on aarch64
Revert commit e655e74a318e from llvm git (by Peter Collingbourne):
AST: Create __va_list in the std namespace even in C.
This ensures that the mangled type names match between C and C++,
which is significant when using -fsanitize=cfi-icall. Ideally we
wouldn't have created this namespace at all, but it's now part of
the ABI (e.g. in mangled names), so we can't change it.
As reported by Jessica in https://reviews.llvm.org/D104830#3129527, this
upstream change is implemented in such a way that it breaks DTrace's
CTF. Since a proper fix has not yet been forthcoming, and we are
unaffected by the (CFI-related) problem upstream was trying to address,
revert the change for now.
Rebecca Cran [Sun, 28 Nov 2021 16:34:33 +0000 (09:34 -0700)]
bhyve: Support a _VARS.fd file for bootrom
OVMF creates two separate .fd files, a _CODE.fd file containing
the UEFI code, and a _VARS.fd file containing a template of an
empty UEFI variable store.
OVMF decides to write variables to the memory range just below the
boot rom code if it detects a CFI flash device. So here we add
just the barest facsimile of CFI command handling to bootrom.c
that is needed to placate OVMF.
Submitted by: D Scott Phillips <d.scott.phillips@intel.com>
Sponsored by: Intel Corporation
Differential Revision: https://reviews.freebsd.org/D19976
MFC After: 1 week
Only accept at most superpage alignment, or if the arch does not have
superpages supported, artificially limit it to PAGE_SIZE * 1024.
This is somewhat arbitrary, and e.g. could change what binaries do
we accept between native i386 vs. amd64 ia32 with superpages disabled,
but I do not believe the difference there is affecting anybody with
real (useful) binaries.
Reported and tested by: pho
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D33359
Invalid (artificial) layout of the loadable ELF segments might result in
triggering the assertion. This means that the file should not be
executed, regardless of the kernel debug mode. Change calling
conventions for rnd_elf{32,64} helpers to allow returning an error, and
abort activation with ENOEXEC if its invariants are broken.
Reported and tested by: pho
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D33359
Rick Macklem [Sat, 11 Dec 2021 23:00:30 +0000 (15:00 -0800)]
nfscl: Fix must_commit/writeverf handling for Direct I/O
Without this patch, the KASSERT(must_commit == 0,..) can be
triggered by the writeverf in the Direct I/O write reply changing.
This is not a situation that should cause a panic(). Correct
handling is to ignore the change in "writeverf" for Direct
I/O, since it is done with NFSWRITE_FILESYNC.
This patch modifies the semantics of the "must_commit"
argument slightly, allowing an initial value of 2 to indicate
that a change in "writeverf" should be ignored.
It also fixes the KASSERT()s.
This bug would affect few, since Direct I/O is not enabled
by default and "writeverf" rarely changes. Normally "writeverf"
only changes when a NFS server reboots, however I found the
bug when testing against a Linux 5.15.1 kernel nfsd, which
replied to a NFSWRITE_FILESYNC write with a "writeverf" of all
0x0 bytes.
Alexander Motin [Sat, 11 Dec 2021 04:18:52 +0000 (23:18 -0500)]
Make msgbuf_peekbytes() not return leading zeroes.
Introduce new MSGBUF_WRAP flag, indicating that buffer has wrapped
at least once and does not keep zeroes from the last msgbuf_clear().
It allows msgbuf_peekbytes() to return only real data, not requiring
every consumer to trim the leading zeroes after doing pointless copy.
The most visible effect is that kern.msgbuf sysctl now always returns
proper zero-terminated string, not only after the first buffer wrap.
Warner Losh [Fri, 10 Dec 2021 18:04:48 +0000 (11:04 -0700)]
stand: remove mips support
As part of decommissioning mips support, remove the boot loader
support. Do this in advance of other boot loader work to limit the
amount of work that will be thrown away.
Florian Walpen [Fri, 10 Dec 2021 01:35:38 +0000 (03:35 +0200)]
Add idle priority scheduling privilege group to MAC/priority
Add an idletime user group that allows non-root users to run processes
with idle scheduling priority. Privileges are granted by a MAC policy in
the mac_priority module. For this purpose, the kernel privilege
PRIV_SCHED_IDPRIO was added to sys/priv.h (kernel module ABI change).
Deprecate the system wide sysctl(8) knob
security.bsd.unprivileged_idprio which lets any user run idle priority
processes, regardless of context. While the knob is still working, it is
marked as deprecated in the description and in the man pages.
Warner Losh [Thu, 9 Dec 2021 23:52:55 +0000 (16:52 -0700)]
hda: Push giant / newbus topo lock down a layer
Rather than picking up Giant at the start of these sysctl handlers, push
acquiring Giant down a layer to protect walking the children and also
to add/delete children for the reconfigure.
Warner Losh [Thu, 9 Dec 2021 23:52:44 +0000 (16:52 -0700)]
newbus: add bus_topo_assert
Add bus_topo_assert() and implmement it as GIANT_REQUIRED for the
moment. This will allow us to change more easily to a newbus-specific
lock int he future.
Warner Losh [Fri, 10 Dec 2021 00:04:45 +0000 (17:04 -0700)]
Create wrapper for Giant taken for newbus
Create a wrapper for newbus to take giant and for busses to take it too.
bus_topo_lock() should be called before interacting with newbus routines
and unlocked with bus_topo_unlock(). If you need the topology lock for
some reason, bus_topo_mtx() will provide that.
John Baldwin [Thu, 9 Dec 2021 21:17:54 +0000 (13:17 -0800)]
TLS: Use <machine/tls.h> for libc and rtld.
- Include <machine/tls.h> in MD rtld_machdep.h headers.
- Remove local definitions of TLS_* constants from rtld_machdep.h
headers and libc using the values from <machine/tls.h> instead.
- Use _tcb_set() instead of inlined versions in MD
allocate_initial_tls() routines in rtld. The one exception is amd64
whose _tcb_set() invokes the amd64_set_fsbase ifunc. rtld cannot
use ifuncs, so amd64 inlines the logic to optionally write to fsbase
directly.
- Use _tcb_set() instead of _set_tp() in libc.
- Use '&_tcb_get()->tcb_dtv' instead of _get_tp() in both rtld and libc.
This permits removing _get_tp.c from rtld.
- Use TLS_TCB_SIZE and TLS_TCB_ALIGN with allocate_tls() in MD
allocate_initial_tls() routines in rtld.
John Baldwin [Thu, 9 Dec 2021 21:17:41 +0000 (13:17 -0800)]
libthr: Use <machine/tls.h> for most MD TLS details.
Note that on amd64 this effectively removes the unused tcb_spare field
from the end of struct tcb since the definition of struct tcb in
<x86/tls.h> does not include that field.
Reviewed by: kib, jrtc27
Sponsored by: The University of Cambridge, Google Inc.
Differential Revision: https://reviews.freebsd.org/D33352
John Baldwin [Thu, 9 Dec 2021 21:17:13 +0000 (13:17 -0800)]
Add <machine/tls.h> header to hold MD constants and helpers for TLS.
The header exports the following:
- Definition of struct tcb.
- Helpers to get/set the tcb for the current thread.
- TLS_TCB_SIZE (size of TCB)
- TLS_TCB_ALIGN (alignment of TCB)
- TLS_VARIANT_I or TLS_VARIANT_II
- TLS_DTV_OFFSET (bias of pointers in dtv[])
- TLS_TP_OFFSET (bias of "thread pointer" relative to TCB)
Note that TLS_TP_OFFSET does not account for if the unbiased thread
pointer points to the start of the TCB (arm and x86) or the end of the
TCB (MIPS, PowerPC, and RISC-V).
Note also that for amd64, the struct tcb does not include the unused
tcb_spare field included in the current structure in libthr. libthr
does not use this field, and the existing calls in libc and rtld that
allocate a TCB for amd64 assume it is the size of 3 Elf_Addr's (and
thus do not allocate room for tcb_spare).
A <sys/_tls_variant_i.h> header is used by architectures using
Variant I TLS which uses a common struct tcb.
Reviewed by: kib (older version of x86/tls.h), jrtc27
Sponsored by: The University of Cambridge, Google Inc.
Differential Revision: https://reviews.freebsd.org/D33351
John Baldwin [Thu, 9 Dec 2021 21:16:19 +0000 (13:16 -0800)]
mips: Add TLS_DTV_OFFSET to the result of tls_get_addr_common.
Previously TLS_DTV_OFFSET was added to the offset passed to
tls_get_addr_common; however, this approach matches powerpc and RISC-V
and better matches the intention.
Reviewed by: kib, jrtc27
Sponsored by: The University of Cambridge, Google Inc.
Differential Revision: https://reviews.freebsd.org/D33347
John Baldwin [Thu, 9 Dec 2021 21:16:00 +0000 (13:16 -0800)]
mips: Rename TLS_DTP_OFFSET to TLS_DTV_OFFSET.
This is the more standard name for the bias of dtv pointers used on
other platforms. This also fixes a few other places that were using
the wrong bias previously on MIPS such as dlpi_tls_data in struct
dl_phdr_info and the recently added __libc_tls_get_addr().
Reviewed by: kib, jrtc27
Sponsored by: The University of Cambridge, Google Inc.
Differential Revision: https://reviews.freebsd.org/D33346
John Baldwin [Thu, 9 Dec 2021 21:15:38 +0000 (13:15 -0800)]
libthr: Remove the DTV_OFFSET macro.
This macro is confusing as it is not related to the similarly named
TLS_DTV_OFFSET. Instead, replace its one use with the desired
expression which is the same on all platforms.
Reviewed by: kib, emaste, jrtc27
Sponsored by: The University of Cambridge, Google Inc.
Differential Revision: https://reviews.freebsd.org/D33345
John Baldwin [Thu, 9 Dec 2021 19:52:43 +0000 (11:52 -0800)]
cryptosoft: Stop single-threading requests within a session.
All of the request handlers no longer modify session state, so remove
the mutex limiting operations to one per session. In addition, change
the pointer to the session state passed to process callbacks to const.
Suggested by: mjg
Reviewed by: mjg, markj
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D33317
John Baldwin [Thu, 9 Dec 2021 19:52:42 +0000 (11:52 -0800)]
cryptosoft: Fully support per-operation keys for auth algorithms.
Only pre-allocate auth contexts when a session-wide key is provided or
for sessions without keys. For sessions with per-operation keys,
always initialize the on-stack context directly rather than
initializing the session context in swcr_authprepare (now removed) and
then copying that session context into the on-stack context.
This approach permits parallel auth operations without needing a
serializing lock. In addition, the previous code assumed that auth
sessions always provided an initial key unlike cipher sessions which
assume either an initial key or per-op keys.
While here, fix the Blake2 auth transforms to function like other auth
transforms where Setkey is invoked after Init rather than before.
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D33316
John Baldwin [Thu, 9 Dec 2021 19:52:42 +0000 (11:52 -0800)]
cryptosoft: Allocate cipher contexts on the stack during operations.
As is done with authentication contexts, allocate cipher contexts on
the stack while completing requests. This permits safely dispatching
concurrent requests on a single session. The cipher context in the
session is now only allocated when a session key is provided during
session setup to serve as a template to initialize the on-stack
context similar to auth operations.
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D33198
John Baldwin [Thu, 9 Dec 2021 19:52:42 +0000 (11:52 -0800)]
crypto: Refactor software support for AEAD ciphers.
Extend struct enc_xform to add new members to handle auth operations
for AEAD ciphers. In particular, AEAD operations in cryptosoft no
longer use a struct auth_hash. Instead, the setkey and reinit methods
of struct enc_xform are responsible for initializing both the cipher
and auth state.
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D33196
John Baldwin [Thu, 9 Dec 2021 19:52:42 +0000 (11:52 -0800)]
GMAC: Reset initial hash value and counter in AES_GMAC_Reinit().
Previously, these values were only cleared in AES_GMAC_Init(), so a
second set of operations could reuse the final hash as the initial
hash. Currently this bug does not trigger in cryptosoft as existing
GMAC and GCM operations always use an on-stack auth context
initialized from a template context.
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D33315
John Baldwin [Thu, 9 Dec 2021 19:52:41 +0000 (11:52 -0800)]
crypto: Don't assert for empty output buffers.
It is always valid for crp_payload_output_start to be 0. However, if
an output buffer is empty (e.g. a decryption request with a tag but an
empty payload), the existing assertion failed since 0 is not less than
0.
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D33193