Simon J. Gerraty [Mon, 12 Feb 2024 22:35:01 +0000 (14:35 -0800)]
libsecureboot do not report expected unverified files
By default only report unverified files at severity VE_WANT
and above. This inlcudes *.conf but not *.hints, *.cookie
or *.tgz which get VE_TRY as their severity.
If Verbose is set to 0, then VerifyFlags should default to 0 too.
Thus the combination of
module_verbose=0
VE_VEBOSE=0
is sufficient to make the loader almost totally silent.
When verify_prep has to find_manifest and it is verified ok
return VE_NOT_CHECKED to verify_file so that it can skip
repeating verify_fd
Also add better debugging output for is_verified and add_verify_status.
vectx handle compressed modules
When verifying a compressed module (.ko.gz or .ko.bz2)
stat() reports the size as -1 (unknown).
vectx_lseek needs to spot this during closing - and just read until
EOF is hit.
Note: because of the way libsa's open() works, verify_prep will see
the path to be verified as module.ko not module.ko.bz2 etc. This is
actually ok, because we need a separate module.ko.bz2 entry so that
the package can be verified, and the hash for module.ko is of the
uncompressed file which is what vectx will see.
Re-work local.trust.mk so site.trust.mk need only set
VE_SIGN_URL_LIST (if using the mentioned signing server)
interp.c: restrict interactive input
Apply the same restrictions to interactive input as for
unverified conf and hints files.
At the destruction of the tcpcb, no timers are supposed to
be running. However, it turns out that stopping them in the
close() / shutdown() call does not have the desired effect
under all circumstances.
This partially reverts 62d47d73b7eb to reduce the nuisance
caused.
Warner Losh [Mon, 12 Feb 2024 18:46:20 +0000 (11:46 -0700)]
rescue,nextboot: Install nextboot as a link to reboot, rm nextboot.sh
Reboot now emulates the nextboot shell script completely. Retire the
nextboot.sh script and install the link. Retain the same manual page,
since there's enough differences between nextboot and reboot that
talking about nextboot would likely be confusing in nextboot.8
The nextboot.sh script no longer exists, so doesn't need to be fixed up
to create rescue. However, now we need a link from nextboot to reboot.
Warner Losh [Mon, 12 Feb 2024 18:46:11 +0000 (11:46 -0700)]
reboot: Allow this to be installed as nextboot
Allow nextboot to be a symlink link to reboot. It does everything reboot
does, except doesn't actually setup the sytem to reboot and reboot. Also,
don't accept the reboot args related to rebooting when in nextboot mode.
Warner Losh [Mon, 12 Feb 2024 18:45:37 +0000 (11:45 -0700)]
reboot: Implement zfs support
Implement full support for ZFS -k support. For ZFS, we have to set a
property that gets cleared by the boot loaeder for whether or not to
process nextboot.conf. Do this using system("zfsbootcfg..." rather than
coding the small subset of that program inline to avoid CDDL
contamination of reboot and the complications of disabling CDDL and/or
ZFS. The few bytes needed to implement reboot for systems with zfs is
not worth saving for systems w/o ZFS.
Only set nextboot_enable=YES for UFS filesystems. They are the only one
that need that as the first line. Its presence on ZFS can cause the
kernel to not be oneshot.
Warner Losh [Mon, 12 Feb 2024 18:45:20 +0000 (11:45 -0700)]
reboot: Add sanity checking of write to nextboot.conf
Add sanity checking to the write to nextboot. Move to separate function
and allow force to override all errors. If we can't write nextboot.conf,
don't silently fail anymore.
Warner Losh [Mon, 12 Feb 2024 18:45:01 +0000 (11:45 -0700)]
reboot: Don't reboot if the next kernel isn't there
reboot -k garbage won't boot garbage unless /boot/garbage/kernel is
there. Refuse to reboot if it is missing, though allow -f to force
it for special-use cases. This is in keeping with nextboot.sh.
Warner Losh [Mon, 12 Feb 2024 18:44:32 +0000 (11:44 -0700)]
rescue: belatedly add zfsbootcfg
nextboot.sh uses zfsbootcfg to enable nextboot functionality for ZFS,
but zfsbootcfg was never added. Add it now since the nextboot binary
that replaced the script also uses it via system.
Warner Losh [Mon, 12 Feb 2024 18:44:22 +0000 (11:44 -0700)]
zfsbootcfg: Remove bogus CFLAGS
When using the zfsbootcfg library, we're talking only to it, not to the
rest of ZFS, nor are we using anything that needs access to the ZFS
compilation environment. Remove all the compiling OpenZFS itself flags.
Ed Maste [Mon, 12 Feb 2024 15:36:36 +0000 (10:36 -0500)]
style.lua.9: remove mention of $FreeBSD$
Also restore a comment line in an example which previously started with
-- $FreeBSD$ but was removed in 6ef644f5889a. The example shows the of
a module require statement block following the license header.
snd_uaudio(4) selects the first maching rate/channel/bit/format/buffer
configuration for use during attach, even though it will print the rest
of the supported configurations detected. To make this clear, mark the
selected playback and recording configurations with a "selected" string.
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Reviewed by: dev_submerge.ch
Differential Revision: https://reviews.freebsd.org/D43766
Florian Walpen [Mon, 12 Feb 2024 11:05:27 +0000 (13:05 +0200)]
snd_uaudio(4): Fix config detection with defaults set.
Let the USB audio descriptor iteration detect configurations with more
channels and larger sample size, even when the following global sysctl
tunables are set to a lower value:
Florian Walpen [Mon, 12 Feb 2024 11:04:57 +0000 (13:04 +0200)]
snd_uaudio(4): Adapt buffer length to buffer_ms tunable.
Adapt the length of the driver side audio buffer to the USB transfer
interval, which is adjustable through the buffer_ms tunable. This
eliminates unnecessary latency in USB audio playback.
To reduce power consumption caused by frequent CPU wakeups, increase the
default buffer_ms value to 4ms. In combination with adaptive buffer
length, this still results in less roundtrip latency compared to the
previous 2ms default.
Extend the buffer_ms value range to 1ms for low latency applications.
mixer(8): Use new mixer if we change the default unit
If we use the -d option to change the default unit, close the current
mixer and open the one we set as the default to avoid printing and
applying changes (if any) to the old one.
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Reviewed by: dev_submerge.ch, markj
Differential Revision: https://reviews.freebsd.org/D43809
The input options of "dev.mute" (+, -, ^) and "dev.recsrc" (+, -, ^, =)
are quite cryptic. Allow the input to also be an actual description of
what these options do.
+ -> add (recsrc)
- -> remove (recsrc)
^ -> toggle (recsrc, mute)
= -> set (recsrc)
0 -> off (mute)
1 -> on (mute)
Also, deprecate the use of the symbol options in the EXAMPLES section of
the man page, by using the new descriptive options.
In the future, we might want to get rid of the symbol options
altogether, but preserve backwards compatibility for now.
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Reviewed by: dev_submerge.ch, imp
Differential Revision: https://reviews.freebsd.org/D43796
246e0457d93071ffd901c78e3ee7badc5f51bd4c ("mixer.8: Add terse example
for increasing volume") mentions that the example changes the volume of
the "first mixer found", while the example shows how the change the
volume of the current mixer's "vol" device. Re-phrease sentence to
reflect the actual behavior of the command.
Also, improve the example by using the % operator, instead of hardcoding
0.05.
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Reviewed by: dev_submerge.ch, markj
Differential Revision: https://reviews.freebsd.org/D43795
mixer(8): Allow full PCM device names as input for the -d option
The -d option is a wrapper around hw.snd.default_unit. Currently
mixer(8) expects the option argument to be just the unit's number (e.g
pcm0 -> 0). To avoid confusion, allow full device names of the form
"pcmN" as well.
While here, improve the -d option's description in the man page.
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Reviewed by: dev_submerge.ch, imp
Differential Revision: https://reviews.freebsd.org/D43794
Eugene Grosbein [Mon, 12 Feb 2024 07:24:28 +0000 (14:24 +0700)]
graid: unbreak Promise RAID1 with 4+ providers
Fix a problem in graid implementation of Promise RAID1 created with 4+ disks.
Such an array generally works fine until reboot only due to a bug
in metadata writing code. Before the fix, next taste erronously created
RAID1E (kind of RAID10) instead of RAID1, hence graid used wrong offsets
for I/O operations.
The bug did not affect Promise RAID1 arrays with 2 or 3 disks only.
Dimitry Andric [Sun, 11 Feb 2024 22:45:51 +0000 (23:45 +0100)]
Bump __FreeBSD_version after clang/llvm PIE change
Otherwise, incremental builds might fail with various interesting
errors. This is a bit of a big hammer, but I don't know of any other way
to force rebuilds of all these libraries.
Dimitry Andric [Sun, 11 Feb 2024 18:01:56 +0000 (19:01 +0100)]
Build clang and other llvm executables as PIE
There is no reason anymore to not build these as PIE. Unfortunately
bsd.lib.mk does not allow for building _only_ PIE static libraries, so
lib/clang/Makefile.inc needs a kludge to work around that issue.
Print reasons when parser declined to parse notes, due to mis-alignment,
invalid length, or too many notes (the later typically means that there
is a loop). Also increase the loop limit to 4096, which gives enough
iterations for notes to fill whole notes' page.
Sponsored by: The FreeBSD Foundation
MFC after: 3 days
LinuxKPI: Allow kmalloc to be called when FPU protection is enabled
Amdgpu driver does a lot of memory allocations in FPU-protected sections
of code for certain display cores, e.g. for DCN30. This does not work
on FreeBSD as its malloc function can not be run within a critical
section. Check this condition and temporally exit from FPU-protected
context to workaround issue and reduce source code patching.
Simon J. Gerraty [Sat, 10 Feb 2024 18:14:23 +0000 (10:14 -0800)]
rc.subr avoid noise if /usr not mounted
basename, sed and tty are all in /usr/bin and not available
until /usr is mounted.
basename and tty we can replace with a function, but sed is more
important. Fix o_verify to just use shell builtins, and
rc_trace should avoid trying to set RC_LEVEL until sed is available.
Introduce the allocuio() and freeuio() functions to allocate and
deallocate struct uio. This hides the actual allocator interface, so it
is easier to modify the sub-allocation layout of struct uio and the
corresponding iovec array.
tcp: stop timers and clean scoreboard in tcp_close()
Stop timers when in tcp_close() instead of doing that in tcp_discardcb().
A connection in CLOSED state shall not need any timers. Assert that no
timer is rescheduled after that in tcp_timer_activate() and verfiy that
this is also the expected state in tcp_discardcb().
tcp: stop doing superfluous work after sending RST
When sending a RST control segment in tcp_output() it
means we are in TCPS_CLOSED state, called from tcp_drop().
Once the RST is sent, don't call tcp_timer_activate() or
update anything in tcpcb, since that will go away shortly.
tcp: clean scoreboard when releasing the socket buffer
The SACK scoreboard is conceptually an extention of the socket
buffer. Remove it when the socket buffer goes away with
soisdisconnected(). Verify that this is also the expected
state in tcp_discardcb().
John Baldwin [Fri, 9 Feb 2024 19:53:43 +0000 (11:53 -0800)]
cam: Check if cam_simq_alloc fails for the xpt bus during module init
This is very unlikely to fail (and if it does, CAM isn't going to work
regardless), but fail with an error rather than a gauranteed panic via
NULL pointer dereference.
John Baldwin [Fri, 9 Feb 2024 18:27:45 +0000 (10:27 -0800)]
pcib: Refine handling of resources allocated from bridge windows
Fix a long-standing layering violation in the original NEW_PCIB code
by not passing suballocated resources up to the parent bus for
activation and mapping. Instead, handle activation and mapping of
sub-allocated resources in this driver. When mapping resources,
request a mapping from a suitable sub-region of the resource allocated
from the parent bus for the associated bridge window.
Note that this does require passing RF_ACTIVE (with RF_UNMAPPED) when
allocating bridge window resources from the parent.
John Baldwin [Fri, 9 Feb 2024 18:27:45 +0000 (10:27 -0800)]
acpi: Cleanup handling of suballocated resources
For resources suballocated from the system resource rmans, handle
those in the ACPI bus driver without passing them up to the parent.
This means using bus_generic_rman_* for several bus methods for
operations on suballocated resources. For bus_map/unmap_resource,
find the system resource allocated from the parent bus (nexus) that
contains the range being mapped and request a mapping of that parent
resource.
This avoids a layering violation where nexus drivers were previously
asked to manage the activation and mapping of resources created
belonging to the ACPI resource managers.
Note that this does require passing RF_ACTIVE (with RF_UNMAPPED) when
allocating system resources from the parent.
While here, don't assume that the parent bus (nexus) provides a
resource list that sysres resources are placed on. Instead, create a
dedicated resource_list in the ACPI bus driver's softc to hold sysres
resources.
Jessica Clarke [Fri, 9 Feb 2024 18:13:47 +0000 (18:13 +0000)]
bsdinstall: Add new Auto option to netconfig interface selection dialog
This changes the OK / Cancel buttons into Auto / Manual / Cancel, with
Auto being the default. Manual behaves like OK used to, i.e. presents a
series of dialogs asking exactly how to configure the interface, and
Cancel is unchanged, exiting with exit code 1. Auto will attempt to
configure IPv4+DHCP and IPv6+SLAAC with no interaction, failing only if
neither can be configured, thereby supporting all of IPv4-only,
IPv6-only and dual-stack environments. If at least one DNS server is
provided, it will also skip asking for DNS settings, otherwise it will
act like Manual mode for the purposes of DNS settings and prompt. For a
standard dual-stack environment this cuts down the number of netconfig
dialogs from 6 (interface, IPv4, DHCP, IPv6, SLAAC, DNS) to just the
first one.
Debugging boot issues can be helped by
logging each rc.d script as it is run
and being able to selectively enable/disable set -x
debug.sh provides an elaborate framework for debugging shell scripts.
For secure systems, we want to be paranoid about what we read
during boot.
dot() simply reads (.) arg file if it exists
vdot() if mac_veriexec is active, ignore unverified files
otherwise behaves much the same as dot()
safe_dot() in safe_eval.sh allows reading an untrusted file;
limiting the input to simple variable assignments.
In load_rc_config allow caller to provide an option to indicate how to
handle its arg:
-v use vdot()
-s use sdot() which will try to use vdot() and fallback to safe_dot()
The default is to read using dot()
rc_run_scripts()
encapsulate the running of rc.d scripts
so that we can easily call it more than twice.
We vdot local.rc.subr to pick up extensions (like
run_rc_scripts_final) and overrides.
We also allow rc.subr.local or rc.conf to set rc_config_xtra
eg (rc_config_xtra=XXX for historic compatibility)
rc use set -o verify around the reading in of rc.subr
This has no effect if mac_veriexec is not active, but if it is; ensures
rc.subr has not been tampered with.
Austin Zhang [Wed, 7 Feb 2024 18:55:02 +0000 (12:55 -0600)]
ntb: Add Intel Xeon Gen4 support
The NTB hardware of XEON Ice lake and Sapphire Rapids has register mapping changes
Add a new NTB_XEON_GEN4 device type and use it to conditionalize driver logic differs
Emil Tsalapatis [Thu, 8 Feb 2024 01:13:43 +0000 (20:13 -0500)]
fusefs: only test for incoherency if FN_SIZECHANGE is set
FUSE emits spurious incoherency warnings in writethrough mode. The
warnings are triggered by setattr calls generated by vnode truncation
turning the cached va_size vattr stale, causing comparisons with the
fresh version provided by the server to fail. Only validate the vnode's
va_size vattr if the FN_SIZECHANGE flag is set.
This is a part of the research work at RCSLab, University of Waterloo.
Brooks Davis [Thu, 8 Feb 2024 18:21:56 +0000 (18:21 +0000)]
libsys: actually install manpages
In initial hacking I'd bluntly disabled manpage installation in libsys,
then later disabled them for libc, but forgot to fix the former leading
to no syscall manapages.
PR: 276887
Reported by: Martin Birgmeier <d8zNeCFG@aon.at>
tcp: ensure tcp_sack_partialack does not inflate cwnd after RTO
The implicit assumption of snd_nxt always being larger than
snd_recover is not true after RTO. In that case, cwnd
would get inflated to ssthresh, which may be much larger
than the current pipe (data in flight).
tcp: calculate ssthresh on RTO according to RFC5681
per RFC5681, only adjust ssthresh on the initital
retransmission timeout. Since RTO often happens
during loss recovery, while cwnd no longer tracks
all data in flight, calculcate pipe properly.
tcp: use tcp_fixed_maxseg instead of tcp_maxseg in cc modules
tcp_fixed_maxseg() is the streamlined calculation of typical
tcp options and more suitable for heavy use in the congestion
control modules on every received packet.
Gleb Smirnoff [Thu, 8 Feb 2024 17:00:23 +0000 (09:00 -0800)]
unix: retire LOCAL_CONNWAIT
This socket option was added in 6a2989fd54a9 together with LOCAL_CREDS.
Both options originate from NetBSD. The LOCAL_CREDS seems to be used by
some software and is covered by our test suite.
The main problem with LOCAL_CONNWAIT is that it doesn't work as
documented. A basic test shows that connect(2) indeed blocks, but
accept(2) on the other side does not wake it up. Indeed, I don't see what
code in the accept(2) path would go into the peer socket of a unix/stream
listener's child and would make wakeup(&so->so_timeo). I tried the test
even on a FreeBSD 6.4-RELEASE and it produced the same results as on
CURRENT.
The other thing that puzzles me is why that option would be useful even if
it worked? Because on unix/stream you can send(2) immediately after
connect(2) and that would put data on the peer receive buffer even before
listener had done accept(2). In other words, one side can do connect(2)
then send(2), only after the remote side would make accept(2) and the
remote would see the data sent before the accept(2). Again this
undocumented feature of unix(4) is present on all versions from FreeBSD 6
to CURRENT.
Gleb Smirnoff [Thu, 8 Feb 2024 17:00:23 +0000 (09:00 -0800)]
tests/unix_passfd: test that control mixed with data creates records
If socket has data interleaved with control it would never allow to read
two pieces of data, neither two pieces of control with one recvmsg(2). In
other words, presence of control makes a SOCK_STREAM socket behave like
SOCK_SEQPACKET, where control marks the records. This is not a documented
or specified behavior, but this is how it worked always for BSD sockets.
If you look closer at it, this actually makes a lot of sense, as if it
were the opposite both the kernel code and an application code would
become way more complex.
The change made recvfd_payload() to return received length and requires
caller to do ATF_REQUIRE() itself. This required a small change to
existing test rights_creds_payload. It also refactors a bit f28532a0f363,
pushing two identical calls out of TEST_PROTO ifdef.
Gleb Smirnoff [Thu, 8 Feb 2024 17:00:23 +0000 (09:00 -0800)]
unix/stream: do not put empty mbufs on the socket
It is a legitimate case to use sendmsg(2) to send control only, with zero
bytes of data and then recvmsg(2) them with zero length iov, receiving
control only. This sendmsg(2)+recmsg(2) would leave a zero length mbuf on
the top of the socket buffer. If you now try to repeat this combo again,
your recvmsg(2) would not return control data, because it sits behind an
MT_DATA mbuf and you have provided zero length uio_resid. IMHO, best
strategy to deal with zero length buffers in a chain is to not put them
there in the first place. Thus, solve this right in uipc_send() instead
of touching soreceive_generic().
Lexi Winter [Sat, 3 Feb 2024 13:19:03 +0000 (13:19 +0000)]
traceroute: remove configuration #defines
traceroute used a series of #defines to specify what features are
available on the host platform. As traceroute is now in source, these
are unnecessary and complicate the code, so remove them.
Lexi Winter [Sat, 3 Feb 2024 13:10:09 +0000 (13:10 +0000)]
traceroute: move from contrib to usr.sbin
traceroute hasn't had a vendor import since 2002, while since then it's
had several significant FreeBSD-specific commits. Since it's unlikely
another vendor import will happen, and to make the merge of traceroute6
into traceroute easier, import traceroute into usr.sbin.
Mark Johnston [Thu, 8 Feb 2024 16:11:02 +0000 (11:11 -0500)]
arm64: Add pmap integration for KMSAN
- In pmap_bootstrap_san(), allocate the root PTPs for the shadow maps.
(For KASAN, this is done earlier since we need to do some special
bootstrapping for the kernel stack.)
- Adjust ifdefs to include KMSAN.
- Expand the shadow maps when pmap_growkernel() is called.
MFC after: 2 weeks
Sponsored by: Klara, Inc.
Sponsored by: Juniper Networks, Inc.
Differential Revision: https://reviews.freebsd.org/D43405
Mark Johnston [Thu, 8 Feb 2024 16:10:43 +0000 (11:10 -0500)]
arm64: Simplify and improve KASAN shadow map bootstrapping
- Move pmap_bootstrap_allocate_kasan_l2() close to the place where it is
actually used.
- Simplify pmap_bootstrap_allocate_kasan_l2() a bit: eliminate some
unneeded variables and zero and exclude each 2MB mapping as we go
rather than doing that all at once. Excluded regions will be
coalesced.
- As a consequence of the previous point, ensure that we do not zero a
preexisting 2MB mapping.
- Simplify pmap_bootstrap_san() and prepare it to work with KMSAN.
MFC after: 2 weeks
Sponsored by: Klara, Inc.
Sponsored by: Juniper Networks, Inc.
Differential Revision: https://reviews.freebsd.org/D43404
Mark Johnston [Thu, 8 Feb 2024 16:02:48 +0000 (11:02 -0500)]
arm64: Disable kernel superpage promotion when KMSAN is configured
The break-before-make operation required to promote or demote a
superpage leaves a window where the KMSAN runtime can trigger a fatal
data abort. More specifically, the code in pmap_update_entry() which
executes after ATTR_DESCR_VALID is cleared may implicitly attempt to
access KMSAN context via curthread, but we may be promoting or demoting
a 2MB page containing the curthread structure.
Reviewed by: imp
Sponsored by: Klara, Inc.
Sponsored by: Juniper Networks, Inc.
Differential Revision: https://reviews.freebsd.org/D43158