John Baldwin [Wed, 6 Oct 2021 21:08:49 +0000 (14:08 -0700)]
crypto: Support Chacha20-Poly1305 with a nonce size of 8 bytes.
This is useful for WireGuard which uses a nonce of 8 bytes rather
than the 12 bytes used for IPsec and TLS.
Note that this also fixes a (should be) harmless bug in ossl(4) where
the counter was incorrectly treated as a 64-bit counter instead of a
32-bit counter in terms of wrapping when using a 12 byte nonce.
However, this required a single message (TLS record) longer than 64 *
(2^32 - 1) bytes (about 256 GB) to trigger.
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D32122
John Baldwin [Wed, 6 Oct 2021 21:08:48 +0000 (14:08 -0700)]
crypto: Test all of the AES-CCM KAT vectors.
Previously, only test vectors which used the default nonce and tag
sizes (12 and 16, respectively) were tested. This now tests all of
the vectors. This exposed some additional issues around requests with
an empty payload (which wasn't supported) and an empty AAD (which
falls back to CIOCCRYPT instead of CIOCCRYPTAEAD).
- Make use of the 'ivlen' and 'maclen' fields for CIOGSESSION2 to
test AES-CCM vectors with non-default nonce and tag lengths.
- Permit requests with an empty payload.
- Permit an input MAC for requests without AAD.
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D32121
John Baldwin [Wed, 6 Oct 2021 21:08:48 +0000 (14:08 -0700)]
cryptosoft: Fix support for variable tag lengths in AES-CCM.
The tag length is included as one of the values in the flags byte of
block 0 passed to CBC_MAC, so merely copying the first N bytes is
insufficient.
To avoid adding more sideband data to the CBC MAC software context,
pull the generation of block 0, the AAD length, and AAD padding out of
cbc_mac.c and into cryptosoft.c. This matches how GCM/GMAC are
handled where the length block is constructed in cryptosoft.c and
passed as an input to the Update callback. As a result, the CBC MAC
Update() routine is now much simpler and simply performs the
XOR-and-encrypt step on each input block.
While here, avoid a copy to the staging block in the Update routine
when one or more full blocks are passed as input to the Update
callback.
Reviewed by: sef
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D32120
John Baldwin [Wed, 6 Oct 2021 21:08:47 +0000 (14:08 -0700)]
crypto: Support multiple nonce lengths for AES-CCM.
Permit nonces of lengths 7 through 13 in the OCF framework and the
cryptosoft driver. A helper function (ccm_max_payload_length) can be
used in OCF drivers to reject CCM requests which are too large for the
specified nonce length.
Reviewed by: sef
Sponsored by: Chelsio Communications, The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D32111
John Baldwin [Wed, 6 Oct 2021 21:08:47 +0000 (14:08 -0700)]
cryptocheck: Support multiple IV sizes for AES-CCM.
By default, the "normal" IV size (12) is used, but it can be overriden
via -I. If -I is not specified and -z is specified, issue requests
for all possible IV sizes.
Reviewed by: markj
Sponsored by: Chelsio Communications, The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D32110
John Baldwin [Wed, 6 Oct 2021 21:08:47 +0000 (14:08 -0700)]
cryptodev: Allow some CIOCCRYPT operations with an empty payload.
If an operation would generate a MAC output (e.g. for digest operation
or for an AEAD or EtA operation), then an empty payload buffer is
valid. Only reject requests with an empty buffer for "plain" cipher
sessions.
Some of the AES-CCM NIST KAT vectors use an empty payload.
While here, don't advance crp_payload_start for requests that use an
empty payload with an inline IV. (*)
Reported by: syzbot+d4b94fbd9a44b032f428@syzkaller.appspotmail.com (*)
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D32109
John Baldwin [Wed, 6 Oct 2021 21:08:46 +0000 (14:08 -0700)]
cryptodev: Permit explicit IV/nonce and MAC/tag lengths.
Add 'ivlen' and 'maclen' fields to the structure used for CIOGSESSION2
to specify the explicit IV/nonce and MAC/tag lengths for crypto
sessions. If these fields are zero, the default lengths are used.
This permits selecting an alternate nonce length for AEAD ciphers such
as AES-CCM which support multiple nonce leengths. It also supports
truncated MACs as input to AEAD or ETA requests.
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D32107
John Baldwin [Wed, 6 Oct 2021 21:08:46 +0000 (14:08 -0700)]
crypto: Permit variable-sized IVs for ciphers with a reinit hook.
Add a 'len' argument to the reinit hook in 'struct enc_xform' to
permit support for AEAD ciphers such as AES-CCM and Chacha20-Poly1305
which support different nonce lengths.
Reviewed by: markj
Sponsored by: Chelsio Communications, The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D32105
John Baldwin [Wed, 6 Oct 2021 21:08:46 +0000 (14:08 -0700)]
cryptodev: Use 'csp' in the handlers for requests.
- Retire cse->mode and use csp->csp_mode instead.
- Use csp->csp_cipher_algorithm instead of the ivsize when checking
for the fixup for the IV length for AES-XTS.
Reviewed by: markj
Sponsored by: Chelsio Communications, The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D32103
Felix Johnson [Wed, 6 Oct 2021 20:47:02 +0000 (22:47 +0200)]
login.conf.5: Mark passwordtime as implemented
login.conf.5 listed passwordtime in RESERVED CAPABILITIES, which is a
section for capabilities not implemented in the base system. However,
passwordtime has been implemented in the base for several years now.
Alan Somers [Sat, 25 Sep 2021 16:16:20 +0000 (10:16 -0600)]
fusefs: fix intermittency in the dev_fuse_poll test
The DevFusePoll::access/select test would occasionally segfault. The
cause was a file descriptor that was shared between two threads. The
first thread would kill the second and close the file descriptor. But
it was possible that the second would read the file descriptor before it
shut down. That did not cause problems for kqueue, poll, or blocking
operation, but it triggered segfaults in select's macros.
Mark Johnston [Wed, 6 Oct 2021 20:03:30 +0000 (16:03 -0400)]
malloc: Unmark KASAN redzones if the full allocation size was requested
Consumers that want the full allocation size will typically access the
full buffer, so mark the entire allocation as valid to avoid useless
KASAN reports.
Alan Somers [Sun, 3 Oct 2021 17:51:14 +0000 (11:51 -0600)]
fusefs: Fix a bug during VOP_STRATEGY when the server changes file size
If the FUSE server tells the kernel that a file's size has changed, then
the kernel must invalidate any portion of that file in cache. But the
kernel can't do that during VOP_STRATEGY, because the file's buffers are
already locked. Instead, proceed with the write.
Alan Somers [Sat, 2 Oct 2021 18:17:36 +0000 (12:17 -0600)]
fusefs: fix a recurse-on-non-recursive lockmgr panic
fuse_vnop_bmap needs to know the file's size in order to calculate the
optimum amount of readahead. If the file's size is unknown, it must ask
the FUSE server. But if the file's data was previously cached and the
server reports that its size has shrunk, fusefs must invalidate the
cached data. That's not possible during VOP_BMAP because the buffer
object is already locked.
Fix the panic by not querying the FUSE server for the file's size during
VOP_BMAP if we don't need it. That's also a a slight performance
optimization.
Alan Somers [Sun, 3 Oct 2021 16:59:04 +0000 (10:59 -0600)]
fusefs: quiet some cache-related warnings
If the FUSE server does something that would make our cache incoherent,
we should print a warning to the user. However, we previously warned in
some situations when we shouldn't, such as if the file's size changed on
the server _after_ our own attribute cache had expired. This change
suppresses the warning in cases like that. It also moves the warning
logic to a single place within the code.
We actually do not know is it safe or not to flush cache for random
BAR/register page existing in the system. It is well-known that for
instance LAPICs cannot tolerate cache flush. As report indicates,
there are more such devices.
This issue typically affects AMD machines which do not report self-snoop,
causing real CLFLUSH invocation on the mapped pages. Intels do self-snoop,
so this change should be nop for them, and unsafe devices, if any, are
already ignored.
Reported and tested by: manu
Reviewed by: alc, markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D32318
Similar to pmap_page_set_memattr() by setting MD page cache attribute
to the argument. Unlike pmap_page_set_memattr(), does not flush cache
for the direct mapping of the page.
Reviewed by: alc, markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D32318
The default value for LAPIC registers page physical address
is usually right. Having this value available early makes
pmap_force_invalidate_cache_range(), used on non-self-snoop machines,
avoid flushing LAPIC range for early calls.
Reviewed by: alc, markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D32318
Alexander Motin [Tue, 5 Oct 2021 19:01:16 +0000 (15:01 -0400)]
cam(4): Limit search for disks in SES enclosure by single bus
At least for SAS that we only support now disks are typically
connected to the same bus as the enclosure. Limiting the search
scope makes it much faster on systems with multiple buses and
thousands of disks.
Alexander Motin [Tue, 5 Oct 2021 18:54:03 +0000 (14:54 -0400)]
cam(4): Improve XPT_DEV_MATCH
Remove *_MATCH_NONE enums, making no sense and so never used. Make
*_MATCH_ANY enums 0 (no any match flags set), previously used by
*_MATCH_NONE. Bump CAM_VERSION to 0x1a reflecting those changes and
add compat shims.
When traversing through buses and devices do not descend if we can
already see that requested pattern does not match the bus or device.
It allows to save significant amount of time on system with thousands
of disks when doing limited searches.
sdhci: Fix crash caused by M_WAITOK in sdhci dumps
In some contexts it is illegal to wait for memory allocation, causing
kernel panic. By default sbuf_new passes M_WAITOK to malloc,
which caused crashes when sdhci_dumpcaps or sdhci_dumpregs was callend in
non sutiable context.
Mark Johnston [Mon, 4 Oct 2021 21:48:44 +0000 (17:48 -0400)]
geom_label: Add more validation for NTFS volume tasting
- Ensure that the computed MFT record size isn't negative or larger than
maxphys before trying to read $Volume.
- Guard against truncated records in volume metadata.
- Ensure that the record length is large enough to contain the volume
name.
- Verify that the (UTF-16-encoded) volume name's length is a multiple of
two.
PR: 258833, 258914
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
David Bright [Mon, 4 Oct 2021 15:43:41 +0000 (08:43 -0700)]
RPCBIND: skip ipv6 link local when request is not from link local address
RPCINFO on macOS behaves different compared to other linux clients and
doesn't provide request address in rpcb structure of the
RPCBPROC_GETADDRLIST call which doesn't seem to be forbidden.
In this case RPCBIND uses RPC call's source address and picks a
closest corresponding local address. If there are no addresses in the
same subnet as the source address, return of RPCBIND may vary
depending on the order of addresses returned in getifaddrs. If a link
local precedes global address it may be returned even if the request
comes from neither a link local nor from link local in a different
scope, which will prevent services like nfs from working in tpc6
scenario on macOS clients. Issue can be seen only on FreeBSD rpcbind
port due to changes in workflow of addrmerge call.
Mark Johnston [Mon, 4 Oct 2021 16:28:22 +0000 (12:28 -0400)]
libctf: Improve check for duplicate SOU definitions in ctf_add_type()
When copying a struct or union from one CTF container to another,
ctf_add_type() checks whether it matches an existing type in the
destination container. It does so by looking for a type with the same
name and kind as the new type, and if one exists, it iterates over all
members of the source type and checks whether a member with matching
name and offset exists in the matched destination type. This can
produce false positives, for example because member types are not
compared, but this is not expected to arise in practice. If the match
fails, ctf_add_type() returns an error.
The procedure used for member comparison breaks down in the face of
anonymous struct and union members. ctf_member_iter() visits each
member in the source definition and looks up the corresponding member in
the desination definition by name using ctf_member_info(), but this
function will descend into anonymous members and thus fail to match.
Fix the problem by introducing a custom comparison routine which does
not assume member names are unique. This should also be faster for
types with many members; in the previous scheme, membcmp() would perform
a linear scan of the desination type's members to perform a lookup by
name. The new routine steps through the members of both types in a
single loop.
PR: 258763
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
ncurses: fix path where to find curses.h at bootstrap
after the split, curses.h is now generated by tinfo Makefile, but
still used for a file generated in ncurses lib. Adjust the path to
make sure curses.h is always found
many external program expects libncurses to not be provided as a single
library. Instead of fixing all ports, distribute ncurses the way
upstream distributes it
Turn libncursesw.so into a ldscript which will link automatically as
needed to libtinfow so so this change is seamless at compile time.
Stefan Eßer [Mon, 4 Oct 2021 07:50:44 +0000 (09:50 +0200)]
contrib/bc: temporarily disconnect the tests for 5.0.2
The tests that come with version 5.0.2 have been extended to cover the
line editing functions. It has been found that these tests generate
false negative results in FreeBSD, most likely due to an issue in the
pexpect functionality used.
These history tests are skipped on systems that do not have python and
py-pexpect installed (and thus are unlikely to cause CI test failures),
but in order to not cause irritating failures on systems were these
packages are in fact installed, I temporarily disconnect them.
I had planned to skip this version due to the issue with the history
tests, but some committer has asked me to go ahead since the currently
used version 5.0.0 contains a macro name that collides with a project
he is working on.
No MFC of this version is planned. A version 5.0.3 is expected to be
released soon, and that version will allow to reconnect the tests and
will be MFCed.
Stefan Eßer [Mon, 4 Oct 2021 07:41:06 +0000 (09:41 +0200)]
contrib/bc: remove files ommitted from the release
A number of files have been removed from the release distribution of
this bc implementation. They were mostly relevant for pre release
testing and benchmarking to identify regressions. The Markdown
sources of the man pages are only relevant for combinations of build
options not used in FreeBSD and need non-default conversion tools
(available as ports in FreeBSD).
All the omitted files can be found in the upstream git repository,
and they are fetched when building this software as a port. But they
have never been used in the FreeBSD base system.
Colin Percival [Sun, 3 Oct 2021 21:55:10 +0000 (14:55 -0700)]
loader bcache: Allow readahead up to 256 kB I/Os
Prior to this commit, the loader would perform readaheads of up to
128 kB; when booting on a UFS filesystem this resulted in a series
of 160 kB reads (32 kB request + 128 kB readahead).
This commit allows readaheads to be longer, subject to a total I/O
size limit of 256 kB; i.e. 32 kB read requests will have added
readaheads of up to 224 kB.
In my testing on an EC2 c5.xlarge instance, this change reduces the
boot time by roughly 80 ms.
Colin Percival [Sun, 3 Oct 2021 21:49:41 +0000 (14:49 -0700)]
loader bcache: Track unconsumed readahead
The loader bcache attempts to determine whether readahead is useful,
increasing or decreasing its readahead length based on whether a
read could be serviced out of the cache. This resulted in two
unfortunate behaviours:
1. A series of consecutive 32 kB reads are requested and bcache
performs 16 kB readaheads. For each read, bcache determines that,
since only the first 16 kB is already in the cache, the readahead
was not useful, and keeps the readahead at the minimum (16 kB) level.
2. A series of consecutive 32 kB reads are requested and bcache
starts with a 32 kB readahead resulting in a 64 kB being read on
the first request. The second 32 kB request can be serviced out of
the cache, and bcache responds by doubling its readahead length to
64 kB. The third 32 kB request cannot be serviced out of the cache,
and bcache reduces its readahead length back down to 32 kB.
The first syndrome converts a series of 32 kB reads into a series of
(misaligned) 32 kB reads, while the second syndrome converts a series
of 32 kB reads into a series of 64 kB reads; in both cases we do not
increase the readahead length to its limit (currently 128 kB) no
matter how many consecutive read requests are made.
This change avoids this problem by tracking the "unconsumed
readahead" length; readahead is deemed to be useful (and the
read-ahead length is potentially increased) not only if a request was
completely serviced out of the cache, but also if *any* of the request
was serviced out of the cache and that length matches the amount of
unconsumed readahead. Conversely, we now only reduce the readahead
length in cases where there was unconsumed readahead data.
In my testing on an EC2 c5.xlarge instance, this change reduces the
boot time by roughly 120 ms.
If a device is plugged into the x16 slot that itself has a bridge, such
as many graphics cards, we currently fail to allocate a bus number for
its child bus (and so pcib_attach_child skips adding a child bus for
further enumeration) as, when the new child bridge attaches, it attempts
to allocate a bus number from its parent (pcib7) which in turn attempts
to grow its own bus range by calling bus_adjust_resource on its own
parent (pcib2) whose bus rman cannot accommodate the request and needs
to itself be extended by calling its own parent (pcib1). Note that
pcib3-7 do not face the same issue when they attach since pcib1 ends up
managing bus numbers 1-255 from the beginning and so never needs to grow
its own range.
Jessica Clarke [Sun, 3 Oct 2021 18:34:53 +0000 (19:34 +0100)]
riscv: Add vt and kbdmux to GENERIC for video console support
No in-tree drivers are supported for RISC-V (given it supports UEFI we
could enable the EFI framebuffer, but U-Boot has very limited hardware
support and EDK2 remains a work in progress), but drm-kmod exists with
drivers for video cards that can be used with the HiFive Unmatched.
Jessica Clarke [Sun, 3 Oct 2021 18:34:40 +0000 (19:34 +0100)]
LinuxKPI: Add more #ifdef VM_MEMATTR_WRITE_COMBINING guards
One of the three uses is already guarded; this guards the remaining ones
to support architectures like riscv that do not provide write-combining,
and is needed to build drm-kmod on riscv.
Jessica Clarke [Sun, 3 Oct 2021 18:34:25 +0000 (19:34 +0100)]
libgcc_s: Export 64-bit int to 128-bit float functions
The corresponding 32-bit int and 128-bit int functions were added in 790a6be5a169, as were all combinations of the float to int functions,
but these two were overlooked. __floatditf is needed to build curl for
riscv as there's a signed 64-bit int to 128-bit float conversion in
lib/progress.c's trspeed as of 7.77.0.
Reviewed by: dim
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D31997
Jessica Clarke [Sun, 3 Oct 2021 18:33:47 +0000 (19:33 +0100)]
riscv: Add a stub pmap_change_attr implementation
pmap_change_attr is required by drm-kmod so we need the function to
exist. Since the Svpbmt extension is on the horizon we will likely end
up with a real implementation of it, so this stub implementation does
all the necessary page table walking to validate the input, ensuring
that no new errors are returned once it's implemented fully (other than
due to out of memory conditions when demoting L2 entries) and providing
a skeleton for that future implementation.
Kyle Evans [Sat, 2 Oct 2021 05:17:57 +0000 (00:17 -0500)]
vfs: remove dead fifoop VOP_KQFILTER implementations
These began to become obsolete in d6d64f0f2c26 (r137739) and the deal
was later sealed in 003e18aef4c4 (r137801) when vfs.fifofs.fops was
dropped and vop-bypass for pipes became mandatory.
Alexander Motin [Sun, 3 Oct 2021 00:57:55 +0000 (20:57 -0400)]
sleepqueue(9): Remove sbinuptime() from sleepq_timeout().
Callout c_time is always bigger or equal than the scheduled time. It
is also smaller than sbinuptime() and can't change while the callback
is running. So we reliably can use it instead of sbinuptime() here.
In case there was a race and the callout was rescheduled to the later
time, the callback will be called again.
According to profiles it saves ~5% of the timer interrupt time even
with fast TSC timecounter.
Robert Wing [Sat, 2 Oct 2021 23:11:40 +0000 (15:11 -0800)]
ffs: retire unused fsckpid mount option
The fsckpid mount option was introduced in 927a12ae16433b50 along with a
couple sysctl's to support SU+J with snapshots. However, those sysctl's
were never used and eventually removed in f2620e9ceb3ede02.
There are no in-tree consumers of this mount option.
Rick Macklem [Sat, 2 Oct 2021 21:11:15 +0000 (14:11 -0700)]
nfsd: Fix pNFS handling of Deallocate
For a pNFS server configuration, an NFSv4.2 Deallocate operation
is proxied to the DS(s). The code that parsed the reply for the
proxy RPC is broken and did not process the pre-operation attributes.
This patch fixes this problem.
This bug would only affect pNFS servers built from recent main/FreeBSD14
sources.