stand/loader.efi: fix regression with ignoring nvstore
To read/update the boot loader nvstore, we always need to call
zfs_attach_nvstore() regardless of whether we use bootonce key
in nvstore or the bootfs property of the pool. The call was
unintentionally left in the block of code that is processed
only when bootonce key is present.
Switch the repository to use https by default, base is providing a CA
root bundle suitable to validate the certificates used by the project.
This can now be activated without requiring another packages to be installed
Doug Moore [Mon, 11 Sep 2023 06:53:40 +0000 (01:53 -0500)]
radix_trie: have vm_radix use pctrie code
Implement everything currently in vm_radix.c with calls to functions
in subr_pctrie.c, asccessed via the interface provided by the
DEFINE_PCTRIE_SMR macro.
If we hit the csfailed case in pf_create_state() we may have allocated
a state, so we must also free it. While here reduce the amount of
duplicated cleanup code.
cxgbe(4): Fix tracing with netlink enabled kernels.
1. The tracing ifnet's name must match the nexus name. One way to do
this is to not use a unit number.
2. Do not hold the tracer lock around ether_ifattach or ether_ifdetach.
netlink calls back into the driver (see stack below) and that leads
to illegal lock recursion.
If we receive a state with a route-to interface name set and we can't
find the interface we do not insert the state. However, in that case we
must still clean up the state (and state keys).
Do so, so we do not leak states.
lib/libc/amd64/string/memchr.S: fix behaviour with overly long buffers
When memchr(buf, c, len) is called with a phony len (say, SIZE_MAX),
buf + len overflows and we have buf + len < buf. This confuses the
implementation and makes it return incorrect results. Neverthless we
must support this case as memchr() is guaranteed to work even with
phony buffer lengths, as long as a match is found before the buffer
actually ends.
Sponsored by: The FreeBSD Foundation
Reported by: yuri, des
Tested by: des
Approved by: mjg (blanket, via IRC)
MFC after: 1 week
MFC to: stable/14
PR: 273652
John Baldwin [Sat, 9 Sep 2023 19:13:57 +0000 (12:13 -0700)]
gic_acpi: Limit the number of CPUs to GIC_MAXCPU
madt_table_data contains an array of pointers for each CPU and was
allocated on the stack. If MAXCPU is raised to a sufficiently large
value this can overflow the kernel stack. Cap the stack growth by
using GIC_MAXCPU instead as for other parts of the gicv1/v2 driver in
commit a0e20c0ded1a.
update max_variance limit in zdb_block_size_histogram test for CI
Commit 2d7843401a628ef8c483229742dd58bca70bc27e had previously
updated this hardcoded limit to allow for CI testing. As there
is no deterministic pass/fail value, the need has arisen for
one more small increase.
Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Edmund Nadolski <edmund.nadolski@ixsystems.com>
Closes #15252
Alexander Motin [Sat, 9 Sep 2023 17:22:36 +0000 (13:22 -0400)]
Add more constraints for block cloning.
- We cannot clone into files with smaller block size if there is
more than one block, since we can not grow the block size.
- Block size must be power-of-2 if destination offset != 0, since
there can be no multiple blocks of non-power-of-2 size.
The first should handle the case when destination file has several
blocks but still is not bigger than one block of the source file.
The second fixes panic in dmu_buf_hold_array_by_dnode() on attempt
to concatenate files with equal but non-power-of-2 block sizes.
While there, assert that error is reported if we made no progress.
Reviewed-by: Pawel Jakub Dawidek <pawel@dawidek.net> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Rob Norris <rob.norris@klarasystems.com> Reviewed-by: Kay Pedersen <mail@mkwg.de> Reviewed-by: Umer Saleem <usaleem@ixsystems.com>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored by: iXsystems, Inc.
Closes #15251
Warner Losh [Sat, 9 Sep 2023 02:18:33 +0000 (20:18 -0600)]
atkbdc: Add additional heurstic for Chromebook keyboards
It turns out that that heurstic used to determine if we have a Google
coreboot, and thus have the i8042 emulation bugs, is incorrect. At least
one Acer "Peppy" Chromebook has an issue because Acer space'd out the
smbios.bios.version string we're using as part of the heuristic. So, if
the version starts with a space, then enable the workarounds if the
smbios.bios.reldate is 2018 or earlier. While not perfect, it should be
a reasonable dividing line and still allow newer core boot-based
machines that aren't Chromebooks to not have the workaround.
carp: Explicitly mark tunnable net.inet.carp.allow with CTLFLAG_NOFETCH
With recent change 110113bc086f, a vnet tunable can be initialized when
there is a corresponding kernel environment variable unless it is marked
with the flag CTLFLAG_NOFETCH.
The initialization may happen during early boot(linker preload), at that
time vnet0 has not been created. The hander carp_allow_sysctl() for the
tunable net.inet.carp.allow requires vnet, thus invoking it during early
boot will cause kernel panic.
The tunnable is initialized by vnet sysinit routine ipcarp_sysinit() so
let's just mark it with flag CTLFLAG_NOFETCH.
No functional change intended.
Fixes: 110113bc086f sysctl(9): Enable vnet sysctl variables to be loader tunable
MFC after: 2 week
Differential Revision: https://reviews.freebsd.org/D41525
The module preload happens before vnet0 creation, at this moment the vnet
list is empty thus invoking vnet_data_copy() during preload is a noop.
With recent change 110113bc086f, for dynamic module load, aka via kldload,
linker will do vnet propagation right after registering sysctls which
happens after module load, then previous propagation (during module load)
is redundant.
In 3da1cf1e88f8, the meaning of the flag CTLFLAG_TUN is extended to
automatically check if there is a kernel environment variable which
shall initialize the SYSCTL during early boot. It works for all SYSCTL
types both statically and dynamically created ones, except for the
SYSCTLs which belong to VNETs.
This change extends the meaning further, to allow it also works for
the SYSCTLs which belong to VNETs. A typical usage is
```
VNET_DEFINE_STATIC(int, foo) = 0;
SYSCTL_INT(_net, OID_AUTO, foo, CTLFLAG_RWTUN | CTLFLAG_VNET,
&VNET_NAME(foo), 0, "Description of the foo loader tunable");
```
Note that the implementation has a limitation. It behaves the same way
as that of non-vnet loader tunables. That is, after the kernel or modules
being initialized, any changes (e.g. via kenv) to kernel environment
variable will not affect the corresponding vnet variable of subsequently
created VNETs. To overcome it, we can use TUNABLE_XXX_FETCH to fetch
the kernel environment variable into those vnet variables during vnet
constructing.
This change will fix the following SYSCTLs those belong to VNETs and
have CTLFLAG_TUN flag:
```
net.add_addr_allfibs
net.bpf.optimize_writers
net.inet.tcp.fastopen.ccache_buckets
net.link.bridge.inherit_mac
net.link.bridge.ipfw_arp
net.link.bridge.log_stp
net.link.bridge.pfil_bridge
net.link.bridge.pfil_local_phys
net.link.bridge.pfil_member
net.link.bridge.pfil_onlyip
net.link.lagg.default_use_flowid
net.link.lagg.default_use_numa
net.link.lagg.default_flowid_shift
net.link.lagg.lacp.debug
net.link.lagg.lacp.default_strict_mode
```
Although the following vnet SYSCTLs have CTLFLAG_TUN flag, theirs
values are re-fetched via TUNABLE_XXX_FETCH, thus are not affected
by this change.
```
net.inet.ip.reass_hashsize
net.inet.tcp.hostcache.cachelimit
net.inet.tcp.hostcache.hashsize
net.inet.tcp.hostcache.bucketlimit
net.inet.tcp.syncache.bucketlimit
net.inet.tcp.syncache.cachelimit
net.inet.tcp.syncache.hashsize
net.key.spdcache.maxentries
net.key.spdcache.threshold
```
In memoriam: hselasky
Discussed with: hselasky, glebius
Fixes: 3da1cf1e88f8 Extend the meaning of the CTLFLAG_TUN flag ...
MFC after: 2 weeks
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D39638
John Baldwin [Fri, 8 Sep 2023 23:30:35 +0000 (16:30 -0700)]
cxgbe tom: Call t4_rcvd_locked from do_rx_data to return RX credits
In particular, the kernel RPC layer used by the NFS client never
invokes pru_rcvd since it always reads data from the socket upcall
via MSG_SOCALLBCK which avoids calling pru_rcvd. As a result, on an
NFS client connection managed by t4_tom, RX credits were never
returned to the TOE connection to open the TCP window resulting in
connection hangs.
To fix, expand the set of conditions in do_rx_data where RX credits
are returned to match those in t4_rcvd_locked by calling the function
directly.
lib/libc/tests/string: add unit tests for strcspn(3)
We currently use the NetBSD test suite to cover strcspn(3). It only
contains a very rudimentary test of this function. This all new set
of unit tests for the FreeBSD test suite should cover many more edge
cases relating to alignment issues.
Sponsored by: The FreeBSD Foundation
Approved by: mjg
MFC after: 1 week
MFC to: stable/14
Differential Revision: https://reviews.freebsd.org/D41557
This changeset adds both a scalar and an x86-64-v2 implementation
of the strcspn(3) function to libc. A baseline implementation does not
appear to be feasible given the requirements of the function.
The scalar implementation is similar to the generic libc implementation,
but expands the bit set into a byte set to reduce latency, improving
performance. This approach could probably be backported to the generic
C version to benefit other platforms.
The x86-64-v2 implementation is built around the infamous pcmpistri
instruction. An alternative implementation based on the Muła/Langdale
algorithm [1] was prototyped, but performed worse than the pcmpistri
approach except for sets of more than 16 characters with long input
strings.
All implementations provide special cases for the empty set (reduces to
strlen as well as single-character sets (reduces to strchr). The
x86-64-v2 kernel falls back to the scalar implementation for sets of
more than 32 characters. This limit could be raised by additional
multiples of 16 through the use of additional pcmpistri code paths, but
I consider this case to be too rare to be of importance.
Michael Tuexen [Fri, 8 Sep 2023 14:20:51 +0000 (16:20 +0200)]
sctp: cleanup locking for notifications
All notifications are now queued via sctp_ulp_notify(). Do
the locking of the inp read lock there and validate this in all
functions being used.
This is one step in avoiding race conditions when closing the
read end of an SCTP socket.
Mike Karels [Fri, 8 Sep 2023 14:06:42 +0000 (09:06 -0500)]
mountd: do not warn about using class mask with -mask
The previous code would warn that the mask was being defaulted to
an obsolete class mask even if -mask was present after -network.
Import a fix from Peter Much with a little tweaking, deferring the
warning until after all parameters are processed.
PR: 263011
Obtained from: pmc at citilink.dinoex.sub.org
MFC after: 3 days
Reviewed by: rmacklem
Differential Revision: https://reviews.freebsd.org/D41774
When building the frequencies table we convert the value in the DTS to
megahertz and loose precision. While it's not a problem for most of the
DTS it is when the expected frequency value is strict down to the hertz.
So it's either we don't truncate the value and have some ugly and long
values in the sysctls or we just find the closest frequency.
Do the latter.
The M1 uses FDT, and has bge to start with. Add a SOC_* option for
the first SoC we'll be supporting.
IOMMU is added commented out because it does have it, but IOMMU is not
well-tested on aarch64. An initial version of the DART driver will be
upstreamed that just puts the DARTs that support bypass mode into bypass
mode -- we'll be missing some functionality, but we at least still end
up with some USB ports.
Fix by cleaning up all the sub-entries inside "kernel/spl" when the
spl module is unloaded.
Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Brian Atkinson <batkinson@lanl.gov> Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
Closes #15239
Ed Maste [Thu, 7 Sep 2023 16:32:39 +0000 (12:32 -0400)]
ssh-keygen: Generate Ed25519 keys when invoked without arguments
Ed25519 keys are convenient because they're much smaller, and the next
OpenSSH release (9.5) will switch to them by default. Apply the change
to FreeBSD main now, to help identify issues as early as possible.
Reviewed by: kevans, karels, des
Relnotes: Yes
Obtained from: OpenBSD 9de458a24986
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D41773
pf: mark removed connections within a multihome association as shutting down
Parse IP removal in ASCONF chunks, find the affected state(s) and mark
them as shutting down. This will cause them to time out according to
PFTM_TCP_CLOSING timeouts, rather than waiting for the established
session timeout.
MFC after: 3 weeks
Sponsored by: Orange Business Services
Kristof Provost [Wed, 2 Aug 2023 08:44:52 +0000 (10:44 +0200)]
pf tests: basic SCTP multihoming test
The SCTP server will announce multiple addresses. Block one of them with
pf, connect to the other have the client use the blocked address. pf
is expected to have created state for all of the addresses announced by
the server.
In a separate test case add the secondary (client) IP after the
connection has been established. The intent is to verify the
functionality of the ASCONF chunk parsing.
Kristof Provost [Wed, 2 Aug 2023 17:05:00 +0000 (19:05 +0200)]
pf: support SCTP multihoming
SCTP may announce additional IP addresses it'll use in the INIT/INIT_ACK
chunks, or in ASCONF chunks at any time during the connection. Parse these
parameters, evaluate the ruleset for the new connection and if allowed
create the corresponding states.
Ed Maste [Wed, 30 Aug 2023 20:49:44 +0000 (16:49 -0400)]
Update WITH_/WITHOUT_SSP descriptions
ProPolice refers to a specific implementation by Hiroaki Etoh and
Kunikazu Yoda. The implementation in contemporary Clang and GCC is
somewhat different and newer, so use a generic term in the src.conf
descriptions.
Martin Matuska [Thu, 7 Sep 2023 15:18:12 +0000 (17:18 +0200)]
libarchive: merge security fix from vendor branch
This commit fixes a couple of security vulnerabilities in the PAX writer:
1. Heap overflow in url_encode() in archive_write_set_format_pax.c
2. NULL dereference in archive_write_pax_header_xattrs()
3. Another NULL dereference in archive_write_pax_header_xattrs()
4. NULL dereference in archive_write_pax_header_xattr()
Security: No known reference yet
Obtained from: https://github.com/libarchive/libarchive/commit/1b4e0d0f9
MFC after: 3 days
Ed Maste [Thu, 7 Sep 2023 15:10:33 +0000 (11:10 -0400)]
syscons: refer to it as the legacy console
vt(4) is the default console, and although there is no firm deprecation
plan for syscons(4) yet it it is not actively maintained and is not
compatible with contemporary systems (i.e., those booting via UEFI).
Reviewed by: manu
MFC after: 3 days
Sponsored by: The FreeBSD Foundation
Each Rx descriptor points to a packet buffer of size 2K, which means
that MTUs greater than 2K see multi-descriptor packets. The TCP-hood of
such packets was being incorrectly determined by looking for a flag on
the last descriptor instead of the first descriptor.
This adds specific width length modifiers in the form of wN and wfN (where N is 8, 16, 32, or 64) which allow printing intN_t and int_fastN_t without resorting to casts or PRI macros.
groff 1.23.0 changed the semantics of the -man parameter, and many
manual pages are not rendered. The -mandoc parameter brings back
the old behavior, as in groff 1.22.4 and earlier.
PR: 273565, 273245
Reviewed by: emaste, bapt
MFC after: 1 week for all supported branches (stable/12, 13, 14)
Differential Revision: https://reviews.freebsd.org/D41737
__crt_aligned_alloc_offset(): fix ov_index for backing allocation address
Wrong value of ov_index resulted in magic check failure, and refuse to
free() the memory allocated with __crt_aligned_alloc_offset().
Then the TLS segments of exited threads leaked.
Colin Percival [Tue, 5 Sep 2023 23:46:38 +0000 (16:46 -0700)]
init_main: Switch from SLIST to STAILQ, fix order
Constructing an SLIST of SYSINITs by inserting them one by one at the
head of the list resulted in them being sorted in anti-stable order:
When two SYSINITs tied for (subsystem, order), they were executed in
the reverse order to the order in which they appeared in the linker
set.
Note that while this changes struct sysinit, it doesn't affect ABI
since SLIST_ENTRY and STAILQ_ENTRY are compatible (in both cases a
single pointer to the next element).
Andrew Turner [Wed, 6 Sep 2023 11:07:41 +0000 (12:07 +0100)]
arm64: Enable FEAT_E0PD when supported
FEAT_E0PD adds two fields to the tcr_el1 special register that, when
set, cause userspace access to either the top or bottom half of the
address spaces without a page walk.
This can be used to stop userspace probing the kernel address space
as the CPU will raise an exception in the same time if the probed
address is in the TLB or not.
Reviewed by: kevans
Sponsored by: Arm Ltd
Differential Revision: https://reviews.freebsd.org/D41760
diff3: make the diff3 -E -m and diff3 -m behaviour match gnu diff3
In gnu diff3 3 way merging files where the new file and the target are
already the same will die and show what has failed to be merged except
if -E is passed in argument, in this case it will finish the merge.
This difference in behaviour was breaking one of the etcupdate testcase
with bsd diff3