Jake Freeland [Thu, 24 Aug 2023 04:39:54 +0000 (22:39 -0600)]
timerfd: Move implementation from linux compat to sys/kern
Move the timerfd impelemntation from linux compat code to sys/kern. Use
it to implement the new system calls for timerfd. Add a hook to kern_tc
to allow timerfd to know when the system time has stepped. Add kqueue
support to timerfd. Adjust a few names to be less Linux centric.
ps: add a new option -D to reimplement tree traversal
It takes a non-optional parameter string, one of "up", "down", or "both"
that can request tree traversal in the chosen directions. This adds PIDs
from the paths to the selection of PIDs and can be used together with -d
to draw a subset of the process tree.
By commiting ca8c0d5e8110 I was hoping that the existing option -d
could just be extended to work with -p to implement a feature that was
and I think is still needed, that is to show all descendant processes
of a given process id or a set of process ids.
After a complaint from -current which may represent a wider
dissatisfaction with this change in the program's behavior, I think it
will be better to revert ca8c0d5e8110 and reintroduce this feature
using a separate option -D.
Michael Tuexen [Thu, 24 Aug 2023 13:52:55 +0000 (15:52 +0200)]
sctp: improve handling of socket shutdown for reading
If a socket is marked as cannot read anymore, drop chunks which
should be added to a control element in the receive queue.
This is consistent with dropping control elements instead of
adding them in the same situation.
pf: Access r->rpool.cur->kif under mutex protection
pf_route() sends traffic to a specified next hop over a specific
interface. The next hop is obtained in pf_map_addr() but the interface
is obtained directly via r->rpool.cur->kif` outside of the lock held in
pf_map_addr() in multiple places around pf. The chosen interface is not
stored in source node.
Move the interface selection into pf_map_addr(), have the function
return it together with the chosen IP address and ensure its stored
in struct pf_ksrc_node, store it in the source node and use the stored
value when needed.
Robert Wing [Wed, 23 Aug 2023 18:39:13 +0000 (10:39 -0800)]
bectl: make mount subcommand less verbose
The mount subcommand currently produces output such as:
# bectl mount <bootenv>
Successfully mounted <bootenv> at <mountpoint>
This commit changes it to only print the mountpoint:
# bectl mount <bootenv>
<mountpoint>
This makes it easier to script the mount subcommand. If an error occurs
while mounting, an error message is printed to stderr and bectl will
exit with a non-zero value.
Jessica Clarke [Wed, 23 Aug 2023 17:00:16 +0000 (18:00 +0100)]
Makefile: Support universe-toolchain on non-FreeBSD
We currently pass MACHINE and MACHINE_ARCH as TARGET and TARGET_ARCH
respectively for universe-toolchain, but on non-FreeBSD these may not
have values that we understand (e.g. on Linux it will be x86_64 rather
than amd64) for TARGET/TARGET_ARCH (note that we do support them for
MACHINE/MACHINE_ARCH). Since the choice is a bit arbitrary and merely
determines what LLVM's default triple will be, use amd64 on non-FreeBSD
as a known-good default.
Jessica Clarke [Wed, 23 Aug 2023 16:56:56 +0000 (17:56 +0100)]
tools/build/make.py: Make --with-default-sys-path mirror usr.bin/bmake
The top-level Makefile passes -m to its sub-makes in order to ensure
they use the in-tree mk files in share/mk, but the top-level make itself
has to rely on whatever environment the bmake used has. For FreeBSD, we
configure the system bmake with .../share/mk:/usr/share/mk, which means
it will pick up src's share/mk whenever run from within the src tree,
but currently for non-FreeBSD we configure our bootstrap bmake only with
bmake's own mk files. This is mostly compatible, with two exceptions:
1. "targets" runs at the top level, but needs TARGET_MACHINE_LIST and
the corresponding MACHINE_ARCH_LIST_${target}, otherwise it will just
print an empty list.
2. "universe" and "universe-toolchain", when run at the top level (i.e.
not via the various wrappers around universe like tinderbox), end up
failing in universe-toolchain itself with:
bmake[1]: "/path/to/freebsd/share/mk/src.sys.obj.mk" line 112: Cannot use MAKEOBJDIR=
Unset MAKEOBJDIR to get default: MAKEOBJDIR='${.CURDIR:S,^${SRCTOP},${OBJTOP},}'
By including .../share/mk in the default sys path like FreeBSD's system
bmake we ensure that we get the in-tree mk files for the top-level make,
not just sub-makes, and avoid such issues.
Note that we cannot (yet) stop using the installed mk files, since the
MAKEOBJDIRPREFIX check in Makefile runs in the object directory and uses
env -i, thereby losing the MAKESYSPATH exported by src.sys.env.mk. Other
such issues may also exist, though are likely rare if so.
We currently assume that any existing bootstrapped bmake binary will
work, but this means it never gets updated as contrib/bmake is, and
similarly we won't rebuild it as and when the configure arguments given
to boot-strap change. Whilst the former isn't necessarily a huge problem
given WANT_MAKE_VERSION rarely gets bumped in Makefile, having fewer
variables is a good thing, and so it's easiest if we just always keep it
up-to-date rather than trying to do something similar to what's already
in Makefile (which may or may not be accurate, given updating FreeBSD
gives you an updated bmake, but nothing does so for our bootstrapped
bmake on non-FreeBSD). The latter is more problematic, though, and the
next commit will be changing this configuration.
We thus now add in two checks. The first is to compare MAKE_VERSION
against _MAKE_VERSION from contrib/bmake/VERSION. The second is to
record at bootstrap time the exact configuration used, and compare that
against what we would bootstrap with.
Andrew Turner [Wed, 23 Aug 2023 14:32:56 +0000 (15:32 +0100)]
Support dynamically sized register sets
We don't always know the size of the register set at compile time,
e.g. on arm64 the size of the SVE registers need to be queried on boot.
To support register sets that needs to be calculated at run time
query the correct size when it is zero.
Andrew Turner [Tue, 22 Aug 2023 10:51:26 +0000 (11:51 +0100)]
gicv3: Split out finding the page size
When adding indirect (2 level) tabled we will need to know the page
size to calculate the size of the level 1 table. To allow for this find
the page size before entering the loop to calculate the final register
value.
Piotr Kubaj [Tue, 22 Aug 2023 10:45:56 +0000 (12:45 +0200)]
iavf: remove compatibility code and address some warnings
Code for pre-11 FreeBSD versions is removed.
Also removed are macros that are not used anymore and "i" variable
does not shadow anymore other "i" variable.
Zhenlei Huang [Wed, 23 Aug 2023 09:48:12 +0000 (17:48 +0800)]
net: Do not overwrite if_vlan's PCP
In commit c7cffd65c5d8 the function ether_8021q_frame() was slightly
refactored to use pointer of struct ether_8021q_tag as parameter qtag to
include the new option proto.
It is wrong to write to qtag->pcp as it will effectively change the memory
that qtag points to. Unfortunately the transmit routine of if_vlan parses
pointer of the member ifv_qtag of its softc which stores vlan interface's
PCP internally, when transmitting mbufs that contains PCP the vlan
interface's PCP will get overwritten.
Fix by operating on a local copy of qtag->pcp. Also mark 'struct ether_8021q_tag'
as const so that compilers can pick up such kind of bug.
PR: 273304
Reviewed by: kp
Fixes: c7cffd65c5d85 Add support for stacked VLANs (IEEE 802.1ad, AKA Q-in-Q)
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D39505
Michael Tuexen [Wed, 23 Aug 2023 06:36:15 +0000 (08:36 +0200)]
sctp: improve handling of SHUTDOWN and SHUTDOWN ACK chunks
When handling a SHUTDOWN or SHUTDOWN ACK chunk detect if the peer
is violating the protocol by not having made sure all user messages
are reveived by the peer. If this situation is detected, abort the
association.
Kyle Evans [Wed, 23 Aug 2023 03:40:45 +0000 (22:40 -0500)]
libc: iconv: zero out cv_shared on allocation
Right now we have to zero-initialize most fields in the varius callers,
but this is a little error prone. Simplify it by zeroing it out upon
allocation instead, drop the other redundant initialization.
Reviewed by: markj
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D41546
Kyle Evans [Wed, 23 Aug 2023 03:40:45 +0000 (22:40 -0500)]
libc: fix c*rtomb/mbrtoc*
In 693f88c9da8d ("iconv_std: complete the //IGNORE support"), we
more completely implemented //IGNORE, which changed the semantics of
ci_discard_ilseq. DISCARD_ILSEQ semantics are supposed to match
//IGNORE, so we really can't do much about that particular
incompatibility. This broke c*rtomb and mbrtoc* handling of invalid
sequences, but it turns out they don't want DISCARD_ILSEQ semantics at
all; they really want the subset that we call
_CITRUS_ICONV_F_HIDE_INVALID.
This restores the exact flow in iconv_std to precisely how it happened
prior to 693f88c9da8d.
PR: 265871
Fixes: 693f88c9da8d ("iconv_std: complete the //IGNORE support")
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D41513
This is an attempt at clean-room implementation of the Linux'
membarrier(2) syscall. For documentation, you would need to read
both membarrier(2) Linux man page, the comments in Linux
kernel/sched/membarrier.c implementation and possibly look at
actual uses.
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D32360
For amd64, i386, arm, and riscv, i.e. all architectures except arm64,
the custom implementation is provided since we maintain the bitmask of
active CPUs anyway.
Arm64 uses somewhat naive iteration over CPUs and match current vmspace'
pmap with the argument. It is not guaranteed that vmspace->pmap is the
same as the active pmap, but the inaccuracy should be toleratable.
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D32360
Bartosz Sobczak [Tue, 22 Aug 2023 23:07:11 +0000 (16:07 -0700)]
ofed: mask seq_num identifier to occupy only 3 bytes
The seq_num among other things is used to assign rq_psn value, which is
a 24-bit identifier. When the seq_num is full 4-byte value, we are
usually receiving: '_ib_modify_qp rq_psn overflow, masking to 24 bits'
warning.
This is burdensome for running rdma traffic with large number of
connections, because the number of logs is growing fast.
Jessica Clarke [Tue, 22 Aug 2023 20:01:03 +0000 (21:01 +0100)]
libzstd: Explicitly define ZSTD_DISABLE_ASM
On FreeBSD, ZSTD_ASM_SUPPORTED is defined as 0, but on macOS and Linux
it is defined as 1, yet we don't build any of the assembly sources.
Rather than add them just for bootstrapping on non-FreeBSD, explicitly
define ZSTD_DISABLE_ASM so they're not needed and everything is
consistent.
This fixes building a bootstrap LLVM toolchain on non-FreeBSD amd64 (the
only architecture with assembly available).
Jessica Clarke [Tue, 22 Aug 2023 20:00:37 +0000 (21:00 +0100)]
arm: Add missing no-ctfconvert for fw_stub.awk target
This target produces a C file not an object file, so using ctfconvert on
it should not be attempted. This keeps it in sync with all other uses of
fw_stub.awk, squashes a warning seen during the build of TEGRA124 on
FreeBSD and avoids the same issue failing the build on non-FreeBSD (such
errors are #ifdef'ed into being warnings on FreeBSD in ctfconvert, which
should be revisited in the future).
Reviewed by: manu
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D41542
Jessica Clarke [Tue, 22 Aug 2023 20:00:28 +0000 (21:00 +0100)]
kbdcontrol: Support building as a bootstrap tool on old and non-FreeBSD
Systems that predate 971bac5ace7a ("kbd: consolidate kb interfaces
(phase one)") cannot build kbdcontrol since kbdelays and kbrates moved
to sys/kbio.h. Moreover, on non-FreeBSD, it requires all kinds of ioctls
and sysctls that are highly FreeBSD-specific to build, but we use it as
a bootstrap tool to generate the keymaps used by some kernels (LINT ones
in particular). Thus, when bootstrapping kbdcontrol, disable everything
that's not needed for that singular use, and use the in-tree kbio.h to
get the definitions of the necessary structures.
This allows KBDMUX_DFLT_KEYMAP, UKBD_DFLT_KEYMAP and ATKBD_DFLT_KEYMAP
to be enabled when building on non-FreeBSD, and thus LINT kernels.
Marius Strobl [Tue, 22 Aug 2023 18:12:59 +0000 (20:12 +0200)]
tcp_info: Add and export more FreeBSD-specific fields
This change adds struct tcp_info fields corresponding to the following
struct tcpcb ones:
- snd_una
- snd_max
- rcv_numsacks
- rcv_adv
- dupacks
Note that while both tcp_fill_info() and fill_tcp_info_from_tcb() are
extended accordingly, no counterpart of rcv_numsacks is available in
the cxgbe(4) TOE PCB, though.
Zhenlei Huang [Tue, 22 Aug 2023 09:20:10 +0000 (17:20 +0800)]
geom_linux_lvm: Check the offset of physical volume header
The LVM label is stored on any of the first four sectors, and the
PV (physical volume) header is stored within the same sector following
the LVM label. The current implementation does not fully check the
offset of PV header, when attaching a bad formatted LVM PV the kernel
may crash due to out-of-bounds memory read.
bhyve: add config option to load ACPI tables into memory
For backward compatibility, the ACPI tables are loaded into the guest
memory. Windows scans the memory, finds the ACPI tables and uses them.
It ignores the ACPI tables provided by the UEFI. We are patching the
ACPI tables in the guest memory, so that's mostly fine. However, Windows
will break when the ACPI tables become to large or when we add entries
which can't be patched by bhyve. One example of an unpatchable entry, is
a TPM log. The TPM log has to be allocated by the guest firmware. As the
address of the TPM log is unpredictable, bhyve can't assign it in the
memory version of the ACPI tables. Additionally, this makes it
impossible for bhyve to calculate a correct checksum of the table.
By default ACPI tables are still loaded into guest memory for backward
compatibility. The new acpi_tables_in_memory config value can be set to
false to avoid this behaviour.
John Baldwin [Tue, 22 Aug 2023 04:02:42 +0000 (21:02 -0700)]
libcrypto: Update assembly build glue for x86 for OpenSSL 3.0.
Notably, define AES_ASM which is required for any AES acceleration
(OpenSSL 1.0 gated all AES acceleration on OPENSSL_CPUID_OBJ instead).
Enabling this exposed that new assembly files added in OpenSSL 3.0
needed to be included in the build (aes-x86-64.S and aes-586.S). Both
of these files supplant both aes_core.c and aes_cbc.c. The last file
had to be moved out of the MI SRCS line for aes and into each ASM_*
for non-x86.
As part of this I audited the generated configdata.pm for amd64, i386,
and aarch64 and found the following additional discrepecancies that are
fixed here as well:
- Enabled BSAES_ASM on amd64 which requires bsase-x86_64.S
- Enabled WHIRLPOOL_ASM on amd64 (asm sources already built)
- Enabled CMLL_ASM on amd64 and i386 (asm sources already built)
aarch64 had no discreprecancies in configdata.pm, and no *.pl asm
generators were missing for aarch64 in Makefile.asm. I did not check
powerpc or armv7, but for armv7 all of the asm generators seem to be
present in Makefile.asm.
Reported by: gallatin (AES-GCM using plain software on amd64)
Reviewed by: gallatin, ngie, emaste
Differential Revision: https://reviews.freebsd.org/D41539
Ed Maste [Fri, 18 Aug 2023 03:29:33 +0000 (23:29 -0400)]
x86: handle domains with no CPUs usable for intr delivery
We can end up with a domain having no CPUs capable of receiving I/O
interrupts. This can occur, for example, when all APIC IDs in a given
domain are 256 or greater, and we have no IOMMU.
In this case disable per-domain interrupt support, effectively reverting
to the behaviour before commit a48de40bcc09 ("Only use CPUs in the
domain the device is attached to for default"). This has a performance
impact but at least allows the system to be functional. It is a stop-
gap until we can rely on the presence of an IOMMU on all x86 platforms.
Thanks to AMD for providing the high-thread-count machine I used for
testing this change, and to cperciva for testing on other hardware.
Reviewed by: jhb
Tested by: cperciva, emaste
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D41501
This changeset adds a baseline implementation of memcmp and bcmp
for amd64. The same code is used for both functions with conditional
code were the behaviour differs (we need more precise output for the
memcmp case).
FreeBSD documents that memcmp returns the difference between the
mismatching characters. Slightly faster code would be possible could
we relax this requirement to the ISO/IEC 9899:1999 requirement of
merely returning a negative/positive integer or zero.
Performance is better than bionic and glibc, except for long strings
were the two are 13% faster. This could be because they use SSE4
ptest which we cannot use in a baseline kernel.
Sponsored by: The FreeBSD Foundation
Approved by: mjg
Differential Revision: https://reviews.freebsd.org/D41442
This commit adds a baseline implementation of stpcpy(3) for amd64.
It performs quite well in comparison to the previous scalar implementation
as well as agains bionic and glibc (though glibc is faster for very long
strings). Fiddle with the Makefile to also have strcpy(3) call into the
optimised stpcpy(3) code, fixing an oversight from D9841.
Sponsored by: The FreeBSD Foundation
Reviewed by: imp ngie emaste
Approved by: mjg kib
Fixes: D9841
Differential Revision: https://reviews.freebsd.org/D41349
Doug Moore [Mon, 21 Aug 2023 17:28:51 +0000 (12:28 -0500)]
pctrie: change for vm_radix compatibility
Restructure parts of pctrie code to make it more compatible with the
needs of vm_radix code.
1. End passing function pointers for memory management.
By breaking insertion into two functions, the call for allocating
memory can happen at the top level and be inlined, rather than
happening via an function pointer to a memory allocator.
By changing the remove function slightly, freeing of memory, when
necessary, can happen at the top level and be inlined.
By turning the reclamation code into two functions, one for starting
iteration over to-be-freed nodes and the other continuing it, all the
freeing can happen at the top level and be inlined.
2. Offer a version of remove that does not panic and returns the freed
value (or NULL).
3. Offer a 'replace' operation, to replace one leaf with another that
has the same key.
These are three of the roadblocks that prevent code sharing between
pctrie and vm_radix code.
It is modelled after aligned_alloc(3). Most importantly, to free the
allocation, __crt_free() can be used. Additionally, caller may specify
offset into the aligned allocation, so that we return offset-ed from
alignment pointer.
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D41150
The current version of LinuxKPI lacks support for "page pool" which
needs enhancing and updating a decade or so old shortcut mapping
struct page directly to struct vm_page.
Complement the driver to make compile on FreeBSD
using LinuxKPI with changes covered by #ifdef (__FreeBSD__).
Add the module build framework but keep disconnected from the
build for now.
The current driver (or rather LinuxKPI) lacks support for some
"qcom" bits needed in order to get things working (as does ath11k).
There was interest by various people to enhance support further
for ath11k which will equally benefit ath12k.
Given the lack of full license texts on the files this is
imported under the draft policy for handling SPDX files (D29226)
and with approval for BSD-3-Clause-Clear. [1]
Approved by: core (jhb, 2023-05-11) [1]
MFC after: 20 days
Bjoern A. Zeeb [Wed, 17 May 2023 20:21:14 +0000 (20:21 +0000)]
ath10k: specifically mark a debug workaround for an early firmware crash
During the last import a (debug) workaround to avoid an early firmware
crash was imported. In order to remember and to separate it from
upstream sources put it under #ifdef
Bjoern A. Zeeb [Tue, 16 May 2023 22:07:53 +0000 (22:07 +0000)]
LinuxKPI: 802.11: update compat code for updated drivers
Adjust and add structs, fields, functions to make more modern versions
of LinuxKPI based wireless drivers (based on wireless-testing (
wt-2023-06-09, wt-2023-07-24, and later)) compile.
Some of these changes can only be applied once all drivers get
updated to not break the old versions currently in the tree.
Mark those changes with __FOR_LATER_DRV_UPDATE for now and flip the
switch at a later point.
Sponsored by: The FreeBSD Foundation
MFC after: 20 days
Ed Maste [Sun, 20 Aug 2023 19:16:54 +0000 (15:16 -0400)]
ssh: fix OpenSSH 9.4 regression with multiplexed sessions
Upstream commit message:
upstream: fix regression in OpenSSH 9.4 (mux.c r1.99) that caused
multiplexed sessions to ignore SIGINT under some circumstances.
Reported by / feedback naddy@, ok dtucker@
rtld: unlock bind lock when calling into crt __pthread_distribute_static_tls method
The method might require resolving and binding symbols, which means
recursing on the bind lock. It is safe to unlock the bind lock,
since we operate on the private object list, and user attempting to
unload an object from the list of not yet fully loaded objects caused
self-inflicted race.
It is similar to how we treat user' init/fini methods.
Reported by: stevek
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Val Packett [Sun, 20 Aug 2023 09:53:32 +0000 (12:53 +0300)]
Add atopcase, the Apple HID over SPI input driver
The driver provides support for Human Interface Devices (HID) on
Serial Peripheral Interface (SPI) buses on Apple Intel Macs
produced in 2015-2018.
The driver appears to work more stable after installation of Darwin OSI
in acpi(4) driver.
To install Darwin OSI insert following lines into /boot/loader.conf:
hw.acpi.install_interface="Darwin"
hw.acpi.remove_interface="Windows 2009, Windows 2012"