Introduce the allocuio() and freeuio() functions to allocate and
deallocate struct uio. This hides the actual allocator interface, so it
is easier to modify the sub-allocation layout of struct uio and the
corresponding iovec array.
tcp: stop timers and clean scoreboard in tcp_close()
Stop timers when in tcp_close() instead of doing that in tcp_discardcb().
A connection in CLOSED state shall not need any timers. Assert that no
timer is rescheduled after that in tcp_timer_activate() and verfiy that
this is also the expected state in tcp_discardcb().
tcp: stop doing superfluous work after sending RST
When sending a RST control segment in tcp_output() it
means we are in TCPS_CLOSED state, called from tcp_drop().
Once the RST is sent, don't call tcp_timer_activate() or
update anything in tcpcb, since that will go away shortly.
tcp: clean scoreboard when releasing the socket buffer
The SACK scoreboard is conceptually an extention of the socket
buffer. Remove it when the socket buffer goes away with
soisdisconnected(). Verify that this is also the expected
state in tcp_discardcb().
John Baldwin [Fri, 9 Feb 2024 19:53:43 +0000 (11:53 -0800)]
cam: Check if cam_simq_alloc fails for the xpt bus during module init
This is very unlikely to fail (and if it does, CAM isn't going to work
regardless), but fail with an error rather than a gauranteed panic via
NULL pointer dereference.
John Baldwin [Fri, 9 Feb 2024 18:27:45 +0000 (10:27 -0800)]
pcib: Refine handling of resources allocated from bridge windows
Fix a long-standing layering violation in the original NEW_PCIB code
by not passing suballocated resources up to the parent bus for
activation and mapping. Instead, handle activation and mapping of
sub-allocated resources in this driver. When mapping resources,
request a mapping from a suitable sub-region of the resource allocated
from the parent bus for the associated bridge window.
Note that this does require passing RF_ACTIVE (with RF_UNMAPPED) when
allocating bridge window resources from the parent.
John Baldwin [Fri, 9 Feb 2024 18:27:45 +0000 (10:27 -0800)]
acpi: Cleanup handling of suballocated resources
For resources suballocated from the system resource rmans, handle
those in the ACPI bus driver without passing them up to the parent.
This means using bus_generic_rman_* for several bus methods for
operations on suballocated resources. For bus_map/unmap_resource,
find the system resource allocated from the parent bus (nexus) that
contains the range being mapped and request a mapping of that parent
resource.
This avoids a layering violation where nexus drivers were previously
asked to manage the activation and mapping of resources created
belonging to the ACPI resource managers.
Note that this does require passing RF_ACTIVE (with RF_UNMAPPED) when
allocating system resources from the parent.
While here, don't assume that the parent bus (nexus) provides a
resource list that sysres resources are placed on. Instead, create a
dedicated resource_list in the ACPI bus driver's softc to hold sysres
resources.
Jessica Clarke [Fri, 9 Feb 2024 18:13:47 +0000 (18:13 +0000)]
bsdinstall: Add new Auto option to netconfig interface selection dialog
This changes the OK / Cancel buttons into Auto / Manual / Cancel, with
Auto being the default. Manual behaves like OK used to, i.e. presents a
series of dialogs asking exactly how to configure the interface, and
Cancel is unchanged, exiting with exit code 1. Auto will attempt to
configure IPv4+DHCP and IPv6+SLAAC with no interaction, failing only if
neither can be configured, thereby supporting all of IPv4-only,
IPv6-only and dual-stack environments. If at least one DNS server is
provided, it will also skip asking for DNS settings, otherwise it will
act like Manual mode for the purposes of DNS settings and prompt. For a
standard dual-stack environment this cuts down the number of netconfig
dialogs from 6 (interface, IPv4, DHCP, IPv6, SLAAC, DNS) to just the
first one.
Debugging boot issues can be helped by
logging each rc.d script as it is run
and being able to selectively enable/disable set -x
debug.sh provides an elaborate framework for debugging shell scripts.
For secure systems, we want to be paranoid about what we read
during boot.
dot() simply reads (.) arg file if it exists
vdot() if mac_veriexec is active, ignore unverified files
otherwise behaves much the same as dot()
safe_dot() in safe_eval.sh allows reading an untrusted file;
limiting the input to simple variable assignments.
In load_rc_config allow caller to provide an option to indicate how to
handle its arg:
-v use vdot()
-s use sdot() which will try to use vdot() and fallback to safe_dot()
The default is to read using dot()
rc_run_scripts()
encapsulate the running of rc.d scripts
so that we can easily call it more than twice.
We vdot local.rc.subr to pick up extensions (like
run_rc_scripts_final) and overrides.
We also allow rc.subr.local or rc.conf to set rc_config_xtra
eg (rc_config_xtra=XXX for historic compatibility)
rc use set -o verify around the reading in of rc.subr
This has no effect if mac_veriexec is not active, but if it is; ensures
rc.subr has not been tampered with.
Austin Zhang [Wed, 7 Feb 2024 18:55:02 +0000 (12:55 -0600)]
ntb: Add Intel Xeon Gen4 support
The NTB hardware of XEON Ice lake and Sapphire Rapids has register mapping changes
Add a new NTB_XEON_GEN4 device type and use it to conditionalize driver logic differs
Emil Tsalapatis [Thu, 8 Feb 2024 01:13:43 +0000 (20:13 -0500)]
fusefs: only test for incoherency if FN_SIZECHANGE is set
FUSE emits spurious incoherency warnings in writethrough mode. The
warnings are triggered by setattr calls generated by vnode truncation
turning the cached va_size vattr stale, causing comparisons with the
fresh version provided by the server to fail. Only validate the vnode's
va_size vattr if the FN_SIZECHANGE flag is set.
This is a part of the research work at RCSLab, University of Waterloo.
Brooks Davis [Thu, 8 Feb 2024 18:21:56 +0000 (18:21 +0000)]
libsys: actually install manpages
In initial hacking I'd bluntly disabled manpage installation in libsys,
then later disabled them for libc, but forgot to fix the former leading
to no syscall manapages.
PR: 276887
Reported by: Martin Birgmeier <d8zNeCFG@aon.at>
tcp: ensure tcp_sack_partialack does not inflate cwnd after RTO
The implicit assumption of snd_nxt always being larger than
snd_recover is not true after RTO. In that case, cwnd
would get inflated to ssthresh, which may be much larger
than the current pipe (data in flight).
tcp: calculate ssthresh on RTO according to RFC5681
per RFC5681, only adjust ssthresh on the initital
retransmission timeout. Since RTO often happens
during loss recovery, while cwnd no longer tracks
all data in flight, calculcate pipe properly.
tcp: use tcp_fixed_maxseg instead of tcp_maxseg in cc modules
tcp_fixed_maxseg() is the streamlined calculation of typical
tcp options and more suitable for heavy use in the congestion
control modules on every received packet.
Gleb Smirnoff [Thu, 8 Feb 2024 17:00:23 +0000 (09:00 -0800)]
unix: retire LOCAL_CONNWAIT
This socket option was added in 6a2989fd54a9 together with LOCAL_CREDS.
Both options originate from NetBSD. The LOCAL_CREDS seems to be used by
some software and is covered by our test suite.
The main problem with LOCAL_CONNWAIT is that it doesn't work as
documented. A basic test shows that connect(2) indeed blocks, but
accept(2) on the other side does not wake it up. Indeed, I don't see what
code in the accept(2) path would go into the peer socket of a unix/stream
listener's child and would make wakeup(&so->so_timeo). I tried the test
even on a FreeBSD 6.4-RELEASE and it produced the same results as on
CURRENT.
The other thing that puzzles me is why that option would be useful even if
it worked? Because on unix/stream you can send(2) immediately after
connect(2) and that would put data on the peer receive buffer even before
listener had done accept(2). In other words, one side can do connect(2)
then send(2), only after the remote side would make accept(2) and the
remote would see the data sent before the accept(2). Again this
undocumented feature of unix(4) is present on all versions from FreeBSD 6
to CURRENT.
Gleb Smirnoff [Thu, 8 Feb 2024 17:00:23 +0000 (09:00 -0800)]
tests/unix_passfd: test that control mixed with data creates records
If socket has data interleaved with control it would never allow to read
two pieces of data, neither two pieces of control with one recvmsg(2). In
other words, presence of control makes a SOCK_STREAM socket behave like
SOCK_SEQPACKET, where control marks the records. This is not a documented
or specified behavior, but this is how it worked always for BSD sockets.
If you look closer at it, this actually makes a lot of sense, as if it
were the opposite both the kernel code and an application code would
become way more complex.
The change made recvfd_payload() to return received length and requires
caller to do ATF_REQUIRE() itself. This required a small change to
existing test rights_creds_payload. It also refactors a bit f28532a0f363,
pushing two identical calls out of TEST_PROTO ifdef.
Gleb Smirnoff [Thu, 8 Feb 2024 17:00:23 +0000 (09:00 -0800)]
unix/stream: do not put empty mbufs on the socket
It is a legitimate case to use sendmsg(2) to send control only, with zero
bytes of data and then recvmsg(2) them with zero length iov, receiving
control only. This sendmsg(2)+recmsg(2) would leave a zero length mbuf on
the top of the socket buffer. If you now try to repeat this combo again,
your recvmsg(2) would not return control data, because it sits behind an
MT_DATA mbuf and you have provided zero length uio_resid. IMHO, best
strategy to deal with zero length buffers in a chain is to not put them
there in the first place. Thus, solve this right in uipc_send() instead
of touching soreceive_generic().
Lexi Winter [Sat, 3 Feb 2024 13:19:03 +0000 (13:19 +0000)]
traceroute: remove configuration #defines
traceroute used a series of #defines to specify what features are
available on the host platform. As traceroute is now in source, these
are unnecessary and complicate the code, so remove them.
Lexi Winter [Sat, 3 Feb 2024 13:10:09 +0000 (13:10 +0000)]
traceroute: move from contrib to usr.sbin
traceroute hasn't had a vendor import since 2002, while since then it's
had several significant FreeBSD-specific commits. Since it's unlikely
another vendor import will happen, and to make the merge of traceroute6
into traceroute easier, import traceroute into usr.sbin.
Mark Johnston [Thu, 8 Feb 2024 16:11:02 +0000 (11:11 -0500)]
arm64: Add pmap integration for KMSAN
- In pmap_bootstrap_san(), allocate the root PTPs for the shadow maps.
(For KASAN, this is done earlier since we need to do some special
bootstrapping for the kernel stack.)
- Adjust ifdefs to include KMSAN.
- Expand the shadow maps when pmap_growkernel() is called.
MFC after: 2 weeks
Sponsored by: Klara, Inc.
Sponsored by: Juniper Networks, Inc.
Differential Revision: https://reviews.freebsd.org/D43405
Mark Johnston [Thu, 8 Feb 2024 16:10:43 +0000 (11:10 -0500)]
arm64: Simplify and improve KASAN shadow map bootstrapping
- Move pmap_bootstrap_allocate_kasan_l2() close to the place where it is
actually used.
- Simplify pmap_bootstrap_allocate_kasan_l2() a bit: eliminate some
unneeded variables and zero and exclude each 2MB mapping as we go
rather than doing that all at once. Excluded regions will be
coalesced.
- As a consequence of the previous point, ensure that we do not zero a
preexisting 2MB mapping.
- Simplify pmap_bootstrap_san() and prepare it to work with KMSAN.
MFC after: 2 weeks
Sponsored by: Klara, Inc.
Sponsored by: Juniper Networks, Inc.
Differential Revision: https://reviews.freebsd.org/D43404
Mark Johnston [Thu, 8 Feb 2024 16:02:48 +0000 (11:02 -0500)]
arm64: Disable kernel superpage promotion when KMSAN is configured
The break-before-make operation required to promote or demote a
superpage leaves a window where the KMSAN runtime can trigger a fatal
data abort. More specifically, the code in pmap_update_entry() which
executes after ATTR_DESCR_VALID is cleared may implicitly attempt to
access KMSAN context via curthread, but we may be promoting or demoting
a 2MB page containing the curthread structure.
Reviewed by: imp
Sponsored by: Klara, Inc.
Sponsored by: Juniper Networks, Inc.
Differential Revision: https://reviews.freebsd.org/D43158
Mark Johnston [Thu, 8 Feb 2024 16:00:40 +0000 (11:00 -0500)]
arm64: Add msan.h
This is mostly a copy of amd64's msan.h, except that we currently do not
avoid shadowing the kernel itself, and we need a more restrictive upper
bound in kmsan_md_unsupported() to avoid probing non-existent shadow
mappings of device mappings.
Reviewed by: imp
MFC after: 2 weeks
Sponsored by: Klara, Inc.
Sponsored by: Juniper Networks, Inc.
Differential Revision: https://reviews.freebsd.org/D43156
Mark Johnston [Thu, 8 Feb 2024 15:57:36 +0000 (10:57 -0500)]
arm64: Make KMSAN aware of exceptions
- Call kmsan_intr_enter() when an exception occurs. This ensures that
code running in the exception context does not clobber thread-local
KMSAN state.
- Ensure that stack memory containing trap frames is treated as
initialized.
Co-authored-by: Alexander Stetsenko <alex.stetsenko@klarasystems.com>
Reviewed by: imp
MFC after: 2 weeks
Sponsored by: Klara, Inc.
Sponsored by: Juniper Networks, Inc.
Differential Revision: https://reviews.freebsd.org/D43155
ck_pr/aarch64: Specify output operands for ck_pr_md_store_*
As in commit 2f9acab, we want to specify output operand widths so that
MSAN compiler instrumentation correctly updates the shadow map. In
particular, LLVM's implementation depends on having type information for
output operands, even when that's not otherwise necessary. Without it,
KMSAN in FreeBSD generates false positives on aarch64.
Reviewed by: cognet
MFC after: 2 weeks
Sponsored by: Klara, Inc.
Sponsored by: Juniper Networks, Inc.
Brooks Davis [Wed, 7 Feb 2024 19:58:33 +0000 (19:58 +0000)]
rescue: Don't explicitly link with libsys
libpthread contains the symbols we need when statically linked. This
was a leftover from a prior version of ef9871c6205c that I failed to
remove before I pushed.
CVE-2020-24370 is a security vulnerability in lua. Although the CVE
description in CVE-2020-24370 said that this CVE only affected lua
5.4.0, according to lua this CVE actually existed since lua 5.2. The
root cause of this CVE is the negation overflow that occurs when you
try to take the negative of 0x80000000. Thus, this CVE also exists in
openzfs. Try to backport the fix to the lua in openzfs since the
original fix is for 5.4 and several functions have been changed.
Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: ChenHao Lu <18302010006@fudan.edu.cn>
Closes #15847
Brooks Davis [Wed, 7 Feb 2024 19:38:16 +0000 (19:38 +0000)]
libthr: filter rather than link with libsys
The allows gcc + GNU ld to link programs with -m32 -pthread without
erroring out due to _umtx_op_err being undefined (unless -lsys is added
to the link command.
We now always link _umtx_op_err into libthr (not just when it's static)
and filter it with libsys so we call that implementation. The dynamic
implementations (at least the assembly ones) should likely become stubs
as a further refinement.
Cameron Harr [Wed, 7 Feb 2024 17:12:12 +0000 (09:12 -0800)]
Add 'zpool status -e' flag to see unhealthy vdevs
When very large pools are present, it can be laborious to find
reasons for why a pool is degraded and/or where an unhealthy vdev
is. This option filters out vdevs that are ONLINE and with no errors
to make it easier to see where the issues are. Root and parents of
unhealthy vdevs will always be printed.
Testing:
ZFS errors and drive failures for multiple vdevs were simulated with
zinject.
Sample vdev listings with '-e' option
- All vdevs healthy
NAME STATE READ WRITE CKSUM
iron5 ONLINE 0 0 0
Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Cameron Harr <harr1@llnl.gov>
Closes #15769
Mark Johnston [Wed, 7 Feb 2024 13:47:24 +0000 (08:47 -0500)]
vmm: Expose more registers to VM_GET_REGISTER
In a follow-up revision the gdb stub will support sending an XML target
description to gdb, which lets us send additional registers, including
the ones added in this patch.
Warner Losh [Wed, 7 Feb 2024 05:47:42 +0000 (22:47 -0700)]
acpica: Fix my mismerge
I merged in the limits.h include. I should have resolved this by
deleting it (since we have no easy way to 'fix' it with compat headers).
GENERIC doesn't bring in the debugger, but LINT does...
Warner Losh [Tue, 6 Feb 2024 23:11:38 +0000 (16:11 -0700)]
leapseconds: Update to the canonical place.
IERS is the source of truth for leap seconds. Their leapsecond file is
updated most quickly and is always right (unlike the IANA one which
often lags). IERS operates this public service for the express purpose
of random people downloading it. Their terms of service are compatible
with open source (we could include this in our release). Rather than
fighting with questions around this because the IANA one changed
locations or the auto update script broken, just use this.
This is in preference to the NIST ftp copy. NIST is in the process of
retiring their FTP services.
Warner Losh [Tue, 6 Feb 2024 22:46:06 +0000 (15:46 -0700)]
arm: Move locore-v6.S to locore.S
As a separate commit, now move locore-v6.S to locore.S. This makes git
annotate work, at least back to 2014 when Ian created locore-v6.S. svn
didn't save enough metadata for the converter to allow it to go back
further.
Warner Losh [Tue, 6 Feb 2024 22:42:03 +0000 (15:42 -0700)]
arm: Use locore-v6.S directly
Use locore-v6.S directly, rather than indirectly via including
locore.S. This loses acle-compat.h inclusion, but that's only needed for
gcc 4.8 and earlier. Since we don't support anything that old, there's
no need for it here.
Warner Losh [Tue, 6 Feb 2024 22:26:17 +0000 (15:26 -0700)]
arm: Catchup to atmel retirement
AT91 boot2 loaders have been long gone, and don't support the AT91 parts
that have armv7 cores (since we don't have specific support for
that). Mentioning its interface is OBE, so remove it.
Warner Losh [Tue, 6 Feb 2024 21:16:51 +0000 (14:16 -0700)]
git-arc: Retain color status messages
Newer versions of archanist have an --ansi option to always include the
ansi colors when doing an arc list (or any command really). Add this to
the arc list that's relevant. Add filter to filter out the 'bolding'
though since that interferes with our parsing. This should restore the
color output after df834e06bbc7.
pf: Ensure that st->kif is obtained in a way which respects the r->rpool->mtx mutex
The redirection pool stored in r->rpool.cur is used for loadbalancing
and cur can change whenever loadbalancing happens, which is for every
new connection. Therefore it can't be trusted outside of pf_map_addr()
and the r->rpool->mtx mutex. After evaluating the ruleset, loadbalancing
decission is made in pf_map_addr() called from within pf_create_state()
and stored in the state itself.
This patch modifies BOUND_IFACE() so that it only uses the information
already stored in the state which has been obtained in a way which
respects the r->rpool->mtx mutex.
Vitaliy Gusev [Tue, 6 Feb 2024 15:36:17 +0000 (10:36 -0500)]
vmm: Fix compiling error with BHYVE_SNAPSHOT
The return values of copyin() and copyout() must be checked.
vm_snapshot_buf_cmp() is unused by the kernel and was incorrectly
implemented, so just remove it.
Allows the development, testing and deployment of netmap(4)-based code
on arm64 without having to recompile the kernel. netmap(4) is already
in the amd64 and powerpc64 default configs, so it does not seem
unreasonable to also provide it on arm64 by default.
Note that netmap(4) is useful even on systems without NIC that fully
support it.
Andriy Gapon [Tue, 6 Feb 2024 08:55:13 +0000 (10:55 +0200)]
fix poweroff regression from 9cdf326b4f by delaying shutdown_halt
The regression affected ACPI-based systems without EFI poweroff support
(including VMs).
The key reason for the regression is that I overlooked that poweroff is
requested by RB_POWEROFF | RB_HALT combination of flags. In my opinion,
that command is a bit bipolar, but since we've been doing that forever,
then so be it. Because of that flag combination, the order of
shutdown_final handlers that check for either flag does matter.
Some additional complexity comes from platform-specific shutdown_final
handlers that aim to handle multiple reboot options at once. E.g.,
acpi_shutdown_final handles both poweroff and reboot / reset. As
explained in 9cdf326b4f, such a handler must run after shutdown_panic to
give it a chance. But as the change revealed, the handler must also run
before shutdown_halt, so that the system can actually power off before
entering the halt limbo.
Previously, shutdown_panic and shutdown_halt had the same priority which
appears to be incompatible with handlers that can do both poweroff and
reset.
The above also applies to power cycle handlers.
PR: 276784
Reported by: many
Tested by: Katsuyuki Miyoshi <katsubsd@gmail.com>,
Masachika ISHIZUKA <ish@amail.plala.or.jp>
Fixes: 9cdf326b4fae run acpi_shutdown_final later to give other handlers a chance
MFC after: 1 week
Bartosz Sobczak [Tue, 6 Feb 2024 02:43:48 +0000 (18:43 -0800)]
ofed: fix warnings during libibverbs compilation
create_qp_handle_resp_common_cleanup should be void
__ibv_cleanup_wq should use wq->cond for cond destroy
both issues were overlooked in: a687910 ('Cleanup pthread locks in ofed RDMA verbs')
On Linux the ioctl_ficlonerange() and ioctl_ficlone() system calls
are expected to either fully clone the specified range or return an
error. The range may be for an entire file. While internally ZFS
supports cloning partial ranges there's no way to return the length
cloned to the caller so we need to make this all or nothing.
As part of this change support for the REMAP_FILE_CAN_SHORTEN flag
has been added. When REMAP_FILE_CAN_SHORTEN is set zfs_clone_range()
will return a shortened range when encountering pending dirty records.
When it's clear zfs_clone_range() will block and wait for the records
to be written out allowing the blocks to be cloned.
Furthermore, the file range lock is held over the region being cloned
to prevent it from being modified while cloning. This doesn't quite
provide an atomic semantics since if an error is encountered only a
portion of the range may be cloned. This will be converted to an
error if REMAP_FILE_CAN_SHORTEN was not provided and returned to the
caller. However, the destination file range is left in an undefined
state.
A test case has been added which exercises this functionality by
verifying that `cp --reflink=never|auto|always` works correctly.
Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #15728
Closes #15842
Marius Strobl [Mon, 5 Feb 2024 19:36:13 +0000 (20:36 +0100)]
fib_algo(4): Lower level of algorithm switching messages to LOG_INFO
Otherwise, with the default flm_debug_level of LOG_NOTICE, it's rather
easy to trigger debug messages such as:
[fib_algo] inet.0 (bsearch4#18) rebuild_fd_flm: switching algo to
radix4_lockless
Also, the "severity" of these events generally only justifies LOG_INFO
and not LOG_NOTICE.
Marius Strobl [Mon, 5 Feb 2024 19:08:33 +0000 (20:08 +0100)]
sdhci_fsl_fdt(4): Actually use modified SDHCI capabilities
SDHCI_QUIRK_MISSING_CAPS needs to be set unconditionally so sdhci(4)
adheres to the slot caps and caps2 set by sdhci_fsl_fdt(4). However,
so far this bug didn't have an impact as the front-end only filters
SDHCI_CAN_DO_SUSPEND, which in turn isn't used, yet.
Warner Losh [Mon, 5 Feb 2024 22:13:57 +0000 (15:13 -0700)]
acpica: Create merge commit against vendor branch
Merge tracking branch 'vendor/acpica' for vendor/acpica/20230628, and
resolve conflicts.
This deletes files that we've deleted since the last merge (during SVN
times it seems) so future merges don't bring them up.
It resolves conflicts in several files that we have modified (but we can
likely fix the build system so we don't have to modify them since it's
almost all headers) and one ifndef kernel that could be solved with an
empty #define.
It also deletes new files in the platform directory that are similar to
prior non-freebsd platform files we've deleted.
Igor Ostapenko [Mon, 5 Feb 2024 16:22:31 +0000 (17:22 +0100)]
pf: Ensure that st->kif is obtained in a way which respects the r->rpool->mtx mutex
The redirection pool stored in r->rpool.cur is used for loadbalancing
and cur can change whenever loadbalancing happens, which is for every
new connection. Therefore it can't be trusted outside of pf_map_addr()
and the r->rpool->mtx mutex. After evaluating the ruleset, loadbalancing
decission is made in pf_map_addr() called from within pf_create_state()
and stored in the state itself.
This patch modifies BOUND_IFACE() so that it only uses the information
already stored in the state which has been obtained in a way which
respects the r->rpool->mtx mutex.
Brooks Davis [Fri, 5 Jan 2024 19:04:53 +0000 (19:04 +0000)]
libc: make syscall stubs empty for shared lib
They are always replaced by libsys so just make them empty. In
https://reviews.freebsd.org/D14609 x86 variants call abort2, but that
requires per-arch assembly and should be of low value in the steady
state.
Brooks Davis [Wed, 15 Nov 2023 23:35:16 +0000 (23:35 +0000)]
libc: link libsys as a auxiliary filter library
At runtime, when rtld loads libc it will also load libsys. For each
symbol that is present in both, the libsys one will override the libc
one. It continues to be the case that program need only link against
libc (usually implicitly). The linkage to libsys is automatic.
Brooks Davis [Wed, 15 Nov 2023 23:31:57 +0000 (23:31 +0000)]
libsys: plumb in to build
libsys provides the FreeBSD kernel interface (auxargs, system calls,
vdso). It can be linked directly for programs using a non-standard
libc and will later be linked as a filter library to libc providing
the actual system call implementation.
Brooks Davis [Tue, 21 Nov 2023 16:55:06 +0000 (16:55 +0000)]
makesyscalls: generate private syscall symbols
For libsys we need to expose all the private symbols (_ and __sys_
prefixes) so libsys can replace the libc versions. Rather than trying
to maintain a table, teach makesyscalls to generate it.
There are a small number of "_" prefixed symbols that are exposed as
public interfaces rather than in the private symbol space. Since the
list is short, just hardcode it for now.
If doesn't appear that we need to export freebsd#_foo symbols for compat
system calls explicitly. If it turns out we do, there are probably few
enough of them to handle seperately.
Brooks Davis [Tue, 21 Nov 2023 18:30:43 +0000 (18:30 +0000)]
libc: compile _once in libsys
auxv support requires _once(), but we don't want the libsys version
stomping on the libc version should they diverge in the future. We
could rename it entierly, but for now just hook it in via Makefile.sys.