libibverbs: remove nonexistent symbols from the linker map
The function ibv_query_device_ex is static inline, it is not exported
from the dso. With lld 16, which is much more picky about versioning and
undefined symbols, this becomes an error.
The ibv_register_driver driver symbol is explicitly versioned in
sources, it is non-existent in un-versioned object files.
This performs very well. x86-64-v3 and x86-64-v4 kernels were written,
too, but performed worse than the baseline kernel on short strings.
These may be added at a future point in time if the performance issues
can be fixed.
Kevin Bowling [Thu, 3 Aug 2023 20:49:15 +0000 (13:49 -0700)]
e1000: Enable TSO for lem(4) and em(4)
Most em(4) devices now enjoy TSO and TSO6, matching NetBSD and Linux
defaults.
A prior commit automasks TSO on 10/100 Ethernet due to errata and other
bugs for IPv6 were fixed recently allowing this.
Mike Karels identified a performance anomaly on Intel 82574L devices.
These are multiqueue enabled on FreeBSD since the conversion to
iflib. I am investigating whether this can be fixed, in the mean time
MSI-X with checksum offloads remain default.
i219 SPT devices have an errata that downclocks the DMA engine, which
results in TSO not being able to acheive line rate. Therefore, it is
disabled on:
* Intel(R) I219-LM and I219-V SPT
* Intel(R) I219-LM and I219-V SPT-H (2)
* Intel(R) I219-LM and I219-V LBG (3)
* Intel(R) I219-LM and I219-V SPT (4)
* Intel(R) I219-LM and I219-V SPT (5)
Many lem(4) devices enjoy TSO, exceptions being 82542, 82543, 82547.
TSO6 may be possible for some chipsets but I am still working through
my testing matrix and that is hidden behind hw.em.unsupported_tso.
If you encounter issues, you may disable TSO with for example:
ifconfig em0 -tso -tso6.
I ask to be informed of any deviations from normal operation requiring
this.
Thanks to cc@ for access to emulab.net.
On a sample I219 system it saves about 16% CPU on IPv4 and 19% on IPv6.
Ed Maste [Fri, 30 Sep 2022 12:14:22 +0000 (08:14 -0400)]
amd64: Bump MAXCPU to 1024 (from 256)
Hardware with more than 256 CPU cores is currently available and will
become increasingly common over FreeBSD 14's lifetime. Increase MAXCPU
in the amd64 GENERIC kernel configuration to 1024.
Earlier commits increased some related limits. These prerequisite
commits include at least:
- d7ed40243769 Increase MAX_APIC_ID safeguard to 0x800
- d1639e43c589 cpuset: increase userland maximum size to 1024
Global and allocated arrays sized by MAXCPU result in excessive bloat
on systems with lower core counts. In addition, some code used u_char
(8 bits) to hold a CPU index, which is not valid if MAXCPU is greater
than 256.
A number of recent commits addressed these sorts of issues, including
at least:
- 133935d26f20 pf: atomically increment state ids
- 74ac712f72cf vmm: Dynamically allocate a couple of per-CPU state save areas
- 78cfa762ebf2 callout: Move per-CPU callout state into the dpcpu region
- 42f722e721cd amd64: store pcids pmap data in pcpu zone
- 9801e7c275f6 smp_topo: dynamically allocate group array
- 9fb6718d1b18 smp: Dynamically allocate the stoppcbs array
- 2bb16c635249 x86: retire use of intr_bind
There are some additional allocations still to be converted and
more scalability work is required to make effective use of very high
core count systems, but this change allows us to boot on these systems
and provides a Kernel Binary Interface (KBI) for the FreeBSD 14 release
that supports these configurations.
Special thanks to AMD for providing hardware to test these changes.
PR: 269572
Reviewed by: des
Relnotes: Yes
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D36838
`intr_bind(u_int vector, u_char cpu);` looked suspicious since
everywhere else "cpu" is a u_int and >256 processors isn't unreasonable
now. `intr_bind()` is not used anywhere in FreeBSD (now, after commit bf42f3738087). Time to remove.
Steve Kargl [Thu, 3 Aug 2023 19:51:17 +0000 (21:51 +0200)]
Clean up libm use of the __ieee754_ prefix
This removes the __ieee754_ prefix from a number of the math functions.
msun/src/math_private.h contains the statement that
/*
* ieee style elementary functions
*
* We rename functions here to improve other sources' diffability
* against fdlibm.
*/
#define __ieee754_sqrt sqrt
...
Here, fdlibm refers to https://netlib.org/fdlibm. It is seen from
https://netlib.org/fdlibm/readme that this prefix was used to
differentiate between different standards:
Wrapper functions will twist the result of the ieee754
function to comply to the standard specified by the value
of _LIB_VERSION
if _LIB_VERSION = _IEEE_, return the ieee754 result;
if _LIB_VERSION = _SVID_, return SVID result;
if _LIB_VERSION = _XOPEN_, return XOPEN result;
if _LIB_VERSION = _POSIX_, return POSIX/ANSI result.
(These are macros, see fdlibm.h for their definition.)
AFAICT, FreeBSD has never supported these wrappers. In addition, as C99,
principally the long double, functions were added to libm, this
convention was not maintained. Given that only 148 of 324 files under
lib/msun contain a "Copyright (C) 1993 by Sun Microsystems" statement,
the removal of the __ieee754_ prefix provides consistency across all
source files.
The last time someone compared lib/msun to fdlibm appears to be
Reduce diffs against vendor source (Sun fdlibm 5.3).
The most recent fdlibm RCS string that appears in a Sun Microsystem
copyrighted file is date "95/01/18". With Oracle Corporation's
acquisition of Sun Microsystems in 2009, it is unlikely that fdlibm will
ever be updated. A search for fdlibm at https://opensource.oracle.com/
yields no hits.
Finally, OpenBSD removed the use of this prefix over 21 years ago. pSee
revision 1.6 of OpenBSD's math_private.h.
Note: this does not drop the __ieee754_ prefix from the trigonometric
argument reduction functions, e.g., __ieee754_rem_pio2. These functions
are internal to the libm and exported through Symbol.map; and thus,
reserved for the implementation.
spibus(4): Add support for ACPI-based children enumeration
When spibus is attached as child of Intel SPI controller it scans all
ACPI nodes for "SPI Serial Bus Connection Resource Descriptor" described
in section 19.6.126 of ACPI specs.
If such a descriptor is found, SPI child is added to spibus, it's SPI
chip select, mode, clock, IRQ resource and ACPI handle are added to ivars.
Existing ACPI bus-hosted child is deleted afterwards.
Apple ACPI SPI extensions are supported.
Reviewed by: manu
Differential Revision: https://reviews.freebsd.org/D41248
intelspi: Add support for ddb/kdb -compatible polled mode
Required for Apple and Microsoft -compatible HID-over-SPI drivers.
Most logic was already implemented in commit 3c0867343819
"spibus: extend API: add cs_delay ivar, KEEP_CS and NO_SLEEP flags".
It dissallowed driver sleeps in the interrupt context. This commit
extends this feature to handle ddb/kdb context with following:
- Skip driver locking if SPI functions were called from kdb/ddb.
- Reinitialize controller if kdb/ddb initiated SPI transfer has
interrupted another already running one. Does not work very
reliable yet.
Reviewed by: manu
Differential Revision: https://reviews.freebsd.org/D41247
Some devices like Apple HID-over-SPI may contain more than one report
descriptors necessitating creation of multiple hidbus children.
Add indentificator of child devices to distinct them.
No functional changes intended.
Doug Moore [Thu, 3 Aug 2023 14:19:48 +0000 (09:19 -0500)]
vm_phys_enqueue_contig: handle npages==0
By letting vm_phys_enqueue_contig handle the case when npages == 0,
the callers can stop checking it, and the compiler can stop
zero-checking with every call to ffs(). Letting vm_phys_enqueue_contig
call vm_phys_enqueue_contig for part of its work also saves a few
bytes.
Mitchell Horne [Thu, 3 Aug 2023 13:48:15 +0000 (10:48 -0300)]
intro(9): rewrite from scratch
This page has existed as a placeholder since its creation in 1995. It
does not provide a useful introduction to the content in this section.
Reimagine it as a top-level overview page containing brief descriptions
and links to existing pages in section 9. It is roughly organized into
sub-sections, grouped by topic or subsystem. In other words, the page is
meant to function as a map to other content.
There is a balance to be found here between providing as many links as
possible and keeping the page concise and searchable. In general the aim
is to reference pages which provide the best entry point to a particular
topic. For example, a link is given to locking(9), but not to the
specific lock pages such as mutex(9) or rwlock(9).
NetBSD has done something similar with their intro(9), so some
inspiration has been taken from there, although their content doesn't
align that closely with what we have.
I have done a thorough review of our existing pages and formed these
subsections around them, but they are meant to evolve.
PR: 270481
Reviewed by: imp, emaste
MFC after: 3 weeks
Relnotes: yes
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D41104
Kevin Bowling [Thu, 3 Aug 2023 05:47:15 +0000 (22:47 -0700)]
e1000: Automask TSO on lem(4)/em(4) 10/100 Ethernet
This feature masks TSO capability when a link comes up at 10 or 100mbit
due to errata on the chips. This behavior matches previous versions of
FreeBSD as well as NetBSD and Linux.
A tunable, hw.em.unsupported_tso may be set if the admin desires to
disabling automasking and configure TSO settings manually.
Steve Kargl [Mon, 31 Jul 2023 22:34:48 +0000 (01:34 +0300)]
Fixes for bugs in sinpi/cospi/tanpi
patch to fix half-cycle trigonometric functions
Paul Zimmermann, a MPFR developer, contacted me about large errors in
the half-cycle trigonometric functions. I've have investigated these
issues and developed the attached patch. The float, double, and ld80
(long double) changes have been tested.
Caveat emptor: The ld128 changes have not been compiled. The ld128
changes have not been tested. I do not have access to a system that
uses ld128 floating point.
Here is an itemized list of changes:
* lib/msun/src/math_private.h:
. Add fast floor macros to compute the integer part of |x| for
0 <= |x| 01xp(N-1), where N is the precision of the type of x.
These macros are used in the half-cycle trigonometric functions
(e.g., sinpi(x)).
. The FFLOOR80 macros is used with the Intel 80-bit extended double
functions. This macors corrects an off-by-one error, which led to
enormous error for |x| > 0x1p32.
* lib/msun/src/s_cospif.c:
* lib/msun/src/s_cospi.c:
* lib/msun/ld80/s_cospil.c:
. Update Copyright years.
. Use FFLOOR*() macro to get integer part of |x|.
. Correct handle the range 0x1p(N-1) <= |x| < 0x1pN. Here, one needs
to determine if the integral value of |x| is even or odd to choose
+1 or -1. If |x| >= 0x1pN, always return +1.
* lib/msun/src/s_sinpif.c:
* lib/msun/src/s_sinpi.c:
* lib/msun/ld80/s_sinpil.c:
. Update Copyright years.
. Use FFLOOR*() macro to get integer part of |x|.
* lib/msun/src/s_tanpif.c:
* lib/msun/src/s_tanpi.c:
* lib/msun/ld80/s_tanpil.c:
. Update Copyright years.
. For +-0.5, return +-inf. Previously, tanpi[fl]() returned an NaN.
. Use FFLOOR*() to get integer part of |x|. Need to determine if the
integer part is even or odd. This is used to set +-0 for |x|
integral
and +-inf for (n+1/2).
. For 0x1p(N-1) <= |x| < 0x1pN need to determine if x is an even or
odd
integer to select +0 or -0. For |x| >= 0x1pN, it is always an even
integer, select 0.
. Note, tanpi[fl](x) is an odd function, so one needs to consider
tanpi[fl](-|x|) = - tanpi[fl](|x|).
* lib/msun/ld128/s_cospil.c:
* lib/msun/ld128/s_sinpil.c:
* lib/msun/ld128/s_tanpil.c:
. Update Copyright years.
. These routines use an FFLOOR128 macros, which likely should be
replaced by a bit twiddling algorithm.
. The same considerations above are applied to 0x1p112 <= |x| <
0x1p113,
and |x| >= 0x1p113 cases.
. Note, even and odd determination used fmodl(x,2.), which is likely
slow.
Steve Kargl [Mon, 31 Jul 2023 22:32:54 +0000 (01:32 +0300)]
Cleanup debugging code in libm
David Das (das@) committed Bruce Evan's (bde's) WIP code for
expl() and logl() in git revision 25a4d6bfda29119. That code
included instrumentation that allowed bde to generate pari
scripts used in testing/debugging. This patch removes that
instrumentation as it is unlikely that others will ever use it.
* math/libm/msun/src/math_private.h:
. Remove bde's macros for the generation of pari scripts.
* math/libm/msun/ld128/s_expl.c:
* math/libm/msun/ld128/s_logl.c:
* math/libm/msun/ld80/s_expl.c:
* math/libm/msun/ld80/s_logl.c:
. Remove bde's DOPRINT_START macro.
. Change RETURNP to RETURNF.
. Change RETURN2P to RETURNF. Adjust arguments as needed.
. Change RETURNPI to RETURNI.
. Change RETURN2PI to RETURNI. Adjust arguments as needed.
Ed Maste [Wed, 2 Aug 2023 14:18:33 +0000 (10:18 -0400)]
ssh: comment deprecated option handling for retired local patches
Older versions of FreeBSD included the HPN patch set and provided
client-side VersionAddendum. Both of these changes have been retired
but we've retained the option parsing for backwards compatibility to
avoid breaking upgrades. Add comment references to the relevant
commits.
Mark Johnston [Wed, 2 Aug 2023 13:24:06 +0000 (09:24 -0400)]
ObsoleteFiles.inc: Add an entry for libdtrace.so.2 in /usr/lib
There was a window between commits 4ae699122810 ("dtrace: Add
WITH_DTRACE_ASAN") and 848ff9bc1b97 ("libdtrace: Explicitly set SHLIBDIR
and SHLIB_MAJOR") where libdtrace.so.2 was being installed to /usr/lib
instead of /lib.
Mark Johnston [Tue, 1 Aug 2023 21:58:42 +0000 (17:58 -0400)]
kdb: Permit a NULL thread credential in kdb_backend_permitted()
Early during boot, thread0 runs with td->td_ucred == NULL. This is
fixed up in proc0_init() at SI_SUB_INTRINSIC. If a panic occurs before
then, rather than dereference a NULL pointer, simply allow the thread to
enter KDB.
Doug Moore [Wed, 2 Aug 2023 03:12:00 +0000 (22:12 -0500)]
vm_phys_enqueue_contig: handle npages==0
By letting vm_phys_enqueue_contig handle the case when npages == 0,
the callers can stop checking it, and the compiler can stop
zero-checking with every call to ffs(). Letting vm_phys_enqueue_contig
call vm_phys_enqueue_contig for part of its work also saves a few
bytes.
Ed Maste [Tue, 1 Aug 2023 12:48:02 +0000 (08:48 -0400)]
retire SHARED_TOOLCHAIN knob
Toolchain components were historically statically linked. They became
normal dynamically linked executables in commit 6ab18ea64d19. There is
no need to keep a special case build option for the toolchain; users who
want statically linked toolchain (or any other) components can use the
existing NO_SHARED knob.
Reviewed by: dim, sjg
Relnotes: Yes
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D41266
John Baldwin [Tue, 1 Aug 2023 22:20:53 +0000 (15:20 -0700)]
mmc_xpt: Remove dubious end of mmc_print_ident
The end of this function finishes the passed in sbuf, calls printf
manually on the contents, and then clears it. The caller then tries
to print the resulting sbuf. This works currently but will not work
for future callers that pass in an external sbuf to be appended to.
John Baldwin [Tue, 1 Aug 2023 22:20:25 +0000 (15:20 -0700)]
cam xpt_*nounce_periph*: Various fixes for periphs without a protocol
If the periph doesn't have a valid protocol, these routines emit
fallback messages. However, the fallback messages duplicated the
periph name and unit number, and in the case of *denounce* included a
spurious newline.
John Baldwin [Tue, 1 Aug 2023 21:01:57 +0000 (14:01 -0700)]
Makefile.inc1: Enable requesting the universe toolchain.
make universe builds a cross toolchain under HOST_OBJTMP/tmp via the
universe-toolchain target. However, doing a plain 'make buildworld'
after a universe/tinderbox run (e.g. to reproduce a failure and test
the fix for it), will try to build a new cross toolchain under
OBJTMP/tmp which can be tedious. This commit adds a make variable
(UNIVERSE_TOOLCHAIN) which can be used similar to CROSS_TOOLCHAIN to
request an external toolchain. If this variable is set (value doesn't
matter), the the universe toolchain is used as an external toolchain.
Warner Losh [Tue, 1 Aug 2023 20:51:10 +0000 (14:51 -0600)]
cam: Log more error codes
Log CAM_DEV_NOT_THERE status CCBs because they get dropped if a drive
disappears and these requests timeout or are cancelled. It's useful to
know the outstanding commands for failure analysis. Log
CAM_NVME_STATUS_ERROR status CCBs to bring in NVMe errors (this will be
more important in future commits that expand the information logged).
Kirk McKusick [Tue, 1 Aug 2023 20:16:11 +0000 (13:16 -0700)]
Support background fsck_ffs(8) on filesystems using journaled soft updates
An earlier addition of code to fsck_ffs(8) allowed it to support
snapshots when running with journalled soft updates. Further
functionality has now been added to fsck_ffs(8) to allow it to use
snapshots to run in background on live filesystems running with
journaled soft updates. This commit enables the use of this functionality.
Tested-by: Peter Holm Sponsored-by: The FreeBSD Foundation
MFC-after: 2 weeks
John Baldwin [Thu, 29 Jun 2023 18:27:12 +0000 (11:27 -0700)]
bhyve: Fully reset the fwctl state machine if the guest requests a reset.
If a guest tries to reset the fwctl device while a pending request was
in flight, the fwctl state machine can be left in an incomplete state.
Specifically, rinfo is not cleared.
Normally the state machine for fwctl alternates between REQ (receiving
request) and RESP (sending response) and ignores port writes while in
RESP or port reads while in REQ. Once a guest completes the writes to
the port to send a request, the state machine transitions to RESP and
ignores future writes.
However, if a guest writes a full request and then resets the fwctl
device, the state would transition to REQ without draining the pending
response or discarding the received request. Instead, additional
port writes after the reset were treated as new payload bytes, but
were appended to the previously-received request and could overflow
the fget_str buffer.
To fix, fully reset the fwctl state machine if the guest requests a
reset.
admbugs: 998
Approved by: so
Reviewed by: markj
Reported by: Omri Ben Bassat <t-benbassato@microsoft.com>
Security: FreeBSD-SA-23:07.bhyve
Security: CVE-2023-3494
Eric van Gyzen [Tue, 25 Jul 2023 16:58:11 +0000 (11:58 -0500)]
proc_detach: use ptrace(PT_KILL) to kill the tracee
When MFC'ing commit dad11f990e2 to stable/12, the child would dump core
when dtrace exited. It was getting SIGTRAP, even though proc_detach
sent a SIGKILL. I could not find the reason for this difference in
behavior from main (and stable/13). The present change, however, works
as expected, probably due the proc_wkilled special case in kern_ptrace.
It also seems like a more obvious approach.
While I'm here, fix two other issues in the previous code:
It would SIGKILL a tracee even in read-only mode.
It would SIGSTOP/SIGCONT the tracee if ptrace succeeded but errno happened
to be EBUSY for some other reason.
Eric van Gyzen [Tue, 25 Jul 2023 16:59:07 +0000 (11:59 -0500)]
dtrace: do not overload libproc flags
dtrace stored its PR_RLC and PR_KLC flags in the proc_handle's flags,
where they collided with PATTACH_FORCE and PATTACH_RDONLY, respectively.
Thus, Psetflags(PR_KLC) effectively also set the PATTACH_RDONLY flag.
Since the flags are private to dtrace (at least on FreeBSD), store them in
dtrace's own dt_proc structure instead.
On FreeBSD, either PR_RLC or PR_KLC was always set, so remove code that
would handle the case where neither was set.
Eric van Gyzen [Tue, 25 Jul 2023 16:58:47 +0000 (11:58 -0500)]
dtrace: remove illumos code from dt_proc.c
The illumos #ifdef's in this file make it harder to read and maintain.
There is little change upstream, and increasing changes for FreeBSD.
Remove the illumos code with `unifdef`. The only manual changes here
are the #includes and #defines at the top, and removing blank lines.
Mark Johnston [Tue, 1 Aug 2023 14:11:23 +0000 (10:11 -0400)]
libdtrace: Explicitly set SHLIBDIR and SHLIB_MAJOR
They were previously being defined by cddl/lib/Makefile.inc, and as of
commit 4ae699122810 were being overridden by defaults in bsd.own.mk,
which changed SHLIBDIR to /usr/lib.
John Baldwin [Mon, 31 Jul 2023 20:24:44 +0000 (13:24 -0700)]
memdesc: Add routines for copying data to/from memory descriptors
These are modeled on the API used for m_copydata/m_copyback and the
crypto buffer APIs. One day it might be nice to reduce the
proliferation of these by adding cursors and using memdesc directly
for crypto request buffers.
tools/build: Work around broken Clang FreeBSD resource dir logic pre-13
Prior to Clang 13 (e.g. in the Clang 11 present in 13.0-RELEASE), the
resource directory logic for FreeBSD was broken and would not resolve
symlinks, meaning symlinks would only work if in a directory next to the
containing lib directory. Therefore we cannot even use a symlink for
worldtmp, we have to make a wrapper script that execs the real binary
via an absolute path.
Kevin Bowling [Mon, 31 Jul 2023 15:16:24 +0000 (08:16 -0700)]
e1000: Fix lem(4)/em(4) TSO6
* Fix TSO6 by specializing IP checksum insertion and following Intel SDM
values for IPv6.
* Remove unnecessary 82544 IP-bit handling
* Remove TSO6 from lem(4) capabilitities