CyberLeo.Net >> Repos - FreeBSD/FreeBSD.git/log

Update doc links in README

null_vput_pair(): release use reference on dvp earlier

We might own the last use reference, and then vrele() at the end would
need to take the dvp vnode lock to inactivate, which causes deadlock
with vp. We cannot vrele() dvp from start since this might unlock ldvp.

Handle it by holding the vnode and dropping use ref after lowerfs
VOP_VPUT_PAIR() ended. This effectivaly requires unlock of the vp vnode
after VOP_VPUT_PAIR(), so the call is changed to set unlock_vp to true
unconditionally. This opens more opportunities for vp to be reclaimed,
if lvp is still alive we reinstantiate vp with null_nodeget().

Reported and tested by: pho
Reviewed by: mckusick
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D29178

vlrureclaim: only skip vnode with resident pages if it own the pages

Nullfs vnode which shares vm_object and pages with the lower vnode should
not be exempt from the reclaim just because lower vnode cached a lot.
Their reclamation is actually very cheap and should be preferred over
real fs vnodes, but this change is already useful.

Reported and tested by: pho
Reviewed by: mckusick
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D29178

softdep_unmount: assert that no dandling dependencies are left

Reviewed by: mckusick
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D29178

FFS: assign fully initialized struct mount_softdeps to um_softdep

Other threads observing the non-NULL um_softdep can assume that it is
safe to use it. This is important for ro->rw remounts where change from
read-only to read-write status cannot be made atomic.

Reviewed by: mckusick
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D29178

Assert that um_softdep is NULL on free(ump), i.e. softdep_unmount() was called

Reviewed by: mckusick
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D29178

ffs_mount: when remounting ro->rw and sbupdate failed, cleanup softdeps

Reviewed by: mckusick
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D29178

softdep_unmount: handle spurious wakeups

Reviewed by: mckusick
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D29178

softdep_flush(): do not access ump after we acked FLUSH_EXIT and unlocked SU lock

otherwise we might follow a pointer in the freed memory.

Reviewed by: mckusick
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D29178

ffs: clear MNT_SOFTDEP earlier when remounting rw to ro

Suppose that we remount rw->ro and in parallel some reader tries to
instantiate a vnode, e.g. during lookup. Suppose that softdep_unmount()
already started, but we did not cleared the MNT_SOFTDEP flag yet.
Then ffs_vgetf() calls into softdep_load_inodeblock() which accessed
destroyed hashes and freed memory.

Set/clear fs_ronly simultaneously (WRT to files flush) with MNT_SOFTDEP.
It might be reasonable to move the change of fs_ronly to under MNT_ILOCK,
but no readers take it.

Reported and tested by: pho
Reviewed by: mckusick
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D29178

Rework MOUNTED/DOING SOFTDEP/SUJ macros

Now MNT_SOFTDEP indicates that SU are active in any variant +-J, and
SU+J is indicated by MNT_SOFTDEP | MNT_SUJ combination. The reason is
that unmount will be able to easily hide SU from other operations by
clearing MNT_SOFTDEP while keeping the record of the active journal.

Reviewed by: mckusick
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D29178

ffs softdep: clear ump->um_softdep on softdep_unmount()

Reviewed by: mckusick
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D29178

ffs_extern.h: Add comments for ffs_vgetf() flags

Requested and reviewed by: mckusick
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D29178

Add FFSV_FORCEINODEDEP flag for ffs_vgetf()

It will be used to allow SU flush code to sync the volume while external
consumers see that SU is already disabled on the filesystem. Use it where
ffs_vgetf() called by SU code to process dependencies.

Reviewed by: mckusick
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D29178

simplify journal_mount: move the out label after success block

This removes the need to check for error == 0.

Reviewed by: mckusick
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D29178

Do not complain about incorrect cylinder group check-hashes when
asked to add them to a filesystem.

MFC after: 3 days
Sponsored by: Netflix

Hyper-V: hn: Enable vSwitch RSC support in hn netvsc driver

Receive Segment Coalescing (RSC) in the vSwitch is a feature available in
Windows Server 2019 hosts and later. It reduces the per packet processing
overhead by coalescing multiple TCP segments when possible. This happens
mostly when TCP traffics are among different guests on same host.
This patch adds netvsc driver support for this feature.

The patch also updates NVS version to 6.1 as needed for RSC
enablement.

MFC after: 2 weeks
Sponsored by: Microsoft
Differential Revision: https://reviews.freebsd.org/D29075

readme: Link to COPYRIGHT file

Fix arch rendering

readme: update style

Update the style to one sentence per line, as is currently used in the FreeBSD
document project. Make the links to the handbook clickable.

Remove README in favor of README.md

Complete the transition to README.md I started 3 years ago. Remove the
now-redundant README file. It's currently just README.md w/o the light markup
and adds no real value. This also allows us to use additional MarkDown
markup as we see fit w/o worrying about keeping things in sync.

SPDX: Spell 4 clause BSD license correctly

gmirror: Pre-allocate the timeout event structure

We can't call malloc(M_WAITOK) in a callout handler.

Reviewed by: imp
Reported by: pho
Tested by: pho
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D29223

development(7): update to reflect Git transition

Reviewed By: debdrup, imp (earlier version)
Sponsored By: EPSRC
Differential Revision: https://reviews.freebsd.org/D28939

man: Remove obsolete info from hosts man page

The NIC no longer provides a host database, and hasn't for quite some
time. Remove that paragraph, it's not been relevant for many years. Also, hosts
appeared in 4.1c, not 4.2, so correct that too.

Noticed by: Henry Bent

nvme: use config_intrhook_drain to avoid removable card races

nvme drives are configured early in boot. However, a number of the configuration
steps takes which take a while, so we defer those to a config intrhook that runs
before the root filesystem is mounted. At the same time, the PCI hot plug wakes
up and tests the status of the card. It may decide that the card has gone away
and deletes the child. As part of that process nvme_detach is called. If this
call happens after the config_intrhook starts to run, but before it is finished,
there's a race where we can tear down the device's soft state while the
config_intrhook is still using it. Use the new config_intrhook_drain to
disestablish the hook. Either it will be removed w/o running, or the routine
will wait for it to finish. This closes the race and allows safe hotplug at any
time, even very early in boot.

Sponsored by: Netflix, Inc
Reviewed by: jhb, mav
Differential Revision: https://reviews.freebsd.org/D29006

config_intrhook: provide config_intrhook_drain

config_intrhook_drain will remove the hook from the list as
config_intrhook_disestablish does if the hook hasn't been called. If it has,
config_intrhook_drain will wait for the hook to be disestablished in the normal
course (or expedited, it's up to the driver to decide how and when
to call config_intrhook_disestablish).

This is intended for removable devices that use config_intrhook and might be
attached early in boot, but that may be removed before the kernel can call the
config_intrhook or before it ends. To prevent all races, the detach routine will
need to call config_intrhook_train.

Sponsored by: Netflix, Inc
Reviewed by: jhb, mav, gde (in D29006 for man page)
Differential Revision: https://reviews.freebsd.org/D29005

linsysfs: create /sys/bus/ and /sys/subsystem/

This looks like a no-op, but it prevents udevadm(8) with failing
loudly, which in turn unbreaks installation of libfprint-2-2, which
in Focal is a dependency for make-4.2.1-1.2.

One might wonder why installing a build utility involves messing
with device handling...

Sponsored By: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D29133

vm_reserv: Fix list locking in vm_reserv_reclaim_contig()

The per-domain partpop queue is locked by the combination of the
per-domain lock and individual reservation mutexes.
vm_reserv_reclaim_contig() scans the queue looking for partially
populated reservations that can be reclaimed in order to satisfy the
caller's allocation.

During the scan, we drop the per-domain lock. At this point, the rvn
pointer may be invalidated. Take care to load rvn after re-acquiring
the per-domain lock.

While here, simplify the condition used to check whether a reservation
was dequeued while the per-domain lock was dropped.

Reviewed by: alc, kib
Reported by: gallatin
MFC after: 3 days
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D29203

usb: tiny formatting nit

Format 300 baud like all the others here. No functional change.

pf: Remove redundant kif != NULL checks

pf_kkif_free() already checks for NULL, so we don't have to check before
we call it.

Reviewed by: melifaro@
MFC after: 1 week
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D29195

pf: Factor out pf_krule_free()

Reviewed by: melifaro@
MFC after: 1 week
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D29194

usr.sbin/pwm/pwm add support for flags

The pwm utility cant set the only flag defined (PWM_POLARITY_INVERTED) so this
patch add the option -I (capital letter i) to send it to the drivers.

None of existing PWM driver have implemented support for flags.
But soon:ish I will put up an review of a pwm driver using TI OMAP DMTimer.

Differential Revision: https://reviews.freebsd.org/D29137
MFC after: 2 weeks

share/man/man9/pwmbus.9 fix types in arguments

Fix the types of period and duty in share/man/man9/pwmbus.9 to match the one in sys/dev/pmw/pwmbus.c.

Reviewed By: rpokala
Differential Revision: https://reviews.freebsd.org/D29139
MFC after: 3 days

kern.mk: fix -Wno-error style to fix build with Clang 12

Clang 12 no longer supports -Wno-error-..., only the -Wno-error=...
style (which is already used everywhere else in the tree).

Differential Revision: https://reviews.freebsd.org/D29157

Flush remaining routes from the routing table during VNET shutdown.

Summary:
This fixes rtentry leak for the cloned interfaces created inside the
VNET.

PR: 253998
Reported by: rashey at superbox.pl
MFC after: 3 days

Loopback teardown order is `SI_SUB_INIT_IF`, which happens after `SI_SUB_PROTO_DOMAIN` (route table teardown).
Thus, any route table operations are too late to schedule.
As the intent of the vnet teardown procedures to minimise the amount of effort by doing global cleanups instead of per-interface ones, address this by adding a relatively light-weight routing table cleanup function, `rib_flush_routes()`.
It removes all remaining routes from the routing table and schedules the deletion, which will happen later, when `rtables_destroy()` waits for the current epoch to finish.

Test Plan:
```
set_skip:set_skip_group_lo -> passed [0.053s]
tail -n 200 /var/log/messages | grep rtentry
```

Reviewers: #network, kp, bz

Reviewed By: kp

Subscribers: imp, ae

Differential Revision: https://reviews.freebsd.org/D29116

ktls: Fix non-inplace TLS 1.3 encryption.

Copy the iovec for the trailer from the proper place. This is the same
fix for CBC encryption from ff6a7e4ba6bf.

Reported by: gallatin
Reviewed by: gallatin, markj
Fixes: 49f6925ca
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D29177

Move time math out of disabled interrupts sections.

We don't need the result before next sleep time, so no reason to
additionally increase interrupt latency.

While there, remove extra PM ticks to microseconds conversion, making
C2/C3 sleep times look 4 times smaller than really. The conversion
is already done by AcpiGetTimerDuration(). Now I see reported sleep
times up to 0.5s, just as expected for planned 2 wakeups per second.

MFC after: 1 month

arm64: Fix COMPAT_FREEBSD32.

The ENTRY() macro was modified by commit
28d945204ea1014d7de6906af8470ed8b3311335 to add an optional NOP instruction
at the beginning of the function. It is of course an arm64 instruction, so
unsuitable for the 32bits sigcode. So just use EENTRY() instead for
aarch32_sigcode. This should fix receiving signals when running 32bits
binaries on FreeBSD/arm64.

MFC After: 1 week

Fix post-start check when unbound.conf has moved.

Reported by: phk@
MFC after: 1 week

Fix local-unbound setup for some IPv6 deployments.

PR: 250984
MFC after: 1 week

ns8250: don't drop IER_TXRDY on bus_grab/ungrab

It has been observed that some systems are often unable to resume from
ddb after entering with debug.kdb.enter=1. Checking the status further
shows the terminal is blocked waiting in tty_drain(), but it never makes
progress in clearing the output queue, because sc->sc_txbusy is high.

I noticed that when entering polling mode for the debugger, IER_TXRDY is
set in the failure case. Since this bit is never tracked by the softc,
it will not be restored by ns8250_bus_ungrab(). This creates a race in
which a TX interrupt can be lost, creating the hang described above.
Ensuring that this bit is restored is enough to prevent this, and resume
from ddb as expected.

The solution is to track this bit in the sc->ier field, for the same
lifetime that TX interrupts are enabled.

PR: 223917, 240122
Reviewed by: imp, manu
Tested by: bz
MFC after: 5 days
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D29130

Arch64: Clear VFP state on execve()

I noticed that many of the math-related tests were failing on AArch64.
After a lot of debugging, I noticed that the floating point exception flags
were not being reset when starting a new process. This change resets the
VFP inside exec_setregs() to ensure no VFP register state is leaked from
parent processes to children.

This commit also moves the clearing of fpcr that was added in 65618fdda0f27
from fork() to execve() since that makes more sense: fork() can retain
current register values, but execve() should result in a well-defined
clean state.

Reviewed By: andrew
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D29060

Allocating the LinuxKPI current structure from a software interrupt thread
must be done using the M_NOWAIT flag after 1ae20f7c70ea .

MFC after: 1 week
Sponsored by: Mellanox Technologies // NVIDIA Networking

Use the word "LinuxKPI" instead of "Linux compatibility", to not confuse with
user-space Linux compatibility support. No functional change.

MFC after: 1 week
Sponsored by: Mellanox Technologies // NVIDIA Networking

Allocating the LinuxKPI current structure from an interrupt thread must be
done using the M_NOWAIT flag after 1ae20f7c70ea .

MFC after: 1 week
Sponsored by: Mellanox Technologies // NVIDIA Networking

wg(4): note the persistent-keepalive ifconfig(8) option

MFC after: 3 days
Fixes: b3dac3913dc9

Implement basic support for allocating memory from a specific numa node
in the LinuxKPI.

Differential Revision: https://reviews.freebsd.org/D29077
Reviewed by: markj@ and kib@
MFC after: 1 week
Sponsored by: Mellanox Technologies // NVIDIA Networking

if_wg: export tx_bytes, rx_bytes, and last_handshake

The names are self-explanatory; these are currently only used by the
wg(8) tool, but they are handy data points to have.

Reviewed by: grehan
MFC after: 3 days
Discussed with: decke
Differential Revision: https://reviews.freebsd.org/D29143

iflib: allow clone detach if not yet init

If we hit an error during init, then we'll unwind our state and attempt
to detach the device -- don't block it.

This was discovered by creating a wg0 with missing parameters; said
failure ended up leaving this orphaned device in place and ended up
panicking the system upon enumeration of the dev.* sysctl space.

Reviewed by: gallatin, markj
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D29145

if_wg: wg_input: remove a couple locals (NFC)

We have no use for the udphdr or this hlen local, just spell out the
addition inline.

MFC after: 3 days
Reviewed by: grehan, markj
Differential Revision: https://reviews.freebsd.org/D29142

amd64 pmap: convert to counter(9), add PV and pagetable page counts

This change converts most of the counters in the amd64 pmap from
global atomics to scalable counter(9) counters.  Per discussion
with kib@, it also removes the handrolled per-CPU PCID save count
as it isn't considered generally useful.

The bulk of these counters remain guarded by PV_STATS, as it seems
unlikely that they will be useful outside of very specific debugging
scenarios.  However, this change does add two new counters that
are available without PV_STATS.  pt_page_count and pv_page_count
track the number of active physical-to-virtual list pages and page
table pages, respectively.  These will be useful in evaluating
the memory footprint of pmap structures under various workloads,
which will help to guide future changes in this area.

Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D28923

ofwfb: fix boot on LE

Some framebuffer properties obtained from the device tree were not being
properly converted to host endian.
Replace OF_getprop calls by OF_getencprop where needed to fix this.

This fixes boot on PowerPC64 LE, when using ofwfb as the system console.

Reviewed by:    bdragon
Sponsored by:   Eldorado Research Institute (eldorado.org.br)
MFC after:      1 week
Differential Revision:  https://reviews.freebsd.org/D27475

Revert "rc: implement parallel boot"

This is not ready yet for prime time

This reverts commit 763db58932874bb47fc6f9322ab81cc947f80991.
This reverts commit f1ab799927c8e93e8f58e5039f287a2ca45675ec.
This reverts commit 6e822e99570fdf4c564be04840a054bccc070222.
This reverts commit 77e1ccbee3ed6c837929e4e232fd07f95bfc8294.

ifconfig: allow displaying/setting persistent-keepalive

The kernel-side already accepted a persistent-keepalive-interval, so
just add a verb to ifconfig(8) for it and start exporting it so that
ifconfig(8) can view it.

PR: 253790
MFC after: 3 days
Discussed with: decke

ifconfig: wg: stop requiring peer endpoints

The way that wireguard is designed does not actually require all peers
to have endpoints. In an architecture that might mimic a traditional
VPN server <-> client, the wg interface on a server would have a number
of peers without set endpoints -- the expectation is that the "clients"
will connect to the "server" peer, which will authenticate the
connection as a known peer and learn the endpoint from there.

MFC after: 3 days
Discussed with: decke, grehan (independently)

kern: malloc: fix panic on M_WAITOK during THREAD_NO_SLEEPING()

Simple condition flip; we wanted to panic here after epoch_trace_list().

Reviewed by: glebius, markj
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D29125

if_wg: avoid sleeping under the net epoch

No sleeping allowed here, so avoid it. Collect the subset of data we
want inside of the epoch, as we'll need extra allocations when we add
items to the nvlist.

Reviewed by: grehan (earlier version), markj
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D29124

if_wg: return to m_defrag() of incoming mbuf, sans leak

This partially reverts df55485085 but still fixes the leak. It was
overlooked (sigh) that some packets will exceed MHLEN and cannot be
physically contiguous without clustering, but we don't actually need
it to be. m_defrag() should pull up enough for any of the headers that
we do need to be accessible.

Fixes: df55485085
Pointy hat; kevans

mountd(8): generate a syslog message when the "V4:" line is missing

Daniel reported that NFSv4 mounts were not working despite having
set "nfsv4_server_enable=YES" in /etc/rc.conf. Mountd was logging a
message that there was no /etc/exports file.
He noted that creating a /etc/exports file with a "V4:" line in it
was needed make NFSv4 mounts work.
At least one "V4:" line in one of the exports(5) file(s) is needed to
make NFSv4 mounts work. This patch fixes mountd.c so that it logs a
message indicting that there is no "V4:" line in any exports(5)
file when NFSv4 mounts are enabled.
To avoid this message being generated erroneously, /etc/rc.d/mountd
is updated to make sure vfs.nfsd.server_max_nfsvers is properly set
before mountd(8) is started.

Reported by: debdrup
PR: 253901
MFC after: 2 weeks

Do not read timer extra time when MWAIT is used.

When we enter C2+ state via memory read, it may take chipset some
time to stop CPU. Extra register read covers that time. But MWAIT
makes CPU stop immediately, so we don't need to waste time after
wakeup with interrupts still disabled, increasing latency.

On my system it reduces ping localhost latency, waking up all CPUs
once a second, from 277us to 242us.

MFC after: 1 month

Change mwait_bm_avoidance use to match Linux.

Even though the information is very limited, it seems the intent of
this flag is to control ACPI_BITREG_BUS_MASTER_STATUS use for C3,
not force ACPI_BITREG_ARB_DISABLE manipulations for C2, where it was
never needed, and which register not really doing anything for years.
It wasted lots of CPU time on congested global ACPI hardware lock
when many CPU cores were trying to enter/exit deep C-states same time.

On idle 80-core system it pushed ping localhost latency up to 20ms,
since badport_bandlim() via counter_ratecheck() wakes up all CPUs
same time once a second just to synchronously reset the counters.
Now enabling C-states increases the latency from 0.1 to just 0.25ms.

Discussed with: kib
MFC after: 1 month

Move back the isa non-PNP driver deadline to FreeBSD 14.

config_intrhook: Move from TAILQ to STAILQ and padding

config_intrhook doesn't need to be a two-pointer TAILQ. We rarely add/delete
from this and so those need not be optimized. Instaed, use the one-pointer
STAILQ plus a uintptr_t to be used as a flags word. This will allow these
changes to be MFC'd to 12 and 13 to fix a race in removable devices.

Feedback from: jhb
Reviewed by: mav
Differential Revision: https://reviews.freebsd.org/D29004

Fix 'in6_purgeaddr: err=65, destination address delete failed' message.

P2P ifa may require 2 routes: one is the loopback route, another is
the "prefix" route towards its destination.

Current code marks loopback routes existence with IFA_RTSELF and
"prefix" p2p routes with IFA_ROUTE.

For historic reasons, we fill in ifa_dstaddr for loopback interfaces.
To avoid installing the same route twice, we preemptively set
IFA_RTSELF when adding "prefix" route for loopback.
However, the teardown part doesn't have this hack, so we try to
remove the same route twice.

Fix this by checking if ifa_dstaddr is different from the ifa_addr
and moving this logic into a separate function.

Reviewed By: kp
Differential Revision: https://reviews.freebsd.org/D29121
MFC after: 3 days

if_vtbe: Add missing includes to fix build

PR: 254137
Reported by: Mina Galić <me@igalic.co>
MFC after: 3 days
Fixes: f8bc74e2f4a5 ("tap: add support for virtio-net offloads")

x86: tsc: deprioritize TSC on VirtualBox

Misbehavior has been observed with TSC under VirtualBox, where threads
doing small sleeps (~1 second) may miss their wake up and hang around
in a sleep state indefinitely. Switching back to ACPI-fast decidedly
fixes it, so stop using TSC on VirtualBox at least for the time being.

This partially reverts 84eaf2ccc6aa, applying it only to VirtualBox and
increasing the quality to 0. Negative qualities can never be chosen and
cannot be chosen with the tunable recently added. If we do not have a
timecounter with a higher quality than 0, then TSC does at least leave
the system mostly usable.

PR: 253087
Reviewed by: emaste, kib
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D29132

Add ObsoleteFiles.inc entries for various OCF headers removed in 13.

MFC after: 3 days

Correct the name of the structure used for TCP socket options.

The structure was renamed while refactoring Netflix's KTLS changes for
upstreaming, but the original name remained in tcp.4 and was
subsequently copied to ktls.4.

PR: 254141
Reported by: asomers
MFC after: 3 days

wg: Fix a mismerge

df55485085 fixed a leak that I had initially fixed in a11009dccb.

Fixes: a11009dccb

iflib: Make if_shared_ctx_t a pointer to const

This structure is shared among multiple instances of a driver, so we
should ensure that it doesn't somehow get treated as if there's a
separate instance per interface. This is especially important for
software-only drivers like wg.

DEVICE_REGISTER() still returns a void * and so the per-driver sctx
structures are not yet defined with the const qualifier.

Reviewed by: gallatin, erj
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D29102

Rename _cscan_atomic.h and _cscan_bus.h to atomic_san.h and bus_san.h

Other kernel sanitizers (KMSAN, KASAN) require interceptors as well, so
put these in a more generic place as a step towards importing the other
sanitizers.

No functional change intended.

MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D29103

ath_hal: Stop printing messages during boot

ath_hal is compiled into the kernel by default and so always prints a
message to dmesg even when the system has no ath hardware.

MFC after: 1 week
Sponsored by: The FreeBSD Foundation

posix timers: Improve the overrun calculation

timer_settime(2) may be used to configure a timeout in the past.  If
the timer is also periodic, we also try to compute the number of timer
overruns that occurred between the initial timeout and the time at which
the timer fired.  This is done in a loop which iterates once per period
between the initial timeout and now.  If the period is small and the
initial timeout was a long time ago, this loop can take forever to run,
so the system is effectively DOSed.

Replace the loop with a more direct calculation of
(now - initial timeout) / period to compute the number of overruns.

Reported by: syzkaller
Reviewed by: kib
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D29093

posix timers: Sprinkle some style fixes

MFC after: 1 week
Sponsored by: The FreeBSD Foundation

posix timers: Declare unexported functions as static

MFC after: 1 week
Sponsored by: The FreeBSD Foundation

wg: Style

MFC after: 1 week
Sponsored by: The FreeBSD Foundation

wg: Avoid leaking mbufs when the input handshake queue is full

Reviewed by: grehan
Sponsored by: The FreeBSD Foundation
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D29011

dumpon.8: Ask DDB to call doadump() rather than calling it directly

Sponsored by: The FreeBSD Foundation
MFC after: 1 week

lib/msun: Avoid FE_INEXACT for x86 log2l/log10l

This fixes tests/lib/msun/logarithm_test after compiling the test with
-fno-builtin (D28577). Adding invln10_lo + invln10_10 results in
FE_INEXACT (for all inputs) and the same for the log2l invln2_lo + invln2_hi.
This patch avoids FE_INEXACT (for exact results such as 0) by defining a
constant and using that.

Reviewed By: dim
Differential Revision: https://reviews.freebsd.org/D28786

tests/sys/cddl: correctly quote atf_set "require.progs"

The argument has to be a single whitespace-separate value. While touching
all these lines also add ksh93, since `atf_set "require.progs"` overrides
the default value specified in the Kyuafile. This then results in tests
being executed despite ksh93 not being installed.

Reviewed By: asomers
Differential Revision: https://reviews.freebsd.org/D29066

kern.mk: Fix wrong variable being used for linker path after 172a624f0

When I synchronized kern.mk with bsd.sys.mk, I accidentally changed
CCLDFLAGS to LDFLAGS which is not used by the kernel builds. This commit
should unbreak the GitHub actions cross-build CI. I didn't notice it
locally because cheribuild already passes -fuse-ld in the linker flags as
it predates this being done in the makefiles.

Reported By: Jose Luis Duran
Fixes: 172a624f0 ("Silence annoying and incorrect non-default linker warning with GCC")

stress2: open(2) tests with BENEATH flags.

Update tests to reflect the changes of "open(2): Remove O_BENEATH and
AT_BENEATH" in 20e91ca36a56.

if_wg: avoid null ptr deref

While we're here, sync up with OpenBSD and don't use a keypair !kp_valid

MFC after: 3 days

wg_input: avoid leaking due to an m_defrag failure

m_defrag() will not free the chain on failure, leaking the mbuf.

Obtained from: OpenBSD
MFC after: 3 days

if_wg: release correct lock in noise_remote_begin_session()

The keypair lock is not taken until later.

Obtained from: Jason A. Donenfeld via OpenBSD
MFC after: 3 days

Simplify using nvlist_append_string_array().

Reported by: hrs
MFC after: 1 week

Make kern.timecounter.hardware tunable

Noted and reviewed by: kevans
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential revision: https://reviews.freebsd.org/D29122

bhyve/snapshot: use SOCK_DGRAM instead of SOCK_STREAM

The save/restore feature uses a unix domain socket to send messages
from bhyvectl(8) to a bhyve(8) process. A datagram socket will suffice
for this.

An added benefit of using a datagram socket is simplified code. For
bhyve, the listen/accept calls are dropped; and for bhyvectl, the
connect() call is dropped.

EPRINTLN handles raw mode for bhyve(8), use it to print error messages.

Reviewed by: jhb
Differential Revision: https://reviews.freebsd.org/D28983

decryptcore: do not include sys/sysctl.h

It's not needed. Removing it is a small improvement in portability.

Sponsored by: Dell EMC Isilon

dumpon: do not print errno for resolver failure

When the netdump host name fails to resolve, don't print errno, since
it's irrelevant. We might as well use a different exit status, too.

Sponsored by: Dell EMC Isilon

Fix dpdk/ldradix fib lookup algorithm preference calculation.

The current preference number were copied from IPv4 code,
assuming 500k routes to be the full-view. Adjust with the current
reality (100k full-view).

Reported by: Marek Zarychta <zarychtam at plan-b.pwste.edu.pl>
MFC after: 3 days

armv8crypto: fix AES-XTS regression introduced by ed9b7f44

Initialization of the XTS key schedule was accidentally dropped
when adding AES-GCM support so all-zero schedule was used instead.
This rendered previously created GELI partitions unusable.
This change restores proper XTS key schedule initialization.

Reported by: Peter Jeremy <peter@rulingia.com>
MFC after: immediately

wg(4): Fix an example in the manual page

The example in the manual page of wg(4) for connecting to a
peer was missing the 'public-key' ifconfig(8) keyword and for the
addressed peer the port must be specified.

PR: 253866
Reported by: Sergey Akhmatov <sergey at akhmatov dot ru>
Reviewed by: debdrup
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D29115

net80211: ratectl header guard against multiple inclusions

Add missing #ifndef/#define/#endif guards against multiple inclusions
to ieee80211_ratectl.h as they are missing.

MFC after: 3 days
Sponsored-by: Rubicon Communications, LLC ("Netgate")

Revert "TEST gitrepo-dev"

This reverts commit 4287fa844f5e8f0021ada77c81ce96f9b547fccf.

TEST gitrepo-dev

mvebu_gpio: Fix settings of gpio pin direction.

Data Output Enable Control register is inverted – 0 means output direction.
Reflect this fact in code.

MFC after: 3 weeks

bhyvectl: print a better error message when vm_open() fails

Use errno to print a more descriptive error message when vm_open() fails

libvmm: preserve errno when vm_device_open() fails

vm_destroy() squashes errno by making a dive into sysctlbyname() - we
can safely skip vm_destroy() here since it's not doing any critical
clean up at this point. Replace vm_destroy() with a free() call.

PR:             250671
MFC after:      3 days
Submitted by:   marko@apache.org
Reviewed by: grehan
Differential Revision: https://reviews.freebsd.org/D29109

poly1305: Chase xform_poly1305.h removal

It was missed in bb6e84c988d3 and afbee98232f4.