CyberLeo.Net >> Repos - FreeBSD/FreeBSD.git/log

nfsclient: add checks for a server returning the current directory

Commit 3fe2c68ba20f dealt with a panic in cache_enter_time() where
the vnode referred to the directory argument.
It would also be possible to get these panics if a broken
NFS server were to return the directory as an new object being
created within the directory or in a Lookup reply.

This patch adds checks to avoid the panics and logs
messages to indicate that the server is broken for the
file object creation cases.

(cherry picked from commit 3e04ab36ba5ce5cbbf6d22f17a01a391a04e465f)

dumpon.8: Ask DDB to call doadump() rather than calling it directly

Sponsored by: The FreeBSD Foundation

(cherry picked from commit af06ff55535d9b2de253103e974558104e0a3d97)

Rename _cscan_atomic.h and _cscan_bus.h to atomic_san.h and bus_san.h

Other kernel sanitizers (KMSAN, KASAN) require interceptors as well, so
put these in a more generic place as a step towards importing the other
sanitizers.

No functional change intended.

Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D29103

(cherry picked from commit 435c7cfb2418fdac48fa53e29e38ef03646b817d)

ath_hal: Stop printing messages during boot

ath_hal is compiled into the kernel by default and so always prints a
message to dmesg even when the system has no ath hardware.

Sponsored by: The FreeBSD Foundation

(cherry picked from commit 0e72eb460228e4b9cb790beb7113d0a5c88db503)

posix timers: Improve the overrun calculation

timer_settime(2) may be used to configure a timeout in the past.  If
the timer is also periodic, we also try to compute the number of timer
overruns that occurred between the initial timeout and the time at which
the timer fired.  This is done in a loop which iterates once per period
between the initial timeout and now.  If the period is small and the
initial timeout was a long time ago, this loop can take forever to run,
so the system is effectively DOSed.

Replace the loop with a more direct calculation of
(now - initial timeout) / period to compute the number of overruns.

Reported by: syzkaller
Reviewed by: kib
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D29093

(cherry picked from commit 7995dae9d3f58abf38ef0001cee24131f3c9054b)

posix timers: Sprinkle some style fixes

Sponsored by: The FreeBSD Foundation

(cherry picked from commit 60d12ef952a39581e967a1a608522fdbdedefa01)

posix timers: Declare unexported functions as static

Sponsored by: The FreeBSD Foundation

(cherry picked from commit 8ff2b41c05a8f98e30250b929e9722f714f1f319)

wg: Style

Sponsored by: The FreeBSD Foundation

(cherry picked from commit d8cebef50e7b5fac1e28bcb1f931962210f9b5f6)

ns8250: don't drop IER_TXRDY on bus_grab/ungrab

It has been observed that some systems are often unable to resume from
ddb after entering with debug.kdb.enter=1. Checking the status further
shows the terminal is blocked waiting in tty_drain(), but it never makes
progress in clearing the output queue, because sc->sc_txbusy is high.

I noticed that when entering polling mode for the debugger, IER_TXRDY is
set in the failure case. Since this bit is never tracked by the softc,
it will not be restored by ns8250_bus_ungrab(). This creates a race in
which a TX interrupt can be lost, creating the hang described above.
Ensuring that this bit is restored is enough to prevent this, and resume
from ddb as expected.

The solution is to track this bit in the sc->ier field, for the same
lifetime that TX interrupts are enabled.

PR: 223917, 240122
Sponsored by: The FreeBSD Foundation

(cherry picked from commit 7e7f7beee732810d3afcc83828341ac3e139b5bd)

development(7): update to reflect Git transition

Reviewed By: debdrup, imp (earlier version)
Sponsored By: EPSRC
Differential Revision: https://reviews.freebsd.org/D28939

(cherry picked from commit d28cbb7944e5b1015d94a04cadc97d473838611e)

linux(4): make getcwd(2) return ERANGE instead of ENOMEM

For native FreeBSD binaries, the return value from __getcwd(2)
doesn't really matter, as the libc wrapper takes over and returns
the proper errno.

PR: kern/254120
Reported By: Alex S <iwtcex@gmail.com>
Reviewed By: kib
Sponsored By: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D29217

(cherry picked from commit 0dfbdd9fc269f0438ffcc31632d35234a90584ad)

Make kern.timecounter.hardware tunable

(cherry picked from commit 56b9bee63a42dbac712acf540f23a4c3dbd099a9)

Move ic_check_send_space clear to the actual check.

It closes tiny race when the flag could be set between being cleared
and the space is checked, that would create us some more work. The
flag setting is protected by both locks, so we can clear it in either
place, but in between both locks are dropped.

MFC after: 1 week

(cherry picked from commit afc3e54eeee635a525c88e4678cc38e3219302c3)

Restore condition removed in df3747c6607b.

I think it allowed to avoid some TX thread wakeups while the socket
buffer is full. But add there another options if ic_check_send_space
is set, which means socket just reported that new space appeared, so
it may have sense to pull more data from ic_to_send for better TX
coalescing.

MFC after: 1 week

(cherry picked from commit aff9b9ee894e3e6b6d8c7e4182d6b973804df853)

Replace STAILQ_SWAP() with simpler STAILQ_CONCAT().

Also remove stray STAILQ_REMOVE_AFTER(), not causing problems only
because STAILQ_SWAP() fixed corrupted stqh_last.

MFC after: 1 week

(cherry picked from commit df3747c6607be12d48db825653e6adfc3041e97f)

Fix initiator panic after 6895f89fe54e.

There are sessions without socket that are not disconnecting yet.

MFC after: 3 weeks

(cherry picked from commit 06e9c710998b83a3be21f7f264187fff5d590bc3)

Optimize TX coalescing by keeping pointer to last mbuf.

Before m_cat() each time traversed through all the coalesced chain.

MFC after: 1 week

(cherry picked from commit b85a67f54a40053e75658a17c620b89bafaba67f)

Optimize out few extra memory accesses.

MFC after: 1 week

(cherry picked from commit a59e2982fe3e6339629cc77fe9d349d60e03a05e)

Micro-optimize OOA queue processing.

- Move ctl_get_cmd_entry() calls from every OOA traversal to when
  the requests first inserted, storing seridx in struct ctl_scsiio.
- Move some checks out of the loop in ctl_check_ooa().
- Replace checks for errors that can not happen with asserts.
- Transpose ctl_serialize_table, so that any OOA traversal accessed
  only one row (cache line).  Compact it from enum to uint8_t.
- Optimize static branch predictions in hottest places.

Due to O(n) nature on deep LUN queues this can be the hottest code
path in CTL, and additional 20% of IOPS I see in some 4KB I/O tests
are good to have in reserve.  About 50% of CPU time here according
to the profiles is now spent in two memory accesses per traversed
request in OOA.

Sponsored by: iXsystems, Inc.
MFC after: 2 weeks

(cherry picked from commit 9d9fd8b79f0ebe59f791c8225fa01ab59858b7b5)

Coalesce socket reads in software iSCSI.

Instead of 2-4 socket reads per PDU this can do as low as one read
per megabyte, dramatically reducing TCP overhead and lock contention.

With this on iSCSI target I can write more than 4GB/s through a
single connection.

MFC after: 1 month

(cherry picked from commit 6895f89fe54e0858aea70d2bd2a9651f45d7998e)

Fix build after 2c7dc6bae9fd.

MFC after: 1 month

(cherry picked from commit c02a28754bc229c05e8baf9b6632cbd59bc73e48)

Refactor CTL datamove KPI.

- Make frontends call unified CTL core method ctl_datamove_done()
to report move completion. It allows to reduce code duplication
in differerent backends by accounting DMA time in common code.
- Add to ctl_datamove_done() and be_move_done() callback samethr
argument, reporting whether the callback is called in the same
context as ctl_datamove(). It allows for some cases like iSCSI
write with immediate data or camsim frontend write save one context
switch, since we know that the context is sleepable.
- Remove data_move_done() methods from struct ctl_backend_driver,
unused since forever.

MFC after: 1 month

(cherry picked from commit 2c7dc6bae9fd5c2fa0a65768df8e4e99c2f159f1)

Microoptimize CTL I/O queues.

Switch OOA queue from TAILQ to LIST and change its direction, so that
we traverse it forward, not backward. There is only one place where
we really need other direction, and it is not critical.

Use STAILQ_REMOVE_HEAD() instead of STAILQ_REMOVE() in backends.

Replace few impossible conditions with assertions.

MFC after: 1 month

(cherry picked from commit 05d882b780f5be2da6f3d3bfef9160aacc4888d6)

Save context switch per I/O for iSCSI and IOCTL frontends.

Introduce new CTL core KPI ctl_run(), preprocessing I/Os in the caller
context instead of scheduling another thread just for that. This call
may sleep, that is not acceptable for some frontends like the original
CAM/FC one, but iSCSI already has separate sleepable per-connection RX
threads, and another thread scheduling is mostly just a waste of time.
IOCTL frontend actually waits for the I/O completion in the caller
thread, so the use of another thread for this has even less sense.

With this change I can measure ~5% IOPS improvement on 4KB iSCSI I/Os
to ZFS.

MFC after: 1 month

(cherry picked from commit 812c9f48a2b7bccc31b2a6077b299822357832e4)

Move XPT_IMMEDIATE_NOTIFY handling out of periph lock.

It is a rare, but still better to not have lock dependencies.

MFC after: 1 month

(cherry picked from commit c67a2909a629db138227993e1093e66bb6c00af5)

newsyslog(8): Implement a new 'E' flag to not rotate empty log files

Based on an idea from dvl's coworker, László DANIELISZ, implement
a new flag, 'E', that prevents newsyslog(8) from rotating the empty
log files. This 'E' flag ist mostly usable in conjunction with 'B'
flag that instructs newsyslog(8) to not insert an informational
message into the log file after rotation, keeping it still empty.

Reviewed by: markj, ian, manpages (rpokala)
Approved by: markj, ian, manpages (rpokala)
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D28940

(cherry picked from commit c7d27b225df8d7fb36a31a21737d4309593c4604)

Do not complain about incorrect cylinder group check-hashes when
asked to add them to a filesystem.

Sponsored by: Netflix

(cherry picked from commit 6385cabd5be627c4f395e3abf215882aaeb36320)

lib/flua/libjail: Allow empty params table

The name or jid always gets added to the params, and that's enough to
avoid allocating a 0 length params array.

Reported by: kevans
Reviewed by: kevans
Differential Revision: https://reviews.freebsd.org/D28778

(cherry picked from commit e175b519a6fb83889fb3ca679b73d11ea5bea7ad)

sbin/ifconfig: Get lagg status with libifconfig

Also trimmed an unused block of code that never prints out LAGG_PROTOS.
Reviewed by: kp (earlier version)
Differential Revision: https://reviews.freebsd.org/D28961

(cherry picked from commit a0ebb915045ed0056decec5f001471af4e999f61)

libifconfig: Add a function to get down reason

For use in ifconfig.

Reviewed by: kp
Differential Revision: https://reviews.freebsd.org/D28991

(cherry picked from commit b12a960e4274926171dc7a4f9887a0d0a5195b44)

sbin/ifconfig: Get bridge status with libifconfig

Reviewed by: kp
Differential Revision: https://reviews.freebsd.org/D28954

(cherry picked from commit 6f497e47e925f6886f444a8e31e2e939fca264f2)

sbin/ifconfig: Get groups with libifconfig

Reviewed by: kp
Differential Revision: https://reviews.freebsd.org/D28965

(cherry picked from commit 64bacab177f7c743af3268a3e1ffcddaf77a68d0)

sbin/ifconfig: Get carp status with libifconfig

A trivial change now that ifconfig is already using libifconfig.
Reviewed by: kp (earlier version)
Differential Revision: https://reviews.freebsd.org/D28955

(cherry picked from commit da393346ac47b22b5f8af4040a59971faadd2c5c)

sbin/ifconfig: Minor housekeeping

Coalesce adjacent lint ifdefs.
Fix spelling of nitems.

Reviewed by: kp
Differential Revision: https://reviews.freebsd.org/D29022

(cherry picked from commit 88832d59dec10e97dd64b44391606776b20e782b)

libifconfig: Fix typo in symbol map

(cherry picked from commit 80545a16df95263781b3422695527b6238f4bd2c)

sbin/ifconfig: Drop local name var in sfp_status

There is already a globally defined name variable.

(cherry picked from commit 9995455218ff19df9cf0dcaf0198269dc76eeb2d)

libifconfig: Set error in ifconfig_get_groups

This should return -1 with OTHER/ENOMEM set in the handle when malloc
fails, like everywhere else in libifconfig.

Reviewed by: kp
Differential Revision: https://reviews.freebsd.org/D28964

(cherry picked from commit 1d9ba697f99a88b321a7d8b96fa142ea774cd3be)

vm_reserv: Fix list locking in vm_reserv_reclaim_contig()

The per-domain partpop queue is locked by the combination of the
per-domain lock and individual reservation mutexes.
vm_reserv_reclaim_contig() scans the queue looking for partially
populated reservations that can be reclaimed in order to satisfy the
caller's allocation.

During the scan, we drop the per-domain lock. At this point, the rvn
pointer may be invalidated. Take care to load rvn after re-acquiring
the per-domain lock.

While here, simplify the condition used to check whether a reservation
was dequeued while the per-domain lock was dropped.

Reviewed by: alc, kib
Reported by: gallatin
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D29203

(cherry picked from commit 968079f253c11433d47bece4b41b46fcbf985903)

Flush remaining routes from the routing table during VNET shutdown.

Summary:
This fixes rtentry leak for the cloned interfaces created inside the
VNET.

Loopback teardown order is `SI_SUB_INIT_IF`, which happens after `SI_SUB_PROTO_DOMAIN` (route table teardown).
Thus, any route table operations are too late to schedule.
As the intent of the vnet teardown procedures to minimise the amount of effort by doing global cleanups instead of per-interface ones, address this by adding a relatively light-weight routing table cleanup function, `rib_flush_routes()`.
It removes all remaining routes from the routing table and schedules the deletion, which will happen later, when `rtables_destroy()` waits for the current epoch to finish.

Test Plan:
```
set_skip:set_skip_group_lo -> passed [0.053s]
tail -n 200 /var/log/messages | grep rtentry
```

PR: 253998
Reported by: rashey at superbox.pl
Reviewed By: kp
Differential Revision: https://reviews.freebsd.org/D29116

(cherry picked from commit b1d63265ac399112b3bca36c3d75df1a3c2c8102)

Fix various NOINET* builds broken by 145bf6c0af48.

Reported by: mjg, bdragon

(cherry picked from commit 8ca99aecf749dd088310f81f3c5364a462f1e332)

Fix blackhole/reject routes.

Traditionally *BSD routing stack required to supply some
interface data for blackhole/reject routes. This lead to
varieties of hacks in routing daemons when inserting such routes.
With the recent routeing stack changes, gateway sockaddr without
RTF_GATEWAY started to be treated differently, purely as link
identifier.

This change broke net/bird, which installs blackhole routes with
127.0.0.1 gateway without RTF_GATEWAY flags.

Fix this by automatically constructing necessary gateway data at
rtsock level if RTF_REJECT/RTF_BLACKHOLE is set.

Reported by: Marek Zarychta <zarychtam at plan-b.pwste.edu.pl>
Reviewed by: donner

(cherry picked from commit 145bf6c0af48b89f13465e145f4516de37c31d85)

Partially revert libcxxrt changes to avoid _Unwind_Exception change

(Note I am also applying this to main and stable/13, to restore the old
libcxxrt ABI and to avoid having to maintain a compat library.)

After the recent cherry-picking of libcxxrt commits 0ee0dbfb0d26 and
d2b3fadf2db5, users reported that editors/libreoffice packages from the
official package builders did not start anymore. It turns out that the
combination of these commits subtly changes the ABI, requiring all
applications that depend on internal details of struct _Unwind_Exception
(available via unwind-arm.h and unwind-itanium.h) to be recompiled.

However, the FreeBSD package builders always use -RELEASE jails, so
these still use the old declaration of struct _Unwind_Exception, which
is not entirely compatible. In particular, LibreOffice uses this struct
in its internal "uno bridge" component, where it attempts to setup its
own exception handling mechanism.

To fix this incompatibility, go back to the old declarations of struct
_Unwind_Exception, and restore the __LP64__ specific workaround we had
in place before (which was to cope with yet another, older ABI bug).

Effectively, this reverts upstream libcxxrt commits 88bdf6b290da
("Specify double-word alignment for ARM unwind") and b96169641f79
("Updated Itanium unwind"), and reapplies our commit 3c4fd2463bb2
("libcxxrt: add padding in __cxa_allocate_* to fix alignment").

PR: 253840

Restore AT_RESOLVE_BENEATH support for funlinkat(2)/unlinkat(2).

(cherry picked from commit ead7697f04c036853535a4281cec9aa09ef21270)

MFC jail: Don't allow jails under dying parents

If a jail is created with jail_set(...JAIL_DYING), and it has a parent
currently in a dying state, that will bring the parent jail back to
life. Restrict that to require that the parent itself be explicitly
brought back first, and not implicitly created along with the new
child jail.

Differential Revision: https://reviews.freebsd.org/D28515

(cherry picked from commit 0a2a96f35a4c2dab3486438680fa289e12971e4b)

MFC jail: Fix locking on an early jail_set error.

I had locked allprison_lock without immediately setting PD_LIST_LOCKED.

(cherry picked from commit 108a9384e9e945cccba73c959f7e9cdb023cbcad)

MFC jail: Add PD_KILL to remove a prison in prison_deref().

Add the PD_KILL flag that instructs prison_deref() to take steps
to actively kill a prison and its descendents, namely marking it
PRISON_STATE_DYING, clearing its PR_PERSIST flag, and killing any
attached processes.

This replaces a similar loop in sys_jail_remove(), bringing the
operation under the same single hold on allprison_lock that it already
has. It is also used to clean up failed jail (re-)creations in
kern_jail_set(), which didn't generally take all the proper steps.

Differential Revision: https://reviews.freebsd.org/D28473

(cherry picked from commit 811e27fa3c445664e36071a7d08228fc7fb85676)

MFC jail: back out 811e27fa3c44 until it doesn't break Jenkins

Reported by: arichardson

(cherry picked from commit ddfffb41a22d4798a036fe2d30e59694ba7cdad3)

MFC jail: re-commit 811e27fa3c44 with fixes

Make sure PD_KILL isn't passed to do_jail_attach, where it might end
up trying to kill the caller's prison (even prison0).

Fix the child jail loop in prison_deref_kill, which was doing the
post-order part during the pre-order part. That's not a system-
killer, but make jails not always die correctly.

(cherry picked from commit c861373bdff90d8167a0d998899ca718ccdb541b)

MFC jail: Add safety around prison_deref() flags.

do_jail_attach() now only uses the PD_XXX flags that refer to lock
status, so make sure that something else like PD_KILL doesn't slip
through.

Add a KASSERT() in prison_deref() to catch any further PD_KILL misuse.

(cherry picked from commit 589e4c1df4a6e4b1368f26fc7fef704a2e5cb42c)

MFC jail: Add pr_state to struct prison

Rather that using references (pr_ref and pr_uref) to deduce the state
of a prison, keep track of its state explicitly. A prison is either
"invalid" (pr_ref == 0), "alive" (pr_uref > 0) or "dying"
(pr_uref == 0).

State transitions are generally tied to the reference counts, but with
some flexibility: a new prison is "invalid" even though it now starts
with a reference, and jail_remove(2) sets the state to "dying" before
the user reference count drops to zero (which was prviously
accomplished via the PR_REMOVE flag).

pr_state is protected by both the prison mutex and allprison_lock, so
it has the same availablity guarantees as the reference counts do.

Differential Revision: https://reviews.freebsd.org/D27876

(cherry picked from commit 1158508a8086a1a93492c1a2e22b61cd7fee4ec7)

MFC jail: Fix a LOR introduced in 1158508a8086

(cherry picked from commit 701d6b50ae7b0b2b50fbd191c2dbd646ef3b4a67)

x86: tsc: deprioritize TSC on VirtualBox

Misbehavior has been observed with TSC under VirtualBox, where threads
doing small sleeps (~1 second) may miss their wake up and hang around
in a sleep state indefinitely. Switching back to ACPI-fast decidedly
fixes it, so stop using TSC on VirtualBox at least for the time being.

This partially reverts 84eaf2ccc6aa, applying it only to VirtualBox and
increasing the quality to 0. Negative qualities can never be chosen and
cannot be chosen with the tunable recently added. If we do not have a
timecounter with a higher quality than 0, then TSC does at least leave
the system mostly usable.

PR: 253087

(cherry picked from commit 8cc15b0dfc2f3299662e78f18bd6127f83c14ab4)

MFC jail: Change the locking around pr_ref and pr_uref

Require both the prison mutex and allprison_lock when pr_ref or
pr_uref go to/from zero.  Adding a non-first or removing a non-last
reference remain lock-free.  This means that a shared hold on
allprison_lock is sufficient for prison_isalive() to be useful, which
removes a number of cases of lock/check/unlock on the prison mutex.

Expand the locking in kern_jail_set() to keep allprison_lock held
exclusive until the new prison is valid, thus making invalid prisons
invisible to any thread holding allprison_lock (except of course the
one creating or destroying the prison).  This renders prison_isvalid()
nearly redundant, now used only in asserts.

Differential Revision: https://reviews.freebsd.org/D28419
Differential Revision: https://reviews.freebsd.org/D28458

(cherry picked from commit f7496dcab0360a74bfb00cd6118f66323fffda61)

MFC jail: fix build after the previous commit
Noted by: Michael Butler <imb protected-networks.net>

(cherry picked from commit ee9b37ae5c115c41835119bb5c9d2e14c83abd65)

MFC jail: Improve locking when removing prisons

Change the flow of prison_deref() so it doesn't let go of allprison_lock
until it's completely done using it (except for a possible drop as part
of an upgrade on its first try).

Differential Revision: https://reviews.freebsd.org/D28458

(cherry picked from commit 6e1d1bfcac77603541706807803a198c6d954d7c)

opencrypto: Make cryptosoft attach silently

cryptosoft is always present and doesn't print any useful information
when it attaches.

Reviewed by: jhb
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D29098

(cherry picked from commit 4fc60fa9294f82c7f4e1a0e71f9a17794124217f)

netmap: Stop printing a line to the dmesg in netmap_init()

netmap is compiled into the kernel by default so initialization was
always reported, and netmap uses a formatting convention not used in the
rest of the kernel.

Reviewed by: vmaffione
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D29099

(cherry picked from commit fef845097190f0ecb783d6c75a9398c4e4a4c0e1)

ktls: Hide initialization message behind bootverbose

We don't typically print anything when a subsystem initializes itself,
and KTLS is currently disabled by default anyway.

Reviewed by: jhb
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D29097

(cherry picked from commit 89b650872bba2e4bfbc94a200946b461ef69ae22)

acpi: Make nexus_acpi quiet on amd64 and i386

Otherwise during attach newbus prints "nexus0", which is not very
useful.

The generic nexus device is already quiet, as is nexus_acpi on arm64.

Sponsored by: The FreeBSD Foundation

(cherry picked from commit 732b69c9f9c84408e7e680a93ab91ce76ffef2ce)

Add ObsoleteFiles.inc entries for various OCF headers removed in 13.

(cherry picked from commit ef74bfc6fed298d5ca0e3cb92bf008b715ea0c2f)

Correct the name of the structure used for TCP socket options.

The structure was renamed while refactoring Netflix's KTLS changes for
upstreaming, but the original name remained in tcp.4 and was
subsequently copied to ktls.4.

PR: 254141
Reported by: asomers

(cherry picked from commit c5a365623f88999b524d94003187ef09fda55f67)

Remove the usr/tests/usr.bin/yacc/yacc directory when removing yacc.

(cherry picked from commit e6cfd2939a4261c1f4bf802368cea8faf824c128)

wg(4): Fix an example in the manual page

The example in the manual page of wg(4) for connecting to a
peer was missing the 'public-key' ifconfig(8) keyword and for the
addressed peer the port must be specified.

PR: 253866
Reported by: Sergey Akhmatov <sergey at akhmatov dot ru>
Reviewed by: debdrup
Differential Revision: https://reviews.freebsd.org/D29115

(cherry picked from commit f7bfe310191c8292da51c8da166a521ff16e0e46)

security(7): mention new W^X sysctls in the manual page

Reviewed by: emaste, gbe
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D28986

(cherry picked from commit 907023b454f06a6af87f21f8a9d6de6c11b2d275)

wg: Avoid leaking mbufs when the input handshake queue is full

Reviewed by: grehan
Sponsored by: The FreeBSD Foundation
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D29011

(cherry picked from commit a11009dccb6a2e75de2b8f1b45a0896eda2e6d85)

qat.4: Fix some firmware module names

PR: 252984

(cherry picked from commit 3adf72a36b9b151eef57e3d83f71a3a9fbacb78d)

Fix 'in6_purgeaddr: err=65, destination address delete failed' message.

P2P ifa may require 2 routes: one is the loopback route, another is
the "prefix" route towards its destination.

Current code marks loopback routes existence with IFA_RTSELF and
"prefix" p2p routes with IFA_ROUTE.

For historic reasons, we fill in ifa_dstaddr for loopback interfaces.
To avoid installing the same route twice, we preemptively set
IFA_RTSELF when adding "prefix" route for loopback.
However, the teardown part doesn't have this hack, so we try to
remove the same route twice.

Fix this by checking if ifa_dstaddr is different from the ifa_addr
and moving this logic into a separate function.

Reviewed By: kp
Differential Revision: https://reviews.freebsd.org/D29121

(cherry picked from commit 7634919e15f1147b6f26d55354be375bc9b198db)

Enforce net epoch in in6_selectsrc().

in6_selectsrc() may call fib6_lookup() in some cases, which requires
epoch. Wrap in6_selectsrc* calls into epoch inside its users.
Mark it as requiring epoch by adding NET_EPOCH_ASSERT().

Differential Revision: https://reviews.freebsd.org/D28647

(cherry picked from commit 605284b894748d23136b30a202689493d8f8af52)

Fix dpdk/ldradix fib lookup algorithm preference calculation.

The current preference number were copied from IPv4 code,
assuming 500k routes to be the full-view. Adjust with the current
reality (100k full-view).

Reported by: Marek Zarychta <zarychtam at plan-b.pwste.edu.pl>

(cherry picked from commit d5be41beb7c44119730791d92782d8e77174d312)

Fix setting static entries for arp/ndp.

rtsock message validation changes committed in 2fe5a79425c7
did not take llinfo messages into account.

Add a special validation case for RTA_GATEWAY llinfo messages.

(cherry picked from commit e5b394f2d0d94f190c9da2346fd22d7c6fb14730)

Fix arp/ndp deletion broken by 2fe5a79425c7.

Changes in the 2fe5a79425c7 moved dst sockaddr masking from the
routing control plane to the rtsock code.

It broke arp/ndp deletion.
It turns out, arp/ndp perform RTM_GET request first to get an
interface index necessary for the deletion.
Then they simply stamp the reply with RTF_LLDATA and set the
command to RTM_DELETE.
As a result, kernel receives request with non-empty RTA_NETMASK
and clears RTA_DST host bits before passing the message to the
lla code.

De facto, the only needed bits are RTA_DST, RTA_GATEWAY and the
subset of rtm_flags.

With that in mind, fix the interace by clearing RTA_NETMASK
for every messages with RTF_LLDATA.

While here, cleanup arp/ndp code a bit.

Reviewed by: gnn
Differential Revision: https://reviews.freebsd.org/D28804

(cherry picked from commit f9e1cd6c99200846b324a8b65f7f31ded74013bd)

Fix NOINET6 build broken by 2fe5a79425c7.

Reported by: mjg

(cherry picked from commit a4513bace0e0c38d38b0c49c1daea60f2741d781)

Fix dst/netmask handling in routing socket code.

Traditionally routing socket code did almost zero checks on
the input message except for the most basic size checks.

This resulted in the unclear KPI boundary for the routing system code
(`rtrequest*` and now `rib_action()`) w.r.t message validness.

Multiple potential problems and nuances exists:
* Host bits in RTAX_DST sockaddr. Existing applications do send prefixes
with hostbits uncleared. Even `route(8)` does this, as they hope the kernel
would do the job of fixing it. Code inside `rib_action()` needs to handle
it on its own (see `rt_maskedcopy()` ugly hack).
* There are multiple way of adding the host route: it can be DST without
netmask or DST with /32(/128) netmask. Also, RTF_HOST has to be set correspondingly.
Currently, these 2 options create 2 DIFFERENT routes in the kernel.
* no sockaddr length/content checking for the "secondary" fields exists: nothing
stops rtsock application to send sockaddr_in with length of 25 (instead of 16).
Kernel will accept it, install to RIB as is and propagate to all rtsock consumers,
potentially triggering bugs in their code. Same goes for sin_port, sin_zero, etc.

The goal of this change is to make rtsock verify all sockaddr and prefix consistency.
Said differently, `rib_action()` or internals should NOT require to change any of the
sockaddrs supplied by `rt_addrinfo` structure due to incorrectness.

To be more specific, this change implements the following:
* sockaddr cleanup/validation check is added immediately after getting sockaddrs from rtm.
* Per-family dst/netmask checks clears host bits in dst and zeros all dst/netmask "secondary" fields.
* The same netmask checking code converts /32(/128) netmasks to "host" route case
(NULL netmask, RTF_HOST), removing the dualism.
* Instead of allowing ANY "known" sockaddr families (0<..<AF_MAX), allow only actually
supported ones (inet, inet6, link).
* Automatically convert `sockaddr_sdl` (AF_LINK) gateways to
`sockaddr_sdl_short`.

Reported by: Guy Yur <guyyur at gmail.com>
Reviewed By: donner
Differential Revision: https://reviews.freebsd.org/D28668

(cherry picked from commit 2fe5a79425c79f7b828acd91da66d97230925fc8)

Add ifa_try_ref() to simplify ifa handling inside epoch.

More and more code migrates from lock-based protection to the NET_EPOCH
umbrella. It requires some logic changes, including, notably, refcount
handling.

When we have an `ifa` pointer and we're running inside epoch we're
guaranteed that this pointer will not be freed.
However, the following case can still happen:
* in thread 1 we drop to 0 refcount for ifa and schedule its deletion.
* in thread 2 we use this ifa and reference it
* destroy callout kicks in
* unhappy user reports bug

To address it, new `ifa_try_ref()` function is added, allowing to return
failure when we try to reference `ifa` with 0 refcount.
Additionally, existing `ifa_ref()` is enforced with `KASSERT` to provide
cleaner error in such scenarious.

Reviewed By: rstone, donner
Differential Revision: https://reviews.freebsd.org/D28639

(cherry picked from commit 600eade2fb4faacfcd408a01140ef15f85f0c817)

Make in_localip_more() fib-aware.

It fixes loopback route installation for the interfaces
in the different fibs using the same prefix.

Reviewed By: donner
PR: 189088
Differential Revision: https://reviews.freebsd.org/D28673

(cherry picked from commit 9fdbf7eef5c006002769add15b1ebb8fa8d9e220)

Remove per-packet ifa refcounting from IPv6 fast path.

Currently ip6_input() calls in6ifa_ifwithaddr() for
every local packet, in order to check if the target ip
belongs to the local ifa in proper state and increase
its counters.

in6ifa_ifwithaddr() references found ifa.
With epoch changes, both `ip6_input()` and all other current callers
of `in6ifa_ifwithaddr()` do not need this reference
anymore, as epoch provides stability guarantee.

Given that, update `in6ifa_ifwithaddr()` to allow
it to return ifa without referencing it, while preserving
option for getting referenced ifa if so desired.

Differential Revision: https://reviews.freebsd.org/D28648

(cherry picked from commit 8268d82cff1bcd7969e5b3c676f28684784a7a43)

Remove now-unused RTF_RNH_LOCKED route flag.

(cherry picked from commit 64d5c2777731c1376dd44b6a5fdb68b168d073dc)

Do not reference returned ifa in in6_ifawithifp().

The only place where in6_ifawithifp() is used is ip6_output(),
which uses the returned ifa to bump traffic counters.
Given ifa stability guarantees is provided by epoch, do not refcount ifa.

This eliminates 2 atomic ops from IPv6 fast path.

Reviewed By: rstone
Differential Revision: https://reviews.freebsd.org/D28649

(cherry picked from commit 1bd44b11e59f1e9ee7245f8de1f823bc5287b9ef)

backlight(8): Add note that with option it print the current brightness.

MFC after: 3 days
PR: 253737

(cherry picked from commit 1df30489a8f7083c98010c94d9ce522f9e8213dc)

backlight: Fix incr/decr with percent value of 0

This now does nothing instead of incr/decr by 10%

MFC After: 3 days
PR: 253736

(cherry picked from commit 3b005d51bd0fe4d8d19fb2df4d470b6e8baebf16)

zfs: update openzfs version reference to bedbc13da

It was missed in the latest merge.

(cherry picked from commit 6781b8a32e702c694d3f813959d326e36facc19f)

zfs: merge OpenZFS master-bedbc13da

Notable upstream commits:
  8e43fa12c Fix vdev_rebuild_thread deadlock
  03ef8f09e Add missing checks for unsupported features
  2e160dee9 Fix assert in FreeBSD-specific dmu_read_pages
  bedbc13da Cancel TRIM / initialize on FAULTED non-writeable vdevs

Obtained from: OpenZFS

(cherry picked from commit caed7b1c399de04279822028e15b36367e84232f)

openzfs: attach pam_zfs_key to build

This PAM module allows unlocking encrypted user home datasets when
logging in (and changing passphrase when changing the account password),
see https://github.com/openzfs/zfs/pull/9903

Also supposed to unload the key when the last session for the user is
done, but there are EBUSY issues:
https://github.com/openzfs/zfs/issues/11222#issuecomment-731897858

Submitted by: Greg V <greg_unrelenting.technology>
Reviewed by: mm
Differential Revision: https://reviews.freebsd.org/D28018

(cherry picked from commit ee21ee1572d40a3b74f18638dae38c1a9ad1e9e3)

zfs: add missing seqc write begin/end around zfs_acl_chown_setattr

It happens to trip over an assert but does not matter for correctness at
this time. However, do it for future proofing.

Reported by: avg

(cherry picked from commit 1d8510c1a64d61a85c74c8b02fb12e6f31ede5a1)

zfs: add missing checks for unsupported features

After the merge of OpenZFS master-9312e0fd1 it has become possible to
import ZFS pools witn an active org.illumos:edonr feature on FreeBSD,
leading to a panic.

In addition, "zpool status" reported all pools without edonr as upgradable
and "zpool upgrade -v" lists edonr in the list of upgradable features.

This is an accepted but not yet included bugfix by upstream.

Obtained from: https://github.com/openzfs/zfs/pull/11653
Differential Revision: https://reviews.freebsd.org/D28935
Reported by: garga (on freebsd-current@)
Reviewed by: freqlabs

(cherry picked from commit c170aa9f37e4ce9338a0f26e3e983f7123ea8c1a)

Install links for zpool feature compat aliases

The alias links were missed when this feature was introduced to the
FreeBSD build system in 10f57cb98fd61b2669640a84aa73ad118601f281.

Reviewed by: mm
Sponsored by: iXsystems, Inc.
Differential Revision: https://reviews.freebsd.org/D28925

(cherry picked from commit 2ae79aa362e7a2ee72657b39be64f1390158aaf6)

zfs: merge OpenZFS master-9312e0fd1

Notable upstream changes:
  778869fa1 Fix reporting of mount progress
  e7adccf7f Disable use of hardware crypto offload drivers on FreeBSD
  03e02e5b5 Fix checksum errors not being counted on repeated repair
  64e0fe14f Restore FreeBSD resource usage accounting
  11f2e9a49 Fix panic if scrubbing after removing a slog device

(cherry picked from commit ba27dd8be821792e15bdabfac69fd6cab0cf9dd3)

zfs: bump version and install new share files

- bump version to 2.0.0-FreeBSD_gbf156c966
- install definition files for the new "-o compatibility" option
to "zpool create"

MFC after: 2 weeks

(cherry picked from commit 10f57cb98fd61b2669640a84aa73ad118601f281)

zfs: merge OpenZFS master-bf156c966

Notable upstream changes:
bf156c966 Remove unused abd_alloc_scatter_offset_chunkcnt
658fb8020 Add "compatibility" property for zpool feature sets

This update introduces a new pool property called "compatibility"
that can be used to enable a limited set of pool features on pool
creation and "stick" to it, so the "zpool upgrade" does not
accidentally enable features that are not desired. The value of
this property may then be changed later.

See zpool-features(5) for more information about the "compatibility"
pool property.

Obtained from: OpenZFS

(cherry picked from commit ee36e25a86cbe2a9474c1d61f2c4b450da8ef952)

zfs: change file mode of all merged tests

If the ksh files are not executable then the tests are not run
and reported as failed.

(cherry picked from commit afcb3c4cb49f1ba9690d066c3dc1af9c7bee1ea3)

zfs: merge OpenZFS master-436ab35a5

- speed up writing to ZFS pools without ZIL devices (aa755b3)
- speed up importing ZFS pools (2d8f72d, a0e0199, cf0977a)
...

Reviewed by: mjg (partial)
Tested by: pho
Differential Revision: https://reviews.freebsd.org/D28677

(cherry picked from commit 184c1b943937986c81e1996d999d21626ec7a4ff)

ice(4): Update to version 0.28.1-k

This updates the driver to align with the version included in
the "Intel Ethernet Adapter Complete Driver Pack", version 25.6.

There are no major functional changes; this mostly contains
bug fixes and changes to prepare for new features. This version
of the driver uses the previously committed ice_ddp package
1.3.19.0.

Signed-off-by: Eric Joyner <erj@FreeBSD.org>
Tested by: jeffrey.e.pieper@intel.com
Sponsored by: Intel Corporation
Differential Revision: https://reviews.freebsd.org/D28640

(cherry picked from commit d08b8680e12ad692736c84238dcf45c70c228914)

ice_ddp: Update package file to 1.3.19.0

This package is intended to be used with ice(4) version 0.28.1-k.
That update will happen in a forthcoming commit.

Signed-off-by: Eric Joyner <erj@FreeBSD.org>
Sponsored by: Intel Corporation

(cherry picked from commit a7ac518bff64d48cf262c60c4dc57eef34e74a07)

[PowerPC64] add mpr to GENERIC64 and GENERIC64LE

Submitted by: Andre Fernando da Silva <andre.silva@eldorado.org.br>
Reviewed by: luporl, alfredo, Sreekanth Reddy <sreekanth.reddy@broadcom.com> (by email)
Sponsored by: Eldorado Research Institute (eldorado.org.br)
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D25785

(cherry picked from commit 231633a2e9000d67b09f132ee26951a4621c778a)

mpr: big-endian support

This fixes mpr driver on big-endian devices.
Tested on powerpc64 and powerpc64le targets using a SAS9300-8i card
(LSISAS3008 pci vendor=0x1000 device=0x0097)

Submitted by: Andre Fernando da Silva <andre.silva@eldorado.org.br>
Reviewed by: luporl, alfredo, Sreekanth Reddy <sreekanth.reddy@broadcom.com> (by email)
Sponsored by: Eldorado Research Institute (eldorado.org.br)
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D25785

(cherry picked from commit 71900a794da046ad5322caae2774aed5b3d361b9)

Add a few missed files to libclang_rt.profile-<arch>.a

Otherwise, programs compiled with -fprofile-instr-generate will
encounter undefined symbol errors during linking, for example
__llvm_profile_counter_bias, lprofSetRuntimeCounterRelocation and a few
others were missing from the profile library.

Reported by: ota@j.email.ne.jp
PR: 254001

(cherry picked from commit 772c631af81abdb6d498d972bab79d04d3db16d0)

Build lib/msun tests with compiler builtins disabled

This forces the compiler to emit calls to libm functions, instead of
possibly substituting pre-calculated results at compile time, which
should help to actually test those functions.

Reviewed by: emaste, arichardson, ngie
Differential Revision: https://reviews.freebsd.org/D28577

(cherry picked from commit cf97d2a1dab8f2cddc4466fe64d37818339c73be)

riscv: Add a soft-float implementation of fabs()

We could just use a C implementation using __builtin_fabs(), but using
this assembly version guarantees that there is no additional prolog/epilog
code. Additionally, clang generates worse code for masking off the top bit
than GCC: https://bugs.llvm.org/show_bug.cgi?id=49377.

This fixes the RISCV64 softfloat world build after cf97d2a1dab8. That commit
added -fno-builtin to the msun tests which resulted in the first references to
fabs (previously the compiler inlined all calls).

Reviewed By: dim
Reported by: mjg
Differential Revision: https://reviews.freebsd.org/D28994

(cherry picked from commit 524b018d200408bed5eb0d2b892db5b9fb46808b)

riscv: Fix whitespace issues in fabs added in 524b018d2004

(cherry picked from commit 066dab17e7a4a78d43dbcef8119960ddc8090a73)

clang: Fix -gz=zlib options for linker

Clang commit ccb4124a4172bf2cb2e1cd7c253f0f1654fce294:

Fix -gz=zlib options for linker

gcc translates -gz=zlib to --compress-debug-options=zlib for both
assembler and linker but clang only does this for assembler.

The linker needs --compress-debug-options=zlib option to compress the
debug sections in the generated executable or shared library.

Due to this bug, -gz=zlib has no effect on the generated executable or
shared library.

This patch fixes that.

Clang commit 462cf39a5c180621b56f7602270ce33eb7b68d23:

[Driver] Fix -gz=zlib options for linker also on FreeBSD

ccb4124a4172 fixed translating -gz=zlib to --compress-debug-sections for
linker invocation for several ToolChains, but omitted FreeBSD.

PR: 253942
Approved by: dim
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D29028

(cherry picked from commit 19587d742264c5caec33d218e9cea6eb78f6c6bb)

Mount the EFI system partition (ESP) on newly-installed systems and VM
images.

Per hier(7), the ESP will be mounted at /boot/efi. On UFS systems,
any existing ESP will be reused and mounted there; otherwise, a new one
will be made. On ZFS systems, space for an ESP is allocated on all disks
in the root pool, but only the partition actually used to boot is set up
and mounted.

This makes future upgrades of the EFI loader easier (upgrade scripts can
just change /boot/efi) and also greatly simplifies the parts of the
installer involved in initialization of the ESP. It also makes the
installer's behavior correspond to the documentation in hier(7).

Reviewed by: imp, tsoome, bdragon
Approved by: re (gjb)
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D28897

(cherry picked from commit 0b7472b3d8d2f1e90fade5236b44fd98d8e396c2)
(cherry picked from commit 2c26d77d989abe48c662eeb6f52f7e4c9b81680c)
(cherry picked from commit e77cf2a4ab32a381df3c06d25b8b4f650047c3f2)
(cherry picked from commit e70eb40271512dfbca7cecf823e4b445e3989c2e)

ipfw: add IPv6 support for sockarg opcode.

Sponsored by: Yandex LLC

(cherry picked from commit a9f7eba9597189c0e438f6986067d31dca1c53b0)

loader: cursor off should restore display content

When drawing cursor, we should store original display
content because there may be image data we would like to restore
when the cursor is removed.

PR: 254054
Reported by: Jose Luis Duran

(cherry picked from commit d708f23ebb06cfc9cf8f96f17a43eb63653b818a)

atomic(9): note that atomic_interrupt_fence first appeared in 13.0

(cherry picked from commit f5e930b369c6ea7a3f81d8e5b52cc395bb7b4187)

Do not exit ctl_be_block_worker() prematurely.

Return while there are any I/Os in a queue may result in them stuck
indefinitely, since there is only one taskqueue task for all of them.
I think I've reproduced this by switching ha_role to secondary under
heavy load.

MFC after: 3 days

(cherry picked from commit 6ed39db2573bb808ac2c206cd6c831f0be86219c)

Move back the isa non-PNP driver deadline to FreeBSD 14.

(cherry picked from commit 6ffdaa5f2d4f0881557f64dabf61fb57541e0fba)

if_vtbe: Add missing includes to fix build

PR: 254137
Reported by: Mina Galić <me@igalic.co>
Fixes: f8bc74e2f4a5 ("tap: add support for virtio-net offloads")

(cherry picked from commit f2f8405cf6b50a9d91acc02073abf1062d9d34f4)

bc: Vendor import new version 3.3.3

(cherry picked from commit 028616d0dd69a3da7a30cb94d35f040bf2ced6b9)

Make length(0) and length(0.0) return 1 for compatibility with GNU bc
and the traditional FreeBSD bc.

Fix a potential division by zero error in a non-standard (extended)
math library function.