CyberLeo.Net >> Repos - FreeBSD/FreeBSD.git/log

pf: do not copy anchor_wildcard / anchor_relative from userspace

We overwrite these fields again in pf_kanchor_setup() anyway.

MFC after: 2 weeks
Sponsored by: Rubicon Communications, LLC ("Netgate")

(cherry picked from commit 1c680e620bf7e53d043d10b23bdfc980e45e6455)

pf: remove unused field from pf_kanchor

The 'match' field is only used in the userspace version of the struct
(pf_anchor).

MFC after: 2 weeks
Sponsored by: Rubicon Communications, LLC ("Netgate")

(cherry picked from commit 76c2e71c4c65a85279505005716aa43101c47bf7)

pfctl: Remove unused variable

MFC after: 2 weeks
Sponsored by: Rubicon Communications, LLC ("Netgate")

(cherry picked from commit 90dedf0fef71d3e3081015525665bf335f9c7ee3)

devd(8): Note default config file search locations

PR: 197003
Reported by: Harald Schmalzbauer <bugzilla.freebsd@omnilan.de>

(cherry picked from commit bad324ace4f817206baf86ae7379c35c8199048e)

sched_ule(4): Fix possible significance loss.

Before this change kern.sched.interact sysctl setting above 32 gave
all interactive threads identical priority of PRI_MIN_INTERACT due to
((PRI_MAX_INTERACT - PRI_MIN_INTERACT + 1) / sched_interact) turning
zero. Setting the sysctl lower reduced the range of used priority
levels up to half, that is not great either.

Change of the operations order should fix the issue, always using full
range of priorities, while overflow is impossible there since both
score and priority values are small. While there, make the variables
unsigned as they really are.

MFC after: 1 month

(cherry picked from commit 1c119e173ddc7f5603a3b6cf940dc524e494a667)

sched_ule(4): Fix hang with steal_thresh < 2.

e745d729be60 caused infinite loop with interrupts disabled in load
stealing code if steal_thresh set below 2. Such configuration should
not generally be used, but appeared some people are using it to
workaround some problems.

To fix the problem explicitly pass to sched_highest() minimum number
of transferrable threads, supported by the caller, instead of guessing.

MFC after: 25 days

(cherry picked from commit 08063e9f98a33980a09e3bd465926719b3437122)

x86: Add NUMA nodes into CPU topology.

Depending on hardware, NUMA nodes may match last level caches, or
they may be above them (AMD Zen 2/3) or below (Intel Xeon w/ SNC).
This information is provided by ACPI instead of CPUID, and it is
provided for each CPU individually instead of mask widths, but
this code should be able to properly handle all the above cases.

This change should immediately allow idle stealing in sched_ule(4)
to prefer load from NUMA-local CPUs to remote ones when the node
does not match LLC. Later we may think of how to better handle it
on sched_pickcpu() side.

MFC after: 1 month

(cherry picked from commit ef50d5fbc39fc39970eab1234222b5ac1d9ba74c)

Fix build without SMP.

MFC after: 1 month

(cherry picked from commit 8db1669959ceebdc60a7d402830663953bf32818)

sched_ule(4): Improve long-term load balancer.

Before this change long-term load balancer was unable to migrate
running threads, only ones waiting on run queues.  But with growing
number of CPU cores it is quite typical now for system to not have
many waiting threads.  But same time if due to some coincidence two
long-running CPU-bound threads ended up sharing same physical CPU
core, they could suffer from the SMT penalty indefinitely, and the
load balancer couldn't help.

Improve that by teaching the load balancer to hint running threads
to migrate by marking them with TDF_NEEDRESCHED and new TDF_PICKCPU
flag, making sched_pickcpu() to search for better CPU later, when
it is convenient.

Fix CPU search logic when balancing to limit round-robin migrations
in case of almost equal load to the group of physical cores.  The
previous code bounced threads across all the system, that should be
pretty bad for caches and NUMA affinity, while additional fairness
was almost invisible, diminishing with number of cores in the group.

MFC after: 1 month

(cherry picked from commit e745d729be60a47b49eb19c02a6864a747fb2744)

sbuf(9): Microoptimize sbuf_put_byte()

This function is actively used by sbuf_vprintf(), so this simple
inlining in half reduces time of kern.geom.confxml generation.

MFC after: 2 weeks
Sponsored by: iXsystem, Inc.

(cherry picked from commit 7835b2cb4a1ae57f403739a2f1076ec7188f18c9)

Bump __FreeBSD_version for OCF changes to support variable nonce lengths.

Sponsored by: The FreeBSD Foundation

(cherry picked from commit ac847dbf73685a5df9f70bbcdefa9fdeb559071d)

crypto: Support Chacha20-Poly1305 with a nonce size of 8 bytes.

This is useful for WireGuard which uses a nonce of 8 bytes rather
than the 12 bytes used for IPsec and TLS.

Note that this also fixes a (should be) harmless bug in ossl(4) where
the counter was incorrectly treated as a 64-bit counter instead of a
32-bit counter in terms of wrapping when using a 12 byte nonce.
However, this required a single message (TLS record) longer than 64 *
(2^32 - 1) bytes (about 256 GB) to trigger.

Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D32122

(cherry picked from commit 42dcd39528c6188a259951e28bbad309234324e4)

crypto: Test all of the AES-CCM KAT vectors.

Previously, only test vectors which used the default nonce and tag
sizes (12 and 16, respectively) were tested.  This now tests all of
the vectors.  This exposed some additional issues around requests with
an empty payload (which wasn't supported) and an empty AAD (which
falls back to CIOCCRYPT instead of CIOCCRYPTAEAD).

- Make use of the 'ivlen' and 'maclen' fields for CIOGSESSION2 to
  test AES-CCM vectors with non-default nonce and tag lengths.

- Permit requests with an empty payload.

- Permit an input MAC for requests without AAD.

Reviewed by: markj
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D32121

(cherry picked from commit 668770dc7de2ec8b5f5edf71e09b8a404120f6fa)

cryptosoft: Fix support for variable tag lengths in AES-CCM.

The tag length is included as one of the values in the flags byte of
block 0 passed to CBC_MAC, so merely copying the first N bytes is
insufficient.

To avoid adding more sideband data to the CBC MAC software context,
pull the generation of block 0, the AAD length, and AAD padding out of
cbc_mac.c and into cryptosoft.c. This matches how GCM/GMAC are
handled where the length block is constructed in cryptosoft.c and
passed as an input to the Update callback. As a result, the CBC MAC
Update() routine is now much simpler and simply performs the
XOR-and-encrypt step on each input block.

While here, avoid a copy to the staging block in the Update routine
when one or more full blocks are passed as input to the Update
callback.

Reviewed by: sef
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D32120

(cherry picked from commit 4361c4eb6e3620e68d005c1671fdbf60b1fe83c6)

safexcel: Support truncated tags for AES-CCM.

Reviewed by: markj
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D32119

(cherry picked from commit 366ae4a000b1483390ddbf28e3dc420ebac894a0)

safexcel: Support multiple nonce lengths for AES-CCM.

Reviewed by: markj
Sponsored by: Chelsio Communications, The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D32118

(cherry picked from commit 2ec2e4df094ba632e5e74268a8818f71903a4537)

ccr: Support AES-CCM requests with truncated tags.

Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D32117

(cherry picked from commit e148e407df5c8b1c83bcd44da9f4837d94431d02)

ccr: Support multiple nonce lengths for AES-CCM.

Sponsored by: Chelsio Communications, The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D32116

(cherry picked from commit 3e6a97b3a7bc80b1c12dd7b5208bfe99019c42b4)

aesni: Support AES-CCM requests with a truncated tag.

Reviewed by: sef
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D32115

(cherry picked from commit 655eb762c31044a791e8c8c6355515e7c89c07ef)

aesni: Permit AES-CCM requests with neither payload nor AAD.

Reviewed by: sef
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D32114

(cherry picked from commit c09c379c7aa7337680ff3cb73691ce12d627128b)

aesni: Handle requests with an empty payload.

Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D32113

(cherry picked from commit d718c2d3c805001db0b0ae0cc0c8a811b8a90a95)

aesni: Support multiple nonce lengths for AES-CCM.

Reviewed by: sef
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D32112

(cherry picked from commit 8e6af6adfc2cc3d0ea89c20eaa5914e453c48b49)

crypto: Support multiple nonce lengths for AES-CCM.

Permit nonces of lengths 7 through 13 in the OCF framework and the
cryptosoft driver. A helper function (ccm_max_payload_length) can be
used in OCF drivers to reject CCM requests which are too large for the
specified nonce length.

Reviewed by: sef
Sponsored by: Chelsio Communications, The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D32111

(cherry picked from commit ae18720d2792287c9ec658404f1a3173014d4979)

cryptocheck: Support multiple IV sizes for AES-CCM.

By default, the "normal" IV size (12) is used, but it can be overriden
via -I. If -I is not specified and -z is specified, issue requests
for all possible IV sizes.

Reviewed by: markj
Sponsored by: Chelsio Communications, The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D32110

(cherry picked from commit bcb0fd6accc095295765b08b02f5f3b07ea62536)

cryptodev: Allow some CIOCCRYPT operations with an empty payload.

If an operation would generate a MAC output (e.g. for digest operation
or for an AEAD or EtA operation), then an empty payload buffer is
valid. Only reject requests with an empty buffer for "plain" cipher
sessions.

Some of the AES-CCM NIST KAT vectors use an empty payload.

While here, don't advance crp_payload_start for requests that use an
empty payload with an inline IV. (*)

Reported by: syzbot+d4b94fbd9a44b032f428@syzkaller.appspotmail.com (*)
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D32109

(cherry picked from commit a0cbcbb7917b0b8566ec0853425a73d7958ddbed)

cryptodev: Permit CIOCCRYPT for AEAD ciphers.

A request without AAD for an AEAD cipher can be submitted via
CIOCCRYPT rather than CIOCCRYPTAEAD.

Reviewed by: markj
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D32108

(cherry picked from commit 70dbebea124236184a66a30175ba307793971f00)

cryptodev: Permit explicit IV/nonce and MAC/tag lengths.

Add 'ivlen' and 'maclen' fields to the structure used for CIOGSESSION2
to specify the explicit IV/nonce and MAC/tag lengths for crypto
sessions. If these fields are zero, the default lengths are used.

This permits selecting an alternate nonce length for AEAD ciphers such
as AES-CCM which support multiple nonce leengths. It also supports
truncated MACs as input to AEAD or ETA requests.

Reviewed by: markj
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D32107

(cherry picked from commit 16676123fc85233334983e0071cb446357abec8d)

cryptosoft, ccr: Use crp_iv directly for AES-CCM and AES-GCM.

Rather than copying crp_iv to a local array on the stack that is then
passed to xform reinit routines, pass crp_iv directly and remove the
local copy.

Reviewed by: markj
Sponsored by: Chelsio Communications, The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D32106

(cherry picked from commit 5ae5ed5b8fd2955378ab67ba127cad8c981678ab)

crypto: Permit variable-sized IVs for ciphers with a reinit hook.

Add a 'len' argument to the reinit hook in 'struct enc_xform' to
permit support for AEAD ciphers such as AES-CCM and Chacha20-Poly1305
which support different nonce lengths.

Reviewed by: markj
Sponsored by: Chelsio Communications, The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D32105

(cherry picked from commit 1833d6042c9a0116e8a1198256fd8fbc99cb11ad)
(cherry picked from commit d586c978b9b4216869e589daa5bbcc33225a0e35)

ossl: Use crypto_cursor_segment().

Reviewed by: markj
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D30447

(cherry picked from commit 1c09320d5833fef8a4b6cc0091883fd47ea1eb1b)

cryptosoft: Use crypto_cursor_segment().

Reviewed by: markj
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D30446

(cherry picked from commit 86be314d09bc2857bb63d0a1e34945c63daa0008)

crypto: Add crypto_cursor_segment() to fetch both base and length.

This function combines crypto_cursor_segbase() and
crypto_cursor_seglen() into a single function. This is mostly
beneficial in the unmapped mbuf case where back to back calls of these
two functions have to iterate over the sub-components of unmapped
mbufs twice.

Bump __FreeBSD_version for crypto drivers in ports.

Suggested by: markj
Reviewed by: markj
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D30445

(cherry picked from commit beb817edfe22cdea91e19a60c42caabd9404da48)

crypto: Add a new type of crypto buffer for a single mbuf.

This is intended for use in KTLS transmit where each TLS record is
described by a single mbuf that is itself queued in the socket buffer.
Using the existing CRYPTO_BUF_MBUF would result in
bus_dmamap_load_crp() walking additional mbufs in the socket buffer
that are not relevant, but generating a S/G list that potentially
exceeds the limit of the tag (while also wasting CPU cycles).

Reviewed by: markj
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D30136

(cherry picked from commit 883a0196b629a07e52562b4103cc0f6391083080)

sglist: Add sglist_append_single_mbuf().

This function appends the contents of a single mbuf to an sglist
rather than an entire mbuf chain.

Reviewed by: gallatin, markj
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D30135

(cherry picked from commit 6663f8a23e7cb60d798c5ffbd9c716b62b204f2a)

Support unmapped mbufs in crypto buffers.

Reviewed by: markj
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D30134

(cherry picked from commit 1c8f4b3c9f9e8ca5823d153d3b117246b3d18db4)

Rename m_unmappedtouio() to m_unmapped_uiomove().

This function doesn't only copy data into a uio but instead is a
variant of uiomove() similar to uiomove_fromphys().

Reviewed by: gallatin, markj
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D30444

(cherry picked from commit aa341db39b6373c5e242f376a3cabe6a6b99141e)

Extend m_copyback() to support unmapped mbufs.

Reviewed by: gallatin, markj
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D30133

(cherry picked from commit 3f9dac85cc8f2963026fdc2d5477acb607176a89)

Extend m_apply() to support unmapped mbufs.

m_apply() invokes the callback function separately on each segment of
an unmapped mbuf: the TLS header, individual pages, and the TLS
trailer.

Reviewed by: markj
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D30132

(cherry picked from commit 3c7a01d773ac2d128eabb596eed7098f76966cc5)

ccp, ccr: Simplify drivers to assume an AES-GCM IV length of 12.

While here, use crypto_read_iv() in a few more places in ccr(4) that I
missed previously.

Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D32104

(cherry picked from commit cb128893b92994456107d6ca722fdf6e5028eacc)

cryptodev: Use 'csp' in the handlers for requests.

- Retire cse->mode and use csp->csp_mode instead.
- Use csp->csp_cipher_algorithm instead of the ivsize when checking
for the fixup for the IV length for AES-XTS.

Reviewed by: markj
Sponsored by: Chelsio Communications, The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D32103

(cherry picked from commit b4e0a27c5be5090a9db16dd0ad417543b1fb0c4a)

cryptocheck: Expand the set of sizes tested by -z.

Test individual sizes up to the max encryption block length as well as
a few sizes that include 1 full block and a partial block before
doubling the size.

Reviewed by: cem, markj
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D29518

(cherry picked from commit c86de1dab8e65bc9d11501ca51f2e152276cb94e)

ossl: Don't encryt/decrypt too much data for chacha20.

The loops for Chacha20 and Chacha20+Poly1305 which encrypted/decrypted
full blocks of data used the minimum of the input and output segment
lengths to determine the size of the next chunk ('todo') to pass to
Chacha20_ctr32().  However, the input and output segments could extend
past the end of the ciphertext region into the tag (e.g.  if a "plain"
single mbuf contained an entire TLS record).  If the length of the tag
plus the length of the last partial block together were at least as
large as a full Chacha20 block (64 bytes), then an extra block was
encrypted/decrypted overlapping with the tag.  Fix this by also
capping the amount of data to encrypt/decrypt by the amount of
remaining data in the ciphertext region ('resid').

Reported by: gallatin
Reviewed by: cem, gallatin, markj
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D29517

(cherry picked from commit d2e076c37b0963a8be89684a656c4e1640dc7a3e)

Add Chacha20+Poly1035 to the list of AEAD algorithms.

Sponsored by: Netflix

(cherry picked from commit c853c53d024a3cc950854dfaade7f50303c5a022)

ossl: Add support for the ChaCha20 + Poly1305 AEAD cipher from RFC 8439

Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D28757

(cherry picked from commit 78991a93eb9dd3074a3fc19b88a7c3e34e1ec703)

poly1305: Don't export generic Poly1305_* symbols from xform_poly1305.c.

There currently isn't a need to provide a public interface to a
software Poly1305 implementation beyond what is already available via
libsodium's APIs and these symbols conflict with symbols shared within
the ossl.ko module between ossl_poly1305.c and ossl_chacha20.c.

Reported by: se, kp
Fixes: 78991a93eb9d
Sponsored by: Netflix

(cherry picked from commit bb6e84c988d3f54eff602ed544ceaa9b9fe3e9ff)

ossl: Add ChaCha20 cipher support.

Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D28756

(cherry picked from commit 92aecd1e6fac47ffc893f628c1fe289568bb19cb)

The ChaCha20 counter is little endian, not big endian.

Reviewed by: cem
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D28755

(cherry picked from commit a899ce4ba4c404d342bf892b8b756b66fc65d6b5)

ossl: Add Poly1305 digest support.

Reviewed by: cem
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D28754

(cherry picked from commit a079e38b08f2f07c50ba915dae66d099559abdcc)

cryptocheck: Free generated IV after each GMAC test.

Reviewed by: cem
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D28753

(cherry picked from commit 442a293611461834778d1b7cd2ac170fb3427dcf)

cryptocheck: Add support for the Poly1305 digest.

Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D28758

(cherry picked from commit 68c03734484f679bf2f15fc81359128e331db364)

cryptosoft: Support per-op keys for AES-GCM and AES-CCM.

Reviewed by: cem
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D28752

(cherry picked from commit a10020cfe2830e9626ac58ae97ecd12afb3553be)

cryptocheck: Add Chacha20-Poly1305 AEAD coverage.

- Make openssl_gcm_encrypt generic to AEAD ciphers (aside from CCM)
and use it for Chacha20-Poly1305.

- Use generic AEAD control constants instead of GCM/CCM specific names.

Reviewed by: cem
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D27838

(cherry picked from commit 1bd9fc96d4e4a26bb0060698c07b6f13d19cd819)

Add an implementation of CHACHA20_POLY1305 to cryptosoft.

This uses the chacha20 IETF and poly1305 implementations from
libsodium. A seperate auth_hash is created for the auth side whose
Setkey method derives the poly1305 key from the AEAD key and nonce as
described in RFC 8439.

Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D27837

(cherry picked from commit dd2e1352b68aa33f7f6f8c19aaf88cf287013ae8)

Add an OCF algorithm for ChaCha20-Poly1305 AEAD.

Note that this algorithm implements the mode defined in RFC 8439.

Reviewed by: cem
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D27836

(cherry picked from commit fc8fc743d89388c0c5b97a491428fab2b36beac8)

contrib/tzdata: correct DST in Fiji

Direct commit to stable/13.

Unfortunately, there is still no clear consensus on the tz mailing list
about some of the changes introduced by tzdata 2021b and later releases.
Pending consensus, only merge the recently announced DST transition date
for Fiji and corrections to commentary from tzdata 2021d. This corrects
future timestamps in Fiji.

cxgbe/t4_tom: Use stale L2T entry and avoid busy-waiting for resolution.
Sponsored by: Chelsio Communications

(cherry picked from commit 53c17de2b472c5c4982d5a020268ad3098241498)

cxgbe(4): Fix the decode and display of the DBVFIFO region in meminfo.

Sponsored by: Chelsio Communications

(cherry picked from commit 92de737996660b70376a8b72b80037f89d876056)

cxgbe(4): Display HMA information in meminfo.

This should have been added with initial T6 support many years ago.

Sponsored by: Chelsio Communications

(cherry picked from commit 83a611e09238ead5a765c0ea2c02699fe8175756)

cxgbe(4): Initialize abs_id for ctrl and ofld queues.

Sponsored by: Chelsio Communications

(cherry picked from commit 76c890229628109e46f01c5037b773b59247a1f8)

cxgbetool(8): Update the register definitions used to decode regdump.

Obtained from: Chelsio Communications
Sponsored by: Chelsio Communications

(cherry picked from commit 35e62b00c3342cffee042093b72a52f3f19e5263)

cxgbe(4): Skip a few more T5/T6 registers during a regdump.

These registers have read side effects and a read at just the right
(wrong?) time can trash some internal hw state.

Obtained from: Chelsio Communications
Sponsored by: Chelsio Communications

(cherry picked from commit f13920b39b8b500a17fc276629d70828f9f2d4b1)

cxgbe(4): Update firmwares to 1.26.2.0.

The firmwares and the following changelog are from the "Chelsio Unified
Wire v3.15.0.0 for Linux."

Version : 1.26.2.0
Date    : 09/24/2021
====================

FIXES
-----

BASE:
- Added support for SFP+ RJ45 (0x1C).
- Fixing backward compatibility issue with older drivers when multiple
  speeds are passed to firmware.

OFLD:
- Do not touch tp_plen_max if driver is supplying tp_plen_max. This
  fixes a connection reset issue in iscsi.

ENHANCEMENTS
------------

BASE:
- Firmware header modified to add firmware binary signature.

Sponsored by: Chelsio Communications

(cherry picked from commit 45d6fbaec23eee457197a14517e715c947114d99)

cxgbe(4): Update firmwares to 1.26.0.0.

Changes since 1.25.6.0 are listed here. This list comes from the
Release Notes for "Chelsio Unified Wire 3.14.0.4 for Linux" dated
2021-07-08.

Fixes
-----

BASE:
- Wait 5ms before and after the i2c command that clears the mod_select.
This fixes incorrect port module type read from i2c.

Obtained from: Chelsio Communications
Sponsored by: Chelsio Communications

(cherry picked from commit 3c900106ea7aab69690945ad885b4df1095c1504)

cxgbe(4): Do not configure traffic classes automatically on attach.

The driver used to configure all available classes with some default
parameters on attach and the rest of t4_sched.c was written with the
assumption that all traffic classes are always valid in the hardware.
But this resulted in a lot of informational messages being logged in the
firmware's circular log, crowding out other more useful messages.

This change leaves the tx scheduler alone during attach to reduce the
spam in the devlog. The state of every class is now tracked separately
from its flags and there is support for an 'uninitialized' state.

Sponsored by: Chelsio Communications

(cherry picked from commit ec8004dd419d8c8acfc9025dd050f141c949d53a)

cxgbe(4): Get the number of usable traffic classes from the firmware.

Recent firmwares are able to utilize the traffic classes of tx channels
that were previously unused. This effectively doubles the number of
traffic classes available per port for 2 port cards. Stop using the raw
per-channel value in the driver and ask the firmware for the number of
usable traffic classes instead.

Sponsored by: Chelsio Communications

(cherry picked from commit 6beb67c7e0ad4c3f8277ed1122ef5efcde0a269c)

cxgbe/iw_cxgbe: Support for 512 SGL entries in one memory registration.

Use the correct SGL limit within iw_cxgbe, firmwares >= 1.25.6.0 support
upto 512 entries per MR.

Obtained from: Chelsio Communications
Sponsored by: Chelsio Communications

(cherry picked from commit 211972cfb816f8da8b8a4c524b44dde4638c3288)

cxgbe(4): Check if the firmware supports 512 SGL per FR MR.

Firmwares >= 1.25.6.0 support 512 SGL entries in a single memory
registration request.

Obtained from: Chelsio Communications
Sponsored by: Chelsio Communications

(cherry picked from commit db15dbf8801120241b7bfb6461341f2cb421ef8e)

cxgbe(4): Update firmwares to 1.25.6.0.

Changes since 1.25.0.0 are listed here.  This list comes from the
Release Notes for the "Chelsio Unified Wire v3.14.0.3 for Linux"
release dated 2021-05-21.

Fixes
-----

BASE:
- Fixed Back to back T6 100G-CR4 link coming up with NO FEC sometimes.
- [T5] Try to bring up link in 1G speed if link doesn't come up on 10G.
- Fixed a bug to not allow BaseR fec in 100G speed.
- Fixed linkup issues on BT adapter in 1G and 100M speed.
- Fixed an issue to allow driver to send VI_ENABLE multiple times (once
  with rx disable and then later rx enable).
- Fixed rate limiting not working on class number 16 to 30.
- Fixed backward compatibility issue in port type interpretation with vpd
  version 0x80.

ETH:
- Fixed a case when firmware failed to deliver NIC WR completion to host.
- No rate limit support for WR ETH_TX_PKTS2 due to performance reasons.

OFLD
- Fixed a connection hang in SO adapters when tp_plen_max (set by driver)
  is more than the window size.
- Added fw_filter_vnic_mode to firmware API file (t4fw_interface.h)
- Use correct rx channel in coprocessor crypto completion (CPL_FW6_PLD). This
  was causing out of order completion to host.

FOiSCSI
- Fixed a crash due to unaligned access of ipv6 address.
- Fixed a crash during lun reset.

Enhancements
------------

ETH:
- Rate limiting support added for encapsulated (vxlan, nvgre, geneve) NIC TCP
  packets.

OFLD:
- More than 128 SGLs supported in FW_RI_FR_NSMR_WR. Now, more than 16GB
  (upto 64GB) of PBLs can be written with single FW_RI_FR_NSMR_WR.

Obtained from: Chelsio Communications
Sponsored by: Chelsio Communications

(cherry picked from commit e0fa04e257c1af4c793a70a124ba41e592570c14)

cxgb(4): Report proper TSO limits.

Sponsored by: Chelsio Communications

(cherry picked from commit f13d72fd0b743a1fd97dd31f4abf19e8814c420b)

cxgbe(4): Fix an incorrect assert.

CTRL and OFLD tx queues do not have automatic tx credit flush enabled so
it is okay for the cidx not to be the same as the pidx when the queue is
destroyed.

Reported by: Jithesh Arakkan @ Chelsio
Sponsored by: Chelsio Communications

(cherry picked from commit 5ef87bf8b687575bee010967e23cd2c552b43ad9)

cxgbe(4): Empty the clib_db before trying to destroy it.

This fixes a panic on driver unload.

Reported by: Jithesh Arakkan @ Chelsio
Sponsored by: Chelsio Communications

(cherry picked from commit bb877c0620347eb86f25f4382c42d58685c348d4)

cxgbe(4): Use correct argument in call to hashdestroy.

This fixes a panic on driver module unload.

Fixes: 24b98f288d11 cxgbe(4): Overhaul CLIP (Compressed Local IPv6) table management.
Sponsored by: Chelsio Communications

(cherry picked from commit 740d722def71905dd74a622acce1561701ccbec6)

cxgbetool(8): add a 'clip' subcommand to deal with the CLIP table.

Sponsored by: Chelsio Communications

(cherry picked from commit ac02945f7e2b5ab84fe510fc052c35350e31220d)

cxgbe(4): Overhaul CLIP (Compressed Local IPv6) table management.

- Process the list of local IPs once instead of once per adapter.  Add
  addresses from all VNETs to the driver's list but leave hardware
  updates for later when the global VNET/IFADDR list locks have been
  released.

- Add address to the hardware table synchronously when a CLIP entry is
  requested for an address that's not already in there.

- Provide ioctls that allow userspace tools to manage addresses in the
  CLIP table.

- Add a knob (hw.cxgbe.clip_db_auto) that controls whether local IPs are
  automatically added to the CLIP table or not.

Sponsored by: Chelsio Communications

(cherry picked from commit 24b98f288d11750f2cdfbfe360be1c92a9c2ee1d)

cxgbe(4): Fix build warnings with NOINET kernels.

MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D26334

(cherry picked from commit ffbb373c5a95c37be693330a76a093fbcf546440)

cxgbe(4): Add support for NIC suspend/resume and live reset.

Add suspend/resume callbacks to the driver and a live reset built around
them.  This commit covers the basic NIC and future commits will expand
this functionality to other stateful parts of the chip.  Suspend and
resume operate on the chip (the t?nex nexus device) and affect all its
ports.  It is not possible to suspend/resume or reset individual ports.
All these operations can be performed on a running NIC.  A reset will
look like a link bounce to the networking stack.

Here are some ways to exercise this functionality:

/* Manual suspend and resume. */
# devctl suspend t6nex0
# devctl resume t6nex0

/* Manual reset. */
# devctl reset t6nex0

/* Manual reset with driver sysctl. */
# sysctl dev.t6nex.0.reset=1

/* Automatic adapter reset on any fatal error. */
# hw.cxgbe.reset_on_fatal_err=1

Suspend disables the adapter (DMA, interrupts, and the port PHYs) and
marks the hardware as unavailable to the driver.  All ifnets associated
with the adapter are still visible to the kernel but operations that
require hardware interaction will fail with ENXIO.  All ifnets report
link-down while the adapter is suspended.

Resume will reattach to the card, reconfigure it as before, and recreate
the queues servicing the existing ifnets.  The ifnets are able to send
and receive traffic as soon as the link comes back up.

Reset is roughly the same as a suspend and a resume with at least one of
these events in between: D0->D3Hot->D0, FLR, PCIe link retrain.

(cherry picked from commit 83b5cda106a2dc0c8ace1718485c2ef05c5aa62b)

cxgbe(4): Separate the sw- and hw-specific parts of resource allocations

The driver uses both software resources (locks, callouts, memory for
descriptors and for bookkeeping, sysctls, etc.) and hardware resources
(VIs, DMA queues, TCAM entries, etc.) to operate the NIC. This commit
splits the single *_ALLOCATED flag used to track all these resources
into separate *_SW_ALLOCATED and *_HW_ALLOCATED flags.

This is the simplified pseudocode that now applies to most queues (foo
can be ctrlq/txq/rxq/ofld_txq/ofld_rxq):

/* Idempotent */
alloc_foo
{
if (!SW_ALLOCATED)
init_iq/init_eq/init_fl no-fail sw init
alloc_iq_fl/alloc_eq/alloc_wrq may-fail sw alloc
add_foo_sysctls, etc. no-fail post-alloc items
if (!HW_ALLOCATED)
alloc_iq_fl_hwq/alloc_eq_hwq hw resource allocation
}

/* Idempotent */
free_foo
{
if (!HW_ALLOCATED)
free_iq_fl_hwq/free_eq_hwq release hw resources
if (!SW_ALLOCATED)
free_iq_fl/free_eq/free_wrq release sw resources
}

The routines that take the driver to FULL_INIT_DONE and VI_INIT_DONE and
back are now all idempotent. The quiesce routines pay attention to the
HW_ALLOCATED flag and will not wait on the hardware for pidx/cidx
updates and other completions if this flag is not set.

Sponsored by: Chelsio Communications

(cherry picked from commit 43bbae19483fbde0a91e61acad8a6e71e334c8b8)

pf: Introduce pf_nvbool()

Similar to the existing functions for strings and ints, this lets us
simplify some of the nvlist conversion code.

MFC after: 1 week
Sponsored by: Rubicon Communications, LLC ("Netgate")

(cherry picked from commit 776df104fa54bb581e1fb88ac44af4fa7fd4052b)

bhyve: Update usage and synopsis for the -k flag

Let's make it clear to users that -k is for configuration files.
Also, point to bhyve_config(5) in the paragraph describing the flag.

Reviewed by: jhb
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D32467

(cherry picked from commit f656df586ad1aa86a2b9f076a224f746d5308613)

nfscl: Fix another deadlock related to the NFSv4 clientID lock

Without this patch, it is possible to hang the NFSv4 client,
when a rename/remove is being done on a file where the client
holds a delegation, if pNFS is being used.  For a delegation
to be returned, dirty data blocks must be flushed to the NFSv4
server.  When pNFS is in use, a shared lock on the clientID
must be acquired while doing a write to the DS(s).
However, if rename/remove is doing the delegation return
an exclusive lock will be acquired on the clientID, preventing
the write to the DS(s) from acquiring a shared lock on the clientID.

This patch stops rename/remove from doing a delegation return
if pNFS is enabled.  Since doing delegation return in the same
compound as rename/remove is only an optimization, not doing
so should not cause problems.

This problem was detected during a recent NFSv4 interoperability
testing event held by the IETF working group.

(cherry picked from commit b82168e657d378ff86ea18c4f03b98aac9ee9bb3)

nfscl: Fix a deadlock related to the NFSv4 clientID lock

Without this patch, it is possible for a process doing an NFSv4
Open/create of a file to block to allow another process
to acquire the exclusive lock on the clientID when holding
a shared lock on the clientID. As such, both processes
deadlock, with one wanting the exclusive lock, while the
other holds the shared lock. This deadlock is unlikely to occur
unless delegations are in use on the NFSv4 mount.

This patch fixes the problem by not deferring to the process
waiting for the exclusive lock when a shared lock (reference cnt)
is already held by the process.

This problem was detected during a recent NFSv4 interoperability
testing event held by the IETF working group.

(cherry picked from commit 120b20bdf49630cf2a7dbc5f93b9e985e1f4f198)

Revert "libc/locale: Fix races between localeconv(3) and setlocale(3)"

This reverts commit f89204d6b99d11aa1f67722e8c1d33b0fc4d61d7.

I didn't intend to push this commit yet, pending discussion on PR
258360.

PR: 258360

mount: Check for !VDIR mount points before handling -o emptydir

To implement -o emptydir, vfs_emptydir() checks that the passed
directory is empty. This should be done after checking whether the
vnode is of type VDIR, though, or vfs_emptydir() may end up calling
VOP_READDIR on a non-directory.

Reported by: syzbot+4006732c69fb0f792b2c@syzkaller.appspotmail.com
Reviewed by: kib, imp
Sponsored by: The FreeBSD Foundation

(cherry picked from commit 03d5820f738de130b2feb66833f18741b7f92a14)

libc/locale: Fix races between localeconv(3) and setlocale(3)

Each locale embeds a lazily initialized lconv which is populated by
localeconv(3) and localeconv_l(3).  When setlocale(3) updates the global
locale, the lconv needs to be (lazily) reinitialized.  To signal this,
we set flag variables in the locale structure.  There are two problems:

- The flags are set before the locale is fully updated, so a concurrent
  localeconv() call can observe partially initialized locale data.
- No barriers ensure that localeconv() observes a fully initialized
  locale if a flag is set.

So, move the flag update appropriately, and use acq/rel barriers to
provide some synchronization.  Note that this is inadequate in the face
of multiple concurrent calls to setlocale(3), but this is not expected
to work regardless.

Thanks to Henry Hu <henry.hu.sh@gmail.com> for providing a test case
demonstrating the race.

PR: 258360
Sponsored by: The FreeBSD Foundation

(cherry picked from commit 7eb138a9e53636366e615bdf04062fedc044bcea)

rtld direct exec: add -d option

(cherry picked from commit ba7f9c1b61329630af25e75cdaca261b389318c7)

sysdecode_enum.3: Fix a typo: SIGBTRAP -> SIGTRAP.

Sponsored by: DARPA

(cherry picked from commit 680d70b59e0379ded0cc94e3772bc47be2163c7f)

Add EPOCH_TRACE to NOTES to get LINT coverage.

Sponsored by: The FreeBSD Foundation

(cherry picked from commit b9485d76e3ad4924032a23c82b8a30a0dce31918)

Document kern.log_wakeups_per_second.

PR: 148680

(cherry picked from commit c51e4962a3cf2959d1f1cb9ab74ceab448583169)

Remove 'make update'.

In the CVS days this used be a wrapper around either CVS or CVSup and
used to support updating src, doc, and ports checkouts. With the move
to subversion this only supported updating src and was itself a
wrapper around 'svn update'. With Git, users are probably better off
using appropriate Git commands directly to update without needing an
explicit make target as a wrapper.

Reviewed by: bcr, imp, emaste
Differential Revision: https://reviews.freebsd.org/D30736

(cherry picked from commit e290182bcf3895ca659dff111bca6a077c4708b1)

config: Fix typo in comment.

(cherry picked from commit bcaa6aa15383cacf5f20179be919bb8dd45cc5f4)

nfscl: Add a check for "has acquired a delegation" to nfscl_removedeleg()

Commit 5e5ca4c8fc53 added a flag to a NFSv4 mount point that is set when
the first delegation is acquired from the NFSv4 server.

For a common case where delegations are not being issued by the
NFSv4 server, the nfscl_removedeleg() code acquires the mutex lock for
open/lock state, finds the delegation list empty, then just unlocks the
mutex and returns. This patch adds a check of the flag to avoid the
need to acquire the mutex for this common case.

This change appears to be performance neutral for a small number
of opens, but should reduce lock contention for a large number of opens
for the common case where server is not issuing delegations.

This commit should not affect the high level semantics of delegation
handling.

(cherry picked from commit 62c5be4ab4c8b8127185286e148638cb8cdf45f4)

selsocket: handle sopoll() errors correctly

Without this change, unmounting smbfs filesystems with an INVARIANTS
kernel would panic after 10e64782ed59727e8c9fe4a5c7e17f497903c8eb.

PR: 253079
Found by: markj
Reviewed by: markj, jhb
Obtained from: CheriBSD
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D32492

(cherry picked from commit 04c91ac48ad13ce0d1392cedbd69c2c0223d206f)

makesyscalls.lua: add a CAPENABLED flag

The CAPENABLED flag indicates that the syscall can be used in capsicum
capability mode. It is intended to replace capabilities.conf.

Reviewed by: kevans, emaste
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D31349

(cherry picked from commit 6945df3fff57a9606f8c8a4e3865def3a0e915e7)

makesyscalls.lua: Add a new syscall type: RESERVED

RESERVED syscall number are reserved for local/vendor use. RESERVED is
identical to UNIMPL except that comments are ignored.

Reviewed by: kevans
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D27988

(cherry picked from commit 119fa6ee8a8056aab5e4ab1719d3c563cdb4a95a)

ciss(4): Fix typo.

(cherry picked from commit 5f8cb13cfb0c91a4ec1a9648a3ae245b1dff36f6)

ciss(4): Properly handle data underrun.

For SCSI data underrun is a part of normal life. It should not be
reported as error. This fixes MODE SENSE used by modern CAM.

MFC after: 1 month

(cherry picked from commit e8144a13e075ff13c1f162690c7f14dd3f0a4862)

cam(4): Limit search for disks in SES enclosure by single bus

At least for SAS that we only support now disks are typically
connected to the same bus as the enclosure. Limiting the search
scope makes it much faster on systems with multiple buses and
thousands of disks.

Reviewed by: imp
MFC after: 2 weeks
Sponsored by: iXsystems, Inc.
Differential Revision: https://reviews.freebsd.org/D32305

(cherry picked from commit 730ea72c706ef8e025616772cfd86fd89ed3d42e)

cam(4): Improve XPT_DEV_MATCH

Remove *_MATCH_NONE enums, making no sense and so never used. Make
*_MATCH_ANY enums 0 (no any match flags set), previously used by
*_MATCH_NONE. Bump CAM_VERSION to 0x1a reflecting those changes and
add compat shims.

When traversing through buses and devices do not descend if we can
already see that requested pattern does not match the bus or device.
It allows to save significant amount of time on system with thousands
of disks when doing limited searches.

Reviewed by: imp
MFC after: 2 weeks
Sponsored by: iXsystems, Inc.
Differential Revision: https://reviews.freebsd.org/D32304

(cherry picked from commit 8f9be1eed11c27c66386c3d72cd6c6aef597fa0d)

Change lowest address on subnet (host 0) not to broadcast by default.

The address with a host part of all zeros was used as a broadcast long
ago, but the default has been all ones since 4.3BSD and RFC1122.  Until
now, we would broadcast the host zero address as well as the configured
address.  Change to not broadcasting that address by default, but add a
sysctl (net.inet.ip.broadcast_lowest) to re-enable it.  Note that the
correct way to use the zero address for broadcast would be to configure
it as the broadcast address for the network.

See https:/datatracker.ietf.org/doc/draft-schoen-intarea-lowest-address/
and the discussion in https://reviews.freebsd.org/D19316.  Note, Linux
now implements this.

Reviewed by: rgrimes, tuexen; melifaro (previous version)
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D31861

(cherry picked from commit fd0765933c3ccb059ad7456e657b2e8ed22f58b0)

Fix two typos in source code comments

- s/alocated/allocated/
- s/realocated/reallocated/

(cherry picked from commit 899a3b38f5172d70360396caeebb5b694638282e)