Alexander Motin [Mon, 5 Apr 2021 14:34:40 +0000 (10:34 -0400)]
Set PCIe device's Max_Payload_Size to match PCIe root's.
Usually on boot the MPS is already configured by BIOS. But we've
found that on hot-plug it is not true at least for our Supermicro
X11 boards. As result, mismatch between root's configuration of
256 bytes and device's default of 128 bytes cause problems for some
devices, while others seem to work fine.
Rick Macklem [Tue, 20 Apr 2021 00:51:07 +0000 (17:51 -0700)]
nfsd: fix stripe size reply for the File Layout pNFS server
At a recent testing event I found out that I had misinterpreted
RFC5661 where it describes the stripe size in the File Layout's
nfl_util field. This patch fixes the pNFS File Layout server
so that it returns the correct value to the NFSv4.1/4.2 pNFS
enabled client.
This affects almost no one, since pNFS server configurations
are rare and the extant pNFS aware NFS clients seemed to
function correctly despite the erroneous stripe size.
It *might* be needed for correct behaviour if a recent
Linux client mounts a FreeBSD pNFS server configuration
that is using File Layout (non-mirrored configuration).
Mark Johnston [Mon, 26 Apr 2021 18:53:16 +0000 (14:53 -0400)]
imgact_elf: Ensure that the return value in parse_notes is initialized
parse_notes relies on the caller-supplied callback to initialize "res".
Two callbacks are used in practice, brandnote_cb and note_fctl_cb, and
the latter fails to initialize res. Fix it.
In the worst case, the bug would cause the inner loop of check_note to
examine more program headers than necessary, and the note header usually
comes last anyway.
Reviewed by: kib
Reported by: KMSAN
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D29986
Kevin Bowling [Mon, 19 Apr 2021 02:11:27 +0000 (19:11 -0700)]
e1000: Add support for [Tiger, Alder, Meteor] Lake
Add support for current and future client platform PCI IDs. These are
all I219 variants and have no known driver changes versus previous
generation client platform I219 variants.
Commit 0a1fdb867c72 changed the internal KAPI between the krpc
and NFS. As such, the krpc, nfscommon and nfscl modules must
all be rebuilt from sources.
Rick Macklem [Sun, 11 Apr 2021 21:34:57 +0000 (14:34 -0700)]
nfsv4 client: do the BindConnectionToSession as required
During a recent testing event, it was reported that the NFSv4.1
server erroneously bound the back channel to a new TCP connection.
RFC5661 specifies that the fore channel is implicitly bound to a
new TCP connection when an RPC with Sequence (almost any of them)
is done on it. For the back channel to be bound to the new TCP
connection, an explicit BindConnectionToSession must be done as
the first RPC on the new connection.
Since new TCP connections are created by the "reconnect" layer
(sys/rpc/clnt_rc.c) of the krpc, this patch adds an optional
upcall done by the krpc whenever a new connection is created.
The patch also adds the specific upcall function that does a
BindConnectionToSession and configures the krpc to call it
when required.
This is necessary for correct interoperability with NFSv4.1
servers when the nfscbd daemon is running.
If doing NFSv4.1 mounts without this patch, it is
recommended that the nfscbd daemon not be running and that
the "pnfs" mount option not be specified.
There is probably a PR for this, but I can't find this, or remember who
submitted it. The patch got lost in the noise of another that wasn't
ready to commit.
Mark Johnston [Tue, 27 Apr 2021 00:04:25 +0000 (20:04 -0400)]
aesni: Avoid modifying session keys in hmac_update()
Otherwise aesni_process() is not thread-safe for AES+SHA-HMAC
transforms, since hmac_update() updates the caller-supplied key directly
to create the derived key. Use a buffer on the stack to store a copy of
the key used for computing inner and outer digests.
This is a direct commit to stable/12 as the bug is not present in later
branches.
iwnstats: fix build with clang and allow install under /usr/local/sbin
iwnstats was not compiling because of some issues raised by the clang
compiler due to -Werror. As a tool it is not connected to world build.
Add missing field "barker_mrc" initialization in struct
iwn_sensitivity_limits for -Wmissing-field-initializers, remove unused
pointer *is on iwn_stats_*_print functions and unused variables for
-Wunused-parameter and -Wunused-variable.
The value for field "barker_mrc" of struct iwn2030_sensitivity_limits
was obtained from linux 3.2 wireless/iwlwifi driver code (iwl-2000.c:115
.barker_corr_th_min_mrc = 390).
Also set BINDIR in Makefile to make it possible to install under
/usr/local/sbin/iwnstats as it require super user.
Reviewed by: adrian
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D29800
Alexander Motin [Tue, 13 Apr 2021 15:19:10 +0000 (11:19 -0400)]
Fix race in case of device destruction.
During device destruction it is possible that open() succeed, but
fdevname() return NULL, that can't be assigned to string variable.
Fix that by adding explicit NULL check.
Also while there switch from fdevname() to fdevname_r().
"zgrep --version" is expected to print the version information in the
same way as "zgrep -V". However, the case handling the --version flag
is never reached, so "zgrep --version" prints:
zgrep: missing pattern
instead of:
grep (BSD grep, GNU compatible) 2.6.0-FreeBSD
Rick Macklem [Sat, 10 Apr 2021 22:50:25 +0000 (15:50 -0700)]
nfsd: fix replies from session cache for multiple retries
Recent testing of network partitioning a FreeBSD NFSv4.1
server from a Linux NFSv4.1 client identified problems
with both the FreeBSD server and Linux client.
Commit 05a39c2c1c18 fixed replying with the cached reply in
in the session slot if same session slot sequence#.
However, the code uses the reply and, as such,
will fail for a subsequent retry of the RPC.
A subsequent retry would be an extremely rare event,
but this patch fixes this, so long as m_copym(..M_NOWAIT)
does not fail, which should also be a rare event.
This fix affects the exceedingly rare case where a NFSv4
client retries a non-idempotent RPC, such as a lock
operation, multiple times. Note that retries only occur
after the client has needed to create a new TCP connection,
with a new TCP connection for each retry.
Alexander Motin [Sat, 17 Apr 2021 14:41:35 +0000 (10:41 -0400)]
mpt(4): Remove incorrect S/G segments limits.
First, two of those four checks are unreachable.
Second, I don't believe there should be ">=" instead of ">".
Third, bus_dma(9) already returns the same EFBIG if ">".
This fixes false I/O errors in worst S/G cases with maxphys >= 2MB.
Alexander Motin [Fri, 16 Apr 2021 19:39:01 +0000 (15:39 -0400)]
pms(4): Limit maximum I/O size to 256KB instead of 1MB.
There is a weird limit of AGTIAPI_MAX_DMA_SEGS (128) S/G segments per
I/O since the initial driver import. I don't know why it was added,
can only guess some hardware limitation, but in worst case it means
maximum I/O size of 508KB. Respect it to be safe, rounding to 256KB.
Rick Macklem [Thu, 8 Apr 2021 21:04:22 +0000 (14:04 -0700)]
nfsd: fix replies from session cache for retried RPCs
Recent testing of network partitioning a FreeBSD NFSv4.1
server from a Linux NFSv4.1 client identified problems
with both the FreeBSD server and Linux client.
The FreeBSD server failec to reply using the cached
reply in the session slot when an RPC was retried on
the session slot, as indicated by same slot sequence#.
This patch fixes this. It should also fix a similar
failure for NFSv4.0 mounts, when the sequence# in
the open/lock_owner requires a reply be done from
an entry locked into the DRC.
This fix affects the fairly rare case where a NFSv4
client retries a non-idempotent RPC, such as a lock
operation. Note that retries only occur after the
client has needed to create a new TCP connection.
Improve size readability.
Preserve more space for swap devise names.
Prevent line overflow with long devise name.
Don't draw a bar when swap is not used at all.
Simplify and optimize code.
Change the label to end at end of 100%.
PR: 251655
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D27496
Mark Johnston [Wed, 21 Apr 2021 18:50:48 +0000 (14:50 -0400)]
safexcel: Fix the SHA-HMAC digest computation when AAD is present
The driver would fail to include the AAD in the input stream, resulting
in incorrect digests for requests combining SHA-HMAC with AES-CBC or
-CTR. Ensure that the AAD is included in the processor's input stream,
and fix the corresponding instruction sequence to include the AAD as
input to the digest computation.
This is a direct commit to stable/12 since the bug was introduced while
merging there and is not present in later branches.
When we find a state for packets that was created by a reply-to rule we
still need to process the packet. The state may require us to modify the
packet (e.g. in rdr or nat cases), which we won't do with the shortcut.
pf: change pf_route so pf only runs when packets enter and leave the stack.
before this change pf_route operated on the semantic that pf runs
when packets go over an interface, so when pf_route changed which
interface the packet was on it would run pf_test again. this change
changes (restores) the semantic that pf is only supposed to run
when packets go in or out of the network stack, even if route-to
is responsibly for short circuiting past the network stack.
just to be clear, for normal packets (ie, those not touched by
route-to/reply-to/dup-to), there isn't a difference between running
pf when packets enter or leave the stack, or having pf run when a
packet goes over an interface.
the main reason for this change is that running the same packet
through pf multiple times creates confusion for the state table.
by default, pf states are floating, meaning that packets are matched
to states regardless of which interface they're going over. if a
packet leaving on em0 is rerouted out em1, both traversals will end
up using the same state, which at best will make the accounting
look weird, or at worst fail some checks in the state and get
dropped.
another reason for this commit is is to make handling of the changes
that route-to makes consistent with other changes that are made to
packet. eg, when nat is applied to a packet, we don't run pf_test
again with the new addresses.
the main caveat with this diff is you can't have one rule that
pushes a packet out a different interface, and then have a rule on
that second interface that NATs the packet. i'm not convinced this
ever worked reliably or was used much anyway, so we don't think
it's a big concern.
discussed with many, with special thanks to bluhm@, sashan@ and
sthen@ for weathering most of that pain.
ok claudio@ sashan@ jmatthew@
Rick Macklem [Mon, 5 Apr 2021 01:15:54 +0000 (18:15 -0700)]
nfsd: make the server repeat CB_RECALL every couple of seconds
Commit 01ae8969a9ee stopped the NFSv4.1/4.2 server from implicitly
binding the back channel to a new TCP connection so that it
conforms to RFC5661, for NFSv4.1/4.2. An effect of this
for the Linux NFS client is that it will do a
BindConnectionToSession when it sees NFSV4SEQ_CBPATHDOWN
set in a sequence reply. This will fix the back channel, but the
first attempt at a callback like CB_RECALL will already have
failed. Without this patch, a CB_RECALL will not be retried
and that can result in a 5 minute delay until the delegation
times out.
This patch modifies the code so that it will retry the
CB_RECALL every couple of seconds, often avoiding the
5 minute delay.
This is not critical for correct behaviour, but avoids
the 5 minute delay for the case where the Linux client
re-binds the back channel via BindConnectionToSession.
Rick Macklem [Sun, 4 Apr 2021 22:05:39 +0000 (15:05 -0700)]
nfsd: fix BindConnectionToSession so that it clears "cb path down"
Commit 01ae8969a9ee stopped the NFSv4.1/4.2 server from implicitly
binding the back channel to a new TCP connection so that it
conforms to RFC5661, for NFSv4.1/4.2. An effect of this
for the Linux NFS client is that it will do a
BindConnectionToSession when it sees NFSV4SEQ_CBPATHDOWN
set in a sequence reply. It will do this for every RPC
reply until it no longer sees the flag.
Without that patch, this will happen until the client does
an Open, which will clear LCL_CBDOWN.
This patch clears LCL_CBDOWN right away, so that
NFSV4SEQ_CBPATHDOWN will no longer be sent to the client
in Sequence replies and the Linux client will not repeat
the BindConnectionToSession RPCs.
This is not critical for correct behaviour, but reduces
RPC overheads for cases where the Open will not be done
for a while.
tcp: For hostcache performance, use atomics instead of counters
As accessing the tcp hostcache happens frequently on some
classes of servers, it was recommended to use atomic_add/subtract
rather than (per-CPU distributed) counters, which have to be
summed up at high cost to cache efficiency.
tcp: Make hostcache.cache_count MPSAFE by using a counter_u64_t
Addressing the underlying root cause for cache_count to
show unexpectedly high values, by protecting all arithmetic on
that global variable by using counter(9).
tcp: reduce memory footprint when listing tcp hostcache
In tcp_hostcache_list, the sbuf used would need a large (~2MB)
blocking allocation of memory (M_WAITOK), when listing a
full hostcache. This may stall the requestor for an indeterminate
time.
A further optimization is to return the expected userspace
buffersize right away, rather than preparing the output of
each current entry of the hostcase, provided by: @tuexen.
This makes use of the ready-made functions of sbuf to work
with sysctl, and repeatedly drain the much smaller buffer.
While sbuf_drain was an internal function, two
KASSERTS checked the sanity of it being called.
However, an external caller may be ignorant if
there is any data to drain, or if an error has
already accumulated. Be nice and return immediately
with the accumulated error.
Rick Macklem [Thu, 1 Apr 2021 22:09:03 +0000 (15:09 -0700)]
nfsd: silence rpcb_unset noise for NFSv4 only servers
An NFSv4 only configuration does not register with
rpcbind(). Without this patch a failure to rpcb_unset()
is reported when the daemon is terminated for this case.
This is harmless noise, but this patch avoids calling
rpcb_unset() for the NFSv4 only case, avoiding the noise.
When called with "-d", it still does the rpcb_unset(),
assuming that the configuration might have been
changed to NFSv4 only and unregistering with
rpcbind() might still be needed.
Avoid raising unexpected floating point exceptions in libm
When using clang with x86_64 CPUs that support AVX, some floating point
transformations may raise exceptions that would not have been raised by
the original code. To avoid this, use the -fp-exception-behavior=maytrap
flag, introduced in clang 10.0.0.
In particular, this fixes a number of test failures with ctanhf(3) and
ctanf(3), when libm is compiled with -mavx. An unexpected FE_INVALID
exception is then raised, because clang emits vdivps instructions to
perform certain divides. (The vdivps instruction operates on multiple
single-precision float operands simultaneously, but the exceptions may
be influenced by unused parts of the XMM registers. In this particular
case, it was calculating 0 / 0, which results in FE_INVALID.)
If -fp-exception-behavior=maytrap is specified however, clang uses
vdivss instructions instead, which work on one operand, and should not
raise unexpected exceptions.
Only use -fp-exception-behavior=maytrap on x86, for now
After 3b00222f156d, it turns out that clang only supports strict
floating point semantics for SystemZ and x86 at the moment, while for
other architectures it is still experimental.
Therefore, only use -fp-exception-behavior=maytrap on x86 for now,
otherwise this option results in "error: overriding currently
unsupported use of floating point exceptions on this target
[-Werror,-Wunsupported-floating-point-opt]" on other architectures.
Avoid -pedantic warnings about using _Generic in __fp_type_select
When compiling parts of math.h with clang using a C standard before C11,
and using -pedantic, it will result in warnings similar to:
bug254714.c:5:11: warning: '_Generic' is a C11 extension [-Wc11-extensions]
return !isfinite(1.0);
^
/usr/include/math.h:111:21: note: expanded from macro 'isfinite'
^
/usr/include/math.h:82:39: note: expanded from macro '__fp_type_select'
^
This is because the block that enables use of _Generic is conditional
not only on C11, but also on whether the compiler advertises support for
C generic selections via __has_extension(c_generic_selections).
To work around the warning without having to pessimize the code, use the
__extension__ keyword, which is supported by both clang and gcc. While
here, remove the check for __clang__, as _Generic has been supported for
a long time by gcc too now.
Kristof Provost [Thu, 25 Mar 2021 12:59:14 +0000 (13:59 +0100)]
libnv: Allow use in non-sleepable contexts
44c125c4cebc2fd87c6260b90eddae11201f5232 switched the nvlist allocations
to be M_WAITOK, but this precludes the use in non-sleepable contexts.
(E.g. with a nonsleepable lock held).
All callers for these allocation functions already cope with memory
alloation failures, so there's no reason to allow sleeping during
allocations.
pf tests: make synproxy and nat work correctly even if inetd is running
tests/sys/netfil/pf/synproxy fails if inetd has been running
outside of the jail because pidfile_open() fails with EEXIST.
tests/sys/netfil/pf/nat has the same problem but the test succeeds
because whether inetd is running is not so important.
Fix the problem by changing the pidfile path from the default
location.