Warner Losh [Fri, 15 Apr 2022 20:41:13 +0000 (14:41 -0600)]
nvme: Rename min_page_size to page_size and save mps
The Memory Page Size sets the basic unit of operation for the drive. We
currently set this to the drive's minimum page size, but we could set it
to any page size the drive supports in the future. Replace min_page_size
(it's now unused for that purpose) with page_size to reflect this and
cache the MPS we want to use. Use NVME_MPS_SHIFT to compute page_size.
Warner Losh [Fri, 15 Apr 2022 20:40:41 +0000 (14:40 -0600)]
nvme: Define NVME_MPS_SHIFT
The memory page size (MPS) is expressed in terms of a 2^(number + 12)
and other items in the system inherit this. Create a define rather than
sprinkling 12 everywehere.
Alan Somers [Fri, 15 Apr 2022 19:04:24 +0000 (13:04 -0600)]
fusefs: validate servers' error values
Formerly fusefs would pass up the stack any error value returned by the
fuse server. However, some values aren't valid for userland, but have
special meanings within the kernel. One of these, EJUSTRETURN, could
cause a kernel page fault if the server returned it in response to
FUSE_LOOKUP. Fix by validating all errors returned by the server.
Also, fix a data lifetime bug in the FUSE_DESTROY test.
Brooks Davis [Fri, 15 Apr 2022 19:04:41 +0000 (20:04 +0100)]
lpr: remove a.out binary detection
Since the first unattributed commit in 1981, lpr has attempted to
prevent users from printing executables (and in earlier versions
archives). Archive detection was lost in 1992 when lpr gained a
dependency on a.out.h. No corresponding support was added for ELF files
with the full transiation to ELF in 1998, but a.out support has been
dragged forward to and contaminated platforms that never supported
a.out.
While this feature isn't unuseful, preventing the printing of
a single file format we stopped producing ~20 years ago isn't worth
the costs (however minimal).
Initially we were using the IEs from ieee80211_probereq_ie() of net80211
and put them into the common_ies field. Start by manually building the
per-band and common IE parts as drivers put them back together.
This also involves allocating the req.ie as one buffer for all IEs over
all bands and setting req.ie_len correctly based on how many bytes we
put in.
Manually building per-band scan IEs we still use the net80211 routines
to add IEs to the buffer (mostly).
This is needed by Realtek drivers but will equally used by others.
Realtek would simply panic due to skbs being allocated with the wrong
length.
Longer-term this will help us, e.g., when not supporting VHT on 2Ghz
and we would have to do this anyway.
Sponsored by: The FreeBSD Foundation
MFC after: 3 days
LinuxKPI: 802.11: use an sx lock to protect the list of vifs
Use an sx lock to protect the list of vifs. We could use the
linux mutex compat for this but our current implementation may
re-acquire the lock recursively so allow this. The change is
mainly motivated by the fact that some callers may sleep in the
interator function called. Recursiveness is needed because we
see find_sta_by_ifaddr() being called from an iterator function
from iterate_interfaces().
Sponsored by: The FreeBSD Foundation
MFC after: 3 days
LinuxKPI: 802.11: start adding rate control to ieee80211_tx_status()
Start adding rate control feedback in ieee80211_tx_status() in order
for net80211 to be able to report something back (which may not
yet be the view of the firmware). iwlwifi is reporting back an MSC 0
even with HT disabled (to be investigated) so we cannot (yet) use
the firmware/driver rate feedback directly.
Sponsored by: The FreeBSD Foundation
MFC after: 3 days
Implement skb_copy() with omissions of fragments and possibly other fields
for now. Should we hit frags at any point a log message will let us know.
For the few cases we need this currently this is enough.
Sponsored by: The FreeBSD Foundation
MFC after: 3 days
LinuxKPI: skbuff: dev_kfree_skb_irq() and improvements
While it is currently unclear if we will have to defer work in
dev_kfree_skb_irq() to call dev_kfree_skb().
We only have one caller which seems to be fine on FreeBSD by calling
it directly for now.
While here shortcut skb_put()/skb_put_data() saving us work if there
are no adjustments to do.
Also adjust the logging in skb_is_gso() to avoid getting spammed by it.
Sponsored by: The FreeBSD Foundation
MFC after: 3 days
This code was marked gone_in(14), so it can now be removed.
The only consumer of this interface is dumpon(8). We do not maintain
strict backwards compatibility for this utility because a) it
can't/shouldn't be used from a jail or chroot and b) it is highly
specific interface unique to FreeBSD. The host's (presumably more
up-to-date) copy of dumpon(8) should be used to configure kernel dump
devices.
Reviewed by: markj, emaste
MFC after: never
Differential Revision: https://reviews.freebsd.org/D34914
This code was marked gone_in(13), so its time has passed.
The only consumer of this interface is dumpon(8). We do not maintain
strict backwards compatibility for this utility because a) it
can't/shouldn't be used from a jail or chroot and b) it is highly
specific interface unique to FreeBSD. The host's (presumably more
up-to-date) copy of dumpon(8) should be used to configure kernel dump
devices.
Ed Maste [Thu, 14 Apr 2022 00:50:17 +0000 (20:50 -0400)]
scp: switch to using the SFTP protocol by default
From upstream release notes https://www.openssh.com/txt/release-9.0
This release switches scp(1) from using the legacy scp/rcp protocol
to using the SFTP protocol by default.
Legacy scp/rcp performs wildcard expansion of remote filenames (e.g.
"scp host:* .") through the remote shell. This has the side effect of
requiring double quoting of shell meta-characters in file names
included on scp(1) command-lines, otherwise they could be interpreted
as shell commands on the remote side.
This creates one area of potential incompatibility: scp(1) when using
the SFTP protocol no longer requires this finicky and brittle quoting,
and attempts to use it may cause transfers to fail. We consider the
removal of the need for double-quoting shell characters in file names
to be a benefit and do not intend to introduce bug-compatibility for
legacy scp/rcp in scp(1) when using the SFTP protocol.
Another area of potential incompatibility relates to the use of remote
paths relative to other user's home directories, for example -
"scp host:~user/file /tmp". The SFTP protocol has no native way to
expand a ~user path. However, sftp-server(8) in OpenSSH 8.7 and later
support a protocol extension "expand-path@openssh.com" to support
this.
In case of incompatibility, the scp(1) client may be instructed to use
the legacy scp/rcp using the -O flag.
Relnotes: Yes
Sponsored by: The FreeBSD Foundation
Ed Maste [Fri, 15 Apr 2022 14:41:08 +0000 (10:41 -0400)]
ssh: update to OpenSSH v9.0p1
Release notes are available at https://www.openssh.com/txt/release-9.0
Some highlights:
* ssh(1), sshd(8): use the hybrid Streamlined NTRU Prime + x25519 key
exchange method by default ("sntrup761x25519-sha512@openssh.com").
The NTRU algorithm is believed to resist attacks enabled by future
quantum computers and is paired with the X25519 ECDH key exchange
(the previous default) as a backstop against any weaknesses in
NTRU Prime that may be discovered in the future. The combination
ensures that the hybrid exchange offers at least as good security
as the status quo.
* sftp-server(8): support the "copy-data" extension to allow server-
side copying of files/data, following the design in
draft-ietf-secsh-filexfer-extensions-00. bz2948
* sftp(1): add a "cp" command to allow the sftp client to perform
server-side file copies.
This commit excludes the scp(1) change to use the SFTP protocol by
default; that change will immediately follow.
MFC after: 1 month
Relnotes: Yes
Sponsored by: The FreeBSD Foundation
Tom Jones [Fri, 15 Apr 2022 13:59:14 +0000 (14:59 +0100)]
diff3: allow diff3 ed scripts to generate deletions
diff3 with the -e (ed script flag) can generate line deletions, add
support for deletions and add a test case to exercise this behaviour.
This functionality was unearthed through comparison of bsd diff3 and gnu
diff3 output.
Reviewed by: pstef
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D34912
Rick Macklem [Thu, 14 Apr 2022 23:15:56 +0000 (16:15 -0700)]
nfscl: Clean up the code by removing unused arguments
The "void *stuff" (also called fstuff and dstuff) argument
was used by the Mac OSX port. For FreeBSD, this argument
is always NULL, so remove it to clean up the code.
This commit gets rid of "stuff" for nfscl_nget().
Future commits will do the same for other functions.
cxgbe/t4_tom: Support for round-robin selection of offload queues.
A COP (Connection Offload Policy) rule can now specify that the tx
and/or rx queue for a new tid should be selected in a round-robin
manner. There is no change in default behavior.
The driver queries the firmware to find out if it supports this feature
and enables it if it does. The firmware moves the iSCSI page pod region
to a lower address so that some of it is located in the faster on-chip
memory instead of external DDR.
Randall Stewart [Thu, 14 Apr 2022 20:07:34 +0000 (16:07 -0400)]
tcp: adding a functionality to define "trace points" so that BB logging can be enabled at specific events.
This commit will add a new concept to rack, tracepoints. A tracepoint
is a defined point inserted into the code (3 are included in this initial patch) that
allows a developer to insert a point that might be of interest. The developer numbers
the point in the tcp_rack.h file and then can use sysctl to enable that (or all) trace
points. A limit is also given to how many BB logged connections will turn on
so that a box is not overrun by BB logging.
Reviewed by: tuexen
Sponsored by: Netflix Inc.
Differential Revision: https://reviews.freebsd.org/D34898
Randall Stewart [Thu, 14 Apr 2022 20:04:08 +0000 (16:04 -0400)]
tcp - hpts timing is off when we are above 1200 connections.
HPTS timing begins to go off when we reach the threshold of connections (1200 by default)
where we have any returning syscall or LRO stop finding the oldest hpts thread that
has not run but instead using the CPU it is on. This ends up causing quite a lot of times
where hpts threads may not run for extended periods of time. On top of all that which
causes heartburn if you are pacing in tcp, you also have the fact that where AMD's
podded L3 cache may have sets of 8 CPU's that share a L3, hpts is unaware of this
and thus on amd you can generate a lot of cache misses.
So to fix this we will get rid of the CPU mode, and always use oldest. But also make
HPTS aware of the CPU topology and keep the "oldest" to be within the same L3 cache.
This also works nicely for NUMA as well couple with Drew's earlier NUMA changes.
Mark Johnston [Thu, 14 Apr 2022 19:45:54 +0000 (15:45 -0400)]
vm: Move the "vm_wait in early boot" assertion to the proper place
The assertion was added in commit 1771e987ca6a. After that, vm_wait()
and friends were refactored such that the actual sleep happens
elsewhere. Now the assertion condition is not checked when
vm_wait_doms() is called directly, and it is checked even if we are not
going to sleep (because vm_page_count_min_set(wdoms) is false).
Reviewed by: alc, kib
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D34909
Ed Maste [Wed, 13 Apr 2022 22:36:03 +0000 (18:36 -0400)]
Allow posix_fadvise in capability mode
posix_fadvise operates only on a provided fd. Noted by
Mathieu <sigsys@gmail.com> in review D34761.
No new CAP_ rights are added for posix_fadvise(), as 'advice' in
general only influences when I/O happens; the fd must have existing
CAP_ rights for actual data access.
Reviewed by: markj
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D34903
Mike Karels [Mon, 11 Apr 2022 19:44:49 +0000 (14:44 -0500)]
genet: fix problems with interface down/up
The genet interface did not resume operation correctly after doing
ifconfig down then up. The down/reset procedure did not clear the
RUNNING flag, and did not reset enough of the hardware state. This
patch is modeled on OpenBSD code, with a call to gen_reset added
to reset the controller completely. Regularize the parameter to
gen_dma_disable() while here.
Gordon Bergling [Thu, 14 Apr 2022 08:04:14 +0000 (10:04 +0200)]
time(3): Refine history in the manual page
The time() system call first appeared in Version 1 AT&T UNIX. Through
the Version 3 AT&T UNIX, it returned 60 Hz ticks since an epoch that
changed occasionally, because it was a 32-bit value that overflowed in a
little over 2 years.
In Version 4 AT&T UNIX the granularity of the return value was reduced to
whole seconds, delaying the aforementioned overflow until 2038.
Version 7 AT&T UNIX introduced the ftime() system call, which returned
time at a millisecond level, though retained the gtime() system call
(exposed as time() in userland). time() could have been implemented as a
wrapper around ftime(), but that wasn't done.
4.1cBSD implemented a higher-precision time function gettimeofday() to
replace ftime() and reimplemented time() in terms of that.
Since FreeBSD 9 the implementation of time() uses
clock_gettime(CLOCK_SECOND) instead of gettimeofday() for performance
reasons.
cxgbe(4): Fix control flow issues reported by Coverity.
CID 1487932: Control flow issues (NESTING_INDENT_MISMATCH).
The macro on this line expands into multiple statements, only the first
of which is nested within the preceding parent while the rest are not.
9828 ulp_region(RX_TLS_KEY);
Reported by: Coverity (CID 1487932)
Fixes: f88b31885c4 cxgbe(4): meminfo should get the TLS region's limits from the hardware.
MFC after: 3 days
Sponsored by: Chelsio Communications
John Baldwin [Wed, 13 Apr 2022 23:08:21 +0000 (16:08 -0700)]
sctp: #ifdef INET-only and INET6-only variables.
Duplicating the SCTP_PCB_FLAGS_BOUND_V6 check made the #ifdef's
simpler than applying #ifdef's directly to the original code. Modern
compilers should cache the result rather than testing the flag twice.