Kristof Provost [Tue, 15 Feb 2022 10:49:39 +0000 (11:49 +0100)]
netinet: allow UDP tunnels to be removed
udp_set_kernel_tunneling() rejects new callbacks if one is already set.
Allow callbacks to be cleared. The use case for this is OpenVPN DCO,
where the socket is opened by userspace and then adopted by the kernel
to run the tunnel. If the DCO interface is removed but userspace does
not close the socket (something the kernel cannot prevent) the installed
callbacks could be called with an invalidated context.
Allow new functions to be set, but only if they're NULL (i.e. allow the
callback functions to be cleared).
Bjoern A. Zeeb [Wed, 16 Feb 2022 03:48:54 +0000 (03:48 +0000)]
LinuxKPI: 802.11 assign an(y) early chandef
The Realtek driver assumes an early chandef to be set. At the time
of linuxkpi_ieee80211_ifattach() we do not really know one yet so
try to find the first one which is available and set that.
This prevents a NULL-deref panic.
Bjoern A. Zeeb [Wed, 16 Feb 2022 03:20:29 +0000 (03:20 +0000)]
LinuxKPI: 802.11: defer workq allocation until we have a name
Turned out all the workq's taskqueues were named "wlanNA" if you had
more then one card in a machine as by the time we called wiphy_name()
the device name was not set yet and we returned the fallback.
Move the alloc_ordered_workqueue() from linuxkpi_ieee80211_alloc_hw()
to linuxkpi_ieee80211_ifattach() at which time the device name has
to be set to give us a unique name.
Sponsored by: The FreeBSD Foundation
MFC after: 3 days
Bjoern A. Zeeb [Wed, 16 Feb 2022 03:00:34 +0000 (03:00 +0000)]
LinuxKPI: 802.11 scan update
Realtek's rtw88 is returning a hard-coded 1 in case they cannot
hw_scan (fw not advertising it). In that case if we want any scan
to run we need to fall-back to sw scan. Start dealing with this.
Long-term we probably need to keep internal state.
Mark Johnston [Wed, 16 Feb 2022 02:50:41 +0000 (21:50 -0500)]
armv8crypto: Use cursors to access crypto buffer data
Currently armv8crypto copies the scheme used in aesni(9), where payload
data and output buffers are allocated on the fly if the crypto buffer is
not virtually contiguous. This scheme is simple but incurs a lot of
overhead: for an encryption request with a separate output buffer we
have to
- allocate a temporary buffer to hold the payload
- copy input data into the buffer
- copy the encrypted payload to the output buffer
- zero the temporary buffer before freeing it
We have a handy crypto buffer cursor abstraction now, so reimplement the
armv8crypto routines using that instead of temporary buffers. This
introduces some extra complexity, but gallatin@ reports a 10% throughput
improvement with a KTLS workload without additional CPU usage. The
driver still allocates an AAD buffer for AES-GCM if necessary.
Mark Johnston [Wed, 16 Feb 2022 02:45:32 +0000 (21:45 -0500)]
opencrypto: Add a routine to copy a crypto buffer cursor
This was useful in converting armv8crypto to use buffer cursors. There
are some cases where one wants to make two passes over data, and this
provides a way to "reset" a cursor.
Bjoern A. Zeeb [Wed, 16 Feb 2022 02:10:10 +0000 (02:10 +0000)]
LinuxKPI: skbuff updates
Various updates to skbuff for new/updated drivers and some housekeeping:
- update types and struct members, add new (stub) functions
- improve freeing of frags.
- fix an issue with sleeping during alloc for dev_alloc_skb().
- Adjust a KASSERT for skb_reserve() which apparently can be called
multiple times if no data was put into the skb yet.
- move the sysctl from linux_8022.c (which may be in a different module)
to linux_skbuff.c so in case we turn debugging on we do not run into
unresolved symbols. Rename the sysctl variable to be less conflicting
and update debugging macros along with that; also add IMPROVE().
- add DDB support to show an skbuff.
- adjust comments/whitespace.
No functional changes intended for iwlwifi.
Sponsored by: The FreeBSD Foundation (partially)
MFC after: 3 days
Bjoern A. Zeeb [Tue, 15 Feb 2022 23:45:15 +0000 (23:45 +0000)]
LinuxKPI: 802.11 header updates and add/adjust source dependencies.
This update is for more/newer versions of drivers:
- add and properly place more structs, enums, defines needed by drivers.
- correct types of struct fields.
- make various function arguments const.
- move REG_RULE() macro to its own file regulatory.h and
use macros for calculations.
- add linuxkpi_ieee80211_get_channel() implementation.
- change linuxkpi_ieee80211_ifattach() to return int for error checking.
No intended functional changes for iwlwifi.
Sponsored by: The FreeBSD Foundation (partially)
MFC after: 3 days
Rick Macklem [Tue, 15 Feb 2022 22:18:23 +0000 (14:18 -0800)]
gssd: Modify /etc/rc.d/gssd so that it starts after NETWORKING
Arno Tuber reported via email that he needed to restart the gssd daemon
after booting, to get his Kerberized NFS mount to work.
Without this patch, rcorder shows that the gssd starts before NETWORKING
and kdc. The gssd will need NETWORKING to connect to the KDC and, if
the kdc is running on the same system, it does not make sense to start it
before the kdc. This fixed the problem for Arno.
While here, I also added a "# BEFORE: mountcritremote".
It does not affect ordering at this time, but I felt
it should be added, since the gssd needs to be running
when remote NFS mounts are done.
iscsi: Use calloutng instead of ticks in iscsi initiator
callout *_sbt functions are used to reduce ping/timeout scheduling
overhead, while allowing later improvments in the functionality.
Keep similar 1000ms callouts while adding a 10 ms window, to allow
some kernel scheduling improvements.
Reviewed By: jhb
Sponsored by: NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D34222
Jessica Clarke [Tue, 15 Feb 2022 16:10:34 +0000 (16:10 +0000)]
libpmc: Allow specifying explicit EVENT_xxH events on armv7 and arm64
This is useful for processors where we don't have an event table; in
those cases we default to a Cortex A8 (armv7) or Cortex A53 (arm64) in
order to attempt to provide something useful, but you're then limited to
the counters in those tables, some of which may also not be implemented
(e.g. LD/ST_RETIRED are no longer implemented in more recent cores,
replaced by LD/ST_SPEC).
Adding the raw EVENT_xxH event lists to each table ensures that you can
always request the exact events you want, regardless of what has been
detected or is known.
Mark Johnston [Tue, 15 Feb 2022 14:03:31 +0000 (09:03 -0500)]
mlx5e: Make TLS tag zones unmanaged
These zones are cache zones used to allocate TLS offload contexts from
firmware. Releasing items from the cache is a sleepable operation due
to the need to await a response from the firmware command freeing the
tag, so items cannot be reclaimed from the zone in non-sleepable
contexts. Since the cache size is limited by firmware limits, avoid
this by setting UMA_ZONE_UNMANAGED to avoid reclamation by uma_timeout()
and the low memory handler.
Reviewed by: hselasky, kib
MFC after: 3 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D34142
Mark Johnston [Tue, 15 Feb 2022 13:57:22 +0000 (08:57 -0500)]
uma: Add UMA_ZONE_UNMANAGED
Allow a zone to opt out of cache size management. In particular,
uma_reclaim() and uma_reclaim_domain() will not reclaim any memory from
the zone, nor will uma_timeout() purge cached items if the zone is idle.
This effectively means that the zone consumer has control over when
items are reclaimed from the cache. In particular, uma_zone_reclaim()
will still reclaim cached items from an unmanaged zone.
Reviewed by: hselasky, kib
MFC after: 3 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D34142
The padding in CURRENT shall not shrink. It is
designed that in CURRENT at always stays
the same, and then when a new stable is branched, it
inherits 6 pointer placeholders that can be used
withing this stable/X lifetime to extend the structure.
Kristof Provost [Thu, 9 Dec 2021 13:24:13 +0000 (14:24 +0100)]
if_epair: implement fanout
Allow multiple cores to be used to process if_epair traffic. We do this
(if RSS is enabled) based on the RSS hash of the incoming packet. This
allows us to distribute the load over multiple cores, rather than
sending everything to the same one.
We also switch from swi_sched() to taskqueues, which also contributes to
better throughput.
Ed Maste [Tue, 15 Feb 2022 03:30:29 +0000 (22:30 -0500)]
elfctl: fix -e invalid operation error handling
Validate the operation prior to parsing the feature string, so that e.g.
-e 0x1 reports invalid operation '0' rather than invalid feature 'x11'.
Also make it an error rather than a warning, so that it is not repeated
if multiple files are specified.
(Previously an invalid operation resulted in a segfault.)
MFC after: 3 days
Sponsored by: The FreeBSD Foundation
Bjoern A. Zeeb [Mon, 14 Feb 2022 23:55:16 +0000 (23:55 +0000)]
Bump __FreeBSD_version to 1400052 for LinuxKPI changes.
Add a marker after GUID_INIT() and linux/pm_qos.h were added, so
that future version of drm-kmod can selectively remove these bits.
The latest port version does not require user updates for this so
no UPDATING entry.
Bjoern A. Zeeb [Mon, 14 Feb 2022 22:29:38 +0000 (22:29 +0000)]
LinuxKPI: 802.11: get rid of lkpi_ic_getradiocaps warnings
Users are seeing warnings about 2 channels (1 per band)
triggered by an ioctl from wpa_supplicant usually:
lkpi_ic_getradiocaps: Adding chan ... returned error 55
This was an early FAQ.
Check the current number of channels against maxchans and the return
code from net80211. In case net80211 reports that we reached the limit
do not print the warning and do not try to add further channels.
Sponsored by: The FreeBSD Foundation
MFC after: 3 days
Ed Maste [Mon, 14 Feb 2022 20:14:02 +0000 (15:14 -0500)]
Cirrus-CI: use qemu-nox11
We use -nographic for the smoke test and there is no need to pull in all
of the x11 deps. This saves some time and bandwidth during package
installation.
When I originally added Cirrus-CI support the -nox11 package was not
available.
John Baldwin [Mon, 14 Feb 2022 19:48:47 +0000 (11:48 -0800)]
Disable -Wreturn-type on GCC.
GCC is more pedantic than clang about warning when a function doesn't
handle undefined enum values (see GCC bug 87950). Clang's warning
gives a more pragmatic coverage and should find any real bugs, so
disable the warning for GCC rather than adding __unreachable
annotations to appease GCC.
Gleb Smirnoff [Mon, 14 Feb 2022 17:21:55 +0000 (09:21 -0800)]
unix/dgram: return EAGAIN instead of ENOBUFS when O_NONBLOCK set
This is behavior what some programs expect and what Linux does. For
example nginx expects EAGAIN when sending messages to /var/run/log,
which it connects to with O_NONBLOCK. Particularly with nginx the
problem is magnified by the fact that a ENOBUFS on send(2) is also
logged, so situation creates a log-bomb - a failed log message
triggers another log message.
Franco Fichtner [Mon, 14 Feb 2022 14:43:29 +0000 (09:43 -0500)]
dhclient: support VID 0 (no vlan) decapsulation
VLAN ID 0 is supposed to be interpreted as having no VLAN with a bit of
priority on the side, but the kernel is not able to decapsulate this on
the fly so dhclient needs to take care of it.
Mark Johnston [Mon, 14 Feb 2022 14:41:49 +0000 (09:41 -0500)]
git-arc: Fix review title matching
Properly handle the case where the title of one commit is a suffix or
prefix of the title of a second commit, and one wishes to create reviews
for both.
Mark Johnston [Mon, 14 Feb 2022 14:38:53 +0000 (09:38 -0500)]
sleepqueue: Address a lock order reversal
After commit 74cf7cae4d22 ("softclock: Use dedicated ithreads for
running callouts."), there is a lock order reversal between the per-CPU
callout lock and the scheduler lock. softclock_thread() locks callout
lock then the scheduler lock, when preparing to switch off-CPU, and
sleepq_remove_thread() stops the timed sleep callout while potentially
holding a scheduler lock. In the latter case, it's the thread itself
that's locked, and if the thread is sleeping then its lock will be a
sleepqueue lock, but if it's still in the process of going to sleep
it'll be a scheduler lock.
We could perhaps change softclock_thread() to try to acquire locks in
the opposite order, but that'd require dropping and re-acquiring the
callout lock, which seems expensive for an operation that will happen
quite frequently. We can instead perhaps avoid stopping the
td_slpcallout callout if the thread is still going to sleep, which is
what this patch does. This will result in a spurious call to
sleepq_timeout(), but some counters suggest that this is very rare.
PR: 261198
Fixes: 74cf7cae4d22 ("softclock: Use dedicated ithreads for running callouts.")
Reported and tested by: thj
Reviewed by: kib
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D34204
Bjoern A. Zeeb [Wed, 9 Feb 2022 11:58:40 +0000 (11:58 +0000)]
LinuxKPI: add kstrtoint_from_user() and DECLARE_FLEX_ARRAY()
Add an implementation of kstrtoint_from_user() based on the other
implementations and an attempt at DECLARE_FLEX_ARRAY() which works
for the driver needing it.
MFC after: 3 days
Reviewed by: hselasky
Differential Revision: https://reviews.freebsd.org/D34231
Bjoern A. Zeeb [Tue, 8 Feb 2022 23:47:15 +0000 (23:47 +0000)]
TCP syncache: enhance KASSERT output
Improve the "syncache: mbuf too small" assertion message with various
variables (some not actually needed) but enough that it will be obvious
if (a) we use IPv4 or IPv6, (b) if UDP tunneling is on, (c) what
max_linkhdr is, and (d) what MHLEN is.
This should help diagnostics in the future.
The case was hit with wireless drivers setting a large ic_headroom
and using IPv6.
Simon J. Gerraty [Sun, 13 Feb 2022 20:45:57 +0000 (12:45 -0800)]
Add support for module_verbose
Set module_verbose to control the printing of information
about loaded modules and kernel:
0 MODULE_VERBOSE_SILENT None
1 MODULE_VERBOSE_SIZE Pathname and size
2 MODULE_VERBOSE_TWIDDLE as for 1 but also twiddle for progress
3 MODULE_VERBOSE_FULL extra detail
When the loader is verifying modules we already have a
running indication of progress and module_verbose=0 makes sense.
Dimitry Andric [Mon, 7 Feb 2022 20:59:46 +0000 (21:59 +0100)]
bwi: Fix clang 14 warning about possible unaligned access
On architectures with strict alignment requirements (e.g. arm), clang 14
warns about a packed struct which encloses a non-packed union:
In file included from sys/dev/bwi/bwimac.c:79:
sys/dev/bwi/if_bwivar.h:308:7: error: field iv_val within 'struct bwi_fw_iv' is less aligned than 'union (unnamed union at sys/dev/bwi/if_bwivar.h:305:2)' and is usually due to 'struct bwi_fw_iv' being packed, which can lead to unaligned accesses [-Werror,-Wunaligned-access]
} iv_val;
^
It appears to help if you also add __packed to the inner union (i.e.
iv_val). No change to the layout is intended.
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D34196
Kyle Evans [Sat, 12 Feb 2022 21:36:24 +0000 (15:36 -0600)]
freebsd-update: improve BE creation feature
This addresses one nit and one bug in the BE creation feature of
freebsd-update:
The nit addressed is that it currently only names the BEs after the
userland version, but the kernel version may be higher. After this
change, we request both and pass them through sort(1) to choose the
highest. This is especially helpful if a freebsd-update patch touched
one but not the other.
The bug fixed is that roots updated that are not located at '/', e.g.,
by using -b or -j, will no longer create boot environments
automatically. There's a very low chance these will actually change the
BE in any meaningful way, anyways. It could make sense in the future
to allow an argument-override to create the BE anyways if someone comes
up with a non-standard setup, e.g., where a jail is an important part of
their boot environment on an appliance or some such setup.
Half of this patch is submitted by delphij@, the other half kevans@.
PR: 261446
MFC after: 3 days
Reviewed by: delphij, emaste, Dave Fullard <dave_fullard.ca>
Differential Revision: https://reviews.freebsd.org/D34257
Navdeep Parhar [Sat, 12 Feb 2022 00:58:46 +0000 (16:58 -0800)]
cxgbe(4): Fix illegal hardware access in cxgbe_refresh_stats.
cxgbe_refresh_stats takes into account VI_SKIP_STATS but not
VI_INIT_DONE when deciding whether to read the hardware stats. But
before this change VI_SKIP_STATS was set only for VIs with VI_INIT_DONE.
That meant that cxgbe_refresh_stats always accessed the hardware for
uninitialized VIs, and this is a problem if the adapter is suspended or
in the middle of a reset.
Fix this by setting VI_SKIP_STATS on all VIs during suspend. While
here, ignore VI_INIT_DONE in vi_refresh_stats too to be consistent with
cxgbe_refresh_stats.
Our mktemp(1) implementation uses 8-X for a temp file by default.
That's ok, but we should increase the value from 8 to 10 as
many other OS already did.
John Baldwin [Sat, 12 Feb 2022 00:04:52 +0000 (16:04 -0800)]
Cast pointer to uintptr_t to avoid alignment warnings.
Both struct ip and struct udphdr both have an aligment of 2, but the
cast from struct ip to a uint32_t pointer confused GCC 9 into raising
the required alignment to 4 and then raising a
-Waddress-of-packed-member error when casting to struct udphdr.
John Baldwin [Fri, 11 Feb 2022 23:16:25 +0000 (15:16 -0800)]
ktls: Write-lock the INP when changing a transmit TLS session.
The TCP rate pacing code relies on being able to read this pointer
safely while holding an INP lock. The initial TLS session pointer is
set while holding the write lock already.
Warner Losh [Fri, 11 Feb 2022 19:50:51 +0000 (12:50 -0700)]
cleankernel: A target to delete the kernel compile file
With the meta-build, it's always a NO_CLEAN build. Provide a way to
remove so one can rebuild from scratch. 'cleankernel' will delete the
kernel and modules object directories. Document this in build(7).
it seems that glibc supports them, and such spelling is mentioned in the
ld.bfd manual. Idea seems to auto-correct some quoting/makefile sytnax
errors on linker command line.
Reviewed by: emaste, markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D34247
Ed Maste [Thu, 10 Feb 2022 20:42:05 +0000 (15:42 -0500)]
Makefile.inc1: synthesize PKG_ABI from newvers.sh variables
Previously we inspected ${WSTAGEDIR}/usr/bin/uname to determine PKG_ABI,
but the file will not exist in some cases - for example, if building
only kernel packages. We can instead synthesize the PKG_ABI from
information already provided by newvers.sh.
Reviewed by: kevans, manu (both earlier rev)
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D34249