ieee80211_defrag() accepts fragmented 802.11 frames in a protected Wi-Fi
network even when some of the fragments are not encrypted.
Track whether the fragments are encrypted or not and only accept
successive ones if they match the state of the first fragment.
This relates to section 6.3 in the 2021 Usenix "FragAttacks" (Fragment
and Forge: Breaking Wi-Fi Through Frame Aggregation and Fragmentation)
paper.
Coherent is lower 32bit only by default in Linux and our only default
dma mask is 64bit currently which violates expectations unless
dma_set_coherent_mask() was called explicitly with a different mask.
Implement coherent by creating a second tag, and storing the tags in the
objects and use the tag from the object wherever possible.
This currently does not update the scatterlist or pool (both could be
converted but S/G cannot be MFCed as easily).
There is a 2nd change embedded in the updated logic of
linux_dma_alloc_coherent() to always zero the allocation as
otherwise some drivers get cranky on uninialised garbage.
LinuxKPI: dma-mapping.h unify "mask" and "dma_mask"
In some places we are using "mask" and others "dma_mask" for the
same thing. Harmonize the various places to "dma_mask" as used in
linux_pci.c. For the declaration remove the argument names to
avoid the entire problem.
This is in preparation for an upcoming change.
No functional changes intended.
As reported by multiple people testing iwlwifi, device_release_driver()
can lead to a panic on secondary errors (usually during attach).
Disable device_release_driver() for the short-term to prevent the panic
but leave it in place so it can be re-worked and fixed properly for
the long-term more easily.
net80211: add func/line information to IEEE80211_DISCARD* macros
While debugging is very good in net80211, some log messages are
repeated in multiple places 1:1. In order to distinguish where the
discard happened and to speed up analysis, add __func__:__LINE__
information to all these messages.
Add fsleep() function now required by rtw88. This seems to be
making a decision depending on time to sleep on how to sleep.
Given our compat framework already is lenient on how long to sleep,
this is a cut down version.
LinuxKPI: add module_pci_driver() and pci_alloc_irq_vectors()
Add the two new functions needed by rtw88 to register the driver and
handle the module bits as well as a version of pci_alloc_irq_vectors()
for what is needed.
Mark Johnston [Wed, 3 Nov 2021 16:28:48 +0000 (12:28 -0400)]
kasan: Disable validation of function parameters passed by value
It appears that the emitted code in the caller does not update shadow
state for values passed on the stack to the callee, which it seemingly
ought to do after pushing values on the stack and prior to the call
itself. This leaves open a window where an interrupt handler can cause
regions of the stack containing these values to be poisoned, resulting
in rare false positive reports. This happens particularly in the amd64
TLB invalidation code, where we liberally pass cpuset_t's around by
value.
LLVM has a flag to disable validation of accesses of function parameters
passed by value. Such validation is itself a relatively new feature.
Turn it off for now.
Reported by: pho, syzkaller
Sponsored by: The FreeBSD Foundation
Rick Macklem [Wed, 3 Nov 2021 21:25:44 +0000 (14:25 -0700)]
nfscl: Fix forced dismount from looping on commit
When a forced dismount is in progress, it is possible to
end up looping, retrying commits that fail.
This patch fixes the problem by pretending
that commits succeeded when a forced dismount is in prgress.
Rick Macklem [Wed, 3 Nov 2021 19:15:40 +0000 (12:15 -0700)]
nfscl: Fix use after free for forced dismount
When a forced dismount is done and delegations are being
issued by the server (disabled by default for FreeBSD
servers), the delegation structure is free'd before the
loop calling vflush(). This could result in a use after
free of the delegation structure.
This patch changes the code so that the delegation
structures are not free'd until after the vflush()
loop for forced dismounts.
Found during a recent IETF NFSv4 working group testing event.
Rick Macklem [Wed, 3 Nov 2021 00:28:13 +0000 (17:28 -0700)]
nfscl: Check for a forced dismount in nfscl_getref()
The nfscl_getref() function is called within nfscl_doiods() when
the NFSv4.1/4.2 pNFS client is doing I/O on a DS. As such,
nfscl_getref() needs to check for a forced dismount.
This patch adds that check.
Found during a recent IETF NFSv4 working group testing event.
Rick Macklem [Tue, 2 Nov 2021 00:21:31 +0000 (17:21 -0700)]
nfscl: Use a smaller initial delay time for NFSERR_DELAY
For NFS RPCs that receive a NFSERR_DELAY reply, the delay time
is initially 1sec and then increases exponentially to NFS_TRYLATERDEL.
It was found that this delay time is excessive for some NFSv4
servers, which work well with a 1msec delay.
A 1sec delay resulted in very slow performance for Remove and
Rename when delegations and pNFS were enabled.
This patch decreases the initial delay time to 1msec.
Found during a recent IETF NFSv4 working group testing event.
Mark Johnston [Sat, 16 Oct 2021 17:13:26 +0000 (13:13 -0400)]
bhyve: Fix the WITH_BHYVE_SNAPSHOT build
Note, this breaks compatibility with snapshots generated by older builds
of bhyve(8).
Fixes: 7fa233534736 ("bhyve: Map the MSI-X table unconditionally for passthrough")
Reported by: Greg V <greg@unrelenting.technology>
Reviewed by: grehan, bz
Sponsored by: The FreeBSD Foundation
Mark Johnston [Sat, 9 Oct 2021 15:36:19 +0000 (11:36 -0400)]
bhyve: Map the MSI-X table unconditionally for passthrough
It is possible for the PBA to reside in the same page as the MSI-X
table. And, while devices are not supposed to do this, at least some
Intel wifi devices place registers in a page shared with the MSI-X
table. To handle the first case we currently map the PBA page using
/dev/mem, and the second case is not handled.
Kill two birds with one stone: map the MSI-X table BAR using the
PCIOCBARMMAP ioctl instead of /dev/mem, and map the entire table so that
accesses beyond the bounds of the table can be emulated. Regions of the
BAR not containing the table are left unmapped.
PR: 251046
Reviewed by: bz, grehan, jhb
Sponsored by: The FreeBSD Foundation
Mark Johnston [Tue, 9 Nov 2021 18:07:57 +0000 (13:07 -0500)]
pci: Implement pci_bar_enabled() for SR-IOV VFs
In a VF's configuration space, "memory space enable" is hard-wired to 0,
so the existing implementation always returns false. We need to read
the SR-IOV control register from the PF device to get the value of the
MSE bit.
Fix pci_bar_enabled() to read this register instead for VFs. I don't
see any way to access the PF's config space without a backpointer in the
pci device ivars, so I added one.
This fixes a regression where bhyve(8) fails to map the MSI-X table
after commit 7fa233534736 ("bhyve: Map the MSI-X table unconditionally
for passthrough") when a VF is passed through, since with that commit we
use PCIOCBARMMAP to map the table and that ioctl always fails for VFs
without this change. As a bonus, pciconf(8) now correctly reports the
enablement of BARs for VFs.
Reported and tested by: Raúl Muñoz <raul.munoz@custos.es>
Reviewed by: rstone, jhb
Sponsored by: The FreeBSD Foundation
Mitchell Horne [Mon, 8 Nov 2021 19:33:25 +0000 (15:33 -0400)]
hwpmc: initialize arm64 counter/interrupt state
Performance counters and overflow interrupts are assumed to be disabled
by default, but this is not guaranteed. Ensure we disable both during
per-cpu initialization, before enabling the PMU. Otherwise, some systems
(such as the Ampere eMAG) would experience an interrupt storm upon
loading the hwpmc module.
Reviewed by: br
MFC after: 5 days
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D32854
Mark Johnston [Mon, 1 Nov 2021 13:27:52 +0000 (09:27 -0400)]
uma: Fix handling of reserves in zone_import()
Kegs with no items reserved have uk_reserve = 0. So the check
keg->uk_reserve >= dom->ud_free_items will be true once all slabs are
depleted. Then, rather than go and allocate a fresh slab, we return to
the cache layer.
The intent was to do this only when the keg actually has a reserve, so
modify the check to verify this first. Another approach would be to
make uk_reserve signed and set it to -1 until uma_zone_reserve() is
called, but this requires a few casts elsewhere.
Fixes: 1b2dcc8c54a8 ("uma: Avoid depleting keg reserves when filling a bucket")
Sponsored by: The FreeBSD Foundation
Mark Johnston [Mon, 1 Nov 2021 13:27:35 +0000 (09:27 -0400)]
uma: Improve M_USE_RESERVE handling in keg_fetch_slab()
M_USE_RESERVE is used in a couple of places in the VM to avoid unbounded
recursion when the direct map is not available, as is the case on 32-bit
platforms or when certain kernel sanitizers (KASAN and KMSAN) are
enabled. For example, to allocate KVA, the kernel might allocate a
kernel map entry, which might require a new slab, which requires KVA.
For these zones, we use uma_prealloc() to populate a reserve of items,
and then in certain serialized contexts M_USE_RESERVE can be used to
guarantee a successful allocation. uma_prealloc() allocates the
requested number of items, distributing them evenly among NUMA domains.
Thus, in a first-touch zone, to satisfy an M_USE_RESERVE allocation we
might have to check the slab lists of other domains than the current one
to provide the semantics expected by consumers.
So, try harder to find an item if M_USE_RESERVE is specified and the keg
doesn't have anything for current (first-touch) domain. Specifically,
fall back to a round-robin slab allocation. This change fixes boot-time
panics on NUMA systems with KASAN or KMSAN enabled.[1]
Alternately we could have uma_prealloc() allocate the requested number
of items for each domain, but for some existing consumers this would be
quite wasteful. In general I think keg_fetch_slab() should try harder
to find free slabs in other domains before trying to allocate fresh
ones, but let's limit this to M_USE_RESERVE for now.
Also fix a separate problem that I noticed: in a non-round-robin slab
allocation with M_WAITOK, rather than sleeping after a failed slab
allocation we simply try again. Call vm_wait_domain() before retrying.
Reported by: mjg, tuexen [1]
Reviewed by: alc
Sponsored by: The FreeBSD Foundation
Rick Macklem [Sat, 30 Oct 2021 03:35:02 +0000 (20:35 -0700)]
nfscl: Use NFSMNTP_DELEGISSUED in two more functions
Commit 5e5ca4c8fc53 added a NFSMNTP_DELEGISSUED flag to indicate when
a delegation has been issued to the mount. For the common case
where an NFSv4 server is not issuing delegations, this flag
can be checked to avoid acquisition of the NFSCLSTATEMUTEX.
This patch adds checks for NFSMNTP_DELEGISSUED being set
to two more functions.
This change appears to be performance neutral for a small number
of opens, but should reduce lock contention for a large number of opens
for the common case where server is not issuing delegations.
Rick Macklem [Sat, 30 Oct 2021 23:35:02 +0000 (16:35 -0700)]
nfscl: Fix race between Lookup and Setattr of size
PR#259071 provides a test program that fails for the NFS client.
Testing with it, there appears to be a race between Lookup
and VOPs like Setattr-of-size, where Lookup ends up loading
stale attributes (including what might be the wrong file size)
into the NFS vnode's attribute cache.
The race occurs when the modifying VOP (which holds a lock
on the vnode), blocks the acquisition of the vnode in Lookup,
after the RPC (with now potentially stale attributes).
Here's what seems to happen:
Child Parent
does stat(), which does
VOP_LOOKUP(), doing the Lookup
RPC with the directory vnode
locked, acquiring file attributes
valid at this point in time
blocks waiting for locked file does ftruncate(), which
vnode does VOP_SETATTR() of Size,
changing the file's size
while holding an exclusive
lock on the file's vnode
releases the vnode lock
acquires file vnode and fills in
now stale attributes including
the old wrong Size
does a read() which returns
wrong data size
This patch fixes the problem by saving a timestamp in the NFS vnode
in the VOPs that modify the file (Setattr-of-size, Allocate).
Then lookup/readdirplus compares that timestamp with the time just
before starting the RPC after it has acquired the file's vnode.
If the modifying RPC occurred during the Lookup, the attributes
in the RPC reply are discarded, since they might be stale.
With this patch the test program works as expected.
Note that the test program does not fail on a July stable/12,
although this race is in the NFS client code. I suspect a
fairly recent change to the name caching code exposed this
bug.
Rick Macklem [Sun, 31 Oct 2021 00:08:28 +0000 (17:08 -0700)]
nfscl: Add setting n_localmodtime to the Write RPC code
Similar to commit 2be417843a, I believe there could be a race between
the NFS client VOP_LOOKUP() and file Writing that could result in stale
file attributes being loaded into the NFS vnode by VOP_LOOKUP().
I have not been able to reproduce a failure due to this race, but
I believe that there are two possibilities:
The Lookup RPC happens while VOP_WRITE() is being executed and loads
stale file attributes after VOP_WRITE() returns when it has already
completed the Write/Commit RPC(s).
--> For this case, setting the local modify timestamp at the end of
VOP_WRITE() should ensure that stale file attributes are not loaded.
The Lookup RPC occurs after VOP_WRITE() has returned, while
asynchronous Write/Commit RPCs are in progress and then is
blocked by the vnode held by VOP_OPEN/VOP_CLOSE/VOP_FSYNC which
will flush writes via ncl_flush() or ncl_vinvalbuf(), clearing the
NMODIFIED flag (which indicates Writes-in-progress). The VOP_LOOKUP()
then acquires the NFS vnode lock and fills in stale file attributes.
--> Setting the local modify timestamp in ncl_flsuh() and ncl_vinvalbuf()
when they clear NMODIFIED should ensure that stale file attributes
are not loaded.
Rick Macklem [Sun, 31 Oct 2021 23:31:31 +0000 (16:31 -0700)]
nfscl: Do pNFS layout return_on_close synchronously
For pNFS servers that specify that Layouts are to be returned
upon close, they may expect that LayoutReturn to happen before
the associated Close.
This patch modifies the NFSv4.1/4.2 pNFS client so that this
is done. This only affects a pNFS mount against a non-FreeBSD
NFSv4.1/4.2 server that specifies return_on_close in LayoutGet
replies.
Found during a recent IETF NFSv4 working group testing event.
pf: ensure we have the correct source/destination IP address in ICMP errors
When we route-to a packet that later turns out to not fit in the
outbound interface MTU we generate an ICMP error.
However, if we've already changed those (i.e. we've passed through a NAT
rule) we have to undo the transformation first.
Andriy Gapon [Thu, 4 Nov 2021 11:56:12 +0000 (13:56 +0200)]
htu21: allow configuration via hints on FDT-based systems
On-board devices should be configured via the FDT and overlays.
Hints are primarily useful for external and temporarily attached devices.
Adding hints is much easier and faster than writing and compiling
an overlay.
Andriy Gapon [Sat, 6 Nov 2021 16:47:32 +0000 (18:47 +0200)]
htu21: don't needlessly bother hardware when measurements are not needed
sysctl(8) first queries a sysctl to get a size of its value even if the
sysctl is of a fixed size, e.g. it has an integer type.
Only after that sysctl(8) queries an actual value of the sysctl.
Previosuly the driver would needlessly read a sensor in the first step.
Peter Grehan [Mon, 1 Nov 2021 13:35:43 +0000 (23:35 +1000)]
igc: Use hardware routine for PHY reset
Summary:
The previously used software reset routine wasn't sufficient
to reset the PHY if the bootloader hadn't left the device in
an initialized state. This was seen with the onboard igc port
on an 11th-gen Intel NUC.
The software reset isn't used in the Linux driver so all related
code has been removed.
Tested on: Netgate 6100 onboard ports, a discrete PCIe I225-LM card,
and an 11th-gen Intel NUC.
ip_divert: calculate delayed checksum for IPv6 adress family
Before passing an IPv6 packet to application apply delayed checksum
calculation. Mbuf flags will be lost when divert listener will return a
packet back, so we will not be able to do delayed checksum calculation
later. Also an application will get a packet with correct checksum.
Felix Johnson [Mon, 8 Nov 2021 06:14:58 +0000 (01:14 -0500)]
find(1): Update date format reference and remove cvs(1) references
cvs(1) is not installed by default. Change the date format reference to
note that find(1) understands ISO8601 and RFC822 date formats. Also
remove references to cvs(1).
Mark Johnston [Wed, 3 Nov 2021 19:09:17 +0000 (15:09 -0400)]
scsi_cd: Improve TOC access validation
1. During CD probing, we read the TOC header to find the number of
entries, then read the TOC itself. The header determines the number
of entries, which determines the amount of data to read from the
device into the softc in the CD_STATE_MEDIA_TOC_FULL state. We
hard-code a limit of 99 tracks (plus one for the lead-out) in the
softc, but were not validating that the size reported by the media
would fit in this hard-coded limit. Kernel memory corruption could
occur if not.[1] Add validation to check this, and refuse to cache
the TOC if it would not fit.
2. The CDIOCPLAYTRACKS ioctl uses caller provided track numbers to index
into the TOC, but we only validate the starting index. Add
validation of the ending index.
Also, raise the hard-coded limit from 100 tracks to 170, per a
suggestion from Ken.
Reported by: C Turt <ecturt@gmail.com> [1]
Reviewed by: ken, avg
Sponsored by: The FreeBSD Foundation
Rick Macklem [Tue, 26 Oct 2021 02:09:14 +0000 (19:09 -0700)]
nfscl: Add a missing delegation lock release
There was a case in nfscl_doiods() where the function would return
without releasing the delegation shared lock, if it was aquired by
the call to nfscl_getstateid(). This patch adds that release.
I have never observed a failure due to this missing release, so I
do not know if it ever happens in practice. However, since the pNFS
client is not yet heavily used, it might be the case.
Found by code inspection during a recent NFSv4 IETF working group
testing event.
Ed Maste [Thu, 2 Sep 2021 20:43:59 +0000 (16:43 -0400)]
openssh: restore local change to gssapi include logic
/usr/include/gssapi.h claims that it is deprecated, and gssapi/gssapi.h
should be used instead. So, test HAVE_GSSAPI_GSSAPI_H first falling
back to HAVE_GSSAPI_H.
This will be submitted upstream.
Fixes: 6eac665c8126 ("openssh: diff reduction against...")
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D31810
Dimitry Andric [Fri, 5 Nov 2021 21:26:16 +0000 (22:26 +0100)]
Partially revert ac76bc1145dd because it is no longer necessary
In ac76bc1145dd, I added a few volatiles to work around ctrig_test
failures with {inf,inf}. This is not necessary anymore now, since in 3b00222f156d we added -fp-exception-behavior=maytrap for clang >= 10 in
libm's Makefile. (The flag tells clang to use stricter floating point
semantics, which libm depends on.)
Ed Maste [Mon, 25 Oct 2021 21:25:26 +0000 (17:25 -0400)]
strip/objcopy: handle empty file as unknown
Previously strip reported a somewhat cryptic error for empty files:
strip: elf_begin() failed: Invalid argument
Add a special case to treat empty files as with an unknown file format.
This is consistent with llvm-strip. GNU strip produces no output which
does not seem like useful behaviour (but it does exit with status 1).
Reported by: andrew
Reviewed by: markj
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D32648
This makes it consistent with other date(1) implementations. Also, it
feels more consistent since hours and minutes are already represented as
HH and MM respectively.
- Use Cm instead of Ar or Sq for command modifiers of the -v flag.
- Remove unnecessary "Ar ..." from the synopsis. It's not clear what it
was referring to.
- Add missing arguments to the -f and -v flags.
- Stylize the dot before "ss" with Cm in the default format in the -f
flag description.
- Set LC_ALL=C in the last example so that the output format of
date(1) always matches the specified format of the -f flag not matter
the locale.
- List the -f flag as optional in all usage lines in the synopsis.