Fix bsnmpd(1) crash with ill-formed Discovery message
RFC 3414 Section 4. Discovery specifies that a discovery request message has a
varBindList left empty. Nonetheless, bsnmpd(1) should not crash when receiving
a non-zero var-bindings list in a Discovery Request message.
Andrew Turner [Wed, 29 Sep 2021 13:33:18 +0000 (14:33 +0100)]
Add a gic interface to allocate MSI interrupts
The previous update to handle the gicv2m as a child of the gicv3 driver
assumed there was only a single gicv2m child. On some hardware there
are multiple children. Support this by removing the mbi ivars and
adding a new interface to handle MSI allocation in a given range.
Tested by: mw, trasz
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D32224
Andrew Turner [Tue, 21 Sep 2021 17:10:57 +0000 (17:10 +0000)]
Allow ddb and dtrace use the DMAP region on arm64
When writing to memory on arm64 we may be trying to be accessing a
read-only page. In this case try to access via the DMAP region to
get a writable location.
While here simplify writing data in DDB and stop trashing the size as
it is passed into the cache handling functions.
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D32053
Andrew Turner [Thu, 23 Sep 2021 10:32:16 +0000 (10:32 +0000)]
Also print symbols when printing arm64 registers
When printing arm64 registers because of an exception in the kernel
also print the symbol and offset. This can be used to track down why
the exception occured without needing external tools.
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D32077
modules: felix: Remove etherswitch_if.c from Makefile
Having it included confuses KOBJOPLOOKUP resulting in kobj_error_method
being called instead of a devmethod from the switch driver.
That in turn returns ENXIO which was treated as a pointer and
dereferenced by etherswitch ioctl logic causing the kernel to panic.
It was missed during the conversion of kernel configs.
Although the driver is already built as a kernel module we might
want to have it built-in for diskless booting and such.
We'd likely be better served by converting these to the equivalent mem*
calls, but just kill the knob for now. The b* macros being defined get
in the way of _FORTIFY_SOURCE.
kqueue: don't arbitrarily restrict long-past values for NOTE_ABSTIME
NOTE_ABSTIME values are converted to values relative to boottime in
filt_timervalidate(), and negative values are currently rejected. We
don't reject times in the past in general, so clamp this up to 0 as
needed such that the timer fires immediately rather than imposing what
looks like an arbitrary restriction.
Another possible scenario is that the system clock had to be adjusted
by ~minutes or ~hours and we have less than that in terms of uptime,
making a reasonable short-timeout suddenly invalid. Firing it is still
a valid choice in this scenario so that applications can at least
expect a consistent behavior.
Colin Percival [Thu, 30 Sep 2021 21:48:14 +0000 (14:48 -0700)]
EFI loader: Don't free bcache for DEVT_DISK devs
Booting on an EC2 c5.xlarge instance, this reduces the number of I/Os
performed from 609 to 432, reduces the total number of blocks read
from 61963 to 60797, and reduces the time spent in the loader by 39 ms.
Note that b4cb3fe0e39a allowed the bcache to be retained for most of
the boot process, but relies on mounting filesystems; this commit
allows the bcache to be retained at the start of the boot process,
before the root filesystem has been located.
These aren't a part of or use libjail(3), but rather are direct
syscalls. Still, they seem like good additions, allowing us to attach
to already-running jails.
Warner Losh [Thu, 30 Sep 2021 20:16:19 +0000 (14:16 -0600)]
uart: Match simple comm
Match the PCI simple comm devices (or try to). Be conservative and use
legacy interrupts rather than msi messages by default for this 'catch
all' since it matches what Linux does (it has opt-in generally for MSI,
but also matches more devices because it does a catch-all like
implemented in this commit).
Ed Maste [Wed, 29 Sep 2021 21:24:39 +0000 (17:24 -0400)]
mgb: Fix nop admin interrupt handling
Previously mgb_admin_intr printed a diagnostic message if no interrupt
status bits were set, but it's not valid to call device_printf() from a
filter. Just drop the message as it has no user-facing value.
Also return FILTER_STRAY in this case - there is nothing further for
the driver to do.
Reviewed by: kbowling
MFC after: 1 week
Fixes: 8890ab7758b8 ("Introduce if_mgb driver...")
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D32231
Mathy Vanhoef [Sun, 6 Jun 2021 22:10:56 +0000 (22:10 +0000)]
net80211: prevent plaintext injection by A-MSDU RFC1042/EAPOL frames
No longer accept plaintext A-MSDU frames that start with an RFC1042
header with EtherType EAPOL. This is done by only accepting EAPOL
packets that are included in non-aggregated 802.11 frames.
Note that before this patch, FreeBSD also only accepted EAPOL frames
that are sent in a non-aggregated 802.11 frame due to bugs in
processing EAPOL packets inside A-MSDUs. In other words,
compatibility with legitimate devices remains the same.
This relates to section 6.5 in the 2021 Usenix "FragAttacks" (Fragment
and Forge: Breaking Wi-Fi Through Frame Aggregation and Fragmentation)
paper.
Mathy Vanhoef [Sun, 6 Jun 2021 22:10:52 +0000 (22:10 +0000)]
net80211: mitigation against A-MSDU design flaw
Mitigate A-MSDU injection attacks by detecting if the destination address
of a subframe equals an RFC1042 (i.e., LLC/SNAP) header, and if so
dropping the complete A-MSDU frame. This mitigates known attacks,
although new (unknown) aggregation-based attacks may remain possible.
This defense works because in A-MSDU aggregation injection attacks, a
normal encrypted Wi-Fi frame is turned into an A-MSDU frame. This means
the first 6 bytes of the first A-MSDU subframe correspond to an RFC1042
header. In other words, the destination MAC address of the first A-MSDU
subframe contains the start of an RFC1042 header during an aggregation
attack. We can detect this and thereby prevent this specific attack.
This relates to section 7.2 in the 2021 Usenix "FragAttacks" (Fragment
and Forge: Breaking Wi-Fi Through Frame Aggregation and Fragmentation)
paper.
ieee80211_defrag() accepts fragmented 802.11 frames in a protected Wi-Fi
network even when some of the fragments are not encrypted.
Track whether the fragments are encrypted or not and only accept
successive ones if they match the state of the first fragment.
This relates to section 6.3 in the 2021 Usenix "FragAttacks" (Fragment
and Forge: Breaking Wi-Fi Through Frame Aggregation and Fragmentation)
paper.
Looking for "tsc-tsc" in the pmu tables will fail every time. Instead,
make this an alias for the static TSC event defined in pmc_events.h.
This fixes 'pmcstat -s cycles' on Intel and AMD.
Reviewed by: emaste
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D32197
ocs_fc: Fix CAM status reporting in ocs_fc(4) when no data is returned.
In ocs_scsi_initiator_io_cb(), if the SCSI command that is
getting completed had a residual equal to the transfer length,
it was setting the CCB status to CAM_REQ_CMP.
That breaks the expected behavior for commands like READ ATTRIBUTE.
For READ ATTRIBUTE, if the first attribute requested doesn't exist,
the command is supposed to return an error (Illegal Request,
Invalid Field in CDB). The broken behavior for READ ATTRIBUTE
caused LTFS tape formatting to fail. It looks for attribute
0x1623, and expects to see an error if the attribute isn't present.
In addition, if the residual is negative (indicating an overrun),
only set the CCB status to CAM_DATA_RUN_ERR if we have not already
reported an error. The SCSI sense data will have more detail about
what went wrong.
sys/dev/ocs_fc/ocs_cam.c:
In ocs_scsi_initiator_io_cb(), don't set the status to
CAM_REQ_CMP if the residual is equal to the transfer length.
Also, only set CAM_DATA_RUN_ERR if we didn't get SCSI
status.
Warner Losh [Thu, 30 Sep 2021 03:58:27 +0000 (21:58 -0600)]
bluetooth: Remove stray btccc references
The 3com bluetooth PC Card adapter was removed from the tree when PC
Card support was removed earlier this year. Remove stray references to
it still in the tree.
Warner Losh [Thu, 30 Sep 2021 02:18:28 +0000 (20:18 -0600)]
fd: Move from using device_busy to a refcount
Use refcounting to delay the detach rather than device_busy and/or
device_unbusy. fd/fdc is one of the few consumers of device_busy in the
tree for that, and it's not a good fit. Also, nothing is waking 'fd' and
other drivers don't loop like this. Return EBUSY if we still have active
users.
Warner Losh [Sun, 5 Sep 2021 18:41:16 +0000 (12:41 -0600)]
bluetooth: complete removal of ng_h4
The ng_h4 module was disconnected 13 years ago when the tty later was
locked by Ed. It completely fails to compile, and has a number of false
positives for Giant use. Remove it for lack of interest. Bluetooth has
largely (completely?) moved on from bluetooth over UART transport.
ipfilter: Save time and cycles swapping bucket table sizes
NAT hash tables are inverted for inbound vs outbound. Rather than spend
the time and cycles swapping them, let's simply calculate the bucket
lengths inversely.
LinuxKPI: Allow cdev_pager prefault handler to steal pages
from other vm_objects. This workarounds "Page already inserted" panic
in vm_page_insert routine triggered on attempt to mmap file created
with shmem_file_setup call. After introduction of "GTT mmap
interface v4" a.k.a. MMAP_OFFSET, vm_objects allocated by these calls
may try to own intersected sets of pages that leads to the assertion.
Although drm-kmod contains better implementation which is able to
allocate real entries on pseudofs, this feature has never been used.
Starting from drm-kmod v5.6 old implementation began to leak entries
on each drm device close(). Now just drop pseudofs support instead of
fixing it in drm-kmod and provide stub in base.
The implementation of the progress bar is simple, but duplicated for
most minidump implementations. Extract the common bits to kern_dump.c.
Ensure that the bar is reset with each subsequent dump; this was only
done on some platforms previously.
Reviewed by: markj
MFC after: 2 weeks
Sponsored by: Juniper Networks, Inc.
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D31885
The function is identical in each minidump implementation, so move it to
vm_phys.c. The only slight exception is powerpc where the function was
public, for use in moea64_scan_pmap().
Fix error return of kern.ipc.posix_shm_list, which caused it (and thus
"posixshmcontrol ls") to fail for all jails that didn't happen to own
the last shm object in the list.
Andrew Turner [Thu, 23 Sep 2021 15:00:55 +0000 (15:00 +0000)]
Add the arm64 table attributes and use them
Add the table page table attributes on arm64 and use them to add
restrictions to the block and page entries below them. This ensures
we are unable to increase the permissions in these last level entries
without also changing them in the upper levels.
Use the attributes to ensure the kernel can't execute from userspace
memory and vice versa, userspace has no access to read or write kernel
memory, and that the DMAP region is non-executable.
Reviewed by: alc, kib
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D32081
Warner Losh [Wed, 29 Sep 2021 16:19:51 +0000 (10:19 -0600)]
f00f: We don't need giant to create IDT for workaround.
We don't need to assert we have Giant here. All machines that require
the F00F workaround are UP and interrupts are disabled. Since we are
single threaded, it's safe to allocate the IDT area with pmap_trm_alloc,
interact with the current idt table and replace the IDT table without
any Giant locking.
Warner Losh [Wed, 29 Sep 2021 15:21:17 +0000 (09:21 -0600)]
loader: create separate man pages for each of the loaders
Create a man page per loader. Loader(8) will have information common to
all of them, while loader_${INTERP}(8) will have information relevant to
that specific loader. Rewrite loader(8) to give an overview and point to
the appropriate man page. Rewrite each of the loader_${INTER}(8) man
pages to contain only the relevant information to that loader. Put all
the common commands, environment variables, etc in loader_simp(8) and
refernce that from the loader_lua or loader_4th man pages. The
loader_lua(8) could use more details about the Lua
integration. Additional organization may be benefitial.
sdhci_xenon: split driver file into generic file and fdt parts
This patch splits driver code into two seperate files sdhci_xenon.c
and sdhci_xenon_fdt.c. This will allow future implementation of ACPI
discovery of sdhci on Xenon chips.
Ed Maste [Wed, 29 Sep 2021 13:59:10 +0000 (09:59 -0400)]
mgb: Use MGB_DEBUG instead of DEBUG
The debug register dump routine is not hooked up and is really only
useful to driver developers, so put it under an mgb-specific MGB_DEBUG
rather than general DEBUG.
MFC after: 1 week
Fixes: 8890ab7758b8 ("Introduce if_mgb driver...")
Sponsored by: The FreeBSD Foundation
Use atomic counters to ensure that we correctly track the number of half
open states and syncookie responses in-flight.
This determines if we activate or deactivate syncookies in adaptive
mode.
mmc: Fix regression in 8a8166e5bcfb breaking Stratix 10 boot
The refactoring in 8a8166e5bcfb introduced a functional change that
breaks booting on the Stratix 10, hanging when it should be attaching
da0. Previously OF_getencprop was called with a pointer to host->f_max,
so if it wasn't present then the existing value was left untouched, but
after that commit it will instead clobber the value with 0. The dwmmc
driver, as used on the Stratix 10, sets a default value before calling
mmc_fdt_parse and so was broken by this functional change. It appears
that aw_mmc also does the same thing, so was presumably also broken on
some boards.
Coherent is lower 32bit only by default in Linux and our only default
dma mask is 64bit currently which violates expectations unless
dma_set_coherent_mask() was called explicitly with a different mask.
Implement coherent by creating a second tag, and storing the tags in the
objects and use the tag from the object wherever possible.
This currently does not update the scatterlist or pool (both could be
converted but S/G cannot be MFCed as easily).
There is a 2nd change embedded in the updated logic of
linux_dma_alloc_coherent() to always zero the allocation as
otherwise some drivers get cranky on uninialised garbage.
Sponsored by: The FreeBSD Foundation
MFC after: 7 days
Reviewed by: hselasky
Differential Revision: https://reviews.freebsd.org/D32164
mvneta_find_ethernet_prop_switch() is file-local static to
if_mvneta_fdt.c. Normally we would not need a function declararion
but in case MVNETA_DEBUG is set it becomes public. Move the
function declaration from if_mvneta.c to if_mvneta_fdt.c to avoid
a warning during each compile.
Warner Losh [Wed, 29 Sep 2021 03:21:50 +0000 (21:21 -0600)]
nvme: Sanity check completion id
Make sure the completion ID is in the range of [0..num_trackers) since
the values past the end of the act_tr array are never going to be valid
trackers and will lead to pain and suffering if we try to dereference
them to get the tracker or to set the tracker back to NULL as we
complete the I/O.
Warner Losh [Wed, 29 Sep 2021 03:11:17 +0000 (21:11 -0600)]
nvme: Add sanity check for phase on startup.
The proper phase for the qpiar right after reset in the first interrupt
is 1. For it, make sure that we're not still in phase 0. This is an
illegal state to be processing interrupts and indicates that we've
failed to properly protect against a race between initializing our state
and processing interrupts. Modify stat resetting code so it resets the
number of interrpts to 1 instead of 0 so we don't trigger a false
positive panic.
Warner Losh [Wed, 29 Sep 2021 03:09:45 +0000 (21:09 -0600)]
nvme: start qpair in state RECOVERY_WAITING
An interrupt happens on the admin queue right away after the reset, so
as soon as we enable interrupts, we'll get a call to our interrupt
handler. It is safe to ignore this interrupt if we're not yet
initialized, or to process it if we are. If we are initialized, we'll
see there's no completion records and return. If we're not, we'll
process no completion records and return. Either way, nothing is
processed and nothing is lost.
Until we've completely setup the qpair, we need to avoid processing
completion records. Start the qpair in the waiting recovery state so we
return immediately when we try to process completions. The code already
sets it to 'NONE' when we're initialization is complete. It's safe to
defer completion processing here because we don't send any commands
before the initialization of the software state of the qpair is
complete. And even if we were to somehow send a command prior to that
completing, the completion record for that command would be processed
when we send commands to the admin qpair after we've setup the software
state. There's no good central point to add an assert for this last
condition.
This fixes an KASSERT "received completion for unknown cmd" panic on
boot.
Colin Percival [Tue, 28 Sep 2021 18:39:59 +0000 (11:39 -0700)]
loader: Set twiddle globaldiv to 16 by default
Booting FreeBSD on an EC2 c5.xlarge instance, the loader "twiddles"
810 times over the course of 510 ms, a rate of 1.59 kHz. Even accepting
that many systems are slower than this particular VM and will take
longer to boot (especially if using spinning-rust disks), this seems
like an unhelpfully large amount of twiddling when compared to the
~60 Hz frame rate of many displays; printing the twiddles also consumes
roughly 10% of the boot time on the aforementioned VM.
Setting the default globaldiv to 16 dramatically reduces the time spent
printing twiddles to the console while still twiddling at roughly 100
Hz; this should be ample even for systems which take longer to boot and
consequently twiddle slower.
Note that this can adjusted via the twiddle_divisor variable in
loader.conf, but that file is not processed until nearly halfway
through the loader's runtime.