John Baldwin [Tue, 19 Apr 2022 17:43:06 +0000 (10:43 -0700)]
Deprecate the 'devclass' argument from *DRIVER_MODULE() macros.
This argument is useless for the vast majority of drivers. For now,
use __VA_ARGS__ wrapper macros so that that the *DRIVER_MODULE()
macros accept both the old version (with a devclass) and the new
version (which omits the argument and stores NULL in the
driver_module_data structure). This provides an API compatiblity
shim that can be merged to older stable branches.
Once all drivers relevant to 14.0 (both in and out of tree) have been
updated, the API compat shims can be dropped.
Rick Macklem [Tue, 3 May 2022 14:22:15 +0000 (07:22 -0700)]
nfscl: Acquire a refcount on "cred" for mirrored pNFS RPCs
When the NFSv4.1/4.2 client is doing a pnfs mount to
mirrored DS(s), asynchronous threads are used to do the
RPCs against the DS(s) concurrently. If a DS is slow
to reply, it is possible for the "cred" to be free'd
before the asynchronous thread is done with it, causing
a panic/crash.
This patch fixes the problem by acquiring a refcount on
the "cred" while it is being used by the asynchronous thread
for a DS RPC. This bug was found during a recent IETF
NFSv4 testing event.
This bug only affects "pnfs" mounts to mirrored pNFS
servers.
Rick Macklem [Mon, 2 May 2022 19:45:42 +0000 (12:45 -0700)]
nfsd: Fix session slot freeing for NFSv4.1/4.2
Without this patch the NFSv4.1/4.2 server erroneously
always frees session slot zero for callbacks. This only
affects 4.1/4.2 mounts if the server has delegations
enabled or is a pNFS configuration. Even for those
cases, the effect is mainly to only use slot 0 for
callbacks, serializing all of them. There is a slight
chance that callbacks will fail if the client performs
them in a different order than received on the TCP
connection.
If this bug affects your server, you will see console
messages like:
newnfs_request: Bad session slot
This patch fixes the problem. Found during a recent
IETF NFSv4 testing event.
This finish (almost) the clocks implementations for the RK3328 SoC.
The clocks are now correctly implemented respecting the clock hiearchy.
The missing clocks are mostly the DDR clocks, implementing those is only
useful for debugging as we will never set them in the kernel.
The ARMCLK still needs to be rewritten so it looks closer to how the
hardware is done.
Some clocks in the RK3328 SoC (and possibly others) have registers not in
the CLKSEL_CON range. Add a macros for muxes which lives not in the range
of CLKSEL_CON which just takes a raw offset.
Corvin Köhne [Tue, 3 May 2022 14:01:22 +0000 (16:01 +0200)]
bsdinstall/script: umount before zpool export
When running zpool export first, boot/efi and dev is still mounted so
zpool export fails. By running bsdinstall umount first the pool can be
cleanly exported.
Reviewed by: emaste
Differential Revision: https://reviews.freebsd.org/D35114
Sponsored by: Beckhoff Automation GmbH & Co. KG
MFC After: 3 days
Corvin Köhne [Tue, 3 May 2022 14:00:09 +0000 (16:00 +0200)]
bsdinstall: stop messing with file descriptors
Throughout the bsdinstall script fd 3 is used by f_dprintf (set through
$TERMINAL_STDOUT_PASSTHRU). By closing file descriptor 3 here, the
final f_dprintf "Installation Completed ... does not work anymore.
By putting the code into a subshell, file descriptors can be edited
without interference with the calling script.
Reviewed by: emaste
Differential Revision: https://reviews.freebsd.org/D35113
Sponsored by: Beckhoff Automation GmbH & Co. KG
MFC after: 3 days
At the moment, writes to BAR registers that aren't 4 byte aligned are
ignored. So, there's no overflow yet. Nevertheless, if this behaviour
changes in the future, it could unintentionally, introduce a buffer
overflow. Additionally, some compiler or tools will detect this
potential overflow and complain about it.
Reviewed by: markj Signed-off-by: Corvin Köhne <c.koehne@beckhoff.com> Reported-by: Andy Fiddaman <andy@omniosce.org>
Differential Revision: https://reviews.freebsd.org/D34689
(cherry picked from commit 45ddbf211274eb28c0ccd0042640de57015dd390)
pci_parse_legacy_config splits the options string by comma characters.
strchr returns a pointer to the first occurence of a character. In that
case, it's a comma. So, pci_parse_legacy_config will stop at the first
character and creates a new config node with a name of NULL.
It simplifies the declaration of the driver structures a little. There
are no current consumers of this macro, in fact it looks like it was
added for exactly this purpose.
This decreases the scope of some variables, so rework the initialization
in vt_init_logos() such that it doesn't require them.
This device is present on the Allwinner D1-based SoCs. Without this
driver, the watchdog timeout will trigger a reset a few seconds after
control is given to the kernel.
Milan Obuch [Thu, 7 Apr 2022 13:04:18 +0000 (10:04 -0300)]
cgem: support SGMII PHY connection mode
As the PolarFire SoC needs SGMII to connect the PHY, check the
'phy-mode' property of device tree node for ethernet and act on it
appropriately.
Add the compatible strings for the PolarFire SoC device tree.
'microchip,mpfs-mss-gem" is not officially documented but has been
observed in the available firmware for this platform, so it is included
for now.
Milan Obuch [Thu, 7 Apr 2022 12:57:25 +0000 (09:57 -0300)]
cgem: rework hardware quirk detection
Rather than doing these checks based on the detected hardware variant, allow
quirks to be specified as a set of flags for each compatible string.
This simplifies adding support for new compatible hardware.
All files are now created relative to savedirfd, e.g. with openat(2).
Therefore, we do not need character buffers to be PATH_MAX bytes long,
just long enough to hold the complete filename. 32 bytes is long enough
in all cases. These can be allocated on the stack.
While here, fix an error message that attempts to use an uninitialized
infoname.
Reviewed by: markj
MFC after: 3 days
Sponsored by: Juniper Networks, Inc.
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D34821
Emmanuel Vadot [Mon, 21 Feb 2022 17:31:00 +0000 (18:31 +0100)]
files: Make mmc_helper depend on gpio
mmc_helper have an hard dependency on gpio_if.h
gpio(4) isn't in the default x86 kernel and none of the x86
sd/mmc drivers uses mmc_helper so just add a dependency on gpio.
Corvin Köhne [Thu, 10 Mar 2022 10:30:17 +0000 (11:30 +0100)]
bhyve/usage: memory size is not in MB
For backward compatibility, the memory size will be interpreted in MB if
it's smaller than1 MB and has no suffix. Nowadays, the -m switch accepts
more than just MB. Respect it in the usage message.
Corvin Köhne [Thu, 10 Mar 2022 10:28:06 +0000 (11:28 +0100)]
bhyve: add ROM emulation
Some PCI devices especially GPUs require a ROM to work properly.
The ROM is executed by boot firmware to initialize the device.
To add a ROM to a device use the new ROM option for passthru device
(e.g. -s passthru,0/2/0,rom=<path>/<to>/<rom>).
It's necessary that the ROM is executed by the boot firmware.
It won't be executed by any OS.
Additionally, the boot firmware should be configured to execute the
ROM file.
For that reason, it's only possible to use a ROM when using
OVMF with enabled bus enumeration.
Corvin Köhne [Thu, 10 Mar 2022 10:26:19 +0000 (11:26 +0100)]
bhyve: export funcs for read/write pci config
Export functions for reading and writing the pci config space from passthru
device to be used by other devices.
This is required for lpc devices to set their vendor/device ids to their
physical values.
Otherwise, GPU passthrough for integrated Intel GPUs won't work properly.
Write to the PWREN register should be done in update_ios based
on the power_mode value in the ios struct.
Also none of the manual (RockChip and Altera) and Linux talks about
the needed for an inverted PWREN value so just remove this.
This fixes eMMC (and possibly SD) when u-boot didn't setup the controller.
When using SDIO the block size if per function and most of the time
not equal to MMC_SECTOR_SIZE, fix sdio on dwmmc by setting the correct
block size in the mmc registers.
MFC after: 1 month
Sponsored by: Diablotin Systems
Andrew Turner [Fri, 29 Apr 2022 12:02:15 +0000 (13:02 +0100)]
Map the ACPI tables into the DMAP
When we try to load these tables via acpidump(8) we need them to be in
the DMAP for /dev/mem to access. Add the EFI ACPI reclaim memory type
to the list of memory we map into DMAP but not used by the kernel as
this is the recommended place to put these.
Eugene Grosbein [Sun, 1 May 2022 16:34:08 +0000 (23:34 +0700)]
ng_pppoe: introduce new sysctl net.graph.pppoe.lcp_pcp
New sysctl allows to mark transmitted PPPoE LCP Control
ethernet frames with needed 3-bit Priority Code Point (PCP) value.
Confirming driver like if_vlan(4) uses the value to fill
IEEE 802.1p class of service field.
This is similar to Cisco IOS "control-packets vlan cos priority"
command.
It helps to avoid premature disconnection of user sessions
due to control frame drops (LCP Echo etc.)
if network infrastructure has a botteleck at a switch
or the xdsl DSLAM.
See also:
https://sourceforge.net/p/mpd/discussion/44692/thread/c7abe70e3a/
Kevin Bowling [Thu, 12 May 2022 15:38:09 +0000 (08:38 -0700)]
e1000: Increase rx_buffer_size to 32b
Extend the size of the local rx_buffer_size variable to account for
larger buffer sizes possible on 82580, i350 chips.
From i350 datasheet, 6.2.10 Initialization Control 4 (LAN Base Address
+ Offset 0x13):
When 4 ports are enabled maximum buffer size is 36 KB. When 2 ports are
enabled maximum buffer size is 72 KB. When only a single port is
enabled maximum buffer size is 144 KB.
and 8.3:
The overall available internal buffer size in the I350 for all ports is
144 KB for receive buffers and 80 KB for transmit Buffers. Disabled
ports memory can be shared between active ports and sharing can be
asymmetric. The default buffer size for each port is loaded from the
EEPROM on initialization.
From the reporter:
But for I350 when only 2 ports are used PBA size can be set as 72KB
(see datasheet RXPbsize or e1000_rxpbs_adjust_82580 function in
e1000_82575.c). In this case calculating the rx_buffer_size overflows
as 0x0048 << 10 = 73728 or 0x12000 pushed into u16. It is then set as
0x2000 or 8192.
Scott Long [Sun, 27 Feb 2022 01:29:08 +0000 (18:29 -0700)]
Default to always accepting the PHY that's present. Linux did
something similar a while back, and there are devices in the wild
that otherwise won't attach. This patch is temporary until the
PHY code is further cleared up.
Kristof Provost [Sat, 7 May 2022 15:15:34 +0000 (17:15 +0200)]
epair: unbind prior to returning to userspace
If 'options RSS' is set we bind the epair tasks to different CPUs. We
must take care to not keep the current thread bound to the last CPU when
we return to userspace.
MFC after: 1 week
Sponsored by: Orange Business Services
John Baldwin [Wed, 4 May 2022 20:08:36 +0000 (13:08 -0700)]
OpenSSL: KTLS: Enable KTLS for receiving as well in TLS 1.3
This removes a guard condition that prevents KTLS being enabled for
receiving in TLS 1.3. Use the correct sequence number and BIO for
receive vs transmit offload.
John Baldwin [Wed, 4 May 2022 20:08:27 +0000 (13:08 -0700)]
OpenSSL: KTLS: Handle TLS 1.3 in ssl3_get_record.
- Don't unpad records, check the outer record type, or extract the
inner record type from TLS 1.3 records handled by the kernel. KTLS
performs all of these steps and returns the inner record type in the
TLS header.
- When checking the length of a received TLS 1.3 record don't allow
for the extra byte for the nested record type when KTLS is used.
- Pass a pointer to the record type in the TLS header to the
SSL3_RT_INNER_CONTENT_TYPE message callback. For KTLS, the old
pointer pointed to the last byte of payload rather than the record
type. For the non-KTLS case, the TLS header has been updated with
the inner type before this callback is invoked.
John Baldwin [Wed, 4 May 2022 20:08:17 +0000 (13:08 -0700)]
OpenSSL: KTLS: Add using_ktls helper variable in ssl3_get_record().
When KTLS receive is enabled, pending data may still be present due to
read ahead. This data must still be processed the same as records
received without KTLS. To ease readability (especially in
consideration of additional checks which will be added for TLS 1.3),
add a helper variable 'using_ktls' that is true when the KTLS receive
path is being used to receive a record.
John Baldwin [Wed, 4 May 2022 20:08:03 +0000 (13:08 -0700)]
OpenSSL: KTLS: Check for unprocessed receive records in ktls_configure_crypto.
KTLS implementations currently assume that the start of the in-kernel
socket buffer is aligned with the start of a TLS record for the
receive side. The socket option to enable KTLS specifies the TLS
sequence number of this initial record.
When read ahead is enabled, data can be pending in the SSL read buffer
after negotiating session keys. This pending data must be examined to
ensurs that the kernel's socket buffer does not contain a partial TLS
record as well as to determine the correct sequence number of the
first TLS record to be processed by the kernel.
In preparation for enabling receive kernel offload for TLS 1.3, move
the existing logic to handle read ahead from t1_enc.c into ktls.c and
invoke it from ktls_configure_crypto().
John Baldwin [Wed, 4 May 2022 20:07:36 +0000 (13:07 -0700)]
OpenSSL: Cleanup record length checks for KTLS
In some corner cases the check for packets
which exceed the allowed record length was missing
when KTLS is initially enabled, when some
unprocessed packets are still pending.
John Baldwin [Mon, 18 Apr 2022 19:04:30 +0000 (12:04 -0700)]
destroy_dev_sched*: Don't hold Giant for all deferred destroy_dev.
Rather than using taskqueue_swi_giant which holds Giant for all
deferred destroy_dev calls, create a separate queue for destroyed
devices with D_NEEDGIANT set in the corresponding cdevsw. The task
for this queue holds Giant whild destroying deferred devices while the
task for the default queue does not hold Giant.
In addition, switch to taskqueue_thread for destroy_dev_sched.
Deferred destroy_dev requests don't need to run at an SWI priority.
John Baldwin [Mon, 18 Apr 2022 21:09:20 +0000 (14:09 -0700)]
arm ti_mbox_attach: Write sysconfig to TI_MBOX_SYSCONFIG to request reset.
This variable was flagged as a set but unused warning as its value was
read from a register and then modified to set a bit
(TI_MBOX_SYSCONFIG_SOFTRST). After the variable is modified, the code
then loops waiting for the SOFTRST bit to go clear in the
TI_MBOX_SYSCONFIG register. Presumably merely reading from the
register does not request a reset as other places in the driver read
this register, so most likely the updated value of sysconfig setting
the reset bit is supposed to be written to the register to request a
reset before the polling loop that waits for the reset to finish.
John Baldwin [Mon, 18 Apr 2022 19:06:52 +0000 (12:06 -0700)]
ata_kauai: Fix support for "shasta" controllers.
The probe routine was setting a value in the softc, but since the
probe routine was not returning zero, this value was lost since the
softc was reallocated (and re-zeroed) when the device was attached.
This is similar in nature to the fixes from 965205eb66cae3fd5de75a70a3aef2f014f98020.
To fix, move the code to set the 'shasta' flag to the start of attach
along with related code to set an IRQ resource on some non-shasta
devices. The IRQ resource still "worked" being in the probe routine
as the IRQ resource persisted after probe returned, but it is cleaner
to go ahead and move it to attach after setting the 'shasta' flag.
I have no way to test this, but noticed this while reading the code.
John Baldwin [Tue, 12 Apr 2022 21:58:59 +0000 (14:58 -0700)]
netgraph: Remove the rethook parameter from NG_NODE_FOREACH_HOOK.
This parameter was set to the hook that terminated the iteration
early. However, none of the remaining callers used this argument and
it was always set to an otherwise-unused variable.
John Baldwin [Tue, 12 Apr 2022 17:05:39 +0000 (10:05 -0700)]
x86: Remove silly checks for <sys/cdefs.h>.
These headers #include <sys/cdefs.h> right after checking if it has
already been #included. The nested #include already existed when the
check for _SYS_CDEFS_H_ was added, so the check shouldn't have been
added in the first place.