Mark Johnston [Thu, 21 Jan 2021 19:30:18 +0000 (14:30 -0500)]
Define PNP info after defining driver modules
PNP info definitions currently have an unfortunate requirement in that
they must follow the associated module definition in the module metadata
linker set. Otherwise devmatch can segfault while processing the linker
hints file since kldxref maintains the order in the linker set.
A number of drivers violate this requirement. In some cases this can
cause devmatch(8) to segfault when processing the linker hints file.
Work around the problem for now simply by adjusting the drivers.
Andrew Gallatin [Thu, 21 Jan 2021 14:45:15 +0000 (09:45 -0500)]
iflib: Fix a NULL pointer deref
rxd_frag_to_sd() have pf_rv parameter as NULL with the current
code. This patch fixes the NULL pointer dereference in that
case thus avoiding a possible panic.
elf: add some definitions for i386 and amd64 relocations
I believe that rtld does not need to implement them, they are mostly for
the static linker. 'Mostly' because for amd64 our kernel linker loads
object files, and amd64 relocation types could be observed.
Defines were taken from glibc sources.
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D28205
Alexander Motin [Thu, 21 Jan 2021 02:33:14 +0000 (21:33 -0500)]
Remove FirstBurstLength limit for software iSCSI.
For hardware offload solicited data may potentially be handled more
efficiently than unsolicited due to direct data placement. Or there
can be some unsolicited write buffering limitations. It may create
situations where FirstBurstLength limit is really useful.
Software driver though has no those factors, having to do memcopy()
any way and having no so hard limit on the temporary storage. Same
time more active use of unsolicited transfers allows to avoid some
of Ready To Transfer (R2T) PDU round-trip times and processing.
This change effectively doubles from 64KB to 128KB the maximum size
of write command that can be transferred within one link RTT. Tests
of (64KB, 128KB] QD1 writes mixed with simultaneous QD8 reads over
the same connection, increasing RTT, shows almost double write speed
and half latency, while we should be able to afford few megabytes of
RAM for additional buffering on a target these days.
Kyle Evans [Mon, 18 Jan 2021 20:11:58 +0000 (14:11 -0600)]
pkgbase: allow update-packages for first-run of packaging
If ${REPODIR}/${PKG_ABI} does not exist when we begin real-update-packages,
skip the comparison with the non-existent previous repository and just
finish the repo off. This allows external scripts to just assume they can
run `update-packages` rather than figuring out if they'd previously run
`packages` for this Version/Arch combo.
PKG_VERSION_FROM_DIR was added so that we could perhaps detect the three
distinct cases:
1.) If the repo has not yet been created, PKG_VERSION_FROM_DIR will be
empty.
2.) If the repo is in some intermediate state between created and fully
initialized, PKG_VERSION_FROM_DIR may point to the ABI directory.
3.) If the repo is fully initialized, then PKG_VERSION_FROM_DIR points to
the latest build to compare to.
Option #2 is explicitly unhandled at the moment, but this is no different
than it was before.
Reviewed-by: manu
Differential-Revision: https://reviews.freebsd.org/D28229
Jessica Clarke [Thu, 21 Jan 2021 02:14:41 +0000 (02:14 +0000)]
virtio_mmio: Delete a stale #if 0'ed debug print
This was blindly moved in r360722 but the variable being printed is not
yet initialised. It's of little use and can easily be added back in the
right place if needed by someone debugging, so just delete it.
Jessica Clarke [Thu, 21 Jan 2021 01:54:52 +0000 (01:54 +0000)]
linux64: Don't pass unnecessary -S and -g to objcopy
Since we use --input-type binary these options are rather meaningless. Both
binutils and elftoolchain ignore the option in this case, but LLVM does not,
and instead strips all symbols from the output file, causing missing symbols at
run time if building with llvm-objcopy. Thus simply remove the options; the
linux module has never included them for building its VDSO (added in r283407),
but for some reason the original commit of linux64 (r283424) added them.
These should however eventually be changed to use template assembly files as is
now done for firmware and MFS_IMAGE.
Jessica Clarke [Thu, 21 Jan 2021 01:54:12 +0000 (01:54 +0000)]
Rename i386's Linux ELF to Linux ELF32
This is what amd64 calls the i386 Linux ABI in order to distinguish it
from the amd64 Linux ABI, and matches the nomenclature used for the
FreeBSD ABIs where they always have the size suffix in the name.
Jessica Clarke [Thu, 21 Jan 2021 01:21:35 +0000 (01:21 +0000)]
Build VirtIO modules on all architectures
Currently only amd64, i386 and powerpc build VirtIO modules, yet all other
architectures have at least one kernel configuration that includes the
transport drivers, and so they lack drivers for all the devices they don't
statically compile into the kernel. Instead, enable the build everywhere so all
architectures have the full set of device drivers available.
Jessica Clarke [Thu, 21 Jan 2021 01:07:23 +0000 (01:07 +0000)]
virtio: Reduce boilerplate for device driver module definitions
Rather than have every device register itself for both virtio_pci and
virtio_mmio, provide a VIRTIO_DRIVER_MODULE wrapper to declare both,
merge VIRTIO_SIMPLE_PNPTABLE with VIRTIO_SIMPLE_PNPINFO and make the
latter register for both buses. This also has the benefit of abstracting
away the available transports and their names.
We must check MagicValue not just Version before anything else, and then
we must check DeviceID and immediately abort if zero (and this must not
be an error).
Do all this when probing rather than at the start of attaching as that's
where this belongs, and provides a clear boundary between the device
detection and device initialisation parts of the specified driver
initialisation process. This also means we don't create empty device
instances for placeholder devices, reducing clutter on systems that
pre-allocate a large number, such as QEMU's AArch64 virt machine (which
provides 32).
John Baldwin [Thu, 21 Jan 2021 00:37:55 +0000 (16:37 -0800)]
Restructure the crypto(7) manpage and add authentication algorithms.
Add separate sections for authentication algorithms, block ciphers,
stream ciphers, and AEAD algorithms. Describe properties commmon to
algorithms in each section to avoid duplication.
Use flat tables to list algorithm properties rather than nested
tables.
John Baldwin [Thu, 21 Jan 2021 00:33:34 +0000 (16:33 -0800)]
Simplify dynamic ipfilter sysctls.
Pass the structure offset in arg2 instead of arg1. This avoids
having to undo the pointer arithmetic on arg1. Instead arg2 can
be used directly as an offset relative to the desired structure.
Jamie Gritton [Wed, 20 Jan 2021 23:08:27 +0000 (15:08 -0800)]
jail: Use refcount(9) for prison references.
Use refcount(9) for both pr_ref and pr_uref in struct prison. This
allows prisons to held and freed without requiring the prison mutex.
An exception to this is that dropping the last reference will still
lock the prison, to keep the guarantee that a locked prison remains
valid and alive (provided it was at the time it was locked).
Among other things, this honors the promise made in a comment in
crcopy(9), that it will not block, which hasn't been true for two
decades.
Ryan Libby [Wed, 20 Jan 2021 21:59:49 +0000 (13:59 -0800)]
pms: handle maximum size IO with any alignment
Define the maximum numbers of segments to allow for non-page alignment
at the beginning and end of a maxphys size transfer. Also set
ccb_pathinq.maxio consistent with maxphys.
hms: Workaround idle mouse drift in I2C sampling mode.
Many I2C "compatibility" mouse devices found on touchpads continue to
return last report data in sampling mode after touch has been ended.
That results in cursor drift. Filter out such a reports with comparing
content of current report with content of previous one.
hconf(4): Do not fetch report before writing new usage values back.
There is a report that reading of surface/button switch feature report
causes SYN1B7D touchpad malfunction. As specs does not require it to
be readable assume that report usages have default value on attach and
last written value during operation. Do not apply default usage values
on attachment and resume.
While here fix manpage typos and add avg@ to copyright header.
Andrew Turner [Wed, 20 Jan 2021 09:56:47 +0000 (09:56 +0000)]
Handle arm64 undefied instructions on msr exceptions
When userspace tries to access a special register that it doesn't have
access to the kernel receives an exception. On most cores this exception
has been observed to be the undefined instruction exception, however on
the Apple M1 under a QEMU based hypervisor it can be the MSR exception.
Handle this second case by also running the undefined exception handler
on these exceptions.
Alan Somers [Sun, 3 Jan 2021 04:25:05 +0000 (21:25 -0700)]
aio: micro-optimize the lio_opcode assignments
This allows slightly more efficient opcode testing in-kernel. It is
transparent to userland, except to applications that sneakily submit
aio fsync or aio mlock operations via lio_listio, which has never been
documented, requires the use of deliberately undefined constants
(LIO_SYNC and LIO_MLOCK), and is arguably a bug.
Cy Schubert [Wed, 20 Jan 2021 15:20:22 +0000 (07:20 -0800)]
wpa_supplicant uses PF_ROUTE to return the routing table in order to
determine the length of the routing table buffer. As of 81728a538d24
wpa_supplicant is started before the routing table has been populated
resulting in the length of zero to be returned. This causes
wpa_supplicant to loop endlessly. (The workaround is to kill and restart
wpa_supplicant as by the time it is restarted the routing table is
populated.)
(Personally, I was not able to reproduce this unless wlan0 was a member of
lagg0. However, others experienced this problem on standalone wlan0.)
Address panic with PRR due to missed initialization of recover_fs
Summary:
When using the base stack in conjunction with RACK, it appears that
infrequently, ++tp->t_dupacks is instantly larger than tcprexmtthresh.
This leaves the recover flightsize (sackhint.recover_fs) uninitialized,
leading to a div/0 panic.
Address this by properly initializing the variable just prior to first
use, if it is not properly initialized.
In order to prevent stale information from a prior recovery to
negatively impact the PRR calculations in this event, also clear
recover_fs once loss recovery is finished.
Finally, improve the readability of the initialization of recover_fs
when t_dupacks == tcprexmtthresh by adjusting the indentation and
using the max(1, snd_nxt - snd_una) macro.
Alex Richardson [Wed, 20 Jan 2021 09:56:01 +0000 (09:56 +0000)]
libc: Fix null pointer arithmetic warning in mergesort
This file has other questionable code and "optimizations" (such as copying
one int at a time) that are probably no longer useful, so it might make
sense to replace it with a different implementation at some point.
Mark Johnston [Wed, 20 Jan 2021 02:32:33 +0000 (21:32 -0500)]
ktls: Improve handling of the bind_threads tunable a bit
- Only check for empty domains if we actually tried to configure domain
affinity in the first place. Otherwise setting bind_threads=1 will
always cause the sysctl value to be reported as zero. This is
harmless since the threads end up being bound, but it's confusing.
- Try to improve the sysctl description a bit.
Mark Johnston [Wed, 20 Jan 2021 01:34:36 +0000 (20:34 -0500)]
arm64, riscv: Set VM_KMEM_SIZE_SCALE to 1
This setting limits the amount of memory that can be allocated to UMA.
On systems with a direct map and ample KVA, however, there is no reason
for VM_KMEM_SIZE_SCALE to be larger than 1. This appears to have been
inherited from the 32-bit ARM platform definitions.
Also remove VM_KMEM_SIZE_MIN, which is not needed when
VM_KMEM_SIZE_SCALE is defined to be 1.[*]
Mark Johnston [Wed, 20 Jan 2021 01:34:35 +0000 (20:34 -0500)]
arm64: Stop setting VM_BCACHE_SIZE_MAX
This setting places a (small) limit on the size of the buffer cache,
constraining UFS performance on large servers. The setting comes from
the initial arm64 implementation and appears to be vestigal. Remove it.
Mark Johnston [Wed, 20 Jan 2021 01:34:35 +0000 (20:34 -0500)]
opencrypto: Fix assignment of crypto completions to worker threads
Since r336439 we simply take the session pointer value mod the number of
worker threads (ncpu by default). On small systems this ends up
funneling all completion work through a single thread, which becomes a
bottleneck when processing IPSec traffic using hardware crypto drivers.
(Software drivers such as aesni(4) are unaffected since they invoke
completion handlers synchonously.)
Instead, maintain an incrementing counter with a unique value per
session, and use that to distribute work to completion threads.
Mark Johnston [Wed, 20 Jan 2021 01:34:35 +0000 (20:34 -0500)]
opencrypto: Embed the driver softc in the session structure
Store the driver softc below the fields owned by opencrypto. This is
a bit simpler and saves a pointer dereference when fetching the driver
softc when processing a request.
Get rid of the crypto session UMA zone. Session allocations are
frequent or performance-critical enough to warrant a dedicated zone.
Alex Richardson [Tue, 19 Jan 2021 15:05:43 +0000 (15:05 +0000)]
Remove remaining uses of ${COMPILER_FEATURES:Mc++11}
All supported compilers have C++11 support so these checks can be replaced
with MK_CXX guards.
See also https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=252759
Alex Richardson [Tue, 19 Jan 2021 11:35:04 +0000 (11:35 +0000)]
getopt: Fix conversion from string-literal to non-const char *
Define a non-const static char EMSG[] = "" to avoid having to add
__DECONST() to all uses of EMSG. Also make current_dash a const char *
to fix this warning.
Alex Richardson [Tue, 19 Jan 2021 11:32:33 +0000 (11:32 +0000)]
Require uint32_t alignment for ipfw_insn
There are many casts of this struct to uint32_t, so we also need to ensure
that it is sufficiently aligned to safely perform this cast on architectures
that don't allow unaligned accesses. This fixes lots of -Wcast-align warnings.
Alex Richardson [Tue, 19 Jan 2021 11:32:32 +0000 (11:32 +0000)]
libalias: Fix -Wcast-align compiler warnings
This fixes -Wcast-align warnings caused by the underaligned `struct ip`.
This also silences them in the public functions by changing the function
signature from char * to void *. This is source and binary compatible and
avoids the -Wcast-align warning.
John Baldwin [Tue, 19 Jan 2021 19:51:27 +0000 (11:51 -0800)]
Convert unmapped mbufs before computing checksums in IPsec.
This is similar to the logic used in ip_output() to convert mbufs
prior to computing checksums. Unmapped mbufs can be sent when using
sendfile() over IPsec or using KTLS over IPsec.
Reported by: Sony Arpita Das @ Chelsio QA
Reviewed by: np
Sponsored by: Chelsio
Differential Revision: https://reviews.freebsd.org/D28187
John Baldwin [Fri, 8 Jan 2021 22:56:22 +0000 (14:56 -0800)]
arm64: Trim duplicate code from cpu_fork_kthread_handler().
cpu_fork_kthread_handler() is always called after either cpu_fork() or
cpu_copy_thread(). The arm64 version was duplicating some of the work
already done by both of those functions.
Glen Barber [Tue, 19 Jan 2021 18:38:33 +0000 (13:38 -0500)]
release: Add workaround to use SVN for ports
The ports tree is scheduled to be converted from Subversion to Git
after the currently-scheduled 13.0-RELEASE, so the source of truth
will be Subversion for the ports tree.
Lutz Donnerhacke [Tue, 19 Jan 2021 14:56:16 +0000 (15:56 +0100)]
ixl: Permit 802.1ad frames to pass though the chip
This patch is a quick hack to change the internal Ethertype used
within the chip. All frames with this type are dropped silently.
This patch allows you to overwrite the factory default 0x88a8, which
is used by IEEE 802.1ad VLAN stacking.
Michal Meloun [Wed, 13 Jan 2021 12:50:54 +0000 (13:50 +0100)]
arm64 busdma: Fix loading of small bounced buffers.
- Don't oversize the buffer fragment. PAGE_SIZE - (curaddr & PAGE_MASK)
may be greater than the total length of the buffer.
- Don't use roundup2(len, alignment) to calculate the buffer fragment
size. The length of current bounced fragment is not subject to alignment
restriction, and next fragment should start at the page boundary.
Stefan Eßer [Tue, 19 Jan 2021 11:46:52 +0000 (12:46 +0100)]
Remove dependency on files in /usr/bin
In order to reduce the pre-requisites of this file, implement the
pattern matching and creation of a temporary test directory without
use of grep respectively mktemp.
The new version makes it possible to provide a writable /tmp in any
case and independently of other local or remote file systems (except /
and /dev) being mounted.
The use of "dd if=/dev/random" has the same dependency on /dev/random
being operational as the previous version that used "mktemp". If this
is found to be an issue on platforms that do not have gathered
sufficient entropy at the time when this scriot is run, I suggest to
replace the "dd" command with "ps lauxww" to get a somewhat random
test directory name.
Mateusz Guzik [Tue, 19 Jan 2021 09:08:24 +0000 (10:08 +0100)]
cache: save a branch in cache_fplookup_next
Previously the code would branch on top find out whether it should
branch on SDT probe and bumping the numposhits counter, depending
on cache_fplookup_cross_mount.
Arguably it should be done regardless of what said function returns.
Bryan Venteicher [Tue, 19 Jan 2021 04:55:25 +0000 (04:55 +0000)]
if_vtnet: Schedule Rx task if pending items when enabling interrupt
Prior to V1, the driver would enable interrupts and then notify the
host that DRIVER_OK. Since for V1, DRIVER_OK needs to be set before
notifying the virtqueues, there may be items in the queues waiting
to be processed by the time interrupts are enabled.
This fixes a bug where the Rx queue would appear stuck, only being
usable after an interface down/up cycle.
Bryan Venteicher [Tue, 19 Jan 2021 04:55:25 +0000 (04:55 +0000)]
if_vtnet: Limit allocations of unused virtqueues
For multiqueue, we may use fewer than the provided maximum number of
queues. Try to limit allocations of the unused queues: no interrupts,
no indirect descriptors, and no taskqueues.
Bryan Venteicher [Tue, 19 Jan 2021 04:55:25 +0000 (04:55 +0000)]
if_vtnet: Add support for software LRO
This useful when running on hosts that support checksum offloading
but not the GUEST_TSO (LRO) feature. Or potentially, some GRO-like
support when doing forwarding.
Only enable SW LRO when the host LRO is not available since both
tends to be harmful, and difficult to enable/disable selectively
with only a single IFCAP_LRO flag.
Bryan Venteicher [Tue, 19 Jan 2021 04:55:24 +0000 (04:55 +0000)]
if_vtnet: Defer updating generated MAC address until attached
This improves spec compliance because the driver is not suppose
to notify the device prior to setting the DRIVER_OK status, which
could happen with the VIRTIO_NET_F_CTRL_MAC_ADDR.
The VIRTIO_NET_F_MAC feature should always be negotiated so would
be a rare situation.
Bryan Venteicher [Tue, 19 Jan 2021 04:55:24 +0000 (04:55 +0000)]
if_vtnet: Remove at attach PROMISC handling
This may have been required in an early, early, early version of the
specification but I cannot find any reference to it, and a promiscuous
default seems very odd so remove this code.
Bryan Venteicher [Tue, 19 Jan 2021 04:55:24 +0000 (04:55 +0000)]
if_vtnet: Support VIRTIO_NET_F_SPEED_DUPLEX
This features lets the guest driver know the speed and duplex of
the "link". Instead of trying to support many media types based
on the possible/likely speeds/duplexes, only use the speed to
set the interface baudrate.
Bryan Venteicher [Tue, 19 Jan 2021 04:55:24 +0000 (04:55 +0000)]
if_vtnet: Support VIRTIO_NET_F_MTU
This feature lets the guest driver know the maximum MTU size
supported by the host device. If set, use this to limit the
acceptable MTUs, and improve how the receive mbuf cluster size
then is selected.