John Baldwin [Wed, 2 Feb 2022 20:18:43 +0000 (12:18 -0800)]
stand/efi: Pass --no-dynamic-linker to ld.bfd >= 2.34.
ld.bfd in binutils 2.34+ now reports an error in more cases for custom
ldscripts that do not place PHDRs in a LOAD segment. However, EFI
binaries are not dynamic binaries which need PHDRs, so pass
--no-dynamic-linker to disable this check.
John Baldwin [Tue, 1 Feb 2022 01:33:31 +0000 (17:33 -0800)]
fstyp: Remove __packed from struct exfat_de_label.
This fixes a -Waddress-of-packed-member warning about a possibly
unaligned pointer from GCC 9 when calling convert_label().
__packed has to be removed from struct exfat_dirent as well to fix an
alignment warning when casting from a struct exfat_dirent pointer to a
struct exfat_de_label pointer.
Reviewed by: cem
Differential Revision: https://reviews.freebsd.org/D32144
John Baldwin [Tue, 1 Feb 2022 01:11:27 +0000 (17:11 -0800)]
hyperv storvsc: Don't abuse struct sglist to hold virtual addresses.
struct sglist is intended for holding S/G lists of physical address
ranges, not virtual address ranges. GCC 9.x issues several warnings
due to casts between pointers and integers of different sizes as a
result (vm_paddr_t is 64-bits on i386). Instead, add a local 'struct
hv_sglist' which uses an array of 'struct iovec' to hold the S/G list
of virtual address ranges.
John Baldwin [Sat, 25 Sep 2021 18:24:35 +0000 (11:24 -0700)]
kernel: Disable errors for -Walloca-larger-than for GCC.
GCC complains about the use of alloca() with variable sizes (for XSAVE
state len) in sendsig() for i386. Modern XSAVE state is probably
getting a bit large for the i386 kstack, but downgrade the error to a
warning.
John Baldwin [Wed, 15 Sep 2021 16:03:18 +0000 (09:03 -0700)]
infiniband: Disable -Wredundant-decl warnings.
ib_uverbs_flow_resources_free() is declard in two header files in
upstream OFED. Disable the warning to avoid introducing diffs to fix
the build on GCC 9.
While here, fix the ibcore module to disable the same warnings
disabled in OFED_CFLAGS.
John Baldwin [Wed, 15 Sep 2021 16:03:17 +0000 (09:03 -0700)]
<linux/overflow.h>: Don't use __has_builtin().
GCC only added support for __has_builtin in GCC 10. However, all
supported versions of GCC and clang include these builtins so just use
them unconditionally.
John Baldwin [Thu, 18 Feb 2021 00:32:11 +0000 (16:32 -0800)]
Add a VA_IS_CLEANMAP() macro.
This macro returns true if a provided virtual address is contained
in the kernel's clean submap.
In CHERI kernels, the buffer cache and transient I/O map are allocated
as separate regions. Abstracting this check reduces the diff relative
to FreeBSD. It is perhaps slightly more readable as well.
A different exception is raised when we hit a 32bits breakpoint, rather than
a 64bits one, so handle those as well when COMPAT_FREEBSD32 is defined.
This should fix SIGBUS at least when using breakpoints with thumb2 code.
mlx5en(4): Use hard-coded 4K page size for RQ/SQ/CQ.
The page size specified for RQ, SQ and CQ is always in units of 4KBytes.
Make sure we subtract MLX5_ADAPTER_PAGE_SHIFT, 12, instead of PAGE_SHIFT
which may vary. This fixes support for using the mlx5en driver on systems
having non-4K page size.
Mark Johnston [Mon, 25 Apr 2022 13:13:03 +0000 (09:13 -0400)]
linuxkpi: Mitigate a seqlock livelock
Disable preemption in seqlock write sections when using the _irqsave
variant. This ensures that a writer can't be preempted and subsequently
starved by a reader running in a callout handler on the same CPU.
This fixes occasional display hangs seen when using the i915 driver.
Tested by: emaste, wulf
Reviewed by: wulf, hselasky
Sponsored by: The FreeBSD Foundation
Alex Richardson [Mon, 13 Sep 2021 09:16:05 +0000 (10:16 +0100)]
Add a test for https://reviews.freebsd.org/D31858 (PR 258310)
This test (based on https://github.com/jiixyj/epoll-shim/pull/32#issuecomment-891276654)
fails reproducibly on QEMU with KVM and `-smp 2` prior to D31858 (committed
as 98168a6e6c12dab8f608f6b5f5b0b175d2b87ef0) and passes with the patch applied.
Alex Richardson [Thu, 9 Sep 2021 10:46:53 +0000 (11:46 +0100)]
Export _mmap and __sys_mmap from libc.so
Unlike the other syscalls these two symbols were missing from the
version script. I noticed this while looking into the compiler-rt
runtime libraries for CHERI.
Reviewed by: brooks
Obtained from: https://github.com/CTSRD-CHERI/cheribsd/pull/1063
MFC after: 3 days
Ed Maste [Sat, 30 Apr 2022 19:25:35 +0000 (15:25 -0400)]
Update WITH_PROFILE description as it is yet removed
GCC still wants to link against (for example) libc_p.a when -pg is in
use, and it's unclear when and how this will be addressed. Change the
WITH_PROFILE option description to claim that it may be removed from an
unspecified future version of FreeBSD, rather than FreeBSD 14.
Reported by: Steve Kargl
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Mark Johnston [Thu, 20 Jan 2022 13:23:38 +0000 (08:23 -0500)]
clockcalib: Fix an overflow bug
tc_counter_mask is an unsigned int and in the TSC timecounter is equal
to UINT_MAX, so the addition tc->tc_counter_mask + 1 can overflow to 0,
resulting in a hang during boot.
Fixes: c2705ceaeb09 ("x86: Speed up clock calibration")
Reviewed by: cperciva
Sponsored by: The FreeBSD Foundation
Implement ieee80211_is_data_present() and a subset of
ieee80211_is_bufferable_mmpdu() which hopefully is good enough in
the compat code for now.
This is partly in preparation for some TXQ changes coming up soon.
In ieee80211_beacon_loss() call into net80211::ieee80211_beacon_miss()
rather than manually bouncing our state. That should give us the
ability to send a probereq and see if the AP is till there rather than
right away going to scan.
When we issue a request to pf and expect a serialised nvlist as a reply
we have to supply a suitable buffer to the kernel.
The required size for this buffer is difficult to predict, and may be
(slightly) different from request to request.
If it's insufficient the kernel will return ENOSPC. Teach libpfctl to
catch this and send the request again with a larger buffer.
Alexander Motin [Thu, 28 Apr 2022 01:39:50 +0000 (21:39 -0400)]
CAM: Keep periph_links when restoring CCB in camperiphdone().
While recovery command executed, some other commands from the periph
may complete, that may affect periph_links of this CCB. So restoring
original CCB we must keep current periph_links as more up to date.
I've found this triggering assertions with debug kernel and suspect
some memory corruptions otherwise when spun down disk receives two
or sometimes more concurrent requests.
Ed Maste [Fri, 11 Mar 2022 21:37:03 +0000 (16:37 -0500)]
teken: color #3 is yellow not brown - use TC_YELLOW as the name
The console escape code standard (ECMA-48) specifies color #3 (escape
code 33) as yellow. A brown console color is an artifact of the VGA
palette, which replaces dim (but not bright) yellow with brown.
Reviewed by: adrian, imp
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D34531
Ed Maste [Wed, 16 Feb 2022 21:10:45 +0000 (16:10 -0500)]
Clean up warnings in pthread tests
I intend to move these into lib/libthr/tests/ and connect to kyua. This
is a first step to address warnings emitted when building using standard
make infrastructure.
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D34306
Andrew Turner [Sat, 9 Apr 2022 11:43:08 +0000 (12:43 +0100)]
Only return a mapped address from efi_phys_to_kva
On some hardware the EFI system table is not in memory mapped in the
DMAP section. Rather than panic the kernel check if it is mapped and
return a failure if not from efi_phys_to_kva.
Andrew Turner [Mon, 28 Mar 2022 11:37:09 +0000 (12:37 +0100)]
Handle non-page aligned/sized memory in physmem
In some configurations the firmware may pass memory regions that are
not page sized or aligned, e.g. when using 16k pages on arm64. If this
is the case we will calculate many small regions because the alignment
is applied before being inserted. As we round the start up and end down
this will leave a 1 page hole between what should have been a single
region.
Fix by keeping the original alignment until we are just about to insert
the region into the avail array.
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D34694
Andrew Turner [Mon, 4 Apr 2022 09:28:59 +0000 (10:28 +0100)]
Fix a coherent bus check in the arm64 busdma
In the arm64 busdma we have an internal flag to signal when a tag is
for a cache-coherent device. In this case we don't need to adjust the
size and alignment of allocated buffers to be within a cache line.
The cache line adjustment was incorrectly using the coherent flag
passed in to bus_dma_tag_create and not the internal flag. Fix it to
use the latter to reduce the memory usage slightly.
Reviewed by: bz
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D34763
Ed Maste [Sun, 27 Feb 2022 19:04:09 +0000 (14:04 -0500)]
fwcontrol: eliminate set but not used warning
The variable was used in an #if 0 block; just move the variable
definition and setting into the same block since Firewire is mainly of
historical interest and is unlikely to see ongoing development in
FreeBSD.
Ed Maste [Sat, 23 Apr 2022 19:52:03 +0000 (15:52 -0400)]
ssh: use upstream SSH_OPENSSL_VERSION macro
With the upgrade to OpenSSH 6.7p1 in commit a0ee8cc636cd we replaced
WITH_OPENSSL ifdefs with an OPENSSL_VERSION macro, later changing it
to OPENSSL_VERSION_STRING.
A few years later OpenSSH made an equivalent change (with a different
macro name), in commit 4d94b031ff88. Switch to the macro name they
chose.
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
lacp: short timeout erroneously declares link-flapping
Panasas was seeing a higher-than-expected number of link-flap events.
After joint debugging with the switch vendor, we determined there were
problems on both sides; either of which might cause the occasional
event, but together caused lots of them.
On the switch side, an internal queuing issue was causing LACP PDUs --
which should be sent every second, in short-timeout mode -- to sometimes
be sent slightly later than they should have been. In some cases, two
successive PDUs were late, but we never saw three late PDUs in a row.
On the FreeBSD side, we saw a link-flap event every time there were two
late PDUs, while the spec says that it takes *three* seconds of downtime
to trigger that event. It turns out that if a PDU was received shortly
before the timer code was run, it would decrement less than a full
second after the PDU arrived. Then two delayed PDUs would cause two
additional decrements, causing it to reach zero less than three seconds
after the most-recent on-time PDU.
The solution is to note the time a PDU arrives, and only decrement if at
least a full second has elapsed since then.
John Baldwin [Fri, 18 Feb 2022 23:20:14 +0000 (15:20 -0800)]
ctl ramdisk: Free compare buffer after a compare I/O request.
For a compare request, the ramdisk backend allocates a temporary
buffer to hold the I/O data and then compares it against the LUN's
pages in ctl_backend_ramdisk_cmp after the data has been filled.
However, the tempory buffer was leaked when after the comparison was
complete. Fix this by freeing the buffer after the comparison.
John Baldwin [Tue, 8 Feb 2022 00:20:06 +0000 (16:20 -0800)]
cxgbei: Dispatch sent PDUs to the NIC asynchronously.
Previously the driver was called to send PDUs to the NIC synchronously
from the icl_conn_pdu_queue_cb callback. However, this performed a
fair bit of work while holding the icl connection lock. Instead,
change the callback to add sent PDUs to a STAILQ and defer dispatching
of PDUs to the NIC to a helper thread similar to the scheme used in
the TCP iSCSI backend.
- Replace rx_flags int and the sole RXF_ACTIVE flag with a simple
rx_active bool.
- Add a pool of transmit worker threads for cxgbei.
- Fix worker thread exit to depend on the wakeup in kthread_exit()
to fix a race with module unload.
John Baldwin [Mon, 7 Feb 2022 22:11:10 +0000 (14:11 -0800)]
Extend the VMM stats interface to support a dynamic count of statistics.
- Add a starting index to 'struct vmstats' and change the
VM_STATS ioctl to fetch the 64 stats starting at that index.
A compat shim for <= 13 continues to fetch only the first 64
stats.
- Extend vm_get_stats() in libvmmapi to use a loop and a static
thread local buffer which grows to hold the stats needed.
John Baldwin [Mon, 7 Feb 2022 20:55:08 +0000 (12:55 -0800)]
cfiscsi_done: Free the dummy PDU earlier.
The dummy PDU needs to be freed before marking task abortion complete
as otherwise cfiscsi_session_terminate_tasks can return and destroy
the session in another thread before the PDU is freed.
Fixes: 2e8d1a55258d iscsi: Allocate a dummy PDU for the internal nexus reset task.
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D34176
John Baldwin [Fri, 4 Feb 2022 23:38:49 +0000 (15:38 -0800)]
cxgbei: Rework parsing of pre-offload PDUs.
sbcut() returns mbufs in reverse order so is not suitable for reading
data from the socket buffer. Instead, check for already-received data
in the receive worker thread before passing offload PDUs up to the
iSCSI layer. This uses soreceive() to read data from the socket and
is also to use M_WAITOK since it now runs from a worker thread instead
of an interrupt thread.
Also, fix decoding of the data segment length for pre-offload PDUs.
Reported by: Jithesh Arakkan @ Chelsio
Fixes: a8c4147edcdc cxgbei: Parse all PDUs received prior to enabling offload mode.
Sponsored by: Chelsio Communications
John Baldwin [Fri, 28 Jan 2022 21:07:04 +0000 (13:07 -0800)]
iscsi: Allocate a dummy PDU for the internal nexus reset task.
When an iSCSI target session is terminated, an internal nexus reset
task is posted to abort existing tasks belonging to the session.
Previously, the ctl_io for this internal nexus reset stored a pointer
to the session in the slot that normally holds a pointer to the PDU
from the initiator that triggered the I/O request. The completion
handler then assumed that any nexus reset I/O was due to an internal
request and fetched the session pointer (instead of the PDU pointer)
from the ctl_io. However, it is possible to trigger a nexus reset via
an on-the-wire task management PDU. If such a PDU were sent to the
target, then the completion handler would incorrectly treat this
request as an internal request and treat the pointer to the received
PDU as a pointer to the session instead.
To fix, allocate a dummy PDU for the internal reset task and use an
invalid opcode to differentiate internal nexus resets from resets
requested by the initiator.