Michael Gmelin [Wed, 16 Mar 2022 22:08:55 +0000 (23:08 +0100)]
if_epair: fix race condition on multi-core systems
As an unwanted side effect of the performance improvements in 24f0bfbad57b9, epair interfaces stop forwarding traffic on higher
load levels when running on multi-core systems.
This happens due to a race condition in the logic that decides when to
place work in the task queue(s) responsible for processing the content
of ring buffers.
In order to fix this, a field named state is added to the epair_queue
structure. This field is used by the affected functions to signal each
other that something happened in the underlying ring buffers that might
require work to be scheduled in task queue(s), replacing the existing
logic, which relied on checking if ring buffers are empty or not.
epair_menq() does:
- set BIT_MBUF_QUEUED
- queue mbuf
- if testandset BIT_QUEUE_TASK:
enqueue task
epair_tx_start_deferred() does:
- swap ring buffers
- process mbufs
- clear BIT_QUEUE_TASK
- if testandclear BIT_MBUF_QUEUED
enqueue task
PR: 262571
Reported by: Johan Hendriks <joh.hendriks@gmail.com>
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D34569
Kirk McKusick [Wed, 16 Mar 2022 18:37:15 +0000 (11:37 -0700)]
Ensure that fsck(8) / fsck_ffs(8) produces the correct exit code
for missing devices.
The fsck_ffs(8) utility uses its internal function openfilesys()
when opening a disk to be checked. This change avoids the use
of pfatal() in openfilesys() which always exits with failure (exit
value 8) so that the caller can choose the correct exit value.
In the case of a non-existent device it should exit with value 3
which allows the startup system to wait for drives (such as those
attached by USB) to come online.
The total size of the user-provided nmreq was first computed and then
trusted during the copyin. This might lead to kernel memory corruption
and escape from jails/containers.
Reported by: Lucas Leong (@_wmliang_) of Trend Micro Zero Day Initiative
Security: CVE-2022-23084
MFC after: 3 days
An unsanitized field in an option could be abused, causing an integer
overflow followed by kernel memory corruption. This might be used
to escape jails/containers.
Reported by: Reno Robert and Lucas Leong (@_wmliang_) of Trend Micro
Zero Day Initiative
Security: CVE-2022-23085
Eugene Grosbein [Wed, 16 Mar 2022 04:41:51 +0000 (11:41 +0700)]
virtio_random(8): avoid deadlock at shutdown time
FreeBSD 13+ running as virtual guest may load virtio_random(8) driver
by means of devd(8) unless the driver is blacklisted or disabled
via device.hints(5). Currently, the driver may prevent
the system from rebooting or shutting down correctly.
This change deactivates virtio_random at very late stage
during system shutdown sequence to avoid deadlock
that results in kernel hang.
Andrew Turner [Thu, 10 Mar 2022 14:39:03 +0000 (14:39 +0000)]
Fix arm64 TLB invalidation with non-4k pages
When using 16k or 64k pages atop will shift the address by more than
the needed amount for a tlbi instruction. Replace this with a new macro
to shift the address by 12 and use PAGE_SIZE in the for loop to let the
code work with any page size.
Reviewed by: alc, markj
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D34516
Andrew Turner [Mon, 14 Mar 2022 12:40:25 +0000 (12:40 +0000)]
Make page size dynamic in libkvm for arm64
To allow for a future 16k or 64k page size we need to tell libkvm which
is being used. Add a flag field in unused space in minidumphdr and use
it to signal between the different options.
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D34548
Andrew Turner [Thu, 10 Mar 2022 14:40:38 +0000 (14:40 +0000)]
Fix calculating l0index in _pmap_alloc_l3 on arm64
When moving from the l1 index to l0 index we need to use the l1 shift
value not the l0 shift value. With 4k pages they are identical, however
with 16k pages we only have 2 l0 entries so the shift value is incorrect.
Reviewed by: alc, markj
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D34517
/usr/freebsd-dist is used used by various programs as the location for
FreeBSD distribution files. In-tree programs following this convention
are bsdinstall(8) and release(7).
Wojciech Macek [Mon, 14 Mar 2022 06:51:21 +0000 (07:51 +0100)]
armv6/legacy: optimize cpu_getcount performance
Use nanotime instread of binuptime.
This change is for old and legacy platforms and does not impact any armv7 and later.
Might be MFCed to 13.0 once armv6 support is finally dropped in 14.0
Rick Macklem [Sun, 13 Mar 2022 20:15:12 +0000 (13:15 -0700)]
nfscl: Fix NFSv4.1/4.2 Lookup+Open RPC
Use of the Lookup+Open RPC is currently disabled,
due to a problem detected during testing. This
patch fixes this problem. The problem was that
nfscl_postop_attr() does not parse the attributes
if nd_repstat != 0. It also would parse the
return status for the operation, where the
Lookup+Open code had already parsed it.
The first change in the patch does not make any
semantics change, but makes the code identical
to what is done later in the function, so that
it is apparent that the semantics should be the
same in both places.
Lookup+Open remains disabled while further
testing is being done, so this patch has no
effect at this time.
FreeBSD 14.0 is going to ship with a new implementation of the mixer(8)
command. Unfortunately, in order to support new features like mute, the
command-line interface of the new implementation is not backwards
compatible.
Update all the remaining documentation and scripts in the src tree
to use the new syntax.
While here, document in usbhidaction.1 that the mute functionality is
now supported.
Reviewed by: christos, debdrup, hselasky
Approved by: hselasky (src)
Fixes: 903873ce1560 Implement and use new mixer(3) library for FreeBSD.
Differential Revision: https://reviews.freebsd.org/D34545
usbtest: Fix issue when multiple devices are sharing same USB vendor and product ID.
When there are multiple devices sharing the same USB vendor and product ID,
the wrong device may be selected. Fix this by also matching the bus and
device address, ugen<X>.<Y> .
- Use correct macros (e.g., Pa for paths, Ar for arguments, Cm for
command modifiers).
- Pet igor and mandoc -Tlint (e.g., start sentences after a newline).
- Use Ta instead of a tab character in tables.
- Stylize all table headers with Sy consistency.
- Add a missing "vol" variant to the synopsis of "dev.volume".
- Sort dev.recsrc command modifiers consistency.
- Use "Bd -literal" for code blocks in the examples. "Bl -tag" is not
the right macro for that.
Fixes: 903873ce1560 Implement and use new mixer(3) library for FreeBSD.
This version provides improvements and fixes mainly to use bsddialog
utility in bsdinstall/scripts. The lib API is not broken so the
previous converted utilities (tzsetup, distextract, etc.) are OK.
Alexander Motin [Sat, 12 Mar 2022 16:49:37 +0000 (11:49 -0500)]
GEOM: Introduce partial confxml API
Traditionally the GEOM's primary channel of information from kernel to
user-space was confxml, fetched by libgeom through kern.geom.confxml
sysctl. It is convenient and informative, representing full state of
GEOM in a single XML document. But problems start to arise on systems
with hundreds of disks, where the full confxml size reaches many
megabytes, taking significant time to first write it and then parse.
This patch introduces alternative solution, allowing to fetch much
smaller XML document, subset of the full confxml, limited to 64KB and
representing only one specified geom and optionally its parents. It
uses existing GEOM control interface, extended with new "getxml" verb.
In case of any error, such as the buffer overflow, it just transparently
falls back to traditional full confxml. This patch uses the new API in
user-space GEOM tools where it is possible.
Ed Maste [Fri, 11 Mar 2022 21:37:03 +0000 (16:37 -0500)]
teken: color #3 is yellow not brown - use TC_YELLOW as the name
The console escape code standard (ECMA-48) specifies color #3 (escape
code 33) as yellow. A brown console color is an artifact of the VGA
palette, which replaces dim (but not bright) yellow with brown.
Reviewed by: adrian, imp
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D34531
Ed Maste [Fri, 11 Mar 2022 19:27:46 +0000 (14:27 -0500)]
loader: accept "yellow" as a named color
For historical reasons console color number 3 may be either yellow (most
consoles) or brown (VGA palette). The console escape code standard
uses "yellow", but teken color name constants appear to be based on the
VGA scheme and use TC_BROWN for color 3. Even so, the palette table
used 50,50,0 as the RGB percentage tuple, resulting in a dim yellow for
framebuffer consoles at the time teken was introduced.
Amusingly, in 19e2ce2d8367 the comment on the palette entry was changed
from "brown" to "dark yellow" but the colour itself was changed from
a pure yellow to being somewhat brown.
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
John Baldwin [Fri, 11 Mar 2022 19:29:45 +0000 (11:29 -0800)]
arm64 hwpmc: Support restricting counters to user or kernel mode.
Support the "usr" and "os" qualifiers on arm64 events to restrict
event counting to either usermode or the kernel, respectively. If
neither qualifier is given, events are counted in both.
Reviewed by: emaste
Sponsored by: University of Cambridge, Google, Inc.
Differential Revision: https://reviews.freebsd.org/D34527
John Baldwin [Thu, 10 Mar 2022 23:50:52 +0000 (15:50 -0800)]
cxgbei: Support unmapped I/O requests.
- Add icl_pdu_append_bio and icl_pdu_get_bio methods.
- Add new page pod routines for allocating and writing page pods for
unmapped bio requests. Use these new routines for setting up DDP
for iSCSI tasks with a SCSI I/O CCB which uses CAM_DATA_BIO.
- When ICL_NOCOPY is used to append data from an unmapped I/O request
to a PDU, construct unmapped mbufs from the relevant pages backing
the struct bio. This also requires changes in the t4_push_pdus path
to support unmapped mbufs.
John Baldwin [Thu, 10 Mar 2022 23:50:26 +0000 (15:50 -0800)]
iscsi: Support unmapped I/O requests in the default initiator.
- Add icl_pdu_append_bio and icl_pdu_get_bio methods.
- When ICL_NOCOPY is used to append data from an unmapped I/O request
to a PDU, construct unmapped mbufs from the relevant pages backing
the struct bio.
- Use m_apply with a helper to compute crc32 digests on mbuf chains
to handle unmapped mbufs. Since m_apply requires PMAP_HAS_DMAP
for unmapped mbufs, only support unmapped requests when PMAP_HAS_DMAP
is true.
John Baldwin [Thu, 10 Mar 2022 23:49:53 +0000 (15:49 -0800)]
iscsi: Handle unmapped I/O requests.
Don't assume that csio->data_ptr is pointer to a data buffer that can
be passed to icl_get_pdu_data and icl_append_data. For unmapped I/O
requests, csio->data_ptr is instead a pointer to a struct bio as
indicated by CAM_DATA_BIO. To support these requests, add
icl_pdu_append_bio and icl_pdu_get_bio methods which pass a pointer to
the bio and an offset and length relative to the bio's buffer.
Note that only backends supporting unmapped requests need to implement
these hooks.
Implement simple no-op hooks for the iser backend.
John Baldwin [Thu, 10 Mar 2022 23:48:20 +0000 (15:48 -0800)]
iscsi: Use ICL_NOCOPY for SCSI command immediate data and R2T.
The associated csio ccb will not be completed via xpt_done() until
after the associated PDUs are transmitted to the other side and either
the original PDU is acked with a SCSI response, or a response is
received for a subsequent abort CCB (which means the earlier PDU has
also been sent since it would have been sent before the abort PDU).
This does assume that once an I/O request has been aborted, no further
PDUs with data payload are queued for that I/O request.
John Baldwin [Thu, 10 Mar 2022 23:40:44 +0000 (15:40 -0800)]
gcore: Use PT_GETREGSET to fetch NT_PRSTATUS and NT_FPREGSET.
Add a elf_putregnote() helper to build the ELF note for a register
set. Once nice result of this approach is that this reuses the
kernel's support for generating 32-bit register sets for 32-bit
processes avoiding the need to duplicate that logic in elf32core.c.
Reviewed by: markj
Sponsored by: University of Cambridge, Google, Inc.
Differential Revision: https://reviews.freebsd.org/D34447
John Baldwin [Thu, 10 Mar 2022 23:40:19 +0000 (15:40 -0800)]
Store core dump notes for all valid register sets for FreeBSD processes.
In particular, use a generic wrapper around struct regset rather than
requiring per-regset helpers. This helper replaces the MI
__elfN(note_prstatus) and __elfN(note_fpregset) helpers. It also
removes the need to explicitly dump NT_ARM_ADDR_MASK in the arm64
__elfN(dump_thread).
Reviewed by: markj, emaste
Sponsored by: University of Cambridge, Google, Inc.
Differential Revision: https://reviews.freebsd.org/D34446
John Baldwin [Thu, 10 Mar 2022 23:39:53 +0000 (15:39 -0800)]
libpmcstat: Fix a few ARM-specific issues with function symbols.
- Refine the checks for ARM mapping symbols and apply them on arm64 as
well as 32-bit arm. In particular, mapping symbols can have
additional characters and are not strictly limited to just "$a" but
can append additional characters (e.g. "$a.1"). Add "$x" to the
list of mapping symbol prefixes.
- Clear the LSB of function symbol addresses. Thumb function
addresses set the LSB to enable Thumb mode. However, the actual
function starts at the aligned address with LSB clear. Not clearing
the LSB can cause pmcannotate to pass misaligned addresses to
objdump when extracting disassembly.
Reviewed by: andrew
Obtained from: CheriBSD
Sponsored by: University of Cambridge, Google, Inc.
Differential Revision: https://reviews.freebsd.org/D34416
John Baldwin [Thu, 10 Mar 2022 23:39:37 +0000 (15:39 -0800)]
ddb: Add 'show gic <name>' and 'show all gics' commands.
To handle the different register layouts for different versions, add a
GIC_DB_SHOW() method. Currently this hook is only implemented for
versions 1 and 2.
Sponsored by: University of Cambridge, Google, Inc.
Differential Revision: https://reviews.freebsd.org/D34415
Ed Maste [Sun, 27 Feb 2022 19:04:09 +0000 (14:04 -0500)]
fwcontrol: eliminate set but not used warning
The variable was used in an #if 0 block; just move the variable
definition and setting into the same block since Firewire is mainly of
historical interest and is unlikely to see ongoing development in
FreeBSD.
Andrew Turner [Thu, 10 Mar 2022 18:00:40 +0000 (18:00 +0000)]
Split out creating the arm64 L2 dmap entries
When creating the DMAP region we may need to create level 2 page table
entries at the start and end of a block of memory. The code to do this
was almost identical so we can merge into a single function.