Mark Johnston [Thu, 21 Apr 2022 17:22:09 +0000 (13:22 -0400)]
mld6: Ensure that mld_domifattach() always succeeds
mld_domifattach() does a memory allocation under the global MLD mutex
and so can fail, but no error handling prevents a null pointer
dereference in this case. The mutex is only needed when updating the
global softc list; the allocation and static initialization of the softc
does not require this mutex. So, reduce the scope of the mutex and use
M_WAITOK for the allocation.
PR: 261457
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D34943
pf: counter argument to pfr_pool_get() may never be NULL
Coverity points out that if counter was NULL when passed to
pfr_pool_get() we could potentially end up dereferencing it.
Happily all users of the function pass a non-NULL pointer. Enforce this
by assertion and remove the pointless NULL check.
Move the use of 'sc' to after the NULL check.
It's very unlikely that we'd actually hit this, but Coverity is correct
that it's not a good idea to dereference the pointer and only then NULL
check it.
Mark Johnston [Thu, 21 Apr 2022 14:49:22 +0000 (10:49 -0400)]
ctfdump: Remove definitions of warn() and vwarn()
The presence of the latter causes a link error when building a
statically linked ctfdump(1) because libc defines the same symbol.
libc's warn() is defined as a weak symbol and so does not cause the same
problem, but let's just use libc's version.
Reported by: stephane rochoy <stephane.rochoy@stormshield.eu>
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
xhci(4): Ensure the so-called data toggle gets properly reset.
Use the drop and enable endpoint context commands to force a reset of
the data toggle for USB 2.0 and USB 3.0 after:
- clear endpoint halt command (when the driver wishes).
- set config command (when the kernel or user-space wants).
- set alternate setting command (only affected endpoints).
Some XHCI HW implementations may not allow the endpoint reset command when
the endpoint context is not in the halted state.
Reported by: Juniper and Gary Jennejohn
MFC after: 1 week
Sponsored by: NVIDIA Networking
Doug Moore [Wed, 20 Apr 2022 22:24:11 +0000 (17:24 -0500)]
dev/iommu: Include offset in maxaddr check.
If iommu_gas_match_one has to adjust for a boundary crossing, its
check against maxaddr includes 'offset' in its calculation, to ensure
that the allocated memory does not exceed the max address. However, if
there's no boundary crossing adjustment, then the maxaddr check
disregards 'offset'. Fix that.
Alan Somers [Wed, 21 Apr 2021 22:56:48 +0000 (16:56 -0600)]
ctlstat: add prometheus output
When invoked by inetd, ctlstat -P will now produce output suitable for
ingestion into Prometheus.
It's a drop-in replacement for https://github.com/Gandi/ctld_exporter,
except that it doesn't report the number of initiators per target, and
it does report time and dma_time.
Andrew Turner [Tue, 19 Apr 2022 16:22:37 +0000 (17:22 +0100)]
Fill the page size array in one posix shm test
The largepage_config posix shared memory test was failing on arm64 as
the page size array is never filled out. Fix this by calling
getpagesizes(3), via pagesizes.
Reviewed by: markj, kib
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D34960
15b1eb142c changed the callout code to store the CALLOUT_SHAREDLOCK flag
in c_iflags (where it used to be c_flags), but failed to update the
check in softclock_call_cc(). This resulted in the callout code always
taking the write lock, even if a read lock had been requested (with
the CALLOUT_SHAREDLOCK flag in callout_init_rm()).
When we issue a request to pf and expect a serialised nvlist as a reply
we have to supply a suitable buffer to the kernel.
The required size for this buffer is difficult to predict, and may be
(slightly) different from request to request.
If it's insufficient the kernel will return ENOSPC. Teach libpfctl to
catch this and send the request again with a larger buffer.
Previously it was disabled right before translation was enabled.
This way the disable logic is still executed even when translation
is not be activated, e.g. with hw.iommu.dma=0 tunable set.
On some platforms we need to disable PMR in order for core dump to work.
At the same time it was observed that enabling translation has
a significant impact on network performance.
With this patch PMR can be disabled, with IOMMU translation not being
turned on by appending the following to the loader.conf:
Ed Maste [Tue, 19 Apr 2022 19:44:46 +0000 (15:44 -0400)]
capsicum: briefly describe capabilities in man page
Provide a very brief introduction to capabilities, using a couple of
sentences from David Chisnall's mailing list response[1] to a question
about Linux capabilities and Capsicum.
Mailing list subject (in case the archive URL changes) was
Re: Linux capabilities to Capsicum
John Baldwin [Tue, 19 Apr 2022 17:43:06 +0000 (10:43 -0700)]
Deprecate the 'devclass' argument from *DRIVER_MODULE() macros.
This argument is useless for the vast majority of drivers. For now,
use __VA_ARGS__ wrapper macros so that that the *DRIVER_MODULE()
macros accept both the old version (with a devclass) and the new
version (which omits the argument and stores NULL in the
driver_module_data structure). This provides an API compatiblity
shim that can be merged to older stable branches.
Once all drivers relevant to 14.0 (both in and out of tree) have been
updated, the API compat shims can be dropped.
Insert padding in __cxa_exception struct for compatibility
Similar to https://github.com/llvm/llvm-project/commit/f2a436058fcb, the
addition of __attribute__((__aligned__)) to _Unwind_Exception (in commit b9616964) causes implicit padding to be inserted before the unwindHeader
field in __cxa_exception.
Applications attempt to get at the earlier fields in __cxa_exception, so
preserve the same negative offsets in __cxa_exception, by moving the
padding to the beginning of the struct.
The assumption here is that if the ABI is not aware of the padding
before unwindHeader and put the referenceCount/primaryException in
there, no padding should exist before unwindHeader.
This should make libreoffice's custom exception handling mechanisms work
correctly, even if it was built against an older cxxabi.h/unwind.h pair.
Tom Jones [Tue, 19 Apr 2022 15:20:24 +0000 (16:20 +0100)]
diff3: Add support for -m
diff3 in -m mode generates a complete file with changes bracketed with
conflict markers. This adds support for diff3 to generate version
control style three way merge output.
The output format was inferred from looking at the gnu diff3 output on a
selection of test files as a specification of what diff3 -m should
output is not available. It is likely there are cases where the -m
output differs from other tools and I am happy to update diff3 to
address these.
Discussed with: pstef, kevans
Sponsored by: Klara, Inc.
Tom Jones [Tue, 19 Apr 2022 14:43:35 +0000 (15:43 +0100)]
diff3: Add support for -A
Diff3 in -A mode generates an ed script to show how the 3 files and
brackets changes that conflict. The ed script generated should when
applied leave familiar merge conflict markers in a patched file.
Diff3 output is not documented, this feature has been arrived at by
comparing bsd diff3 output to gnu diff3 output until they were made to
agree. There are likely to still be differences between these formats.
The gnu diff3 guide is actually quite good at explaining how diff3
output should appear, but it doesn't cover every form of output from
diff3.
Tom Jones [Tue, 19 Apr 2022 13:47:07 +0000 (14:47 +0100)]
diff3: Clean up printing of ranges for edscript output
Replace the edscript code that tracked and printed lines using byte
offsets with code that can work from line offsets.
This tidies up the reduces duplication in the edscript output code. It
also fixes the usage of the de struct so that it only tracks diffs as
line offsets rather than the usage changing from line offsets to byte
offsets during the lifetime of diff3.
Large files with large numbers of ranges will probably suffer in
performance here, but as we don't use diff3 yet this isn't a regression.
Include a warning for future hackers so they have a place to start
hacking from.
Reviewed by: pstef
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D34941
Alan Somers [Mon, 18 Apr 2022 21:29:37 +0000 (15:29 -0600)]
prometheus_sysctl_exporter: fix metric aliasing
When exporting sysctls to Prometheus, the exporter replaces "." with
"_". This caused several metrics to alias, confusing the Prometheus
server. Fix it by:
* Renaming the "tcp_log_bucket" UMA zone to "tcp_log_id_bucket". Also,
rename "tcp_log_node" to "tcp_log_id_node" for consistency.
* Not exporting sysctls with "(LEGACY)" in the description. That is
used by ZFS sysctls that have been replaced by others, many of which
alias to the same Prometheus metric name (like "vfs.zfs.arc_max" and
"vfs.zfs.arc.max").
diff: tests: loosen up requirements for report_identical
This test cannot run without an unprivileged_user being specified
anyways, so just run as the unprivileged user. Revoking read permisions
works just as well if you're guaranteed non-root.
Reviewed by: pstef
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D34950
Alan Somers [Mon, 18 Apr 2022 23:03:53 +0000 (17:03 -0600)]
fusefs: correctly handle servers that report too much data written
During a FUSE_WRITE, the kernel requests the server to write a certain
amount of data, and the server responds with the amount that it actually
did write. It is obviously an error for the server to write more than
it was provided, and we always treated it as such, but there were two
problems:
* If the server responded with a huge amount, greater than INT_MAX, it
would trigger an integer overflow which would cause a panic.
* When extending the file, we wrongly set the file's size before
validing the amount written.
Michael Tuexen [Mon, 18 Apr 2022 22:40:31 +0000 (00:40 +0200)]
if_vtnet: improve dumping a kernel
Disable software LRO during kernel dumping, because having it enabled
requires to be in a network epoch, which might or might not be the
case depending on the code path resulting in the panic.
Reviewed by: markj
MFC after: 3 days
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D34787
Mark Johnston [Mon, 18 Apr 2022 21:16:10 +0000 (17:16 -0400)]
geli: Add a chicken switch for unmapped I/O
We have a report of a panic in GELI that appears to go away when
unmapped I/O is disabled. Add a tunable to make such investigations
easier in the future. No functional change intended.
PR: 262894
Reviewed by: asomers
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D34944
John Baldwin [Mon, 18 Apr 2022 21:09:20 +0000 (14:09 -0700)]
arm ti_mbox_attach: Write sysconfig to TI_MBOX_SYSCONFIG to request reset.
This variable was flagged as a set but unused warning as its value was
read from a register and then modified to set a bit
(TI_MBOX_SYSCONFIG_SOFTRST). After the variable is modified, the code
then loops waiting for the SOFTRST bit to go clear in the
TI_MBOX_SYSCONFIG register. Presumably merely reading from the
register does not request a reset as other places in the driver read
this register, so most likely the updated value of sysconfig setting
the reset bit is supposed to be written to the register to request a
reset before the polling loop that waits for the reset to finish.
John Baldwin [Mon, 18 Apr 2022 19:48:42 +0000 (12:48 -0700)]
iscsi: Fetch limits based on a socket rather than assuming global limits.
cxgbei needs the ability to return different limits based on the
connection (e.g. if the connection is over a T5 adapter or a T6
adapter as well as factoring in the MTU).
This change plumbs through the changes in the ioctls without changing
any of the backends. The limits callback passed to icl_register now
accepts a second socket argument which holds the integer file
descriptor. To support ABI compatiblity for old binaries, the
callback should return "global" values if the socket fd is zero.
The CTL_ISCSI_LIMITS argument used with CTL_ISCSI by ctld(8) now
accepts the socket fd in a field that was previously part of a
reserved spare field. Old binaries zero this request which results in
passing a socket fd of 0 to the limits callback.
The ISCSIDREQUEST ioctl no longer returns limits. Instead, iscsid(8)
invokes a new ISCSIDLIMITS ioctl after establishing the connection via
connect(2). For ABI compat, if the old ISCSIDREQUEST is invoked, the
global limits are still fetched (with a socket fd of 0) and returned.
John Baldwin [Mon, 18 Apr 2022 19:25:08 +0000 (12:25 -0700)]
linuxkpi_ieee80211_tx_status: Mark ridx as unused.
__diagused only squelches warnings for variables used under
INVARIANTS, it does not apply to custom debug knobs like
LINUXKPI_DEBUG_80211. Use __unused instead.
John Baldwin [Mon, 18 Apr 2022 19:27:48 +0000 (12:27 -0700)]
uhid_snes: Remove USB_ST_TRANSFERRED handling for the status request.
The result of the request computed in new_status was never returned to
the caller leaving new_status as a set-but-unused variable. Removing
new_status leaves sc->previous_status as a write-only variable.
Removing sc->previous_status leaves current_status as a write-only
variable, so it collapses down to removing the entire
USB_ST_TRANSFERRED case.
Arguably, all of the support for UHID_SNES_STATUS_DT_RD should be
removed as it doesn't return anything to the caller. If the request
should be fixed instead then this commit should be reverted and
new_status should be returned to whoever submitted the request.
John Baldwin [Mon, 18 Apr 2022 19:06:52 +0000 (12:06 -0700)]
ata_kauai: Fix support for "shasta" controllers.
The probe routine was setting a value in the softc, but since the
probe routine was not returning zero, this value was lost since the
softc was reallocated (and re-zeroed) when the device was attached.
This is similar in nature to the fixes from 965205eb66cae3fd5de75a70a3aef2f014f98020.
To fix, move the code to set the 'shasta' flag to the start of attach
along with related code to set an IRQ resource on some non-shasta
devices. The IRQ resource still "worked" being in the probe routine
as the IRQ resource persisted after probe returned, but it is cleaner
to go ahead and move it to attach after setting the 'shasta' flag.
I have no way to test this, but noticed this while reading the code.
John Baldwin [Mon, 18 Apr 2022 19:04:30 +0000 (12:04 -0700)]
destroy_dev_sched*: Don't hold Giant for all deferred destroy_dev.
Rather than using taskqueue_swi_giant which holds Giant for all
deferred destroy_dev calls, create a separate queue for destroyed
devices with D_NEEDGIANT set in the corresponding cdevsw. The task
for this queue holds Giant whild destroying deferred devices while the
task for the default queue does not hold Giant.
In addition, switch to taskqueue_thread for destroy_dev_sched.
Deferred destroy_dev requests don't need to run at an SWI priority.
ping: split the visual part of -f into a new option -.
After this, we'll be able to ping a host and not spam the terminal, and
no flooding will have to be involved. I've been doing this under Linux
as ping -fi1 host.
Reviewed by: rpokala, Pau Amma
Differential Revision: https://reviews.freebsd.org/D34882
The new '-L' flag will cause savecore to invoke the new mem(4) kernel
dump ioctl, taking a dump of the running system and writing the result
to a temporary file. Validation of the dump header is performed, similar
to regular crash dumps, and the final result is written to
livecore.X[.zst|.gz].
Also added is the '-Z' flag, which instructs the kernel to compress the
livedump compressed with zstd, akin to the existing -z flag. This option
has no effect in normal savecore(8) operation, but in theory could be
extended to perform such compression while reading the dump from the
dump device.
Encryption is unsupported for live dumps.
For example: 'savecore -Lz /var/crash' would create:
/var/crash/livecore.0.gz
Reviewed by: markj
MFC after: 2 weeks
Sponsored by: Juniper Networks, Inc.
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D34347
All files are now created relative to savedirfd, e.g. with openat(2).
Therefore, we do not need character buffers to be PATH_MAX bytes long,
just long enough to hold the complete filename. 32 bytes is long enough
in all cases. These can be allocated on the stack.
While here, fix an error message that attempts to use an uninitialized
infoname.
Reviewed by: markj
MFC after: 3 days
Sponsored by: Juniper Networks, Inc.
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D34821