dumpon: Fix unconfiguring netdump with "off" and "/dev/null".
Netdump has its own configuration tracking such that
ioctl(/dev/null, DIOCSKERNELDUMP) does a dumper_remove() but does not
notify netdump about the removal. Simply sending the same ioctl to
/dev/netdump handles the situation.
dumpon: Fix -v causing error when configuring an encrypted dump
If -v is specified when adding a new device then a full listing of
configured devices is displayed. This requires sysctl access which
genkey()'s use of capability mode was blocking permission to access.
This leads to both confusing console spam but also incorrectly returning
an error status even if no other had been encountered.
dumpon: Sysctl get 'kern.shutdown.dumpdevname': Operation not permitted
Fix this by generating the key in a child process.
Ed Maste [Mon, 26 Jul 2021 14:33:09 +0000 (10:33 -0400)]
Cirrus-CI: Temporarily skip package build + test
The PKG_FORMAT=tar used by Cirrus CI's pkgbase build is failing after 6cafdee71d2b ("pkgbase: Track pkg 1.17"). Skip package build and test
in Cirrus-CI until new pkg is available.
Numerous counters got migrated from straight uint64_t to the counter(9)
API. Unfortunately the implementation comes with a significiant
performance hit on some platforms and cannot be easily fixed.
Work around the problem by implementing a pf-specific variant.
Now that dounmount() supports a dedicated taskqueue, we can simply call
it with MNT_DEFERRED directly from the failing context. This also
avoids blocking taskqueue_thread with a potentially-expensive unmount
operation.
We no longer allow upper filesystems to be unregistered from the base
mount while vfs_notify_upper() or any other upper operation is pending.
New upper mounts can still be registered during this period, but they
will be added at the end of the upper mount tailq. We therefore no
longer need to allocate marker nodes during vfs_notify_upper() to keep
our place in the iteration.
Allow stacked filesystems to be recursively unmounted
In certain emergency cases such as media failure or removal, UFS will
initiate a forced unmount in order to prevent dirty buffers from
accumulating against the no-longer-usable filesystem. The presence
of a stacked filesystem such as nullfs or unionfs above the UFS mount
will prevent this forced unmount from succeeding.
This change addreses the situation by allowing stacked filesystems to
be recursively unmounted on a taskqueue thread when the MNT_RECURSE
flag is specified to dounmount(). This call will block until all upper
mounts have been removed unless the caller specifies the MNT_DEFERRED
flag to indicate the base filesystem should also be unmounted from the
taskqueue.
To achieve this, the recently-added vfs_pin_from_vp()/vfs_unpin() KPIs
have been combined with the existing 'mnt_uppers' list used by nullfs
and renamed to vfs_register_upper_from_vp()/vfs_unregister_upper().
The format of the mnt_uppers list has also been changed to accommodate
filesystems such as unionfs in which a given mount may be stacked atop
more than one lower mount. Additionally, management of lower FS
reclaim/unlink notifications has been split into a separate list
managed by a separate set of KPIs, as registration of an upper FS no
longer implies interest in these notifications.
Alan Cox [Sat, 24 Jul 2021 08:50:27 +0000 (03:50 -0500)]
amd64: Don't repeat unnecessary tests when cmpset fails
When a cmpset for removing the PG_RW bit in pmap_promote_pde() fails,
there is no need to repeat the alignment, PG_A, and PG_V tests just to
reload the PTE's value. The only bit that we need be concerned with at
this point is PG_M. Use fcmpset instead.
amd64 pti init: fix calculation of the kernel text start
Old expression happens to provide the correct answer, but assumes that
kernel is loaded at physical address zero, with 2M gap. Do not use
kernphys to calculate KVA of kernel text start, just explicitly write
out KERNBASE and the hole size.
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D31121
With recent ATF (v2.5) the PMIC is reset to I2C mode.
Without a PMIC no regulators can be changed/enabled/disabled
This fixes cpufreq on A64 (at least) and anything else that needs
regulators handled by the PMIC.
Warner Losh [Tue, 20 Jul 2021 04:47:30 +0000 (22:47 -0600)]
awk: Make -F '' and -v FS="" behave the same
IEEE Std 1003.1-2008 mandates that -F str be treated the same as -v
FS=str. For a null string, this was not the case. Since awk(1) documents
that a null string for FS has a specific behavior, make -F '' behave
consistently with -v FS="".
Warner Losh [Sat, 24 Jul 2021 15:03:53 +0000 (09:03 -0600)]
devctl: don't publish the mount options
Mount options aren't solely ASCII strings. In addition, experience to
date suggests that the mount options are much less useful than was
originally supposed and the mount flags suffice to make decisions. Drop
the reporting of options for the mount/remount/unmount events.
Alan Cox [Sat, 24 Jul 2021 03:50:10 +0000 (22:50 -0500)]
amd64: Eliminate a redundant test from pmap_enter_object()
The call to pmap_allow_2m_x_page() in pmap_enter_object() is redundant.
Specifically, even without the call to pmap_allow_2m_x_page() in
pmap_enter_object(), pmap_allow_2m_x_page() is eventually called by
pmap_enter_pde(), so the outcome will be the same. Essentially,
calling pmap_allow_2m_x_page() in pmap_enter_object() amounts to
"optimizing" for the unexpected case.
Warner Losh [Fri, 23 Jul 2021 21:21:02 +0000 (15:21 -0600)]
geom_disk: use a preallocated geom_event for disk destruction.
Preallocate a geom_event (using the new geom_alloc_event) when we create
a disk. When we create the disk, we're going to be in a sleepable
context, so we can always allocate this extra bit of memory. Then use
this preallocated memory to free the disk. CAM can try to free the disk
from an unsleepable context if there was I/O outstanding when the disk
was destroyted (say because the SIM said it had gone away). The I/O
context isn't sleepable. Rather than trying to invent a retry mechanism
and making sure all the other geom_disk consumers did it properly,
preallocating the event ensure that the geom_disk will be properly torn
down, even when there's memory pressure when the disk departs.
Warner Losh [Fri, 23 Jul 2021 21:16:57 +0000 (15:16 -0600)]
geom: create an API to allocate events, and use that storage to send them
g_alloc_event will allocate storage for an opaque event. g_post_event_ep
can use memory returned by g_alloc_event to send an event from a context
that might not be able to allocate the event. Occasionally, we can
alloate memory when we create an object, but not while we're destroy
it. This allows one to allocate at creation time memory to use when
destorying the object.
Mark Johnston [Fri, 23 Jul 2021 14:41:00 +0000 (10:41 -0400)]
KASAN: Disable checking before triggering a panic
KASAN hooks will not generate reports if panicstr != NULL, but then
there is a window after the initial panic() call where another report
may be raised. This can happen if a false positive occurs; to simplify
debugging of such problems, avoid recursing.
Mark Johnston [Fri, 23 Jul 2021 14:30:29 +0000 (10:30 -0400)]
redzone: Raise a compile error if KASAN is configured
redzone(9) does some munging of the allocation to insert redzones before
and after a valid memory buffer, but KASAN does not know about this and
will raise false positives if both are configured. Until this is fixed,
do not allow both to be configured. Note that KASAN provides similar
checking on its own but currently does not force the creation of
redzones for all UMA allocations; this should be addressed as well.
Eric van Gyzen [Fri, 23 Jul 2021 13:49:55 +0000 (08:49 -0500)]
aio_md_test: NUL-terminate result of readlink
readlink does not NUL-terminate the output buffer. This led to spurious
failures to destroy the md device because the unit number was garbage.
NUL-terminate the output buffer.
Eric van Gyzen [Fri, 23 Jul 2021 13:24:52 +0000 (08:24 -0500)]
aio_md_test: fix cleanup
ATF cleanup functions cannot use functions such as ATF_REQUIRE
and atf_tc_fail. These functions assert that a test case is
currently running, which is not true during cleanup, so the
process aborts. Change the cleanup function to simply print
to stderr and return.
Martin Matuska [Fri, 23 Jul 2021 00:50:13 +0000 (02:50 +0200)]
zfs: merge openzfs/zfs@14b43fbd9 (master) into main
Notable upstream pull request merges:
#12271 Tinker with slop space accounting with dedup
#12279 Fix ARC ghost states eviction accounting
#12284 Add Module Parameter Regarding Log Size Limit
#12300 Introduce dsl_dir_diduse_transfer_space()
#12314 Optimize allocation throttling
#12348 Minor ARC optimizations
#12350 Detect HAVE_LARGE_STACKS at compile time
#12356 Use SET_ERROR for more errors in FreeBSD vnops
#12375 FreeBSD: Ignore make_dev_s() errors
#12378 FreeBSD: Switch from MAXPHYS to maxphys on FreeBSD 13+
Ryan Moeller [Thu, 22 Jul 2021 21:29:27 +0000 (17:29 -0400)]
zloop: Add a max iterations option, use default run/pass times
It is useful to have control over the number of iterations of zloop so
we can easily produce "x core dumps found *in y iterations*" metrics.
Using random values for run/pass times doesn't improve coverage in a
meaningful way.
Randomizing run time could be seen as a compromise between running a
greater variety of shorter tests versus a smaller variety of longer
tests within a fixed time span. However, it is not desirable when
running a fixed number of iterations.
Pass time already incorporates randomness within ztest.
Either parameter can be passed to ztest explicitly if the defaults are
not satisfactory.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: George Melikov <mail@gmelikov.ru> Reviewed-by: John Kennedy <john.kennedy@delphix.com> Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #12411
riscv: Fix pmap_kextract racing with concurrent superpage promotion/demotion
This repeats amd64's cfcbf8c6fd3b (r180498) and i386's cf3508519c5e
(r202894) but for riscv; pmap_kextract must be lock-free and so it can
race with superpage promotion and demotion, thus the L2 entry must only
be loaded once to avoid using inconsistent state.
Yang Zhong [Thu, 22 Jul 2021 17:16:01 +0000 (13:16 -0400)]
mmc: Drain the intrhook in mmc_detach()
Buggy SD card drivers may attach and detach a mmc(4) driver instance in
quick succession. In this case mmc(4) must disestablish its intrhook
callback during detach. Thus, this change adds a call to
config_intrhook_drain(), which blocks or does nothing if the intrhook is
running or has already ran (the SD card was plugged in), and
disestablishes the hook if it hasn't ran yet (the SD card was not
plugged in).
PR: 254373
Reviewed by: imp, manu, markj
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D31262
Alan Somers [Mon, 21 Jun 2021 18:20:51 +0000 (12:20 -0600)]
[skip ci] fix syntax in CODEOWNERS
* Fix invalid usernames
- Fix spelling of ngie-eign
- Delete users who aren't members of the FreeBSD org
* Fix spelling of usr.bin/fetch in CODEOWNERS
* rm "a directory anywhere in the repo" patterns from CODEOWNERS
Even though they're documented as working, in practice they don't.
https://docs.github.com/en/github/creating-cloning-and-archiving-repositories/creating-a-repository-on-github/about-code-owners#codeowners-file-location
Alan Somers [Wed, 21 Jul 2021 21:11:00 +0000 (15:11 -0600)]
Escape any '.' characters in sysctl node names
ZFS creates some sysctl nodes that include a pool name, and '.' is an
allowed character in pool names. But it's the separator in the sysctl
tree, so it can't be included in a sysctl name. Replace it with "%25".
Handily, "%" is illegal in ZFS pool names, so there's no ambiguity
there.
Alexander Motin [Thu, 22 Jul 2021 16:22:14 +0000 (12:22 -0400)]
FreeBSD: Ignore make_dev_s() errors
Since errors returned by zvol_create_minor_impl() are ignored by the
common code, it is more convenient to ignore make_dev_s() errors there.
It allows, for example, to get device created for the zvol after later
rename instead of having it further stuck in half-created state.
zvol_rename_minor() already ignores those errors.
While there, switch from MAXPHYS to maxphys in FreeBSD 13+.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Sponsored-By: iXsystems, Inc.
Closes #12375
When using SDIO the block size if per function and most of the time
not equal to MMC_SECTOR_SIZE, fix sdio on dwmmc by setting the correct
block size in the mmc registers.
MFC after: 1 month
Sponsored by: Diablotin Systems
IR can be noisy in dmesg if it "receive" some unwanted data.
Add a tunable hw.aw_cir.debug to enable those message that are
only useful if one wants to debug the driver.
fdt: Expose the model, compatible and freebsd dts brandind as sysctl
This make it easier for script to get the hardware on which they are running.
MFC after: 1 month
Sponsored by: Diablotin Systems
Differential Revision: https://reviews.freebsd.org/D31205
Reviewed by: imp
Should be ok on powerpc: jhibbits (over irc)