Kirk McKusick [Tue, 13 Jun 2023 07:21:43 +0000 (00:21 -0700)]
Write out corrected superblock when creating a UFS/FFS snapshot.
When taking a snapshot on a UFS version 1 filesystem we need to
call ffs_oldfscompat_write() to unwind any in-memory changes that
were made to the superblock before writing it. The cause of this bug
was that the trimmed down maximum file size was not being reverted.
PR: 271352 Tested-by: Peter Holm
MFC-after: 1 week Sponsored-by: The FreeBSD Foundation
This variable was used to print the created interface name in the
atexit(3) handler. The interface name was calculated in the
ifclonecreate() by matching old & new names.
This change alter the implementation the following way:
1) the function responsible for the interface creation (ifcreate_ioctl)
updates all necessary state internally. This removes the need for the
name manipulation hack in wlan_create().
2) As atexit(3) handler does not accept any parameters, explicitly store
the name to print in the ifname_to_print variable read by the atexit(3)
handler.
ifconfig: add if_ctx argument to the generic and ifclone callbacks.
This is the continuation of the ifconfig cleanup work. This change is
a pre-requsite for the next changes removing some of the global variables.
It will also help in implementing functionality via Netlink instead of ioctl.
No functional changes intended.
* vxlan_cb() was removed as it contained no code
* ioctl_ifcreate() was renamed to ifcreate_ioctl() to follow the other
netlink/ioctl function naming. Netlink and ioctl provide _different_
interfaces and it's not possible to have a unified interface object
that can be filled by either netlink or ioctl implementations. With that
in mind, I'm leaning more to the function_<nl|ioctl> postfix pattern,
than doing ioctl_ or netlink_ prefix.
Xin LI [Tue, 13 Jun 2023 04:08:32 +0000 (21:08 -0700)]
expand_number: Tighten check of unit.
The current code silently ignores characters after the unit as long
the unit themselves were recognized. This commit makes expand_number(3)
to fail with EINVAL if buf did not terminate after the unit character.
Historically, the function accepts and ignores "B" as a SI unit, this
behavior is preserved and e.g. KB, MB are still accepted as aliases of
K and M, document this behavior in the manual page.
While I am there, also write a few test cases to validate the behavior.
Warner Losh [Tue, 13 Jun 2023 03:37:10 +0000 (21:37 -0600)]
nvme: Switch to nda by default
We already run nda by default on all the !x86 architectures. Switch the
default to nda. nda created nvd compatibility links by default, so this
should be a nop. If this causes problems for your application, set
hw.nvme.use_nvd=1 in your loader.conf.
Mitchell Horne [Mon, 12 Jun 2023 18:59:00 +0000 (15:59 -0300)]
mac(9): update SEE ALSO
Rather than maintaining an incomplete list of MAC modules references,
just reference mac(4), where such a list can be found.
Reviewed by: Mina Galić <freebsd@igalic.co>
Reviewed by: Pau Amma <pauamma@gundo.com>
MFC after: 3 days
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D40485
Mitchell Horne [Mon, 12 Jun 2023 18:56:34 +0000 (15:56 -0300)]
mac(4): update the references to MAC modules
Add entries for mac_ntpd(4) and mac_priority(4) to the table of MAC
modules.
Drop the entry for mac_none(4) from the list, but retain the
cross-reference in SEE ALSO. This module has no functional impact and is
of minimal interest to users. Add a new cross-reference to the similar
mac_stub(4), limited to SEE ALSO for the same reasoning.
Reviewed by: Pau Amma <pauamma@gundo.com>
MFC after: 3 days
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D40483
Alan Cox [Wed, 31 May 2023 23:10:41 +0000 (18:10 -0500)]
amd64/arm64 pmap: Stop requiring the accessed bit for superpage promotion
Stop requiring all of the PTEs to have the accessed bit set for superpage
promotion to occur. Given that change, add support for promotion to
pmap_enter_quick(), which does not set the accessed bit in the PTE that
it creates.
Since the final mapping within a superpage-aligned and sized region of a
memory-mapped file is typically created by a call to pmap_enter_quick(),
we now achieve promotions in circumstances where they did not occur
before, for example, the X server's read-only mapping of libLLVM-15.so.
See also https://www.usenix.org/system/files/atc20-zhu-weixi_0.pdf
Ed Maste [Mon, 12 Jun 2023 17:54:56 +0000 (13:54 -0400)]
wg(4): add Matt Macy back to AUTHORS section
Matt did the initial in-kernel FreeBSD driver port. The driver would
not exist without that work and some of it remains, even if the driver
was largely rewritten and reworked before being added back to the tree.
Overview:
Intel(R) QuickAssist Technology (Intel(R) QAT) provides hardware
acceleration for offloading security, authentication and compression
services from the CPU, thus significantly increasing the performance and
efficiency of standard platform solutions.
This commit introduces:
- Intel® 4xxx Series VF driver support.
- Device configurability via sysctls.
- UIO support for Intel® 4xxx Series devices.
Patch co-authored by: Krzysztof Zdziarski <krzysztofx.zdziarski@intel.com>
Patch co-authored by: Michal Gulbicki <michalx.gulbicki@intel.com>
Patch co-authored by: Julian Grajkowski <julianx.grajkowski@intel.com>
Patch co-authored by: Piotr Kasierski <piotrx.kasierski@intel.com>
Patch co-authored by: Lukasz Kolodzinski <lukaszx.kolodzinski@intel.com>
Patch co-authored by: Karol Grzadziel <karolx.grzadziel@intel.com>
Mark Johnston [Mon, 12 Jun 2023 16:09:34 +0000 (12:09 -0400)]
opencrypto: Handle end-of-cursor conditions in crypto_cursor_segment()
Some consumers, e.g., swcr_encdec(), may call crypto_cursor_segment()
after having advanced the cursor to the end of the buffer. In this case
I believe the right behaviour is to return NULL and a length of 0.
When this occurs with a CRYPTO_BUF_VMPAGE buffer, the cc_vmpage pointer
will point past the end of the page pointer array, so
crypto_cursor_segment() ends up dereferencing a random pointer before
the function returns a length of 0. The uio-backed cursor has
a similar problem.
Address this by keeping track of the residual buffer length and
returning immediately once the length is zero.
PR: 271766
Reported by: Andrew "RhodiumToad" Gierth <andrew@tao11.riddles.org.uk>
Reviewed by: jhb
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D40428
A TPM has an event log. Therefore, qemu adds a FwCfg item and adds it to
an ACPI table. We like to use the same OVMF driver as qemu, so we should
do the same. This commit adds the ability to basl to do it.
John Baldwin [Mon, 12 Jun 2023 10:47:35 +0000 (12:47 +0200)]
bhyve: Remove vestigial support for setting max vCPUs.
The kernel part of the hypervisor is not going to support per-VM maxcpu
limits. The topology is only used to control the values returned by
CPUID leaves for which max vCPUs is not relevant.
Certain internet service providers transmit vlan 0 priority tagged
EAPOL frames from the ONT towards the residential gateway. VID 0
should be ignored, and the frame processed according to the priority
set in the 802.1P bits and the encapsulated EtherType (i.e. EAPOL).
The pcap filter utilized by l2_packet is inadquate for this use case.
Here we modify the pcap filter to accept both unencapsulated and
encapsulated (with VLAN 0) EAPOL EtherTypes. This preserves the
original filter behavior while also matching on encapsulated EAPOL.
Bjoern A. Zeeb [Sat, 10 Jun 2023 22:56:03 +0000 (22:56 +0000)]
LinuxKPI: 802.11: correct HE_MAC_CAP3 values
While we had assigned dummy values so far to HE, correct the HW_MAC_CAP3
values to avoid compile time errors of drivers when shifting values out
of range.
Sponsored by: The FreeBSD Foundation
MFC after: 10 days
Bjoern A. Zeeb [Sat, 10 Jun 2023 22:52:04 +0000 (22:52 +0000)]
LinuxKPI: add dummy of.h
Given https://reviews.freebsd.org/D34318 was abandoned add an empty
of.h dummy header file to at least avoid #include errors and avoid
covering those #include with CONFIG_OF.
Bjoern A. Zeeb [Sat, 10 Jun 2023 21:53:56 +0000 (21:53 +0000)]
LinuxKPI: 802.11: improve scan handling
Under certain circumstances a hw_scan may be downgraded to a software
scan. Handle these situations better and make sure we free resources
in all cases once. [1]
Also leave a note about scanning all bands (or we would have to switch
bands manually).
In both cases hardware doing and driver saying seem not entirely
consistent for all and all firmware.
Sponsored by: The FreeBSD Foundation
MFC after: 10 days
Reported by: imp [1]
George Amanakis [Sat, 10 Jun 2023 00:05:47 +0000 (02:05 +0200)]
Fix the L2ARC write size calculating logic (2)
While commit bcd5321 adjusts the write size based on the size of the log
block, this happens after comparing the unadjusted write size to the
evicted (target) size.
In this case l2ad_hand will exceed l2ad_evict and violate an assertion
at the end of l2arc_write_buffers().
Fix this by adding the max log block size to the allocated size of the
buffer to be committed before comparing the result to the target
size.
Also reset the l2arc_trim_ahead ZFS module variable when the adjusted
write size exceeds the size of the L2ARC device.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: George Amanakis <gamanakis@gmail.com>
Closes #14936
Closes #14954
kern_ntptime: Fix undefined behavior of the shift operator
L_LINT macro is used with negative numbers [i.e.
L_LINT(time_freq, -MAXFREQ)], it could cause undefined
behavior. It should be similar to the L_RSHIFT(v, n) macro.
Brad Smith [Fri, 9 Jun 2023 20:01:35 +0000 (22:01 +0200)]
msun: Correct FreeBSD version in sincos() man page
The sincos() man page notes the function was added to msun in FreeBSD
9.0 which must have been an oversight in the review as it was commited
to 12.0 and then backported to the 11 branch.
So I have provided a diff to correct this to the first FreeBSD version
it did ship with which was 11.2.
Alexander Motin [Fri, 9 Jun 2023 19:40:55 +0000 (15:40 -0400)]
Finally drop long disabled vdev cache.
It was a vdev level read cache, designed to aggregate many small
reads by speculatively issuing bigger reads instead and caching
the result. But since it has almost no idea about what is going
on with exception of ZIO_FLAG_DONT_CACHE flag set by higher layers,
it was found to make more harm than good, for which reason it was
disabled for the past 12 years. These days we have much better
instruments to enlarge the I/Os, such as speculative and prescient
prefetches, I/O scheduler, I/O aggregation etc.
Besides just the dead code removal this removes one extra mutex
lock/unlock per write inside vdev_cache_write(), not otherwise
disabled and trying to do some work.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored by: iXsystems, Inc.
Closes #14953
Alexander Motin [Fri, 9 Jun 2023 17:14:05 +0000 (13:14 -0400)]
Improve l2arc reporting in arc_summary.
- Do not report L2ARC as FAULTED in presence of in-flight writes.
- Report read and write I/Os, bytes and errors.
- Remove few numbers not important to average user.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored by: iXsystems, Inc.
Closes #12304
Closes #14946
Alexander Motin [Fri, 9 Jun 2023 17:12:52 +0000 (13:12 -0400)]
Use list_remove_head() where possible.
... instead of list_head() + list_remove(). On FreeBSD the list
functions are not inlined, so in addition to more compact code
this also saves another function call.
Reviewed-by: Brian Atkinson <batkinson@lanl.gov> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored by: iXsystems, Inc.
Closes #14955
We are not allowed to access lwb after setting LWB_STATE_FLUSH_DONE
state and dropping zl_lock, since it may be freed by zil_sync().
To free itxs and waiters after dropping the lock we need to move
lwb_itxs and lwb_waiters lists elements to local storage.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored by: iXsystems, Inc.
Closes #14957
Closes #14959
Andrew Turner [Fri, 9 Jun 2023 08:36:12 +0000 (09:36 +0100)]
Add more arm64 ID registers to the user_regs array
This is a mapping from ID register value to offset in struct cpu_desc.
These registers may be needed with future architecture revisions either
by userspace or by bhyve.
Ed Maste [Fri, 9 Jun 2023 13:53:08 +0000 (09:53 -0400)]
Cirrus-CI: split main script into separate world + kernel
It appears that Cirrus-CI has a 100MB limit for log output, and we
exceed that (!) with the amd64-gcc12 build. Separate world and kernel
build tasks in an attempt to stay below the limit.
This also has the benefit of showing world and kernel build status
separately in the Cirrus-CI UI.
veriexec: Do not save error from file info in fingerprint status
We do not want or need to propagate the error from fetching file info
when determining the file status. It could cause open(2) and similar
calls to fail when trying to access devices.
Randall Stewart [Fri, 9 Jun 2023 14:27:08 +0000 (10:27 -0400)]
tcp: Rack fixes and misc updates
So over the past few weeks we have found several bugs and updated hybrid pacing to have
more data in the low-level logging. We have also moved more of the BBlogs to "verbose" mode
so that we don't generate a lot of the debug data unless you put verbose/debug on.
There were a couple of notable bugs, one being the incorrect passing of percentage
for reduction to timely and the other the incorrect use of 20% timely Beta instead of
80%. This also expands a simply idea to be able to pace a cwnd (fillcw) as an alternate
pacing mechanism combining that with timely reduction/increase.
Xin LI [Fri, 9 Jun 2023 01:38:47 +0000 (18:38 -0700)]
hexdump: Partial lines cannot be repetitions of earlier lines.
When checking for repetitions of earlier lines, we compare the
first nread bytes of the line against the saved line. However,
when we read a partial line, it should never be treated as a
repetition of an earlier line, even if the first bytes match.
This change fixes a bug where a partial line could be
incorrectly identified as a repetition of an earlier line.
Gleb Smirnoff [Thu, 8 Jun 2023 18:14:45 +0000 (11:14 -0700)]
stand/boot1.efi: use the bootonce dataset as root dataset
Before this change we would only pass the bootonce dataset name
to the environment for the next loader, while actually reading
the next stage loader from the 'bootfs' dataset, not the bootonce
dataset.
Another problem fixed by this change is a boot from a configuration
when bootonce attribute is present, but 'bootfs' property is not set.
Gleb Smirnoff [Thu, 8 Jun 2023 18:14:45 +0000 (11:14 -0700)]
stand/loader.efi: read zfs bootonce attribute before checking currdev
First check if bootonce is configured and if it is, then change currdev
accordingly and after that do the sanity check. This fixes boot in a
situation when ZFS pool doesn't have the "bootfs" property, but has
bootonce attribute set. A strange, but legitimate case.
Jessica Clarke [Thu, 8 Jun 2023 16:00:50 +0000 (17:00 +0100)]
etc: Don't create stray directories in NO_ROOT distrib-dirs
The loop above is responsible for creating the actual directories,
whilst this one is just responsible for creating the corresponding
METALOG. Since DESTDIR already includes DISTBASE, this results in
creating a second set of roots (one per MTREES entry) within DISTBASE
whenever DISTBASE is non-empty, such as base/base, base/base/var,
base/base/usr, etc. in the distributeworld case. This is purely cosmetic
though as they won't appear in the METALOG.
Jessica Clarke [Thu, 8 Jun 2023 17:35:23 +0000 (18:35 +0100)]
Makefile.inc1: Fix distributeworld mtree mangling for dist root dir
The trailing slash means that ./base itself doesn't get mangled and
remains as-is in the output, leading to a stray /base in base.txz for
NO_ROOT builds and thus in the installed system. Since this action is
running on a line whose file matches one listed by find (and we're
printing all of these as part of that distribution), we don't need to
care about the possibility of a path like ./basefoo/bar where the path
prefix isn't ./base, and can thus just drop the slash rather than
needing something more complicated like "slash or whitespace or EOL" as
one might first think.
Jessica Clarke [Thu, 8 Jun 2023 16:47:04 +0000 (17:47 +0100)]
Makefile.inc1: Use INSTALL_DDIR for distributeworld's distrib-dirs
INSTALL_DDIR is the canonicalised version of DESTDIR/DISTDIR. Whilst
most of what distrib-dirs does doesn't need the canonicalised form, it
is responsible for installing the POSIX and en_US.US_ASCII NLS symlinks
to C, and therefore needs the canonicalised version for those two uses
of install for NO_ROOT builds, since our install does a naive text-based
prefix strip when creating the METALOG entry rather than a smarter path
semantics-aware one (which itself is really a bug, and has bitten us
many times). As a result, using plain DESTDIR/DISTDIR instead can result
in the METALOG having ./path/to/destdir/base/usr/share/nls/$LOCALE
rather than ./base/usr/share/nls/$LOCALE and then being filtered out
when creating base.meta (or, if you're unlucky and the absolute path
begins with base or tests, weird things will probably happen).
Given this footgun an audit of DESTDIR uses is probably in order,
especially those using DESTDIR/DISTDIR, but this is sufficient for now.
Bjoern A. Zeeb [Sat, 20 May 2023 00:53:21 +0000 (00:53 +0000)]
LinuxKPI: add devm_ioremap()
Given we do not seem to support ioremap() do not support the "devm"
version either and simply return NULL, which means we do not have
to keep track of the memory to be freed on device free later.
Sponsored by: The FreeBSD Foundation
MFC after: 10 days
Reviewed by: hselasky
Differential Revision: https://reviews.freebsd.org/D40173
Bjoern A. Zeeb [Sat, 20 May 2023 00:48:28 +0000 (00:48 +0000)]
LinuxKPI: pci: update struct msi_desc
It seems struct msi_desc is setup differently (or was changed) compared
to how we added it a while ago. Catch up in order to keep drivers
directly accessing fields compiling.
Sponsored by: The FreeBSD Foundation
MFC after: 10 days
Reviewed by: emaste
Differential Revision: https://reviews.freebsd.org/D40175
Andrew Turner [Thu, 1 Jun 2023 17:25:37 +0000 (18:25 +0100)]
arm64: Reduce the direct use of cpu_desc
To help moving to a dynamically allocated cpu_desc array reduce the
places we use it directly and create a pointer that is passed in to
functions that read it.