Chuck Silvers [Wed, 3 May 2023 20:21:19 +0000 (13:21 -0700)]
fsck_ffs: fix the previous change that skipped pass 5 in some cases
The previous change involved calling check_cgmagic() twice in a row
for the same CG in order to differentiate when the CG was already ok vs.
when the CG was rebuilt, but that doesn't work because the second call
(which was supposed to rebuild the CG) returns 0 (indicating that
the CG was not rebuilt) due to the prevfailcg check causing an early
failure return. Fix this by moving the rebuild part of check_cgmagic()
out into a separate function which is called by pass1() when it wants to
rebuild a CG.
Mark Johnston [Wed, 3 May 2023 15:56:46 +0000 (11:56 -0400)]
unix: Fix locking in uipc_peeraddr()
After the locking protocol changed in commit 75a67bf3d00d ("AF_UNIX:
make unix socket locking finer grained"), uipc_peeraddr() was not
updated accordingly.
The link lock now only protects global socket lists. The PCB lock is
used to protect the link between connected PCBs, so use that. Remove an
old comment which appears to be noting that unp_conn is not set for
connected SOCK_DGRAM sockets (in one direction anyway).
Ed Maste [Tue, 2 May 2023 17:27:18 +0000 (13:27 -0400)]
src.conf: add WITH_TOOLCHAIN description
It is not used by the in-tree default configuration, but adding it
allows downstream projects with different defaults to make use of
Cirrus-CI (as the makeman test added in 85e8c2a03490 requires that
there are no missing option descriptions).
Reviewed by: imp
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D39936
#11680 Add support for zpool user properties
#14145 Storage device expansion "silently" fails on degraded vdev
#14405 Create zap for root vdev
#14659 Allow MMP to bypass waiting for other threads
#14674 Miscellaneous FreBSD compilation bugfixes
#14692 Fix some signedness issues in arc_evict()
#14702 Fix typo in check_clones()
#14715 module: small fixes for FreeBSD/aarch64
#14716 Trim needless zeroes from checksum events
#14719 vdev: expose zfs_vdev_max_ms_shift as a module parameter
#14722 Fix "Detach spare vdev in case if resilvering does not happen"
#14723 freebsd clone range fixes
#14728 Fix BLAKE3 aarch64 assembly for FreeBSD and macOS
#14735 Fix in check_filesystem()
#14739 Fix data corruption when cloning embedded blocks
#14758 Fix VERIFY(!zil_replaying(zilog, tx)) panic
#14761 Revert "ZFS_IOC_COUNT_FILLED does unnecessary txg_wait_synced()"
#14774 FreeBSD .zfs fixups
#14776 FreeBSD: make zfs_vfs_held() definition consistent with declaration
#14779 powerpc64: Support ELFv2 asm on Big Endian
#14788 FreeBSD: add missing vop_fplookup assignments
#14789 PAM: support the authentication facility
#14790 Revert "Fix data race between zil_commit() and zil_suspend()"
#14795 Fix positive ABD size assertion in abd_verify()
#14798 Mark TX_COMMIT transaction with TXG_NOTHROTTLE
#14804 Correct ABD size for split block ZIOs
#14806 Use correct block pointer in block cloning case.
#14808 blake3: fix up bogus checksums in face of cpu migration
Functions manipulating source nodes can fail due to various reasons like
memory allocation errors, hitting configured limits or lack of
redirection targets. Ensure those errors are properly caught and
propagated in the code. Increase the error counters not only when
parsing the main ruleset but the NAT ruleset too.
Ed Maste [Tue, 2 May 2023 19:57:20 +0000 (15:57 -0400)]
rtld: don't add extraneous -L directory when MK_TOOLCHAIN == no
rtld's Makefile used to add -L${LIBDIR} to LDFLAGS when MK_TOOLCHAIN was
no. This was done as part of a change to fix building rtld with
MK_TOOLCHAIN == no (although I'm not sure this part was necessary).
In any case as of 5f2e84015da7 libc_pic.a is built independent of the
MK_TOOLCHAIN setting and the main part of the workaround has already
been removed. Remove the rest now.
IfAPI: Add if_maddr_empty() to check for any maddrs
if_llmaddr_count() only counts link-level multicast addresses.
hv_netvsc(4) needs to know if there are any multicast addresses. Since
hv_netvsc(4) is the only instance where this would be used, make it a
simple boolean. If others need a if_maddr_count(), that can be added in
the future.
We always build the kernel floating point support. Now that the
riscv64sf userspace variant has been removed the option is required for
correct operation.
Reviewed by: jhb
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D39851
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Brian Atkinson <batkinson@lanl.gov> Signed-off-by: Pawel Jakub Dawidek <pawel@dawidek.net>
Closes #14806
Kristof Provost [Tue, 2 May 2023 08:45:01 +0000 (10:45 +0200)]
rtsol: introduce an 'always' script
In addition to the 'M' and 'O' scripts (for when 'Managed' and 'Other'
flags are set) also introduce an 'always' script that is called for any
router advertisement (so even if M and O are not set).
This is primarly useful for systems like pfSense that wish to be
informed of routers for further system configuration.
Intel DMAR: remove parsing of 6-level paging capability
Early versions of the VT-d spec mentioned 6-level paging support as a
possible value for the SAGAW capability, but later versions removed it
and SAGAW=0x10 is currently listed as a reserved value.
The 6-level (agaw=64) entry in sagaw_bits is furthermore problematic
with clang15 because the attempted comparison against 1ULL << 64 in
dmar_maxaddr2mgaw() causes the compiler to elide the last iteration
of the initial loop, which bypasses the subsequent logic to find the
greatest HW-supported address width. This results in 5-level paging
always being selected regardless of whether the hardware supports it,
which can result address translation failure due to invalid context-
entry programming.
Reviewed by: kib
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D39896
This change eliminates the struct pmap_pcid array embedded into struct
pmap and sized by MAXCPU, which would bloat with MAXCPU increase. Also
it removes false sharing of cache lines, since the array elements are
mostly locally accessed by corresponding CPUs.
Suggested by: mjg
Reviewed by: markj
Tested by: pho
Sponsored by: The FreeBSD Foundation
Differential revision: https://reviews.freebsd.org/D39890
Ed Maste [Mon, 1 May 2023 20:58:42 +0000 (16:58 -0400)]
src.opts.mk: Decouple MK_INCLUDES from MK_TOOLCHAIN
Prior to 590461a4b89b installation of include files was controlled
directly by ${MK_TOOLCHAIN}. 590461a4b89b added an INCLUDES knob
defaulting to YES. Setting WITHOUT_TOOLCHAIN forced it off to retain
existing behaviour.
Decouple them now, as there are reasonable use cases for installing
libraries and include files without a compiler or other tool chain
components.
Reviewed by: imp, jrtc27
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D39918
Currently when layering the ABD buffer of each split block on top of
an indirect vdev's ZIO ABD we don't specify the split block's ABD.
This results in those ABDs being incorrectly sized by inheriting
the size of their parent ABD which is larger than what each split
block needs.
The above behavior isn't causing any bugs currently but can lead
to unexpected ABD sizes for people analyzing and/or working on
the ZIO codepath. This patch fixes this behavior by properly setting
the ABD size for split block ZIOs.
Reviewed-by: Matthew Ahrens <mahrens@delphix.com> Reviewed-by: Igor Kozhukhov <igor@dilos.org> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Mark Maybee <mark.maybee@delphix.com> Reviewed-by: Brian Atkinson <batkinson@lanl.gov> Signed-off-by: Serapheim Dimitropoulos <serapheim@delphix.com>
Closes #14804
Warner Losh [Mon, 1 May 2023 21:12:41 +0000 (15:12 -0600)]
stand: More protection against malformed smbios tables
Add some more sanity checks to make sure we don't march off the end of
the table. Typically, smbios structures are well formed, or Windows
wouldn't boot. Sometimes they aren't, and this at least fails safe.
Warner Losh [Mon, 1 May 2023 21:12:34 +0000 (15:12 -0600)]
stand: Avoid unaligned access in smbios code
This code was written on x86 where unaligned accesses were
easy. LinuxBoot running on aarch64 uses mmap of /dev/mem to read the
smbios table. Linux's mapping of this memory doesn't allow the normal
unaligned fixup, so we get a bus error instead. We can't use the more
natural le16dec and friends because they optimize into a single,
unaligned memory load. We don't see this issue on aarch64 UEFI because
memory is mapped such that unaligned accesses are fixed up.
Warner Losh [Mon, 1 May 2023 21:12:29 +0000 (15:12 -0600)]
kboot: Add smbios support
Add support for getting smbios from /sys/firmware/efi/systab, if
any. Add ptov mapping that uses mmap on /dev/mem to do the mapping with
64k pages (usually we only need 1 or two mappings).
Warner Losh [Mon, 1 May 2023 16:07:00 +0000 (10:07 -0600)]
stand: back out the most of the horrible aarch64 kludge
Add one ifdef to upstrem code and get rid of compiling the horrible
checked-in aarch64 assembler for the boot loader that the loader will
never use. I'll attempt to upstream this and adjust as needed.
Warner Losh [Mon, 1 May 2023 15:28:17 +0000 (09:28 -0600)]
stand/userboot: Simplify code
We have way more than 8k of stack for the current value of the zfs
bootonce attribute. Allocate buf on the stack rather than the
complicated malloc / free dance.
Warner Losh [Mon, 1 May 2023 15:27:14 +0000 (09:27 -0600)]
stand/boot1.efi: Implement bootonce for ZFS
Implement ZFS bootonce protocol. We pass zfs-bootonce=t to the next boot
stage as a command line argument. Unlike zfsboot -> loader handoff in
the BIOS case, we don't use the OS_BOOTONCE_USED. This would require
modifications to loader.efi which would only server to make it more
complicated. Instead, use the command line parsing interface for the
boot1.efi -> loader.efi to pass in the zfs-bootonce kenv that will be
needed by rc.d/zfsbe to activate the BE if boot progresses that far.
Warner Losh [Mon, 1 May 2023 15:26:51 +0000 (09:26 -0600)]
stand/zfs: Refactor zfs_get_bootenv
Create a new interface to zfs_get_bootenv called zfs_get_bootenv_spa
which takes a spa instead of a void * (effectively a devdesc *). Use
that in zfs_get_bootenv.
Warner Losh [Mon, 1 May 2023 15:26:44 +0000 (09:26 -0600)]
stand/zfs: Move spa_find_by_dev from zfsimpl.c to zfs.c
zfsimpl.c doesn't know about devdesc at all, but zfs.c does. Move it to
zfs.c, which is the only user. Keep it static for now, but it could be
exposed later if something else were to need it.
Warner Losh [Mon, 1 May 2023 15:26:31 +0000 (09:26 -0600)]
stand/boot1.efi: Allow modules to add env variables
Sometimes filesystem modules need to pass details of the state of the
filesystem to later stages of a boot. Provide a generic method to do
so. We'll add them after any env variables set in our config files.
Ed Maste [Mon, 1 May 2023 20:33:47 +0000 (16:33 -0400)]
bsd.lib.mk: decouple lib*_pic.a from TOOLCHAIN build knob
A user may use a tool chain from a package or just use an existing
tool chain from a previous installation. There is no reason for this
to disable the installation of lib${LIB}_pic.a.
This also means we don't need to force MK_TOOLCHAIN=yes in lib/libc.
Ed Maste [Mon, 1 May 2023 17:25:18 +0000 (13:25 -0400)]
pkgbase: hide duplicate METALOG directory warnings under verbose
Creating directories multiple times is an inherent side effect of the
way installation is done. Hide warnings from duplicate directory
entries (with identical metadata) under metalog_reader's verbose mode.
Duplicate file entries are always reported. They currently generate
warnings but will be switched to errors once the few instances currently
in the tree are fixed.
PR: 244596, 271178
Reviewed by: kevans
Sponsored By: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D39898
pf: reduce number of hashing operations when handling source nodes
Reduce number of hashing operations when handling source nodes by always
having a pointer to the hash row mutex in the source node. Provide
macros for handling and asserting the mutex. Calculate the hash only
once in pf_find_src_node() and then use this hash in subsequent
operations.
This script's goal is to check the system for peripherals that needs
firmware and install the needed packages for them.
For now it only support pci subsystem and only video classes for AMD
and Intel GPUs.
TI support is in a sad state for years.
We haven't been able to keep up with all the breaking changes that
upstream do in the DTS. This requires a lot of new drivers to handle the
new buses that they create and all the new clocks that they expose.
Keep the code for now in case somebody is interested in reviving this
platform but stop bloating GENERIC with code that don't work.
Reviewed by: imp, mmel
MFC after: never
Sponsored by: Beckhoff Automation GmbH & Co. KG
Differential Revision: https://reviews.freebsd.org/D39843
listen(2): improve administrator control over logging
As documented in listen.2 manual page, the kernel emits a LOG_DEBUG
syslog message if a socket listen queue overflows. For some appliances,
it may be desirable to change the priority to some higher value
like LOG_INFO while keeping other debugging suppressed.
OTOH there are cases when such overflows are normal and expected.
Then it may be desirable to suppress overflow logging altogether,
so that dmesg buffer is not flooded over long run.
In addition to existing sysctl kern.ipc.sooverinterval,
introduce new sysctl kern.ipc.sooverprio that defaults to 7 (LOG_DEBUG)
to preserve current behavior. It may be changed to any value
in a range of 0..7 for corresponding priority or to -1 to suppress logging.
Document it in the listen.2 manual page.
Stefan Eßer [Sun, 30 Apr 2023 17:53:26 +0000 (19:53 +0200)]
contrib/bc: import version 6.5.0
This release that fixes an infinite loop bug in the (non-standard)
extended math library functions root() and cbrt(), fixes a bug with
BC_LINE_LENGTH=0, and adds the fib() function to the extended math
library to calculate Fibonacci numbers.
There were two issues with the carp key configuration in the new netlink
code.
The first is that userspace failed to actually pass the CARP_NL_KEY
attribute to the kernel, so a key was never set.
The second issue is that snl_attr_get_string() returns a pointer to the
string inside the netlink message. It does not copy the string to the
target buffer. That's somewhat inconvenient to work with in libifconfig
where we have a static buffer for the key.
Introduce snl_attr_copy_string() which can copy a string to a target
buffer and uses the 'arg' parameter to pass the buffer size, so it
doesn't accidentally exceed the available space.
Michael Tuexen [Sun, 30 Apr 2023 09:39:32 +0000 (11:39 +0200)]
sctp: improve handling of stale cookie error causes
* If a measure of staleness of 0 is reported, use the RTT instead.
* Ensure that we always send a cookie preservative parameter by
rounding up during the calculation.
* If allowed, perform a round trip time measurement.
* Clear the overall error counter, since the error cause also
acts like an ACK.
I find it very annoying that there is no FreeBSD infrastructure to
determine failures across architectures other than to check in
changes and then have Jenkins find them.
A check in the superblock validity code verifies that the computed
size of the filesystem cylinder groups (CGSIZE macro) does not
exceed the filesystem block size (fs_bsize).
A report was received that a filesystem had been flagged as failing
this check. We were unable to determine how the reported filesystem
could have been created. This commit adds a check at the end of the
newfs(8) command to verify that the the cylinder group size is valid.
If an oversize cylinder group is found newfs(8) prints a diagnostic
output and rebuilds the filesystem to make it compiliant.
Updates to UFS/FFS superblock integrity checks when reading a superblock.
Check for an uninitialed (zero valued) fs_maxbsize and set it
to its minimum valid size (fs_bsize). Uninitialed fs_maxbsize
were left by older versions of makefs(8) and the superblock
integrity checks fail when they are found.
No legitimate superblocks should fail as a result of these changes.
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Provide an additional line of output for the superblock giving the
computed size of the cylinder group (CGSIZE macro) along with the
details needed to calculate it.
Other virtual interface drivers (e.g. if_gif, if_stf, if_ovpn) all start
with if_. The wireguard file is also named if_wg, but the module name
was 'wg'.
Ravi Pokala [Thu, 27 Apr 2023 05:03:48 +0000 (22:03 -0700)]
jedec_dimm(4): Refactor offset adjustment and page0 reset
Offsets greater than 255 bytes reside on page1 of the SPD device.
Accessing them requires switching to page1, and adjusting the absolute
offset to be relative to the start of page1. After the access, the page
must be set back to page0. These operations are performed in several
places, so break them out into their own functions.
Also, replace a pair of default cases, which should be impossible due to
earlier checks, with __assert_unreachable().