Gordon Bergling [Tue, 12 Mar 2024 14:44:48 +0000 (15:44 +0100)]
md5.1: Fix the GNU mode example when using a digest file
The last example in the manpage md5(1) wants to demonstrate
GNU mode (md5sum), but uses BSD mode (md5) instead.
In GNU mode, the -c option does not compare against a hash string
passed as parameter. Instead, it expects a digest file,
as created under the name digest for /boot/loader.conf in
the example above.
PR: 276560
Reviewed by: mhorne, des
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D44098
Randall Stewart [Tue, 12 Mar 2024 11:55:02 +0000 (07:55 -0400)]
Update to bring the rack stack with all its fixes in.
This brings the rack stack up to the current level used at NF. Many fixes
and improvements have been added. I also add in a fix to BBR to deal with
the changes that have been in hpts for a while i.e. only one call no matter
if mbuf queue or tcp_output.
It basically does little except BBlogs and is a placemark for future work on
doing path capacity measurements.
With a bit of a struggle with git I finally got rack_pcm.c into place (apologies
for not noticing this error). The LINT kernel is running on my box now .. sigh.
Reviewed by: tuexen, glebius
Sponsored by: Netflix Inc.
Differential Revision:https://reviews.freebsd.org/D43986
Sumit Saxena [Tue, 12 Mar 2024 06:51:09 +0000 (06:51 +0000)]
mrsas: don't reference the removed physical disk of RAID1 during IO submission
When a physical disk(PD) [belonging to a RAID1 Virtual disk(VD)] is
removed, driver may still use the reference to the removed PD while submitting
IO to the controller. Controller firmware faults upon receipt of such IO.
This patch fixes this issue by not using any reference to the removed PD.
Warner Losh [Tue, 12 Mar 2024 04:19:05 +0000 (22:19 -0600)]
timezone: Move to the XSI/POSIX definition for timezone.
The old timezone(3) function has long since been obsolete and has a
fatally flawed interface. Retain this function for compatibility
purposes, but shift to providing the offset from UTC in the timezone
variable, whether or not the timezone observes summer time in the
'daylight' variable. Document the tzname variable that's already been
set. Also make _tztab() static. It's not used in libc (or anywhere in
the tree) and it's not exported as a public dynamic symbol.
Warner Losh [Mon, 11 Mar 2024 20:15:44 +0000 (14:15 -0600)]
kboot: kbootfdt: fix error handling
If we are able to open /sys/firmware/fdt, but aren't able to read it,
fall back to /proc/device-tree. Remove comment that's not really true,
it turns out.
Warner Losh [Mon, 11 Mar 2024 20:15:34 +0000 (14:15 -0600)]
kboot: Print UEFI memory map
If we can read the UEFI memory map, go ahead and print the memory map.
While the kernel prints this with bootverbose, having it at this stage
is useful for debugging other problems.
Warner Losh [Mon, 11 Mar 2024 20:15:24 +0000 (14:15 -0600)]
kboot: hostfs -- check for llseek failure correctly
The host_* syscalls are all raw Linux system calls, not the POSIX
wrappers that glibc / musl create. So we have to ranage change the
return value of host_llseek correctly to use the negative value hack
that all Linux system calls use.
This fixes a false positive error detection when we do something like
lseek(fd, 0xf1234567, ...); This returns 0xf1234567, which is a negative
value which used to trigger the error path. Instead, we check using the
is_linux_error() and store the return value in a long. Translate that
errno to a host errno and set the global errno to that and return
-1. lseek can't otherwise return a negative number, since it's the
offset after seeking into the file, which by definition is positive.
This kept the 'read the UEFI memory map out of physical memory' from
working on aarch64 (whose boot loader falls back to reading it since
there are restrictive kernel options that can also prevent it), since
the physical address the memory map was at on my platform was like
0xfa008018.
Warner Losh [Mon, 11 Mar 2024 20:15:10 +0000 (14:15 -0600)]
kboot: Avoid UB in signed shift
offset is signed. Copy it to the unsigned res before shifting. This
avoids any possible undefined behavior for right shifting signed
numbers. No functional change intended (and the code generated is the
nearly same for aarch64).
Warner Losh [Mon, 11 Mar 2024 20:15:03 +0000 (14:15 -0600)]
kboot: Create function for error checking.
Linux has the convention of returning -ERRNO to flag errors from its
system calls. Sometimes other negative values are returned that are
success... However, only values -1 to -4096 (inclusive) are really
errors. The rest are either truncated values that only look negative (so
use long instead of int), or are things like addresses or legal unsigned
file offsets or similar that are successful returns. Filter out the
latter.
Randall Stewart [Mon, 11 Mar 2024 11:36:54 +0000 (07:36 -0400)]
Update to bring the rack stack with all its fixes in.
This brings the rack stack up to the current level used at NF. Many fixes
and improvements have been added. I also add in a fix to BBR to deal with
the changes that have been in hpts for a while i.e. only one call no matter
if mbuf queue or tcp_output.
Note there is a new file that I can't figure out how to get in rack_pcm.c
It basically does little except BBlogs and is a placemark for future work on
doing path capacity measurements.
Reviewed by: tuexen, glebius
Sponsored by: Netflix Inc.
Differential Revision:https://reviews.freebsd.org/D43986
unionfs: accommodate underlying FS calls that may re-lock
Since non-doomed unionfs vnodes always share their primary lock with
either the lower or upper vnode, any forwarded call to the base FS
which transiently drops that upper or lower vnode lock may result in
the unionfs vnode becoming completely unlocked during that transient
window. The unionfs vnode may then become doomed by a concurrent
forced unmount, which can lead to either or both of the following:
--Complete loss of the unionfs lock: in the process of being
doomed, the unionfs vnode switches back to the default vnode lock,
so even if the base FS VOP reacquires the upper/lower vnode lock,
that no longer translates into the unionfs vnode being relocked.
This will then violate that caller's locking assumptions as well
as various assertions that are enabled with DEBUG_VFS_LOCKS.
--Complete less of reference on the upper/lower vnode: the caller
normally holds a reference on the unionfs vnode, while the unionfs
vnode in turn holds references on the upper/lower vnodes. But in
the course of being doomed, the unionfs vnode will drop the latter
set of references, which can effectively lead to the base FS VOP
executing with no references at all on its vnode, violating the
assumption that vnodes can't be recycled during these calls and
(if lucky) violating various assertions in the base FS.
Fix this by adding two new functions, unionfs_forward_vop_start_pair()
and unionfs_forward_vop_finish_pair(), which are intended to bookend
any forwarded VOP which may transiently unlock the relevant vnode(s).
These functions are currently only applied to VOPs that modify file
state (and require vnode reference and lock state to be identical at
call entry and exit), as the common reason for transiently dropping
locks is to update filesystem metadata.
uipc_bindat(): Explicitly specify exclusive locking for the new vnode
When calling VOP_CREATE(), uipc_bindat() reuses the componentname
object from the preceding lookup operation, which is likely to specify
LK_SHARED. Furthermore, the VOP_CREATE() interface technically only
requires the newly-created vnode to be returned with a shared lock.
However, the socket layer requires the new vnode to be locked exclusive
and asserts to that effect.
In most cases, this is not a practical concern because most if not
all base-layer filesystems (certainly FFS, ZFS, and msdosfs at least)
always return the vnode locked exclusive regardless of the lock flags.
However, it is an issue for unionfs which uses cn_lkflags to determine
how the new unionfs wrapper vnode should be locked. While it would
be easy enough to work around this issue within unionfs itself, it
seems better for the socket layer to be explicit about its locking
requirements when issuing VOP_CREATE().
vn_lock_pair(): allow lkflags1/lkflags2 to be 0 if vp1/vp2 is NULL
It's a bit strange to require the caller to pass contrived lock flags
if the corresponding vnode is NULL, simply to appease the assertion
that exactly one of LK_SHARED or LK_EXCLUSIVE must be set. On the
other hand, we still want to catch cases in which completely bogus
or corrupt flags are passed even if the corresponding vnode is NULL.
Therefore, specifically allow empty flags for lkflags1/lkflags2 iff
the respective vp1/vp2 param is NULL.
Kyle Evans [Sat, 9 Mar 2024 02:01:17 +0000 (20:01 -0600)]
crunchgen: slap a dependency on the generated makefile for .lo
crunchgen generates a foo.lo for each binary it will end up crunching
into the final product. While they have a dependency on the libs that
are used to link them, nothing will force relinking if the set of libs
needed to link them is changed. Because of this, incremental builds may
not be possible if one builds a version of, e.g., rescue/ with a broken
set of libs specified for a project -- a subsequent fix won't be rolled
in cleanly, it will require purging the rescue/ objdir.
This is a bit crude, but the foo.mk we generate doesn't actually get
regenerated all that often in practice, so a spurious relink for the
vast majority of crunched objects won't actually happen all that often.
Brooks Davis [Fri, 8 Mar 2024 19:14:24 +0000 (19:14 +0000)]
lib{c,sys}: fix incremental builds
I removed lib/libsys/{aarch64,arm,riscv}/syscall.S in favor of an
idential generated version. We need to clean out the .ddepend files to
ensure the generated version is actually generated.
The guard here is technically too strict, but should be fine in practice
and I've verified both the breakage and fix on an armv7 build.
Mitchell Horne [Fri, 8 Mar 2024 14:09:17 +0000 (10:09 -0400)]
simple_mfd: don't attach children twice
Trying to probe+attach the child device at the point it is added comes
before the syscon handle is set up (if relevant). It will therefore be
unavailable to the attach method which is expecting it, and the first
attempt to attach the device will fail.
Just rely on the call to bus_generic_attach() at the end of the function
to perform probe+attach of dev's children.
Reviewed by: manu
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D44268
Mitchell Horne [Fri, 8 Mar 2024 14:09:08 +0000 (10:09 -0400)]
clkdom_dump(): improve output text
If the call to clknode_get_freq() returns an error (unlikely), report
this, rather than printing the error code as the clock frequency.
If the clock has no parent (e.g. a fixed reference clock), print "none"
rather than "(NULL)(-1)". This is a more human-legible presentation of the
same information.
Reviewed by: manu
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D44267
Wei Hu [Fri, 8 Mar 2024 10:00:25 +0000 (10:00 +0000)]
Hyper-V: vPCI: fix cpu id mis-mapping in vmbus_pcib_map_msi()
The msi address contains apic id. The code in vmbus_pcib_map_msi()
treats it as cpu id, which could cause mis-configuration of msix
IRQs, leading to missing interrupts for SRIOV devices. This happens
when apic id is not the same as cpu id on certain large VM sizes
with multiple numa domains in Azure. Fix this issue by correctly
mapping apic ids to cpu ids.
On vPCI version before 1.4, it only supports up to 64 vcpus
for msi/msix interrupt. This change also adds a check and returns
error if the vcpu_id is greater than 63.
Michael Tuexen [Fri, 8 Mar 2024 09:03:43 +0000 (10:03 +0100)]
TCP LRO: disable mbuf queuing when packet filter hooks are in place
When doing mbuf queueing, the packet filter hooks in ether_demux(),
ip_input(), and ip6_input() are by-passed. This means that the packet
filters don't process incoming packets, which might result in
connection failures. For example bypassing the TCP sequence number
validation will result in dropping valid packets.
Please note that this patch is only disabling mbuf queueing, not LRO.
Reported by: Herbert J. Skuhra
Reviewed by: glebius, rrs, rscheff
MFC after: 1 week
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D43769
Warner Losh [Fri, 8 Mar 2024 05:40:43 +0000 (22:40 -0700)]
awk: Fix the tests
I'd forgotten that we have to adjust the stderr tests from
upstream. Remove the OK files. Also remove system-status.*. These
restore the fixes I made in 517e52b6c21c which were lost when I imported
the last version of awk.
Also, force LANG to be C.UTF-8 when testing to ensure that stray lang
settings don't fail tests.
Brooks Davis [Thu, 7 Mar 2024 20:09:00 +0000 (20:09 +0000)]
libc/quad: narrow list of symbols exposed on i386
These symbols aren't present on i386 so don't try to expose them.
Given the structure of quad/Makefile.inc, it might make more sense to
have per-arch symbol maps here, but this is sufficent to build with
WITHOUT_UNDEFINED_VERSION on i386.
Brooks Davis [Thu, 7 Mar 2024 20:08:38 +0000 (20:08 +0000)]
libc/iconv: don't export nonexistant symbols
It's unclear to me that any of these symbols ever existed. The ones
I've spot checked are only mentioned in the initial Citrus iconv import
(commit ad30f8e79bd1) and this code hasn't changed much over time.
if_bnxt: Set 1G/10G baseT force speed as auto speeds
The firmware lacks support for manually setting 1G and 10G baseT speeds.
However, the driver can enable auto speed masks to achieve automatic configuration
at these speeds.
if_bnxt: Implementation of Extended Port Hardware Stats Support for THOR Controller
The newly added port extended hardware statistics are now accessible to
users through the sysctl interface. Also, Few obsolete stats are removed
and few stats are renamed.
if_bnxt: Fix media speed update issue in "ifconfig -m" during PHY hot plug
Currently, if a media type (e.g., DAC) is hot-plugged out and another type
(e.g., optical cable) is hot-plugged in, the new speed is not reflected in
ifconfig. This occurs when the driver fails to update speeds with unchanged
tx and rx flow control.
To fix, a phy_type check ensures update of phy speeds upon detecting the new
phy.
Warner Losh [Thu, 7 Mar 2024 01:22:34 +0000 (18:22 -0700)]
nvme: Log reset success or failure to devd
We're logging when we start a reset, but not when we complete it, nor
the result. Create now log a success or timed_out event for the reset.
Currently, the only detectable error we have from reset is 'failure to
become ready in time,' though the code looks like it might be more
generic. Log this and if we ever have other failure modes, change the
logging to devd when that happens.
Warner Losh [Thu, 7 Mar 2024 01:22:13 +0000 (18:22 -0700)]
nvme: split devctl out to its own function
Split the devctl aspect of things out to its own function in
nvme_ctrlr_devctl_log. In preparing to document this, and based on
actual use, we want something different for the SMART errors, so this
will facilitate that.
Brooks Davis [Thu, 7 Mar 2024 01:01:36 +0000 (01:01 +0000)]
libsys: don't try to expose yield
The undocumented yield system call has never been implemented via libc
or libsys (except accidentally for <15 minutes in 1998 between commits abd529cebab9 and 0db2fac06ab7). Avoid trying to export it now to avoid
failures when linking with --no-undefined-version.
Brooks Davis [Thu, 7 Mar 2024 00:59:07 +0000 (00:59 +0000)]
syscall(2): make i386 less of an outlier
Unlike other architectures, i386 only defined syscall() and not
_syscall() or __sys_syscall(). The syscall() function then invoked the
desired system call directly rather than invoking syscall(2). Keep the
latter as it's marginally more efficent, but also create the
conventional _syscall() and __sys_syscall() stubs.
This avoids the need to special case syscall(2) in the symbol list
generation in libsys.
Brooks Davis [Thu, 7 Mar 2024 00:55:11 +0000 (00:55 +0000)]
heimdal: don't try to expose nonexistant symbols
For one reason or another these symbols aren't present so don't try to
make them available for linkage.
In the case of libroken these seem to be compatability bits we don't
need a thus don't compile. For others it seems to rot upstream, but
I've not investigated deeply.
Brooks Davis [Thu, 7 Mar 2024 00:54:55 +0000 (00:54 +0000)]
heimdal: don't export nonexistant _wind_ucs2read
This symbol table entry came in with the 1.5 import (commit 7c450da7b446), but the only other mention is a commented out entry in
lib/wind/libwind-exports.def.
Remove the include that crept in by accident
Clang complains about CLOCK_BOOTTIME being the same for now as
CLOCK_UPTIME, so remove CLOCK_BOOTTIME and leave a comment for
what to do when CLOCK_BOOTTIME will be different for real.
The checksum code assumed that struct ustar_header filled an entire
block and calculcated the checksum based on the size of the structure.
The header is in fact only 500 bytes long while the checksum covers
the entire block (“logical record” in POSIX terms). Add padding and
an assertion, and clean up the checksum code.
MFC after: 3 days
Sponsored by: Juniper Networks, Inc.
Sponsored by: Klara, Inc.
Reviewed by: imp
Differential Revision: https://reviews.freebsd.org/D44226
* Reject hard or soft links with an empty target path. Currently, a
debugging kernel will hit an assertion in tarfs_lookup_path() while
a non-debugging kernel will happily create a link to the mount root.
* Use a temporary variable to store the result of the link target path,
and copy it to tnp->other only once we have found it to be valid.
Otherwise we error out after creating a reference to the target but
before incrementing the target's reference count, which results in a
use-after-free situation in the cleanup code.
* Correctly return ENOENT from tarfs_lookup_path() if the requested
path was not found and create_dirs is false. Luckily, existing
callers did not rely solely on the return value.
MFC after: 3 days
PR: 277360
Sponsored by: Juniper Networks, Inc.
Sponsored by: Klara, Inc.
Reviewed by: sjg
Differential Revision: https://reviews.freebsd.org/D44161
Sumit Saxena [Fri, 23 Feb 2024 08:20:26 +0000 (08:20 +0000)]
if_bnxt: Correcting the firmware package version parsing logic
The firmware package version currently appears as "Unknown" through
the sysctl interface. The parsing logic for extracting the firmware
package version from the package log has been modified to ensure
compatibility with all controllers.
Alan Somers [Tue, 5 Mar 2024 17:55:55 +0000 (10:55 -0700)]
zfsd: Use vdev prop values for fault/degrade thresholds
ZED uses vdev props for setting disk fault/degrade thresholds, this
patch enables zfsd to use the same vdev props for these same tasks.
OpenZFS on Linux is using vdev props for ZED disk fault/degrade
thresholds. Originally the thresholds supported were for io and checksum
events and recently this was updated to process slow io events as
well, see
https://github.com/openzfs/zfs/commit/cbe882298e4ddc3917dfaf239eca475fe06d62d4
This patch enables us to use the same vdev props in zfsd as ZED uses.
After this patch is merged both OSs will use the same vdev props to set
retirement thresholds.
It's probably important to note that the threshold defaults are
different between OS. I've kept the existing defaults inside zfsd and
DID NOT match them to what ZED does.
Eugene Grosbein [Tue, 5 Mar 2024 17:23:41 +0000 (00:23 +0700)]
diskinfo(8): introduce new option -l
In modes -p or -s, add an option -l to start each line
with a device name separated with a tab. Update the manual page.
Add an example to list names with corresponding serial numbers:
Kyle Evans [Tue, 5 Mar 2024 04:14:07 +0000 (22:14 -0600)]
ktrace: log genio events on failed write
Visibility into the contents of the buffer when a write(2) has failed
can be immensely useful in debugging IPC issues -- pushing this to
discuss the idea, or maybe an alternative where we can set a flag like
KTRFAC_ERRIO to enable it.
When a genio event is potentially raised after an error, currently we'll
just free the uio and return. However, such data can be useful when
debugging communication between processes to, e.g., understand what the
remote side should have grabbed before closing a pipe. Tap out the
entire buffer on failure rather than simply discarding it.