Michael Tuexen [Wed, 10 Jan 2024 07:33:09 +0000 (08:33 +0100)]
tcpsso: fix when used without -i option
Since fdb987bebddf it is not possible anymore to use inp_next
iterator for bound, but unconnected sockets. This applies
to TCP listening sockets. Therefore the metioned commit broke
tcpsso on listening sockets if the -i option was not used.
Fix this by iterating through all endpoints instead of only
through the bound, but unconnected ones.
Marius Strobl [Tue, 9 Jan 2024 22:01:46 +0000 (23:01 +0100)]
igb(4): Remove disconnected SYSCTL
The global hw.igb.rx_process_limit knob never was adhered to by the
in-tree version of this driver but similar functionality is available
via the device-specific dev.igb.N.iflib.rx_budget.
While at it, remove the - besides initialization of tx_process_limit -
unused {r,t}x_process_limit members.
Andrew Gallatin [Tue, 9 Jan 2024 20:52:07 +0000 (15:52 -0500)]
apei: Mark ReadAckRegister resource as shareable
Work around vendors who use the same address for multiple
ReadAckRegisters in their ACPI HEST table. This
allows apei to attach cleanly on Ampere Altra servers.
Note the issue is not specific to Ampere, I've run into
it with at least one other vendor (whose server is not
yet released).
Gleb Smirnoff [Tue, 9 Jan 2024 21:01:28 +0000 (13:01 -0800)]
netlink: fix regression with group writers
Refactoring of argument list to nl_send_one() led to derefercing
wrong union member. Rename nl_send_one() to a more generic name,
isolate anew nl_send_one() as the callback only for the normal
writer and provide correct argument to nl_send() from nl_send_group().
John Baldwin [Tue, 9 Jan 2024 19:23:10 +0000 (11:23 -0800)]
acpi: Only reserve resources enumerated via _CRS
In particular, don't reserve resources added by drivers via other
means (e.g. acpi_bus_alloc_gas which calls bus_alloc_resource
right after adding the resource).
The intention of reserved resources is to ensure that a resource range
that a bus driver knows is assigned to a device is reserved by the
system even if no driver is attached to the device. This prevents
other "wildcard" resource requests from conflicting with these
resources. For ACPI, the only resources the bus driver knows about
for unattached devices are the resources returned from _CRS. All of
these resources are already reserved now via acpi_reserve_resources
called from acpi_probe_children.
As such, remove the logic from acpi_set_resource to try to reserve
resources when they are set. This permits RF_SHAREABLE to work with
acpi_bus_alloc_gas without requiring hacks like the current one for
CPU device resources in acpi_set_resource.
Reported by: gallatin (RF_SHAREABLE not working)
Diagnosed by: jrtc27
John Baldwin [Tue, 9 Jan 2024 19:05:03 +0000 (11:05 -0800)]
memdesc: Helper function to construct mbuf chain backed by memdesc buffer
memdesc_alloc_ext_mbufs constructs a chain of external (M_EXT or
M_EXTPG) mbufs backed by a data buffer described by a memory
descriptor.
Since memory descriptors are not an actual buffer just a description
of a buffer, the caller is required to supply a couple of helper
routines to manage allocation of the raw mbufs and associating them
with a reference to the underlying buffer.
John Baldwin [Tue, 9 Jan 2024 18:57:48 +0000 (10:57 -0800)]
kldxref: Workaround incorrect PT_DYNAMIC in existing powerpc kernels
Existing powerpc kernels include additional sections beyond .dynamic
in the PT_DYNAMIC segment. Relax the requirement for an exact size
match of the section and segment for PowerPC files as a workaround.
gofaster [Tue, 9 Jan 2024 17:49:30 +0000 (12:49 -0500)]
Add Gotify notification support to ZED
This commit adds the zed_notify_gotify() function and hooks it
into zed_notify(). This will allow ZED to send notifications
to a self-hosted Gotify service, which can be received
on a desktop or mobile device. It is configured with ZED_GOTIFY_URL,
ZED_GOTIFY_APPTOKEN and ZED_GOTIFY_PRIORITY variables in zed.rc.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: gofaster <felix.gofaster@gmail.com>
Closes #15693
Alexander Motin [Tue, 9 Jan 2024 17:48:40 +0000 (12:48 -0500)]
Fix livelist assertions for dedup and cloning
Two block pointers in livelist pointing to the same location may
be caused not only by dedup, but also by block cloning. We should
not assert D bit set in them.
Two block pointers in livelist pointing to the same location may
have different logical birth time in case of dedup or cloning. We
should assert identical physical birth time instead.
Assert identical physical block size between pointers in addition
to checksum, since that is what checksums are calculated on.
Reviewed-by: Matthew Ahrens <mahrens@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored by: iXsystems, Inc.
Closes #15732
Alexander Motin [Tue, 9 Jan 2024 17:46:43 +0000 (12:46 -0500)]
Improve block sizes checks during cloning
- Fail if source block is smaller than destination. We can only
grow blocks, not shrink them.
- Fail if we do not have full znode range lock. In that case grow
is not even called. We should improve zfs_rangelock_cb() somehow
to know when cloning needs to grow the block size unlike write.
- Fail of we tried to resize, but failed. There are many reasons
for it to fail that we can not predict at this level, so be ready
for them. Unlike write, that may proceed after growth failure,
block cloning can't and must return error.
This fixes assertion inside dmu_brt_clone() when it sees different
number of blocks held in destination than it got block pointers.
Builds without ZFS_DEBUG returned EXDEV, so are not affected much.
Reviewed-by: Pawel Jakub Dawidek <pawel@dawidek.net> Reviewed-by: Brian Atkinson <batkinson@lanl.gov> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored by: iXsystems, Inc.
Closes #15724
Closes #15735
Kent Ross [Tue, 9 Jan 2024 17:13:52 +0000 (09:13 -0800)]
make zdb_decompress_block check decompression reliably
This function decompresses to two buffers and then compares them to
check whether the (opaque) decompression process filled the whole
buffer. Previously it began with lbuf uninitialized and lbuf2 filled
with pseudorandom data. This neither guarantees that any bytes not
written by the compressor would be different, nor seems incredibly
sound otherwise!
After these changes, instead of filling one buffer with generated
pseudorandom data we overwrite each buffer with completely different
data. This should remove the possibility of low-probability failures,
as well as make the process simpler and cheaper.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Rich Ercolani <rincebrain@gmail.com> Signed-off-by: Kent Ross <k@mad.cash>
Closes #15733
The standard is somewhat unclear, but on the balance, I believe that the
phrase “the rest of the input line” should be interpreted to mean the
rest of the input line including the terminating newline if and only if
there is one. This means the current implementation is incorrect on two
points:
- First, it suppresses the previous line's newline in the '1' case.
- Second, it unconditionally emits a newline at the end of the output
for non-empty input, even if the input did not end with a newline.
Resolve this by rewriting the main loop. Instead of special-casing the
first line and then assuming that every line ends with a newline, we
remember how each line ends and emit that either at the beginning of
the next line or at the end of the file except in the one case ('+')
where the standard explicitly says not to.
While here, try to reduce diff to upstream a little and update their
RCS tag to reflect the fact that while we've diverged significantly
from them, we've incorporated all their changes. Remove the useless
second RCS tag.
We also update the tests to account for the change in interpretation
of the '1' case and add a test case for unterminated input.
Olivier Certner [Thu, 4 Jan 2024 15:10:40 +0000 (16:10 +0100)]
libthr: thr_attr.c: More style and clarity fixes
The change of argument for sizeof() (from a type to an object) is to be
consistent with the change done for the malloc() code just above in the
preceding commit touching this file.
Consider bit flags as integers and test whether they are set with an
explicit comparison with 0.
Use an explicit flag value (PTHREAD_SCOPE_SYSTEM) in place of a variable
that has this value at point of substitution.
Kyle Evans [Tue, 9 Jan 2024 04:21:36 +0000 (22:21 -0600)]
build: only inspect the first word of toolchain tools
CC/CXX/CPP/LD may all have arguments supplied in various circumstances,
which break the logic here. We only need to determine which of these
tools we're expecting to invoke from PATH, which just requires
examination of the first word. Limit our scope to exactly that.
Kyle Evans [Tue, 9 Jan 2024 03:08:16 +0000 (21:08 -0600)]
bhyveload: add CAP_SEEK to our dirfd rights
In the case of hostbase_fd, this is infact a bug fix; we have a seek
callback that the host: filesystem may use in loader, and we really
don't have a good excuse to break it.
bootfd-derived fds will only be used with fdlopen(3) and rtld doesn't
seem to need pread / lseek at all for it today, but there's no reason to
break if it finds a good reason to later.
Gleb Smirnoff [Tue, 9 Jan 2024 01:20:31 +0000 (17:20 -0800)]
sockets: on shutdown(2) do sorflush() only in case of generic sockbuf
This is a quick plug to fix panic with Netlink which has protocol specific
buffers. Note that PF_UNIX/SOCK_DGRAM, which also has its own buffers,
avoids the panic due to being SOCK_DGRAM. A correct but more complicated
fix that needs to be done is to merge pr_shutdown, pr_flush and dom_dispose
into one protocol method that may call sorflush for generic sockets or do
their own stuff for protocol which has own buffers.
Alexander Motin [Tue, 9 Jan 2024 00:49:39 +0000 (19:49 -0500)]
ZIL: Update Linux tracing after #15635
While picking parts from #14909 I've missed Linux tracing specific
ones, that went unnoticed in default configurations, but breaks the
build in some.
Reviewed-by: Ameer Hamza <ahamza@ixsystems.com> Reviewed-by: Brian Atkinson <batkinson@lanl.gov> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored by: iXsystems, Inc.
Closes #15730
Shengqi Chen [Tue, 9 Jan 2024 00:05:24 +0000 (08:05 +0800)]
Linux 6.2 compat: add check for kernel_neon_* availability
This patch adds check for `kernel_neon_*` symbols on arm and arm64
platforms to address the following issues:
1. Linux 6.2+ on arm64 has exported them with `EXPORT_SYMBOL_GPL`, so
license compatibility must be checked before use.
2. On both arm and arm64, the definitions of these symbols are guarded
by `CONFIG_KERNEL_MODE_NEON`, but their declarations are still
present. Checking in configuration phase only leads to MODPOST
errors (undefined references).
tcp: prevent spurious empty segments and fix uncommon panic
Only try sending more data on pure ACKs when there is
more data available in the send buffer.
In the case of a retransmitted SYN not being sent due to
an internal error, the snd_una/snd_nxt accounting could
be off, leading to a panic. Pulling snd_nxt up to snd_una
prevents this from happening.
Kyle Evans [Mon, 8 Jan 2024 17:49:40 +0000 (11:49 -0600)]
bhyveload: make error printing consistent
Previously we used a mix of perror(3) + exit(3) and err(3); standardize
on the latter instead. This does remove one free() in an error path,
because we're decidedly leaking a lot more than just the loader name
there (loader handle, vcpu, vmctx...) anyways.
Bjoern A. Zeeb [Mon, 8 Jan 2024 15:29:09 +0000 (15:29 +0000)]
ath10k/rtw89: make compile again after LinuxKPI changes
Both drivers are not yet attached to the build so this change is
for people currently trying them out.
In 96ab16ebab6319dce9b3041961b0ab6e20a4fecc the sys/rman.h include
was removed. In various wireless drivers we prefer to directly use
bus_dma functions rather than io* LinuxKPI once. In order to cast
the pointer we need sys/rman.h back for our native 'struct resource'
in their pci.c implementations.
Long-term we should consider providing some lkpi_-FreeBSD-specific
wrapper functions to avoid this problem.
Improve log messages to be more helpful in error cases.
Change one LinuxKPI sleep function as we cannot call the original
one from a context we cannot sleep.
Both cases were hit during testing.
Warner Losh [Sun, 7 Jan 2024 16:14:13 +0000 (09:14 -0700)]
checkstyle9: Remove irrelevant stuff from qemu
Remove some qemu project specific things we don't care about
o Remove python interpreter check
o Remove linux header check
o Remove trace file specail treatment
o Add $FreeBSD$ tag additions
o Remove some experiemntal code we won't need
o Remove commented out initializer code that we don't explicitly have a
rule for.
Mark Johnston [Sun, 7 Jan 2024 16:35:06 +0000 (11:35 -0500)]
dtrace/profile: Set t_dtrace_trapframe for profile probes
profile provider probes fire in the context of a timer interrupt. Thus,
the "regs" action can make use of the interrupt trap frame to get
register values when the interrupt happened in kernel mode. Make that
trap frame available when possible so that "regs" works more or less as
it already does with the fbt and kinst providers.
Warner Losh [Sun, 7 Jan 2024 03:46:42 +0000 (20:46 -0700)]
style.yml: Don't run this on branch pushes
We don't need to run this on branch pushes, just pull requests. It's
designed to be a gross filter for incoming commits, not something
perfect we need to keep green. It also doesn't work quite right for
branch pushes anyway and needs adjustment.
Also remove some debugging information. We don't need it anymore.
Warner Losh [Sat, 6 Jan 2024 15:20:17 +0000 (08:20 -0700)]
Connect my checkstyle9.pl script to a action.
Connect the checkstyle9.pl script to a github action. This will provide
feedback to people submitting changes when the style is grossly wrong. And
can provide other automated feedback for the commit message in the future.
It already catches the github noreply author.
It pulls the full repo to do this. Optimizations welcome. After messing
with that for a few hours, I decided to punt and commit the slow, working
version and let someone else optimize from here.
Keeping the SACK scoreboard intact after the first RTO
and retransmitting all data anew only on subsequent RTOs
allows a more timely and efficient loss recovery under
many adverse cirumstances.
Mike Karels [Fri, 5 Jan 2024 19:41:24 +0000 (13:41 -0600)]
arm64/RPI: enable powerd by default on arm64-aarch64-RPI images
Most 64-bit Raspberry Pi models have a variable processor clock
speed that defaults to a slow speed (e.g. 600 MHz for a nominal
1.5 GHz clock). This results in everything running slowly unless
or until powerd is started, and FreeBSD is then thought to be slow.
Enable powerd by default in /etc/rc.conf on the arm64-aarch64-RPI
images. Tested on Raspberry Pi 3B+ and 4B so far.
Kyle Evans [Fri, 5 Jan 2024 06:09:31 +0000 (00:09 -0600)]
kern: console: make /dev/console backing console more predictable
Specifically, altering the console list with conscontrol has some weird
behavior:
1. If you remove the first configured console, /dev/console will become
unconfigured
2. Any console added becomes the /dev/console
In a multicons situation, #1 is clearly a bug and #2 is perhaps slightly
less clear. If we have ttyu0, ttyv0, then it seems obvious that one
would want ttyv0 to take over the console if ttyu0 is removed. If we
add ttyu0 back in, then it's debatable whether it should take over the
console or not.
Fix it now to make the /dev/console selection more FIFO-ish, with
respect to how conscontrol affects it. A `primary` verb for
conscontrol(8) might be a good addition.
Kyle Evans [Fri, 5 Jan 2024 06:21:15 +0000 (00:21 -0600)]
bhyveload: support guest rebooting from the loader
userboot has a EXIT_REBOOT code that it uses when the 'reboot' loader
command is executed. Use that and longjmp back to reinit the VM
entirely with a reboot request. This fixes the 'reboot' option in the
loader menu to actually reboot rather than shutdown the VM.
The JMP_* constants are introduced to keep track of why we're doing a
longjmp, though they aren't currently used. We'll notably still do a
complete reload of the interpreter to give the rebooted VM that new
loader smell. It just seemed forward thinking to just keep track of the
different setjmp points.
While we're here, we don't actually need to keep the fd we passed to
fdlopen(3), so let's avoid leaking it.
Kyle Evans [Fri, 5 Jan 2024 06:21:14 +0000 (00:21 -0600)]
bhyveload: limit rights on the dirfds we create
In neither case do we need write access to the directories we're working
with; userboot doesn't support fo_write on the host device, and the
bootfd is only ever needed for loader loading.
This improves on 8bf0882e18 ("bhyveload: enter capability mode [...]")
so that arbitrary code in the loader can't open writable fds to either
of the directories we need to maintain access to.
nfsclient: limit situations when we do unlocked read-ahead by nfsiod
If there were or are writeable mappings, read-ahead might overwrite the
dirty pages data that is not yet reflected as a delayed write in the
matching buffer state.
Noted by: rmacklem
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Revision e99215a614675 reorganized the code in vtruncbuf(), and moved
the logic to flush meta buffers into a dedicated loop. While doing it,
the condition was changed from bp->b_lblkno < 0 (to handle) into
bp->b_lblkno > 0 (to skip), which causes buffer at lblkno to needlessly
flush.
Reviewed by: chs, markj
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D43261
before ncl_flush() when done to ensure that the server sees our cached
data, because it potentially changes the server response. This is
relevant for copy_file_range(), seek(), and allocate().
Convert LK_SHARED invp lock into LK_EXCLUSIVE if needed to properly call
vm_object_page_clean().
Reported by: asomers
PR: 276002
Noted and reviewed by: rmacklem
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D43250
Ed Maste [Fri, 5 Jan 2024 03:16:30 +0000 (22:16 -0500)]
ssh: Update to OpenSSH 9.6p1
From the release notes,
> This release contains a number of security fixes, some small features
> and bugfixes.
The most significant change in 9.6p1 is a set of fixes for a newly-
discovered weakness in the SSH transport protocol. The fix was already
merged into FreeBSD and released as FreeBSD-SA-23:19.openssh.
Full release notes at https://www.openssh.com/txt/release-9.6
Relnotes: Yes
Sponsored by: The FreeBSD Foundation
Lexi Winter [Thu, 4 Jan 2024 22:34:58 +0000 (22:34 +0000)]
mail: add volatile in grabh()
setjmp() requires that any stack variables modified between the setjmp
call and the longjmp() must be volatile. This means that 'saveint' in
grabh() must be volatile, since it's modified after the setjmp().
Otherwise, the signal handler is not properly restored, resulting in a
crash (SIGBUS) if ^C is typed twice while composing.
Kristof Provost [Thu, 4 Jan 2024 12:45:56 +0000 (13:45 +0100)]
libpfctl: introduce a handle-enabled variant of pfctl_add_rule()
Introduce pfctl_add_rule_h(), which takes a pfctl_handle rather than a
file descriptor (which it didn't use). This means that library users can
open the handle while they're running as root, but later drop privileges
and still add rules to pf.
Kristof Provost [Thu, 4 Jan 2024 09:50:14 +0000 (10:50 +0100)]
libpfctl: introduce pfctl_handle
Consumers of libpfctl can (and in future, should) open a handle. This
handle is an opaque object which contains the /dev/pf file descriptor
and a netlink handle. This means that libpfctl users can open the handle
as root, then drop privileges and still access pf.
Already add the handle to pfctl_startstop() and pfctl_get_creatorids()
as these are new in main, and not present on stable branches. Other
calls will have handle-enabled alternatives implemented in subsequent
commits.
Kristof Provost [Tue, 2 Jan 2024 14:52:39 +0000 (15:52 +0100)]
pflog: pass the action to pflog directly
If a packet is malformed, it is dropped by pf(4). The rule referenced
in pflog(4) is the default rule. As the default rule is a pass
rule, tcpdump printed "pass" although the packet was actually
dropped. Use the actual action, rather than the rule's action, or an
attempt at guessing the correct action.
Inspired by OpenBSD's 'pflog(4) logs packet dropped by default rule with block.' commit.
Kristof Provost [Tue, 2 Jan 2024 13:54:06 +0000 (14:54 +0100)]
pf: don't clobber log flag
If we decide to discard a packet due to unexpected IP options or
unsupported headers we set pd.act.log. However, this can later get
overwritten when we copy the state's saved actions over.
Merge the two log fields to ensure we log as expected.
Mark Johnston [Thu, 4 Jan 2024 17:02:04 +0000 (12:02 -0500)]
systm: Annotate copyin() and related functions with __result_use_check
Now that all in-tree callers check for errors (or cast them away), we
can ask the compiler to check that new code does the same. This was
prompted by SA-23:18.nfsclient, which was caused by missing error
handling. This change is a weak mitigation since code can easily fail
to propagate error handling to the right place, but it's better than
nothing.
Mark Johnston [Thu, 4 Jan 2024 13:34:31 +0000 (08:34 -0500)]
targ: Handle errors from suword()
In targstart() we are already handling an error and have no go way to
signal the failure to upper layers, so ignore the return value of
suword() there.
This is in preparation for annotating copyin() and related functions
with __result_use_check.