Mark Johnston [Fri, 7 May 2021 18:27:58 +0000 (14:27 -0400)]
divert: Fix mbuf ownership confusion in div_output()
div_output_outbound() and div_output_inbound() relied on the caller to
free the mbuf if an error occurred. However, this is contrary to the
semantics of their callees, ip_output(), ip6_output() and
netisr_queue_src(), which always consume the mbuf. So, if one of these
functions returned an error, that would get propagated up to
div_output(), resulting in a double free.
Fix the problem by making div_output_outbound() and div_output_inbound()
responsible for freeing the mbuf in all cases.
Reported by: Michael Schmiedgen <schmiedgen@gmx.net>
Tested by: Michael Schmiedgen
Reviewed by: donner
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D30129
sbin/ipfw: Fix null pointer deference when printing counters
ipfw -[tT] prints statistics of the last access. If the rule was never
used, the counter might be not exist. This happens unconditionally on
inserting a new rule. Avoid printing statistics in this case.
Rick Macklem [Sun, 25 Apr 2021 19:52:48 +0000 (12:52 -0700)]
nfscl: fix delegation recall when the file is not open
Without this patch, if a NFSv4 server recalled a
delegation when the file is not open, the renew
thread would block in the NFS VOP_INACTIVE()
trying to acquire the client state lock that it
already holds.
This patch fixes the problem by delaying the
vrele() call until after the client state
lock is released.
This bug has been in the NFSv4 client for
a long time, but since it only affects
delegation when recalled due to another
client opening the file, it got missed
during previous testing.
Until you have this patch in your client,
you should avoid the use of delegations.
If we reassemble a packet we modify the IP header (to set the length and
remove the fragment offset information), but we failed to update the
checksum. On certain setups (mostly where we did not re-fragment again
afterwards) this could lead to us sending out packets with incorrect
checksums.
struct pf_rule had a few counter_u64_t counters. Those couldn't be
usefully comminicated with userspace, so the fields were doubled up in
uint64_t u_* versions.
Now that we use struct pfctl_rule (i.e. a fully userspace version) we
can safely change the structure and remove this wart.
Stop using the kernel's struct pf_rule, switch to libpfctl's pfctl_rule.
Now that we use nvlists to communicate with the kernel these structures
can be fully decoupled.
pf: Move prototypes for userspace functions to userspace header
These functions no longer exist in the kernel, so there's no reason to
keep the prototypes in a kernel header. Move them to pfctl where they're
actually implemented.
Kristof Provost [Fri, 26 Mar 2021 10:22:15 +0000 (11:22 +0100)]
pfctl: Use the new DIOCGETRULENV ioctl
Create wrapper functions to handle the parsing of the nvlist and move
that code into pfctl_ioctl.c.
At some point this should be moved into a libpfctl.
opal_console: fix serial console output corruption on powerpc64
Adds OPAL_CONSOLE_WRITE error handling and implements a call to
OPAL_CONSOLE_WRITE_BUFFER_SPACE to verify if there's enough space
before writing to console.
This fixes serial port output getting corrupted on fast writes, like
on "dmesg" output.
Tested on Raptor Blackbird running powerpc64 BE kernel
Zhenlei Huang [Mon, 3 May 2021 16:46:19 +0000 (12:46 -0400)]
traceroute6: Properly calculate UDP checksum
The revision D25604 capsicumize traceroute6. For UDP the send socket was
changed from SOCK_DGRAM to SOCK_RAW and thus the UDP checksum need be
calculated by application itself other than the kernel.
outpacket is filled with zeros by line 707, thus the first round the UDP
checksum is correct. But subsequent rounds outudp->uh_sum will be left
with garbage.
Alexander Motin [Tue, 6 Apr 2021 21:27:16 +0000 (17:27 -0400)]
Introduce "soft" serseq variant.
With new ZFS prefetcher improvements it is no longer needed to fully
serialize reads to reach decent prediction hit rate. Softer variant
only creates small time window to reduce races instead of completely
blocking following reads while previous is running. It much less
hurts the performance in case of prediction miss.
Mark Johnston [Wed, 28 Apr 2021 14:42:59 +0000 (10:42 -0400)]
pipe: Avoid calling selrecord() on a closing pipe
pipe_poll() may add the calling thread to the selinfo lists of both ends
of a pipe. It is ok to do this for the local end, since we know we hold
a reference on the file and so the local end is not closed. It is not
ok to do this for the remote end, which may already be closed and have
called seldrain(). In this scenario, when the polling thread wakes up,
it may end up referencing a freed selinfo.
pkg(7): replace usage of sbuf(9) with open_memstream(3)
open_memstream(3) is a standard way to obtain the same feature we do get
by using sbuf(9) (aka dynamic size buffer), switching to using it makes
pkg(7) more portable, and reduces its number of dependencies.
Reviewed by: manu
Differential Revision: https://reviews.freebsd.org/D30005
pkg(7): when bootstrapping first search for pkg.pkg file then pkg.txz
The package extension is going to be changed to .pkg to be among other
things resilient to the change of compression format used and reduce
the impact of all third party tool of that change.
Ensure the bootstrap knows about it
Reviewed by: manu
Differential revision: https://reviews.freebsd.org/D29232
Moritz Schmitt [Tue, 27 Apr 2021 01:59:12 +0000 (03:59 +0200)]
Make pkg(7) use environment variables specified in pkg.conf
Modify /usr/sbin/pkg to use environment variables specified in pkg.conf.
This allows control over underlying libraries like fetch(3), which can
be configured by setting HTTP_PROXY.
Alexander Motin [Mon, 5 Apr 2021 14:34:40 +0000 (10:34 -0400)]
Set PCIe device's Max_Payload_Size to match PCIe root's.
Usually on boot the MPS is already configured by BIOS. But we've
found that on hot-plug it is not true at least for our Supermicro
X11 boards. As result, mismatch between root's configuration of
256 bytes and device's default of 128 bytes cause problems for some
devices, while others seem to work fine.
33cb3cb2e321 introduced an `rib_head` structure field under the
FIB_ALGO define. This may be problematic for the CTF, as some
of the files including `route_var.h` do not have `fib_algo`
defined.
Make dtrace happy by making the field unconditional.
Add rib_walk_from() wrapper for selective rib tree traversal.
Provide wrapper for the rnh_walktree_from() rib callback.
As currently `struct rib_head` is considered internal to the
routing subsystem, this wrapper is necessary to maintain isolation
from the external code.
[fib algo] Delay algo init at fib growth to to allow to reliably use rib KPI.
Currently, most of the rib(9) KPI does not use rnh pointers, using
fibnum and family parameters to determine the rib pointer instead.
This works well except for the case when we initialize new rib pointers
during fib growth.
In that case, there is no mapping between fib/family and the new rib,
as an entirely new rib pointer array is populated.
Address this by delaying fib algo initialization till after switching
to the new pointer array and updating the number of fibs.
Set datapath pointer to the dummy function, so the potential callers
won't crash the kernel in the brief moment when the rib exists, but
no fib algo is attached.
This change allows to avoid creating duplicates of existing rib functions,
with altered signature.
Traditionally we had 2 sources of information whether the
added/delete route request targets network or a host route:
netmask (RTA_NETMASK) and RTF_HOST flag.
The former one is tricky: netmask can be empty or can explicitly
specify the host netmask. Parsing netmask sockaddr requires per-family
parsing and that's what rtsock code traditionally avoided. As a result,
consistency was not enforced and it was possible to specify network with
the RTF_HOST flag and vice versa.
Continue normalization efforts from D29826 and D29826 and ensure that
RTF_HOST flag always reflects host/network data from netmask field.
Differential Revision: https://reviews.freebsd.org/D29958
MFC after: 2 days
Rick Macklem [Tue, 20 Apr 2021 00:51:07 +0000 (17:51 -0700)]
nfsd: fix stripe size reply for the File Layout pNFS server
At a recent testing event I found out that I had misinterpreted
RFC5661 where it describes the stripe size in the File Layout's
nfl_util field. This patch fixes the pNFS File Layout server
so that it returns the correct value to the NFSv4.1/4.2 pNFS
enabled client.
This affects almost no one, since pNFS server configurations
are rare and the extant pNFS aware NFS clients seemed to
function correctly despite the erroneous stripe size.
It *might* be needed for correct behaviour if a recent
Linux client mounts a FreeBSD pNFS server configuration
that is using File Layout (non-mirrored configuration).
Mark Johnston [Mon, 26 Apr 2021 18:53:16 +0000 (14:53 -0400)]
imgact_elf: Ensure that the return value in parse_notes is initialized
parse_notes relies on the caller-supplied callback to initialize "res".
Two callbacks are used in practice, brandnote_cb and note_fctl_cb, and
the latter fails to initialize res. Fix it.
In the worst case, the bug would cause the inner loop of check_note to
examine more program headers than necessary, and the note header usually
comes last anyway.
Reviewed by: kib
Reported by: KMSAN
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D29986