instead of manual zeroing of the debug registers file in pcb.
This centralizes the cleaning code, but the practical difference is
that PCB_DBREGS flag is cleared, saving some operations on context
switching.
Reviewed by: jhb
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D29687
Warner Losh [Fri, 9 Apr 2021 22:35:17 +0000 (16:35 -0600)]
efivar: Attempt to fix setting/printing/deleting EFI vars with '-' in their name
Due to how we're parsing UUIDs, we were disallowing setting, printing or
deleting any UEFI variable with a '-' in it when you attempted to do that
operation with the exact name (wildcard reporting was unaffected). Fix the
parser to loop over all the dashes in the name and only give up when all
possible matches are exhausted.
tcp_hostcache: implement tcp_hc_updatemtu() via tcp_hc_update.
Locking changes are planned here, and without this change too
much copy-and-paste would be between these two functions.
This fixes a regression in d36d6816151705907393889, where the call to
__tls_get_address() was performed under rtld_bind_lock write-locked.
Instead use tls_get_addr_slow() directly, with locked = true.
Reported by: jkim, many others
Tested by: jkim, bdragon (powerpc), mhorne (riscv)
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D29623
The intent is to better handle time intervals with large amount of RIB
updates (e.g. BGP peer going up or down), while still keeping low sync
delay for the rest scenarios.
The implementation is the following: updates are bucketed into the
buckets of size 50ms. If the number of updates within a current bucket
exceeds the threshold of 500 routes/sec (e.g. 10 updates per bucket
interval), the update is delayed for another 50ms. This can be repeated
until the maximum update delay (1 sec) is reached.
Gordon Bergling [Fri, 9 Apr 2021 15:28:18 +0000 (17:28 +0200)]
sysctl.conf(5): Mention sysctl.conf.local in the sysctl.conf(5) manual page
The possibility of using a sysctl.conf.local on a machine that has a shared
sysctl.conf(5) isn't documented. So mention the sysctl.conf.local in the
manual page.
PR: 254901
Submitted by: Jose Luis Duran <jlduran at gmail dot com>
Reported by: Jose Luis Duran <jlduran at gmail dot com>
Reviewed by: markj
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D29673
Wojciech Macek [Fri, 9 Apr 2021 07:28:44 +0000 (09:28 +0200)]
pci_dw: Trim ATU windows bigger than 4GB
The size of the ATU MEM/IO windows is implicitly casted to uint32_t.
Because of that some window sizes were silently demoted to 0 and ignored.
Check the size if its too large, trim it to 4GB and print a warning message.
Rick Macklem [Thu, 8 Apr 2021 21:04:22 +0000 (14:04 -0700)]
nfsd: fix replies from session cache for retried RPCs
Recent testing of network partitioning a FreeBSD NFSv4.1
server from a Linux NFSv4.1 client identified problems
with both the FreeBSD server and Linux client.
The FreeBSD server failec to reply using the cached
reply in the session slot when an RPC was retried on
the session slot, as indicated by same slot sequence#.
This patch fixes this. It should also fix a similar
failure for NFSv4.0 mounts, when the sequence# in
the open/lock_owner requires a reply be done from
an entry locked into the DRC.
This fix affects the fairly rare case where a NFSv4
client retries a non-idempotent RPC, such as a lock
operation. Note that retries only occur after the
client has needed to create a new TCP connection.
Gleb Smirnoff [Mon, 22 Mar 2021 20:51:42 +0000 (13:51 -0700)]
tcp_hostcache: add bool argument for tcp_hc_lookup() to tell are we
looking to only read from the result, or to update it as well.
For now doesn't affect locking, but allows to push stats and expire
update into single place.
tcp: Prepare PRR to work with NewReno LossRecovery
Add proper PRR vnet declarations for consistency.
Also add pointer to tcpopt struct to tcp_do_prr_ack, in preparation
for it to deal with non-SACK window reduction (after loss).
Avoid -pedantic warnings about using _Generic in __fp_type_select
When compiling parts of math.h with clang using a C standard before C11,
and using -pedantic, it will result in warnings similar to:
bug254714.c:5:11: warning: '_Generic' is a C11 extension [-Wc11-extensions]
return !isfinite(1.0);
^
/usr/include/math.h:111:21: note: expanded from macro 'isfinite'
^
/usr/include/math.h:82:39: note: expanded from macro '__fp_type_select'
^
This is because the block that enables use of _Generic is conditional
not only on C11, but also on whether the compiler advertises support for
C generic selections via __has_extension(c_generic_selections).
To work around the warning without having to pessimize the code, use the
__extension__ keyword, which is supported by both clang and gcc. While
here, remove the check for __clang__, as _Generic has been supported for
a long time by gcc too now.
bhyve: fix regression in legacy virtio-9p config parsing
Commit 621b5090487de9fed1b503769702a9a2a27cc7bb introduced a regression
in legacy virtio-9p config parsing by not initializing *sharename to
NULL. As a result, "sharename != NULL" check in the first iteration fails
and bhyve exits with "virtio-9p: more than one share name given".
Andrew Turner [Thu, 1 Apr 2021 14:38:09 +0000 (14:38 +0000)]
arm64: Fix finding the pmc event ID
The lower pmc event bits were masked off to find the PMC event ID.
The doesn't work when there are more events. Switch it to use the
offser relative to the first event while also checking the ID is
in the expected range.
Reviewed by: gnn, ray
Sponsored by: Innovate UK
Differential Revision: https://reviews.freebsd.org/D29600
Andrew Turner [Tue, 23 Mar 2021 18:23:47 +0000 (18:23 +0000)]
Discard the arm64 VFP state before resetting it
When resetting the VFP state we need to discard any old state so we don't
try to save it on a context switch. Move this first so resetting the pcb
is safe to perform outside a critical section.
Reviewed by: arichardson
Sponsored by: Innovate UK
Differential Revision: https://reviews.freebsd.org/D29401
Ryan Libby [Wed, 7 Apr 2021 19:39:05 +0000 (12:39 -0700)]
shared shadow vm object invalidation regression test
Add a regression test for a scenario where a shadow vm object is shared
by multiple mappings. If a page COW occurs through one of the mappings,
then the virtual-to-physical mapping may become invalidated.
In situations when the current file name wasn't the first element on
the list we were cleaning the current name too early.
This might cause us to pre-cache the same file twice.
Yongbo Yao [Wed, 7 Apr 2021 18:33:22 +0000 (13:33 -0500)]
Loader: support booting OS from memory disk (MD)
Until now, the boot image can be embedded into the loader with
/sys/tools/embed_mfs.sh, and memory disk (MD) is already supported
in loader source. But due to memory disk (MD) driver isn't registered
to the loader yet, the boot image can't be boot from embedded memory
disk.
Mark Johnston [Wed, 7 Apr 2021 18:19:52 +0000 (14:19 -0400)]
capsicum: Limit socket operations in capability mode
Capsicum did not prevent certain privileged networking operations,
specifically creation of raw sockets and network configuration ioctls.
However, these facilities can be used to circumvent some of the
restrictions that capability mode is supposed to enforce.
Add capability mode checks to disallow network configuration ioctls and
creation of sockets other than PF_LOCAL and SOCK_DGRAM/STREAM/SEQPACKET
internet sockets.
Reviewed by: oshogbo
Discussed with: emaste
Reported by: manu
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D29423
When we find a state for packets that was created by a reply-to rule we
still need to process the packet. The state may require us to modify the
packet (e.g. in rdr or nat cases), which we won't do with the shortcut.
Kristof Provost [Thu, 25 Mar 2021 12:59:14 +0000 (13:59 +0100)]
libnv: Allow use in non-sleepable contexts
44c125c4cebc2fd87c6260b90eddae11201f5232 switched the nvlist allocations
to be M_WAITOK, but this precludes the use in non-sleepable contexts.
(E.g. with a nonsleepable lock held).
All callers for these allocation functions already cope with memory
alloation failures, so there's no reason to allow sleeping during
allocations.
pf tests: make synproxy and nat work correctly even if inetd is running
tests/sys/netfil/pf/synproxy fails if inetd has been running
outside of the jail because pidfile_open() fails with EEXIST.
tests/sys/netfil/pf/nat has the same problem but the test succeeds
because whether inetd is running is not so important.
Fix the problem by changing the pidfile path from the default
location.
Ka Ho Ng [Wed, 7 Apr 2021 11:00:31 +0000 (19:00 +0800)]
Document vnode_pager_setsize(9)
MFC after: 3 days
Sponsored by: The FreeBSD Foundation
Reviewed by: bcr
Approved by: philip (mentor)
Differential Revision: https://reviews.freebsd.org/D29408
When the FreeBSD testsuite runs the libarchive tests it checks that stderr
is empty. Since #1382 this is no longer the case. This change restores
the behaviour of silencing bunzip2 stderr but doesn't bring back the
output text check.
Alexander Motin [Tue, 6 Apr 2021 21:27:16 +0000 (17:27 -0400)]
Introduce "soft" serseq variant.
With new ZFS prefetcher improvements it is no longer needed to fully
serialize reads to reach decent prediction hit rate. Softer variant
only creates small time window to reduce races instead of completely
blocking following reads while previous is running. It much less
hurts the performance in case of prediction miss.
Allocate extra inodes in makefs when leaving free space in UFS images.
By default, makefs(8) has very few spare inodes in its output images,
which is fine for static filesystems, but not so great for VM images
where many more files will be added. Make makefs(8) use the same
default settings as newfs(8) when creating images with free space --
there isn't much point to leaving free space on the image if you
can't put files there. If no free space is requested, use current
behavior of a minimal number of available inodes.
Reviewed by: manu
MFC after: 3 weeks
Differential Revision: https://reviews.freebsd.org/D29492
Eric van Gyzen [Tue, 6 Apr 2021 14:36:52 +0000 (09:36 -0500)]
uefisign: fix handling of errors from child proc
Close the unused pipe file descriptors so the parent will notice if
the child exits prematurely. Previously, the parent would block
forever on a read from the pipe.
Marcin Wojtas [Tue, 6 Apr 2021 16:50:36 +0000 (18:50 +0200)]
pci_user: fix build for 32-bit platforms
Commit: f2f1ab39c040 ("pci_user: call bus_translate_resource before BAR mmap")
broke build for 32-bit platforms due to rman_res_t and vm_paddr_t
incompatible types. Fix that.
Marcin Wojtas [Tue, 6 Apr 2021 15:10:04 +0000 (17:10 +0200)]
pci_user: call bus_translate_resource before BAR mmap
On some armv8 machines it is possible that the mapping between CPU
and PCI bus BAR base addresses is not 1:1. In case a BAR is allocated
in kernel using bus_alloc_resource_any this translation is handled in
ofw_pci_activate_resource.
Do the same in pci_user.c by calling bus_translate_resource devmethod.
This fixes mmaping BARs to userspace on Marvell SoCs (Armada 7k8k/CN913x)
and possibly many other platforms.
Marcin Wojtas [Tue, 6 Apr 2021 15:00:05 +0000 (17:00 +0200)]
pciconf: Use VM_MEMATTR_DEVICE on supported architectures
Some architectures - armv7, armv8 and riscv use VM_MEMATTR_DEVICE
when mapping device registers in kernel. Do the same in pciconf.
On armada8k SoC all reads from BARs mapped with hitherto attribute
(VM_MEMATTR_UNCACHEABLE) return 0xff's.
Radix MMU code was missing TLB invalidations when some Level 3 PDEs were
modified. This caused TLB multi-hit machine check interrupts when
superpages were enabled.
Reviewed by: jhibbits
MFC after: 2 weeks
Sponsored by: Eldorado Research Institute (eldorado.org.br)
Differential Revision: https://reviews.freebsd.org/D29511