Mark Johnston [Wed, 5 Oct 2022 19:12:46 +0000 (15:12 -0400)]
vm_page: Fix a logic error in the handling of PQ_ACTIVE operations
As an optimization, vm_page_activate() avoids requeuing a page that's
already in the active queue. A page's location in the active queue is
mostly unimportant.
When a page is unwired and placed back in the page queues,
vm_page_unwire() avoids moving pages out of PQ_ACTIVE to honour the
request, the idea being that they're likely mapped and so will simply
get bounced back in to PQ_ACTIVE during a queue scan.
In both cases, if the page was logically in PQ_ACTIVE but had not yet
been physically enqueued (i.e., the page is in a per-CPU batch), we
would end up clearing PGA_REQUEUE from the page. Then, batch processing
would ignore the page, so it would end up unwired and not in any queues.
This can arise, for example, when a page is allocated and then
vm_page_activate() is called multiple times in quick succession. The
result is that the page is hidden from the page daemon, so while it will
be freed when its VM object is destroyed, it cannot be reclaimed under
memory pressure.
Fix the bug: when checking if a page is in PQ_ACTIVE, only perform the
optimization if the page is physically enqueued.
PR: 256507
Fixes: f3f38e2580f1 ("Start implementing queue state updates using fcmpset loops.")
Reviewed by: alc, kib
Sponsored by: E-CARD Ltd.
Sponsored by: Klara, Inc.
We have the authorization from the University of California to remove
the advertising clause for a while, wosch@ who also hold a copyright
on this code also approved the relicensing
In the default configuration add 2 bindings which has been requested by
many during the HEADSUP discussion:
* csh like arrow history navigation
* ctrl-arrow to jump from word to words
Add an alias to make the history command exist as an alias to fc -l.
This change makes ident only dependant on libc functions
This makes our ident(1) more portable, also the fact that we only
depend on libc which is maintained with excellent backward compatibility
means that if one day ident is removed from base, someone using FreeBSD
22 will be able to fetch ident from FreeBSD 14 to run ident against
FreeBSD 1.0 binary
most programs in ports are looking for .pc files in order to get the
necessary information on how to compile and link against openssl.
The ports now also has a way to hide or force a path for pkgconf.
Providing .pc files along with openssl in base will allow (once all
the supported version of FreeBSD has it) so improve the framework to
deal with openssl in base vs openssl in ports (and libressl)
This will also greatly reduce the number of patches necessary to
workaround the build systems which only knows how to detect where
openssl is installed via pkgconf.
Doug Moore [Wed, 21 Sep 2022 04:21:14 +0000 (23:21 -0500)]
rb_tree: augmentation shortcut
RB-tree augmentation maintains data in each node of the tree that
represents the product of some associative operator applied to all the
nodes of the subtree rooted at that node. If a node in the tree
changes, augmentation data for the node is updated for that node and
all nodes on the path from that node to the tree root. However,
sometimes, augmenting a node changes no data in that node,
particularly if the associated operation is something involving 'max'
or 'min'. If augmentation changes nothing in a node, then the work of
walking to the tree root from that point is pointless, because
augmentation will change nothing in those nodes either. This change
makes it possible to avoid that wasted work.
Define RB_AUGMENT_CHECK as a macro much like RB_AUGMENT, but which
returns a value 'true' when augmentation changes the augmentation data
of a node, and false otherwise. Change code that unconditionally walks
and augments to the top of tree to code that stops once an
augmentation has no effect. In the case of rebalancing the tree after
insertion or deletion, where previously a node rotated into the path
was inevitably augmented on the march to the tree root, now check to
see if it needs augmentation because the march to the tree root
stopped before reaching it.
Change the augmentation function in iommu_gas.c so that it returns
true/false to indicate whether the augmentation had any effect.
Doug Moore [Tue, 13 Sep 2022 06:11:47 +0000 (01:11 -0500)]
rb_tree: pass parent to RB_INSERT_COLOR
Change RB_COLOR_INSERT to take a parent parameter, to avoid looking up
a value already available. Make adjustments to a linux rbtree header,
which invokes it.
Mark Johnston [Sat, 24 Sep 2022 13:18:04 +0000 (09:18 -0400)]
smr: Fix synchronization in smr_enter()
smr_enter() must publish its observed read sequence number before
issuing any subsequent memory operations. The ordering provided by
atomic_add_acq_int() is insufficient on some platforms, at least on
arm64, because it permits reordering of subsequent loads with the store
to c_seq.
Thus, use atomic_thread_fence_seq_cst() to issue a store-load barrier
after publishing the read sequence number.
On x86, take advantage of the fact that memory operations are not
reordered with locked instructions to improve code density: we can store
the observed read sequence and provide a store-load barrier with a
single operation.
Based on a patch from Pierre Habouzit <pierre@habouzit.net>.
Mark Johnston [Sat, 24 Sep 2022 13:19:21 +0000 (09:19 -0400)]
amd64: Make it possible to grow the KERNBASE region of KVA
pmap_growkernel() may be called when mapping a region above KERNBASE,
typically for a kernel module. If we have enough PTPs left over from
bootstrap, pmap_growkernel() does nothing. However, it's possible to
run out, and in this case pmap_growkernel() will try to grow the kernel
map all the way from kernel_vm_end to somewhere past KERNBASE, which can
easily run the system out of memory. This happens with large kernel
modules such as the nvidia GPU driver. There is also a WIP dtrace
provider which needs to map KVA in the region above KERNBASE (to provide
trampolines which allow a copy of traced kernel instruction to be
executed), and its allocations could potentially trigger this scenario.
This change modifies pmap_growkernel() to manage the two regions
separately, allowing them to grow independently. The end of the
KERNBASE region is tracked by modifying "nkpt".
Gleb Smirnoff [Mon, 3 Jan 2022 02:32:30 +0000 (18:32 -0800)]
sshd: update the libwrap patch to drop connections early
OpenSSH has dropped libwrap support in OpenSSH 6.7p in 2014
(f2719b7c in github.com/openssh/openssh-portable) and we
maintain the patch ourselves since 2016 (a0ee8cc636cd).
Over the years, the libwrap support has deteriotated and probably
that was reason for removal upstream. Original idea of libwrap was
to drop illegitimate connection as soon as possible, but over the
years the code was pushed further down and down and ended in the
forked client connection handler.
The negative effects of late dropping is increasing attack surface
for hosts that are to be dropped anyway. Apart from hypothetical
future vulnerabilities in connection handling, today a malicious
host listed in /etc/hosts.allow still can trigger sshd to enter
connection throttling mode, which is enabled by default (see
MaxStartups in sshd_config(5)), effectively casting DoS attack.
Note that on OpenBSD this attack isn't possible, since they enable
MaxStartups together with UseBlacklist.
A only negative effect from early drop, that I can imagine, is that
now main listener parses file in /etc, and if our root filesystems
goes bad, it would get stuck. But unlikely you'd be able to login
in that case anyway.
Implementation details:
- For brevity we reuse the same struct request_info. This isn't
a documented feature of libwrap, but code review, viewing data
in a debugger and real life testing shows that if we clear
RQ_CLIENT_NAME and RQ_CLIENT_ADDR every time, it works as intended.
- We set SO_LINGER on the socket to force immediate connection reset.
- We log message exactly as libwrap's refuse() would do.
Ed Maste [Fri, 15 Apr 2022 14:41:08 +0000 (10:41 -0400)]
ssh: update to OpenSSH v9.0p1
Release notes are available at https://www.openssh.com/txt/release-9.0
Some highlights:
* ssh(1), sshd(8): use the hybrid Streamlined NTRU Prime + x25519 key
exchange method by default ("sntrup761x25519-sha512@openssh.com").
The NTRU algorithm is believed to resist attacks enabled by future
quantum computers and is paired with the X25519 ECDH key exchange
(the previous default) as a backstop against any weaknesses in
NTRU Prime that may be discovered in the future. The combination
ensures that the hybrid exchange offers at least as good security
as the status quo.
* sftp-server(8): support the "copy-data" extension to allow server-
side copying of files/data, following the design in
draft-ietf-secsh-filexfer-extensions-00. bz2948
* sftp(1): add a "cp" command to allow the sftp client to perform
server-side file copies.
This commit excludes the scp(1) change to use the SFTP protocol by
default; that change will immediately follow.
MFC after: 1 month
Relnotes: Yes
Sponsored by: The FreeBSD Foundation
Ed Maste [Wed, 13 Apr 2022 20:00:56 +0000 (16:00 -0400)]
ssh: update to OpenSSH v8.9p1
Release notes are available at https://www.openssh.com/txt/release-8.9
Some highlights:
* ssh(1), sshd(8), ssh-add(1), ssh-agent(1): add a system for
restricting forwarding and use of keys added to ssh-agent(1)
* ssh(1), sshd(8): add the sntrup761x25519-sha512@openssh.com hybrid
ECDH/x25519 + Streamlined NTRU Prime post-quantum KEX to the
default KEXAlgorithms list (after the ECDH methods but before the
prime-group DH ones). The next release of OpenSSH is likely to
make this key exchange the default method.
* sshd(8), portable OpenSSH only: this release removes in-built
support for MD5-hashed passwords. If you require these on your
system then we recommend linking against libxcrypt or similar.
A near-future release of OpenSSH will switch scp(1) from using the
legacy scp/rcp protocol to using SFTP by default.
Legacy scp/rcp performs wildcard expansion of remote filenames (e.g.
"scp host:* .") through the remote shell. This has the side effect of
requiring double quoting of shell meta-characters in file names
included on scp(1) command-lines, otherwise they could be interpreted
as shell commands on the remote side.
MFC after: 1 month
Relnotes: Yes
Sponsored by: The FreeBSD Foundation
ipfilter/ippool: Return error code when listing a pool fails
When an internal or other error occurs during the listing of a pool,
return an error code when extiting ippool(8). Printing an error to
stderr without returning an error code is useless in shell scripts.
powerpc: cpuset: add local functions for copyin/copyout
Add local functions to workaround an instruction segment trap (panic)
when the indirect functions copyin and copyout are called by an external
loadable kernel module (i.e. pfsync, zfs and linuxulator). The crash
was triggered by change 47a57144af25a7bd768b29272d50a36fdf2874ba, but
kernel binary linked with LLD 9 works fine. LLVM bisect points that LLD
behavior chaged after dc06b0bc9ad055d06535462d91bfc2a744b2f589.
This is know to affect powerpc targets only and the final fix is still
being discussed with the LLVM community.
PR: 266730
Reviewed by: luporl, jhibbits (on IRC, previous version)
MFC after: 2 days
Sponsored by: Instituto de Pesquisas Eldorado (eldorado.org.br)
Differential Revision: https://reviews.freebsd.org/D36234
nfs: skip bootpc when vfs.root.mountfrom is other than nfs
If "vfs.root.mountfrom" is set and the value is something other
than "nfs:*", it means the user doesn't want to mount root via nfs,
there's no reason to continue with bootpc
This fixes the powerpcspe kernel (MPC85XXSPE) that's compiled with
BOOTP_NFSROOT by default and gets stuck on bootpc/dhcp request loop
when no DHCP server is available on the network, even when user
specifies a local disk via "vfs.root.mountfrom" kernel parameter.
Reviewed by: imp
MFC after: 2 weeks
Sponsored by: Instituto de Pesquisas Eldorado (eldorado.org.br)
Differential Revision: https://reviews.freebsd.org/D35098
Michael Gmelin [Wed, 7 Sep 2022 16:56:49 +0000 (18:56 +0200)]
stand: Unbreak FAT32 in loader
This corrects an issue introduced in b4cb3fe0e39a3, where a freshly
allocated `DOS_FS` structure would not be initialized properly before
use in `dos_open`.
In case of FAT32 file systems, this would leave `fs->dirents`
uninitialized and - depending on its content and due to checks in
`parsebs` - prevent mounting the file system successfully.
This particularily impacted the EFI loader, as it was sometimes not
able to read files from a FAT32-formatted EFI partition, including
LoaderEnv (`/efi/freebsd/loader.env`).
Leandro Lupori [Thu, 14 Oct 2021 16:13:27 +0000 (13:13 -0300)]
powerpc64: make radix with superpages default
As Radix MMU with superpages enabled is now stable, make it the
default choice on supported hardware (POWER9 and above), since its
performance is greater than that of HPT MMU.
Reviewed by: alfredo, jhibbits
Sponsored by: Instituto de Pesquisas Eldorado (eldorado.org.br)
Differential Revision: https://reviews.freebsd.org/D30797
When running in a virtualized environment, TLB invalidations can only
be performed on process scope, as only the hypervisor is allowed to
invalidate a global scope, or else a Program Interrupt is triggered.
Since we are here, also make sure that the register process table
hypercall returns success.
Reviewed by: jhibbits
Sponsored by: Instituto de Pesquisas Eldorado (eldorado.org.br)
Differential Revision: https://reviews.freebsd.org/D31775
Justin Hibbits [Thu, 12 Aug 2021 00:03:27 +0000 (19:03 -0500)]
powerpc/pseries: Allow radix pmap in pseries for ISA 3.0
ISA 3.0 allows for nested radix translations with minimal to no
involvement of the hypervisor. This should make pseries signficantly
faster on POWER9 pseries instances, as fewer hypercalls are needed to
manage pmap now.
Leandro Lupori [Thu, 25 Nov 2021 19:41:46 +0000 (16:41 -0300)]
powerpc64le: fix boot when using QEMU PowerNV
When using QEMU PowerNV with latest op-build release (v2.7), its
kexec transfers control to FreeBSD kernel in BE mode, causing an
instant exception on LE kernels. Make kboot able to detect and
swap endian to fix this.
Reviewed by: imp
Sponsored by: Instituto de Pesquisas Eldorado (eldorado.org.br)
Differential Revision: https://reviews.freebsd.org/D33104
Martin Matuska [Tue, 4 Oct 2022 15:52:09 +0000 (17:52 +0200)]
zfs: merge openzfs/zfs@6a6bd4939 (zfs-2.1-release) into stable/13
OpenZFS release 2.1.6
Notable upstream pull requeset merges:
#11733 ICP: Add missing stack frame info to SHA asm files
#12274 Optimize txg_kick() process
#12284 Add Module Parameter Regarding Log Size Limit
#12285 Introduce a tunable to exclude special class buffers from L2ARC
#12287 Remove refcount from spa_config_*()
#12425 Avoid small buffer copying on write
#12516 Fix NFS and large reads on older kernels
#12678 spa.c: Replace VERIFY(nvlist_*(...) == 0) with fnvlist_*
#12789 Improve log spacemap load time
#13022 Add more control/visibility and speedup spa_load_verify()
#13106 add physical device size to SIZE column in 'zpool list -v'
#13388 Improve mg_aliquot math
#13405 Revert "Reduce dbuf_find() lock contention"
#13452 More speculative prefetcher improvements
#13476 Refactor Log Size Limit
#13540 AVL: Remove obsolete branching optimizations
#13553 Reduce ZIO io_lock contention on sorted scrub
#13555 Scrub mirror children without BPs
#13563 FreeBSD: Improve crypto_dispatch() handling
#13576 Several sorted scrub optimizations
#13579 Fix and disable blocks statistics during scrub
#13582 Several B-tree optimizations
#13591 Avoid two 64-bit divisions per scanned block
#13606 Avoid memory copies during mirror scrub
#13613 Avoid memory copy when verifying raidz/draid parity
#13643 Fix scrub resume from newly created hole
#13756 FreeBSD: Mark ZFS_MODULE_PARAM_CALL as MPSAFE
#13767 arcstat: fix -p option
#13781 Importing from cachefile can trip assertion
#13794 Apply arc_shrink_shift to ARC above arc_c_min
#13798 Improve too large physical ashift handling
#13811 Fix column width in 'zpool iostat -v' and 'zpool list -v'
#13842 make DMU_OT_IS_METADATA and DMU_OT_IS_ENCRYPTED return B_TRUE
or B_FALSE
#13855 zfs recv hangs if max recordsize is less than received
recordsize
#13861 Fix use-after-free in btree code
#13865 vdev_draid_lookup_map() should not iterate outside draid_maps
#13878 Delay ZFS_PROP_SHARESMB property to handle it for encrypted
raw receive
#13882 FreeBSD: Fix integer conversion for vnlru_free{,_vfsops}()
#13885 Fix incorrect size given to bqueue_enqueue() call in dmu_redact.c
#13908 FreeBSD: stop passing LK_INTERLOCK to VOP_LOCK
#13930 zpool: Don't print "repairing" on force faulted drives
#13954 Fix bad free in skein code
When WITHOUT_SENDMAIL is enabled and WITHOUT_MAILWRAPPER is disabled
we install /bin/rmail as a link to the /usr/sbin/mailwrapper.
Ensure make delete-old does not unlink /bin/rmail in that case.
Justin Hibbits [Tue, 4 Jan 2022 15:22:04 +0000 (09:22 -0600)]
busdma: Fix powerpc DMA alignment check
The original logic was to check if there's no filter and the address is
misaligned relative to the requirements. The refactoring in c606ab59e7f9423f7027320e9a4514c7db39658d missed this, and instead caused
it to return failure if the address *is* properly aligned.