]> CyberLeo.Net >> Repos - FreeBSD/FreeBSD.git/log
FreeBSD/FreeBSD.git
8 months agoFix a bug in fsck_ffs(8) triggered by corrupted filesystems.
Kirk McKusick [Fri, 20 Oct 2023 22:14:46 +0000 (15:14 -0700)]
Fix a bug in fsck_ffs(8) triggered by corrupted filesystems.

Reported-by: Andreas Bock
PR:           274404
(cherry picked from commit 1e39a0886e0999520a7e7136e3f7d09e9cd9a5f2)

8 months agoufs quotas: fix configuring soft quota grace time
Mikel Lechner [Sat, 21 Oct 2023 06:08:38 +0000 (09:08 +0300)]
ufs quotas: fix configuring soft quota grace time

PR: 274552

(cherry picked from commit 2fee3974603bce6f2dc153eb6af459cb4f864ab4)

8 months agobhyve: fix arguments to ioctl(VMIO_SIOCSIFFLAGS)
Gleb Smirnoff [Thu, 26 Oct 2023 09:59:21 +0000 (02:59 -0700)]
bhyve: fix arguments to ioctl(VMIO_SIOCSIFFLAGS)

ioctl(2)'s with integer argument shall pass command argument by value,
not by pointer.  The ioctl(2) manual page is not very clear about that.
See sys/kern/sys_generic.c:sys_ioctl() near IOC_VOID.

Reviewed by: markj
Differential Revision: https://reviews.freebsd.org/D42366
Fixes: fd8b9c73a5a63a7aa438a73951d7a535b4f25d9a

(cherry picked from commit f407a72a506d2630d60d9096c42058f12dff874e)

8 months agopfctl: fix incorrect mask on dynamic address
Kristof Provost [Fri, 6 Oct 2023 12:20:17 +0000 (14:20 +0200)]
pfctl: fix incorrect mask on dynamic address

A PF rule using an IPv4 address followed by an IPv6 address and then a
dynamic address, e.g. "pass from {192.0.2.1 2001:db8::1} to (pppoe0)",
will have an incorrect /32 mask applied to the dynamic address.

MFC after: 3 weeks
Obtained from: OpenBSD
See also: https://ftp.openbsd.org/pub/OpenBSD/patches/5.6/common/007_pfctl.patch.sig
Sponsored by: Rubicon Communications, LLC ("Netgate")
Event: Oslo Hackathon at Modirum

(cherry picked from commit 7ce98cf2f87a22240b66e4c38fd887431a25bf7d)

8 months agomips: add enough glue for membarrier(2)
Konstantin Belousov [Thu, 26 Oct 2023 22:26:46 +0000 (01:26 +0300)]
mips: add enough glue for membarrier(2)

This is direct commit to stable/13.

8 months agolibprocstat: improve conditional for 32-bit compat
Brooks Davis [Thu, 26 Oct 2023 20:38:41 +0000 (21:38 +0100)]
libprocstat: improve conditional for 32-bit compat

Include support for translating 32-bit auxv vectors on non-64-bit
platforms that aren't riscv (which has no 32-bit ABI support and
probably never will).

Reviewed by: markj
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D42201

(cherry picked from commit 248fe3d3483cb3ec2c78dd31dc02a467060a6577)

8 months agolibprocstat: copy all the 32-bit auxv entries
Brooks Davis [Thu, 26 Oct 2023 20:38:41 +0000 (21:38 +0100)]
libprocstat: copy all the 32-bit auxv entries

Use source struct size not the destination struct size so we copy all
the auxv entries, not just the first half of them.

Fix a style issue on an adjacent line.

Reviewed by: markj
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D42200

(cherry picked from commit 8f06fabe39ac3ebca4ab448a456945008305a23f)

8 months agolibprocstat: make sv_name not static
Brooks Davis [Thu, 26 Oct 2023 20:38:41 +0000 (21:38 +0100)]
libprocstat: make sv_name not static

Making this variable static makes is_elf32_sysctl() and callers thread
unsafe.

Use a less absurd length for sv_name.  The longest name in the system is
"FreeBSD ELF64 V2" which tips the scales at 16+1 bytes.  We'll almost
certainly have other problems if we exceed 32 characters.

Reviewed by: markj
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D42199

(cherry picked from commit 72a4ee26a7c665ae1c31abe1c6feeaa7ccaba140)

8 months agolibprocstat: simplify auxv value conversion
Brooks Davis [Thu, 26 Oct 2023 20:38:41 +0000 (21:38 +0100)]
libprocstat: simplify auxv value conversion

Avoid a weird dance through the union and treat all 32-bit values as
unsigned integers.  This avoids sign extension of flags and userspace
pointers.

Reviewed by: markj
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D42198

(cherry picked from commit 9735cc0e41825bb9e95d16433d381ffe4c190f38)

8 months agolibprocstat: style: space after switch
Brooks Davis [Thu, 26 Oct 2023 20:38:40 +0000 (21:38 +0100)]
libprocstat: style: space after switch

Style demands a space after the switch keyword.

Noticed reviewing code in CheriBSD that propagated the style bug.

Reported by: markj
Sponsored by: DARPA
Reviewed by: kib, markj
Differential Revision: https://reviews.freebsd.org/D42041

(cherry picked from commit ccac440f7cbb013de41aa3933f3f7be77225c44f)

8 months agoRegen
Konstantin Belousov [Thu, 26 Oct 2023 04:06:56 +0000 (07:06 +0300)]
Regen

8 months agoAdd membarrier(2)
Konstantin Belousov [Thu, 7 Oct 2021 21:10:07 +0000 (00:10 +0300)]
Add membarrier(2)

(cherry picked from commit 4a69fc16a583face922319c476f3e739d9ce9140)

8 months agoAdd cpu_sync_core()
Konstantin Belousov [Thu, 7 Oct 2021 21:57:55 +0000 (00:57 +0300)]
Add cpu_sync_core()

(cherry picked from commit 74ccb8ecf6c115a79f008bc32d4981f1126b63a8)

8 months agoadd pmap_active_cpus()
Konstantin Belousov [Thu, 7 Oct 2021 22:25:54 +0000 (01:25 +0300)]
add pmap_active_cpus()

(cherry picked from commit 8882b7852acf2588d87ccb6d4c6bf7694511fc56)

8 months agomakesyscall: Stop generating $FreeBSD$
Warner Losh [Fri, 9 Jun 2023 13:26:07 +0000 (07:26 -0600)]
makesyscall: Stop generating $FreeBSD$

With 14 coming, we no longer need to generate the $FreeBSD$. We can
likely MFC that to 13 as well.

MFC After: 2 weeks
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D39879

(cherry picked from commit 61fe63f698148f8d63fe6f366c5e20bc7ad60b87)

8 months agomakesyscalls.lua: Minor fluff removal
Warner Losh [Thu, 20 Apr 2023 22:16:16 +0000 (16:16 -0600)]
makesyscalls.lua: Minor fluff removal

luacheck pointed out two minor issues: line isn't declared as a global,
so declare it local. Also remove an unused parameter.

Suggested by: kevans
Sponsored by: Netflix

(cherry picked from commit c1e987e0624ebf509a6d86099dd47ae6d72a8c03)

8 months agomakesyscalls.lua: Make more luaish
Warner Losh [Thu, 20 Apr 2023 22:15:57 +0000 (16:15 -0600)]
makesyscalls.lua: Make more luaish

x["y"] can be written as x.y, which looks better and is a more typical
lua idiom.

Sponsored by: Netflix
Reviewed by: kevans
Differential Revision: https://reviews.freebsd.org/D39709

(cherry picked from commit 1dd350fce0aad85c559b962654f71d1449f21727)

8 months agoacpi_pcib: Rename decoded_bus_range to get_decoded_bus_range
John Baldwin [Fri, 20 Oct 2023 21:53:49 +0000 (14:53 -0700)]
acpi_pcib: Rename decoded_bus_range to get_decoded_bus_range

While here, change the return value to bool.

Discussed by: gibbs

(cherry picked from commit f6c2774fe415f3b79c551b8075c159d6a7d4d0bf)

8 months agox86: Cosmetic cleanups to struct msi_intsrc
John Baldwin [Fri, 20 Oct 2023 21:53:05 +0000 (14:53 -0700)]
x86: Cosmetic cleanups to struct msi_intsrc

- Sort members by size.

- Change msi_msix from a u_int to a bool.

Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D42305

(cherry picked from commit bfccb4a429795954cfeca4ba60a07c0e1ec35e07)

8 months agox86 msi: Enable/disable IDT vectors for MSI groups all at once
John Baldwin [Fri, 20 Oct 2023 21:52:38 +0000 (14:52 -0700)]
x86 msi: Enable/disable IDT vectors for MSI groups all at once

Unlike MSI-X, when a device uses multiple MSI interrupts, the entire
group of interrupts are enabled/disabled at once in the relevant PCI
config register.  Currently, the interrupt code enables the IDT vector
for each MSI interrupt when a handler is first registered.  If the PCI
device triggers an MSI interrupt which doesn't yet have a handler,
this can trigger a panic when the Xrsvd ISR executes rather than
treating it as a stray device interrupt.

To fix, enable all the IDT vectors for an MSI group when the first
interrupt handler is configured, and don't disable the IDT vectors
until the last interrupt handler for the group is torn down.

When migrating an MSI group between CPUs, enable/disable the entire
group of IDT vectors if at least one interrupt handler is configured
for the group.

Reported by: jhay
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D42232

(cherry picked from commit 2d4924892144f653a7a7afba27ed1bf536dd7e51)

8 months agoacpi_pcib: Trust decoded bus range from _CRS over _BBN
John Baldwin [Mon, 16 Oct 2023 22:19:07 +0000 (15:19 -0700)]
acpi_pcib: Trust decoded bus range from _CRS over _BBN

Currently if _BBN doesn't match the first bus in the decoded bus range
from _CRS for a Host to PCI bridge, the driver fails to attach as a
defensive measure.

There is now firmware in the field where these do not match, and the
_BBN values are clearly wrong, so rather than failing attach, trust
the range from _CRS over _BBN.

Co-authored-by: Justin Gibbs <gibbs@FreeBSD.org>
Reported by: gibbs
Reviewed by: imp (earlier version)
Differential Revision: https://reviews.freebsd.org/D42231

(cherry picked from commit 22a6678b627b39ceb94f7323be1010e928d92494)

8 months agobhyve: Replace many fprintf(stderr, ...) calls with EPRINTLN
John Baldwin [Mon, 16 Oct 2023 22:17:48 +0000 (15:17 -0700)]
bhyve: Replace many fprintf(stderr, ...) calls with EPRINTLN

EPRINTLN handles newlines appropriately when stdout/stderr have been
reused as the backend for a serial port.

For bhyverun.c itself, the rule this attempts to follow is to use
regular fprintf/perror/warn/err prior to init_pci() (which is when
serial ports are configured) and to switch to EPRINTLN afterwards.

Reviewed by: corvink, markj
Differential Revision: https://reviews.freebsd.org/D42182

(cherry picked from commit b0936440b8fcee523c0b26fdbbef7c3b2b5098bf)

8 months agobhyve ahci: Replace WPRINTF with EPRINTLN
John Baldwin [Fri, 13 Oct 2023 19:26:58 +0000 (12:26 -0700)]
bhyve ahci: Replace WPRINTF with EPRINTLN

Reviewed by: markj
Differential Revision: https://reviews.freebsd.org/D42181

(cherry picked from commit edd2a9b887864d07ac5af480b4b8f35cb76443f6)

8 months agobhyve: Some fwctl simplifications.
John Baldwin [Fri, 13 Oct 2023 19:26:22 +0000 (12:26 -0700)]
bhyve: Some fwctl simplifications.

- Collapse IDENT_SEND/IDENT_WAIT states down to a single state.

- Remove unused 'len' argument to op_data callback.  The value passed
  in (total amount of remaining data to receive) didn't seem very useful
  and no op_data implementations used it.

Reviewed by: corvink, markj
Differential Revision: https://reviews.freebsd.org/D41286

(cherry picked from commit f0852344e7abf4d74508185e67a1b98d6cdbd026)

8 months agobhyve: Document the hw.vmm.maxcpu tunable and the current limit on vCPUs
Yuri Pankov [Thu, 12 Oct 2023 19:49:47 +0000 (12:49 -0700)]
bhyve: Document the hw.vmm.maxcpu tunable and the current limit on vCPUs

Reviewed by: corvink (original version)
Co-authored-by: John Baldwin <jhb@FreeBSD.org>
Differential Revision: https://reviews.freebsd.org/D40074

(cherry picked from commit da202b0fe616e9314739f01493ae310e37a36d8d)

8 months agoamd64: Remove a stale comment from cpu_setregs
John Baldwin [Wed, 11 Oct 2023 21:22:17 +0000 (14:22 -0700)]
amd64: Remove a stale comment from cpu_setregs

Reviewed by: kib, markj, emaste
Differential Revision: https://reviews.freebsd.org/D42134

(cherry picked from commit e839ebfc0dc5851d383ac38740f32e96f7bd5186)

8 months agoriscv: Tidy panic messages for exceptions
John Baldwin [Wed, 11 Oct 2023 21:21:12 +0000 (14:21 -0700)]
riscv: Tidy panic messages for exceptions

- Remove trailing newlines

- Be consistent about the format used to print pointer values

- Print the trap value for access faults (it is the faulting address
  if non-zero) and illegal instructions (it is the first N bytes of
  the decoded instruction if non-zero)

Reviewed by: markj
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D41786

(cherry picked from commit ff79f35bdae5742f4e56e1dc18fffc5d9ea98876)

8 months agoTrim various $FreeBSD$
John Baldwin [Tue, 10 Oct 2023 17:34:43 +0000 (10:34 -0700)]
Trim various $FreeBSD$

Approved by: markj (cddl/contrib changes)
Reviewed by: imp, emaste
Differential Revision: https://reviews.freebsd.org/D41961

(cherry picked from commit f53355131f65d64e7643d734dbcd4fb2a5de20ed)

8 months agosendmail: Drop $FreeBSD$ from .mc files
John Baldwin [Mon, 25 Sep 2023 14:56:02 +0000 (07:56 -0700)]
sendmail: Drop $FreeBSD$ from .mc files

Reviewed by: imp, emaste
Differential Revision: https://reviews.freebsd.org/D41960

(cherry picked from commit 3aaa7724d68fb001ca3c7e75950edcb617aaeb65)

8 months agoUpdate a few tools to not embed $FreeBSD$ in generated files
John Baldwin [Mon, 25 Sep 2023 14:55:43 +0000 (07:55 -0700)]
Update a few tools to not embed $FreeBSD$ in generated files

Reviewed by: imp, emaste
Differential Revision: https://reviews.freebsd.org/D41959

(cherry picked from commit c4e2333cb2a59e44004d824a1093d9bf1d9fe273)

8 months agopowerpc/generate-hfs.sh: Don't include $FreeBSD$ in prefix to uuencoded image
John Baldwin [Mon, 25 Sep 2023 14:55:18 +0000 (07:55 -0700)]
powerpc/generate-hfs.sh: Don't include $FreeBSD$ in prefix to uuencoded image

Reviewed by: imp, emaste
Differential Revision: https://reviews.freebsd.org/D41958

(cherry picked from commit 5919ab299160e6d330bfd8bacf7bd1c5ad8cabb9)

8 months agoPurge more stray embedded $FreeBSD$ strings
John Baldwin [Mon, 25 Sep 2023 14:54:56 +0000 (07:54 -0700)]
Purge more stray embedded $FreeBSD$ strings

These do not use __FBSDID but instead use bare char arrays.

Reviewed by: imp, emaste
Differential Revision: https://reviews.freebsd.org/D41957

(cherry picked from commit eba230afba4932f02a1ca44efc797cf7499a5cb0)

8 months agolpr: Remove now unused fallback definition for __FBSDID
John Baldwin [Mon, 25 Sep 2023 14:50:33 +0000 (07:50 -0700)]
lpr: Remove now unused fallback definition for __FBSDID

Reviewed by: imp, emaste
Differential Revision: https://reviews.freebsd.org/D41956

(cherry picked from commit e4c68414d0854b5e43dfd1b2b0cfbc295702e831)

8 months agoUpdate a couple of tools to not embed __FBSDID in generated files
John Baldwin [Mon, 25 Sep 2023 14:50:11 +0000 (07:50 -0700)]
Update a couple of tools to not embed __FBSDID in generated files

Reviewed by: imp, emaste
Differential Revision: https://reviews.freebsd.org/D41955

(cherry picked from commit 99159b076a278d1feb0e18ae99fd866c90443893)

8 months agoRemove a few more stray __FBSDID uses
John Baldwin [Mon, 25 Sep 2023 14:49:52 +0000 (07:49 -0700)]
Remove a few more stray __FBSDID uses

Reviewed by: imp, emaste
Differential Revision: https://reviews.freebsd.org/D41954

(cherry picked from commit 16837d353cdde87672d08112610e51e4121c4e50)

8 months agovideomode: Regenerate files
John Baldwin [Mon, 25 Sep 2023 14:49:30 +0000 (07:49 -0700)]
videomode: Regenerate files

Reviewed by: imp, emaste
Differential Revision: https://reviews.freebsd.org/D41953

(cherry picked from commit fc3cc652e500bd8e33b4b77449d167f1df073acb)

8 months agovideomode/devlist2h.awk: Don't include $FreeBSD$ in generated files
John Baldwin [Mon, 25 Sep 2023 14:46:53 +0000 (07:46 -0700)]
videomode/devlist2h.awk: Don't include $FreeBSD$ in generated files

Reviewed by: imp, emaste
Differential Revision: https://reviews.freebsd.org/D41952

(cherry picked from commit bd524e2ddb77e1c691f308359ab917414ecb8bed)

8 months agomake_*_driver.sh: Don't include $FreeBSD$ in generated files
John Baldwin [Mon, 25 Sep 2023 14:46:09 +0000 (07:46 -0700)]
make_*_driver.sh: Don't include $FreeBSD$ in generated files

Reviewed by: imp, emaste
Differential Revision: https://reviews.freebsd.org/D41950

(cherry picked from commit 97232e04ca07dffeef629c1628f1cc95f062b41a)

8 months agofactor: Remove an empty #ifdef __FBSDID clause
John Baldwin [Sat, 23 Sep 2023 21:49:11 +0000 (14:49 -0700)]
factor: Remove an empty #ifdef __FBSDID clause

(cherry picked from commit f2f73fa7bd4b24c22ced0ff4566e03115dc9cb5f)

8 months agoifconfig/ifvlan.c: Whitespace fix
John Baldwin [Sat, 23 Sep 2023 20:27:49 +0000 (13:27 -0700)]
ifconfig/ifvlan.c: Whitespace fix

(cherry picked from commit 701468baa415c7d563d1ad28d3133d0a976908e6)

8 months agoRevert "socket tests: Add a regression test for ktrace+recv(MSG_TRUNC)"
Mark Johnston [Tue, 24 Oct 2023 18:03:49 +0000 (14:03 -0400)]
Revert "socket tests: Add a regression test for ktrace+recv(MSG_TRUNC)"

This reverts commit f5a9a849e9034c597c2b0a9014673a44834b9516.

This test will require extra work to port to stable/13.

8 months agoRevert "socket tests: Build fix"
Mark Johnston [Tue, 24 Oct 2023 18:03:23 +0000 (14:03 -0400)]
Revert "socket tests: Build fix"

This reverts commit 1b07f630c11ccf899612a7d02777fe0855e3bb25.

This test will require extra work to port to stable/13.

8 months agosocket tests: Build fix
Mark Johnston [Tue, 17 Oct 2023 14:21:32 +0000 (10:21 -0400)]
socket tests: Build fix

Fixes: d8735eb7acc0 ("socket tests: Add a regression test for ktrace+recv(MSG_TRUNC)")
Reported by: Jenkins

(cherry picked from commit 4bd1e19684945aa1fd3397b58613f5210fda9091)

8 months agobhyve: Use VMIO_SIOCSIFFLAGS instead of SIOCGIFFLAGS
Jan Bramkamp [Mon, 4 Sep 2023 08:38:25 +0000 (10:38 +0200)]
bhyve: Use VMIO_SIOCSIFFLAGS instead of SIOCGIFFLAGS

Creating an IP socket to invoke the SIOCGIFFLAGS ioctl on is the only
thing preventing bhyve from working inside a bhyve jail with IPv4 and
IPv6 disabled restricting the jailed bhyve process to only access the
host network via a tap/vmnet device node.

PR: 273557
Fixes: 56be282bc999 ("bhyve: net_backends, automatically IFF_UP tap devices")
Reviewed by: markj
MFC after: 1 week

(cherry picked from commit fd8b9c73a5a63a7aa438a73951d7a535b4f25d9a)

8 months agouiomove: Add some assertions
Mark Johnston [Mon, 16 Oct 2023 20:12:37 +0000 (16:12 -0400)]
uiomove: Add some assertions

Make sure that we don't try to copy with a negative resid.

Make sure that we don't walk off the end of the iovec array.

Reviewed by: kib
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D42098

(cherry picked from commit 8fd0ec53deaad34383d4b344714b74d67105b258)

8 months agosocket tests: Add a regression test for ktrace+recv(MSG_TRUNC)
Mark Johnston [Mon, 16 Oct 2023 22:23:36 +0000 (18:23 -0400)]
socket tests: Add a regression test for ktrace+recv(MSG_TRUNC)

MFC after: 1 week

(cherry picked from commit d8735eb7acc0613fd19f74a49d3bdcb7ed0e9b0e)

8 months agosocket tests: Clean up the MSG_TRUNC regression tests a bit
Mark Johnston [Mon, 16 Oct 2023 21:35:07 +0000 (17:35 -0400)]
socket tests: Clean up the MSG_TRUNC regression tests a bit

- Fix style.
- Move test case-specific code out of the shared function and into the
  individual test cases.
- Remove unneeded setting of SO_REUSEPORT.
- Avoid unnecessary copying.
- Use ATF_REQUIRE* instead of ATF_CHECK*.  The former cause test
  execution to stop after a failed assertion, which is what we want.
- Add a test case for AF_LOCAL/SOCK_SEQPACKET sockets.

MFC after: 1 week

(cherry picked from commit b5e7dbac756afb49c58315c7081737b34a1d2dfd)

8 months agogeom_linux_lvm: Avoid removing from vg_list before inserting
Mark Johnston [Tue, 17 Oct 2023 14:25:38 +0000 (10:25 -0400)]
geom_linux_lvm: Avoid removing from vg_list before inserting

PR: 266693
Reported by: Robert Morris <rtm@lcs.mit.edu>
MFC after: 1 week

(cherry picked from commit 56279238b03a0ccef245b22fff7679fe35cffccc)

8 months agondp: fix timestamp display output
R. Christian McDonald [Tue, 17 Oct 2023 16:57:22 +0000 (18:57 +0200)]
ndp: fix timestamp display output

The current xo_format string is incorrect. This restores the display
format prior to libxo-ification work while also explicitly marking
tv_sec and tv_usec as encoded output only.

MFC after: 1 week
Reviewed by: kp
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D42269

(cherry picked from commit 2bb78b46e02413483409fe73244995524b838b6e)

8 months agox86: Prefer consistent naming for loader tunables
Zhenlei Huang [Tue, 17 Oct 2023 07:05:25 +0000 (15:05 +0800)]
x86: Prefer consistent naming for loader tunables

The following loader tunables do have corresponding sysctl MIBs but
with inconsistent naming. That may be historical reason. Let's prefer
consistent naming for them so that it will be easier to maintain.

 1. hw.dmar.timeout -> hw.iommu.dmar.timeout
 2. hw.lapic_eoi_suppression -> hw.apic.eoi_suppression
 3. hw.lapic_tsc_deadline -> hw.apic.timer_tsc_deadline
 4. hw.x2apic_enable -> hw.apic.x2apic_mode

Those tunables are for field debugging, no need to keep old names for
compatibility.

Reviewed by: kib
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D42248

(cherry picked from commit 12cce5994b92f8235f379d660ccb28da8e69f55b)
(cherry picked from commit 6cd7e3d118f247a8f6bc0f8162a9cb67155b7c76)

8 months agoamd64 pmap: Prefer consistent naming for loader tunable
Zhenlei Huang [Fri, 20 Oct 2023 07:31:44 +0000 (15:31 +0800)]
amd64 pmap: Prefer consistent naming for loader tunable

The sysctl knob 'vm.pmap.allow_2m_x_ept' is loader tunable and have
public document entry in security(7) but is fetched from kernel
environment 'hw.allow_2m_x_ept'. That is inconsistent and obscure.

As there is public security advisory FreeBSD-SA-19:25.mcepsc [1],
people may refer to it and use 'hw.allow_2m_x_ept', let's keep old
name for compatibility.

[1] https://www.freebsd.org/security/advisories/FreeBSD-SA-19:25.mcepsc.asc

Reviewed by: kib
Fixes: c08973d09c95 Workaround for Intel SKL002/SKL012S errata
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D42311

(cherry picked from commit 9e7f349ff10691c2e3fb03898dbc942794a47566)
(cherry picked from commit 8784b153a31fc0b3a12449a2f0377eb038e6fb7b)

8 months agovmx: Prefer consistent naming for loader tunables
Zhenlei Huang [Thu, 19 Oct 2023 17:18:25 +0000 (01:18 +0800)]
vmx: Prefer consistent naming for loader tunables

The following loader tunables do have corresponding sysctl MIBs but
with different names. That may be historical reason. Let's prefer
consistent naming for them so that it will be easier to read and
maintain.

 1. hw.vmm.l1d_flush -> hw.vmm.vmx.l1d_flush
 2. hw.vmm.l1d_flush_sw -> hw.vmm.vmx.l1d_flush_sw
 3. hw.vmm.vmx.use_apic_pir -> hw.vmm.vmx.cap.posted_interrupts
 4. hw.vmm.vmx.use_apic_vid -> hw.vmm.vmx.cap.virtual_interrupt_delivery
 5. hw.vmm.vmx.use_tpr_shadowing -> hw.vmm.vmx.cap.tpr_shadowing

Old names are kept for compatibility.

Meanwhile, add sysctl flag CTLFLAG_TUN to them so that `sysctl -T` will
report them correctly.

Reviewed by: corvink, jhb, kib, #bhyve
MFC after: 5 days
Differential Revision: https://reviews.freebsd.org/D42251

(cherry picked from commit f3ff0918ffcdbcb4c39175f3f9be70999edb14e8)
(cherry picked from commit 9e48b627aed346bf5e950134a581218d3097eb7c)

8 months agoautomount(8): when flushing autofs, specify fsid
Konstantin Belousov [Fri, 29 Sep 2023 18:43:42 +0000 (21:43 +0300)]
automount(8): when flushing autofs, specify fsid

PR: 272446

(cherry picked from commit 56c44bd92efa002b2185445878fc98172ae8c66f)

8 months agoautomount: check for mounted-over autofs instances on flush
Andrew Gierth [Mon, 10 Jul 2023 15:09:56 +0000 (16:09 +0100)]
automount: check for mounted-over autofs instances on flush

PR: 272446

(cherry picked from commit 21b8e363c4eb24c0a5659101603cc08a86d87759)

8 months agonmount(MNT_UPDATE): add optional generid fsid parameter
Konstantin Belousov [Fri, 29 Sep 2023 18:42:50 +0000 (21:42 +0300)]
nmount(MNT_UPDATE): add optional generid fsid parameter

(cherry picked from commit 9ef7a491a4236810e50f0a2ee8d52f5c4bb02c64)

8 months agofreebsd-update: create deep BEs by default
Kyle Evans [Thu, 12 Oct 2023 02:51:07 +0000 (21:51 -0500)]
freebsd-update: create deep BEs by default

The -r flag to bectl needs to go away, and we need to just do the right
thing.  In the meantime, we can apply an -r in freebsd-update as a
minimal fix to stop creating partial backups in these (non-default) deep
BE setups.

PR: 267535
(cherry picked from commit 989c5f6da99081b1f2b76ec09e91078e531e1250)

8 months agopmap: Prefer consistent naming for loader tunable
Zhenlei Huang [Thu, 19 Oct 2023 17:00:31 +0000 (01:00 +0800)]
pmap: Prefer consistent naming for loader tunable

The sysctl knob 'vm.pmap.pv_entry_max' becomes a loader tunable since
7ff48af7040f (Allow a specific setting for pv entries) but is fetched
from system environment 'vm.pmap.pv_entries'. That is inconsistent and
obscure.

This reverts 36e1b9702e21 (Correct the tunable name in the message).

PR: 231577
Reviewed by: jhibbits, alc, kib
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D42274

(cherry picked from commit 02320f64209563e35fa371fc5eac94067f688f7f)
(cherry picked from commit e53f8ca323e8e563d4b55883fc3544bea75aab29)

8 months agoveriexec: Correctly export symbols
Zhenlei Huang [Sun, 15 Oct 2023 14:29:18 +0000 (22:29 +0800)]
veriexec: Correctly export symbols

There's no symbol named 'mac_veriexec_get_executable_flags', the right
one should be the function 'mac_veriexec_metadata_get_executable_flags()'.

Reviewed by: stevek
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D42133

(cherry picked from commit f34c9c4e3bdc2b8bffae4ac26897e0e847e9f76f)
(cherry picked from commit d8aaf09792338fae07b9618665ea9612b7b92a6e)

8 months agoamd64: Fix two typos of loader tunables
Zhenlei Huang [Thu, 19 Oct 2023 15:23:33 +0000 (23:23 +0800)]
amd64: Fix two typos of loader tunables

To match the sysctl MIBs and document entries in security(7).

Fixes: 2dec2b4a34b4 amd64: flush L1 data cache on syscall return with an error
Fixes: 17edf152e556 Control for Special Register Buffer Data Sampling mitigation

Reviewed by: kib
MFC after: 1 day
Differential Revision: https://reviews.freebsd.org/D42249

(cherry picked from commit afbb8041a0633c97acb51ac895c9ae3cde4fe540)
(cherry picked from commit 032a0b44541ffb669a4553105c6f6343ab4e3a67)

8 months agokasan.9: Mention the loader tunable 'debug.kasan.disable'
Zhenlei Huang [Fri, 13 Oct 2023 14:42:34 +0000 (22:42 +0800)]
kasan.9: Mention the loader tunable 'debug.kasan.disable'

Reviewed by: markj
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D42165

(cherry picked from commit 2df97575088d2efe71d6ee136a677cf50249f96d)
(cherry picked from commit c878532881d639ff50fded05bca37595d8b9d00d)

8 months agoteken: fix style in teken_wcwidth.h
Christos Margiolis [Fri, 13 Oct 2023 05:14:57 +0000 (08:14 +0300)]
teken: fix style in teken_wcwidth.h

Reviewed by: bojan.novkovic_fer.hr
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D42164

(cherry picked from commit 90367ba750bcbf3f9ac4609c3ec8df4ab95a22af)

8 months agotty/teken: fix UTF8 sequence validation logic
Bojan Novković [Fri, 13 Oct 2023 05:14:36 +0000 (08:14 +0300)]
tty/teken: fix UTF8 sequence validation logic

This patch fixes UTF-8 sequence validation logic in
teken_utf8_bytes_to_codepoint() and fixes fallback behaviour in
ttydisc_rubchar() when an invalid UTF8 sequence is encountered. The code
previously used __bitcount() to extract sequence length information from
the leading byte. However, this assumption breaks for certain code
points that have additional bits set in the first half of the leading
byte (e.g. Cyrillic characters). This lead to incorrect behaviour when
deleting those characters using backspaces. The code now checks the
number of consecutive set bits in the leading byte starting from the
MSB, as per RFC 3629.

Reviewed by: christos
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D42147

(cherry picked from commit 2fed1c579c52d63b72fc08ffcc652ba0183f9254)

8 months agoteken: fix up unused func warnings
Mateusz Guzik [Sun, 8 Oct 2023 13:54:11 +0000 (13:54 +0000)]
teken: fix up unused func warnings

Sponsored by: Rubicon Communications, LLC ("Netgate")

(cherry picked from commit 4b9aa38ef0e5bedcdd90b6627cc1c215037a1121)

8 months agoteken: use __bitcount() instead of bitcount()
Christos Margiolis [Sat, 7 Oct 2023 21:36:59 +0000 (00:36 +0300)]
teken: use __bitcount() instead of bitcount()

The use of bitcount() triggered a build error because it couldn't be
located. __bitcount() on the other hand is defined in sys/types.h, which
is included in teken/teken.h.

MFC after: 2 weeks

(cherry picked from commit 6d3296f16a06bcaa49918799e683936711dcf9c9)

8 months agotty: fix improper backspace behaviour for UTF8 characters when in canonical mode
Bojan Novković [Sat, 7 Oct 2023 18:00:11 +0000 (21:00 +0300)]
tty: fix improper backspace behaviour for UTF8 characters when in canonical mode

This patch adds additional logic in ttydisc_rubchar() to properly handle
backspace behaviour for UTF-8 characters.

Currently, typing in a backspace after a UTF8 character will delete only
one byte from the byte sequence, leaving garbled output in the tty's
output queue. With this change all of the character's bytes are deleted.
This change is only active when the IUTF8 flag is set (see
19054eb6053189144aa962b2ecc1bf5087758a3e "(s)tty: add support for IUTF8
input flag")

The code uses the teken_wcwidth() function to properly handle character
column widths for different code points, and adds the
teken_utf8_bytes_to_codepoint() function that converts a UTF-8 byte
sequence to a codepoint, as specified in RFC3629.

Reported by:    christos
Reviewed by:    christos, imp
MFC after:      2 weeks
Differential Revision:  https://reviews.freebsd.org/D42067

(cherry picked from commit 9e589b0938579f3f4d89fa5c051f845bf754184d)

8 months ago(s)tty: add support for IUTF8 input flag
Bojan Novković [Sat, 7 Oct 2023 17:59:57 +0000 (20:59 +0300)]
(s)tty: add support for IUTF8 input flag

This patch adds the necessary kernel and stty code to support setting
the IUTF8 flag for ttys. It is the first of two patches that fix
backspace behaviour for UTF-8 encoded characters when in canonical mode.

Reported by: christos
Reviewed by: christos, imp
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D42066

(cherry picked from commit 128f63cedc14ae21b35f74e11e2fe1a5659c58e8)

8 months agoktrace: Handle uio_resid underflow via MSG_TRUNC
Mark Johnston [Mon, 16 Oct 2023 20:11:55 +0000 (16:11 -0400)]
ktrace: Handle uio_resid underflow via MSG_TRUNC

When recvmsg(2) is used with MSG_TRUNC on an atomic socket type (DGRAM
or SEQPACKET), soreceive_generic() and uipc_peek_dgram() may
intentionally underflow uio_resid so that userspace can find out how
many bytes it should have asked for.

If this happens, and KTR_GENIO is enabled, ktrgenio() will attempt to
copy in beyond the end of the output buffer's iovec.  In general this
will silently cause the ktrace operation to fail since it'll result in
EFAULT from uiomove().  Let's be more careful and make sure not to try
and copy more bytes than we have.

Fixes: be1f485d7d6b ("sockets: add MSG_TRUNC flag handling for recvfrom()/recvmsg().")
Reported by: syzbot+30b4bb0c0bc0f53ac198@syzkaller.appspotmail.com
Reviewed by: kib
MFC after: 3 days
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D42099

(cherry picked from commit 761ae1ce798add862d78728cc5ac5240ce7db779)

8 months agoarm64, risvc: warn about ignored kstack_pages for thread0
Konstantin Belousov [Wed, 11 Oct 2023 18:56:28 +0000 (21:56 +0300)]
arm64, risvc: warn about ignored kstack_pages for thread0

(cherry picked from commit 6aa641b71d0dd1b26674f0b6dba086410f643595)

8 months agoarm64: do not disable the kern.kstack_pages tunable on arm64
Konstantin Belousov [Tue, 10 Oct 2023 00:02:06 +0000 (03:02 +0300)]
arm64: do not disable the kern.kstack_pages tunable on arm64

(cherry picked from commit 39cddbd7a07c182c4f121bea5a6effa36862fc63)

8 months agoarm64, riscv: Use KSTACK_PAGES for the thread0 kstack size designator
Konstantin Belousov [Mon, 9 Oct 2023 23:56:37 +0000 (02:56 +0300)]
arm64, riscv: Use KSTACK_PAGES for the thread0 kstack size designator

(cherry picked from commit ac63f7534d0102352bf993ebe2c748ce2ffd432e)

8 months agoarm64 locore.S: fix typos
Konstantin Belousov [Mon, 9 Oct 2023 23:55:45 +0000 (02:55 +0300)]
arm64 locore.S: fix typos

(cherry picked from commit 4095e0bcb9e8fac51eedad89211a5b16af7f55ad)

8 months agoHyper-V: vmbus: check if signaling host is needed in vmbus_rxbr_read
Wei Hu [Fri, 20 Oct 2023 08:58:20 +0000 (08:58 +0000)]
Hyper-V: vmbus: check if signaling host is needed in vmbus_rxbr_read

It is observed that netvsc's send rings could stall on the latest
Azure Boost platforms. This is due to vmbus_rxbr_read() routine
doesn't check if host is waiting for more room to put data, which
leads to host side sleeping forever on this vmbus channel. The
problem was only observed on the latest platform because the host
requests larger buffer ring room to be available, which causes
the issue to happen much more easily.

Fix this by adding check in the vmbus_rxbr_read call and signaling
the host in the callers if check returns positively.

Reported by: NetApp
Tested by: whu
Sponsored by: Microsoft

(cherry picked from commit 49fa9a64372b087cfd66459a20f4ffd25464b6a3)

8 months agoarm64/compat32: Fix handling of 32bits FP registers.
Olivier Houchard [Mon, 16 Oct 2023 20:18:24 +0000 (22:18 +0200)]
arm64/compat32: Fix handling of 32bits FP registers.

We must consider the aarch32 FP registers as 16 128bits registers, and store
that as the first 16 aarch64 FP registers.

PR: 267788

(cherry picked from commit ccd0f34d8585cba727dd17a381309855af655b82)
(cherry picked from commit 0e0a03c792542a2509702378559622efafc86548)

8 months agovm_phys: Add corresponding sysctl knob for loader tunable
Zhenlei Huang [Thu, 12 Oct 2023 10:14:49 +0000 (18:14 +0800)]
vm_phys: Add corresponding sysctl knob for loader tunable

The loader tunable 'vm.numa.disabled' does not have corresponding sysctl
MIB entry. Add it so that it can be retrieved, and `sysctl -T` will also
report it correctly.

Reviewed by: markj
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D42138

(cherry picked from commit c415cfc8be1b732a80f1ada6d52091e08eeb9ab5)
(cherry picked from commit e26b7e8d02f648623ad343016533487634a16698)

8 months agovm_page: Add corresponding sysctl knob for loader tunable
Zhenlei Huang [Thu, 12 Oct 2023 10:14:49 +0000 (18:14 +0800)]
vm_page: Add corresponding sysctl knob for loader tunable

The loader tunable 'vm.pgcache_zone_max_pcpu' does not have corresponding
sysctl MIB entry. Add it so that it can be retrieved, and `sysctl -T`
will also report it correctly.

Reviewed by: markj
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D42138

(cherry picked from commit a55fbda874db31b804490567c69502c891b6ff61)
(cherry picked from commit cb5bc8a748dfefc68e3905e8fdf17e0484844383)

8 months agokasan: Add corresponding sysctl knob for loader tunable
Zhenlei Huang [Thu, 12 Oct 2023 10:14:48 +0000 (18:14 +0800)]
kasan: Add corresponding sysctl knob for loader tunable

The loader tunable 'debug.kasan.disabled' does not have corresponding
sysctl MIB entry. Add it so that it can be retrieved, and `sysctl -T`
will also report it correctly.

Reviewed by: markj
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D42138

(cherry picked from commit db5d0bc868be669ed6588ebeccf8c02e76aabc41)
(cherry picked from commit 6f8ef4d6e44ee27a08c14ab6a892ffccf332bcf7)

8 months agouma.h: Fix a typo in a source code comment
Gordon Bergling [Sun, 15 Oct 2023 12:09:21 +0000 (14:09 +0200)]
uma.h: Fix a typo in a source code comment

- s/setable/settable/

(cherry picked from commit fc9f1d2c6391b1a4b133aab56ace625b72c9ea85)

8 months agoMFC: Remove confDH_PARAMETERS settings in favor of using sendmail's
Gregory Neil Shapiro [Fri, 18 Aug 2023 00:32:56 +0000 (00:32 +0000)]
MFC: Remove confDH_PARAMETERS settings in favor of using sendmail's
built-in default which was added in sendmail 8.15.2 (the config
line predates that 8.15.2 feature).  This also alleviates the need
for admins to create the DH parameters file if they opt to use
Diffie-Hellman.

PR: 248387

(cherry picked from commit 98fd1add676321978db72d77d34ef51ca454c814)

9 months agoptsname.3: accommodate upcoming POSIX Issue 8 ptsname_r
Ed Maste [Fri, 13 Oct 2023 20:25:53 +0000 (16:25 -0400)]
ptsname.3: accommodate upcoming POSIX Issue 8 ptsname_r

POSIX has accepted a proposal[1] to add glibc-compatible ptsname_r.  It
indicates an error by returning the error number, rather than returning
-1 and setting errno.  Update RETURN VALUES in ptsname_r's man page now
to encourage folks to test that the return value != 0 rather than == -1.

[1] https://www.austingroupbugs.net/bug_view_page.php?bug_id=508

Reported by: Collin Funk
Reviewed by: kib
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D42204

(cherry picked from commit a5ed6a815e38d6c622cd97a6020592ded579cf7a)

9 months agosctp: Various fixes for loader tunables
Zhenlei Huang [Mon, 9 Oct 2023 04:36:48 +0000 (12:36 +0800)]
sctp: Various fixes for loader tunables

The following sysctl variables are actually loader tunables. Add sysctl
flag CTLFLAG_TUN to them so that `sysctl -T` will report them correctly.

 1. net.inet.sctp.tcbhashsize
 2. net.inet.sctp.pcbhashsize
 3. net.inet.sctp.chunkscale

The loader tunable 'net.inet.sctp.tcbhashsize' and 'net.inet.sctp.chunkscale'
are only used during vnet initializing, thus it make no senses to make them
writable tunable.

Validate the values of loader tunables on vnet initialize, reset them to
theirs defaults if invalid to prevent potential kernel panics.

Reviewed by: tuexen, #transport, #network
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D42007

(cherry picked from commit dac91eb7660324677d8a2f71bd6f192422355ba1)
(cherry picked from commit fd9de12a71109d1e3bb4b20e7d040fc9a1784dc2)

9 months agotcp: Simplify the initialization of loader tunable 'net.inet.tcp.tcbhashsize'
Zhenlei Huang [Sun, 8 Oct 2023 10:03:59 +0000 (18:03 +0800)]
tcp: Simplify the initialization of loader tunable 'net.inet.tcp.tcbhashsize'

No functional change intended.

Reviewed by: cc, rscheff, #transport
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D41998

(cherry picked from commit 38ecc80b2a4e5e11ece83ca4df63632f0b6fa394)
(cherry picked from commit 3a97686fc11ae51ceb4004c07702a8a20f71410d)

9 months agovkbd: correct ref count on cloned cdevs
Konstantin Belousov [Mon, 2 Oct 2023 22:25:52 +0000 (01:25 +0300)]
vkbd: correct ref count on cloned cdevs

(cherry picked from commit 6e92fc930943a85f311e986a02e2b3dae9e37126)

9 months agotun/tap: correct ref count on cloned cdevs
Konstantin Belousov [Thu, 21 Sep 2023 10:47:14 +0000 (13:47 +0300)]
tun/tap: correct ref count on cloned cdevs

PR: 273418

(cherry picked from commit 27f1ec0be24b45559793e486a4fa5a2e7fdadc17)

9 months agounbound: Import upstream 0ee44ef3 when ENOBUFS is returned
Cy Schubert [Fri, 13 Oct 2023 00:04:25 +0000 (17:04 -0700)]
unbound: Import upstream 0ee44ef3 when ENOBUFS is returned

From upstream 0ee44ef3:

- Fix send of udp retries when ENOBUFS is returned. It stops looping
  and also waits for the condition to go away. Reported by Florian
  Obser.

PR: 274352

Merge commit '292d51198aa319c58f534549851e9c28486abdf4'

(cherry picked from commit 6e71235e558ef579605e7f35b02f983b9a246a4a)

9 months agofusefs: sanitize FUSE_READLINK results for embedded NULs
Alan Somers [Wed, 4 Oct 2023 18:48:01 +0000 (12:48 -0600)]
fusefs: sanitize FUSE_READLINK results for embedded NULs

If VOP_READLINK returns a path that contains a NUL, it will trigger an
assertion in vfs_lookup.  Sanitize such paths in fusefs, rejecting any
and warning the user about the misbehaving server.

PR: 274268
Sponsored by: Axcient
Reviewed by: mjg, markj
Differential Revision: https://reviews.freebsd.org/D42081

(cherry picked from commit 662ec2f781521c36b76af748d74bb0a3c2e27a76)

9 months agomrsas: Fix callout locking in mrsas_complete_cmd()
Mark Johnston [Sat, 7 Oct 2023 00:31:03 +0000 (20:31 -0400)]
mrsas: Fix callout locking in mrsas_complete_cmd()

callout_stop() requires the associated lock to be held.

This is a bit hacky, but I believe it's safe since the subsequent
mrsas_cmd_done() call will also acquire the SIM lock to stop a different
callout.

PR: 265484
Reviewed by: imp
Tested by: Jérémie Jourdin <jeremie.jourdin@advens.fr>
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D39559

(cherry picked from commit 4640df1b0a49697840b81f6bcd269a483514c6aa)

9 months agovfs cache: s/vfs.cache_fast_lookup/vfs.cache.param.fast_lookup
Mateusz Guzik [Tue, 3 Oct 2023 13:34:32 +0000 (13:34 +0000)]
vfs cache: s/vfs.cache_fast_lookup/vfs.cache.param.fast_lookup

(cherry picked from commit 38a375c472d295df41adf73c5ddd50543f9d877c)

9 months agovfs: convert recycles_count and recycles_free_count to mere u_long
Mateusz Guzik [Thu, 12 Oct 2023 06:57:59 +0000 (06:57 +0000)]
vfs: convert recycles_count and recycles_free_count to mere u_long

Only vnlru ever updates them.

This also removes recycles_count updates from hand-rolled debug vnode
recycling via sysctl.

Sponsored by: Rubicon Communications, LLC ("Netgate")

(cherry picked from commit 37544d9768110fd67527db7f2a3f7bb6fc977582)

9 months agovfs: count recycles by vnlru and by vn_alloc separately
Mateusz Guzik [Thu, 12 Oct 2023 06:47:45 +0000 (06:47 +0000)]
vfs: count recycles by vnlru and by vn_alloc separately

Sponsored by: Rubicon Communications, LLC ("Netgate")

(cherry picked from commit a92fc3122d2becfbf5a627af6eda5cedfac57c31)

9 months agovfs: count calls to uma_reclaim in vnlru
Mateusz Guzik [Wed, 11 Oct 2023 22:48:03 +0000 (22:48 +0000)]
vfs: count calls to uma_reclaim in vnlru

(cherry picked from commit bb679b0c49094757f2aef3d8fe46c41dc8192fea)

9 months agovfs: add max_vnlru_free to the vfs.vnode.vnlru tree
Mateusz Guzik [Wed, 11 Oct 2023 13:05:43 +0000 (13:05 +0000)]
vfs: add max_vnlru_free to the vfs.vnode.vnlru tree

While here rename the var internally.

(cherry picked from commit 281a9715b582861fe4097c2f27eb27b208d752b1)

9 months agovfs: further speed up continuous free vnode recycle
Mateusz Guzik [Wed, 11 Oct 2023 09:42:12 +0000 (09:42 +0000)]
vfs: further speed up continuous free vnode recycle

The primary bottleneck *was* vnode_list mtx, which got artificially
worsened due to the following work done with the lock held:
1. the global heavily modified numvnodes counter was being read,
   inducing massive cache line ping pong
2. should the value fit limits (which it normally did) there would be an
   avoidable write to vn_alloc_cyclecount, which is being read outside
   of the lock, once more inducing traffic

But if vn_alloc_cyclecount is 0, which it normally is even when facing
vnode shortage, there is no need to check numvnodes nor set it to 0 again.

Another problem was numvnodes adjustment (which made the locked read
much worse). While it fundamentally does not scale as it is not
distributed in any fashion, it was avoidably slow. When bumping over the
vnode limit, it would be modified with atomics 3 times: inc + dec to
backpedal in vn_alloc, then final inc in vn_alloc_hard.

One can let some slop persist over calls to vnlru_free instead.

In principle each thread in the system could get here and bump it, so a
limit is put in place to keep things sane.

Bench setup same as in prior commits: zfs, 20 separate directory trees
each with 1 million files in total and 20 find(1) processes stating them
in parallel (one per each tree).

Total run time (in seconds) goes down as follows:
vnode limit 8388608 400000
before ~20 ~35
after ~8 ~15

With this in place the primary bottleneck is now ZFS.

Sponsored by: Rubicon Communications, LLC ("Netgate")

(cherry picked from commit 054f45e026d898bdc8f974d33dd748937dee1d6b)

9 months agovfs: don't recycle transiently excess vnodes
Mateusz Guzik [Wed, 11 Oct 2023 06:39:48 +0000 (06:39 +0000)]
vfs: don't recycle transiently excess vnodes

Sponsored by: Rubicon Communications, LLC ("Netgate")

(cherry picked from commit a4f753e812d8913e9be481c6dfa1574c7f032a56)

9 months agovfs: prefix regular vnlru with a special case for free vnodes
Mateusz Guzik [Thu, 14 Sep 2023 19:08:40 +0000 (19:08 +0000)]
vfs: prefix regular vnlru with a special case for free vnodes

Works around severe performance problems in certain corner cases, see
the commentary added.

Modifying vnlru logic has proven rather error prone in the past and a
release is near, thus take the easy way out and fix it without having to
dig into the current machinery.

(cherry picked from commit 90a008e94bb205e5b8f3c41d57e155b59a6be95d)

9 months agovfs: consult freevnodes in vnlru_kick_cond
Mateusz Guzik [Tue, 10 Oct 2023 16:19:53 +0000 (16:19 +0000)]
vfs: consult freevnodes in vnlru_kick_cond

If the count is high enough there is no point trying to produce more.
Not going there reduces traffic on the vnode_list mtx.

This further shaves total real time in a test mentioned in:
74be676d87745eb7 ("vfs: drop one vnode list lock trip during vnlru free
recycle") -- 20 instances of find each creating 1 million vnodes, while
total limit is set to 400k.

Time goes down from ~41 to ~35 seconds.

Sponsored by: Rubicon Communications, LLC ("Netgate")

(cherry picked from commit 23ef25d25d989e7213bc1d3ef32b0f48a9eb2537)

9 months agovfs: be less eager to call uma_reclaim(UMA_RECLAIM_DRAIN)
Mateusz Guzik [Tue, 10 Oct 2023 16:15:53 +0000 (16:15 +0000)]
vfs: be less eager to call uma_reclaim(UMA_RECLAIM_DRAIN)

In face of vnode shortage the count very easily can go few units above
the limit before going back down.

Calling uma_reclaim results in massive amount of work which in this case
is not warranted.

Sponsored by: Rubicon Communications, LLC ("Netgate")

(cherry picked from commit 1bf55a739e754765fa2dc15ab6481fe411084be3)

9 months agovfs: don't provoke recycling non-free vnodes without a good reason
Mateusz Guzik [Thu, 14 Sep 2023 16:13:01 +0000 (16:13 +0000)]
vfs: don't provoke recycling non-free vnodes without a good reason

If the total number of free vnodes is at or above target, there is no
point creating more of them.

Tested by: pho (in a bigger patch)

(cherry picked from commit 8733bc277a383cf59f38a83956f4f523869cfc90)

9 months agovfs cache: denote a known bug in cache_remove_cnp
Mateusz Guzik [Thu, 5 Oct 2023 12:32:29 +0000 (12:32 +0000)]
vfs cache: denote a known bug in cache_remove_cnp

(cherry picked from commit cd2105d691f446f7dbddf5965d82b9e9103bc8d2)

9 months agovfs cache: plug a hypothetical corner case when freeing
Mateusz Guzik [Sat, 23 Sep 2023 02:04:06 +0000 (02:04 +0000)]
vfs cache: plug a hypothetical corner case when freeing

cache_zap_unlocked_bucket is called with a bunch of addresses and
without any locks held, forcing it to revalidate everything from
scratch.

It did not account for a case where the entry is reallocated with
everything the same except for the target vnode.

Should the target use a different lock than the one expected, freeing
would proceed without being properly synchronized.

Note this is almost impossible to happen in practice.

(cherry picked from commit 0f15054f7990f9c772bea34778a8838aa05ebed8)

9 months agovfs cache: sanitize debug counters
Mateusz Guzik [Thu, 5 Oct 2023 12:16:18 +0000 (12:16 +0000)]
vfs cache: sanitize debug counters

They are very rarely triggered, so no need for per-cpu distribution.

At the same time the non-cpu ones still should use atomics to not lose
any updates.

(cherry picked from commit 2749c222da8a6325d39c0571f72b1dbed2f7d583)