CyberLeo.Net >> Repos - FreeBSD/FreeBSD.git/log

exception: Fix typos

Signed-off-by: Elyes Haouas <ehaouas@noos.fr>
Reviewed by: imp
Pull Request: https://github.com/freebsd/freebsd-src/pull/885

minidump_machdep: Fix typo

Signed-off-by: Elyes Haouas <ehaouas@noos.fr>
Reviewed by: imp
Pull Request: https://github.com/freebsd/freebsd-src/pull/885

pmap: Fix typos

Signed-off-by: Elyes Haouas <ehaouas@noos.fr>
Reviewed by: imp
Pull Request: https://github.com/freebsd/freebsd-src/pull/885

vmm: Fix typo

Signed-off-by: Elyes Haouas <ehaouas@noos.fr>
Reviewed by: imp
Pull Request: https://github.com/freebsd/freebsd-src/pull/885

atomic: Fix typo

Signed-off-by: Elyes Haouas <ehaouas@noos.fr>
Reviewed by: imp
Pull Request: https://github.com/freebsd/freebsd-src/pull/885

msan: Fix typo

Signed-off-by: Elyes Haouas <ehaouas@noos.fr>
Reviewed by: imp
Pull Request: https://github.com/freebsd/freebsd-src/pull/885

release/Makefile.vm: Support read-only ports tree

Build qemu (if needed) with WRKDIRPREFIX=/tmp/ports DISTDIR=/tmp/distfiles
so that we can have a read-only /usr/ports and don't contaminate it. This
became an issue when I enabled parallel release building, since one image
might be creating its ports.txz file at the same time as we're building
qemu as a prerequisite for building another image.

MFC after: 5 days

Revert "cloudware: allow disk format to be a list"

This reverts commit 6ec9aaf63c81a68881cb6312f777349a0ac82ad5.

Requested by: cperciva

arm64/vmm: Define a dummy _start symbol in vmm_hyp_blob.elf

To silence a linker warning about _start being missing. This blob
contains code executed at EL2 and is only meant to be entered via
exception handlers.

Reviewed by: bz, emaste
Fixes: 47e073941f4e ("Import the kernel parts of bhyve/arm64")
Differential Revision: https://reviews.freebsd.org/D44735

cloudware: allow disk format to be a list

Make basic-cloudinit available both in qcow2 and raw formats

MFC After: 1 week
Reviewed by: Allanjude
Sponsored by: OVHCloud
Differential Revision: https://reviews.freebsd.org/D44747

scmi: Add an SCMI VirtIO transport driver

Add an SCMI transport driver based on the virtio-scmi backend.

Reviewed by: andrew, bryanv
Sponsored by: Arm Ltd
Differential Revision: https://reviews.freebsd.org/D43048

vtscmi: Add a virtio-scmi driver

Add a new virtio backend to support SCMI VirtIO devices (type 32) as
defined by the VirtIO specification since version v1.2.

https://docs.oasis-open.org/virtio/virtio/v1.2/cs01/virtio-v1.2-cs01.pdf

Reviewed by: andrew, bryanv
Sponsored by: Arm Ltd
Differential Revision: https://reviews.freebsd.org/D43047

scmi: Introduce a new SCMI API and port CLK SCMI driver to it

Expose new scmi_buf_get/put API methods to build and send messages;
command request descriptors are now pre-allocated when the SCMI core is
initialized and kept in a free list, instead of being allocated on the
stack of the caller of the SCMI request.

Dynamically allocated descriptors enable the SCMI core to keep around
and track outstanding transactions for as long as needed, outliving the
lifetime of the caller stack: this allows tracking of late or missing
replies and it will be needed when adding support for SCMI transports
that allows for more messages to be inflight concurrently.

Move the existing CLK SCMI driver to the new API.

Reviewed by: andrew
Tested on: Arm Morello Board
Sponsored by: Arm Ltd
Differential Revision: https://reviews.freebsd.org/D43046

scmi: Add SCMI message tracking and centralize tx/rx logic

In order to be able to support also new, more parallel, SCMI transports
that by nature can allow multiple concurrent commands to be in-flight,
pending a reply, we must be able to use the sequence number provided in
the SCMI messages to track the message status, matching commands and
replies while keeping track of timeouts and duplicates.

Add the needed message tracking machinery in the core SCMI stack and
move the residual common tx/rx logic from the specific transports to
the core SCMI stack, while adding one more interface to let the
transports customize ther behaviour.

Reviewed by: andrew
Tested on: Arm Morello Board
Sponsored by: Arm Ltd
Differential Revision: https://reviews.freebsd.org/D43045

scmi: Add new SCMI interfaces for init and message processing

Introduce a couple of new SCMI interface methods to allow centralized
initialization of transport-specific features and a couple of methods
to handle message reception from the SCMI core.

Move SCMI SMT related calls out of the core common SCMI code into the
transport specific layers Mailbox/SMC.

Make SCMI Mailbox/SMC transports use the new interface methods for
initialization and message reception.

Reviewed by: andrew
Tested on: Arm Morello Board
Sponsored by: Arm Ltd
Differential Revision: https://reviews.freebsd.org/D43044

scmi: Protect SCMI/SMT channels from concurrent transmissions

The SCMI/SMT memory areas are used from the agent and the platform as
channels to exchage commands and replies.

Once the platform has completed its processing and a reply is ready to
be read from the agent, the platform will relinquish the channel to the
agent by setting the CHANNEL_FREE bits in the related SMT area.

When this happens, though, the agent has still to effectively read back
the reply message and any other concurrent request happened to have been
issued in the meantime will have been to be hold back until the reply
is processed or risk to be overwritten by the new request.

The base->mtx lock that currently guards the whole scmi_request()
operation is released when sleeping waiting for a reply, so the above
mentioned race can still happen or, in a slightly different scenario,
the concurrent transmission could just fail, finding the channel busy,
after having sneaked through the mutex.

Adding a new mechanism to let the agent explicitly acquire/release the
channel paves the way, in the future, to remove such central commmon
lock in favour of new dedicated per-transport locking mechanisms, since
not all transports will necessarily need the same level of protection.

Add a flag, controlled by the agent, to mark when the channel has an
inflight command transaction still pending to be completed and make the
agent spin on it when queueing multiple concurrent messages on the same
SMT channel.

Reviewed by: andrew
Tested on: Arm Morello Board
Sponsored by: Arm Ltd
Differential Revision: https://reviews.freebsd.org/D43043

scmi: Fix SCMI mailbox polling mechanism

When the system is cold, the SCMI stack processes commands in polling
mode with the current polling mechanism being a check of the status
register in the mailbox controller to see if there is any pending
doorbell request.

Anyway, the completion interrupt is optional by the SCMI specification
and a system could have been simply designed without it: for this
reason polling on the mailbox controller status registers is not going
to work in all situations.

Moreover even alternative SCMI transports based on shared memory, like
SMC, will not have at all a mailbox controller to poll for.

On the other side, the associated SCMI Shared Memory Transport defines
dedicated channel flags and status bits that can be used by the agent to
explicitly request a polling-based transaction, even if the completion
interrupt was available, and to check afterwards when the platform has
completed its processing on the outstanding command.

Use SCMI/SMT specific mechanism to process transactions in polling mode.

Reviewed by: andrew
Tested on: Arm Morello Board
Sponsored by: Arm Ltd
Differential Revision: https://reviews.freebsd.org/D43042

scmi: Extend and refactor SCMI shmem support

Add a few new common public scmi_shmem methods to be used to handle SCMI
shared memory areas from multiple transports; while doing that review
the shared memory accesses to read only the SMT header fields strictly
relevant to the SCMI message processing.

Move all the SCMI shmem related code to the existing scmi_shmem.c file
and add a new dedicated scmi_shmem.h header.

Introduce some commonly needed message header manipulation macros.

Reviewed by: andrew
Tested on: Arm Morello Board
Sponsored by: Arm Ltd
Differential Revision: https://reviews.freebsd.org/D43041

scmi: Add an SCMI SMC transport driver

Using the SCMI transport interface add a new SMC transport to the
SCMI stack.

Sponsored by: Arm Ltd
Differential Revision: https://reviews.freebsd.org/D43040

scmi: Split out the SCMI mailbox to a new file

Add a new SCMI interface file to allow for multiple kind of transports
and move the mailbox transport to its own file, using the new interface.

Sponsored by: Arm Ltd
Differential Revision: https://reviews.freebsd.org/D43039

scmi: Implement scmi_clknode_recalc_freq method

Allow the SCMI clock frequency to be queried back, useful for testing
the IRQ path via sysctl access.

Reviewed by: andrew
Sponsored by: Arm Ltd
Differential Revision: https://reviews.freebsd.org/D43038

macio: Set resource map size

PR: 278278
Fixes: af081ec6f7cf ("powerpc: powermac: Use bus_generic_rman_*")

libllvm: add missed tlbgen headers and sources for BPF target

Noticed by: vishwin
PR: 276104
MFC after: 1 month

ahc/ahd: Fix target mode operation

After 5e63cdb457f9 the drivers didn't clear CAM_DIS_DISCONNECT in
ah*_handle_target_cmd() when needed, only set it.

Reported/tested by: HP van Braam <hp@tmm.cx>
MFC after: 1 week

Revert "sendfile: mark it explicitly as a TCP only feature"

This reverts commit 3b7aa842e27dcf07181f161b1abde0067ed51e97.

vm: improve kstack_object pindex calculation to avoid pindex holes

This commit replaces the linear transformation of kernel virtual
addresses to kstack_object pindex values with a non-linear
scheme that circumvents physical memory fragmentation caused by
kernel stack guard pages. The new mapping scheme is used to
effectively "skip" guard pages and assign pindices for
non-guard pages in a contiguous fashion.

The new allocation scheme requires that all default-sized kstack KVAs
come from a separate, specially aligned region of the KVA space.
For this to work, this commited introduces a dedicated per-domain
kstack KVA arena used to allocate kernel stacks of default size.
The behaviour on 32-bit platforms remains unchanged due to a
significatly smaller KVA space.

Aside from fullfilling the requirements imposed by the new scheme, a
separate kstack KVA arena facilitates superpage promotion in the rest
of kernel and causes most kstacks to have guard pages at both ends.

Reviewed by:  alc, kib, markj
Tested by:    markj
Approved by:  markj (mentor)
Differential Revision: https://reviews.freebsd.org/D38852

bhyve: Implement a PL031 RTC on arm64

Unlike amd64's, this RTC is implemented entirely in userspace. This is
the same RTC as is provided by QEMU's virt machine.

Reviewed by: jhb
MFC after: 2 weeks
Obtained from: CheriBSD

bhyve: Extract uart-clock from fdt_add_uart as an apb-pclk

This clock will also be used by the PL031 RTC (rather than defining
redundant per-device clocks).

Reviewed by: jhb
MFC after: 2 weeks
Obtained from: CheriBSD

bhyve: Extend mevent to support updating timers

This will be used by a new PL031 implementation to provide an RTC for
arm64 guests.

Reviewed by: jhb
MFC after: 2 weeks
Obtained from: CheriBSD

bhyve: Fix arm64 PCI I/O range to match FDT

This is supposed to combine with the memory range to make one contiguous
block, as is laid out in the FDT, so make this match what the OS is told
and thus actually configures.

Also drop the confusing leading zero from all three of these constants
that is making these 9 rather than 8 hex digits long (as one would
expect for a 32-bit address).

Reviewed by: jhb
MFC after: 2 weeks
Obtained from: CheriBSD

bhyve: Support legacy PCI interrupts on arm64

This allows us to remove various #ifdef hacks and enable building more
PCI devices.

Note that a hole is left in the interrupt mapping for the RTC rather
than having the two core devices straddle the PCIe interrupts. QEMU's
virt machine also takes this approach.

Reviewed by: jhb
MFC after: 2 weeks
Obtained from: CheriBSD

src.conf.5: Regenerate

libvmmapi: Conditionalize compilation of some functions

Hide definitions of several functions that currently don't have
implementatations in the arm64 vmm port. In particular, add a
WITH_VMMAPI_SNAPSHOT preprocessor variable that can be used to enable
compilation of save/restore functions, and conditionalize compilation of
some functions only used by amd64 bhyve. If in the long term they
remain amd64-only, they can move to vmmapi_machdep.c, but for now it's
not clear to me that that's the right thing to do.

MFC after: 2 weeks
Sponsored by: Innovate UK

bhyve: Push option parsing down into bhyverun_machdep.c

After a couple of attempts I think this is the cleanest approach despite
the expense of some code duplication. Quite a few of the single-letter
bhyve options are x86-specific.

I think that going forward we should strongly discourage the addition of
new options and instead configure guests using the more general
configuration file syntax.

Reviewed by: corvink, jhb
MFC after: 2 weeks
Sponsored by: Innovate UK
Differential Revision: https://reviews.freebsd.org/D41753

arm64: Connect bhyve and libvmmapi to the build

Reviewed by: corvink, andrew, jhb, emaste
MFC after: 2 weeks
Sponsored by: Innovate UK
Differential Revision: https://reviews.freebsd.org/D41742

bhyve: Partially disable INT#x support in virtio for arm64

A FreeBSD guest won't make use of this support and pci_lintr_* is not
implemented on arm64. Simply make pci_lintr_*() calls amd64-specific
for now.

Reviewed by: corvink, jhb
MFC after: 2 weeks
Sponsored by: Innovate UK
Differential Revision: https://reviews.freebsd.org/D41741

bhyve: Use vm_raise_msi() instead of vm_lapic_msi()

No functional change intended.

Reviewed by: corvink, jhb
MFC after: 2 weeks
Sponsored by: Innovate UK
Differential Revision: https://reviews.freebsd.org/D41740

bhyve: Add PCI mappings for arm64

- The extended config space and BAR ranges are listed in the FDT.
- Avoid referencing I/O ports in ACPI tables. Currently the arm64 port
does not support ACPI in any case.

Reviewed by: corvink, jhb
MFC after: 2 weeks
Sponsored by: Innovate UK
Differential Revision: https://reviews.freebsd.org/D41739

bhyve: Do not compile PCI passthrough support on arm64

Some required kernel functionality is not yet implemented.

For now this means that one cannot specify host PCI register values, but
that functionality is only used by amd64-specific device models for now.
Note that this limitation is rather artificial; it arises only because
pci_host_read_config() lives in pci_passthru.c.

Reviewed by: corvink, andrew, jhb
MFC after: 2 weeks
Sponsored by: Innovate UK
Differential Revision: https://reviews.freebsd.org/D41738

bhyve: Add bhyverun and vmexit handlers for arm64

Reviewed by: corvink, jhb
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D41006

libvmmapi: Zero out the structure passed to VM_GET_MEMSEG

Avoid assuming that the kernel zeros the name buffer, it does not do
this for zero-length segments.

MFC after: 2 weeks
Sponsored by: Innovate UK

libvmmapi: Make vm_raise_msi() a common function

Currently, bhyve PCI emulation uses vm_lapic_msi() to raise an MSI in
the guest. The arm64 port has a similar function, vm_raise_msi().
Add vm_raise_msi() on amd64 as well and have it simply call
vm_lapic_msi() so that bhyve can use a common, generically named
function.

Reviewed by: corvink, andrew, jhb
MFC after: 2 weeks
Sponsored by: Innovate UK
Differential Revision: https://reviews.freebsd.org/D41752

libvmmapi: Add arm64 support

- Define wrappers for some MD ioctls.
- Provide a list of vmm device ioctls for cap_ioctl_limit().
- Disable use of the lowmem region.

Reviewed by: corvink
MFC after: 2 weeks
Sponsored by: Innovate UK
Differential Revision: https://reviews.freebsd.org/D41005

libvmmapi: Make memory segment handling a bit more abstract

libvmmapi leaves a hole at [3GB, 4GB) in the guest physical address
space.  This hole is not used in the arm64 port, which maps everything
above 4GB.  This change makes the code a bit more general to accomodate
arm64 more naturally.  In particular:

- Remove vm_set_lowmem_limit(): it is unused and doesn't have
  well-defined constraints, e.g., nothing prevents a consumer from
  setting a lowmem limit above the highmem base.
- Define a constant for the highmem base and use that everywhere that
  the base is currently hard-coded.
- Make the lowmem limit a compile-time constant instead of a vmctx field.
- Store segment info in an array.
- Add vm_get_highmem_base(), for use in bhyve since the current value is
  hard-coded in some places.

No functional change intended.

Reviewed by: corvink, jhb
MFC after: 2 weeks
Sponsored by: Innovate UK
Differential Revision: https://reviews.freebsd.org/D41004

libvmmapi: Move PCI passthrough ioctl wrappers into a separate file

The arm64 port doesn't implement PCI passthrough and in particular
doesn't define the ioctls used by these wrappers. It might be that the
ppt ioctl interface will require modification to support arm64. Until
that's sorted out one way or another, put this code in a separate file
so that it's easy to conditionally compile.

No functional change intended.

Reviewed by: corvink, jhb
MFC after: 2 weeks
Sponsored by: Innovate UK
Differential Revision: https://reviews.freebsd.org/D41003

libvmmapi: Move more amd64-specific ioctl wrappers to vmmapi_machdep.c

No functional change intended.

Reviewed by: corvink, jhb
MFC after: 2 weeks
Sponsored by: Innovate UK
Differential Revision: https://reviews.freebsd.org/D41002

libvmmapi: Split the ioctl list into MI and MD lists

To enable use in capability mode, libvmmapi needs a list of all the
ioctls that might be invoked on the vmm device handle.  Some of these
ioctls are amd64-specific.  Move the ioctl list to vmmapi_machdep.c and
define a list of MI ioctls so that the arm64 port can build its own list
without duplicating common ioctls.  No functional change intended.

Reviewed by: corvink, jhb
MFC after: 2 weeks
Sponsored by: Innovate UK
Differential Revision: https://reviews.freebsd.org/D41001

libvmmapi: Move VM capability names to vmmapi_machdep.c

Add some missing entries while here.

Reviewed by: corvink, jhb
MFC after: 2 weeks
Sponsored by: Innovate UK
Differential Revision: https://reviews.freebsd.org/D41000

libvmmapi: Move some ioctl wrappers to vmmapi_machdep.c

ioctls relating to segments and various x86-specific interrupt
controllers are easy candidates to move to vmmapi_machdep.c.

In vmmapi.h I'm just ifdefing MD prototypes for now. We could instead
split vmmapi.h into multiple headers, e.g., vmmapi.h and
vmmapi_machdep.h, but it's not obvious to me yet that that's the right
approach.

No functional change intended.

Reviewed by: corvink, jhb
MFC after: 2 weeks
Sponsored by: Innovate UK
Differential Revision: https://reviews.freebsd.org/D40999

libvmmapi: Add a subdirectory for amd64-specific code

Move vmmapi_freebsd.c there. It contains x86-specific code used only by
bhyveload(8).

Move vcpu_reset() into vmmapi_machdep.c. It is also x86-specific.

No functional change intended.

Reviewed by: corvink, jhb
MFC after: 2 weeks
Sponsored by: Innovate UK
Differential Revision: https://reviews.freebsd.org/D40998

bhyve: Use vm_get_highmem_base() instead of hard-coding the value

This reduces the coupling between libvmmapi (which creates the highmem
segment) and bhyve, in preparation for the arm64 port.

No functional change intended.

Reviewed by: corvink, jhb
MFC after: 2 weeks
Sponsored by: Innovate UK
Differential Revision: https://reviews.freebsd.org/D40992

bhyve: Add FDT building code for arm64

fdt.c provides some basic routines which let platform initialization
code build the FDT that gets passed into the guest. For now this is not
very generic; we declare info about CPUs, memory, a single UART
(specified by -o console), a PCIe controller (used for virtio devices),
an interrupt controller and the platform timer.

Co-authored-by: andrew
Reviewed by: corvink, jhb
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D40996

bhyve: Provide optional libfdt linking

The arm64 port currently does not support ACPI, it instead builds up an
FDT which is exported to the guest. This mechanism will not be used on
amd64 but isn't really arm64-specific either, so provide an opt-in
mechanism to link libfdt.

No functional change intended.

Reviewed by: corvink, jhb
MFC after: 2 weeks
Sponsored by: Innovate UK
Differential Revision: https://reviews.freebsd.org/D40995

bhyve: Add PL011 UART emulation

This will be use for arm64 guests, instead of the existing ns16550 UART
model.

Reviewed by: corvink, jhb
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D40997

sys_procctl(): Make it clear that negative commands are invalid

An initial reading of the preamble of sys_procctl() gives the impression
that no test prevents a malicious user from passing a negative commands
index (in 'uap->com'), which is soon used as an index into the static
array procctl_cmds_info[].

However, a closer examination leads to the conclusion that the existing
code is technically correct.  Indeed, the comparison of 'uap->com' to
the nitems() expression, which expands to a ratio of sizeof(), leads to
a conversion of 'uap->com' to an 'unsigned int' as per Usual Arithmetic
Conversions/Integer Promotions applied by '<=', because sizeof() returns
'size_t' values, and we define 'size_t' as an equivalent of 'unsigned
int' (which is not mandated by the standard, the latter allowing, e.g.,
integers of lower ranks).

With this conversion, negative values of 'uap->com' are automatically
ruled-out since they are converted to very big unsigned integers which
are caught by the test.  An analysis of assembly code produced by LLVM
16 on amd64 and practical tests confirm that no exploitation is possible.

However, the guard code as written is misleading to readers and might
trip up static analysis tools.  Make sure that negative values are
explicitly excluded so that it is immediately clear that EINVAL will be
returned in this case.

Build tested with clang 16 and GCC 12.

Approved by:    markj (mentor)
MFC after:      1 week
Sponsored by:   The FreeBSD Foundation

tcp: Make tcp_var.h more self-contained

struct tcpcb embeds a struct osd and a struct callout. Rather than
forcing all consumers to pull in the same headers, include the headers
directly.

No functional change intended.

Reviewed by: glebius
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D44685

vm_reserv_reclaim_contig: Return NULL not false

Reviewed by: dougm, zlei
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D44667

pciconf(8): dump AMD IOMMU Base Capability

Reviewed by: emaste
Sponsored by: Advanced Micro Devices (AMD)
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D44732

pcireg.h: Add AMD IOMMU Base Cap definitions

Reviewed by: emaste
Sponsored by: Advanced Micro Devices (AMD)
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D44732

pcireg.h: add include guard

Reviewed by: emaste
Sponsored by: Advanced Micro Devices (AMD)
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D44732

libbe(3): history: fix

'bectl(8) and libbe' (not 'libbe and libbe(3)').

Reviewed by: imp
Pull Request: https://github.com/freebsd/freebsd-src/pull/857

libbe(3): consistency, and authors

Consistency with the manual page for bectl(8), including addition of an
AUTHORS section.

Reviewed by: imp
Pull Request: https://github.com/freebsd/freebsd-src/pull/857

bectl(8): authors: Kyle Evans: fine-tune

Discussed with Kyle in Discord.

Reviewed by: imp
Pull Request: https://github.com/freebsd/freebsd-src/pull/857

bectl(8): authors: be more explicit

Cross-reference (name) the manual page that was written by Bryan
Drewery.

Reviewed by: imp
Pull Request: https://github.com/freebsd/freebsd-src/pull/857

bectl(8): HISTORY, AUTHORS: further attention

Reviewed by: imp
Pull Request: https://github.com/freebsd/freebsd-src/pull/857

bectl(8): corrections, changes

beadm(1) no longer exists.

Cross-reference beadm(8).

Aim to improve the HISTORY and AUTHORS sections, including consistency
with the manual pages for beadm(8) and libbe(3).

Reviewed by: imp
Pull Request: https://github.com/freebsd/freebsd-src/pull/857

exit.3: add the comma after an empty space

exit(3) man page shows __cxa_atexit(3,) instead of __cxa_atexit(3), in a
particular section. It seems the comma gets inside the parenthesis and
with an extra space, it can be viewed as expected.

Signed-off-by: rilysh <nightquick@proton.me>
Reviewed by: imp
Pull Request: https://github.com/freebsd/freebsd-src/pull/1006

Reapply "release.sh: Add -jN to `make release`"

With the latest changes to release/Makefile, it is now possible to
run `make release -jN` without the build failing (at least in my
latest tests).

This reverts commit 7b707e797b2cd6265ba8f6215e59445e9efb9e97.

MFC after: 1 week

release: Don't reuse disc1/bootonly directories

The disc1 and bootonly directories have files distributed into them
for use in "full" and "mini" images; the former are disc1.iso and
memstick.img, and the latter is bootonly.iso and mini-memstick.img.

Unfortunately the scripts which package a directory tree into an ISO
or memory stick image also modify the directory, for example to
create an appropriate /etc/fstab file; so creating two images at the
same time breaks.

Resolve this by copying disc1 to disc1-disc1 and disc1-memstick,
and copying bootonly to bootonly-bootonly and bootonly-memstick,
before using those directories for constructing the ISO+memstick
images.

MFC after: 1 week

release: distributekernel before packagekernel

With these as a single make command, `make -j` breaks when it tries to
package up a kernel which hasn't been distributed yet.

MFC after: 1 week

release: make -j compat: cd inside subshell

Place instances of "cd foo && bar" inside subshells for compatibility
with modern make(8) which uses a single shell for the duration of a
makefile target.

MFC after: 1 week

bcm2838_xhci: add module

bcm2838_xhci(4) is a shim for the XHCI controller on the Raspberry Pi 4B
SoC. It loads the controller's firmware before passing control to the
normal xhci(4) driver.

When xhci(4) is built as a module (and not in the kernel), bcm2838_xhci
is not built at all and the RPi4's XHCI controller won't attach due to
missing firmware.

To fix this, build a new module, bcm2838_xhci.ko, which depends on
xhci.ko. For the dependency to work correctly, also modify xhci to
provide the 'xhci' module in addition to the 'xhci_pci' module it
already provided.

Since bcm2838_xhci is specific to a quirk of the RPi4 SoC, only build
the module for AArch64.

Reviewed by: imp
Pull Request: https://github.com/freebsd/freebsd-src/pull/1142

tests: Add ktrace regression test for shm_open

Verify that a capability violation is recorded when shm_open(2) is called
with a non-anonymous path.

Approved by: markj (mentor)
Reviewed by: markj
MFC after: 1 month
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D44733

uipc_shm: Copyin userpath for ktrace(2)

If userpath is not SHM_ANON, then copy it in early so ktrace(2) can
record it. Without this change, ktrace(2) will attempt to strcpy a
userspace string and trigger a page fault.

Reported by: syzbot+490b9c2a89f53b1b9779@syzkaller.appspotmail.com
Fixes: 0cd9cde767c3
Approved by: markj (mentor)
Reviewed by: markj
MFC after: 1 month
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D44702

unionfs_lookup(): fix wild accesses to vnode private data

There are a few spots in which unionfs_lookup() accesses unionfs vnode
private data without holding the corresponding vnode lock or interlock.

Reviewed by: kib, olce
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D44601

NOTES: Move NVMe entries to MI file

While here, adjust the sample setting for NVME_USE_NVD to use a
non-default setting as is typical in entries in NOTES.

Discussed with: imp
Reviewed by: manu
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D44691

sys: Enable NVMe drivers on all architectures

The NVMe drivers are portable and are already included statically in
GENERIC on other architectures such as aarch64 and riscv64.

Reviewed by: imp
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D44690

NOTES: Tidy entries for SATA controllers

- Add typical comments after device entries (copied from amd64
  GENERIC)

- Add an entry for 'device ada'.  Normally this is pulled in via
  'device sd', but is documented in ada(4) and can be used to include
  ATA/SATA disk support in a kernel without SCSI disk support.

Reviewed by: imp
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D44689

NOTES: Add devices for iSCSI support

Reviewed by: imp
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D44688

iser: Add kernel build glue

'device iser' is documented in iser(4) but not supported. Hook it up
to the build.

Reviewed by: imp
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D44687

NOTES: Move OFED options to MI NOTES

Disable in armv7 NOTES to match sys/modules/Makefile

Reviewed by: imp
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D44686

periodic/daily/801.trim-zfs: Add a daily zfs trim script

As mentioned in zpoolprops(7), on some SSDs, it may not be desirable to
use ZFS autotrim because a large number of trim requests can degrade
disk performance; instead, the pool should be manually trimmed at
regular intervals.

Add a new daily periodic script for this purpose, 801.trim-zfs. If
enabled (daily_trim_zfs_enable=YES; the default is NO), it will run a
'zpool trim' operation on all online pools, or on the pools listed in
'daily_trim_zfs_pools'.

The trim is not started if the pool is degraded (which matches the
behaviour of the existing 800.scrub-zfs script) or if a trim is already
running on that pool. Having autotrim enabled does not inhibit the
periodic trim; it's sometimes desirable to run periodic trims even with
autotrim enabled, because autotrim can elide trims for very small
regions.

PR: 275965
MFC after: 1 week
Reviewed by: imp
Pull Request: https://github.com/freebsd/freebsd-src/pull/956

pci_host_generic: Tolerate range resource allocation failures

QEMU for armv7 includes a PCI memory range whose CPU address is
greater than 4GB.  This falls outside the range of armv7's global
mem_rman used by the nexus driver.  As a result, pcib0 fails to
attach blocking all PCI devices.

Instead, change the driver to be a bit more tolerant.  If allocating a
resource for a range fails, don't fail attaching the entire driver,
but do skip adding the associated PCI range to the relevant rman in
the pcib driver.  This will prevent child devices from using BARs that
allocate from this range.  In the case of QEMU on armv7 devices can
still allocate from an earlier PCI memory range that is within the
32-bit address space (and in fact none of the firmware-assigned memory
BARs use addresses from the upper range).

While here, reorder the operations on I/O ranges a bit: 1) print the
range under bootverbose first (rather than last) so that the range is
printed before any relevant errors for the range, 2) move
rman_manage_region last after the parent resource has been set and
allocated.

Reported by: markj, Jenkins
Reviewed by: markj
Fixes: d79b6b8ec267 pci_host_generic: Don't rewrite resource start address for translation
Differential Revision: https://reviews.freebsd.org/D44698

Revert "unix: new implementation of unix/stream & unix/seqpacket"

The regressions in aio(4) and kernel RPC aren't a 5 minute problem.

This reverts commit d80a97def9a1db6f07f5d2e68f7ad62b27918947.
This reverts commit d1cbb17a873c787a527316bbb27551e97d5ad30c.
This reverts commit fb8a8333b481cc4256d0b3f0b5b4feaa4594e01f.

config.mk: Add MK_VIMAGE knob

Default to VIMAGE as yes.
Add VIMAGE to __DEFAULT_DEPENDENT_OPTIONS (to define VIMAGE_SUPPORT)

Only output VIMAGE to opt_global.h when VIMAGE support is wanted.

Obtained from: Juniper Networks, Inc.
Differential Revision: https://reviews.freebsd.org/D39636

arm64 pmap: Add ATTR_CONTIGUOUS support [Part 2]

Create ATTR_CONTIGUOUS mappings in pmap_enter_object().  As a result,
when the base page size is 4 KB, the read-only data and text sections
of large (2 MB+) executables, e.g., clang, can be mapped using 64 KB
pages.  Similarly, when the base page size is 16 KB, the read-only
data section of large executables can be mapped using 2 MB pages.

Rename pmap_enter_2mpage().  Given that we have grown support for 16 KB
base pages, we should no longer include page sizes that may vary, e.g.,
2mpage, in pmap function names.  Requested by: andrew

Co-authored-by: Eliot Solomon <ehs3@rice.edu>
Differential Revision: https://reviews.freebsd.org/D44575

rpc: use new macros to lock socket buffers

Fixes: d80a97def9a1db6f07f5d2e68f7ad62b27918947

vm: add macro to mark arguments used when NUMA is defined

This fixes compiler warnings when -Wunused-arguments is enabled and
not quieted.

Reviewed by: kib, markj
Obtained from: Juniper Networks, Inc.
Differential Revision: https://reviews.freebsd.org/D44623

ng_socket: Treat EEXIST from kern_kldload() as success

EEXIST is possible in a race condition.

Inspired by: ffc72591b1f5 (Don't worry if a module is already loaded ...)
Reviewed by: glebius
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D44633

mountd.8: Document the new -A mountd option

Commit fefb7c399b39 added warning messages noting
that administrative controls that exported directories
that are not local server file system mount points actually
export the entire local server file system.
This commit also added a new command line option "-A' that
silences these warnings.

This patch documents the new "-A' mountd option.

This is a content change.

Reviewed by: markj, pauamma_gundo.com (manpages)
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D44692

sockets: Add hhook in sonewconn for inheriting OSD specific data

Added HHOOK_SOCKET_NEWCONN and bumped HHOOK_SOCKET_LAST

Reviewed by: glebius, tuexen
Obtained from: Juniper Networks, Inc.
Differential Revision: https://reviews.freebsd.org/D44632

unix: return immediately on MSG_OOB

Jumping to cleanup routines will work on uninitialized stack mc.

Fixes: d80a97def9a1db6f07f5d2e68f7ad62b27918947
Reported-by: syzbot+4adf0b37849ea7723586@syzkaller.appspotmail.com

unix: fix the ad hoc STAILQ_PREPEND()

If there is nothing to prepend, don't try STAILQ_INSERT_HEAD().

Fixes: d80a97def9a1db6f07f5d2e68f7ad62b27918947
Reported-by: syzbot+bb7f3d07c79b5faf8de8@syzkaller.appspotmail.com

icmp: correct the assertion that checks limit + jitter

Fixes: 4399e055ea610cdefa1470ad1ee614dd81ba5e56

cp: Never follow symbolic links in destination.

Historically, BSD cp has followed symbolic links in the destination
when copying recursively, while GNU cp has not.  POSIX is somewhat
vague on the topic, but both interpretations are within bounds.  In
33ad990ce974, cp was changed to apply the same logic for symbolic
links in the destination as for symbolic links in the source: follow
if not recursing (which is moot, as this situation can only arise
while recursing) or if the `-L` option was given.  There is no support
for this in POSIX.  We can either switch back, or go all the way.

Having carefully weighed the kind of trouble you can run into by
following unexpected symlinks up against the kind of trouble you can
run into by not following symlinks you expected to follow, we choose
to go all the way.

Note that this means we need to stat the destination twice: once,
following links, to check if it is or references the same file as the
source, and a second time, not following links, to set the dne flag
and determine the destination's type.

While here, remove a needless complication in the dne logic.  We don't
need to explicitly reject overwriting a directory with a non-directory,
because it will fail anyway.

Finally, add test cases for copying a directory to a symlink and
overwriting a directory with a non-directory.

MFC after: never
Relnotes: yes
Sponsored by: Klara, Inc.
Reviewed by: kevans
Differential Revision: https://reviews.freebsd.org/D44578

unix: new implementation of unix/stream & unix/seqpacket

Provide protocol specific pr_sosend and pr_soreceive for PF_UNIX
SOCK_STREAM sockets and implement SOCK_SEQPACKET sockets as an extension
of SOCK_STREAM.  The change meets three goals: get rid of unix(4) specific
stuff in the generic socket code, provide a faster and robust unix/stream
sockets and bring unix/seqpacket much closer to specification.  Highlights
follow:

- The send buffer now is truly bypassed.  Previously it was always empty,
but the send(2) still needed to acquire its lock and do a variety of
tricks to be woken up in the right time while sleeping on it.  Now the
only two things we care about in the send buffer is the I/O sx(9) lock
that serializes operations and value of so_snd.sb_hiwat, which we can read
without obtaining a lock.  The sleep of a send(2) happens on the mutex of
the receive buffer of the peer.  A bulk send/recv of data with large
socket buffers will make both syscalls just bounce between owning the
receive buffer lock and copyin(9)/copyout(9), no other locks would be
involved.

- The implementation uses new mchain structure to manipulate mbuf chains.
Note that this required converting to mchain two functions that are shared
with unix/dgram: unp_internalize() and unp_addsockcred() as well as adding
a new shared one uipc_process_kernel_mbuf().  This induces some non-
functional changes in the unix/dgram code as well.  There is a space for
improvement here, as right now it is a mix of mchain and manually managed
mbuf chains.

- unix/seqpacket previously marked as PR_ADDR & PR_ATOMIC and thus treated
as a datagram socket by the generic socket code, now becomes a true stream
socket with record markers.

- unix/stream loses the sendfile(2) support.  This can be brought back,
but requires some work.  Let's first see if there is any interest in this
feature, except purely academical.

Reviewed by: markj, tuexen
Differential Revision: https://reviews.freebsd.org/D44151

mbuf: provide mc_uiotomc() a function to copy from uio(9) to mchain

Implement m_uiotombuf() as a wrapper around mc_uiotomc(). The M_EXTPG is
left untouched. The m_uiotombuf() is left as a compat KPI. New code
should use either mc_uiotomc() or m_uiotombuf_nomap().

Reviewed by: markj, tuexen
Differential Revision: https://reviews.freebsd.org/D44150

mbuf: provide mc_get() that allocates struct mchain of given length

Implement m_getm2(), which is widely used via m_getm() macro, as a wrapper
around mc_get(). New code is advised to use mc_get().

Reviewed by: markj, tuexen
Differential Revision: https://reviews.freebsd.org/D44149

mbuf: add mc_split() that works on two struct mchain

It preserves tail points and all length/memory accounting, so that caller
doesn't need to do any extra traversals. It doesn't respect M_PKTHDR but
it may be improved if needed. It respects M_EOR, though. First consumer
will be the new unix(4) SOCK_STREAM and SOCK_SEQPACKET.

Also provide much more simple mc_concat() that glues two chains back.

Reviewed by: markj
Differentail Revision: https://reviews.freebsd.org/D44148

mbuf: provide new type for mbuf manipulation - mbuf chain

It tracks both the first mbuf and last mbuf, making it handy to use inside
functions that are interested in both. It also tracks length of data and
memory usage. It can be allocated on stack and passed to an mbuf
allocation or another mbuf manipulation function. It can be embedded into
some kernel facility internal structure representing most simple data
buffer. It uses modern queue(3) based linkage, but is also compatible with
old style m_next linkage. Transitioning older code to new type can be done
gradually - a code that doesn't understand the chain yet, can be supplied
with STAILQ_FIRST(&mc.mc_q). So you can have a mix of old style and new
style code in one function as a temporary solution.

Reviewed by: markj, tuexen
Differential Revision: https://reviews.freebsd.org/D44147