CyberLeo.Net >> Repos - FreeBSD/FreeBSD.git/log

packages: add package for NTP

Reviewed by: imp, manu
Pull Request: https://github.com/freebsd/freebsd-src/pull/1193

zfs: unbreak macOS bootstrap

Temporary patch until vendor implements a fix.

libcbor: vendor update to 0.11.0

Sponsored by: The FreeBSD Foundation

__cxa_thread_call_dtors(3): fix dtor pointer validity check

When checking for the destructor pointer belonging to some still
loaded dso, do not limit the possible dso to the one instantiated the
destructor. For instance, dso could set up the dtr pointer to a function
from libcxx.

PR: 278701
Reported by: vd
Reviewed by: dim, emaste, markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D45074

Merge bmake-20240430

Merge commit '507951f55039f9d1ceae507d510f8cb68225fbc5'

Import bmake-20240430

Intersting/relevant changes since bmake-20240309

ChangeLog since bmake-20240309

2024-04-30  Simon J Gerraty  <sjg@beast.crufty.net>

* VERSION (_MAKE_VERSION): 20240430
Merge with NetBSD make, pick up
o main.c: ensure '.include <makefile>' respects MAKESYSPATH.
Dir_FindFile will search .CURDIR first unless ".DOTLAST" is seen.

2024-04-28  Simon J Gerraty  <sjg@beast.crufty.net>

* VERSION (_MAKE_VERSION): 20240428
Merge with NetBSD make, pick up
o simplify freeing of lists
o arch.c: trim pointless comments
o var.c: delay variable assignments until actually needed
don't reallocate memory after evaluating an expression, result is
almost always short-lived.

2024-04-26  Simon J Gerraty  <sjg@beast.crufty.net>

* VERSION (_MAKE_VERSION): 20240426
Merge with NetBSD make, pick up
o job.c: in debug output, print the directory in which a job
failed at same time as failed target so it is more easily found in
build log.

2024-04-24  Simon J Gerraty  <sjg@beast.crufty.net>

* VERSION (_MAKE_VERSION): 20240424
Merge with NetBSD make, pick up
o clean up comments, code and tests

2024-04-23  Simon J Gerraty  <sjg@beast.crufty.net>

* VERSION (_MAKE_VERSION): 20240422
Merge with NetBSD make, pick up
o var.c: avoid LazyBuf for :*time modifiers.
LazyBuf's are not nul terminated so not suitable for passing to
functions that expect that. These modifiers are used sparingly so
an extra allocation is not a problem.

2024-04-20  Simon J Gerraty  <sjg@beast.crufty.net>

* VERSION (_MAKE_VERSION): 20240420
Merge with NetBSD make, pick up
o provide more context information for parse/evaluate errors

2024-04-14  Simon J Gerraty  <sjg@beast.crufty.net>

* VERSION (_MAKE_VERSION): 20240414
Merge with NetBSD make, pick up
o parse.c: print -dp debug info earlier so we see which
.if or .for line is being parsed.

2024-04-04  Simon J Gerraty  <sjg@beast.crufty.net>

* VERSION (_MAKE_VERSION): 20240404
Merge with NetBSD make, pick up
o fix some unit tests for Cygwin
o parse.c: exit immediately after reading a null byte from a makefile

* fix generation of bmake.cat1

2024-03-19  Simon J Gerraty  <sjg@beast.crufty.net>

* VERSION (_MAKE_VERSION): 20240314
Add/Improve support for Cygwin
o uname -s output isn't useful so allow configure to
set FORCE_MAKE_OS - to force the value of .MAKE.OS
and use Cygwin which matches uname -o
o fix some unit-tests for Cygwin

* configure.in: use_makefile=no for Cygwin et al.
NOTE: bmake does not support Cygwin and likely never will,

mk/ChangeLog since bmake-20240309

2024-04-24  Simon J Gerraty  <sjg@beast.crufty.net>

* meta.autodep.mk: do not override start_utc

2024-04-18  Simon J Gerraty  <sjg@beast.crufty.net>

* sys.dirdeps.mk: set defaults for DEP_* at level 0 too.
These help when first include of Makefile.depend happens in a leaf
dir.

* install-mk (MK_VERSION): 20240414

2024-04-09  Simon J Gerraty  <sjg@beast.crufty.net>

* install-mk (MK_VERSION): 20240408

* init.mk: allow for _ as well as . to join V
and Q from QUALIFIED_VAR_LIST and VAR_QUALIFIER_LIST.

* progs.mk: avoid overlap between PROG_VARS and
init.mk's QUALIFIED_VAR_LIST since PROG would also
match its VAR_QUALIFIER_LIST,
libs.mk does not have the same issue.

* subdir.mk: _SUBDIRUSE for realinstall should run install
remove include of ${.CURDIR}/Makefile.inc that can be done via
local.subdir.mk where needed

* own.mk: do not conflict with man.mk

2024-03-19  Simon J Gerraty  <sjg@beast.crufty.net>

* install-mk (MK_VERSION): 20240314

* add sys/Cygwin.mk from Christian Franke

Vendor import of libcbor 0.11.0

RELNOTES: Document the addition of NVMe over Fabrics support

tpm: Refactor TIS and add a SPI attachment

Summary:
Though mostly used in x86 devices, TPM can be used on others, with a
direct SPI attachment. Refactor the TPM 2.0 driver set to use an
attachment interface, and implement a SPI bus interface.

Test Plan:
Tested on a Raspberry Pi 4, with a GeeekPi TPM2.0 module (SLB9670
TPM) using security/tpm2-tools tpm2_getcaps for very light testing against the
spibus attachment.

Reviewed by: kd
Obtained from: Juniper Networks, Inc.
Differential Revision: https://reviews.freebsd.org/D45069

libdiff: More type issues.

Sponsored by: Klara, Inc.
Reviewed by: allanjude
Differential Revision: https://reviews.freebsd.org/D45080

zfs: merge openzfs/zfs@8f1b7a6fa

Notable upstream pull request merges:
#15839 c3f2f1aa2 vdev probe to slow disk can stall mmp write checker
#15888 5044c4e3f Fast Dedup: ZAP Shrinking
#15996 db499e68f Overflowing refreservation is bad
#16118 67d13998b Make more taskq parameters writable
#16128 21bc066ec Fix updating the zvol_htable when renaming a zvol
#16130 645b83307 Improve write issue taskqs utilization
#16131 8fd3a5d02 Slightly improve dnode hash
#16134 a6edc0adb zio: try to execute TYPE_NULL ZIOs on the current task
#16141 b28461b7c Fix arcstats for FreeBSD after zfetch support

Obtained from: OpenZFS
OpenZFS commit: 8f1b7a6fa6762ea4c89198ceb11c521f80b92ddc

MINIMAL: Grow minimal to support ata, scsi and nvme

Until the boot loader automatically loads these things (including the
CAM dependency), we need to have them in the minimal kernel since they
are needed to boot. These aren't strictly required to be in the kernel,
since modules work, but are high enough demand items that until we sort
out boot loader automation, I'm adding them here. These devices are also
common in vm environments. The delta is relatively small in size. Once
the boot loader automation arrives, these and a lot of other things can
be trimmed. It's less than ideal, but is a good middle ground for the
moment.

Sponsored by: Netflix
Reviewed by: kevans, emaste
Differential Revision: https://reviews.freebsd.org/D45012

diff: Sort headers.

MFC after: 1 week
Sponsored by: Klara, Inc.
Reviewed by: markj
Differential Revision: https://reviews.freebsd.org/D45078

libdiff: Fix type issues.

MFC after: 1 week
Sponsored by: Klara, Inc.
Reviewed by: allanjude, markj
Differential Revision: https://reviews.freebsd.org/D45077

geom_stripe: Cascade cantrim just like we do for gmirror

If any of the disks can support trim, cascade that up the
stack. Otherwise, trims won't pass through striped raid setups.

PR: 277673
Reviewed by: imp (minor style tweaks from bug report)

da: Update trim stats for WRITE SAME and ATA TRIM

The scsi UNMAP path updated trim stats in the da sysctl, but the ATA
TRIM passthru and WRITE SAME paths did not. Add code so they do.

PR: 277637
Reviewed by: imp (tweaked WS path to update ranges)

tests/sendfile: test operation on unix/stream socket

Although there are already multiple tests in the tests collection
that utilize sendfile(2) support over unix/stream socket, they all
don't exercise the asynchronous part of the operation. This test
framework, however, uses a trick to toggle true async operation and
guarantee that pr_ready method of unix/stream is also tested.

Reviewed by: chs
Differential Revision: https://reviews.freebsd.org/D45055

tests/sendfile: factor out tcp_socketpair()

It creates a pair of connected TCP sockets for later testing. No
functional change.

Reviewed by: chs
Differential Revision: https://reviews.freebsd.org/D45054

libarchive: fix thread autodetermination for zstd compression format

The libarchive code uses sysconf(3) to determine the number of threads
when 0 has been given as the number of thread to use

MFC after: 3 days

linuxkpi: Fix set_memory_*

set_memory_* is currently implemented using PHYS_TO_DMAP but not all
architectures have a DMAP. Looking at how this function is used the
given address isn't physical but virtual so the PHYS_TO_DMAP call can
simply be removed.

Also cast numpages before shifting it to avoid overflow.

Reviewed by: kib, markj
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D45057

Tighten boundary check in split(1) to prevent a potential buffer overflow.

Before increasing sufflen, make sure the current name plus two (including
the terminating NUL character and the to-be-added character) does not
exceed the fixed buffer length, and stop immediately if this would occur.

In worst case scenario the code would write an nul character beyond the
boundary, however it would be caught by open(2) and based on the memory
layout, we do not believe this would constitute a security vulnerability.

MFC after: 3 days

pf tests: fix REQUIRED_MODULES typo

This ensures we don't try to run the nat66 tests if pf is not loaded.

Sponsored by: Rubicon Communications, LLC ("Netgate")

periodic.conf: remove long deprecated security_daily_compat_var()

This function is documented to be gone in after 11. Time to remove this
compat shim.

PR: 275296
Reviewed by: jrm (mentor)
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D44796

nvmfd: A simple userspace daemon for the NVMe over Fabrics controller

This daemon can operate as a purely userspace controller exporting one
or more simulated RAM disks or local block devices as NVMe namespaces
to a remote host.  In this case the daemon provides a discovery
controller with a single entry for an I/O controller.

nvmfd can also offload I/O controller queue pairs to the nvmft.ko
in-kernel Fabrics controller when -K is passed.  In this mode, nvmfd
still accepts connections and performs initial transport-specific
negotitation in userland.  The daemon still provides a userspace-only
discovery controller with a single entry for an I/O controller.
However, queue pairs for the I/O controller are handed off to the CTL
NVMF frontend.

Eventually ctld(8) should be refactored to to provide an abstraction
for the frontend protocol and the discovery and the kernel mode of
this daemon should be merged into ctld(8).  At that point this daemon
can be moved to tools/tools/nvmf as a debugging tool (mostly as sample
code for a userspace controller using libnvmf).

Reviewed by: imp
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D44731

nvmfdd: A simple userspace NVMe over Fabrics host

This program uses libnvmf to connect to a remote Fabrics controller
and perform a single read or write operation. The write command reads
data from stdin to construct one or more NVM Write commands sent to
the remote namespace. The read command uses one or more NVM Read
commands to read blocks from a remote namespace writing the data to
stdout.

Reviewed by: chuck, imp
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D44730

ctladm: Add nvterminate command to drop active NVMeoF associations

Reviewed by: imp
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D44729

ctladm: Add nvlist command to list active NVMeoF associations

Reviewed by: imp
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D44728

ctladm: Permit creating nvmf ports

Reviewed by: imp
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D44727

nvmft: The in-kernel NVMe over Fabrics controller

This is the server (target in SCSI terms) for NVMe over Fabrics.
Userland is responsible for accepting a new queue pair and receiving
the initial Connect command before handing the queue pair off via an
ioctl to this CTL frontend.

This frontend exposes CTL LUNs as NVMe namespaces to remote hosts.
Users can ask LUNS to CTL that can be shared via either iSCSI or
NVMeoF.

Reviewed by: imp
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D44726

mbuf: Add EXT_CTL for mbufs backed by a CTL backend buffer

This is somewhat similar to EXT_NET_DRV, but CTL isn't a network
driver.

Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D44725

ctl: Add NVMF port type and ioctls

- Add CTL_PORT_NVMF as a new port type.

- Define a new CTL_NVMF ioctl for NVMF-specific operations similar to
  CTL_ISCSI.  This ioctl supports a command to handoff a single
  queue pair, a command to enumerate active associations, and a
  command to disconnect one or more active associations.

Reviewed by: imp
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D44724

ctl_backend_block: Add support for NVMe

Reviewed by: imp
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D44723

ctl_backend_block: Prepare for NVMe support

- Use wrapper routines for access to shared fields between SCSI and
NVMe I/O requests.

- Use protocol-agnostic wrapper routines for I/O completion status.

Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D44851

ctl_backend_ramdisk: Add support for NVMe

One known caveat is that the support for WRITE_UNCORRECTABLE is not
quite correct as reads from LBAs after a WRITE_UNCORRECTABLE will
return zeroes rather than an error. Fixing this would likely require
special handling for PG_ANCHOR for NVMe requests (or adding a new
PG_UNCORRECTABLE).

Reviewed by: ken, imp
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D44722

ctl_backend_ramdisk: Prepare for NVMe support

- Use wrapper routines for access to shared fields between SCSI and
NVMe I/O requests.

- Use protocol-agnostic wrapper routines for I/O completion status.

Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D44849

ctl: Add helper routines to populate NVMe namespace data IDs for a LUN

These will be used by the backends to populate the unique ID fields
like EUI64 in the NVMe namespace data (CNS == 0) and namespace
identification descriptor list (CNS == 3).

Reviewed by: imp
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D44721

ctl: Support for NVMe commands

- Add support for queueing and executing NVMe admin and NVM commands
  via ctl_run and ctl_queue.  This requires fixing a few places that
  were SCSI-specific to add NVME logic.

- NVMe has much simpler command ordering requirements than SCSI.  In
  particular, the HBA is not required to enforce any specific ordering
  for requests with overlapping LBAs.  The host is required to manage
  that ordering.  However, fused commands (currently only COMPARE and
  WRITE NVM commands can be fused) are required to be executed
  atomically.

  To support fused commands, make the second half of a fused command
  block on the first half, and have commands submitted after a fused
  command pair block on the second half.

- Add handlers and command tables for admin and NVM commands that
  operate on individual namespaces and will be passed down from an
  NVMe over Fabrics controller to a CTL LUN.

Reviewed by: ken, imp
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D44720

ctl: Add assertions in SCSI-only paths

Assert that only SCSI I/O requests are passed in various places
that assume a SCSI I/O request (that is, places that access fields
in io->scsiio directly).

Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D44847

ctl: Update some core data paths to be protocol agnostic

- Add wrapper routines for invoking the be_move_done and io_continue
  callbacks in SCSI and NVMe I/O requests.

- Use wrapper routines for access to shared fields between SCSI and
  NVMe I/O requests.

- ctl_config_write_done is not fully updated since it resubmits SCSI
  commands via ctl_scsiio.  This will be completed in a subsequent
  commit when ctl_nvmeio is added.

Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D44846

ctl: Support NVMe requests in debug trace functions

Reviewed by: imp
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D44719

ctl: Add helper routines for setting NVMe completion status

Also includes a few protocol-agnostic wrappers for setting a generic
status (such as success) for a CTL I/O request whether it be SCSI or
NVMe.

Reviewed by: imp
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D44718

ctl: Add structure and related constants for NVMe commands

This includes static inline functions to serve as getters/setters for
fields shared between SCSI and NVMe I/O requests to manage data
buffers.

Reviewed by: ken, imp
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D44717

nvme: Add constants for the Fused Operation (FUSE) field in commands

Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D44845

ctl: Add CTL_IO_ASSERT wrapper macro

Currently, this pattern is commonly used to assert that a union ctl_io
is a SCSI request. In the future it will be used to assert other
types.

Suggested by: imp
Reviewed by: imp
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D44844

ctl: Avoid an upcast for calling ctl_scsi_path_string

Change the first argument of ctl_scsi_path_string to be the embedded
header structure instead of the union. Currently union ctl_io and
struct ctl_scsiio have the same alignment, but this changes on i386 if
a new union member is added that contains a uint64_t member (such as
an embedded struct nvme_command for NVMeoF). In that case, union
ctl_io requires stronger alignment, so the upcast from struct
ctl_scsiio to union ctl_io in ctl_scsi_sense_sbuf raises an increasing
alignment warning on i386.

Avoid the warning by passing struct ctl_io_hdr as the first argument
to ctl_scsi_path_string instead.

Reviewed by: imp
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D44716

nvmecontrol: New commands to support Fabrics hosts

- discover: Connects to a remote Discovery controller, fetches its
  Discovery Log Page, and enumerates the remote controllers described
  in the log page.

  The -v option can be used to display the Identify Controller data
  structure for the Discovery controller.  This is only really useful
  for debugging.

- connect: Connects to a remote I/O controller and establishes an
  association of an admin queue and a single I/O queue.  The
  association is handed off to the in-kernel host to create a new
  nvmeX device.

- connect-all: Connects to a Discovery controller and attempts to
  create an association with each I/O controller enumerated in the
  Discovery controller's Discovery Log Page.

- reconnect: Establishes a new association with a remote I/O
  controller for an existing nvmeX device.  This can be used to
  restore access to a remote I/O controller after the loss of a prior
  association due to a transport error, controller reboot, etc.

- disconnect: Deletes one or more nvmeX devices after detaching its
  namespaces and terminating any active associations.  The devices to
  delete can be identified by either a nvmeX device name or the NQN of
  the remote controller.

- disconnect-all: Deletes all active associations with remote
  controllers.

Reviewed by: imp
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D44715

nvmf: The in-kernel NVMe over Fabrics host

This is the client (initiator in SCSI terms) for NVMe over Fabrics.
Userland is responsible for creating a set of queue pairs and then
handing them off via an ioctl to this driver, e.g. via the 'connect'
command from nvmecontrol(8).  An nvmeX new-bus device is created
at the top-level to represent the remote controller similar to PCI
nvmeX devices for PCI-express controllers.

As with nvme(4), namespace devices named /dev/nvmeXnsY are created and
pass through commands can be submitted to either the namespace devices
or the controller device.  For example, 'nvmecontrol identify nvmeX'
works for a remote Fabrics controller the same as for a PCI-express
controller.

nvmf exports remote namespaces via nda(4) devices using the new NVMF
CAM transport.  nvmf does not support nvd(4), only nda(4).

Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D44714

cam: Add a XPORT_NVMF for NVMe over Fabrics sims

Reviewed by: ken, imp
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D44713

nvmf_tcp: Add a TCP transport for NVMe over Fabrics

Structurally this is very similar to the TCP transport for iSCSI
(icl_soft.c).  One key difference is that NVMeoF transports use a more
abstract interface working with NVMe commands rather than transport
PDUs.  Thus, the data transfer for a given command is managed entirely
in the transport backend.

Similar to icl_soft.c, separate kthreads are used to handle transmit
and receive for each queue pair.  On the transmit side, when a capsule
is transmitted by an upper layer, it is placed on a queue for
processing by the transmit thread.  The transmit thread converts
command response capsules into suitable TCP PDUs where each PDU is
described by an mbuf chain that is then queued to the backing socket's
send buffer.  Command capsules can embed data along with the NVMe
command.

On the receive side, a socket upcall notifies the receive kthread when
more data arrives.  Once enough data has arrived for a PDU, the PDU is
handled synchronously in the kthread.  PDUs such as R2T or data
related PDUs are handled internally, with callbacks invoked if a data
transfer encounters an error, or once the data transfer has completed.
Received capsule PDUs invoke the upper layer's capsule_received
callback.

struct nvmf_tcp_command_buffer manages a TCP command buffer for data
transfers that do not use in-capsule-data as described in the NVMeoF
spec.  Data related PDUs such as R2T, C2H, and H2C are associated with
a command buffer except in the case of the send_controller_data
transport method which simply constructs one or more C2H PDUs from the
caller's mbuf chain.

Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D44712

nvmf: Add infrastructure kernel module for NVMe over Fabrics

nvmf_transport.ko provides routines for managing NVMeoF queue pairs
and capsules.  It provides a glue layer between transports (such as
TCP or RDMA) and an NVMeoF host (initiator) and controller (target).

Unlike the synchronous API exposed to the host and controller by
libnvmf, the kernel's transport layer uses an asynchronous API built
on callbacks.  Upper layers provide callbacks on queue pairs that are
invoked for transport errors (error_cb) or anytime a capsule is
received (receive_cb).

Data transfers for a command are usually associated with a callback
that is invoked once a transfer has finished either due to an error
or successful completion.

For an upper layer that is a host, command capsules are allocated and
populated with an NVMe SQE by calling nvmf_allocate_command.  A data
buffer (described by a struct memdesc) can be associated with a
command capsule before it is transmitted via nvmf_capsule_append_data.
This function accepts a direction (send vs receive) as well as the
data transfer callback.  The host then transmits the command via
nvmf_transmit_capsule.  The host must ensure that the data buffer
described by the 'struct memdesc' remains valid until the data
transfer callback is called.  The queue pair's receive_cb callback
should match received response capsules up with previously transmitted
commands.

For the controller, incoming commands are received via the queue
pair's receive_cb callback.  nvmf_receive_controller_data is used to
retrieve any data from a command (e.g. the data for a WRITE command).
It can be called multiple times to split the data transfer into
smaller sizes.  This function accepts an I/O completion callback that
is invoked once the data transfer has completed.
nvmf_send_controller_data is used to send data to a remote host in
response to a command.  In this case a callback function is not used
but the status is returned synchronously.  Finally, the controller can
allocate a response capsule via nvmf_allocate_response populated with
a supplied CQE and send the response via nvmf_transmit_capsule.

Reviewed by: imp
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D44711

libnvmf: Add internal library to support NVMe over Fabrics

libnvmf provides APIs for transmitting and receiving Command and
Response capsules along with data associated with NVMe commands.
Capsules are represented by 'struct nvmf_capsule' objects.

Capsules are transmitted and received on queue pairs represented by
'struct nvmf_qpair' objects.

Queue pairs belong to an association represented by a 'struct
nvmf_association' object.

libnvmf provides additional helper APIs to assist with constructing
command capsules for a host, response capsules for a controller,
connecting queue pairs to a remote controller and optionally
offloading connected queues to an in-kernel host, accepting queue pair
connections from remote hosts and optionally offloading connected
queues to an in-kernel controller, constructing controller data
structures for local controllers, etc.

libnvmf also includes an internal transport abstraction as well as an
implementation of a userspace TCP transport.

libnvmf is primarily intended for ease of use and low-traffic use cases
such as establishing connections that are handed off to the kernel.
As such, it uses a simple API built on blocking I/O.

For a host, a consumer first populates an 'struct
nvmf_association_params' with a set of parameters shared by all queue
pairs for a single association such as whether or not to use SQ flow
control and header and data digests and creates a 'struct
nvmf_association' object.  The consumer is responsible for
establishing a TCP socket for each queue pair.  This socket is
included in the 'struct nvmf_qpair_params' passed to 'nvmf_connect' to
complete transport-specific negotiation, send a Fabrics Connect
command, and wait for the Connect reply. Upon success, a new 'struct
nvmf_qpair' object is returned.  This queue pair can then be used to
send and receive capsules.  A command capsule is allocated, populated
with an SQE and optional data buffer, and transmitted via
nvmf_host_transmit_command.  The consumer can then wait for a reply
via nvmf_host_wait_for_response.  The library also provides some
wrapper functions such as nvmf_read_property and nvmf_write_property
which send a command and wait for a response synchronously.

For a controller, a consumer uses a single association for a set of
incoming connections.  A consumer can choose to use multiple
associations (e.g. a separate association for connections to a
discovery controller listening on a different port than I/O
controllers).  The consumer is responsible for accepting TCP sockets
directly, but once a socket has been accepted it is passed to
nvmf_accept to perform transport-specific negotiation and wait for the
Connect command.  Similar to nvmf_connect, nvmf_accept returns a newly
construct nvmf_qpair.  However, in contrast to nvmf_connect,
nvmf_accept does not complete the Fabrics negotiation.  The consumer
must explicitly send a response capsule before waiting for additional
command capsules to arrive.  In particular, in the kernel offload
case, the Connect command and data are provided to the kernel
controller and the Connect response capsule is sent by the kernel once
it is ready to handle the new queue pair.

For userspace controller command handling, the consumer uses
nvmf_controller_receive_capsule to wait for a command capsule.
nvmf_receive_controller_data is used to retrieve any data from a
command (e.g. the data for a WRITE command).  It can be called
multiple times to split the data transfer into smaller sizes.
nvmf_send_controller_data is used to send data to a remote host in
response to a command.  It also sends a response capsule indicating
success, or an error if an internal error occurs.  nvmf_send_response
is used to send a response without associated data.  There are also
several convenience wrappers such as nvmf_send_success and
nvmf_send_generic_error.

Reviewed by: imp
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D44710

nvmft: Add NVMeoF controller routines shared between kernel and userland

This includes functions to validate NVMe Qualified Names, compute an
initial value of the CAP property, validate changes to the CC
property, and populate the Identify Controller data structure for an
I/O controller.

Reviewed by: imp
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D44709

nvmf_tcp.h: Internal header shared between userspace and kernel

- Helper macros for specific SGL types used with the TCP transport

- An inline function which validates various fields in TCP PDUs

Reviewed by: imp
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D44708

nvmf: Install nvmf.h and nvmf_proto.h in /usr/include/dev/nvmf

Reviewed by: imp
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D44707

nvmf.h: New header defining ioctls for NVMe over Fabrics

This defines structures, ioctl commands, and related constants used
for both the Fabrics host and controller.

Reviewed by: imp
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D44706

nvmf_proto.h: Add additional types and constants from the 1.1 spec

- Add opcode, command structure, and new error code for Disconnect
  fabrics opcode.

- Add a generic struct nvmf_fabric_command.

- Add constants for special controller ID values.

- Add constants for the cattr field in the Connect command and the
  default value for the kato field in the Connect command.

- Add constants for the offset of controller properties (Fabrics
  version of controller registers).

Reviewed by: imp
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D44705

nvmf_proto.h: Update for use in FreeBSD

- Replace SPDK_STATIC_ASSERT with _Static_assert.

- Remove SPDK_ and spdk_ prefixes from types and constants.

- Switch to using FreeBSD headers, e.g. <dev/nvme/nvme.h> in place of
  "spdk/nvme_spec.h".

- Add a definition of NVME_NQN_FIELD_SIZE (from SPDK's nvme_spec.h).

- Remove constant for the fabrics opcode as this is already present in
  <dev/nvme/nvme.h>.

- Use types from <dev/nvme/nvme.h> for NVMe structures including
  struct nvme_sgl_descriptor, struct nvme_command, and
  struct nvme_completion.

- Use plain uint16_t in place of struct spdk_nvme_status.

Reviewed by: imp
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D44704

nvmf_proto.h: NVMe over Fabrics protocol definitions

This is a copy of spdk/include/spdk/nvmf_spec.h as of commit
470e851852bb948334a272c9f8de495020fa082f from Intel's SPDK.
Subsequent commits will modify it to be suitable header for the
kernel, but importing the stock file first makes it easier to see
how the resulting header is derived from the original.

Reviewed by: imp
Obtained from: SPDK (https://github.com/spdk/spdk.git)
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D44703

vdev_disk: disable flushes if device does not support it

If the underlying device doesn't have a write-back cache, the kernel
will just return a successful response. This doesn't hurt anything, but
it's extra work on the IO taskqs that are unnecessary. So, detect this
when we open the device for the first time.

Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rob Norris <rob.norris@klarasystems.com>
Closes #16148

.github: Update the path used for the homebrew LLVM install on macOS

Pull Request: https://github.com/freebsd/freebsd-src/pull/1212

cam/iosched: Document latency buckets correctly.

Document how latency buckets are actually computed: They are a doubling
from 20us to 10.485s by default, but based at
kern.cam.iosched.bucket_base_us and increase with a ratio of
kern.cam.iosched.bucket_ration / 100 from one to the next.

Sponsored by: Netflix

Revert "Make WITHOUT_UNDEFINED_VERSION the default"

This is causing failures on gcc13 CI builds so those need to be fixed
or worked around.

This reverts commit 4510f2ca9170927309a423274e03f1eb8e27da27.

nvmecontrol: Allow optional /dev/ for device names

nvmecontrol operates on devices. Allow a user to specify the /dev/ if
they want. Any device that starts with / will be treated as if it was a
full path for maximum flexbility.

Sponsored by: Netflix

date.1: Note that nanosecond support is to appear first in 14.1

Sponsored by: Klara, Inc.

RELNOTES: Mention date(1)'s nanosecond support

Sponsored by: Klara, Inc.

sysctl: Make sysctl_ctx_free() a bit safer

Clear the list before returning so that sysctl_ctx_free() can be called
more than once on the same list without side effects. This simplifies
error handling in drivers; previously, drivers would have to be careful
to call sysctl_ctx_free() at most once to avoid a use-after-free.

While here, use TAILQ_FOREACH_SAFE in the loop which unregisters OIDs.

Reviewed by: thj, emaste
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D45041

Make WITHOUT_UNDEFINED_VERSION the default

Link with --no-undefined-version by default. Will detect and prevent
the accidental removal of symbols from versioned libraries.

Reviewed by: arichardson, kib, dim, emaste
Differential Revision: https://reviews.freebsd.org/D44216

libgcc_s: __extendxftf2 and __trunctfxf2 are amd64-only

__extendxftf2 and __trunctfxf2 build on amd64 not aarch64 and riscv.

Reviewed by: emaste
Differential Revision: https://reviews.freebsd.org/D45052

ctladm: Use nitems() in a few more places

Sponsored by: Chelsio Communications

Differential Revision: https://reviews.freebsd.org/D45017

in6.h: expose s6_addr* definitions to user level

The only element of of in6_addr that is specified in RFC 3493 or
in POSIX.1-2017 is s6_addr, implemented via a #define to a union
member.  However, FreeBSD and other BSD systems have additional
definitions for the other union members, s6_addr{8,16,32} which
are defined for the kernel and loader.  Some Linux applications
also use them, and they seem to be allowed by the RFC and POSIX.
Remove the current ifdefs, exposing the additional fields to user
level, and replace with #if __BSD_VISIBLE.  Add an explanatory
comment expanding on the previous "nonstandard" comment.

MFC after: 1 week
Reviewed by: bz
Differential Revision: https://reviews.freebsd.org/D44979

sctp: document sctp_recvmsg as implemented

PR: 275990
MFC after: 3 days

Improve write issue taskqs utilization

- Reduce number of allocators on small system down to one per 4
CPU cores, keeping maximum at 4 on 16+ core systems. Small systems
should not have the lock contention multiple allocators supposed
to solve, while having several metaslabs open and modified each
TXG is not free.
- Reduce number of write issue taskqs down to one per 16 CPU
cores and an integer fraction of number of allocators. On mid-
sized systems, where multiple allocators already make sense, too
many write issue taskqs may reduce write speed on single-file
workloads, since single file is handled by only one taskq to
reduce fragmentation. On large systems, that can actually benefit
from many taskq's better IOPS, the bottleneck is less important,
since in worst case there will be at least 16 cores to handle it.
- Distribute dnodes between allocators (and taskqs) in a round-
robin fashion instead of relying on sync taskqs to be balanced.
The last is not guarantied and may depend on scheduling.
- Remove io_wr_iss_tq from struct zio. io_allocator is enough.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored by: iXsystems, Inc.
Closes #16130

Slightly improve dnode hash

As I understand just for being less predictable dnode hash includes
8 bits of objset pointer, starting at 6. But since objset_t is
more than 1KB in size, its allocations are likely aligned to 2KB,
that means 11 lower bits provide no entropy. Just take the 8 bits
starting from 11.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored by: iXsystems, Inc.
Closes #16131

libspl/assert: use libunwind for backtrace when available

libunwind seems to do a better job of resolving a symbols than
backtrace(), and is also useful on platforms that don't have backtrace()
(eg musl). If it's available, use it.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rob Norris <robn@despairlabs.com>
Sponsored-by: https://despairlabs.com/sponsor/
Closes #16140

libspl/assert: dump backtrace in assert

Adds a check for the backtrace() function. If available, uses it to show
a stack backtrace in the assertion output.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rob Norris <robn@despairlabs.com>
Sponsored-by: https://despairlabs.com/sponsor/
Closes #16140

libspl/assert: add lock around assertion output

If multiple threads trip an assertion at the same moment (quite common),
they can be printing at the same time, and their output gets messy.

This adds a simple lock around the whole thing, to prevent a second task
printing assert output before the first has finished.

Additionally, if libspl_assert_ok is not set, abort() is called without
dropping the lock, so that any other asserting tasks will be killed
before starting any output, rather than only getting part-way through.
This is a tradeoff; it's assumed that multiple threads asserting at the
same moment are likely the same fault in different instances of a
thread, and so there won't be any more useful information from the other
tasks anyway.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rob Norris <robn@despairlabs.com>
Sponsored-by: https://despairlabs.com/sponsor/
Closes #16140

libspl/assert: show process/task details in assert output

Makes it much easier to see what thing complained.

Getting thread id, program name and thread name vary wildly between
Linux and FreeBSD, so those are set up in macros. pthread_getname_np()
did not appear in musl until very recently, but the same info has always
been available via prctl(PR_GET_NAME), so we use that instead.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rob Norris <robn@despairlabs.com>
Sponsored-by: https://despairlabs.com/sponsor/
Closes #16140

libzpool: set thread names

Arrange for the thread/task name to be set when new threads are created.
This makes them visible in the process table etc.

pthread_setname_np() is generally available in glibc, musl and FreeBSD,
so no test is required.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rob Norris <robn@despairlabs.com>
Sponsored-by: https://despairlabs.com/sponsor/
Closes #16140

find_system_library: fix var cleanup when library not found

The "not found" path is attempting to clear SOMELIB_CFLAGS and
SOMELIB_LIBS by resetting them in AC_SUBST(). However, the second arg to
AC_SUBST is expanded in autoconf with `m4_ifvaln([$2], [[$1]=$2])`,
which is defined as "if the first arg is non-empty". The m4 "empty"
construction is [], therefore, the existing AC_SUBST calls never modify
the variables at all.

The effect of this is that leftovers from the library test can leak out.
At least, if a library header is found in the first stage, but the
library itself is not, -lsomelib is added to SOMELIB_LIBS and further
tests done. If that library is not found, SOMELIB_LIBS will not be
cleared.

For most of our library tests this hasn't been a problem, as they're
either always found properly via pkg-config or set directly, or the
calling test immediately aborts configure. For an optional dependency
however, an apparent "partial" result where the header is found but no
corresponding library causes link errors later.

I think a complete fix should probably not be setting SOMELIB_xxx until
the final result is known, but for now, adjusting the AC_SUBST calls to
explictly set the empty shell string (which is not "empty" to m4) at
least restores the intent.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rob Norris <robn@despairlabs.com>
Sponsored-by: https://despairlabs.com/sponsor/
Closes #16140

in_pcb: don't leak credential refcounts on error

In the error path during allocating an in_pcb, the credentials
associated with the new struct get their reference count
increased early on, but not decremented when the allocation
fails.

Reported by: cmiller_netapp.com
MFC after: 3 days
Reviewed by: jhb, tuexen
Sponsored by: NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D45033

beinstall: retire mergemaster support

Mergemaster has been deprecated for some time, and will be retired.

Reviewed by: kevans
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D41799

libgcc_s: 80-bit long double function are x86-only

Don't try to expose them on other architectures.

Reviewed by: arichardson
Differential Revision: https://reviews.freebsd.org/D45028

guestrpc module to handle VMware backdoor port GuestRPC functionality

Convert existing FreeBSD vmware_hvcall function to take a channel
and parameter arguments.

Added vmware_guestrpc_cmd() to send GuestRPC commands to the VMware
hypervisor. The sbuf argument is used for both the command to send
and to store the data to return to the caller.

The following KPIs can be used to get and set FreeBSD-specific guest
information in key/value pairs:
* vmware_guestrpc_set_guestinfo
- set a value into the guestinfo.fbsd.<keyword> key
* vmware_guestrpc_get_guestinfo
- get the value stored in the guestinfo.fbsd.<keyword> key

Add VMware devices to x86 NOTES

Reviewed by: jhb
Obtained from: Juniper Networks, Inc.
Differential Revision: https://reviews.freebsd.org/D44528

release: Stage non-UFS images in vm-images-stage

When the VM image building code was updated to support building
non-UFS images, the vm-images-stage target was not updated to
install those newly built images to the FTP site. As a result, we
have been sending weekly snapshot announcements since August claiming
that ZFS VM images are available when they are not in fact present
anywhere publicly accessible.

Fixes: 32ae9a6b3937 "release: Build UFS and ZFS VM images"
Reported by: Michael Dexter
MFC after: 5 days

Fix up a mistake in the CFLAGS added. Pointed out by jrtc.

Out of tree modules should be built with DTrace by default.

examples: Install bhyve files on arm64

Sponsored by: Innovate UK

bhyve: Move lock of uart frontend to uart backend

Currently, lock of uart in bhyve is placed in frontend. There are some
problems about it:

1. If every frontend should has a lock, why not move it inside backend
   as they all have same uart_softc.
2. If backend needs to modify the information of uart after initialize,
   it will be impossible as backend cannot use lock. For example, if we
   want implement a telnet support for uart in backend, It should wait
   for connection when initialize. After some remote process connect it,
   it needs to modify rfd and wfd in backend.

So I decide to move it to backend.

Reviewed by: corvink, jhb, markj
Differential Revision: https://reviews.freebsd.org/D44947

vmrun.sh: Add arm64 support

For now, we enumerate disk devices before network devices. This is to
work around a problem wherein u-boot remaps BARs during boot in a way
that bhyve does not handle. Some discussion and experiments suggest
that this can be handled by having bhyve not map BARs during boot on
arm64; until a solution is implemented, however, this workaround is
sufficient for simple usage and doesn't have any real downsides.

The console and bootrom are specified slightly differently versus amd64,
and a few of vmrun.sh's command-line options are amd64-only.

Reviewed by: corvink, jhb
Sponsored by: Innovate UK
Differential Revision: https://reviews.freebsd.org/D44933

bhyvectl: Add arm64 bits and hook it up to the build

For now this implementation doesn't provide any machine dependent
functionality on arm64, but it's enough to be able to reset and destroy
VMs.

Reviewed by: jhb
Sponsored by: Innovate UK
Differential Revision: https://reviews.freebsd.org/D44932

bhyvectl: Prepare to add arm64 support

Move MD code into a separate directory and add a simple interface which
lets the MD bits register options and handle them.

No functional change intended.

Reviewed by: jhb
Sponsored by: Innovate UK
Differential Revision: https://reviews.freebsd.org/D44932

sdt: Add macros which expand to probe and provider structure names

No functional change intended.

MFC after: 1 week

fattime: fix fattime to timespec conversion of dates beyond 2106-02-06

It turns out that the only conversion issue was in fattime2timespec, where
multiplying the number of seconds in a day by the number of days overflowed
32-bit unsigned int for dates beyond 2106-02-07 06:28:15.

Casting one of the multiplicands as time_t forces a 64-bit multiplication on
systems where time_t is 64-bits and produces no binary changes on the one
remaining system with 32-bit time_t (namely i386).

Since the code is now tested & fixed, this change removes the fixme comments.

Reviewed by: imp
Differential Revision: https://reviews.freebsd.org/D44755

fattime: make the test code check beyond 32-bit time_t limits

On systems that have a 64-bit time_t, the test code now exercises the whole
range of fattime.  To do this, this commit...

1. replaces the call to random() with two calls to arc4random() to
   generate a 33-bit number of seconds in order to cover the entire range of
   fattime [1970,2107].  (32-bits stops just short - in January 2106.)
   On systems with 32-bit time_t, the extra bits are discarded and only the
   time_t expressible range is tested.
2. casts time_t values passed to printf as longs and changes the format
   string to match.

Now, the test code builds, runs, and exercises what it can (i.e., the whole
fattime range or the 32-bit time_t subset of it) on both 32-bit and 64-bit
time_t systems.

Reviewed by: imp
Differential Revision: https://reviews.freebsd.org/D44754

fattime: make the test code build again

This change...

1. replaces calls to timet2fattime/fattime2timet with calls to
   timespec2fattime/fattime2timespec.  The functions got renamed shortly
   after they landed in the kernel but the test code wasn't updated (see
   7ea93e912bf0ef).
2. adds a utc_offset stub.

With this, the test code builds and runs as a 32-bit binary (cc -Wall -O2
-m32 subr_fattime.c).

Reviewed by: imp
Differential Revision: https://reviews.freebsd.org/D44753

cxgbe/tom: Fix the rx channel selection in options2.

This affects TOE operation when multiple rx c-channels are in use for
offload, which is an unusual configuration.

MFC after: 1 week
Sponsored by: Chelsio Communications

cxgbe(4): Query TPCHMAP once and not once per port.

MFC after: 1 week
Sponsored by: Chelsio Communications

cxgbe(4): Rename rx_c_chan to rx_chan.

It is the equivalent of tx_chan but for receive so rx_chan is a better
name. Initialize both using helper functions and make sure both are
displayed in the sysctl MIB.

MFC after: 1 week
Sponsored by: Chelsio Communications

clang-format: Minor tweaks

Invert KeepEmptyLinesAtTheStartOfBlocks.  We used to require an empty
line at the beginning of functions with no local variables, which I
believe is the reason for this setting.  Now it is discouraged in new
code.

Tell clang-format to align consecutive macros, since we tend to do that.
clang-format's output isn't quite what we want here.  Typically we have
a tab after a #define for some reason, and clang-format doesn't appear
to have an option for that.  clang-format will also use a mix of tabs
and spaces to minimize indentation, which is also against our
convention.  However, the result looks better with this setting than
without.

Reviewed by: emaste
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D29870

arm: Remove duplicate definitions in armreg.h

No functional change intended.

MFC after: 1 week