Oops, although the vneting macros do not do anything yet,
commit 7344856e3a6d did change where things are initialized
and one of the initialization functions was not being called
early enough. This patch moves nfsrvd_init(0) to the
function called via (VNET_)SYSINIT() to fix this.
Rick Macklem [Sun, 12 Feb 2023 02:27:59 +0000 (18:27 -0800)]
nfsd: Delete nfsrv_prison_cleanup() until vneting enabled
Oops, although the vneting macros do not do anything yet,
commit 7344856e3a6d enabled the prison cleanup function, that
would get called and crash the system when a jail was terminated.
This patch gets rid of nfsrv_prison_cleanup() for now.
It can go in when the vnet macros are enabled as
front ends to the vnet macros.
Rick Macklem [Sat, 11 Feb 2023 23:51:19 +0000 (15:51 -0800)]
nfsd: Prepare the NFS server code to run in a vnet prison
This patch defines null macros that can be used to apply
the vnet macros for global variables and SYSCTL flags.
It also applies these macros to many of the global variables
and some of the SYSCTLs. Since the macros do nothing, these
changes should not result in semantics changes, although the
changes are large in number.
The patch does change several global variables that were
arrays or structures to pointers to same. For these variables,
modified initialization and cleanup code malloc's and free's
the arrays/structures. This was done so that the vnet footprint
would be about 300bytes when the macros are defined as vnet macros,
allowing nfsd.ko to load dynamically.
I believe the comments in D37519 have been addressed, although
it has never been reviewed, due in part to the large size of the patch.
This is the first of a series of patches that will put D37519 in main.
Once everything is in main, the macros will be defined as front
end macros to the vnet ones.
Andrew Turner [Sat, 4 Jun 2022 11:13:51 +0000 (12:13 +0100)]
Make SMCCC usable by device drivers
To allow device drivers to call into SMCCC we need to initialise it
earlier. As it depends on PSCI, and that is detected via ACPI or FDT
move the call to smccc_init to the PSCI driver.
Add a function for drivers to read the smccc version, or 0 if smccc
is not present.
Andrew Turner [Wed, 18 Jan 2023 09:30:32 +0000 (09:30 +0000)]
Always store the arm64 VFP context
If a thread enters a kernel FP context the PCB_FP_STARTED may be
unset when calling get_fpcontext even if the VFP unit has been used
by the current thread.
Reduce the use of this flag to just decide when to store the VFP state.
While here add an assert to check the assumption that the passed in
thread is the current thread and remove the unneeded critical section.
The latter is unneeded as the only place we would need it is in
vfp_save_state and this already has a critical section when needed.
Sponsored by: Arm Ltd
Differential Revision: https://reviews.freebsd.org/D37998
Andrew Turner [Wed, 18 Jan 2023 09:30:20 +0000 (09:30 +0000)]
Always read the VFP regs in the arm64 fill_fpregs
The PCB_FP_STARTED is used to indicate that the current VFP context
has been used since either 1. the start of the thread, or 2. exiting
a kernel FP context.
When case 2 was added to the kernel this could cause incorrect results
to be returned when a thread exits the kernel FP context and fill_fpregs
is called before it has restored the VFP state, e.g. by trappin on a
userspace VFP instruction.
In both of the cases the base save area is still valid so reduce the
use of the PCB_FP_STARTED flag check to help decide if we need to
store the current threads VFP state.
Sponsored by: Arm Ltd
Differential Revision: https://reviews.freebsd.org/D37994
Kyle Evans [Sun, 5 Mar 2023 00:49:04 +0000 (18:49 -0600)]
arm: generic_timer: use interrupt-names when available
Offsets for all of thse can be a bit complicated as not all interrupts
will be present, only phys and virt are actually required, and sec-phys
could optionally be specified before phys. Push idx/name pairs into
a new config struct and maintain the old indices while still getting the
correct timers.
Split fdt/acpi attach out independently and allocate interrupts before
we head into the common attach(). The secure physical timer is also
optional there, so mark it so to avoid erroring out if we run into
problems.
Reviewed by: andrew
Differential Revision: https://reviews.freebsd.org/D38911
Andrew Turner [Thu, 2 Feb 2023 16:26:25 +0000 (16:26 +0000)]
Limit where we disable the Arm generic timer
Only disable the Arm generic timer on arm64 when entering the kernel
through EL2. There is no guarantee it will be enabled if we are running
under a hypervisor.
Andrew Turner [Sat, 28 Jan 2023 17:36:24 +0000 (17:36 +0000)]
Disable the arm physical timer when an irq exists
Some firmware leaves the timers enabled. Ensure they are disabled if
there are any physical timer interrupt resources to ensure we don't
receive any unexpected interrupts from them.
Kristof Provost [Wed, 10 May 2023 16:26:29 +0000 (18:26 +0200)]
e1000: fix VLAN 0
VLAN 0 essentially means "Treat as untagged, but with priority bits",
and is used by some ISPs.
On igb/em interfaces we did not receive packets with VLAN tag 0 unless
vlanhwfilter was disabled.
This can be fixed by explicitly listing VLAN 0 in the hardware VLAN
filter (VFTA). Do this from em_setup_vlan_hw_support(), where we already
(re-)write the VFTA.
Mark Johnston [Thu, 11 May 2023 13:31:30 +0000 (09:31 -0400)]
cap_net tests: Skip tests if there is no connectivity
When testing cap_connect() and name/addr lookup functions, skip tests if
we fail and the error is not ENOTCAPABLE. This makes the tests amenable
to running in CI without Internet connectivity.
Corvin Köhne [Wed, 10 May 2023 13:19:25 +0000 (09:19 -0400)]
vmm: don't free unallocated memory
If vmx or svm is disabled in BIOS or the device isn't supported by vmm,
modinit won't allocate these state save areas. As kmem_free panics when
passing a NULL pointer to it, loading the vmm kernel module causes a
panic too.
PR: 271251
Reviewed by: markj
Fixes: 74ac712f72cfd6d7b3db3c9d3b72ccf2824aa183 ("vmm: Dynamically allocate a couple of per-CPU state save areas")
MFC after: 1 week
Sponsored by: Beckhoff Automation GmbH & Co. KG
Differential Revision: https://reviews.freebsd.org/D39974
Mark Johnston [Wed, 10 May 2023 13:18:16 +0000 (09:18 -0400)]
unix: Fix locking in uipc_peeraddr()
After the locking protocol changed in commit 75a67bf3d00d ("AF_UNIX:
make unix socket locking finer grained"), uipc_peeraddr() was not
updated accordingly.
The link lock now only protects global socket lists. The PCB lock is
used to protect the link between connected PCBs, so use that. Remove an
old comment which appears to be noting that unp_conn is not set for
connected SOCK_DGRAM sockets (in one direction anyway).
Ed Maste [Mon, 13 Feb 2023 14:28:35 +0000 (09:28 -0500)]
Cirrus-CI: use llvm15 toolchain packages
As of commit 50d7464c3fe6 we use llvm15 as the system toolchain, and
commit eca005d8531f added compiler options incompatible with earlier
versions. Switch to llvm15 packages.
Mark Johnston [Tue, 21 Mar 2023 13:36:58 +0000 (09:36 -0400)]
dtrace: Sync dis_tables.c with illumos
This brings in the following commits:
commit 584b574a3b16c6772c8204ec1d1c957c56f22a87
12174 i86pc: variable may be used uninitialized
Author: Toomas Soome <tsoome@me.com>
Reviewed by: John Levon <john.levon@joyent.com>
Reviewed by: Andrew Stormont <astormont@racktopsystems.com>
Approved by: Dan McDonald <danmcd@joyent.com>
commit a25e615d76804404e5fc63897a9196d4f92c3f5e
12371 dis x86 EVEX prefix mishandled
12372 dis EVEX encoding SIB mishandled
12373 dis support for EVEX vaes instructions
12374 dis support for EVEX vpclmulqdq instructions
12375 dis support for gfni instructions
Author: Robert Mustacchi <rm@fingolfin.org>
Reviewed by: Jerry Jelinek <jerry.jelinek@joyent.com>
Approved by: Joshua M. Clulow <josh@sysmgr.org>
commit c1e9bf00765d7ac9cf1986575e4489dd8710d9b1
12369 dis WBNOINVD support
Author: Robert Mustacchi <rm@joyent.com>
Reviewed by: Hans Rosenfeld <hans.rosenfeld@joyent.com>
Reviewed by: Jerry Jelinek <jerry.jelinek@joyent.com>
Reviewed by: Andy Fiddaman <andy@omniosce.org>
Reviewed by: Toomas Soome <tsoome@me.com>
Approved by: Dan McDonald <danmcd@joyent.com>
commit e4f6ce7088a7dd335b9edf4774325f888692e5fb
10893 Need support for new Cascade Lake Instructions
Author: Robert Mustacchi <rm@joyent.com>
Reviewed by: Hans Rosenfeld <hans.rosenfeld@joyent.com>
Reviewed by: Dan McDonald <danmcd@joyent.com>
Reviewed by: Richard Lowe <richlowe@richlowe.net>
Approved by: Gordon Ross <gwr@nexenta.com>
commit cff040f3ef42d16ae655969398f5a5e6e700b85e
10226 Need support for new EPYC ISA extensions
Author: Robert Mustacchi <rm@joyent.com>
Reviewed by: Hans Rosenfeld <hans.rosenfeld@joyent.com>
Reviewed by: Jason King <jason.king@joyent.com>
Reviewed by: Richard Lowe <richlowe@richlowe.net>
Approved by: Dan McDonald <danmcd@joyent.com>
commit d242cdf5288b86d9070d88791c8ee696612becdc
8492 AVX512 dis - legacy logical instructions
Author: Jerry Jelinek <jerry.jelinek@joyent.com>
Reviewed by: Robert Mustacchi <rm@joyent.com>
Reviewed by: Gordon Ross <gordon.w.ross@gmail.com>
Approved by: Richard Lowe <richlowe@richlowe.net>
commit 81b505b772ab015c588c56bb116239ee549b6eee
8384 AVX512 dis - EVEX prefix support
8385 32-bit avx dis test mishandles EVEX prefix
8386 32-bit bound dis is incorrect
Author: Jerry Jelinek <jerry.jelinek@joyent.com>
Reviewed by: Robert Mustacchi <rm@joyent.com>
Reviewed by: Gordon Ross <gordon.w.ross@gmail.com>
Approved by: Richard Lowe <richlowe@richlowe.net>
commit 92381362ae635a3bea638d87b7119f1623b6212e
8319 dis support for new xsave instructions
Author: Jerry Jelinek <jerry.jelinek@joyent.com>
Reviewed by: Robert Mustacchi <rm@joyent.com>
Reviewed by: Gordon Ross <gordon.w.ross@gmail.com>
Approved by: Richard Lowe <richlowe@richlowe.net>
commit a4e73d5d60e566669c550027fae2b1d87b4be2b4
8240 AVX512 dis - opmask instruction support
Author: Jerry Jelinek <jerry.jelinek@joyent.com>
Reviewed by: Robert Mustacchi <rm@joyent.com>
Reviewed by: Toomas Soome <tsoome@me.com>
Approved by: Gordon Ross <gordon.w.ross@gmail.com>
959b2dfd39979fe8a9a315a52741d009eb168822
7825 want avx dis tests
7826 PCLMULQDQ psuedo-ops aren't properly described in dis
7827 dis tests for f16c, movbe, cpuid, msr, tsc, fence instrs
7828 sysenter and sysexit dis should be allowed in 64-bit x86
Author: Robert Mustacchi <rm@joyent.com>
Reviewed by: Jerry Jelinek <jerry.jelinek@joyent.com>
Approved by: Richard Lowe <richlowe@richlowe.net>
Mark Johnston [Tue, 25 Apr 2023 17:33:08 +0000 (13:33 -0400)]
vmm: Expose some more AVX512 CPUID bits to guests
This is required to announce support for some accelerated AES
operations. AVX512BW indicates support for the AVX512-FP16 extension
and AVX512VL indicates support for the use of AVX512 instructions with
vector lengths smaller than 512 bits.
VAES and VPCLMULQDQ extensions indicate that VEX-prefixed AES-NI and
pclmulqdq instructions are supported.
All of these bits are needed for OpenSSL to use VAES to accelerate
AES-GCM transforms.
Chuck Silvers [Wed, 3 May 2023 20:21:19 +0000 (13:21 -0700)]
fsck_ffs: fix the previous change that skipped pass 5 in some cases
The previous change involved calling check_cgmagic() twice in a row
for the same CG in order to differentiate when the CG was already ok vs.
when the CG was rebuilt, but that doesn't work because the second call
(which was supposed to rebuild the CG) returns 0 (indicating that
the CG was not rebuilt) due to the prevfailcg check causing an early
failure return. Fix this by moving the rebuild part of check_cgmagic()
out into a separate function which is called by pass1() when it wants to
rebuild a CG.
Dimitry Andric [Fri, 5 May 2023 16:19:40 +0000 (18:19 +0200)]
Apply libc++ fix for compiling <type_traits> with gcc 13
Merge commit 484e64f7e7b2 from llvm-project (by Roland McGrath):
[libc++] Use __is_convertible built-in when available
https://github.com/llvm/llvm-project/issues/62396 reports that
GCC 13 barfs on parsing <type_traits> because of the declarations
of `struct __is_convertible`. In GCC 13, `__is_convertible` is a
built-in, but `__is_convertible_to` is not. Clang has both, so
using either should be fine.
vmm: fix HLT loop while vcpu has requested virtual interrupts
This fixes the detection of pending interrupts when pirval is 0 and the
pending bit is set
More information how this situation occurs, can be found here:
https://github.com/freebsd/freebsd-src/blob/c5b5f2d8086f540fefe4826da013dd31d4e45fe8/sys/amd64/vmm/intel/vmx.c#L4016-L4031
Reviewed by: corvink, markj
Fixes: 02cc877968bbcd57695035c67114a67427f54549 ("Recognize a pending virtual interrupt while emulating the halt instruction.")
MFC after: 1 week
Sponsored by: vStack
Differential Revision: https://reviews.freebsd.org/D39620
There are some use cases where bhyve has to prepare some special memory
regions. E.g. GPU passthrough for Intel integrated graphic devices needs
to reserve some memory for the graphic device. So, bhyve has to inform
the guest about those memory regions. This information can be passed by
the qemu fwcfg interface. As qemu creates an E820 table, we can reuse
the existing fwcfg item "etc/e820".
This commit is the first one of a series. It only adds a basic
implementation for the creation of the E820 table. Some subsequent
commits will add more items to the E820 table and register it as fwcfg
item.
bhyve: add helper struct for qemus acpi table loader
The hypervisor is aware of all system properties. For the guest bios
it's hard and complex to detect all system properties. For that reason,
it would be better if the hypervisor creates acpi tables instead of the
guest. Therefore, the hypervisor has to send the acpi tables to the
guest. At the moment, bhyve just copies the acpi tables into the guest
memory. This approach has some restrictions. You have to keep sure that
the guest doesn't overwrite them accidentally. Additionally, the size of
acpi tables is limited.
Providing a plain copy of all acpi tables by fwcfg isn't possible. Acpi
tables have to point to each other. So, if the guest copies the acpi
tables into memory by it's own, it has to patch the tables. Due to
different layouts for different acpi tables, there's no generic way to
do that. For that reason, qemu created a table loader interface. It
contains commands for the guest for loading specific blobs into guest
memory and patching those blobs.
This commit adds a qemu_loader class which handles the creation of qemu
loader commands. At the moment, the WRITE_POINTER command isn't
implement. It won't be required by bhyve's acpi table generation yet.
Ed Maste [Tue, 2 May 2023 17:27:18 +0000 (13:27 -0400)]
src.conf: add WITH_TOOLCHAIN description
It is not used by the in-tree default configuration, but adding it
allows downstream projects with different defaults to make use of
Cirrus-CI (as the makeman test added in 85e8c2a03490 requires that
there are no missing option descriptions).
Reviewed by: imp
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D39936
Intel DMAR: remove parsing of 6-level paging capability
Early versions of the VT-d spec mentioned 6-level paging support as a
possible value for the SAGAW capability, but later versions removed it
and SAGAW=0x10 is currently listed as a reserved value.
The 6-level (agaw=64) entry in sagaw_bits is furthermore problematic
with clang15 because the attempted comparison against 1ULL << 64 in
dmar_maxaddr2mgaw() causes the compiler to elide the last iteration
of the initial loop, which bypasses the subsequent logic to find the
greatest HW-supported address width. This results in 5-level paging
always being selected regardless of whether the hardware supports it,
which can result address translation failure due to invalid context-
entry programming.
Ravi Pokala [Thu, 27 Apr 2023 05:03:48 +0000 (22:03 -0700)]
jedec_dimm(4): Refactor offset adjustment and page0 reset
Offsets greater than 255 bytes reside on page1 of the SPD device.
Accessing them requires switching to page1, and adjusting the absolute
offset to be relative to the start of page1. After the access, the page
must be set back to page0. These operations are performed in several
places, so break them out into their own functions.
Also, replace a pair of default cases, which should be impossible due to
earlier checks, with __assert_unreachable().
Ravi Pokala [Tue, 25 Apr 2023 06:07:39 +0000 (23:07 -0700)]
jedec_dimm(4): Add manufacturing year and week.
DDR3 and DDR4 encode the week and year that the DIMM was manufactured,
as a pair of two-digit binary-coded decimal values. Read the values, and
report them as (uint8_t)s.
Ed Maste [Mon, 1 May 2023 17:25:18 +0000 (13:25 -0400)]
pkgbase: hide duplicate METALOG directory warnings under verbose
Creating directories multiple times is an inherent side effect of the
way installation is done. Hide warnings from duplicate directory
entries (with identical metadata) under metalog_reader's verbose mode.
Duplicate file entries are always reported. They currently generate
warnings but will be switched to errors once the few instances currently
in the tree are fixed.
PR: 244596, 271178
Reviewed by: kevans
Sponsored By: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D39898
boolean_t: change to unsigned int to avoid signed bitfield warnings
This is the final part, which actually makes boolean_t unsigned. Note
that we do not change its size, nor do we try to change it directly to
bool, since that results in a lot of regressions.
Converting the remaining instances of boolean_t to plain C99 bool can
now be done in a piecemeal fashion, after which boolean_t may hopefully
be retired.
vm: fix a number of functions to match the expected prototypes
Noticed while attempting to make boolean_t unsigned: some vm-related
function declarations and defintions were using boolean_t where they
should have used int, and vice versa.
zfs: make zfs_vfs_held() definition consistent with declaration
Noticed while attempting to change boolean_t into an actual bool: in
include/sys/zfs_ioctl_impl.h, zfs_vfs_held() is declared to return a
boolean_t, but in module/os/freebsd/zfs/zfs_ioctl_os.c it is defined to
return an int. Make the definition match the declaration.
Apply clang fix for assertion building emulators/rpcs3
Merge commit a5e1a93ea10f from llvm-project (by Mariya Podchishchaeva):
[clang] Fix crash when handling nested immediate invocations
Before this patch it was expected that if there was several immediate
invocations they all belong to the same expression evaluation context.
During parsing of non local variable initializer a new evaluation context is
pushed, so code like this
```
namespace scope {
struct channel {
consteval channel(const char* name) noexcept { }
};
consteval const char* make_channel_name(const char* name) { return name;}
channel rsx_log(make_channel_name("rsx_log"));
}
```
produced a nested immediate invocation whose subexpressions are attached
to different expression evaluation contexts. The constructor call
belongs to TU context and `make_channel_name` call to context of
variable initializer.
This patch removes this assumption and adds tracking of previously
failed immediate invocations, so it is possible when handling an
immediate invocation th check that its subexpressions from possibly another
evaluation context contains errors and not produce duplicate
diagnostics.
John Baldwin [Wed, 7 Dec 2022 20:32:19 +0000 (12:32 -0800)]
aesni: Remove misleading array bounds for aesni_decryt_ecb.
All the other functions used pointers for from/to instead of
fixed-size array parameters. More importantly, this function can
accept pointers to buffers of multiple blocks, not just a single
block.
John Baldwin [Wed, 22 Mar 2023 19:35:09 +0000 (12:35 -0700)]
sys: Stop enabling -Wnested-externs.
clang doesn't implement this warning, so violations are only caught by
GCC. It is also no longer a common practice to use this as it was in
the original BSD code, so the need for the warning is not as important
as when it was used to do cleanups 20 years ago. A recent commit
(c3179891f897d840f578a5139839fcacb587c96d) triggers this warning on
GCC, but that commit uses nested externs purposefully.