There are some use cases where bhyve has to prepare some special memory
regions. E.g. GPU passthrough for Intel integrated graphic devices needs
to reserve some memory for the graphic device. So, bhyve has to inform
the guest about those memory regions. This information can be passed by
the qemu fwcfg interface. As qemu creates an E820 table, we can reuse
the existing fwcfg item "etc/e820".
This commit is the first one of a series. It only adds a basic
implementation for the creation of the E820 table. Some subsequent
commits will add more items to the E820 table and register it as fwcfg
item.
bhyve: add helper struct for qemus acpi table loader
The hypervisor is aware of all system properties. For the guest bios
it's hard and complex to detect all system properties. For that reason,
it would be better if the hypervisor creates acpi tables instead of the
guest. Therefore, the hypervisor has to send the acpi tables to the
guest. At the moment, bhyve just copies the acpi tables into the guest
memory. This approach has some restrictions. You have to keep sure that
the guest doesn't overwrite them accidentally. Additionally, the size of
acpi tables is limited.
Providing a plain copy of all acpi tables by fwcfg isn't possible. Acpi
tables have to point to each other. So, if the guest copies the acpi
tables into memory by it's own, it has to patch the tables. Due to
different layouts for different acpi tables, there's no generic way to
do that. For that reason, qemu created a table loader interface. It
contains commands for the guest for loading specific blobs into guest
memory and patching those blobs.
This commit adds a qemu_loader class which handles the creation of qemu
loader commands. At the moment, the WRITE_POINTER command isn't
implement. It won't be required by bhyve's acpi table generation yet.
Ed Maste [Tue, 2 May 2023 17:27:18 +0000 (13:27 -0400)]
src.conf: add WITH_TOOLCHAIN description
It is not used by the in-tree default configuration, but adding it
allows downstream projects with different defaults to make use of
Cirrus-CI (as the makeman test added in 85e8c2a03490 requires that
there are no missing option descriptions).
Reviewed by: imp
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D39936
Intel DMAR: remove parsing of 6-level paging capability
Early versions of the VT-d spec mentioned 6-level paging support as a
possible value for the SAGAW capability, but later versions removed it
and SAGAW=0x10 is currently listed as a reserved value.
The 6-level (agaw=64) entry in sagaw_bits is furthermore problematic
with clang15 because the attempted comparison against 1ULL << 64 in
dmar_maxaddr2mgaw() causes the compiler to elide the last iteration
of the initial loop, which bypasses the subsequent logic to find the
greatest HW-supported address width. This results in 5-level paging
always being selected regardless of whether the hardware supports it,
which can result address translation failure due to invalid context-
entry programming.
Ravi Pokala [Thu, 27 Apr 2023 05:03:48 +0000 (22:03 -0700)]
jedec_dimm(4): Refactor offset adjustment and page0 reset
Offsets greater than 255 bytes reside on page1 of the SPD device.
Accessing them requires switching to page1, and adjusting the absolute
offset to be relative to the start of page1. After the access, the page
must be set back to page0. These operations are performed in several
places, so break them out into their own functions.
Also, replace a pair of default cases, which should be impossible due to
earlier checks, with __assert_unreachable().
Ravi Pokala [Tue, 25 Apr 2023 06:07:39 +0000 (23:07 -0700)]
jedec_dimm(4): Add manufacturing year and week.
DDR3 and DDR4 encode the week and year that the DIMM was manufactured,
as a pair of two-digit binary-coded decimal values. Read the values, and
report them as (uint8_t)s.
Ed Maste [Mon, 1 May 2023 17:25:18 +0000 (13:25 -0400)]
pkgbase: hide duplicate METALOG directory warnings under verbose
Creating directories multiple times is an inherent side effect of the
way installation is done. Hide warnings from duplicate directory
entries (with identical metadata) under metalog_reader's verbose mode.
Duplicate file entries are always reported. They currently generate
warnings but will be switched to errors once the few instances currently
in the tree are fixed.
PR: 244596, 271178
Reviewed by: kevans
Sponsored By: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D39898
boolean_t: change to unsigned int to avoid signed bitfield warnings
This is the final part, which actually makes boolean_t unsigned. Note
that we do not change its size, nor do we try to change it directly to
bool, since that results in a lot of regressions.
Converting the remaining instances of boolean_t to plain C99 bool can
now be done in a piecemeal fashion, after which boolean_t may hopefully
be retired.
vm: fix a number of functions to match the expected prototypes
Noticed while attempting to make boolean_t unsigned: some vm-related
function declarations and defintions were using boolean_t where they
should have used int, and vice versa.
zfs: make zfs_vfs_held() definition consistent with declaration
Noticed while attempting to change boolean_t into an actual bool: in
include/sys/zfs_ioctl_impl.h, zfs_vfs_held() is declared to return a
boolean_t, but in module/os/freebsd/zfs/zfs_ioctl_os.c it is defined to
return an int. Make the definition match the declaration.
Apply clang fix for assertion building emulators/rpcs3
Merge commit a5e1a93ea10f from llvm-project (by Mariya Podchishchaeva):
[clang] Fix crash when handling nested immediate invocations
Before this patch it was expected that if there was several immediate
invocations they all belong to the same expression evaluation context.
During parsing of non local variable initializer a new evaluation context is
pushed, so code like this
```
namespace scope {
struct channel {
consteval channel(const char* name) noexcept { }
};
consteval const char* make_channel_name(const char* name) { return name;}
channel rsx_log(make_channel_name("rsx_log"));
}
```
produced a nested immediate invocation whose subexpressions are attached
to different expression evaluation contexts. The constructor call
belongs to TU context and `make_channel_name` call to context of
variable initializer.
This patch removes this assumption and adds tracking of previously
failed immediate invocations, so it is possible when handling an
immediate invocation th check that its subexpressions from possibly another
evaluation context contains errors and not produce duplicate
diagnostics.
John Baldwin [Wed, 7 Dec 2022 20:32:19 +0000 (12:32 -0800)]
aesni: Remove misleading array bounds for aesni_decryt_ecb.
All the other functions used pointers for from/to instead of
fixed-size array parameters. More importantly, this function can
accept pointers to buffers of multiple blocks, not just a single
block.
John Baldwin [Wed, 22 Mar 2023 19:35:09 +0000 (12:35 -0700)]
sys: Stop enabling -Wnested-externs.
clang doesn't implement this warning, so violations are only caught by
GCC. It is also no longer a common practice to use this as it was in
the original BSD code, so the need for the warning is not as important
as when it was used to do cleanups 20 years ago. A recent commit
(c3179891f897d840f578a5139839fcacb587c96d) triggers this warning on
GCC, but that commit uses nested externs purposefully.
John Baldwin [Wed, 21 Dec 2022 18:47:08 +0000 (10:47 -0800)]
Disable -Wzero-length-bounds for the kernel for GCC 12.
The mlx5 driver and some other OFED bits use a somewhat dubious
pattern of:
struct foo {
uint64_t arg[0];
/* Real members of a struct */
};
The code then treats 'arg' as if it were really a kind of union
such that foo.arg[N] functions similarly to (uint64_t *)foo[N].
This uses of foo.arg[N] then trigger this warning.
No real bugs were found by this warning though, so just turn it off
globally.
John Baldwin [Wed, 21 Dec 2022 18:46:26 +0000 (10:46 -0800)]
Disable -Wdangling-pointer for the kernel for GCC 12.
Some of the warnings raised in the kernel seem to be outright bugs in
the compiler (e.g. the cases in ata_xpt.c and scsi_xpt.c). Other
cases are not fatal and it didn't seem to find any legitimate bugs in
the kernel.
John Baldwin [Wed, 21 Dec 2022 18:45:45 +0000 (10:45 -0800)]
iee80211_hwmp: Don't dereference NULL ni in debug printf.
In this call to IEEE80211_NOTE, ni is always NULL due to the assignment
a few lines earlier at the start of the function. If debug traces are
enabled, then this will pass an invalid pointer as the 'mac' pointer to
ieee80211_note_mac. Use IEEE80211_DPRINTF which doesn't take a 'ni'
argument instead.
John Baldwin [Wed, 21 Dec 2022 18:45:26 +0000 (10:45 -0800)]
mrsas: Don't leak a stack pointer value in the softc.
mrsas_issue_blocked_cmd stores a pointer to an on-stack variable
in its softc so that the driver can call wakeup() on the correct
pointer. Once the loop around tsleep() has finished however, the
pointer is no longer needed and any further use would be invalid.
Clear sc->chan to NULL after the loop.
John Baldwin [Wed, 21 Dec 2022 18:46:06 +0000 (10:46 -0800)]
Disable errors for -Wnonnull for the kernel for GCC 12.
The USB code and some other places raise false positives when a NULL
pointer is passed to an inlined function along with a separate length
and the compiler can't determine that the separate length of 0
prevents the use of the NULL pointer.
John Baldwin [Wed, 22 Mar 2023 19:34:34 +0000 (12:34 -0700)]
bhyve: Accept a variable-length string name for qemu_fwcfg_add_file.
It is illegal (UB?) to pass a shorter array to a function argument
that takes a fixed-length array. Do a runtime check for names that
are too long via strlen() instead.
John Baldwin [Mon, 5 Dec 2022 00:27:22 +0000 (16:27 -0800)]
libsa: Disable -Wdangling-pointer for zfs.c.
GCC 12 warns about a dangling pointer to 'objid' in
zfs_bootenv_initial(). However, this appears to be a false positive
as the pointer to 'objid' is only passed to zfs_lookup_dataset() but
not saved anywhere that outlives the lifetime of the
zfs_bootenv_initial() function.
John Baldwin [Wed, 30 Nov 2022 22:56:19 +0000 (14:56 -0800)]
Explicitly set CXXSTD to c++11 for old C++ code using std::auto_ptr<>.
GCC 12 defaults to C++17 which removes (not just deprecates)
std::auto_ptr<>. Trying to use CXXSTD of c++03 doesn't work with
libc++ headers, but c++11 does.
John Baldwin [Mon, 3 Oct 2022 23:10:44 +0000 (16:10 -0700)]
libefivar: Fix a buffer overread.
DevPathToTextUsbWWID allocates a separate copy of the SerialNumber
string to append a null terminator if the original string is not
null terminated. However, by using AllocateCopyPool, it tries to
copy 'Length + 1' words from the existing string containing 'Length'
characters into the target string. Split the copy out to only
copy 'Length' characters instead.
John Baldwin [Wed, 30 Nov 2022 22:50:35 +0000 (14:50 -0800)]
Silence GCC warnings when using libc++ headers.
GCC 12 raises warnings about literal operator suffixes not preceded by
'_' in libc++ headers such as <string_view> as it doesn't recognize
libc++ headers being an implementation of the standard.
GCC 12 also warns about clang-specific pragmas in <locale>.
Disabling these warnings globally for all C++ code is not ideal, but
is a better option than patching libc++ headers to ignore these
warnings.
John Baldwin [Thu, 23 Mar 2023 16:31:29 +0000 (09:31 -0700)]
libpmc: Use LIB_CXX instead of explicit LDADD to link a C++ library.
This uses the C++ compiler as the linker instead of the C compiler
letting the compiler driver pick the right libraries. This is a no-op
on main and stable/13 but matters for stable/12 where the current
logic breaks for external GCC since it tries to use a non-existent
libstdc++.
Before the commit 6cc44223cb6717795afdac4348bbe7e2a968a07d the
field event_mask was fully copied to the EventMasks field.
After this commit the event_mask (uint8_t) is 4 times casted to
EventMask (uint32_t). Because of that 24 bits of each event_mask array
is lost.
This commits brings back simple copying of field, and after words
converting 32 bits field to the requested endian.
I don't think we need more sophisticated method,
as the array is of size 4 (for 32 bits version).
Ed Maste [Mon, 24 Apr 2023 19:41:45 +0000 (15:41 -0400)]
ipv6: disable RFC 4620 nodeinfo by default
RFC 4620 is an experimental RFC that can be used to request information
about a host, including:
- the fully-qualified or single-component name
- some set of the Responder's IPv6 unicast addresses
- some set of the Responder's IPv4 unicast addresses
This is not something that should be made available by default.
PR: 257709
Submitted by: ruben@verweg.com
Reviewed by: melifaro
Relnotes: Yes
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D39778
Ed Maste [Mon, 17 Apr 2023 18:56:51 +0000 (14:56 -0400)]
geom: use bool for one-bit wide bit-field
A one-bit wide bit-field can take only the values 0 and -1. Clang 16
introduced a warning that "implicit truncation from 'int' to a one-bit
wide bit-field changes value from 1 to -1". Fix by using c99 bool.
Reported by: Clang, via dim
Reviewed by: dim
Sponsored by: The FreeBSD Foundation
Stefan Eßer [Tue, 25 Apr 2023 07:40:05 +0000 (09:40 +0200)]
sys/fs: do not report blocks allocated for synthetic file systems
The pseudo file systems (devfs, fdescfs, procfs, etc.) report total
and available blocks and inodes despite being synthetic with no
underlying storage device to which those values could be applied.
The current code of these file systems tends to report a fixed number
of total blocks but no free blocks, and in the case of procfs,
libprocfs, linsysfs also no free inodes.
This can be irritating in e.g. the "df" output, since 100% of the
resources seem to be in use, but it can also create warnings in
monitoring tools used for capacity management.
This patch makes these file systems return the same value for the
total and free parameters, leading to 0% in use being displayed by
"df". Since there is no resource that can be exhausted, this appears
to be a sensible result.
fs/msdosfs: add tracking of free root directory entries
This update implements tallying of free directory entries during
create, delete, or rename operations on FAT12 and FAT16 file systems.
Prior to this change, the total number of root directory entries
was reported as number of inodes, but 0 as the number of free
inodes, causing system health monitoring software to warn about
a suspected disk full issue.
The FAT12 and FAT16 file systems provide a limited number of
root directory entries, e.g. 512 on typical hard disk formats.
The valid range of values is 1 to 65535, but the msdosfs code
will effectively round up "odd" values to the next multiple of 16
(e.g. 513 would allow for 528 root directory entries).
This update implements tracking of directory entries during create,
delete, or rename operations, with initial values determined by
scanning the directory when the file system is mounted.
Total and free directory entries are reported in the f_files and
f_ffree elements of struct statfs, despite differences in semantics
of these values:
- There is no limit on the number of files and directories that can
be created on a FAT file system. Only the root directory of FAT12
and FAT16 file systems is limited, any number of files can still be
created in sub-directories, even when 0 free "inodes" are reported.
- A single file can require 1 to 21 directory entries, depending on
the character set, structure, and length of the name. The DOS 8.3
style file name takes up 1 entry, and if the name does not comply
with the syntax of a DOS 8.3 file name, 1 additional entry is used
for each 13 characters of the file name. Since all these entries
have to be contiguous, it is possible that a file or directory with
a long name can not be created, despite a sufficient total number of
free directory entries.
- Renaming a file can require more directory entries than currently
allocated to store its long name, which may prevent an in-place
update of the name if more entries are needed. This may cause a
rename operation to fail if no contiguous range of free entries for
the new name can be found.
- The volume label is stored in a directory entry. An empty FAT file
system with a volume label will therefore show 1 used "inode" in
df.
- The perceentage of free inodes shown in df or monitoring tools does
only represent the state of the root directory of a FAT12 or FAT16
file system. Neither does a reported value of 0% free inodes does
prevent files from being created in sub-directories, nor does a
value of 50% free inodes guarantee that even a single file with
a "long" name can be created in the root directory (if every other
directory entry is occupied and there are no 2 contiguous entries).
The statfs(2) and df(1) man pages have been updated with a notice
regarding the possibly different semantics of values reported as
total and free inodes for non-Unix file systems.
fs/msdosfs: Fix potential panic and size calculations
Some combinations of FAT12 file system parameters could cause a kernel
panic due to an unmapped access if the size of the FAT was larger than
the CPU page size. The reason is that FAT12 uses 3 bytes to store
2 FAT pointers, leading to partial FAT pointers at the end of buffers
of a size that is not a multiple of 3.
With a typical page size of 4 KB, this caused the FAT entry at byte
offsets 4095 and 4096 to cross the page boundary, with only the first
page mapped. This was fixed by adjusting the mapping to always cover
both bytes of each FAT entry.
Testing revealed 2 other inconsistencies that are fixed by this commit:
1) The calculation of the size of the data area did not take into
account the fact that the first two data block numbers are reserved
and that the data area starts with block 2. This could cause a
FAT12 file system created with the maximum supported number of
blocks to be incorrectly identified as FAT16.
2) The root directory does not take up space in the data area of a
FAT12 or FAT16 file system, since it is placed into a reserved
area outside of that data area. This commits makes stat() report
the logical size of the root directory, but with 0 blocks allocated
from the data area.
When sorting, both the C11 standard (ISO/IEC 9899:2011, K.3.6.3.2) and
the ISO/IEC JTC1 SC22 WG14 N1172 standard, does not define objects of
zero size as undefined behaviour. However Microsoft's cpp-docs does.
Add proper checks for this. Found while working on bsort(3).
mlx5en(4): Don't wait for receive queue to fill up with mbufs during open channels.
Failure to get mbufs may be transient.
Don't permanently fail to open the channels due to lack of mbufs.
This also makes modifying channel parameters faster.
Implement an API for sending a zero-length-packet. The purpose of such a
USB packet is to toggle the binary packet counter for USB 1.0/2.0 protocols,
without sending any data, so that the first packet sent after opening
a USB BULK endpoint doesn't get lost. This is for devices not supporting
the USB standard defined clear-stall handling.