John Baldwin [Fri, 18 Nov 2022 18:00:00 +0000 (10:00 -0800)]
vmm: Remove the per-vm cookie argument from vmmops taking a vcpu.
This requires storing a reference to the per-vm cookie in the
CPU-specific vCPU structure. Take advantage of this new field to
remove no-longer-needed function arguments in the CPU-specific
backends. In particular, stop passing the per-vm cookie to functions
that either don't use it or only use it for KTR traces.
John Baldwin [Fri, 18 Nov 2022 17:59:21 +0000 (09:59 -0800)]
vmm: Refactor storage of CPU-dependent per-vCPU data.
Rather than storing static arrays of per-vCPU data in the CPU-specific
per-VM structure, adopt a more dynamic model similar to that used to
manage CPU-specific per-VM data.
That is, add new vmmops methods to init and cleanup a single vCPU.
The init method returns a pointer that is stored in 'struct vcpu' as a
cookie pointer. This cookie pointer is now passed to other vmmops
callbacks in place of the integer index. The index is now only used
in KTR traces and when calling back into the CPU-independent layer.
John Baldwin [Fri, 18 Nov 2022 17:58:56 +0000 (09:58 -0800)]
vmm vmx: Add a global bool to indicate if the host has the TSC_AUX MSR.
A future commit will remove direct access to vCPU structures from
struct vmx, so add a dedicated boolean for this rather than checking
the capabilities for vCPU 0.
John Baldwin [Fri, 18 Nov 2022 17:58:41 +0000 (09:58 -0800)]
vmm: Rework snapshotting of CPU-specific per-vCPU data.
Previously some per-vCPU state was saved in vmmops_snapshot and other
state was saved in vmmops_vcmx_snapshot. Consolidate all per-vCPU
state into the latter routine and rename the hook to the more generic
'vcpu_snapshot'. Note that the CPU-independent per-vCPU data is still
stored in a separate blob as well as the per-vCPU local APIC data.
John Baldwin [Fri, 18 Nov 2022 17:57:29 +0000 (09:57 -0800)]
vmm: Simplify saving of absolute TSC values in snapshots.
Read the current "now" TSC value and use it to compute absolute time
saved value in vm_snapshot_vcpus rather than iterating over vCPUs
multiple times in vm_snapshot_vm.
Ed Maste [Thu, 17 Nov 2022 19:22:33 +0000 (14:22 -0500)]
pkgbase: do not record dependency on non-existent liby package
liby-dev provides (only) liby.a. liby has no headers or man pages, and
there is no liby package. Add a special case to record no dependency on
the package that does not exist.
PR: 266923
Reviewed by: bapt
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D37429
Warner Losh [Fri, 18 Nov 2022 17:33:03 +0000 (10:33 -0700)]
stand: Remove i386-only support fire firewire
Remove support for booting off of firewire, and for having dcons via
firewire in the loader. Kernel support for these things is unchanged.
Discussed on arch@ and the current state is not working (and the build
was wrong to boot).
evdev: Extend EVIOCGRAB ioctl scope to cover sysmouse interface
of psm(4), ums(4) and sysmouse(4) drivers.
EVIOCGRAB ioctl execution on /dev/input/event# device node gains
exclusive access to this device to caller. It is used mostly for
development purposes and remote control software. See e.g.
https://reviews.freebsd.org/D30020 which is the reason of creation
of this change.
Cy Schubert [Thu, 17 Nov 2022 15:43:29 +0000 (07:43 -0800)]
heimdal: Fix: Too large time skew, client time 1970-01-01T01:00:00
Part of ed549cb0c53f zeroed out a data structure in the resulting code-file
when a TUTCTime type was freed. This part of the patch applies to Heimdal
7.1+ and not our Heimdal 1.5.2.
PR: 267827
Reported by: Peter Much <pmc@citylink.dinoex.sub.org>
Tested by: Peter Much <pmc@citylink.dinoex.sub.org>
Fixes: ed549cb0c53f
MFC after: TBD with philip@
Ed Maste [Thu, 17 Nov 2022 14:15:20 +0000 (09:15 -0500)]
pkgbase: report type for duplicated METALOG entries
Duplicate METALOG file entries are more of a concern than duplicate
directories. The metalog check tool previously did not include the
entry type in the warnings, making it hard to find the ones of concern.
Ed Maste [Wed, 16 Nov 2022 19:53:42 +0000 (14:53 -0500)]
pkgbase: examine METALOG files relative to stage root directory
Previously we stripped the '.' from the beginning of each METALOG entry
to determine the path to stat. This meant that we examined files on the
build host, not the staged files.
Instead, strip off the last part of the specified METALOG pathname to
find the stage root directory, and stat files relative to that.
Reviewed by: bapt
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D37412
Rick Macklem [Thu, 17 Nov 2022 01:37:22 +0000 (17:37 -0800)]
vfs_vnops.c: Fix blksize for ZFS
Since ZFS reports _PC_MIN_HOLE_SIZE as 512 (although it
appears that an unwritten region must be at least f_iosize
to remain unallocated), vn_generic_copy_file_range()
uses 4096 for the copy blksize for ZFS, reulting in slow copies.
For most other file systems, _PC_MIN_HOLE_SIZE and f_iosize
are the same value, so this patch modifies the code to
use f_iosize for most cases. It also documents in comments
why the blksize is being set a certain way, so that the code
does not appear to be doing "magic math".
Martin Matuska [Thu, 17 Nov 2022 01:00:05 +0000 (02:00 +0100)]
zfs: workaround panic on rootfs mount
The import of OpenZFS PR 13758 causes a panic in zfsctl_is_node()
if ZFS is mounting as root filesystem. This implements a workaround
until the issue is resolved by authors.
Ed Maste [Thu, 17 Nov 2022 00:12:18 +0000 (19:12 -0500)]
libcompat: avoid installing include files twice
Previously some headers were getting installed twice, once as expected
and then a second time as part of the compat32 library stage.
Makefile.libcompat sets -DLIBRARIES_ONLY for the install make invocation
which causes bsd.lib.mk to skip headers. However some headers are
handled via bsd.prog.mk, which does not use LIBRARIES_ONLY. Explicitly
set MK_INCLUDES=no.
Reviewed by: brooks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D37413
Notable upstream pull request merges:
#13680 Add options to zfs redundant_metadata property
#13758 Allow mounting snapshots in .zfs/snapshot as a regular user
#13838 quota: disable quota check for ZVOL
#13839 quota: extend quota for dataset
#13973 Fix memory leaks in dmu_send()/dmu_send_obj()
#13977 Avoid unnecessary metaslab_check_free calling
#13978 PAM: Fix unchecked return value from zfs_key_config_load()
#13979 Handle possible null pointers from malloc/strdup/strndup()
#13997 zstream: allow decompress to fix metadata for uncompressed
records
#13998 zvol_wait logic may terminate prematurely
#14001 FreeBSD: Fix a pair of bugs in zfs_fhtovp()
#14003 Stop ganging due to past vdev write errors
#14039 Optimize microzaps
#14050 Fix draid2+2s metadata error on simultaneous 2 drive failures
#14062 zed: Avoid core dump if wholedisk property does not exist
#14077 Propagate extent_bytes change to autotrim thread
#14079 FreeBSD: vn_flush_cached_data: observe vnode locking contract
#14093 Fix ARC target collapse when zfs_arc_meta_limit_percent=100
#14106 Add ability to recompress send streams with new compression
algorithm
#14119 Deny receiving into encrypted datasets if the keys are not
loaded
#14120 Fix arc_p aggressive increase
#14129 zed: Prevent special vdev to be replaced by hot spare
#14133 Expose zfs_vdev_open_timeout_ms as a tunable
#14135 FreeBSD: Fix out of bounds read in zfs_ioctl_ozfs_to_legacy()
#14152 Adds the `-p` option to `zfs holds`
#14161 Handle and detect #13709's unlock regression
Dapeng Gao [Wed, 16 Nov 2022 18:29:28 +0000 (18:29 +0000)]
Check alignment of fp in unwind_frame
A misaligned frame pointer is certainly not a valid frame pointer and
with strict alignment enabled (as on CHERI) can cause panics when it is
loaded from later in the code.
Building the DSDT table by basl will allow it to be loaded by qemu's
ACPI table loader.
Building the DSDT is complex and basl doesn't support it yet. For that
reason, it's still compiled by iasl. It's just a bit restructured.
Upcoming commits will restructure the builds of all other ACPI tables in
a similar way. So, this commit is done for consistency reasons. We're
starting with DSDT because it doesn't point to any other tables and it's
the last one in our current build list.
Emmanuel Vadot [Tue, 15 Nov 2022 12:54:49 +0000 (13:54 +0100)]
usb/dwc3: Only force USB2 based on the PHY register and IP version
We shouldn't force USB2 only based on if we have an external PHY.
The internal PHY register tell us what link speed we can acheive
and we need to force USB2 only if it cannot do USB3.
This is only available after revision 0x290A of the dwc3 IP.
Reviewed by: andrew
Differential Revision: https://reviews.freebsd.org/D37394
Fixed: 1331c0f44b6a ("Add support for RockChip RK356X to DWC3 driver.")
Sponsored by: Beckhoff Automation GmbH & Co. KG
Emmanuel Vadot [Tue, 15 Nov 2022 08:58:30 +0000 (09:58 +0100)]
dwc3: Handle optional clocks
Usually dwc3 needs a glue node that contain the SoC specific clocks/resets.
For some reason the RK3328 DTS doesn't have this glue node and the clocks
are specified in the dwc3 node directly.
The bindings says that it is allowed but doesn't specified some strict names
for them.
Add a specific case for RK3328 based on the compatible string.
Reviewed by: andrew
Differential Revision: https://reviews.freebsd.org/D37392
Sponsored by: Beckhoff Automation GmbH & Co. KG
John Baldwin [Wed, 16 Nov 2022 05:20:18 +0000 (21:20 -0800)]
rs: Test actual output in the tests.
Previously the tests just verified if command line arguments raised an
error or not, they did not test how command line arguments affected
the output. This adds some sample (if simple) input and output to
each flag test as well as adding a few additional trivial tests.
John Baldwin [Wed, 16 Nov 2022 05:19:35 +0000 (21:19 -0800)]
rs: Use getopt() and strtol() instead of mannual parsing.
This uses the "::" extension to getopt() to handle options which take
an optional argument.
The updated flag tests were all wrong before and only passed because
the manual parser failed to raise errors when a required argument was
missing. The invalid argument test now gets a better error message.
John Baldwin [Wed, 16 Nov 2022 05:17:28 +0000 (21:17 -0800)]
depend-cleanup.sh: Handle rs(1) moving to C++.
To support changes in filenames for programs (and not just libraries),
update clean_dep() to check .depend.foo.o files as well as
.depend.foo.pico files.
Kyle Evans [Wed, 16 Nov 2022 04:07:28 +0000 (22:07 -0600)]
share: i18n: fix mismatch in BIG5 esdb generation
In the first loop, we setup Big5_$i_variable where $i are elements of
$PART with : replaced to @. Do the same in the second loop when we're
trying to refer to the same variable.
No functional change, because none of the in-tree mappings have an @
in them.
John Baldwin [Wed, 16 Nov 2022 03:18:58 +0000 (19:18 -0800)]
libfetch: Pass a zeroed digest to DigestCalcResponse.
GCC 12 warns that passing "" (a constant of char[1]) to a parameter of
type char[33] could potentially overread. It is not clear from the
context that c->qops can never be "auth-int" (and if it can't, then
the "auth-int" handling in DigestCalcResponse is dead code that should
be removed since this is the only place the function is called).
John Baldwin [Wed, 16 Nov 2022 03:17:36 +0000 (19:17 -0800)]
diff: Don't (ab)use sprintf() as a kind of strcat().
Previously print_header() used sprintf() of a buffer to itself as a
kind of string builder but without checking for overflows. This
raised -Wformat-truncation and -Wrestrict warnings in GCC. Instead,
just conditionally print the new timestamp fields after the initial
strftime()-formatted string. While here, use sizeof(buf) with
strftime() rather than a magic number.
John Baldwin [Wed, 16 Nov 2022 03:16:50 +0000 (19:16 -0800)]
diff: Don't treat null characters like carriage returns in readhash().
The implicit fall-through in the !D_FORCEASCII case caused null
characters to be treated as carriage returns honoring the D_STRIPCR,
D_FOLDBLANKS, and D_IGNOREBLANKS flags.
Reported by: GCC -Wimplicit-fallthrough
Reviewed by: bapt
Fixes: 3cbf98e2bee9 diff: read whole files to determine if they are ASCII text
Differential Revision: https://reviews.freebsd.org/D36813
Rich Ercolani [Tue, 15 Nov 2022 22:44:12 +0000 (17:44 -0500)]
Handle and detect #13709's unlock regression (#14161)
In #13709, as in #11294 before it, it turns out that 63a26454 still had
the same failure mode as when it was first landed as d1d47691, and
fails to unlock certain datasets that formerly worked.
Rather than reverting it again, let's add handling to just throw out
the accounting metadata that failed to unlock when that happens, as
well as a test with a pre-broken pool image to ensure that we never get
bitten by this again.
Fixes: #13709 Signed-off-by: Rich Ercolani <rincebrain@gmail.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Rick Macklem [Tue, 15 Nov 2022 21:30:41 +0000 (13:30 -0800)]
rpc.tlsservd.8: Update man page for new -N/--numdaemons option
Commit 1e588a9ceb36 added a new command line option -N/numdaemons
that specifies how many daemons to run. This allows a server
to be configured with more than one rpc.tlsservd daemon, which
may be necessary to handle a reboot for an NFS server with
many NFS-over-TLS client mounts.
Cy Schubert [Tue, 8 Nov 2022 08:53:29 +0000 (00:53 -0800)]
heimdal: Fix multiple security vulnerabilities
The following issues are patched:
- CVE-2022-42898 PAC parse integer overflows
- CVE-2022-3437 Overflows and non-constant time leaks in DES{,3} and arcfour
- CVE-2021-44758 NULL dereference DoS in SPNEGO acceptors
- CVE-2022-44640 Heimdal KDC: invalid free in ASN.1 codec
Note that CVE-2022-44640 is a severe vulnerability, possibly a 10.0
on the Common Vulnerability Scoring System (CVSS) v3, as we believe
it should be possible to get an RCE on a KDC, which means that
credentials can be compromised that can be used to impersonate
anyone in a realm or forest of realms.
Heimdal's ASN.1 compiler generates code that allows specially
crafted DER encodings of CHOICEs to invoke the wrong free function
on the decoded structure upon decode error. This is known to impact
the Heimdal KDC, leading to an invalid free() of an address partly
or wholly under the control of the attacker, in turn leading to a
potential remote code execution (RCE) vulnerability.
This error affects the DER codec for all extensible CHOICE types
used in Heimdal, though not all cases will be exploitable. We have
not completed a thorough analysis of all the Heimdal components
affected, thus the Kerberos client, the X.509 library, and other
parts, may be affected as well.
This bug has been in Heimdal's ASN.1 compiler since 2005, but it may
only affect Heimdal 1.6 and up. It was first reported by Douglas
Bagnall, though it had been found independently by the Heimdal
maintainers via fuzzing a few weeks earlier.
While no zero-day exploit is known, such an exploit will likely be
available soon after public disclosure.
- CVE-2019-14870: Validate client attributes in protocol-transition
- CVE-2019-14870: Apply forwardable policy in protocol-transition
- CVE-2019-14870: Always lookup impersonate client in DB
Sponsored by: so (philip)
Obtained from: so (philip)
Tested by: philip, cy
MFC after: immediately
John Baldwin [Tue, 15 Nov 2022 20:08:51 +0000 (12:08 -0800)]
cxgbe: Enable TOE TLS RX when an RX key is provided via setsockopt().
Rather than requiring a socket to be created as a TLS socket from the
get go, switch a TOE socket from "plain" TOE to TLS mode when a
receive key is added to the socket.
The firmware is only able to switch a "plain" TOE connection to TLS
mode if the head of the pending socket data is the start of a TLS
record, so the connection is migrated to TLS mode as a multi-step
process.
When TOE TLS RX is enabled, the associated connection's receive side
is frozen via a flag in the TCB. The state of the socket buffer is
then examined to determine if the pending data in the socket buffer
ends on a TLS record boundary. If so, the connection is migrated to
TLS mode and unfrozen. Otherwise, the connection is unfrozen
temporarily until more data arrives. Once more data arrives, the
receive queue is frozen again and rechecked. This continues until the
connection is paused at a record boundary. Any records received
before TLS mode is enabled are decrypted as software records.
Note that this removes the 'rx_tls_ports' sysctl. TOE TLS offload for
receive is now enabled automatically on existing TOE connections when
using a KTLS-aware SSL library just as it was previously enabled
automatically for TLS transmit. This also enables TLS offload for TOE
connections which enable TLS after passing initial data in the clear
(e.g. STARTTLS with SMTP).
John Baldwin [Tue, 15 Nov 2022 20:03:19 +0000 (12:03 -0800)]
ktls: Add tests for receiving corrupted or invalid records.
These should all trigger errors when reading from the socket.
Tests include truncated records (socket closed early on the other
side), corrupted records (bits flipped in explicit IVs, ciphertext, or
MAC), invalid header fields, and various invalid record lengths.
John Baldwin [Tue, 15 Nov 2022 20:02:57 +0000 (12:02 -0800)]
ktls_ocf: Reject encrypted TLS records using AEAD that are too small.
If a TLS record is too small to contain the required explicit IV,
record_type (TLS 1.3), and MAC, reject attempts to decrypt it with
EMSGSIZE without submitting it to OCF. OCF drivers may not properly
detect that regions in the crypto request are outside the bounds of
the mbuf chain. The caller isn't supposed to submit such requests.
John Baldwin [Tue, 15 Nov 2022 20:02:03 +0000 (12:02 -0800)]
ktls: Add software support for AES-CBC decryption for TLS 1.1+.
This is mainly intended to provide a fallback for TOE TLS which may
need to use software decryption for an initial record at the start
of a connection.
Andrew Turner [Mon, 31 Oct 2022 15:08:26 +0000 (15:08 +0000)]
Split out the arm64 EL2 exception vectors
These were originally in locore.S as they are only needed so we have
a valid value to put into the vbar_el2 register. As these will soon
be used by bhyve so move them to a new file as we already have with
the EL1 exception vectors in exception.S.
Obtained from: https://github.com/FreeBSD-UPB/freebsd-src (earlier version)
Sponsored by: Innovate UK
Sponsored by: The FreeBSD Foundation
Andrew Turner [Mon, 14 Nov 2022 15:48:43 +0000 (15:48 +0000)]
Add the arch field to the arm64 MIDR macros
For completeness add accessors for the MIDR field. As the field is
always 0xf on arm64 it is unneeded in the current MICR handling, but
will be used in the vmm module for bhyve.
Obtained from: https://github.com/FreeBSD-UPB/freebsd-src (earlier version)
Sponsored by: The FreeBSD Foundation
Andrew Turner [Mon, 7 Nov 2022 11:21:42 +0000 (11:21 +0000)]
Disable superpage use for stage 2 arm64 mappings
When modifying a stage 2 mapping we may need to call into the
hypervisor to invalidate the TLB. Until it is known if the cost of
this operation is less than the performance gains superpages offers
disable their use.
Reviewed by: kib. markj
Sponsored by: Innovate UK
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D37299
Kristof Provost [Tue, 15 Nov 2022 11:11:32 +0000 (12:11 +0100)]
pfsync: fix memory leak
The recent refactoring to prepare for pfsync over IPv6 introduced a
memory leak.
If we don't have a sync peer configured we return early (without sending
out a packet), but failed to free the newly allocated packet.
Kristof Provost [Wed, 9 Nov 2022 13:48:05 +0000 (14:48 +0100)]
if_ovpn: pass control packets through the socket
Rather than passing control packets through the ioctl interface allow
them to pass through the normal UDP socket flow.
This simplifies both kernel and userspace, and matches the approach
taken (or the one that will be taken) on the Linux side of things.
Some ACPI tables like XSDT contain pointers to other ACPI tables. When
an ACPI table is loaded by qemu's loader, the address in the guest
memory is unknown. For that reason, the qemu loader supports patching
those pointers. Basl keeps track of all pointers and causes the qemu
loader to patch all pointers.
The qemu ACPI table loader is unsupport yet. However, in a future commit
bhyve will use dynamic ACPI table offsets based on the size and
alignment requirements of each ACPI table. Therefore, tracking ACPI
table pointer is required too.
The qemu ACPI table loader patches the ACPI tables. After patching them,
checksums aren't correct any more. It has to calculate a new checksum
for the ACPI table. For that reason, basl has to keep track of checksums
and has to cause the qemu loader to create new checksums for the tables.
The qemu ACPI table loader isn't supported yet. However, the address of
all tables is unknown as long as bhyve hasn't finished ACPI table
creation. So, the checksum of tables which include pointer to other
tables are unknown too. This requires tracking of checksums too.
ACPI tables have different layouts. So, there's no common position for
the length field. When tables are build by basl, the length is unknown
at the beginning. It has to be set after building the table.