Corvin Köhne [Mon, 6 Feb 2023 10:43:49 +0000 (11:43 +0100)]
bhyve: add helper to read PCI IDs from bhyve config
Changing the PCI IDs is valuable in some situations. The Intel GOP
driver requires that some PCI IDs of the LPC bridge are aligned with the
physical values of the host LPC bridge. Another use case are oracles
virtio driver. They require different subvendor ID than the default one.
For that reason, create a helper which makes it easy to read PCI IDs
from bhyve config. Additionally, this helper ensures that all emulation
devices are using the same config keys.
Corvin Köhne [Mon, 6 Feb 2023 09:26:33 +0000 (09:26 +0000)]
pci: add tunable hw.pci.enable_mps_tune
If the tunable is set to 0, the tuning of the MPS (maximum payload size)
is disabled and the default MPS values set by the BIOS are used. In this
case the system may use a lower speed or operate in a less optimized
state, but it can resolve issues with stability and compatibility. With
specific devices the tuning of the mps, can lead to a complete freeze of
the system.
Reviewed by: manu
MFC after: 1 week
Sponsored by: Beckhoff Automation GmbH & Co. KG
Differential Revision: https://reviews.freebsd.org/D38397
The tzdata 2023c release reverts all changes made in 2023b other than
commentary, as that appears to be the best of a bad set of short-notice
choices for modeling this week's daylight saving chaos in Lebanon.
netmap: get rid of save_if_input for emulated adapters
The save_if_input function pointer was meant to save the previous
value of ifp->if_input before replacing it with the emulated
adapter hook.
However, the same pointer value is already stored in the if_input
field of the netmap_adapter struct, to be used for host TX ring processing.
Reuse the netmap_adapter if_input field to simplify the code
and save some space.
Kirk McKusick [Tue, 7 Mar 2023 23:12:37 +0000 (15:12 -0800)]
Correct several bugs in fsck_ffs(8) triggered by corrupted filesystems.
If a directory entry has an illegal inode number (less than zero
or greater than the last inode in the filesystem) the entry is removed.
If a directory '.' or '..' entry had an illegal inode number they
were being removed. Since fsck_ffs knows what the correct value is
for these two entries fix them rather deleting them.
Add much more extensive cylinder group checks and use them to be
more careful about rebuilding a cylinder group.
Check for out-of-range block numbers before trying to free them.
When a directory is deleted also remove its cache entry created
in pass1 so that later passes do not try to operate on a deleted
directory.
Check for ctime(3) returning NULL before trying to use its return.
When freeing a directory inode, do not try to interpret it as a
directory.
Reserve space in the inostatlist to have room to allocate a
lost+found directory.
If an invalid block number is found past the end of an inode simply
remove it rather than clearing and removing the inode.
Modernize the inoinfo structure to use queue(3) LIST rather than a
handrolled linked list implementation.
Reported by: Bob Prohaska, John-Mark Gurney, and Mark Millard
Tested by: Peter Holm
Reviewed by: Peter Holm
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D38668
Mitchell Horne [Mon, 20 Mar 2023 19:54:11 +0000 (16:54 -0300)]
KASSERT(9): some updates
- Add a little bit of introductory text
- Improve the existing example: ANSI C, use a better assertion than a
NULL check (which is discouraged)
- Document the widely used MPASS macro in this page
- Drop the cross-reference to config(8)
Reviewed by: kib, markj, rpokala, Pau Amma <pauamma@gundo.com>
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D39131
Mitchell Horne [Mon, 20 Mar 2023 19:50:50 +0000 (16:50 -0300)]
critical(9): small updates
- Document CRITICAL_ASSERT() in this man page.
- Clarify that a thread may also handle interrupts in a critical
section, not only faults/exceptions.
- Note the negative effects of critical section abuse
- Some other minor clarifications
- Add short SEE ALSO
Reviewed by: kib, markj, rpokala, Pau Amma <pauamma@gundo.com>
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D39130
Mark Johnston [Wed, 22 Mar 2023 13:02:54 +0000 (09:02 -0400)]
bhyve: Sleep briefly in the VMEXIT_DEBUG handler
As of commit 0bda8d3e9f7a ("vmm: permit some IPIs to be handled by
userspace") and commit 9cc9abf409cc ("bhyve: create all vcpus on
startup"), we have a misbehaviour where AP vCPU threads spin until they
receive a SIPI. In particular, since they are "suspended", they simply
call the VMEXIT_DEBUG handler in a loop, but the handler is a no-op by
default.
This is tricky to fix since the gdb stub isn't aware of whether a given
vCPU is supposed to be running. For 13.2's sake, introduce a simple
workaround wherein the VMEXIT_DEBUG handler sleeps for a short period.
This ensures that host CPU usage remains sane when VMs are starting
without penalizing users of VMEXIT_DEBUG too much.
Reviewed by: corvink, jhb
MFC after: 3 days
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D39174
Kristof Provost [Mon, 20 Mar 2023 13:26:33 +0000 (14:26 +0100)]
pfsync: add missing unlock in pfsync_defer_tmo()
The callout for pfsync_defer_tmo() is created with
CALLOUT_RETURNUNLOCKED, because while the callout framework takes care
of taking the lock we want to run a few operations outside of the lock,
so we unlock ourselves.
However, if `sc->sc_sync_if == NULL` we return without releasing the
lock, and leak the lock, causing later deadlocks.
Ensure we always release the bucket lock when we exit pfsync_defer_tmo()
Kristof Provost [Mon, 20 Mar 2023 13:29:55 +0000 (14:29 +0100)]
pfsync: fix pfsync_undefer_state() locking
pfsync_undefer_state() takes the bucket lock, but could get called from
places (e.g. from pfsync_update_state() or pfsync_delete_state()) where
we already held the lock.
As it can also be called from places where we don't yet hold the lock
create new locked variant for use when the lock is already held. Keep
using pfsync_undefer_state() where the lock must still be taken.
Vitaliy Gusev [Fri, 17 Mar 2023 09:17:22 +0000 (10:17 +0100)]
vmm: fix missing ipi statistic
ipi counters are missing in bhyvectl's output because vm_maxcpu is 0
when initializing them. That's because vmm_stat_register is executed
before vmm_init.
Instead of directly fixing it, there's a better solution in illumos
which is cherry picked:
https://github.com/illumos/illumos-gate/commit/65a3bc83734e5fb0fc2c19df3e5112b87dcdc3f8
It replaces the matrix statistic by two counters per vcpu. One for
counting the ipis to the vcpu and one counting the ipis received by the
vcpu. This has several advantages:
- A matrix statistic becomes huge when using many vcpus.
- A matrix statistic easily reaches the MAX_VMM_STAT_ELEMS limit.
- Two counters are enough in most cases. DTrace can be used for more
advanced debugging purposes.
- A matrix statistic wastes memory. The matrix size is determined by
vm_maxcpu regardless of the number of vcpus assigned to the vm.
Corvin Köhne [Wed, 11 Aug 2021 08:04:36 +0000 (10:04 +0200)]
bhyve: add helper for adding fwcfg files
Fwcfg items without a fixed index are reported by the file_dir. They
have an index of 0x20 and above. This helper simplifies the addition of
such fwcfg items. It selects a new free index, assigns it to the fwcfg
items and creates an proper entry in the file_dir.
Mitchell Horne [Wed, 15 Mar 2023 15:26:57 +0000 (12:26 -0300)]
arm64: limit EFI excluded regions to physical memory types
Consolidate add_efi_map_entry() and exclude_efi_map_entry() into a
single function, handle_efi_map_entry(), so that the exact set of entry
types handled is the same in the addition or exclusion cases. Before,
exclude_efi_map_entry() had a 'default' case that would exclude all
entry types that were not listed explicitly in the switch statement.
Logically, we do not need to exclude a range that could not possibly be
added to physmem, and we do not need to exclude bus ranges that are not
physical memory, for example EFI_MD_TYPE_IOMEM.
Since physmem's ram0 device will reserve bus memory resources for its
owned ranges, this was preventing attachment of the watchdog device on
the RPI4B. For some reason its region of memory-mapped I/O appeared in
the EFI memory map (with the aforementioned EFI_MD_TYPE_IOMEM type).
This change fixes the attachment issue, as we prevent the physmem API
from messing with this range of bus space.
Warner Losh [Wed, 30 Nov 2022 23:28:01 +0000 (16:28 -0700)]
arm64/machdep: Add parameter to the EFI table walking code
It would be nice to be able to pass an arbitrary pointer to the callback
code. Add one, and pass NULL in all the places that we do that today.
As noted by andrew@, we should likely refactor this into MI code and use
it here and amd64, but for the future.
Warner Losh [Wed, 24 Aug 2022 12:27:01 +0000 (06:27 -0600)]
arm64: Remove unused typedef
We don't use EFI_MEMORY_DESCRIPTOR that's typedef'd here. We use the one
from sys/efi.h instead. Remove the clutter here as these two are subtly
different (though wind up with the same layout due to alignment rules).
Mark Johnston [Wed, 10 Nov 2021 21:57:12 +0000 (16:57 -0500)]
mbuf: Fix an offset calculation in m_apply_extpg_one()
We were not including the requested starting offset in the page offset.
Reviewed by: jhb
Fixes: 3c7a01d773ac ("Extend m_apply() to support unmapped mbufs.")
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D32922
Ed Maste [Wed, 23 Nov 2022 17:14:18 +0000 (12:14 -0500)]
csh: install hard link with same mode as target
Previously when using NO_ROOT we recorded METALOG entries for the /.cshrc
hard link with a different file mode than the link target, which is not
permitted.
We cannot just set LINKMODE here as it would also apply to the hard link
for the tcsh binary.
Reviewed by: brooks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D37499
Ed Maste [Wed, 23 Nov 2022 15:34:58 +0000 (10:34 -0500)]
pam.d: install hard link with same mode as target
Previously when using NO_ROOT we recorded a METALOG entry for the
pam.d/ftp hard link with a different file mode than the link target
pam.d/ftpd, which is not permitted.
This change is similar to 1dbb9994d4dd for .profile
Ed Maste [Wed, 23 Nov 2022 15:44:41 +0000 (10:44 -0500)]
dwatch: install hard links with same mode as target
Previoulsy when using NO_ROOT we recorded METALOG entries for dwatch
hard links with different file modes than their link targets, which is
not permitted.
Reviewed by: bapt
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D37477
Ed Maste [Wed, 23 Nov 2022 15:20:49 +0000 (10:20 -0500)]
sh: install hard link with same mode as target
Previously when using NO_ROOT we recorded a METALOG entry for the
/.profile hard link with a different mode than the link target, which is
not permitted.
Reviewed by: bapt
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D37476
Ed Maste [Wed, 16 Nov 2022 21:24:19 +0000 (16:24 -0500)]
CI: Run pkgbase METALOG lint script
tools/pkgbase/metalog_reader.lua checks for errors in METALOG (for
pkgbase staging), such as hard links with differing modes, duplicate
entries, etc. Run it as part of the Cirrus-CI job to prevent
regressions.
Reviewed by: manu, imp
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D37521
Warner Losh [Wed, 22 Mar 2023 02:25:58 +0000 (20:25 -0600)]
_endian.h: Include sys/cdefs.h for visibility macros
BYTE_ORDER, LITTLE_ENDIAN and BIG_ENDIAN will be required by the
forthcoming POSIX Issue 8. In addition, they are provided in the BSD
compilation environments. However, depending on the order includes
happend, sys/cdefs.h may or may not be included when endian.h is
included. Include it here so we can safely test __BSD_VISIBLE. Add
visibility when we're compiling in the future for issue 8, but since the
date number for issue 8 hasn't been fixed, use strictly greater than the
issue 7 date.of 200809.
This had the side effect of sometimes (in the traditional BSD
compliation environment)
#if BYTE_ORDER == LITTLE_ENDIAN
and
#if BYTE_ORDER == BIG_ENDIAN
both being true because none of these were defined. This fixes
that. It also fixes including it after <stdio.h> but not before.
John Baldwin [Mon, 7 Feb 2022 20:47:51 +0000 (12:47 -0800)]
Stop adding -Wredundant-decls to CWARNFLAGS.
clang doesn't implement it, and Linux doesn't enforce it. As a
result, new instances keep cropping up both in FreeBSD's code and in
upstream sources from vendors.
Mark Johnston [Mon, 20 Mar 2023 18:16:00 +0000 (14:16 -0400)]
kerneldump: Inline dump_savectx() into its callers
The callers of dump_savectx() (i.e., doadump() and livedump_start())
subsequently call dumpsys()/minidumpsys(), which dump the calling
thread's stack when writing the dump. If dump_savectx() gets its own
stack frame, that frame might be clobbered when its caller later calls
dumpsys()/minidumpsys(), making it difficult for debuggers to unwind the
stack.
Fix this by making dump_savectx() a macro, so that savectx() is always
called directly by the function which subsequently calls
dumpsys()/minidumpsys().
This fixes stack unwinding for the panicking thread from arm64
minidumps. The same happened to work on amd64, but kgdb reports the
dump_savectx() calls as coming from dumpsys(), so in that case it
appears to work by accident.
Fixes: c9114f9f86f9 ("Add new vnode dumper to support live minidumps")
Reviewed by: mhorne, jhb
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D39151
Ed Maste [Mon, 13 Mar 2023 20:51:51 +0000 (16:51 -0400)]
makefs: do not call brelse if bread returns an error
If bread returns an error there is no bp to brelse. One of these
changes was taken from NetBSD commit 0a62dad69f62 ("This works well
enough to populate..."), the rest were found by looking for the same
pattern.
Reviewed by: imp
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D39069
Mark Johnston [Mon, 6 Mar 2023 20:06:00 +0000 (15:06 -0500)]
netinet: Tighten checks for unspecified source addresses
The assertions added in commit b0ccf53f2455 ("inpcb: Assert against
wildcard addrs in in_pcblookup_hash_locked()") revealed that protocol
layers may pass the unspecified address to in_pcblookup().
Add some checks to filter out such packets before we attempt an inpcb
lookup:
- Disallow the use of an unspecified source address in in_pcbladdr() and
in6_pcbladdr().
- Disallow IP packets with an unspecified destination address.
- Disallow TCP packets with an unspecified source address, and add an
assertion to verify the comment claiming that the case of an
unspecified destination address is handled by the IP layer.
Mark Johnston [Fri, 10 Mar 2023 22:06:46 +0000 (17:06 -0500)]
netbsd-tests: Remove some pointless sleeps from message queue tests
- In the msgctl tests, there is no point in sleeping after a fork().
Just block immediately in wait().
- In non-blocking send/recv tests, just wait for the child to exit once
it's reached a message limit. If a bug prevents the child from
exiting promptly, the test will time out.
Kristof Provost [Sun, 12 Mar 2023 15:08:31 +0000 (16:08 +0100)]
pf tests: test IPv6 fragmentation with link-local addresses
We've observed a panic after pf_refragment6() with link-local addresses,
because pf_refragment6() calls ip6_forward() even for a simple output
case.
That results in us entering ip6_forward() with an mbuf with a NULL
m->m_pkthdr.rcvif, which can cause a NULL deref (but seemingly not for
GUAs.