Warner Losh [Tue, 25 Oct 2022 05:05:07 +0000 (23:05 -0600)]
subr_physmem: Fix userspace build
Include stdbool.h in userspace configurations. For the kernel builds we
get it from sys/types.h, but bool isn't defined there for non-kernel
builds and this otherwise kernel-only file is used for the physmem test
suite.
x86/include/elf.h: make inclusion blocks for elf32.h and elf64.h similar
They were copy-pasted when x86/include/elf.h file was merged from its
i386 and amd64 counterparts. Having the text around inclusions
significantly different is somewhat confusing.
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D37085
i386: move hard-coded load address for PIE below default linker base
both for i386 native and compat32 amd64. We know the ld-elf.so.1 size
in advance, it fits there. Trying to push it up after the end of a
binary cannot work reliably and eventually fail for large binaries.
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D37085
arm, arm64: tweak hard-coded load addresses for PIE binaries
They are used when ASLR is not applied.
The need for adjusting is due to rtld direct exec mode puts ld-elf.so.1
at the PIE load address, and this address must not conflict with the
default linker' load address for non-PIE binaries. Otherwise rtld in
direct mode cannot activate image. Example of implicit failure is ldd(1)
refusing to run.
Reported by: kp
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D37085
Warner Losh [Thu, 6 Oct 2022 03:55:26 +0000 (21:55 -0600)]
physmem: Add physmem_excluded to query if a region is excluded
In order to safely reuse excluded memory when it's reserved for special
purpose, we need to test whether or not the memory has been reserved
early in boot. physmem_excluded will return true when the entire range
is excluded, false otherwise.
Warner Losh [Thu, 6 Oct 2022 02:56:43 +0000 (20:56 -0600)]
efi: Add linux memory reserve table defniitions
There is some hardware which can't be completely reset to release the
memory it is using(so far only the GICv3 on arm has fit this
bill). Since that meory needs to be reserved by the OS for that
hardware's later use of it, create defines for code that will parse that
memory table. Otherise the system may allocate the memory for block I/O,
network packets, etc which will lead to memory corruption.
When booting via Linux's kexec protocol, it will add this table to the
EFI systbl's cfgtbl array. While the mechanism to pass 'configuration'
is standardized, these specific tables are not documented except in the
Linux source. Include comments gleened from its study.
Ed Maste [Mon, 24 Oct 2022 18:06:41 +0000 (14:06 -0400)]
build: Use `rm -fv` for BATCH_DELETE_OLD_FILES
It's possible to have files with odd permissions in the tmproot (or
sysroot), causing rm to prompt for each one during e.g. buildworld.
Add -f to forcibly delete these.
Reviewed by: brooks
MFC after: 1 month
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D37111
Mark Johnston [Tue, 25 Oct 2022 13:16:23 +0000 (09:16 -0400)]
bhyve: Address warnings in blockif_proc()
- Use unsigned types for all arithmetic. Use a new signed variable for
holding the return value of pread() and pwrite().
- Handle short I/O from pwrite().
Mark Johnston [Fri, 9 Sep 2022 00:40:02 +0000 (20:40 -0400)]
bhyve: Avoid shadowing global variables in bhyverun.c
- Rename the global cores/sockets/threads to cpu_cores/sockets/threads.
This way, num_vcpus_allowed() doesn't shadow them.
- The global maxcpus is unused, remove it for the same reason.
Kyle Evans [Tue, 23 Aug 2022 02:08:03 +0000 (21:08 -0500)]
split: add some tests
This should cover all of the basic functionality, as well as the recent
enhancement to use a dynamic buffer size rather than limiting patterns
and lines to MAXBSIZE.
Reviewed by: bapt
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D36324
Kyle Evans [Tue, 23 Aug 2022 02:05:58 +0000 (21:05 -0500)]
split: switch to getline() for line/pattern matching
Get rid of split's home-grown logic for growing the buffer; arbitrarily
breaking at LONG_MAX bytes instead of 65536 bytes gives us much more
wiggle room. Additionally, we'll actually fail out entirely if we can't
fit a line, which makes noticing this class of problem much easier.
Mark Johnston [Mon, 24 Oct 2022 21:31:11 +0000 (17:31 -0400)]
libvmmapi: Provide an interface for limiting rights on the device fd
Currently libvmmapi provides a way to get a list of the allowed ioctls
on the vmm device file, so that bhyve can limit rights on the device
file fd. The interface is rather strange: it allocates a copy of the
list but returns a const pointer, so the caller has to cast away the
const in order to free it without aggravating the compiler.
As far as I can see, there's no reason to make a copy of the array, but
changing vm_get_ioctls() to not do that would break compatibility. So
this change just introduces a better interface: move all rights-limiting
logic into libvmmapi.
Any new operations on the fd should be wrapped by libvmmapi, so also
discourage use of vm_get_device_fd(). Currently bhyve uses it only when
limiting rights on the device fd.
Bjoern A. Zeeb [Mon, 24 Oct 2022 20:54:20 +0000 (20:54 +0000)]
dpaa2: cleanup some include files
2782ed8f6cd3d7f59219a783bc7fa7bbfb1fe26f fixed the standalone module
build. REmove the now duplicate includes for opt_acpi.h and
opt_platform.h. Als remove the if_mdio.h again in both the Makefile
and the implementation file as it is not (currently) used.
Randall Stewart [Mon, 24 Oct 2022 19:47:29 +0000 (15:47 -0400)]
Rack and BBR broken with the new timewait state purge.
We recently got rid of the explicit INP_TIMEWAIT state, this has caused some
minor breakage to both rack and bbr. Basically the timewait check that was
in tcp_lro.c is now gone. This means that compressed_ack and mbuf_queued
packets will arrive at TCP without going through tcp_input_with_port(). We need
to expand the check that was stripped to look at the tcp_state (t_state) and
not "LRO" packets that are in the TCPS_TIMEWAIT state.
Warner Losh [Mon, 24 Oct 2022 18:12:17 +0000 (12:12 -0600)]
config.mk: All options in DEFAULTS are now defined in opt_global.h
To simplify management of all the options that should be enabled for the
different architectures, adopt the convention that all options listed in
DEFAULTS will be #defined to 1 in opt_global.h for untied builds. Except
for GEOM_* and ISAPNP, they are all in opt_global.h. ISAPNP is a
opt_dontuse.h, so only filter GEOM_*.
Warner Losh [Mon, 24 Oct 2022 18:12:09 +0000 (12:12 -0600)]
config: Make ISAPNP be in opt_dontuse.h
Nothing uses ISAPNP today, apart from bringing in files or not. There's
really no need to ever do #ifdef ISAPNP in drivers and such. It means
use the ISA bus plug and play isolation protocol to enumerate the bus,
not the more useful 'you might have devices with isa pnp ids' which all
drivers hide behind DEV_ISA and/or an isa clause in the files files.
- Add a zfs_exit() call in an error path, otherwise a lock is
leaked.
- Remove the fid_gen > 1 check. That appears to be Linux-specific:
zfsctl_snapdir_fid() sets fid_gen to 0 or 1 depending on whether
the snapshot directory is mounted. On FreeBSD it fails, making
snapshot dirs inaccessible via NFS.
New driver to ACPI generic event device, defined in ACPI spec.
Some ACPI power button may not work without this.
In qemu arm64 with "virt" machine, with ACPI firmware,
enable devd check devd message by
and invoke following command in qemu monitor
(qemu) system_powerdown
and make sure some power button input event appear.
(setting sysctl hw.acpi.power_button_state=S5 is not work,
because ACPI tree does not have \_S5 object.)
Kristof Provost [Fri, 14 Oct 2022 05:57:33 +0000 (07:57 +0200)]
bridge: default to not filtering L3
Change the default for net.link.bridge.pfil_member and
net.link.bridge.pfil_bridge to zero.
That is, default to not calling layer 3 firewalls on the bridge or its
member interfaces.
With either of these enabled the bridge will, during L2 processing,
remove the Ethernet header from packets, feed them to L3 firewalls,
re-add the Ethernet header and send them out.
Not only does this interact very poorly with firewalls which defer
packets, or reassemble and refragment IPv6, it also causes considerable
confusion for users, because the firewall gets called in unexpected
ways.
For example, a bridge which contains a bhyve tap and the host's LAN
interface. We'd expect traffic between the LAN and bhyve VM to pass, no
matter what (layer 3) firewall rules are set on the host. That's not the
case as long as pfil_bridge or pfil_member are set.
Reviewed by: Zhenlei Huang
MFC: never
Differential Revision: https://reviews.freebsd.org/D37009
Kristof Provost [Mon, 17 Oct 2022 09:06:34 +0000 (11:06 +0200)]
if_ovpn: add sysctls for netisr_queue() and crypto_dispatch_async()
Allow the choice between asynchronous and synchronous netisr and crypto
calls. These have performance implications, but depend on the specific
setup and OCF back-end.
Bjoern A. Zeeb [Sun, 23 Oct 2022 21:48:22 +0000 (21:48 +0000)]
LinuxKPI: 802.11: add MO tracing
Add a macro to each implemented mac80211 operation. This currently
turns into a printf if LINUXKPI_80211_DEBUG is defined but in the
future could become a different probe as well.
This is helpful for quick analysis and boot-time problem debugging
when DTrace and other frameworks may be harder to use.
Sponsored by: The FreeBSD Foundation
MFC after: 3 days
Vitaliy Gusev [Sun, 23 Oct 2022 18:47:56 +0000 (14:47 -0400)]
bhyve: Handle snapshots of unconfigured virtio-net devices
In case of device reset or not configured - features_negotiated is not
set, calling calling pci_vtnet_neg_features is wrong and resume gets
"Segmentation fault".
Mark Johnston [Sat, 22 Oct 2022 17:41:33 +0000 (13:41 -0400)]
bhyve: Fix some warnings in the snapshot code
- Qualify unexported symbols with "static".
- Drop some unnecessary and incorrect casts.
- Avoid arithmetic on void pointers.
- Avoid signed/unsigned comparisons in loops which use nitems() as a
bound.
Mark Johnston [Sat, 22 Oct 2022 17:35:40 +0000 (13:35 -0400)]
bhyve: Fix some warnings in the ps2 emulation code
- Include headers containing prototypes for exported functions.
- Initialize all fields of the extended translation table.
- Qualify an unexported translation table as static.
- Fix error handling for a read(2).
- Fix some style bugs.
List of changes:
- Use integer multiplication instead of long multiplication, because the result is an integer.
- Remove multiple if-statements and predict new if-statements.
- Rename local variable name, "ticks" into "retval" to avoid shadowing
the system "ticks" global variable.
Warner Losh [Sun, 23 Oct 2022 01:09:10 +0000 (19:09 -0600)]
stand/efi: Call md_copymodules based on __LP64__ to fix 32-bit arm
When I refactored everything, I neglected to pass in the proper is64
value on 32-bit platforms. This corrects that. This prevented armv7 and
armv6 platforms from booting due to misaligned data in the kernel. The
only platform we support 32-bit booting in armv[67], which I apparently
neglected to test before commiting my refactoring.
Warner Losh [Sat, 22 Oct 2022 15:09:23 +0000 (09:09 -0600)]
stand/kboot: hostdisk isn't a DEVT_DISK, use a different value.
We assume in all the code that a DEVT_DISK uses common/disk.c and/or
common/part.c and we can access a struct disk_devdesc. hostdisk.c
opens raw devices directly, so has no such structures. Define a
kboot-specific DEVT_HOSTDISK and use that instead.
In addition, disk_fmtdev assumes it is working with a struct
disk_devdesc, so write hostdisk_fmtdev as well.
Bjoern A. Zeeb [Sat, 22 Oct 2022 17:40:17 +0000 (17:40 +0000)]
iwlwifi: prepare to support debugfs
Import two files left out initially from the driver needed for debugfs
support [1]. Adjust the driver further to make it compile on FreeBSD.
This is currently turned off and needs more LinuxKPI/lindebugfs work.
Being in the tree will allow us to collaboratively work on it and
then we can enable it for good.
Colin Percival [Fri, 21 Oct 2022 18:13:36 +0000 (11:13 -0700)]
x86/busdma: Limit reserved pages if low nsegs
When bus_dmamap_create is called, if bouncing might be required we
reserve enough pages for a maximum-length request, subject to the
MAX_BPAGES constraint (32 MB on amd64; 32 MB or 2 MB on i386
depending on the amount of RAM).
Since pages used for bouncing are typically non-consecutive, each
bounced page will typically constitute a busdma segment; as such, we
are unlikely to ever successfully use more pages than the nsegments
limit. Limit the number of pages reserved to nsegments.
On FreeBSD/Firecracker, this reduces bounce page memory consumption
from 32 MB to 512 kB, making VMs with 128 MB of RAM usable.
Alan Somers [Wed, 12 Oct 2022 22:44:09 +0000 (16:44 -0600)]
ctld: if adding a target fails, retry it on the next reload
If the admin creates more CTL ports than kern.cam.ctl.max_ports, then
adding some will fail. If he then removes some ports and does
"service ctld reload", he would expect that the new ports would get
added in the newly-freed port space. But they don't, because ctld
assigned them port numbers during their first creation attempts.
Fix this bug by removing newly created ports from ctld's internal list
if the kernel rejects them for any reason. That way, a subsequent
config reload will attempt to add them again, possibly with new port
numbers.
Warner Losh [Fri, 21 Oct 2022 23:39:34 +0000 (17:39 -0600)]
stabd/geli: Bail out if you can't get the disks size
If the DIOCGMEDIASIZE ioctl fails, assume the disk doesn't have geli
encryption. While all disks should implement this, fail safe for disks /
partitions that do not.
Kirk McKusick [Fri, 21 Oct 2022 18:00:00 +0000 (11:00 -0700)]
Increase the maximum size of the journaled soft-updates journal.
The size of the journaled soft-updates journal should be big enough
to hold two minutes of filesystem metadata-update activity. The
maximum size of the soft updates journal was set in the 1990s. At
the time it was assummed that disk arrays would top out at 16 drives
and disk writes per drive would top out at 500 per second. Today's
I/O subsystems are considerably bigger and faster than those limits.
Thus this delta removes the hard upper limit and lets tunefs(8) and
newfs(8) set the upper bound based on the size of the filesystem and
its cylinder groups.
Kirk McKusick [Fri, 21 Oct 2022 17:56:20 +0000 (10:56 -0700)]
Add a description of soft updates journaling to newfs(8).
Add a descrition to the newfs(8) -j (journal enablement) flag
that explains what soft updates journaling does, the tradeoffs
to using it, and the limitations that it imposes. Copied from
the description in tunefs(8).