Ed Maste [Tue, 2 May 2023 20:57:12 +0000 (16:57 -0400)]
cpuset: increase userland maximum size to 1024
Hardware with more than 256 CPU cores is now available and will become
increasingly common. Bump CPU_MAXSIZE (used for userland cpuset_t
sizing) to 1024 to define the ABI for FreeBSD 14.
This change is reapplied after a change to decouple cpuset from bhyve:
commit e17eca327633 ("vmm: Avoid embedding cpuset_t ioctl ABIs").
PR: 269572, 271213 [exp-run]
Reviewed by: mjg, jhb
Relnotes: yes
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D39941
Randall Stewart [Wed, 24 May 2023 10:35:36 +0000 (06:35 -0400)]
tcp: request tracking is not http specific.
This change is a name change only. TCP Request tracking can track sendfile and even non-sendfile requests. The
names however in the current code use http, and they should not. The feature is not http specific. Lets change the
name so they more properly reflect whats going on. This also fixes conflicts with http_req which caused application pain.
Mitchell Horne [Wed, 24 May 2023 13:27:55 +0000 (10:27 -0300)]
ofw_cpu: quiet secondary CPU devices
We already do plenty to announce the different CPUs in dmesg. Follow the
ACPI CPU strategy of reporting the first CPU device, but quieting the
rest for non-verbose boot. This cuts down slightly on dmesg output.
Reviewed by: manu, jhb
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D40243
Andrew Turner [Thu, 4 May 2023 10:30:57 +0000 (11:30 +0100)]
Add more arm64 special registers
These will be used by bhyve
Reviewed by: markj
Sponsored by: Arm Ltd
Sponsored by: Innovate UK
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D40128
Introduce macro for PHYS_TO_PTE, setting the groundwork for future
support of various Arm VMSA extensions.
For extensions such as 52-bit VA/PA (FEAT_LPA2), the representation of
an address between a PTE and PA are not equivalent. This macro will
allow converting between the different representations.
Currently PHYS_TO_PTE is a NOP. Replace all instances where we go from
PA to PTE with new PHYS_TO_PTE macro.
Reviewed by: markj
Sponsored by: Arm Ltd
Differential Revision: https://reviews.freebsd.org/D39828
Introduce macro for PTE_TO_PHYS, setting the groundwork for future
support of various Arm VMSA extensions.
For extensions such as 52-bit VA/PA (FEAT_LPA2), the representation of
an address between a PTE and PA are not equivalent. This macro will
allow converting between the different representations.
Currently going from PTE to PA is achieved by masking off the upper and
lower attributes. Retain this behaviour but replace all instances with
the new macro instead.
Mark Johnston [Wed, 24 May 2023 01:14:36 +0000 (21:14 -0400)]
bhyve: Remove the exitcode stats structure
This structure isn't used for anything, and only counts a subset of
vmexit types. Moreover, it is not accurate since there is no
synchronization between vcpu threads. Simply remove it.
No functional change intended.
Reviewed by: jhb
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D40245
Mark Johnston [Wed, 24 May 2023 01:13:33 +0000 (21:13 -0400)]
vmm: Avoid embedding cpuset_t ioctl ABIs
Commit 0bda8d3e9f7a ("vmm: permit some IPIs to be handled by userspace")
embedded cpuset_t into the vmm(4) ioctl ABI. This was a mistake since
we otherwise have some leeway to change the cpuset_t for the whole
system, but we want to keep the vmm ioctl ABI stable.
Rework IPI reporting to avoid this problem. Along the way, make VM_RUN
a bit more efficient:
- Split vmexit metadata out of the main VM_RUN structure. This data is
only written by the kernel.
- Have userspace pass a cpuset_t pointer and cpusetsize in the VM_RUN
structure, as is done for cpuset syscalls.
- Have the destination CPU mask for VM_EXITCODE_IPIs live outside the
vmexit info structure, and make VM_RUN copy it out separately. Zero
out any extra bytes in the CPU mask, like cpuset syscalls do.
- Modify the vmexit handler prototype to take a full VM_RUN structure.
PR: 271330
Reviewed by: corvink, jhb (previous versions)
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D40113
Yan Ka Chiu [Tue, 23 May 2023 20:39:22 +0000 (16:39 -0400)]
ifconfig(8): Teach ifconfig to attach and run itself in a jail
Add -j <jail> flag to ifconfig to allow ifconfig to attach and run inside a
jail. This allow parent to configure network interfaces of its children
even if ifconfig is not available in child's tree (e.g. Linux Jails)
Dimitry Andric [Tue, 23 May 2023 17:40:36 +0000 (19:40 +0200)]
Update -ftrivial-auto-var-init flags for clang >= 16
As of clang 16, the -ftrivial-auto-var-init=zero option no longer needs
-enable-trivial-auto-var-init-zero-knowing-it-will-be-removed-from-clang
to enable the option. Only add it for older clang versions.
Mark Johnston [Tue, 23 May 2023 14:14:06 +0000 (10:14 -0400)]
md: Get rid of the pbuf zone
The zone is used solely to provide KVA for mapping BIOs so that we can
pass mapped buffers to VOP_READ and VOP_WRITE. Currently we preallocate
nswbuf/10 bufs for this purpose during boot.
The intent was to limit KVA usage on 32-bit systems, but the
preallocation means that we in fact consumed more KVA than needed unless
one has more than nswbuf/10 (typically 25) vnode-backed MD devices
in existence, which I would argue is the uncommon case.
Meanwhile, all I/O to an MD is handled by a dedicated thread, so we can
instead simply preallocate the KVA region at MD device creation time.
Architectures that are not included in the #ifdef won't be able to
compile libdtrace. This was tested on an ARM64 build. If the ifdef is
removed, libdtrace can be compiled with no problems, otherwise it fails
at libdtrace.
Reviewed by: markj
Approved by: markj (mentor)
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D39948
dtrace(1): add -d flag to dump D script post-dt_sugar
By specifying the -d flag, libdtrace will dump the D script after it has
applied syntactical sugar transformations (e.g if/else). This is useful
for both understanding what dt_sugar does, as well as debugging it.
Reviewed by: markj
Approved by: markj (mentor)
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D38732
The current implementation of KINST_TRAMP_INIT is working only on amd64,
where the breakpoint instruction is one byte long, which might not be
the case for other architectures (e.g in RISC-V it's either 2 or 4
bytes). This patch introduces two machine-dependent constants,
KINST_TRAMP_FILL_PATTERN and KINST_TRAMP_FILL_SIZE, which hold the fill
instruction and the size of that instruction in bytes respectively.
Reviewed by: markj
Approved by: markj (mentor)
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D39504
Mitchell Horne [Mon, 22 May 2023 23:54:36 +0000 (20:54 -0300)]
riscv: Print less CPU info
Change the reporting strategy to more closely follow what arm64
implements:
- Always print the one-line CPU summary when a core comes online
- Only print the additional fields (e.g. ISA) when they differ from the
CPU before it
In the common case of identical CPUs this results in informative but
non-repetitive output. For example, in QEMU:
CPU 0 : Vendor=Unspecified Core=Unknown (Hart 0)
marchid=0x80032, mimpid=0x80032
MMU: 0x7<Sv39,Sv48,Sv57>
ISA: 0x112d<Atomic,Compressed,Double,Float,Mult/Div>
real memory = 8589934592 (8192 MB)
avail memory = 8332300288 (7946 MB)
FreeBSD/SMP: Multiprocessor System Detected: 6 CPUs
CPU 1 : Vendor=Unspecified Core=Unknown (Hart 1)
CPU 2 : Vendor=Unspecified Core=Unknown (Hart 2)
CPU 3 : Vendor=Unspecified Core=Unknown (Hart 3)
Reviewed by: markj
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D40024
Mitchell Horne [Mon, 22 May 2023 23:53:43 +0000 (20:53 -0300)]
riscv: MMU detection
Detect and report the supported MMU for each CPU. Export the
capabilities to the rest of the kernel and use it in pmap_bootstrap() to
check for Sv48 support.
Reviewed by: markj
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D39814
Mitchell Horne [Mon, 22 May 2023 23:51:44 +0000 (20:51 -0300)]
riscv: Rework CPU identification (second part)
Modify when and how we perform parsing and reporting. Most notably,
everything now executes on CPU 0.
The de-facto standard way to enumerate CPU features (ISA extensions) on
RISC-V is by parsing each CPU's ISA string. We currently obtain this
information from the device tree, and in the future will be able to pull
it from ACPI tables.
Eliminate the SYSINIT from identcpu.c. We still need to walk the /cpus
list in the device tree, but now do this one CPU at a time, as a step in
the identify_cpu() procedure. This is slightly less error prone, and
allows us to parse ISA features for CPU 0 much earlier.
Make use of the SMP hooks cpu_mp_start() and cpu_mp_announce() to
identify and print secondary CPU info, respectively. This causes
secondary processor identification to be printed much earlier in boot;
everything is done by SI_SUB_CPU, SI_ORDER_THIRD. Adjust some other
printf() calls so that we get enough useful info to debug under
bootverbose.
Reviewed by: markj (slightly earlier version)
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D39811
Mitchell Horne [Mon, 22 May 2023 23:50:09 +0000 (20:50 -0300)]
riscv: Call identify_cpu() earlier for CPU 0
It is advantageous to have knowledge of ISA features as early as
possible. For example, the presence of newer virtual memory extensions
may be useful to pmap_bootstrap().
To achieve this, split out the printf() parts of identify_cpu() into a
separate function, printcpuinfo(). This latter function will be called
later in boot after the console has been initialized.
Reviewed by: markj
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D39810
Mitchell Horne [Mon, 22 May 2023 23:48:41 +0000 (20:48 -0300)]
riscv: Rework CPU identification (first part)
Make better use of the RISC-V identification CSRs: mvendorid, marchid,
and mimpid. This code was written before these registers were
well-specified, or even available to the kernel. It currently fails to
recognize any CPU or platform.
Per the privileged specification, mvendorid contains the JEDEC vendor ID,
or zero.
The marchid register denotes the CPU microarchitecture. This is either
one of the globally allocated open-source implementation IDs, or the
field has a custom encoding. Therefore, for known vendors (SiFive) we
can also maintain a list of known marchid values. If we can not give a
name to the CPU but marchid is non-zero, then just print its value in
the report.
The mimpid (implementation ID) could be used in the future to more
uniquely identify the micro-architecture, but it really remains to be
seen how it gets used. For now we just print its value.
Thank you to Danjel Qyteza <danq1222@gmail.com> who submitted an early
version of this change to me, although it has been almost entirely
rewritten.
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D39809
libdtrace: get rid of illumos ifdefs in dt_module_update(), fix dm_file and dm_modid
Because dt_module_update() is highly OS-specific, the ifdefs make it
hard to read and follow what is going on. Also handle dm_modid, and
remove handling of the ".filename" section, since we can easily fetch
the filename from the module's pathname (k_stat->pathname).
Reviewed by: markj
Approved by: markj (mentor)
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D39177
Mike Karels [Tue, 23 May 2023 12:21:50 +0000 (07:21 -0500)]
pwd.1: replace /home with /sys in example
The default location for home directories is moving from /usr/home
to /home, and the /home symlink will no longer exist. Switch to
another example that is in base, /sys.
Mike Karels [Tue, 23 May 2023 12:18:27 +0000 (07:18 -0500)]
bsdinstall on zfs: create dataset for /home rather than /usr/home
Now that pw (hence adduser and the initial install) use /home for
user home directories rather than /usr/home, create a dataset for
/home rather than /usr/home. Update the man page to match.
Mike Karels [Tue, 23 May 2023 12:17:42 +0000 (07:17 -0500)]
pw: do not move /home/$user to /usr/home
When adding a user, pw will create the path to the home directory
if needed. However, if creating a path with just one component,
i.e. that appears to be in the root directory, pw would create the
directory in /usr, and create a symlink from the root directory.
Most commonly, this meant that the default of /home/$user would turn
into /usr/home/$user. This was added in a self-described kludge 26
years ago. It made (some) sense when root was generally a small
partition, with most of the space in /usr. However, the default is
now one large partition. /home really doesn't belong under /usr,
and anyone who wants to use /usr/home can specify it explicitly.
Remove the kludge to move /home under /usr and create the symlink,
and just use the specified path. Note that this operation was
done only on the first invocation for a path, and this happened most
commonly when adding a user during the install.
Modify the test that checked for the creation of the symlink to
verify that the symlink is *not* made, but rather a directory.
Add a test that intermediate directories are still created.
Notable upstream pull request merges:
#12355 Teach zpool scrub to scrub only blocks in error log
#14811 Refine special_small_blocks property validation
#14854 zil: Some micro-optimizations
#14855 zil: Free lwb_buf after write completion
#14860 Fixes for issues identified by recent Coverity defect reports
#14861 Probe vdevs before marking removed
#14873 Add the ability to uninitialize a zpool
#14875 Hold db_mtx when updating db_state
Xin LI [Tue, 23 May 2023 04:23:57 +0000 (21:23 -0700)]
/etc/rc.d/motd: Update to accommodate changes in uname(1) and newvers.sh
The recent changes to the uname(1) command removed trailing spaces for
better POSIX conformance, but it broke the regular expression used by
the motd script which expected it. This commit addresses this by removing
the requirement, as it is no longer present.
Additionally, a recent change in newvers.sh introduced a new format for
uname -v, which omited the build number and build dates to improve
reproducible build support. This commit adds support for this new format.
Rose [Mon, 8 May 2023 23:08:18 +0000 (19:08 -0400)]
Correct size parameter to strncmp
The wrong value passed to strncmp meant that only enable and disable were being
accepted. This change corrects the logic so enabled and disabled are also
accepted.
kinst uses this function as well, but because it is not exported, it
implements its own copy of it. The patch also exposes the function to
userland, so programs that need to use dtrace_disx86() can use this
function instead of rolling their own copies.
Reviewed by: markj
Approved by: markj (mentor)
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D39871
Kyle Evans [Mon, 15 May 2023 17:21:45 +0000 (12:21 -0500)]
arm64: gicv3: setup PPIs on all APs after they're online
For all PPIs setup earlier than SI_SUB_SMP, PIC_INIT_SECONDARY ends up
cleaning these up for each AP as it comes online. Once they're online,
we don't currently do anything to make sure they're configured for other
APs. Fix it by using smp_rendezvous for the meaty bits of configuring a
PPI, which will just do single-thread behavior before APs are online but
do the right thing for other CPUs after.
While we're here, make sure redistributor config is correct for other
APs as they come online in gic_v3_init_secondary.
libthr rtld locks: do not leak URWLOCK_READ_WAITERS into child
Since there is only the current thread in the child, no pending readers
exist. Clear the bit, since it confuses future attempts to acquire
write ownership of the rtld locks, due to URWLOCK_PREFER_READERS flag.
To be future-proof, clear all state about pending writers and readers.
PR: 271490
Reported and tested by: KJ Tsanaktsidis <kj@kjtsanaktsidis.id.au>
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D40178
netlink: call IPv6 hook when adding IPv4 addresses.
This provides compatibility with ifioctl() version of SIOCAIFADDR.
This change is temporary until the IPv4/IPv6 address handling code
is moved to netinet[6].
Bjoern A. Zeeb [Sat, 13 May 2023 15:17:47 +0000 (15:17 +0000)]
LinuxKPI: fix WRITE_ONCE(), remove ACCESS_ONCE()
Fix a gcc warning: "to be safe all intermediate pointers in cast from
'...' to '...' must be 'const' qualified [-Wcast-qual]".
Doing what is essentially a __DECONST() adding the uintptr_t gets
rid of the massive amount of warnings we get in LinuxKPI and lets
us see the actual problems a lot better.
This is a follow-up to 74e908b3c63b28de1d590dc42502fbe959a6da2e which
fixed READ_ONCE().
ACCESS_ONCE() seems to be an obsolete KPI these days in Linux and
FreeBSD does not use it either directly so we can entirely remove
it now.
Sponsored by: The FreeBSD Foundation
Suggested by: jhb
Reviewed by: hselasky
MFC after: 10 days
Differential Revision: https://reviews.freebsd.org/D40084
Fedor Uporov [Mon, 8 May 2023 16:14:02 +0000 (19:14 +0300)]
ext2fs: Add large sectorsize disks support
The ext2fs does not support disks with sectorsize more 512 bytes.
The main issue is in reading/writing superblock, which is not aligned
with 4k value. Reimplement the superblock reading logic to make it
indifferent to disk logical sector size. The logical sector size
more then page size is not supported, like it is doing on Linux side.
Bjoern A. Zeeb [Tue, 16 May 2023 20:59:30 +0000 (20:59 +0000)]
LinuxKPI: implement pci_rescan_bus()
Try to implement pci_rescan_bus(). pci_rescan_method() is already
doing most of the job. We only have to do the count for the return
value again ourselves.
Sponsored by: The FreeBSD Foundation
MFC after: 10 days
Reviewed by: jhb
Differential Revision: https://reviews.freebsd.org/D40122
Enji Cooper [Sat, 13 May 2023 02:38:18 +0000 (19:38 -0700)]
Require the OpenSSL 1.1 APIs when compiling ldns
Moving the APIs from OpenSSL 1.1 supporting APIs to 3.x supporting APIs
is a non-trivial effort. Require 1.1 API compatibility to unblock
updating OpenSSL in base to 3.x.
This mirrors what upstream has done in their configure.ac file.
Submitted by: Pierre Pronchery <pierre@freebsdfoundation.org>
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D40082
Bjoern A. Zeeb [Thu, 11 May 2023 20:41:40 +0000 (20:41 +0000)]
fwget: add support for various WiFi NICs
Add support for Realtek, QCA, and Mediatek WiFi NIC cards.
We group the matching entries by driver in sub-functions in order
to semi-automatically create the lists for now.
Reviewed by: manu
Differential Revision: https://reviews.freebsd.org/D40073
Bjoern A. Zeeb [Thu, 11 May 2023 20:36:50 +0000 (20:36 +0000)]
fwget: improve the pci base script
When matching "class" only match the class byte and not subclass and
programming interface.
Extend the list of supported classes by network, old, and misc (for no
better names on the latter two).
Extend the list of known vendors for various WiFi NICs.
Add a "pci_fixup_class" as some wireless cards have unexpected PCI
classes set. In case we cannot find a matching file for the original
try to see if a "fixed up" version exists. This allows us to avoid
duplicate matching files for the same vendor/driver but different
chipsets.
Reviewed by: manu
Differential Revision: https://reviews.freebsd.org/D40072
Bjoern A. Zeeb [Thu, 11 May 2023 20:30:44 +0000 (20:30 +0000)]
fwget: simplify adding firmware images to pkg to install
Rather than using echo to return the firmware package name, call a
new function (addpkg) which will also deal with (i) no leading space
and (ii) remove duplicates (as some devices have dual-wifi-cards).
In addition we won't have a line break when having multiple packages.
While here also do not call pkg(8) anymore if there is no package to
install and use the correct variable to install all and not just the
last found package.
Currently carp implementation peeks into the opaque 'afp->af_addreq'
buffer, assumes it knows the af-specific layout and assigns vhid
directly.
Simplify the code and remove abstraction leak by introducing per-afp
callback for setting vhid.
This change is a pre-requisite to set addresses via Netlink,
as Netlink implementiation uses different structure layout.
Bjoern A. Zeeb [Sat, 20 May 2023 00:51:01 +0000 (00:51 +0000)]
LinuxKPI: netdevice: add dev_set_threaded()
Add dev_set_threaded() to the dummy functions of netdevice.h in order
to keep an upcoming wireless driver compiling.
While here also update the name of a function argument for consistency.