Mitchell Horne [Mon, 22 May 2023 23:53:43 +0000 (20:53 -0300)]
riscv: MMU detection
Detect and report the supported MMU for each CPU. Export the
capabilities to the rest of the kernel and use it in pmap_bootstrap() to
check for Sv48 support.
Reviewed by: markj
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D39814
Mitchell Horne [Mon, 22 May 2023 23:51:44 +0000 (20:51 -0300)]
riscv: Rework CPU identification (second part)
Modify when and how we perform parsing and reporting. Most notably,
everything now executes on CPU 0.
The de-facto standard way to enumerate CPU features (ISA extensions) on
RISC-V is by parsing each CPU's ISA string. We currently obtain this
information from the device tree, and in the future will be able to pull
it from ACPI tables.
Eliminate the SYSINIT from identcpu.c. We still need to walk the /cpus
list in the device tree, but now do this one CPU at a time, as a step in
the identify_cpu() procedure. This is slightly less error prone, and
allows us to parse ISA features for CPU 0 much earlier.
Make use of the SMP hooks cpu_mp_start() and cpu_mp_announce() to
identify and print secondary CPU info, respectively. This causes
secondary processor identification to be printed much earlier in boot;
everything is done by SI_SUB_CPU, SI_ORDER_THIRD. Adjust some other
printf() calls so that we get enough useful info to debug under
bootverbose.
Reviewed by: markj (slightly earlier version)
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D39811
Mitchell Horne [Mon, 22 May 2023 23:50:09 +0000 (20:50 -0300)]
riscv: Call identify_cpu() earlier for CPU 0
It is advantageous to have knowledge of ISA features as early as
possible. For example, the presence of newer virtual memory extensions
may be useful to pmap_bootstrap().
To achieve this, split out the printf() parts of identify_cpu() into a
separate function, printcpuinfo(). This latter function will be called
later in boot after the console has been initialized.
Reviewed by: markj
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D39810
Mitchell Horne [Mon, 22 May 2023 23:48:41 +0000 (20:48 -0300)]
riscv: Rework CPU identification (first part)
Make better use of the RISC-V identification CSRs: mvendorid, marchid,
and mimpid. This code was written before these registers were
well-specified, or even available to the kernel. It currently fails to
recognize any CPU or platform.
Per the privileged specification, mvendorid contains the JEDEC vendor ID,
or zero.
The marchid register denotes the CPU microarchitecture. This is either
one of the globally allocated open-source implementation IDs, or the
field has a custom encoding. Therefore, for known vendors (SiFive) we
can also maintain a list of known marchid values. If we can not give a
name to the CPU but marchid is non-zero, then just print its value in
the report.
The mimpid (implementation ID) could be used in the future to more
uniquely identify the micro-architecture, but it really remains to be
seen how it gets used. For now we just print its value.
Thank you to Danjel Qyteza <danq1222@gmail.com> who submitted an early
version of this change to me, although it has been almost entirely
rewritten.
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D39809
libdtrace: get rid of illumos ifdefs in dt_module_update(), fix dm_file and dm_modid
Because dt_module_update() is highly OS-specific, the ifdefs make it
hard to read and follow what is going on. Also handle dm_modid, and
remove handling of the ".filename" section, since we can easily fetch
the filename from the module's pathname (k_stat->pathname).
Reviewed by: markj
Approved by: markj (mentor)
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D39177
Mike Karels [Tue, 23 May 2023 12:21:50 +0000 (07:21 -0500)]
pwd.1: replace /home with /sys in example
The default location for home directories is moving from /usr/home
to /home, and the /home symlink will no longer exist. Switch to
another example that is in base, /sys.
Mike Karels [Tue, 23 May 2023 12:18:27 +0000 (07:18 -0500)]
bsdinstall on zfs: create dataset for /home rather than /usr/home
Now that pw (hence adduser and the initial install) use /home for
user home directories rather than /usr/home, create a dataset for
/home rather than /usr/home. Update the man page to match.
Mike Karels [Tue, 23 May 2023 12:17:42 +0000 (07:17 -0500)]
pw: do not move /home/$user to /usr/home
When adding a user, pw will create the path to the home directory
if needed. However, if creating a path with just one component,
i.e. that appears to be in the root directory, pw would create the
directory in /usr, and create a symlink from the root directory.
Most commonly, this meant that the default of /home/$user would turn
into /usr/home/$user. This was added in a self-described kludge 26
years ago. It made (some) sense when root was generally a small
partition, with most of the space in /usr. However, the default is
now one large partition. /home really doesn't belong under /usr,
and anyone who wants to use /usr/home can specify it explicitly.
Remove the kludge to move /home under /usr and create the symlink,
and just use the specified path. Note that this operation was
done only on the first invocation for a path, and this happened most
commonly when adding a user during the install.
Modify the test that checked for the creation of the symlink to
verify that the symlink is *not* made, but rather a directory.
Add a test that intermediate directories are still created.
Notable upstream pull request merges:
#12355 Teach zpool scrub to scrub only blocks in error log
#14811 Refine special_small_blocks property validation
#14854 zil: Some micro-optimizations
#14855 zil: Free lwb_buf after write completion
#14860 Fixes for issues identified by recent Coverity defect reports
#14861 Probe vdevs before marking removed
#14873 Add the ability to uninitialize a zpool
#14875 Hold db_mtx when updating db_state
Xin LI [Tue, 23 May 2023 04:23:57 +0000 (21:23 -0700)]
/etc/rc.d/motd: Update to accommodate changes in uname(1) and newvers.sh
The recent changes to the uname(1) command removed trailing spaces for
better POSIX conformance, but it broke the regular expression used by
the motd script which expected it. This commit addresses this by removing
the requirement, as it is no longer present.
Additionally, a recent change in newvers.sh introduced a new format for
uname -v, which omited the build number and build dates to improve
reproducible build support. This commit adds support for this new format.
Rose [Mon, 8 May 2023 23:08:18 +0000 (19:08 -0400)]
Correct size parameter to strncmp
The wrong value passed to strncmp meant that only enable and disable were being
accepted. This change corrects the logic so enabled and disabled are also
accepted.
kinst uses this function as well, but because it is not exported, it
implements its own copy of it. The patch also exposes the function to
userland, so programs that need to use dtrace_disx86() can use this
function instead of rolling their own copies.
Reviewed by: markj
Approved by: markj (mentor)
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D39871
Kyle Evans [Mon, 15 May 2023 17:21:45 +0000 (12:21 -0500)]
arm64: gicv3: setup PPIs on all APs after they're online
For all PPIs setup earlier than SI_SUB_SMP, PIC_INIT_SECONDARY ends up
cleaning these up for each AP as it comes online. Once they're online,
we don't currently do anything to make sure they're configured for other
APs. Fix it by using smp_rendezvous for the meaty bits of configuring a
PPI, which will just do single-thread behavior before APs are online but
do the right thing for other CPUs after.
While we're here, make sure redistributor config is correct for other
APs as they come online in gic_v3_init_secondary.
libthr rtld locks: do not leak URWLOCK_READ_WAITERS into child
Since there is only the current thread in the child, no pending readers
exist. Clear the bit, since it confuses future attempts to acquire
write ownership of the rtld locks, due to URWLOCK_PREFER_READERS flag.
To be future-proof, clear all state about pending writers and readers.
PR: 271490
Reported and tested by: KJ Tsanaktsidis <kj@kjtsanaktsidis.id.au>
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D40178
netlink: call IPv6 hook when adding IPv4 addresses.
This provides compatibility with ifioctl() version of SIOCAIFADDR.
This change is temporary until the IPv4/IPv6 address handling code
is moved to netinet[6].
Bjoern A. Zeeb [Sat, 13 May 2023 15:17:47 +0000 (15:17 +0000)]
LinuxKPI: fix WRITE_ONCE(), remove ACCESS_ONCE()
Fix a gcc warning: "to be safe all intermediate pointers in cast from
'...' to '...' must be 'const' qualified [-Wcast-qual]".
Doing what is essentially a __DECONST() adding the uintptr_t gets
rid of the massive amount of warnings we get in LinuxKPI and lets
us see the actual problems a lot better.
This is a follow-up to 74e908b3c63b28de1d590dc42502fbe959a6da2e which
fixed READ_ONCE().
ACCESS_ONCE() seems to be an obsolete KPI these days in Linux and
FreeBSD does not use it either directly so we can entirely remove
it now.
Sponsored by: The FreeBSD Foundation
Suggested by: jhb
Reviewed by: hselasky
MFC after: 10 days
Differential Revision: https://reviews.freebsd.org/D40084
Fedor Uporov [Mon, 8 May 2023 16:14:02 +0000 (19:14 +0300)]
ext2fs: Add large sectorsize disks support
The ext2fs does not support disks with sectorsize more 512 bytes.
The main issue is in reading/writing superblock, which is not aligned
with 4k value. Reimplement the superblock reading logic to make it
indifferent to disk logical sector size. The logical sector size
more then page size is not supported, like it is doing on Linux side.
Bjoern A. Zeeb [Tue, 16 May 2023 20:59:30 +0000 (20:59 +0000)]
LinuxKPI: implement pci_rescan_bus()
Try to implement pci_rescan_bus(). pci_rescan_method() is already
doing most of the job. We only have to do the count for the return
value again ourselves.
Sponsored by: The FreeBSD Foundation
MFC after: 10 days
Reviewed by: jhb
Differential Revision: https://reviews.freebsd.org/D40122
Enji Cooper [Sat, 13 May 2023 02:38:18 +0000 (19:38 -0700)]
Require the OpenSSL 1.1 APIs when compiling ldns
Moving the APIs from OpenSSL 1.1 supporting APIs to 3.x supporting APIs
is a non-trivial effort. Require 1.1 API compatibility to unblock
updating OpenSSL in base to 3.x.
This mirrors what upstream has done in their configure.ac file.
Submitted by: Pierre Pronchery <pierre@freebsdfoundation.org>
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D40082
Bjoern A. Zeeb [Thu, 11 May 2023 20:41:40 +0000 (20:41 +0000)]
fwget: add support for various WiFi NICs
Add support for Realtek, QCA, and Mediatek WiFi NIC cards.
We group the matching entries by driver in sub-functions in order
to semi-automatically create the lists for now.
Reviewed by: manu
Differential Revision: https://reviews.freebsd.org/D40073
Bjoern A. Zeeb [Thu, 11 May 2023 20:36:50 +0000 (20:36 +0000)]
fwget: improve the pci base script
When matching "class" only match the class byte and not subclass and
programming interface.
Extend the list of supported classes by network, old, and misc (for no
better names on the latter two).
Extend the list of known vendors for various WiFi NICs.
Add a "pci_fixup_class" as some wireless cards have unexpected PCI
classes set. In case we cannot find a matching file for the original
try to see if a "fixed up" version exists. This allows us to avoid
duplicate matching files for the same vendor/driver but different
chipsets.
Reviewed by: manu
Differential Revision: https://reviews.freebsd.org/D40072
Bjoern A. Zeeb [Thu, 11 May 2023 20:30:44 +0000 (20:30 +0000)]
fwget: simplify adding firmware images to pkg to install
Rather than using echo to return the firmware package name, call a
new function (addpkg) which will also deal with (i) no leading space
and (ii) remove duplicates (as some devices have dual-wifi-cards).
In addition we won't have a line break when having multiple packages.
While here also do not call pkg(8) anymore if there is no package to
install and use the correct variable to install all and not just the
last found package.
Currently carp implementation peeks into the opaque 'afp->af_addreq'
buffer, assumes it knows the af-specific layout and assigns vhid
directly.
Simplify the code and remove abstraction leak by introducing per-afp
callback for setting vhid.
This change is a pre-requisite to set addresses via Netlink,
as Netlink implementiation uses different structure layout.
Bjoern A. Zeeb [Sat, 20 May 2023 00:51:01 +0000 (00:51 +0000)]
LinuxKPI: netdevice: add dev_set_threaded()
Add dev_set_threaded() to the dummy functions of netdevice.h in order
to keep an upcoming wireless driver compiling.
While here also update the name of a function argument for consistency.
Brian Atkinson [Fri, 19 May 2023 20:05:53 +0000 (16:05 -0400)]
Hold db_mtx when updating db_state
Commit 555ef90 did some general code refactoring for
dmu_buf_will_not_fill() and dmu_buf_will_fill(). However, the db_mtx was
not held when update db->db_state in those code block. The rest of the
dbuf code always holds the db_mtx when updating db_state. This is
important because cv_wait() db_changed is used to check for db_state
changes.
Updating dmu_buf_will_not_fill() and dmu_buf_will_fill() to hold the
db_mtx when updating db_state.
Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Brian Atkinson <batkinson@lanl.gov>
Closes #14875
Brian Behlendorf [Fri, 19 May 2023 20:05:09 +0000 (13:05 -0700)]
Probe vdevs before marking removed
Before allowing the ZED to mark a vdev as REMOVED due to a
hotplug event confirm that it is non-responsive with probe.
Any device which can be successfully probed should be left
ONLINE to prevent a healthy pool from being incorrectly
SUSPENDED. This may occur for at least the following two
scenarios.
1) Drive expansion (zpool online -e) in VMware environments.
If, during the partition resize operation, a partition is
removed and re-created then udev will send a removed event.
2) Re-scanning the namespaces of an NVMe device (nvme ns-rescan)
may result in a udev remove and add event being delivered.
Finally, update the ZED to only kick in a spare when the
removal was successful.
Reviewed-by: Ameer Hamza <ahamza@ixsystems.com> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #14859
Closes #14861
Randall Stewart [Fri, 19 May 2023 15:16:28 +0000 (11:16 -0400)]
There are congestion control algorithms will that pull in srtt, and this can cause issues with rack.
When using rack, cubic and htcp will grab the srtt, but they think it is in ticks. For rack
it is in micro-seconds (which we should probably move all stacks to actually). This causes
issues so instead lets make a new interface so that any CC module can pull the srtt in
whatever granularity they want.
Kristof Provost [Thu, 18 May 2023 18:04:45 +0000 (20:04 +0200)]
if_bridge: fix potential panic
When a new bridge_rtnode is added it is added with a NULL brt_dst. The
brt_dst is set after the entry is added. This means there's a small
window where another core could also attempt to add this node, leading
to the code attempting to log that the MAC addresses moved to a new
interface.
Aside from that being a spurious log entry it also panics, because
obif is NULL (and we attempt to dereference it).
Avoid this by settings brt_dst before we insert the bridge_rtnode.
Assert that obif is non-NULL, as an extra precaution.
Kristof Provost [Thu, 18 May 2023 20:06:37 +0000 (22:06 +0200)]
carp test: improve jail names for unicast_ll_v6 test
Rename the jails used in the unicast_ll_v6 test, to ensure the jail
names are unique to this test.
That is one of the requirements for running these tests in parallel.
Kristof Provost [Thu, 18 May 2023 19:37:48 +0000 (21:37 +0200)]
pfsync tests: check for the correct IP address
When checking if the state synced over we should look for
198.51.100.254, not 198.51.100.2. The test worked because the incorrect
address is a substring of the correct one, but we should fix it anyway.
Reported by: Naman Sood <naman@freebsdfoundation.org>
MFC after: 1 week
George Amanakis [Fri, 17 Dec 2021 20:35:28 +0000 (21:35 +0100)]
Teach zpool scrub to scrub only blocks in error log
Added a flag '-e' in zpool scrub to scrub only blocks in error log. A
user can pause, resume and cancel the error scrub by passing additional
command line arguments -p -s just like a regular scrub. This involves
adding a new flag, creating new libzfs interfaces, a new ioctl, and the
actual iteration and read-issuing logic. Error scrubbing is executed in
multiple txg to make sure pool performance is not affected.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Co-authored-by: TulsiJain tulsi.jain@delphix.com Signed-off-by: George Amanakis <gamanakis@gmail.com>
Closes #8995
Closes #12355
Bjoern A. Zeeb [Wed, 17 May 2023 20:40:47 +0000 (20:40 +0000)]
ifconfig: improve trimming off interface number at end
When trying to auto-load a module, we trim the interface number off
the end. Currently we stop at the first digit. For interfaces which
have numbers in the driver name this does not work well.
In the current example ifconfig ath10k0 would load ath(4) instead of
ath10k(4). For module/interface names like rtw88[0] we never guess
correctly.
To improve for the case we can, start trimming off digits from the
end rather than the front.
Sponsored by: The FreeBSD Foundation
Reported by: thierry
MFC after: 20 days
Reviewed by: melifaro, thierry
Differential Revision: https://reviews.freebsd.org/D40137
Colin Percival [Thu, 18 May 2023 03:17:24 +0000 (20:17 -0700)]
rc.d/netif: Don't DAD if lo0 is the only IPv6 IF
The code in rc.d/netif waiting for IPv6 Duplicate Address Detection if
any network interfaces support IPv6. Unfortunately, since lo0 *always*
has IPv6 enabled, this means unconditionally sleeping, even on systems
which have no external IPv6 interfaces.
Since we presume that there is little risk of a duplicate address being
assigned on lo0, amend the test to wait only if there is an interface
*other than lo0* which supports IPv6.
Dmitry Chagin [Thu, 18 May 2023 07:55:39 +0000 (10:55 +0300)]
linux(4): Check fd passed to unlockpt()
In our implementation, grantpt() and unlockpt() don't actually have
any use, because PTY's are created on the fly and already have proper
permissions upon creation.
Atleast check that a proper fd passed to unlockpt(). For grantpt()
Glibc calls TIOCGPTN ioctl which would fail if fd is not a master.
Corvin Köhne [Fri, 12 May 2023 05:37:32 +0000 (07:37 +0200)]
bhyve: error out if fwcfg user file isn't read completely
At the moment, fwcfg reads the file once at startup and passes these
data to the guest. Therefore, we should always read the whole file.
Otherwise we should error out.
Additionally, GCC12 complains that the comparison whether
fwcfg_file->size is lower than 0 is always false due to the limited
range of data type.
Reviewed by: markj
Fixes: ca14781c8170f3517ae79e198c0c880dbc3142dd ("bhyve: add cmdline option for user defined fw_cfg items")
MFC after: 1 week
Sponsored by: Beckhoff Automation GmbH & Co. KG
Differential Revision: https://reviews.freebsd.org/D40076
Kristof Provost [Thu, 11 May 2023 16:10:33 +0000 (18:10 +0200)]
pf: release rules lock before passing the packet to dummynet
In the Ethernet rules we held the PF_RULES lock while we called
ip_dn_io_ptr() (i.e. dummynet). That meant that we could end up back in
pf while still holding the PF_RULES lock.
That's not immediately fatal, because that lock is recursive, but still
not ideal.
There also appear to be scenarios where this can actually trigger
deadlocks.
We don't need to hold the PF_RULES lock, as long as we make a local copy
of the data we need from the rule (in this case, the action and
bridge_to target). It's safe to keep the struct ifnet pointer around,
because we remain in NET_EPOCH.
routing: fix panic triggered by the 'gr_idx != 0' assert in nhg code
Nexthop groups can be referenced by the external code. The reference
can be released after the VNET destruction. Furthermore, nexthop
groups use a single per-rib lock, which is destroyed during the
VNET desctruction. To eliminate use-after-free problem, each nhg
is marked as "unlinked" during the VNET destruction stage, leaving
nhg_idx intact. Normally there should not be such nexthops, but if
there are any, the kernel will panic on 'gr_idx != 0' when the
last nhg reference is released.
Address this by using the assert checks only when the nexthop group
is destroyed during "valid" VNET lifetime.
pfsync: Remove deletion of states using the full pfsync_state struct
State deletions are sent over pfsync using struct pfsync_del_c.
Remove the code for receiving state deletions using struct pfsync_state
as such deletions are never sent. Rename functions and constants so that
only the "compressed" versions remain.