Kristof Provost [Thu, 20 Jan 2022 17:31:45 +0000 (18:31 +0100)]
pf: support masking mac addresses
When filtering Ethernet packets allow rules to specify a mac address
with a mask. This indicates which bits of the specified address are
significant. This allows users to do things like filter based on device
manufacturer.
Teach the 'ether' rules to accept { mac1, mac2, ... } lists, similar to
the lists of interfaces or IP addresses we already supported for layer 3
filtering.
Allow packets to be tagged with dummynet information. Note that we do
not apply dummynet shaping on the L2 traffic, but instead mark it for
dummynet processing in the L3 code. This is the same approach as we take
for ALTQ.
Kristof Provost [Wed, 10 Feb 2021 12:28:14 +0000 (13:28 +0100)]
pf: Do not hold PF_RULES_RLOCK while processing Ethernet rules
Avoid the overhead of acquiring a (read) RULES lock when processing the
Ethernet rules.
We can get away with that because when rules are modified they're staged
in V_pf_keth_inactive. We take care to ensure the swap to V_pf_keth is
atomic, so that pf_test_eth_rule() always sees either the old rules, or
the new ruleset.
We need to take care not to delete the old ruleset until we're sure no
pf_test_eth_rule() is still running with those. We accomplish that by
using NET_EPOCH_CALL() to actually free the old rules.
Ed Maste [Tue, 1 Mar 2022 21:42:13 +0000 (16:42 -0500)]
ssh: use standalone config file for security key support
An upcoming OpenSSH update has multiple config.h settings that change
depending on whether builtin security key support is enabled. Prepare
for this by moving ENABLE_SK_INTERNAL to a new sk_config.h header
(similar to the approach used for optional krb5 support) and optionally
including that, instead of defining the macro directly from CFLAGS.
Reviewed by: kevans
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D34407
libusb(3): Ignore SIGPIPE when initializing the LibUSB v1.0 API.
The LibUSB v1.0 emulation layer uses pipes internally to signal between
threads. When USB devices are reset, as part of loading firmware, SIGPIPE
may happen, and that is expected and should be ignored.
Warner Losh [Wed, 2 Mar 2022 05:54:53 +0000 (22:54 -0700)]
bootstrap: bump minimum supported version
Bump the minimum supported version to build -current from to 11.3R in
preparation of removing support for older systems. 11.4R was selected
as the most recent version to go out of support.
Warner Losh [Wed, 2 Mar 2022 05:54:45 +0000 (22:54 -0700)]
bootstrap: No need for kbdcontrol bootstrap anymore
We only need kbdcontrol when bootstrapping from FreeBSD 10 or
pre-FreeBSD 11.0 current. Since we can no longer build from these
versions of FreeBSD, remove the support for bootstrapping them.
Warner Losh [Tue, 1 Mar 2022 23:58:28 +0000 (16:58 -0700)]
heir: Document SYSROOT conventions
Define a place for sysroot trees to live. This assumes they come from
the base in some way, though there's not yet a build/install/etc sysroot
target. Include the FreeBSD version so multiple verrsions can be
installed on one system (it also includes the whole uname version, so
one could, in theory, install variants like CheriBSD or whatever on the
same system as FreeBSD). Use MACHINE.MACHINE_ARCH to be consistent with
the release practices, /usr/obj and other naming conventions.
Mark Johnston [Tue, 1 Mar 2022 16:53:42 +0000 (11:53 -0500)]
fasttrap: Avoid creating WX mappings
fasttrap instruments certain instructions by overwriting them and
copying the original instruction to some per-thread scratch space which
is executed after the probe fires. This trampoline jumps back to the
tracepoint after executing the original instruction.
The created mapping has both write and execute permissions, and so this
mechanism doesn't work when allow_wx is disabled. Work around the
restriction by using proc_rwmem() to write to the trampoline.
Reviewed by: vangyzen
Tested by: Amit <akamit91@hotmail.com>
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D34304
Mark Johnston [Tue, 1 Mar 2022 16:48:39 +0000 (11:48 -0500)]
proc: Relax proc_rwmem()'s assertion on the process hold count
This reference ensures that the process and its associated vmspace will
not be destroyed while proc_rwmem() is executing. If, however, the
calling thread belongs to the target process, then it is unnecessary to
hold the process. In particular, fasttrap - a module which enables
userspace dtrace - may frequently call proc_rwmem(), and we'd prefer to
avoid the overhead of locking and bumping the hold count when possible.
Thus, make the assertion conditional on "p != curproc". Also assert
that the process is not already exiting. No functional change intended.
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Ed Maste [Mon, 28 Feb 2022 01:11:20 +0000 (20:11 -0500)]
zfs: Update test format strings to match variable typtes
And drop stray 'd' from the end of some printed numbers. I assume this
was the result of someone thinking u is a printf length modifier for d,
not a format specifier itself.
Reviewed by: kevans, rew
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D34387
Warner Losh [Tue, 1 Mar 2022 00:26:00 +0000 (17:26 -0700)]
ath: Suppress set but unused warnings
The ath driver has a lot of these warnings. It's an older driver, so
just supress these warnings until they can be fixed. They are a mix of
simple dead stores, debubgging output and stuff that would require
careful study to know if its safe to remove the access or not (there are
likely very few of the latter, but if there are any they are latent bugs
that compiler could optimize away). Since I have no ath hardware to test
on anymore, take the conservative approach.
Warner Losh [Mon, 28 Feb 2022 21:28:33 +0000 (14:28 -0700)]
acpi hints: Abstract out acpi_hint_device_matches_resources
Abstract out acpi_hint_device_matches_resources from
acpi_hint_device_unit to simplify that code. Continue matching like
we've always matched: no functional change.
Warner Losh [Mon, 28 Feb 2022 21:27:55 +0000 (14:27 -0700)]
pci: switch logic a little
If we find a match, then assign it. Flip the logic in the if and assign
the unit rather than continuing if it doesn't match. Will make it easier
to expand to other matching schemes.
Warner Losh [Mon, 28 Feb 2022 21:27:42 +0000 (14:27 -0700)]
bus: Add ACPI locator support
Add support for printing ACPI paths. This is a bit of a degenerate case
for this interface since it's always just the device handle if the
device has one. But it is illustrtive of how to do this for a few nodes
in the tree.
Warner Losh [Mon, 28 Feb 2022 21:27:28 +0000 (14:27 -0700)]
libdevctl: Add devctl_getpath
Helper routine to call the kernel to get a path to the named device.
Different path enumeration methods (called locators) can be used
for different path types depending on what the kernel implements.
Warner Losh [Mon, 28 Feb 2022 21:27:09 +0000 (14:27 -0700)]
bus: Introduce the bus interface get_device_path
This returns the full path of a the child device requested. Since
there's different ways to recon the entire path, include a 'locator'
method. The default 'FreeBSD' method uses a filesystem-like path name
with each device to the root node separated by /. Other locators will be
UEFI, ACPI and fdt, though others are possible in the future. Make the
locator a string to allow maximum flexibility.
Warner Losh [Mon, 28 Feb 2022 21:26:19 +0000 (14:26 -0700)]
devctl2: Change to 644 protections
We make sure that we check for device privs (usually meaning root or
better) for everything. To allow other functions that don't require
this, default to 644 protection.
Mark Johnston [Tue, 1 Mar 2022 14:07:14 +0000 (09:07 -0500)]
riscv: Add support for enabling SV48 mode
This increases the size of the user map from 256GB to 128TB. The kernel
map is left unchanged for now.
For now SV48 mode is left disabled by default, but can be enabled with a
tunable. Note that extant hardware does not implement SV48, but QEMU
does.
- In pmap_bootstrap(), allocate a L0 page and attempt to enable SV48
mode. If the write to SATP doesn't take, the kernel continues to run
in SV39 mode.
- Define VM_MAX_USER_ADDRESS to refer to the SV48 limit. In SV39 mode,
the region [VM_MAX_USER_ADDRESS_SV39, VM_MAX_USER_ADDRESS_SV48] is not
mappable.
Reviewed by: jhb
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D34280
Mark Johnston [Tue, 1 Mar 2022 13:54:55 +0000 (08:54 -0500)]
release: Remove references to ChallengeResponseAuthentication
This sshd_config keyword was replaced by KbdInteractiveAuthentication in
openssh 8.7, though ChallengeResponseAuthentication is silently accepted
as an alias. However, this means that the code in ec2.conf which
modifies a commented-out line no longer does anything. Apply a minimal
fix.
Reviewed by: cperciva, emaste
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D34400
Tom Jones [Tue, 1 Mar 2022 13:23:25 +0000 (13:23 +0000)]
diff: Use start of change when searching for function
Use the start of change when searching for a function rather than the
start of the context. In short functions if this could result in search
for the function name starting from before the function definition.
Kirk McKusick [Tue, 1 Mar 2022 00:36:08 +0000 (16:36 -0800)]
Create a new GEOM utility, gunion(8).
The gunion(8) utility is used to track changes to a read-only disk on
a writable disk. Logically, a writable disk is placed over a read-only
disk. Write requests are intercepted and stored on the writable
disk. Read requests are first checked to see if they have been
written on the top (writable disk) and if found are returned. If
they have not been written on the top disk, then they are read from
the lower disk.
The gunion(8) utility can be especially useful if you have a large
disk with a corrupted filesystem that you are unsure of how to
repair. You can use gunion(8) to place another disk over the corrupted
disk and then attempt to repair the filesystem. If the repair fails,
you can revert all the changes in the upper disk and be back to the
unchanged state of the lower disk thus allowing you to try another
approach to repairing it. If the repair is successful you can commit
all the writes recorded on the top disk to the lower disk.
Another use of the gunion(8) utility is to try out upgrades to your
system. Place the upper disk over the disk holding your filesystem
that is to be upgraded and then run the upgrade on it. If it works,
commit it; if it fails, revert the upgrade.
Further details can be found in the gunion(8) manual page.
Jessica Clarke [Mon, 28 Feb 2022 22:37:37 +0000 (22:37 +0000)]
release: Add support for building on non-FreeBSD
This requires two sets of changes. Firstly, for non-FreeBSD, we do not
know where tools are in PATH (and it is likely that some are not in
system directories and have been built as bootstrap tools during the
build), so we should leave PATH alone and trust the user. Secondly,
makefs needs a master.passwd for building images from a METALOG file, so
pass the directory in the image tree to makefs's -N option in order to
pick up a valid FreeBSD master.passwd; this is unnecessary on FreeBSD
(except in the edge case of building an image that refers to users or
groups not present in the host's database, which is unlikely but
technically possible) but harmless so can be done unconditionally.
Jessica Clarke [Mon, 28 Feb 2022 22:37:21 +0000 (22:37 +0000)]
install-boot.sh: Avoid - in function names for POSIX compatibility
FreeBSD sh supports this but other common POSIX shells do not; in
particular, dash does not, unlike bash and zsh. This allows the script
to be used on non-FreeBSD systems for release media building.
Jessica Clarke [Mon, 28 Feb 2022 22:37:03 +0000 (22:37 +0000)]
release: Support -DNO_ROOT image building
This requires a bunch of METALOG mangling to include the files we inject
into the tree. The mkisoimages.sh and make-memstick.sh scripts are now
called with the current directory inside the tree so that the relative
paths in the METALOG match up with the current directory. The scripts do
not require this when not using a METALOG, but for simplicity we always
do so. The Makefile mangles the real METALOG created from the install,
as those files are shared across all uses of the tree, but the shell
scripts create a temporary copy of the METALOG that they mangle as their
tree modifications are specific to that image. We also need to pass -D
to makefs to turn any duplicate METALOG entry errors into warnings, as
we have many (harmless) instances of those.
Whilst dvd1.iso should work, the !NOPKG code will need more work to
support this.
All media will also lack mergemaster and etcupdate trees, since more
work is needed to add -DNO_ROOT modes to them. Users of install media
built this way will have to manually bootstrap them.
Jessica Clarke [Mon, 28 Feb 2022 22:36:51 +0000 (22:36 +0000)]
mkisoimages.sh: Avoid creating temporary files in the current directory
Currently the current directory is the parent of the rootfs directory,
but this will change in order to support NO_ROOT builds that use a
metalog manifest, since those need to have the current directory be the
rootfs itself in order for the relative paths to be correct, and we do
not want the non-METALOG case (which passes the directory to makefs) to
pick up leftover temporary .img files from a previous failed build.
Jessica Clarke [Mon, 28 Feb 2022 22:36:39 +0000 (22:36 +0000)]
Fix hand-rolled METALOG entries for installconfig during distributeworld
During distributeworld we call distribute on subdirectories, which in
turn calls installconfig. However, this recursive installconfig call
appends the distribution name (in these cases, "base") to DESTDIR. For
install(1) this works fine as its -D argument comes from the top-level
Makefile.inc1, which passes the original DESTDIR, thereby resulting in
the METALOG entry having the distribution name as a prefix representing
its true installed path relative to the root, but for the hand-rolled
entries they do not use install(1) and thus do not have access to what
the original DESTDIR was, resulting in the METALOG missing this prefix.
Thus, pass down the name of the distribution via a new variable DISTBASE
(chosen as Makefile.inc1 already uses that to convey this exact same
information to etc's distrib-dirs during distributeworld) and prepend
this to the handful of manually-generated METALOG entries. For the
installworld case this variable will be empty and so this behaves as
before.
Note that we need to be careful to avoid double slashes in the METALOG;
distributeworld uses find | awk to split the single METALOG up into
multiple dist.meta files, and this relies on the paths in the METALOG
having the exact prefix ./dist (or ./dist/usr/lib/debug).
Cy Schubert [Mon, 28 Feb 2022 19:43:33 +0000 (11:43 -0800)]
ipfilter: Print protocol when listing NAT table mappings
NAT table mappings list only the source and destination IP, the source
and destinaion port numbers, and their mappings. But the protocol is not
listed. Now that Facebook and Google use QUIC, seeing port 443 in in a
list of active NAT sessions could mean 443/tcp or 443/udp. This patch
adds the protocol to the listing to aid in determining whether HTTPS is
TCP or QUIC in a NAT mapping listing. This also helps differentiatinete
between other protocols such as ICMP, ESP, and AH in ipnat list of active
sessions.
Warner Losh [Mon, 28 Feb 2022 17:17:06 +0000 (10:17 -0700)]
Report I/O stats from the CAM_IOSCHED_DYNAMIC extension
Report, on a periodic basis, the I/O latencies the CAM I/O scheduler
computes. These times are only for the hardware portion of the I/O as
measured from the time the operation is scheduled with the SIM using
xpt_action() until the SIM reports it has completed with xpt_dine(). Any
time the I/O operation spends in a software queue is no included.
The P50 (median), P90, P99 and P99.9 statistics about the latency of
each of the read, write and trim operations that completed during the
polling interval are reported. If there are fewer than 2, 10, 100 or
1000 operations during the polling interval, no statistic is reported
and a single dash '-' is displayed.
The read, write and trim commands (either on the command line or at run
time) toggle display of these operations. The color command toggles
color (it defaults to on, like gstat). When color is enabled, unknown
statistics are reported in blue, high latency for a statistics is
reported in red, medium in magenta and low in green (as with gstat). The
med= and hi= commands can set these latency thresholds.
Limitations: The entire sysctl space for all the devices is walked for
each polling period. This should be optimized to remember the OIDs and
only do such polling with the xpt generation changes. There is also no
way to filter devices displayed. This command only works on physical
devies that are connected to SCSI, ATA or NVME sims as those are the
only ones that are instrumented in the CAM I/O scheduler (the
CAM_IOSCHED_DYNAMIC option must be in the kernel, and the dynamic
scheduler can't be disabled).