Jessica Clarke [Wed, 24 Jan 2024 23:49:54 +0000 (23:49 +0000)]
riscv: Convert local interrupt controller to a newbus PIC
Currently the local interrupt controller implementation is based on
pre-INTRNG arm/arm64 code, using hand-rolled event code rather than
INTRNG. This then interacts weirdly with the PLIC, and other future
interrupt controllers like the APLIC and IMSICs in the upcoming AIA
specification, since they become the root PIC despite not being the
logical root. Instead, use a real newbus device for it and register
it as the root PIC.
This also adapts the IPI code to make use of the newly-added INTRNG
generic IPI handling framework, adding a new sbi_ipi as the PIC. In
future there will be alternative devices for sending IPIs that will
register with higher priorities, such as the proposed AIA IMSIC and
ACLINT SSWI.
Jessica Clarke [Wed, 24 Jan 2024 23:49:54 +0000 (23:49 +0000)]
riscv: Create a newbus device for the SBI driver
This approach is based on the Arm PSCI driver, though that makes more
extensive use of its softc than we do here. This will be used to extract
the SBI IPI code as a real PIC.
Jessica Clarke [Wed, 24 Jan 2024 23:49:54 +0000 (23:49 +0000)]
intrng: Allow alternative IPI PICs to be registered and used
On RISC-V, the root PIC (whether the PLIC or, as will be the case in
future, the local interrupt controller) cannot send IPIs, relying on
another means to trigger the necessary software interrupts (firmware
calls), but there are upcoming standard devices that will be able to
inject them, so we can't just put the firmware calls in the root PIC
driver.
Thus, split out a new intr_ipi_dev from intr_irq_root_dev to use for
sending IPIs. New devices can be registered with a given priority up
until the first IPI is set up, when the best device seen so far gets
frozen as the IPI device to use.
Jessica Clarke [Wed, 24 Jan 2024 23:49:53 +0000 (23:49 +0000)]
intrng: Extract arm/arm64 IPI->PIC glue code
The arm and arm64 implementations of dispatching IPIs via PIC_IPI_SEND
are almost identical, and entirely MI with the lone exception of a
single store barrier on arm64 (that is likely either redundant or needed
on arm too). Thus, de-duplicate this code by moving it to INTRNG as a
generic IPI glue framework. The ipi_* functions remain declared in MD
smp.h headers and implemented in MD code, but are trivial wrappers
around intr_ipi_send that could be made MI, at least for INTRNG ports,
at a later date.
Note that, whilst both arm and arm64 had an ii_send member in intr_ipi
to abstract over how to send interrupts,, they were always ultimately
using PIC_IPI_SEND, and so this complexity has been removed. A follow-up
commit will re-introduce the same flexibility by instead allowing a
device other than the root PIC to be registered as the IPI sender.
As part of this, strengthen a MAXCPU assertion that was missed in commit 2f0b059eeafc ("intrng: switch from MAXCPU to mp_ncpus") (which itself is
mis-titled).
Jessica Clarke [Wed, 24 Jan 2024 23:49:53 +0000 (23:49 +0000)]
intrng: Remove irq_root_ipicount and corresponding intr_pic_claim_root arg
The static irq_root_ipicount variable is only ever written to (with the
value passed to irq_root_ipicount), never read. Moreover, the bcm2836
driver, as used by the Raspberry Pi 2B and 3A/B (but not 4, which uses a
GIC-400, though does have the legacy interrupt controller present too)
passes 0 as ipicount, despite implementing IPIs. It's thus inaccurate
and serves no purpose, so should be removed.
Kyle Evans [Wed, 24 Jan 2024 19:36:26 +0000 (13:36 -0600)]
kern: tty: fix recanonicalization
`ti->ti_begin` is actually the offset within the first block that is
unread, so we must use that for our lower bound.
Moving to the previous block has to be done at the end of the loop in
order to correctly handle the case of ti_begin == TTYINQ_DATASIZE. At
that point, lastblock is still the last one with data written and the
next write into the queue would advance lastblock. If we move to the
previous block at the beginning, then we're essentially off by one block
for the entire scan and run the risk of running off the end of the block
queue.
The ti_begin == 0 case is still handled correctly, as we skip the loop
entirely and the linestart gets recorded as the first byte available for
writing. The bit after the loop about moving to the next block is also
still correct, even with both previous fixes in mind: we skipped moving
to the previous block if we hit ti_begin, and `off + 1` would in-fact be
a member of the next block from where we're reading if it falls on a
block boundary.
Reported by: dim
Fixes: 522083ffbd1ab ("kern: tty: recanonicalize the buffer on [...]")
Kristof Provost [Wed, 24 Jan 2024 16:34:01 +0000 (17:34 +0100)]
pf: only check MTU for IPv6 packets when forwarding
When the packets are generated locally (i.e. PFIL_FWD is not set) we
might generate overly large packets and rely on the NIC to fragment it
for us. In that case we'd reject a valid packet.
Reported by: Herbert J. Skuhra <herbert@gojira.at>
Tested by: Herbert J. Skuhra <herbert@gojira.at>
Fixes: 54c62e3e5d8cd90c5571a1d4c8c5f062d580480e
Sponsored by: Rubicon Communications, LLC ("Netgate")
Ed Maste [Wed, 24 Jan 2024 15:05:09 +0000 (10:05 -0500)]
ccdconfig: remove obsolete references to BSD disklabels
ccd(4) previoulsy had knowledge of BSD disklabels, and relied on their
use on the underlying disks, but this hasn't been the case since 2003
(commit 0f76d6d822f4).
Remove disklabel references from the man page.
Reviewed by: imp
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D43574
Gleb Smirnoff [Wed, 24 Jan 2024 17:33:27 +0000 (09:33 -0800)]
callout: retire callout_async_drain()
This function was used only in TCP before 446ccdd08e2a. It was born in
pain in 2016 to plug different complex panics in TCP timers. It wasn't
warmly accepted in phabricator by all of the reviewers and my recollection
of overall agreement was that "if you need this KPI, then you'd better fix
your code to not need it". However, the function served its duty well all
the way to FreeBSD 14. But now that TCP doesn't need it anymore, let's
retire it to reduce complexity of callout code and also to avoid its
further use.
tcp: pass maxseg around instead of calculating locally
Improve slowpath processing (reordering, retransmissions)
slightly by calculating maxseg only once. This typically
saves one of two calls to tcp_maxseg().
Ed Maste [Mon, 22 Jan 2024 14:49:02 +0000 (09:49 -0500)]
release: rework distributions list
Components like base.txz and ports.txz are called distributions in the
installer, and with the introduction of pkgbase we will start dealing
with normal pkg packages in the installer. Rename EXTRA_PACKAGES to
DISTRIBUTIONS, and move base.txz and kernel.txz to that list.
This introduces no functional change but is a small cleanup in advance
of some pkgbase experimentation.
Reviewed by: cperciva
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D43544
This is based purely on reading the Linux kcmp(2) man page.
In addition to the Linux set of comparators, I also added KCMP_FILEOBJ to
compare underlying file' objects.
Tested by: manu
Reviewed by: brooks, markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D43518
The method should return 0 if the file' underlying objects are same. In
other words, if 0 is returned, io from either of file causes
modifications of the same object.
Reviewed by: brooks, markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D43518
Kyle Evans [Wed, 24 Jan 2024 05:00:36 +0000 (23:00 -0600)]
ncurses: serialize the tinfo build a little bit
Move ncurses_dll.h to GENHDRS to start with; it's been generated from
ncurses_dll.h.in for years, so it's not actually in a different category
than all of the other GENHDRS. Slap an .ORDER on it to ensure that we
build ncurses_dll.h and curses.h before any *.c gets compiled.
This should sufficiently address a build race seen downstream where
ncurses_dll.h is present but not yet populated.
Reviewed by: bapt
Sponsored by: NetApp, Inc.
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D43540
Mike Karels [Tue, 23 Jan 2024 17:23:38 +0000 (11:23 -0600)]
tmpfs: increase vfs.tmpfs.memory_percent to 100 as workaround
The changes to avoid letting tmpfs use all of memory + swap do not
work well with ZFS ARC. The ARC can grow quite large, and will shrink
when there is memory pressure, but tmpfs does not allow for that.
Pending investigation of the right way to handle this, change the
default value of the vfs.tmpfs.memory_percent sysctl to 100 as a
workaround. The sysctl can be set to 95 to get back to the previous
default.
John Baldwin [Tue, 23 Jan 2024 17:38:09 +0000 (09:38 -0800)]
ofw_pcib: Use bus_generic_rman_*
- Implement bus_map/unmap_resource pulling bits from the previous
ofw_pcib_activate/deactivate_resource. One difference here is that
the bus_unmap_resource implementation uses bus_space_unmap instead
of pmap_unmapdev as a complement to the existing use of bus_space_map.
- Use bus_generic_rman_* in various routines for memory and I/O port
resources.
- Use pci_domain_* for PCI_RES_BUS in
ofw_pcib_activate/deactivate_resource.
John Baldwin [Tue, 23 Jan 2024 17:37:53 +0000 (09:37 -0800)]
powerpc: Fix bus_space_unmap
Previously it failed to compile since the macro passed too many
arguments to the function. Fix by adding the bus handle to the
function and adding an implementation that calls pmap_unmapdev.
John Baldwin [Tue, 23 Jan 2024 17:37:13 +0000 (09:37 -0800)]
arm nexus: Use bus_generic_rman_*
- Implement bus_get_rman pulling bits from nexus_alloc_resource.
- Implement bus_map/unmap_resource pulling bits from
nexus_activate/deactivate_resource.
- Use bus_generic_rman_* for
bus_alloc/adjust/activate/deactivate/release_resource except for
custom interrupt activate/deactivate logic still in
nexus_activate/deactivate_resource.
Mark Johnston [Tue, 23 Jan 2024 16:40:52 +0000 (11:40 -0500)]
bhyve: Simplify register definitions a bit
It's awkward to have separate tables for information which is logically
connected. Merge the gdb_regset[] and gdb_regsize[] arrays and update
gdb_read_regs() to cope with the result. This makes the addition of
arm64 support a bit cleaner.
Ed Maste [Tue, 23 Jan 2024 02:05:58 +0000 (21:05 -0500)]
bsdlabel: limit to 8 partitions
bsdlabel is intended to support up to 20 partitions, but the disklabel
struct has a d_partitions array with only BSD_NPARTS_MIN (8) entries.
Previously, an attempt to operate on a bsdlabel with more than eight
partitions resulted in a buffer overflow.
As a stopgap limit bsdlabel to 8 partitions until this is fixed
properly.
Kristof Provost [Mon, 22 Jan 2024 16:35:54 +0000 (17:35 +0100)]
pflow: limit to no more than 128 flow exporters
While there are no inherent limits to the number of exporters we're
likely to scale rather badly to very large numbers. There's also no
obvious use case for more than a handful. Limit to 128 exporters to
prevent foot-shooting.
Aaron LI [Mon, 22 Jan 2024 16:18:56 +0000 (10:18 -0600)]
wg: detach bpf upon destroy as well
bpfattach() is called in wg_clone_create(), but the bpfdetach() is
missing from wg_close_destroy(). Add the missing bpfdetach() to avoid
leaking both the associated bpf bits as well as the ifnet that bpf will
hold a reference to.
Kristof Provost [Wed, 17 Jan 2024 17:11:27 +0000 (18:11 +0100)]
pf: work around icmp6 packet-too-big not being sent when binat-ing
If we're applying NPTv6 we pass a packet with a modified source and/or
destination address to the network stack.
If that packet then turns out to be larger than the MTU of the sending
interface the stack will attempt to generate an icmp6 packet-too-big
error, but may fail to look up the appropriate source address for that
error message. Even if it does, pf would still have to undo the binat
operation inside the icmp6 packet so the sending host can make sense of
the error.
We can avoid both problems entirely by having pf also perform the MTU
check (taking the potential refragmentation into account), and
generating the icmp6 error directly in pf.
Alexander Ziaee [Fri, 12 Jan 2024 22:12:48 +0000 (17:12 -0500)]
newfs_msdos.8: example for specific cluster size
The usual use case in 2024 for newfs_msdosfs is creating filesystems on SD cards
for older hardware. In most tutorials, they call the cluster size "allocation
size". Therefore, add a small note next to cluster size that it is also called
allocation size, and add an example for how to do this.
Cy Schubert [Sat, 20 Jan 2024 13:52:35 +0000 (05:52 -0800)]
rc.d/kdc: Support start of MIT krb5kdc
Some users wishing to use the MIT krb5kdc have discovered the
kdc script workaround applied to the MIT krb5 ports is insufficient.
Let's build into this rc script the smarts to determine whether
base or ports Hiemdal kdc is being invoked or the MIT krb5kdc.
While at it, remove kdc_start_precmd(). This will simplify a future
jail patch.
Warner Losh [Sat, 20 Jan 2024 04:32:16 +0000 (21:32 -0700)]
firmware(9): Update example
Update the example to include a firmware module in the kernel from npe
to iwn. Npe was deleted 6 years ago so makes a poor example of how to
embed firmware in the kernel.