Kristof Provost [Sun, 12 Mar 2023 15:08:31 +0000 (16:08 +0100)]
pf tests: test IPv6 fragmentation with link-local addresses
We've observed a panic after pf_refragment6() with link-local addresses,
because pf_refragment6() calls ip6_forward() even for a simple output
case.
That results in us entering ip6_forward() with an mbuf with a NULL
m->m_pkthdr.rcvif, which can cause a NULL deref (but seemingly not for
GUAs.
Kristof Provost [Mon, 13 Mar 2023 09:27:59 +0000 (10:27 +0100)]
pf: set scope in pf_refragment6()
Link-local traffic needs to have a scope embedded before it's passed on
to ip6_output(). Do so in pf_refragment6(), because when we end up here
in the output path we may have passed through ip6_output() already
(before being reassembled), where the scope would have been removed.
Re-embed the scope so that link-local traffic is sent correctly.
Kristof Provost [Sun, 12 Mar 2023 17:34:42 +0000 (18:34 +0100)]
pf: distinguish forwarding and output cases for pf_refragment6()
Re-introduce PFIL_FWD, because pf's pf_refragment6() needs to know if
we're ip6_forward()-ing or ip6_output()-ing.
ip6_forward() relies on m->m_pkthdr.rcvif, at least for link-local
traffic (for in6_get_unicast_scopeid()). rcvif is not set for locally
generated traffic (e.g. from icmp6_reflect()), so we need to call the
correct output function.
Michael Tuexen [Thu, 16 Mar 2023 09:45:13 +0000 (10:45 +0100)]
sctp: don't do RTT measurements with cookies
When receiving a cookie, the receiver does not know whether the
peer retransmitted the COOKIE-ECHO chunk or not. Therefore, don't
do an RTT measurement. It might be much too long.
To overcome this limitation, one could do at least two things:
1. Bundle the INIT-ACK chunk with a HEARTBEAT chunk for doing the
RTT measurement. But this is not allowed.
2. Add a flag to the COOKIE-ECHO chunk, which indicates that it
is the initial transmission, and not a retransmission. But
this requires an RFC.
Michael Tuexen [Wed, 15 Mar 2023 21:29:52 +0000 (22:29 +0100)]
sctp: improve negotiation of zero checksum feature
Enforce consistency between announcing 0-cksum support and actually
using it in the association. The value from the inp when the
INIT ACK is sent must be used, not the one from the inp when the
cookie is received.
Summary:
* add snl_send_message() as a convenient send wrapper
* add signed integer parsers
* add snl_read_reply_code() to simplify operation result checks
* add snl_read_reply_multi() to simplify reading multipart messages
* add snl_create_genl_msg_request()
* add snl_get_genl_family() to simplify family name->id resolution
* add tests for some of the functionality
Andrew Turner [Wed, 15 Mar 2023 13:33:02 +0000 (13:33 +0000)]
Use the arm physical timer when able
To allow bhyve manage the virtual timer while in a guest have FreeBSD
use the virtual timer only when bhyve will be unavailable due to not
starting at EL2 where the hypervisor switcher will run.
Mitchell Horne [Wed, 15 Mar 2023 15:26:57 +0000 (12:26 -0300)]
arm64: limit EFI excluded regions to physical memory types
Consolidate add_efi_map_entry() and exclude_efi_map_entry() into a
single function, handle_efi_map_entry(), so that the exact set of entry
types handled is the same in the addition or exclusion cases. Before,
exclude_efi_map_entry() had a 'default' case that would exclude all
entry types that were not listed explicitly in the switch statement.
Logically, we do not need to exclude a range that could not possibly be
added to physmem, and we do not need to exclude bus ranges that are not
physical memory, for example EFI_MD_TYPE_IOMEM.
Since physmem's ram0 device will reserve bus memory resources for its
owned ranges, this was preventing attachment of the watchdog device on
the RPI4B. For some reason its region of memory-mapped I/O appeared in
the EFI memory map (with the aforementioned EFI_MD_TYPE_IOMEM type).
This change fixes the attachment issue, as we prevent the physmem API
from messing with this range of bus space.
John Grafton [Wed, 15 Mar 2023 03:14:14 +0000 (21:14 -0600)]
libbe: Avoid double printing cloning errors.
be_clone calls be_clone_cb and both call set_error on the return
error path. set_error prints the error resulting in a double print.
be_clone_cb should just return the error code and allow be_clone
to print it.
Ed Maste [Wed, 16 Nov 2022 21:24:19 +0000 (16:24 -0500)]
CI: Run pkgbase METALOG lint script
tools/pkgbase/metalog_reader.lua checks for errors in METALOG (for
pkgbase staging), such as hard links with differing modes, duplicate
entries, etc. Run it as part of the Cirrus-CI job to prevent
regressions.
Reviewed by: manu, imp
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D37521
Jessica Clarke [Wed, 15 Mar 2023 00:06:53 +0000 (00:06 +0000)]
Add new DISK_IMAGE_TOOLS_BOOTSTRAP option
This will build etdump, makefs and mkimg as bootstrap tools to allow
easily creating disk images. Note that etdump is bootstrapped due to its
use in the release scripts for building ISO images.
Bjoern A. Zeeb [Tue, 14 Mar 2023 21:01:19 +0000 (21:01 +0000)]
net80211: make ieee80211_scan_dump_channels private
ieee80211_scan_dump_channels() is only used locally and only when
IEEE80211_DEBUG is compiled. Stop exporting it, make it file local
and hide under the #ifdef to reduce the footprint for production
kernels a tiny bit.
Sponsored by: The FreeBSD Foundation
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D38833
Bjoern A. Zeeb [Tue, 14 Mar 2023 21:00:48 +0000 (21:00 +0000)]
net80211: define mask for ss_flags rather than using hardcoded 0xfff
scan state ss_flags in two places cut off the "internal" GOTPICK
options. Replace the hardcoded 0xfff with a defined mask.
Note that "internal" flags is confusing as we also supplement the
the 16bit by another 16bit of "internal flags" passed around but
comaparing to GOTPICK never stored to my understanding.
No functional change.
Sponsored by: The FreeBSD Foundation
MFC after: 3 days
Reviewed by: adrian
Differential Revision: https://reviews.freebsd.org/D38832
netmap: get rid of save_if_input for emulated adapters
The save_if_input function pointer was meant to save the previous
value of ifp->if_input before replacing it with the emulated
adapter hook.
However, the same pointer value is already stored in the if_input
field of the netmap_adapter struct, to be used for host TX ring processing.
Reuse the netmap_adapter if_input field to simplify the code
and save some space.
Justin Hibbits [Thu, 9 Feb 2023 02:32:47 +0000 (21:32 -0500)]
infiniband: Convert BPF handling for IfAPI
Summary:
All callers of infiniband_bpf_mtap() call it through the wrapper macro,
which checks the if_bpf member explicitly. Since this is getting
hidden, move this check into the internal function and remove the
wrapper macro.
Ed Maste [Wed, 8 Feb 2023 13:16:53 +0000 (08:16 -0500)]
ssh: fix leak and apply style(9) to hostname canonicalization
Fixes: bf2e2524a2ce ("ssh: canonicize the host name before...")
Fixes: 3e74849a1ee2 ("ssh: canonicize the host name before...")
Reviewed by: rew
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D38441
Ed Maste [Tue, 14 Mar 2023 17:01:20 +0000 (13:01 -0400)]
compiler-rt: remove eprintf
It was used by ancient GCC assert.h. Prior to 2001 GCC used to provide
its own assert.h The GCC assert.h required __eprintf to emit the error
message. FreeBSD's own assert.h never used this.
Reviewed by: ed (previously), imp
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D2597
dl [Tue, 14 Mar 2023 04:26:41 +0000 (22:26 -0600)]
Increase protection provided by veriexec with new unlink/rename hooks.
Functions implemented :
- mac_veriexec_vnode_check_unlink: Unlink on a file has been
requested and requires validation. This function prohibits the
deleting a protected file (or deleting one of these hard links, if
any).
- mac_veriexec_vnode_check_rename_from: Rename the file has been
requested and must be validated. This function controls the renaming
of protected file
- mac_veriexec_vnode_check_rename_to: File overwrite rename has been
requested and must be validated. This function prevent overwriting of
a file protected (overwriting by mv command).
The 3 fonctions together aim to control the 'removal' (via unlink) and
the 'mv' on files protected by veriexec. The intention is to reach the
functional level of NetBSD veriexec.
Add sysctl node security.mac.veriexec.unlink to toggle control on
syscall unlink.
Add tunable kernel variable security.mac.veriexec.block_unlink to toggle
unlink protection. Add the corresponding read-only sysctl.
[ tidied up commit message, trailing whitespace, long lines, { placement ]
Allan Jude [Sat, 26 Nov 2022 18:11:13 +0000 (18:11 +0000)]
loader: Add support for booting from a ZFS snapshot
When booting from a snapshot we need to follow a different code path
to turn the objset ID into the name, and for forward lookups we need
to walk the parent's snapnames_zap.
With this, it is possible to set the pools BOOTFS property to a
snapshot and boot with a read-only filesystem of that snapshot.
Reviewed by: tsoome, rew, imp
Sponsored By: Beckhoff Automation GmbH & Co. KG
Sponsored By: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D38600
Roger Pau Monné [Mon, 13 Mar 2023 14:17:21 +0000 (15:17 +0100)]
xen: take struct size into account for video information
The xenpf_dom0_console_t structure can grow as more data is added, and
hence we need to check that the fields we accesses have been filled by
Xen. The only extra field FreeBSD currently uses is the top 32 bits
for the frame buffer physical address.
Note that this field is present in all the versions that make the
information available from the platform hypercall interface, so the
check here is mostly cosmetic, and to remember us that newly added
fields require checking the size of the returned data.
Fixes: 6f80738b228c ('xen: fetch dom0 video console information from Xen')
Sponsored by: Citrix Systems R&D
lucy [Mon, 13 Mar 2023 22:01:12 +0000 (16:01 -0600)]
Add GNU glibc compatible secure_getenv
Add mostly glibc and msl compatible secure_getenv. Return NULL if
issetugid() indicates the process is tainted, otherwise getenv(x). The
rational behind this is the fact that many Linux applications use this
function instead of getenv() as it's widely consider a, "best
practice".
Jessica Clarke [Tue, 14 Mar 2023 04:12:31 +0000 (04:12 +0000)]
arm64: Move Azure-specific config from std.hyperv to std.azure
Hyper-V does not provide Mellanox hardware, some of Azure's instances
do, thus the configuration to enable them does not belong in the generic
std.hyperv config.
Fixes: 15e7fa83ef3c ("arm64: Hyper-V: Add vPCI and Mellanox driver modules into build")
Warner Losh [Tue, 14 Mar 2023 02:33:35 +0000 (20:33 -0600)]
Parse /kboot.conf
If there's a kboot.conf, prase it after the command line args are
parsed. It's not always easy to get all the right command line args
depending on the environment. Allow an escape hatch. While we can't do
everything one might like in this file, we can do enough.
Ed Maste [Mon, 13 Mar 2023 20:51:51 +0000 (16:51 -0400)]
makefs: do not call brelse if bread returns an error
If bread returns an error there is no bp to brelse. One of these
changes was taken from NetBSD commit 0a62dad69f62 ("This works well
enough to populate..."), the rest were found by looking for the same
pattern.
Reviewed by: imp
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D39069
Kyle Evans [Sun, 5 Mar 2023 00:49:04 +0000 (18:49 -0600)]
arm: generic_timer: use interrupt-names when available
Offsets for all of thse can be a bit complicated as not all interrupts
will be present, only phys and virt are actually required, and sec-phys
could optionally be specified before phys. Push idx/name pairs into
a new config struct and maintain the old indices while still getting the
correct timers.
Split fdt/acpi attach out independently and allocate interrupts before
we head into the common attach(). The secure physical timer is also
optional there, so mark it so to avoid erroring out if we run into
problems.
Reviewed by: andrew
Differential Revision: https://reviews.freebsd.org/D38911
Warner Losh [Mon, 13 Mar 2023 20:28:51 +0000 (14:28 -0600)]
makefs: make msdos creation go fast
Add missing brelse(bp). Without it the cache grows and we have a n^2
lookup. I'm not entirely sure why we read the block before we write it
back out, since the only side effect of that is to allocate memory,
clear the memory, read it in from disk, throw it away with the contents
of the file being written out. We likely should just do a getblk() here
instead, but even with all that, this takes the time it takes to create
a 150MB msdos fs image down from 5 minutes to 30 seconds.
See code review for how we got this. tl;dr: netbsd move brelse
into bwrite and we picked up msdos code after that, but not the
move. That change should be picked up later.
Sponsored by: Netflix
Reviewed by: emaste
MFC After: 1 day (13.2 is coming fast)
Differential Revision: https://reviews.freebsd.org/D39025
Pawel Biernacki [Mon, 13 Mar 2023 16:36:11 +0000 (16:36 +0000)]
netinet6: allow disabling excess log messages
RFC 4443 specifies cases where certain packets, like those originating from
local-scope addresses destined outside of the scope shouldn't be forwarded.
The current practice is to drop them, send ICMPv6 message where appropriate,
and log the message:
At times the volume of such messages cat get very high. Let's allow local
admins to disable such messages on per vnet basis, keeping the current
default (log).
Dimitry Andric [Sat, 25 Feb 2023 00:45:48 +0000 (01:45 +0100)]
zfs: Use .section .rodata instead of .rodata on FreeBSD
In commit 0a5b942d4 the FreeBSD SECTION_STATIC macro was set to
".rodata". This assembler directive is supported by LLVM (as a
convenience alias for ".section .rodata") by not by GNU as.
This caused the FreeBSD builds that are done with gcc to fail.
Therefore, use ".section .rodata" instead, similar to the other
asm_linkage.h headers.
lib/csu: do not compile the body of handle_static_init() for PIC build at all
The referenced symbols that provide init array boundaries are weak,
hidden, and undefined. The code that iterates over that arrays is not
used for the case when libc is compiled as dso.
This should fix linking with ld.bfd.
Sponsored by: The FreeBSD Foundation
MFC after: 3 weeks
Rick Macklem [Sun, 12 Mar 2023 21:34:25 +0000 (14:34 -0700)]
rc.d: Fix NFS server startup scripts to enable vnet prison use
Now that commit cbbb22031f9b is in main,
it is possible to run nfsd(8), nfsuserd(8), mountd(8),
gssd(8) and rpc.tlsservd(8) in an appropriately configured vnet
prison if the "allow.nfsd" option is specified in jail.conf.
This patch fixes the rc scripts for this.
Mostly just replaces the "nojail" KEYWORD with "nojailvnet",
but also avoids setting vfs.nfsd.srvmaxio in a prison, since it
must be set outside of the prisons and applies to all
nfsd(8) instances.
Ihor Antonov [Sun, 12 Mar 2023 16:07:34 +0000 (10:07 -0600)]
daemon: move variables into struct daemon_state
The fact that most of the daemon's state is stored on the stack
of the main() makes it hard to split the logic smaller chunks.
Which in turn leads to huge main func that does a a lot of things.
struct log_params existed because some variables need to be passed
into other functions together.
This change renames struct log_params into daemon_state
and moves the rest of the variables into it. This is a necessary
preparation step for further refactroing.
Justin Hibbits [Sun, 12 Mar 2023 15:46:57 +0000 (11:46 -0400)]
powerpc/pmap: Add pmap_sync_icache() for radix pmap
DTrace pid provider writes to user space to set breakpoints. Failing to
sync the icache can lead to SIGTRAP. Radix pmap is the only one missing
a pmap_sync_icache() method, so the pid provider would only potentially
crash a process on a POWER9 or later system.
The current name was a historical curiosity that started when init array
support was added, and then the file appeared a convenient place for the
addition of the MI common code to csu. It is now referenced by name in
single place and the rename is easy, so do it.
Sponsored by: The FreeBSD Foundation
MFC after: 3 weeks
Why? Most trivial point, it shaves around 600 bytes from the dynamic
binaries on amd64. Less trivial, the removed code is no longer part of
the ABI, and we can ship updates to it with libc updates. Right now most
of the csu is linked into the binaries and require us to do somewhat
tricky ABI compat when it needs to change. For instance, the init_array
change would be much simpler and does not require note tagging if we
have init calling code in libc.
This could be improved more, by splitting dynamic and static
initialization. For instance, &_DYNAMIC tests can be removed then.
Such change, nonetheless, would require building libc three times.
I left this for later, after this change stabilizes, if ever.
Reviewed by: markj
Discussed with: jrtc27 (some objections, see the review), imp
Tested by: markj (aarch64)
Sponsored by: The FreeBSD Foundation
MFC after: 3 weeks
Differential revision: https://reviews.freebsd.org/D37220
Justin Hibbits [Sat, 11 Mar 2023 16:30:00 +0000 (11:30 -0500)]
dtrace/powerpc: "Fix" stack traces across trap frames
In function boundary tracing the link register is not yet saved to the
save stack location, so the save point contains whatever the previous
'lr' save was, or even garbage, at the time the trap is taken. Address
this by explicitly loading the link register from the trap frame instead
of the stack, and propagate that out.
Mateusz Guzik [Tue, 7 Mar 2023 20:56:54 +0000 (20:56 +0000)]
vm: read-locked fault handling for backing objects
This is almost the simplest patch which manages to avoid write locking
for backing objects, as a result mostly fixing vm object contention
problems.
What is not fixed:
1. cacheline ping pong due to read-locks
2. cacheline ping pong due to pip
3. cacheling ping pong due to object busying
4. write locking on first object
On top of it the use of VM_OBJECT_UNLOCK instead of explicitly tracking
the state is slower multithreaded that it needs to be, done for
simplicity for the time being.