Kornel Dulęba [Tue, 16 Aug 2022 08:16:50 +0000 (10:16 +0200)]
icmp6: Improve validation of PMTU
Currently we accept any pmtu between IPV6_MMTU(1280B) and the link mtu.
In some network topologies could allow a bad actor to perform a DOS attack.
Contrary to IPv4 in IPv6 oversized packets are dropped, and a ICMP
PACKET_TOO_BIG message is sent back to the sender.
After receiving an ICMPv6 packet with pmtu bigger than the
current one the victim will start sending frames that will be dropped
a router with reduced MTU.
Although it will eventually receive another message with correct pmtu,
an attacker can still just inject their spoofed packets frequently
enough to overwrite the correct value.
This issue is described in detail in RFC8201, section 6.
Fix this by checking the current pmtu, and accepting the new one only
if it's smaller.
Alexander Motin [Thu, 14 Jul 2022 19:38:14 +0000 (15:38 -0400)]
Delay GEOM disk_create() until CAM periph probe completes.
Before this patch CAM periph drivers called both disk_alloc() and
disk_create() same time on periph creation. But then prevented disks
from opening until the periph probe completion with cam_periph_hold().
As result, especially if disk misbehaves during the probe, GEOM event
thread, triggered to taste the disk, got blocked on open attempt,
potentially for a long time, unable to process other events.
This patch moves disk_create() call from periph creation to the end of
the probe. To allow disk_create() calls from non-sleepable CAM contexts
some of its duties requiring memory allocations are moved either back
to disk_alloc() or forward to g_disk_create(), so now disk_alloc() and
disk_add_alias() are the only disk methods that require sleeping. If
disk fails during the probe disk_create() may just be skipped, going
directly to disk_destroy(). Other method calls during that time are
just ignored. Since GEOM may now see the disks after CAM bus scan is
already completed, introduce per-periph boot hold functions. Enclosure
driver already had such mechanism, so just generalize it.
Per the EHABI32 specification, __cxa_end_cleanup must take care to
preserve registers before calling _Unwind_Resume(). So, libcxxrt uses
an assembly stub which preserves caller-saved registers around the call
to __cxa_get_cleanup(). But:
- it failed to restore them properly,
- it did not preserve the link register.
Fix both of these problems. This is needed to fix exception unwinding
on FreeBSD with LLVM 14. Note that r4 is callee-saved but is pushed
onto the stack to preserve stack pointer alignment.
When printing the current state name and the old state numeric value,
both are always the same. Remove the redundant ostate. It is always
the same as iv_state.
Cy Schubert [Sun, 12 Jun 2022 19:02:47 +0000 (12:02 -0700)]
rc.d/wpa_supplicant: Remove the sleep to improve boot time
bapt@ had discovered a noticeable boot improvement without the sleep.
Without the sleep does not affect warm or cold boot however a
service netif restart may cause dhclient to spend a few extra seconds
to rerequest the DHCP request.
Brooks Davis [Fri, 1 Jul 2022 07:33:16 +0000 (08:33 +0100)]
installworld: improve portability of ldd use
b3b462229f97 added a case statement to ignore lines containing strings
in square brackets such as "[vdso]" and "[preloaded]". On MacOS
Monterey where /bin/sh may be zsh, this fails with:
/bin/sh: -c: line 0: syntax error near unexpected token `;;'
Invoke grep in the pipeline to remove such lines instead.
Mark Johnston [Wed, 3 Aug 2022 00:32:17 +0000 (20:32 -0400)]
ctfconvert: Give bitfield types names distinct from the base type
CTF integers have an explicit width and so can be used to represent
bitfields. Bitfield types emitted by ctfconvert(1) share the name of
the base integer type, so a struct field with type "unsigned int : 15"
will have a type named "unsigned int".
To avoid ambiguity when looking up types by name, add a suffix to names
of bitfield types to distinguish them from the base type. Then, if
ctfmerge happens to order bitfield types before the corresponding base
type in a CTF file, a name lookup will return the base type, which is
always going to be the desired behaviour.
PR: 265403
Reported by: cy
Sponsored by: The FreeBSD Foundation
DB_COMMAND(9): update to mention additional macros
Document the existing alias definitions, and augment the example with
one of these. Also, describe the purpose of the newly added _FLAGS
variations of these command definitions.
Make some small style improvements to appease mandoc -Tlint.
Reviewed by: markj
MFC after: 3 days
Sponsored by: Juniper Networks, Inc.
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D35664
Mitchell Horne [Thu, 30 Jun 2022 15:58:22 +0000 (12:58 -0300)]
ddb: add _FLAGS command variants
Provide _FLAGS variants of the various command definition macros, in
anticipation of adding a new flag. This can also be used for some
existing commands which require special flag values.
Reviewed by: markj
MFC after: 3 days
Sponsored by: Juniper Networks, Inc.
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D35581
Provide separate helper macros for regular commands and next-level table
commands as they are mutually exclusive. This ensures proper
initialization of each element and allows us to exclude some redundant
fields, such as specifying .more = NULL for every regular command.
Reviewed by: markj, jhb
MFC after: 3 days
Sponsored by: Juniper Networks, Inc.
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D35580
Mitchell Horne [Tue, 21 Jun 2022 13:29:53 +0000 (10:29 -0300)]
bus_if: provide a default null rescan method
There is an existing helper function, bus_null_rescan(), but most
drivers won't use it except to override an inherited rescan method. It
also returns the same error as an empty method. Make this the default
rescan method in bus_if.m and return a more sensible error code.
This gives a slightly more meaningful error message when attempting
'devctl rescan' on buses and devices alike:
"Device not configured" --> "Operation not supported by device"
Reviewed by: imp
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D35501
rtld-elf: Fix leaks and wild frees in origin_subst
55abf23dd36b inverted the value passed to origin_subst_one when rolling
up the existing code into a loop. If the first token is found ($ORIGIN),
this results in a wild free of part of strtab. Processing the second
token works fine and will act how the first should have regardless of
whether found, allocating memory for the string without freeing.
Processing subsequent tokens however will then leak, regardless of
whether found, as they will also believe they need to allocate memory
and can't free the string.
Found by: CHERI
Reviewed by: kib, markj
Fixes: 55abf23dd36b ("rtld: make token substitution table-driven")
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D35792
Gordon Bergling [Sun, 7 Aug 2022 14:30:24 +0000 (16:30 +0200)]
libutil: Fix mandoc warnings
- missing comma before name
- possible typo in section name: Sh CAVEAT instead of CAVEATS
- useless macro: Tn
- blank line in fill mode, using .sp
- no blank before trailing delimiter: Dv NULL?
Gordon Bergling [Sun, 7 Aug 2022 12:53:53 +0000 (14:53 +0200)]
libpathconv: Fix mandoc warnings in abs2rel(3) and rel2abs(3)
- cannot parse date, using it verbatim: Dec 15, 1997"
- sections out of conventional order: Sh SEE ALSO
- possible typo in section name: Sh EXAMPLE instead of EXAMPLES
- AUTHORS section without An macro
Mark Johnston [Mon, 25 Jul 2022 20:53:21 +0000 (16:53 -0400)]
vm_fault: Shoot down shared mappings in vm_fault_copy_entry()
As in vm_fault_cow(), it's possible, albeit rare, for multiple vm_maps
to share a shadow object. When copying a page from a backing object
into the shadow, all mappings of the source page must therefore be
removed. Otherwise, future operations on the object tree may detect
that the source page is fully shadowed and thus can be freed.
Approved by: so
Security: FreeBSD-SA-22:11.vm
Reviewed by: alc, kib
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D35635
elf_note_prpsinfo: handle more failures from proc_getargv()
Resulting sbuf_len() from proc_getargv() might return 0 if user mangled
ps_strings enough. Also, sbuf_len() API contract is to return -1 if the
buffer overflowed. The later should not occur because get_ps_strings()
checks for catenated length, but check for this subtle detail explicitly
as well to be more resilent.
The end result is that p_comm is used in this situations.
Approved by: so
Security: FreeBSD-SA-22:09.elf
Reported by: Josef 'Jeff' Sipek <jeffpc@josefsipek.net>
Reviewed by: delphij, markj
admbugs: 988
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D35391
Been some time since 364fe18b8c8 when the URL was first in this file.
Update from svnweb to cgit for the URL listed at the end of this file.
In addition, update all URLs to HTTPS. Replace two URLs with links to
archive.org as the original URLs are no longer valid.
Mark Adler [Sat, 30 Jul 2022 22:51:11 +0000 (15:51 -0700)]
zlib: Fix a bug when getting a gzip header extra field with inflate().
If the extra field was larger than the space the user provided with
inflateGetHeader(), and if multiple calls of inflate() delivered
the extra header data, then there could be a buffer overflow of the
provided space. This commit assures that provided space is not
exceeded.
Kristof Provost [Thu, 23 Jun 2022 20:35:29 +0000 (22:35 +0200)]
ipsec: replace SECASVAR mtx by rmlock
This mutex is a significant point of contention in the ipsec code, and
can be relatively trivially replaced by a read-mostly lock.
It does require a separate lock for the replay protection, which we do
here by adding a separate mutex.
This improves throughput (without replay protection) by 10-15%.
ed crowe [Mon, 25 Jul 2022 15:35:46 +0000 (11:35 -0400)]
asmc: Add support for MacBookPro6,2
Modify asmc_sms_printintr() to be silent when the ambient light sensor
interrupt fires on this model, since the messages can otherwise fill up
the dmesg.
Alan Cox [Fri, 29 Jul 2022 06:14:46 +0000 (01:14 -0500)]
iommu_gas: Eliminate redundant parameters and push down lock acquisition
Since IOMMU map entries store a reference to the domain in which they
reside, there is no need to pass the domain to iommu_gas_free_entry(),
iommu_gas_free_space(), and iommu_gas_free_region().
Push down the acquisition and release of the IOMMU domain lock into
iommu_gas_free_space() and iommu_gas_free_region().
Both of these changes allow for simplifications in the callers of the
functions without really complicating the functions themselves.
Moreover, the latter change eliminates the direct use of the IOMMU
domain lock from the x86-specific DMAR code.
Alan Cox [Tue, 26 Jul 2022 04:53:15 +0000 (23:53 -0500)]
x86/iommu: Correct a recent change to iommu_domain_unload_entry()
Correct 8bc367384745. When iommu_domain_unload_entry() performs a
synchronous IOTLB invalidation, it must call dmar_domain_free_entry()
to remove the entry from the domain's RB_TREE.
Push down the acquisition and release of the DMAR lock into the
recently introduced function dmar_qi_invalidate_sync_locked() and
remove the _locked suffix.
Doug Moore [Sun, 10 Jul 2022 19:24:23 +0000 (14:24 -0500)]
iommu_gas: consolidate find_space helpers
Merge lowermatch and uppermatch into find_space. Eliminate uppermatch
recursion. Merge match_insert into match_one and eliminate some
redundant calculation. Move some initialization out of find_space and
into map (and out from under a lock).
Doug Moore [Wed, 20 Apr 2022 22:24:11 +0000 (17:24 -0500)]
dev/iommu: Include offset in maxaddr check.
If iommu_gas_match_one has to adjust for a boundary crossing, its
check against maxaddr includes 'offset' in its calculation, to ensure
that the allocated memory does not exceed the max address. However, if
there's no boundary crossing adjustment, then the maxaddr check
disregards 'offset'. Fix that.
Alan Cox [Mon, 18 Jul 2022 00:56:39 +0000 (19:56 -0500)]
x86/iommu: Shrink the critical section in dmar_qi_task()
It is safe to test and clear the Invalidation Wait Descriptor
Complete flag before acquiring the DMAR lock in dmar_qi_task(),
rather than waiting until the lock is held.
Alan Cox [Fri, 22 Jul 2022 17:00:26 +0000 (12:00 -0500)]
iommu_gas: Eliminate a possible case of use-after-free
Eliminate a possible case of use-after-free in an error handling path
after a mapping failure. Specifically, eliminate IOMMU_MAP_ENTRY_QI_NF
and instead perform the IOTLB invalidation synchronously. Otherwise,
when iommu_domain_unload_entry() is called and told not to free the
IOMMU map entry, the caller could free the entry before dmar_qi_task()
is finished with it.
Alan Cox [Thu, 21 Jul 2022 06:53:54 +0000 (01:53 -0500)]
iommu_gas: Avoid double unmapping on error
In the extremely unlikely case that the iommu_gas_map_region() call in
bus_dma_iommu_load_ident() failed, we would attempt to unmap the failed
entry twice, first in iommu_gas_map_region(), and a second time in the
caller. Once is enough, and twice is problematic because it leads to a
second RB_REMOVE call on the same tree node. Like it or not, RB_TREE
does not handle that possibility.
Steve Kargl [Thu, 4 Aug 2022 17:31:57 +0000 (19:31 +0200)]
[libm] Correct comments in s_cbrt[l].c
Damian McGuckin <damianm at esi dot com dot au> noted that the accuracy
claims in the code for cbrt(3) and cbrtl(3) were incorrect. Fix the
comments to more accurately describe the accuracies.
David Sips [Fri, 22 Jul 2022 17:17:04 +0000 (19:17 +0200)]
if_vlan: avoid hash table thrashing when adding and removing entries
vlan_remhash() uses incorrect value for b.
When using the default value for VLAN_DEF_HWIDTH (4), the VLAN hash-list table
expands from 16 chains to 32 chains as the 129th entry is added. trunk->hwidth
becomes 5. Say a few more entries are added and there are now 135 entries.
trunk-hwidth will still be 5. If an entry is removed, vlan_remhash() will
calculate a value of 32 for b. refcnt will be decremented to 134. The if
comparison at line 473 will return true and vlan_growhash() will be called. The
VLAN hash-list table will be compressed from 32 chains wide to 16 chains wide.
hwidth will become 4. This is an error, and it can be seen when a new VLAN is
added. The table will again be expanded. If an entry is then removed, again
the table is contracted.
If the number of VLANS stays in the range of 128-512, each time an insert
follows a remove, the table will expand. Each time a remove follows an
insert, the table will be contracted.
The fix is simple. The line 473 should test that the number of entries has
decreased such that the table should be contracted using what would be the new
value of hwidth. line 467 should be:
Mark Johnston [Thu, 28 Jul 2022 13:38:52 +0000 (09:38 -0400)]
riscv: Avoid passing invalid addresses to pmap_fault()
After the addition of SV48 support, VIRT_IS_VALID() did not exclude
addresses that are in the SV39 address space hole but not in the SV48
address space hole. This can result in mishandling of accesses to that
range when in SV39 mode.
Fix the problem by modifying VIRT_IS_VALID() to use the runtime address
space bounds. Then, if the address is invalid, and pcb_onfault is set,
give vm_fault_trap() a chance to veto the access instead of panicking.
PR: 265439
Reviewed by: jhb
Reported and tested by: Robert Morris <rtm@lcs.mit.edu>
Fixes: 31218f3209ac ("riscv: Add support for enabling SV48 mode")
Sponsored by: The FreeBSD Foundation
Ed Maste [Tue, 28 Jun 2022 13:06:12 +0000 (09:06 -0400)]
Add ELFCOMPRESS_ZSTD ELF compression constant
ELFCOMPRESS_ZSTD indicates that an ELF section is compressed with zstd.
It is the second compression type, after the existing ELFCOMPRESS_ZLIB.
Zstd generally provides a compelling tradeoff of speed and compression
(other algorithms may compress slightly better but take a lot longer,
or run faster but do not compress nearly as well).
See https://groups.google.com/g/generic-abi/c/satyPkuMisk for details.
ELFCOMPRESS_ZSTD will be supported in a future Clang/LLVM update. ELF
Tool Chain tools also need updating.
Reviewed by: Fangrui Song
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Suppress possible unused variable warning for icl_soft.c
With clang 15, the following -Werror warning is produced on i386:
sys/dev/iscsi//icl_soft.c:1277:6: error: variable 'i' set but not used [-Werror,-Wunused-but-set-variable]
int i;
^
The 'i' variable is used later in the icl_soft_conn_pdu_get_bio()
function, via the PHYS_TO_DMAP() macro. However, on i386 and some other
architectures, this macro is defined to panic immediately, so in those
cases, 'i' is indeed not used. Suppress the warning by marking 'i' as
unused.
Adjust function definition in isa's pnp.c to avoid clang 15 warning
With clang 15, the following -Werror warning is produced:
sys/isa/pnp.c:118:24: error: a function declaration without a prototype is deprecated in all versions of C [-Werror,-Wstrict-prototypes]
pnp_send_initiation_key()
^
void
This is because pnp_send_initiation_key() is declared with a (void)
argument list, but defined with an empty argument list. Make the
definition match the declaration.
Mark Johnston [Mon, 18 Jul 2022 19:50:45 +0000 (15:50 -0400)]
sched_ule: Ensure we hold the thread lock when modifying td_flags
The load balancer may force a running thread to reschedule and pick a
new CPU. To do this it sets some flags in the thread running on a
loaded CPU. But the code assumed that a running thread's lock is the
same as that of the corresponding runqueue, and there are small windows
where this is not true. In this case, we can end up with non-atomic
modifications to td_flags.
Since this load balancing is best-effort, simply give up if the thread's
lock doesn't match; in this case the thread is about to enter the
scheduler anyway.
Reviewed by: kib
Reported by: glebius
Fixes: e745d729be60 ("sched_ule(4): Improve long-term load balancer.")
Sponsored by: The FreeBSD Foundation
Mark Johnston [Thu, 14 Jul 2022 14:43:53 +0000 (10:43 -0400)]
sched_ule: Use explicit atomic accesses for tdq fields
Different fields in the tdq have different synchronization protocols.
Some are constant, some are accessed only while holding the tdq lock,
some are modified with the lock held but accessed without the lock, some
are accessed only on the tdq's CPU, and some are not synchronized by the
lock at all.
Convert ULE to stop using volatile and instead use atomic_load_* and
atomic_store_* to provide the desired semantics for lockless accesses.
This makes the intent of the code more explicit, gives more freedom to
the compiler when accesses do not need to be qualified, and lets KCSAN
intercept unlocked accesses.
Thus:
- Introduce macros to provide unlocked accessors for certain fields.
- Use atomic_load/store for all accesses of tdq_cpu_idle, which is not
synchronized by the mutex.
- Use atomic_load/store for accesses of the switch count, which is
updated by sched_clock().
- Add some comments to fields of struct tdq describing how accesses are
synchronized.
No functional change intended.
Reviewed by: mav, kib
Sponsored by: The FreeBSD Foundation
Mark Johnston [Thu, 14 Jul 2022 14:24:25 +0000 (10:24 -0400)]
x86: Add a required store-load barrier in cpu_idle()
ULE's tdq_notify() tries to avoid delivering IPIs to the idle thread.
In particular, it tries to detect whether the idle thread is running.
There are two mechanisms for this:
- tdq_cpu_idle, an MI flag which is set prior to calling cpu_idle(). If
tdq_cpu_idle == 0, then no IPI is needed;
- idle_state, an x86-specific state flag which is updated after
cpu_idleclock() is called.
The implementation of the second mechanism is racy; the race can cause a
CPU to go to sleep with pending work. Specifically, cpu_idle_*() set
idle_state = STATE_SLEEPING, then check for pending work by loading the
tdq_load field of the CPU's runqueue. These operations can be reordered
so that the idle thread observes tdq_load == 0, and tdq_notify()
observes idle_state == STATE_RUNNING.
Some counters indicate that the idle_state check in tdq_notify()
frequently elides an IPI. So, fix the problem by inserting a fence
after the store to idle_state, immediately before idling the CPU.
PR: 264867
Reviewed by: mav, kib, jhb
Sponsored by: The FreeBSD Foundation
Mark Johnston [Thu, 14 Jul 2022 14:23:43 +0000 (10:23 -0400)]
sched_ule: Enable preemption of curthread in the load balancer
The load balancer executes from statclock and periodically tries to move
threads among CPUs in order to balance load. It may move a thread to
the current CPU (the loader balancer always runs on CPU 0). When it
does so, it may need to schedule preemption of the interrupted thread.
Use sched_setpreempt() to do so, same as sched_add().
PR: 264867
Reviewed by: mav, kib, jhb
Sponsored by: The FreeBSD Foundation