Dimitry Andric [Thu, 30 Dec 2021 09:53:25 +0000 (10:53 +0100)]
Avoid emitting popcnt in libclang_rt.fuzzer*.a if unsupported
Since popcnt is only supported by CPUTYPE=nehalem and later, ensure that
this instruction is only emitted when appropriate. Otherwise, programs
using the library can abort with SIGILL.
See also: https://github.com/llvm/llvm-project/issues/52893
PR: 258156
Reported by: Eric Rucker <bhtooefr@bhtooefr.org>
MFC after: 3 days
Mark Johnston [Thu, 16 Dec 2021 21:53:59 +0000 (16:53 -0500)]
fd: Initialize more export_fd_buf fields in kern_proc_cwd_out()
In particular, we need to initialize efbuf->flags, since
export_vnode_to_sb() loads that field. This was mostly harmless since
the flag only determines whether the output kinfo_file is packed, and
KERN_PROC_CWD only ever emits a single kinfo_file anyway.
Mark Johnston [Fri, 17 Dec 2021 15:59:29 +0000 (10:59 -0500)]
unix: Increase the default datagram recv buffer size
syslog(3) was recently change to support larger messages, up to 8KB.
Our syslogd handles this fine, as it adjusts /dev/log's recv buffer to a
large size. rsyslog, however, uses the system default of 4KB. This
leads to problems since our syslog(3) retries indefinitely when a send()
returns ENOBUFS, but if the message is large enough this will never
succeed.
Increase the default recv buffer size for datagram sockets to support
8KB syslog messages without requiring the logging daemon to adjust its
buffers.
PR: 260126
Reviewed by: asomers
Sponsored by: The FreeBSD Foundation
Bjoern A. Zeeb [Mon, 27 Dec 2021 17:42:51 +0000 (17:42 +0000)]
iwlwifi: plug memory modified after free
In certain situations we saw a memory modified after free. This was
tracked down to a pointer not NULLed after free and used in a different
code path. It is unclear how the race happens pending further
investigation but setting the pointer to NULL after free and adding a
check in the 2nd code path handling the case gracefully helps for now.
While here improve another debug messge in sta handling.
Bjoern A. Zeeb [Mon, 27 Dec 2021 17:40:02 +0000 (17:40 +0000)]
iwlwifi: add man pages
Add and hook up man pages for iwlwifi and iwlwififw and install a copy
of the firmware license to /usr/share/docs/legal so it will always be
shipped with the installed system.
Bjoern A. Zeeb [Mon, 27 Dec 2021 17:36:04 +0000 (17:36 +0000)]
iwlwifi: also depend on linuxkpi_wlan
The 802.11 compat code is split off linuxkpi.ko into linuxkpi_wlan.ko
in case it is built as a module. Depend on that.
While here adjust our module to a longer version to avoid conflicts.
Bjoern A. Zeeb [Sun, 26 Dec 2021 18:52:51 +0000 (18:52 +0000)]
LinuxKPI: add 802.11 compat code
Add 802.11 compat code for mac80211 and to a minimal degree cfg80211.
This allows us to compile and use basic functionality of wireless
drivers such as iwlwifi.
This is a constant work in progress but having it in the tree will
allow others to test and more easy to track changes and avoid having
snapshots no longer applying to branches.
Bjoern A. Zeeb [Sun, 26 Dec 2021 18:29:29 +0000 (18:29 +0000)]
LinuxKPI: import beginning of a new version of netdevice.h
Import a netdevice update complementing the last remaining bits of
the old ifnet derived implementation. Along add a (for now) task
based NAPI implementation.
This is the minimal set of chnages which are needed for the initial
support of wireless drivers. The NAPI implementation has an option to
still switch to "direct dispatch" as it had been used by these drivers
before not relying on a deferred context along with some printf tracing.
This has been helpful in the last weeks for debugging and will be
cleaned once we have had broader testing and are sure this is fine as-is.
Should we need a more time-sensitive or load-sensitive response
in the future we can always switch to something more sophisticated.
Sponsored by: The FreeBSD Foundation
X-Differential Revision: D33075 (abandoned without feedback a while ago)
Bjoern A. Zeeb [Sun, 26 Dec 2021 18:26:26 +0000 (18:26 +0000)]
LinuxKPI: add a work-in-progress skbuff implementation
This is a work-in-progress implementation of sk_buff compat code
used for wireless drivers only currently.
Bring in this version of the code as it has proven to be good enough
to have packets going for a few months.
The current implementation has several drawbacks including the need
for us to copy data between sk_buffs and mbufs.
Do not rely on the internals of this implementation. They are highly
likely to change as we will improve the integration to FreeBSD mbufs.
Bjoern A. Zeeb [Sun, 26 Dec 2021 17:26:58 +0000 (17:26 +0000)]
net80211: adjust a printf to toeee80211_note
Throughout net80211 there are multiple ways to log (debugging)
information. Start to clenaup one as I kept hitting it to harmonize
the output. The more we get away from printfs into either wrapper
functions or macros the more likely we can use holistic systematic
tracing in the future.
Bjoern A. Zeeb [Sun, 26 Dec 2021 17:25:57 +0000 (17:25 +0000)]
net80211: add debugging information
Add more STATE / DEBUG probes and enhance the output of one in order
to track state changes triggered by "ack" (or not).
This helped to narrow down causes from drivers or the LinuxKPI 802.11
compat framework which kept us in a scan -> auth -> scan loop.
Bjoern A. Zeeb [Sun, 26 Dec 2021 17:24:04 +0000 (17:24 +0000)]
net80211: format debug functions as single line
Making use of the debug output was hard given debug lines were run in
parts through vlog (if_printf) and in (multiple) parts through printf(s).
Like some of the functions alreay have, use a local buffer to format
the string and then use a single if_printf; in addition given these
functions are debug-only, add an extra printf in case we find our
buffers still to be too small so we can adjust for the future.
We already found that 128 characters are to short for some log messages.
Bump the buffer sizes collectively to 256 characters which also is
the maximum of if_vlog() so getting longer would need further changes
elsewhere.
Bjoern A. Zeeb [Wed, 17 Nov 2021 19:35:46 +0000 (19:35 +0000)]
modules: increase MAXMODNAME and provide backward compat
With various firmware files used by graphics and wireless drivers
we are exceeding the current 32 character module name (file path
in kldxref) length.
In order to overcome this issue bump it to the maximum path length
for the next version.
To be able to MFC provide backward compat support for another version
of the struct as the offsets for the second half change due to the
array size increase.
MAXMODNAME being defined to MAXPATHLEN needs param.h to be
included first. With only 7 modules (or LinuxKPI module.h) not
doing that adjust them rather than including param.h in module.h [1].
Reported by: Greg V (greg unrelenting.technology)
Sponsored by: The FreeBSD Foundation
Suggested by: imp [1]
Reviewed by: imp (and others to different level)
Differential Revision: https://reviews.freebsd.org/D32383
Bjoern A. Zeeb [Sun, 28 Nov 2021 18:35:53 +0000 (18:35 +0000)]
iwlwifi: import Intel's iwlwifi/mvm driver.
Over the past few months we published multiple snapshots for this
Linux derived driver and it has become fairly stable in terms of
minimal local changes needed for new updates.
Given the lack of full license texts on non-local files this is
imported under the draft policy for handling SPDX files (D29226). [1]
Do not yet hook this to the build until the remaining compat code
is all in. Along with the firmware import this will make publishing
the last bits and final testing a lot easier.
Sponsored by: The FreeBSD Foundation
Approved by: core (imp) [1]
Bjoern A. Zeeb [Sun, 28 Nov 2021 18:34:19 +0000 (18:34 +0000)]
iwlwifi: import firmware for Intel iwlwifi/mvm supported chipsets.
Import the most recent versions of the firmware images for iwlwifi
chipsets supported by the "mvm" sub-driver.
This is based on linux-firmware at f5d519563ac9d2d1f382a817aae5ec5473811ac8.
The license of the firmware matches the previous iwnfw(4) and
iwmfw(4) firmware files and you can find a copy in
sys/contrib/dev/iwlwififw/LICENCE.iwlwifi_firmware .
Add build infrastructure to create the .ko files but do not yet hook
it up to the build until all parts are in the tree.
There is an open issue concerning kldxref that we need to resolve
(D32383).
Bjoern A. Zeeb [Wed, 24 Nov 2021 21:28:46 +0000 (21:28 +0000)]
LinuxKPI: USB return possible error from suspend/resume
USB suspend/resume cannot fail so we never returned the error which
resulted in a -Wunused-but-set-variable warning.
Initialize the return variable and return a possible error possibly
triggering a printf upstream to at least have a trace of the problem.
This also fixes the warning.
Suggested by: hselasky
Sponsored by: The FreeBSD Foundation
Reviewed by: imp
Differential Revision: https://reviews.freebsd.org/D33107
Bjoern A. Zeeb [Wed, 24 Nov 2021 17:36:48 +0000 (17:36 +0000)]
net80211: fix -Wunused-but-set-variable warnings
Put the offending variables under the appropriate #ifdefs
(mostly IEEE80211_DEBUG, in one case IEEE80211_SUPPORT_SUPERG, and
in two cases under __notyet__ to revisit why these had been left
there but not used).
Dawid Gorecki [Wed, 13 Oct 2021 19:06:05 +0000 (21:06 +0200)]
libthr: Use kern.stacktop for thread stack calculation.
Use the new kern.stacktop sysctl to retrieve the address of stack top
instead of kern.usrstack. kern.usrstack does not have any knowledge
of the stack gap, so this can cause problems with thread stacks.
Using kern.stacktop sysctl should fix most of those problems.
kern.usrstack is used as a fallback when kern.stacktop cannot be read.
Rename usrstack variables to stacktop to reflect this change.
Fixes problems with firefox and thunderbird not starting with
stack gap enabled.
Dawid Gorecki [Wed, 13 Oct 2021 19:03:37 +0000 (21:03 +0200)]
kern_exec: Add kern.stacktop sysctl.
With stack gap enabled top of the stack is moved down by a random
amount of bytes. Because of that some multithreaded applications
which use kern.usrstack sysctl to calculate address of stacks for
their threads can fail. Add kern.stacktop sysctl, which can be used
to retrieve address of the stack after stack gap is applied to it.
Returns value identical to kern.usrstack for processes which have
no stack gap.
Dawid Gorecki [Wed, 13 Oct 2021 19:01:08 +0000 (21:01 +0200)]
setrlimit: Take stack gap into account.
Calling setrlimit with stack gap enabled and with low values of stack
resource limit often caused the program to abort immediately after
exiting the syscall. This happened due to the fact that the resource
limit was calculated assuming that the stack started at sv_usrstack,
while with stack gap enabled the stack is moved by a random number
of bytes.
Save information about stack size in struct vmspace and adjust the
rlim_cur value. If the rlim_cur and stack gap is bigger than rlim_max,
then the value is truncated to rlim_max.
Emmanuel Vadot [Wed, 8 Dec 2021 10:37:50 +0000 (11:37 +0100)]
bsdinstall: zfsboot: Prompt user for zpool name if the pool already exists
If one install FreeBSD on multiple disks (say 13 and CURRENT) the first created
pool will always be used.
Prompt the user for a new pool name if we detect that the default or supplied one
already exists.
Emmanuel Vadot [Wed, 8 Dec 2021 09:37:54 +0000 (10:37 +0100)]
bsdinstall: bootconfig: Try to clean old efi boot entries
If one install FreeBSD on the same machine multiple times in a row or
on different harddrive they have a lot of 'FreeBSD' efi boot entries added.
With this patch we now do :
- If there is no 'FreeBSD' entry we add one like before
- If there is one or more entries we ask the user if they want to delete
them all and add a new one
- If they say yes we do that
- If they say no we prompt them an inputbox so they can enter a different
entry name if they want, it defaults to 'FreeBSD'
Rick Macklem [Thu, 16 Dec 2021 00:36:40 +0000 (16:36 -0800)]
nfscl: Handle CB_SEQUENCE not first op correctly
The check for "not first operation" in CB_SEQUENCE
was done after the slot, etc. was updated. This patch
moves the check to the beginning of CB_SEQUENCE
processing.
While here, also fix the check for "no CB_SEQUENCE operation first"
by moving the check to the beginning of callback operation parsing,
since the check was in a couple of the other operations, but
not all of them.
I thought that these new auth_stat values had been agreed
upon by the IETF NFSv4 working group, but that no longer
is the case. As such, delete them and use AUTH_TOOWEAK
instead. Leave the code that uses these new auth_stat
values in the sources #ifdef notnow, in case they are
defined in the future.
Maxim Sobolev [Fri, 20 Aug 2021 19:33:51 +0000 (12:33 -0700)]
Only trigger read-ahead if two adjacent blocks have been requested.
The change makes block caching algorithm to work better for remote
media on low-BW/high-delay links.
This cuts boot time over IP KVMs noticeably, since the initialization
stage reads bunch of small 4th (and now lua) files that are not in
the same cache stripe (usually), thus wasting lot of bandwidth and
increasing latency even further.
The original regression came in 2017 with revision 87ed2b7f5. We've
seen increase of time it takes for the loader to get to the kernel
loading from under a minute to 10-15 minutes in many cases.
Colin Percival [Thu, 30 Sep 2021 21:48:14 +0000 (14:48 -0700)]
EFI loader: Don't free bcache for DEVT_DISK devs
Booting on an EC2 c5.xlarge instance, this reduces the number of I/Os
performed from 609 to 432, reduces the total number of blocks read
from 61963 to 60797, and reduces the time spent in the loader by 39 ms.
Note that b4cb3fe0e39a allowed the bcache to be retained for most of
the boot process, but relies on mounting filesystems; this commit
allows the bcache to be retained at the start of the boot process,
before the root filesystem has been located.
Colin Percival [Tue, 28 Sep 2021 18:39:59 +0000 (11:39 -0700)]
loader: Set twiddle globaldiv to 16 by default
Booting FreeBSD on an EC2 c5.xlarge instance, the loader "twiddles"
810 times over the course of 510 ms, a rate of 1.59 kHz. Even accepting
that many systems are slower than this particular VM and will take
longer to boot (especially if using spinning-rust disks), this seems
like an unhelpfully large amount of twiddling when compared to the
~60 Hz frame rate of many displays; printing the twiddles also consumes
roughly 10% of the boot time on the aforementioned VM.
Setting the default globaldiv to 16 dramatically reduces the time spent
printing twiddles to the console while still twiddling at roughly 100
Hz; this should be ample even for systems which take longer to boot and
consequently twiddle slower.
Note that this can adjusted via the twiddle_divisor variable in
loader.conf, but that file is not processed until nearly halfway
through the loader's runtime.
Colin Percival [Sun, 30 May 2021 20:17:11 +0000 (20:17 +0000)]
MFC loader+userland TSLOG support
stand/common: Add file_addbuf()
libsa: Add support for timestamp logging (tslog)
stand/common: Add support for timestamp logging (tslog)
i386/loader: Call tslog_init
efi/loader: Call tslog_init (+ bugfix)
stand/common command_boot: Pass tslog to kernel
kern_tslog: Include tslog data from loader
loader: Use tslog to instrument some functions
Add userland boot profiling to TSLOG (+ bugfix)
Summary:
One was required to press a key to continue after every 18 lines of
output. This requirement had been in the "show vmopag" command since it
was introduced, which was many years before paging was added to DDB.
With paging, this explict key check is no longer necessary.
Obtained from: Juniper Networks, Inc.
MFC after: 1 week
Test Plan:
Run "show vmopag" from db> prompt and see that it does not need additional
keypresses other than the ones needed for the pager.
Bjoern A. Zeeb [Tue, 30 Nov 2021 14:23:18 +0000 (14:23 +0000)]
fw_stub: fix -Wunused-but-set-variable for firmware files
In case we are only embedding a single firmware image the variable
"parent" gets set but never used. Add checks for the number of files
for it and only print it out if we are exceeding the single file count.
This fixes -Wunused-but-set-variable warnings for the majority of
firmware files in the tree.
Bjoern A. Zeeb [Mon, 13 Dec 2021 22:10:25 +0000 (22:10 +0000)]
rc: network.subr improve network6_getladdr()
In network6_getladdr() we are iterating over inet6 lines and are not
interested in any others. So tell ifconfig to limit output to "inet6"
as much as possible.
This is probably a micro-optimisation but was noticed while looking
at other IPv6-related boot-time improvements.
Mark Johnston [Tue, 28 Dec 2021 22:44:57 +0000 (17:44 -0500)]
x86: Do not attempt to calibrate the LAPIC timer if no APIC is present
Reported and tested by: Michael Butler <imb@protected-networks.net>
Reviewed by: jhb, kib
Fixes: 62d09b46ad75 ("x86: Defer LAPIC calibration until after timecounters are available")
Sponsored by: The FreeBSD Foundation
Mark Johnston [Mon, 6 Dec 2021 15:42:19 +0000 (10:42 -0500)]
x86: Perform late TSC calibration before LAPIC timer calibration
This ensures that LAPIC calibration is done using the correct tsc_freq
value, i.e., the one associated with the TSC timecounter. It does mean
though that TSC calibration cannot use sbinuptime() to read the
reference timecounter, as timehands are not yet set up.
Reviewed by: kib, jhb
Sponsored by: The FreeBSD Foundation
Mark Johnston [Mon, 6 Dec 2021 15:42:10 +0000 (10:42 -0500)]
x86: Defer LAPIC calibration until after timecounters are available
This ensures that we have a good reference timecounter for performing
calibration.
Change lapic_setup to avoid configuring the timer when booting, and move
calibration and initial configuration to a new lapic routine,
lapic_calibrate_timer. This calibration will be initiated from
cpu_initclocks(), before an eventtimer is selected.
Reviewed by: kib, jhb
Sponsored by: The FreeBSD Foundation
Mark Johnston [Mon, 15 Nov 2021 20:31:21 +0000 (15:31 -0500)]
x86: Implement deferred TSC calibration
There is no universal way to find the TSC frequency. Newer Intel CPUs
may report it via CPUID leaves 0x15 and 0x16. Sometimes it can be
obtained from the PLATFORM_INFO MSR as well, though we never use that.
On older platforms we derive the frequency using a DELAY(1000000) call,
which uses the 8254 PIT. On some newer platforms the 8254 is apparently
non-functional, leading to bogus calibration results. On such platforms
the TSC frequency must be available from CPUID. It is also possible to
disable calibration with a tunable, in which case we try to parse the
brand string if the TSC freq is not available from CPUID.
CPUID 0x15 provides an authoritative TSC frequency value, but even that
is not always available on new Intel platforms. CPUID 0x16 provides the
specified processor base frequency, which is not the same as the TSC
frequency. Empirically, it is close enough for early boot, but too far
off for timekeeping: on a Comet Lake NUC, CPUID 0x16 yields 1600MHz but
the TSC frequency is rougly 1608MHz, leading to frequent clock stepping
when NTP is in use.
Thus we have a situation where we cannot calibrate using the PIT and
cannot obtain a precise frequency from CPUID (or MSRs). This change
seeks to address that by using the CPUID 0x16 value during early boot
and refining the calibration later once ACPI-based timecounters are
available. TSC frequency detection is thus split into two phases:
Early phase:
- On Intel platforms, query CPUID 0x15 and 0x16 and use that value
initially if available.
- Otherwise, get an estimate using the PIT, reducing the delay loop to
100ms from 1s.
- Continue to register the TSC as the CPU ticks provider early, even
though the frequency may be off. Otherwise any code executed during
boot that uses cpu_ticks() (e.g., context switching) gets tripped up
when the ticks provider changes.
Later phase:
- In SI_SUB_CLOCKS, once the timehands are initialized, load the current
TSC and timecounter (sbinuptime()) values at the beginning and end of
a 1s interval and use the timecounter frequency (typically from
kvmclock, HPET or the ACPI PM timer) to estimate the TSC frequency.
- Update the TSC timecounter, global tsc_freq and CPU ticker with the
new frequency and finally register the TSC as a timecounter.
Reviewed by: kib, jhb (previous version)
Discussed with: imp, cperciva
Sponsored by: The FreeBSD Foundation
Piotr Kubaj [Mon, 22 Nov 2021 02:28:46 +0000 (03:28 +0100)]
Add assembly optimized code for OpenSSL on powerpc, powerpc64 and powerpc64le
Summary:
1. https://github.com/openssl/openssl/commit/34ab13b7d8e3e723adb60be8142e38b7c9cd382a
needs to be merged for ELFv2 support on big-endian.
2. crypto/openssl/crypto/ppccap.c needs to be patched.
Same reason as in https://github.com/openssl/openssl/pull/17082.
Bjoern A. Zeeb [Sun, 26 Dec 2021 15:33:48 +0000 (15:33 +0000)]
IPv4: fix redirect sending conditions
RFC792,1009,1122 state the original conditions for sending a redirect.
RFC1812 further refine these.
ip_forward() still sepcifies the checks originally implemented for these
(we do slightly more/different than suggested as makes sense).
The implementation added in 8ad114c082a159c0dde95aa35d2e3e108aa30a75
to ip_tryforward() however is flawed and may send a "multi-hop"
redirects (to a host not on the directly connected network).
Do proper checks in ip_tryforward() to stop us from sending redirects
in situations we may not. Keep as much logic out of ip_tryforward()
and in ip_redir_alloc() and only do the mbuf copy once we are sure we
will send a redirect.
While here enhance and fix comments as to which conditions are handled
for sending redirects in various places.
Andrew Turner [Tue, 14 Dec 2021 15:49:07 +0000 (15:49 +0000)]
Fix dtrace fbt return probes on arm64
As with arm and riscv fix return fbt probes on arm64. arg0 should be
the offset within the function of the return instruction and arg1
should be the return value.
Reviewed by: kp, markj
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D33440
Andrew Turner [Fri, 19 Nov 2021 11:32:58 +0000 (11:32 +0000)]
Add accelerated arm64 sha512 to libmd
As with sha256 add support for accelerated sha512 support to libmd on
arm64. This depends on clang 13+ to build as this is the first release
with the needed intrinsics. Gcc should also support them, however from
a currently unknown release.
Reviewed by: cem
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D33373
Andrew Turner [Tue, 7 Dec 2021 14:23:13 +0000 (14:23 +0000)]
Handle table attributes in the arm64 kernel map
When getting the arm64 kernel maps sysctl we should look at the table
attributes as well as the block/page attributes. These attributes are
different to the last level attributes so need to be translated.
The previous code assumed the table and block/page attributes are
identical, however this is not the case. Handle the difference by
extracting the code into new helper functions & calling them as needed
based on the entry type being checked.
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D33321
Andrew Turner [Mon, 6 Dec 2021 13:08:39 +0000 (13:08 +0000)]
Teach vm.pmap.kernel_maps about both XN bits
The arm64 vm.pmap.kernel_maps sysctl would only check the kernel XN bit
when printing the kernel mapping. It can also be useful to check none
of the mappings allow userspace to execute from a given virtual address.
To check for this add the user XN bit when getting the kernel maps.
While here fix the ATTR_S1_AP_USER check to use ATTR_S1_AP to shift the
value to the correct offset.
Reviewed by: kib (earlier version), markj
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D33304
Andrew Turner [Tue, 14 Dec 2021 18:09:47 +0000 (18:09 +0000)]
Don't fail changing props for unmapped DMAP memory
When recursing in pmap_change_props_locked we may fail because there is
no pte. This shouldn't be considered a fail as it may happen in a few
cases, e.g. there are multiple normal memory ranges with device memory
between them.
Reported by: cperciva
Tested by: cperciva
Reviewed by: alc, kib
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D33459
Andrew Turner [Tue, 14 Dec 2021 10:05:15 +0000 (10:05 +0000)]
Only change DMAP props on DMAP covered memory
When changing memory properties in the arm64 pmap we need to keep both
the kernel address and DMAP mappings in sync.
To keep the kernel and DMAP memory in sync we recurse when updating the
former to also update the latter. There was insuffucuent checking around
this recursion. It would check if the virtual address is not within the
DMAP region, but not if the physical address is covered.
Add the missing check as without it the recursion may return an error.
Mark Johnston [Tue, 14 Dec 2021 20:10:46 +0000 (15:10 -0500)]
vm_fault: Fix vm_fault_populate()'s handling of VM_FAULT_WIRE
vm_map_wire() works by calling vm_fault(VM_FAULT_WIRE) on each page in
the rage. (For largepage mappings, it calls vm_fault() once per large
page.)
A pager's populate method may return more than one page to be mapped.
If VM_FAULT_WIRE is also specified, we'd wire each page in the run, not
just the fault page. Consider an object with two pages mapped in a
vm_map_entry, and suppose vm_map_wire() is called on the entry. Then,
the first vm_fault() would allocate and wire both pages, and the second
would encounter a valid page upon lookup and wire it again in the
regular fault handler. So the second page is wired twice and will be
leaked when the object is destroyed.
Fix the problem by modify vm_fault_populate() to wire only the fault
page. Also modify the error handler for pmap_enter(psind=1) to not test
fs->wired, since it must be false.
PR: 260347
Reviewed by: alc, kib
Sponsored by: The FreeBSD Foundation
Clean up a couple of MD warts in vm_fault_populate():
--Eliminate a big ifdef that encompassed all currently-supported
architectures except mips and powerpc32. This applied to the case
in which we've allocated a superpage but the pager-populated range
is insufficient for a superpage mapping. For platforms that don't
support superpages the check should be inexpensive as we shouldn't
get a superpage in the first place. Make the normal-page fallback
logic identical for all platforms and provide a simple implementation
of pmap_ps_enabled() for MIPS and Book-E/AIM32 powerpc.
--Apply the logic for handling pmap_enter() failure if a superpage
mapping can't be supported due to additional protection policy.
Use KERN_PROTECTION_FAILURE instead of KERN_FAILURE for this case,
and note Intel PKU on amd64 as the first example of such protection
policy.
Andriy Gapon [Mon, 6 Dec 2021 07:59:28 +0000 (09:59 +0200)]
vmxnet3: skip zero-length descriptor in the middle of a packet
Passing up such descriptors to iflib is obviously wasteful.
But the main conern is that we may overrun iri_frags array because of
them. That's been observed in practice.
Also, assert that the number of fragments / descriptors / segments is
less than IFLIB_MAX_RX_SEGS.
Dimitry Andric [Fri, 24 Dec 2021 11:46:00 +0000 (12:46 +0100)]
Apply clang fix for crash or assertion failure compiling part of llvm
Merge commit 77e8f4eeeeed from llvm git (by David Green):
[ARM] Define ComplexPatternFuncMutatesDAG
Some of the Arm complex pattern functions call canExtractShiftFromMul,
which can modify the DAG in-place. For this to be valid and handled
successfully we need to define ComplexPatternFuncMutatesDAG.
When building parts of llvm targeting armv6 on stable/12, the following
assertion can appear (or if assertions are disabled, clang is likely to
crash):
Assertion failed: (NodeToMatch->getOpcode() != ISD::DELETED_NODE && "NodeToMatch was removed partway through selection"), function SelectCodeCommon, file /usr/src/contrib/llvm-project/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp, line 3573.
PLEASE submit a bug report to https://bugs.freebsd.org/submit/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0. Program arguments: /usr/obj/usr/src/freebsd12-amd64/tmp/usr/bin/c++ -cc1 -triple armv6kz-unknown-freebsd12.3-gnueabihf -S --mrelax-relocations -disable-free -disable-llvm-verifier -discard-value-names -mrelocation-model static -mconstructor-aliases -target-cpu arm1176jzf-s -target-feature +vfp2 -target-feature +vfp2sp -target-feature -vfp3 -target-feature -vfp3d16 -target-feature -vfp3d16sp -target-feature -vfp3sp -target-feature -fp16 -target-feature -vfp4 -target-feature -vfp4d16 -target-feature -vfp4d16sp -target-feature -vfp4sp -target-feature -fp-armv8 -target-feature -fp-armv8d16 -target-feature -fp-armv8d16sp -target-feature -fp-armv8sp -target-feature -fullfp16 -target-feature +fp64 -target-feature -d32 -target-feature -neon -target-feature -sha2 -target-feature -aes -target-feature -fp16fml -target-feature +strict-align -target-abi aapcs-linux -mfloat-abi hard -fallow-half-arguments-and-returns -ffunction-sections -fdata-sections -O1 -std=c++14 -fdeprecated-macro -fno-rtti -fno-signed-char -faddrsig -fexperimental-new-pass-manager PPCISelLowering-009095.ii
1. <eof> parser at end of file
2. Code generation
3. Running pass 'Function Pass Manager' on module 'PPCISelLowering-009095.cpp'.
4. Running pass 'ARM Instruction Selection' on function '@_ZN4llvm17PPCTargetLoweringC2ERKNS_16PPCTargetMachineERKNS_12PPCSubtargetE'
This crash or assertion is fixed by the upstream commit.
Dimitry Andric [Mon, 20 Dec 2021 09:52:02 +0000 (10:52 +0100)]
tests/libalias: Make inline functions static inline
In C, plain inline functions should never be used: they should be
declared either static inline or extern inline. In this case, they are
clearly meant to be static inline.
黃清隆 [Mon, 13 Dec 2021 16:09:15 +0000 (08:09 -0800)]
sys/dev/arcmsr: Update Areca RAID driver to fix some issues on ARC-1886.
1. Doorbell interrupt status may arrive lately when doorbell interrupt on
ARC-1886.
2. System boot up hung when ARC-1886 with no volume created or no device
attached.
Many thanks to Areca for continuing to support FreeBSD.
Rick Macklem [Mon, 13 Dec 2021 23:21:31 +0000 (15:21 -0800)]
nfsd: Limit parsing of layout errors to maxcnt bytes
This patch decrements maxcnt by the appropriate
number of bytes during parsing and checks to see
if there is data remaining. If not, it just returns
from nfsrv_flexlayouterr() without further processing.
This prevents the tl pointer from running off the end
of the error data pointed at by layp, if there are
flaws in the data.
Domagoj Stolfa [Fri, 17 Dec 2021 16:01:54 +0000 (11:01 -0500)]
dtrace: Disable getf() as it is broken on FreeBSD
getf() on FreeBSD calls _sx_slock(), _sx_sunlock() and fget_locked().
Furthermore, it does not set the per-core fault flag, meaning it
usually ends up in a double fault panic once getf() does get called,
especially from fbt.
Reviewing the DTrace Toolkit + a number of other scripts scattered
around FreeBSD, I have not been able to find one use of getf(). Given
how broken the implementation currently is, we disable it until it
can be implemented properly.
Also comment out a test in aggs/tst.subr.d for getf().
Rick Macklem [Sun, 12 Dec 2021 23:40:30 +0000 (15:40 -0800)]
nfscl: Fix must_commit handling for mirrored pNFS mounts
For pNFS mounts to mirrored Flexible File layout pNFS servers,
the "must_commit" component in the nfsclwritedsdorpc
structure must be checked and the "must_commit" argument passed
into nfscl_doiods() must be updated. Technically, only writes to
the DS with a writeverf change must be redone, but since this
occurrence will be rare, the must_commit argument to nfscl_doiosd()
is set to 1, so all writes to all DSs will be redone.
This bug would affect few, since use of mirrored pNFS servers
is rare and "writeverf" rarely changes. Normally "writeverf"
only changes when a NFS server reboots.
Andriy Gapon [Wed, 15 Dec 2021 11:37:59 +0000 (13:37 +0200)]
mmc_sim: fix setting of the mutex name
To quote the manual:
The pointer passed in as name and type is saved rather than the data
it points to. The data pointed to must remain stable until the mutex
is destroyed.
It seems that the type is actually copied, but the name is stored as
a pointer indeed.
mmc_cam_sim_alloc used a name stored on stack.
So, a corrupt mutex name would be reported.
For example:
lock order reversal: (sleepable after non-sleepable)
1st 0xd7285b20 <8A><C0><C0>P@<C1><D0>P@<C1>^D^A (aw_mmc_sim, sleep mutex) @ sys/cam/cam_xpt.c:2804
This change moves the name to struct mmc_sim.
Also, that name is used as the sim name as well.
Unused mtx_name variable is removed too.
The name buffer is reduced to 16 characters.
Rick Macklem [Sat, 11 Dec 2021 23:00:30 +0000 (15:00 -0800)]
nfscl: Fix must_commit/writeverf handling for Direct I/O
Without this patch, the KASSERT(must_commit == 0,..) can be
triggered by the writeverf in the Direct I/O write reply changing.
This is not a situation that should cause a panic(). Correct
handling is to ignore the change in "writeverf" for Direct
I/O, since it is done with NFSWRITE_FILESYNC.
This patch modifies the semantics of the "must_commit"
argument slightly, allowing an initial value of 2 to indicate
that a change in "writeverf" should be ignored.
It also fixes the KASSERT()s.
This bug would affect few, since Direct I/O is not enabled
by default and "writeverf" rarely changes. Normally "writeverf"
only changes when a NFS server reboots, however I found the
bug when testing against a Linux 5.15.1 kernel nfsd, which
replied to a NFSWRITE_FILESYNC write with a "writeverf" of all
0x0 bytes.