Mark Johnston [Fri, 21 Jan 2022 18:28:13 +0000 (13:28 -0500)]
Avoid memory allocations in the ARC eviction thread
When the eviction thread goes to shrink an ARC state, it allocates a set
of marker buffers used to hold its place in the state's sublists.
This can be problematic in low memory conditions, since
1) the allocation can be substantial, as we allocate NCPU markers;
2) on at least FreeBSD, page reclamation can block in
arc_wait_for_eviction()
In particular, in stress tests it's possible to hit a deadlock on
FreeBSD when the number of free pages is very low, wherein the system is
waiting for the page daemon to reclaim memory, the page daemon is
waiting for the ARC eviction thread to finish, and the ARC eviction
thread is blocked waiting for more memory.
Try to reduce the likelihood of such deadlocks by pre-allocating markers
for the eviction thread at ARC initialization time. When evicting
buffers from an ARC state, check to see if the current thread is the ARC
eviction thread, and use the pre-allocated markers for that purpose
rather than dynamically allocating them.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: George Amanakis <gamanakis@gmail.com> Signed-off-by: Mark Johnston <markj@FreeBSD.org>
Closes #12985
(cherry picked from commit 6e2a59181e286a397d260fa9f140b58688d60c58)
Mark Johnston [Thu, 10 Feb 2022 20:32:23 +0000 (15:32 -0500)]
libctf: Rip out CTFv1 support
CTFv1 was obsolete before libctf was imported into FreeBSD, and
ctfconvert/ctfmerge can emit only CTFv2. Make ctf.h a bit easier to
maintain by ripping v1 support out. No functional change intended.
Mark Johnston [Thu, 10 Feb 2022 20:31:26 +0000 (15:31 -0500)]
tcp: Avoid conditionally defined fields in union lro_address
The layout of the structure ends up depending on whether the including
file includes opt_inet.h and opt_inet6.h, so different compilation units
can end up seeing different versions of the structure. Fix this by
unconditionally defining the address fields.
As a side effect, this eliminates some duplication in the kernel's CTF
type graph.
Reviewed by: rscheff, tuexen
Sponsored by: The FreeBSD Foundation
linux: Add additional ptracestop only if the debugger is Linux
In 6e66030c4c0, additional ptracestop was added in order
to implement PTRACE_EVENT_EXEC. Make it only apply to cases
where the debugger is a Linux processes; native FreeBSD
debuggers can trace Linux processes too, but they don't
expect that additonal ptracestop.
Reimplement bdf0f24bb16d556a5b by checking for the caller' ABI in
the implementation of PT_GET_SC_ARGS, and copying out everything if
it is Linuxolator.
Also fix a minor information leak: if PT_GET_SC_ARGS_ALL is done on the
thread reused after other process, it allows to read some number of that
thread last syscall arguments. Clear td_sa.args in thread_alloc().
Reviewed by: jhb
Sponsored by: The FreeBSD Foundation
Differential revision: https://reviews.freebsd.org/D31968
Bjoern A. Zeeb [Sun, 20 Feb 2022 18:10:45 +0000 (18:10 +0000)]
Bump __FreeBSD_version to 1300526 for LinuxKPI changes.
This successfully builds against drm-fbsd13-kmod-5.4.144.g20220128
so no conflicting changes on the MFC. Given there are overlaps, bump
__FreeBSD_version so they can be detected and removed as pleases.
Bjoern A. Zeeb [Thu, 17 Feb 2022 00:58:12 +0000 (00:58 +0000)]
LinuxKPI: 802.11 simplify beacon checks in rx path
In linuxkpi_ieee80211_rx() check if the frame is a beacon once upfront
and use the result for enhanced debugging and further checks.
This was done intially for rx_status->device_timestamp debugging.
Bjoern A. Zeeb [Fri, 1 Oct 2021 10:51:50 +0000 (10:51 +0000)]
LinuxKPI: implement dma_sync_single_for_*, apply to (un)map single/sg
Implement dma_sync_single_for_{cpu,device} translating the Linux
DMA_ flags to BUS_DMASYNC_ combinations. Make map_single/unmap_single*
functions call the respective sync function. Apply the same logic to
the scatter-gather list map/unmap functions.
Sponsored by: The FreeBSD Foundation
Reviewed by: hselasky
Differential Revision: https://reviews.freebsd.org/D32255
Bjoern A. Zeeb [Wed, 16 Feb 2022 23:57:27 +0000 (23:57 +0000)]
LinuxKPI: 802.11: disable ic_headroom for the moment
There is a problem with some drivers, such as rtw88, asking for more
headroom than we currently can handle throughout the stack (we have
other legacy wireless driver in the tree with similar problems).
This may trigger an assertion in the TCP syncache where we are checking
for a reply to fit in MHLEN.
While for the moment we still copy data from mbufs to skbs,
we can simply disable the extra headroom request in ic_headroom and
deal with it ourselves (which we already did anyway).
Leave a link to the thread on freebsd-transport detailing more of the
problem so we can find it again and solve it here or there.
Bjoern A. Zeeb [Thu, 17 Feb 2022 00:15:56 +0000 (00:15 +0000)]
LinuxKPI: 802.11 advertise full offload scanning based on hw_scan only
We disabled hw_scan for drivers not advertising SINGLE_SCAN_ON_ALL_BANDS.
Do not depend on this hw flag to set IEEE80211_FEXT_SCAN_OFFLOAD for
net80211 as otherwise scanning will never work.
Long-term we probably want to re-think how we do/integrate hw_scan
better in net80211.
Bjoern A. Zeeb [Tue, 15 Feb 2022 23:45:15 +0000 (23:45 +0000)]
LinuxKPI: 802.11 header updates and add/adjust source dependencies.
This update is for more/newer versions of drivers:
- add and properly place more structs, enums, defines needed by drivers.
- correct types of struct fields.
- make various function arguments const.
- move REG_RULE() macro to its own file regulatory.h and
use macros for calculations.
- add linuxkpi_ieee80211_get_channel() implementation.
- change linuxkpi_ieee80211_ifattach() to return int for error checking.
Bjoern A. Zeeb [Wed, 16 Feb 2022 02:10:10 +0000 (02:10 +0000)]
LinuxKPI: skbuff updates
Various updates to skbuff for new/updated drivers and some housekeeping:
- update types and struct members, add new (stub) functions
- improve freeing of frags.
- fix an issue with sleeping during alloc for dev_alloc_skb().
- Adjust a KASSERT for skb_reserve() which apparently can be called
multiple times if no data was put into the skb yet.
- move the sysctl from linux_8022.c (which may be in a different module)
to linux_skbuff.c so in case we turn debugging on we do not run into
unresolved symbols. Rename the sysctl variable to be less conflicting
and update debugging macros along with that; also add IMPROVE().
- add DDB support to show an skbuff.
- adjust comments/whitespace.
Bjoern A. Zeeb [Wed, 16 Feb 2022 03:20:29 +0000 (03:20 +0000)]
LinuxKPI: 802.11: defer workq allocation until we have a name
Turned out all the workq's taskqueues were named "wlanNA" if you had
more then one card in a machine as by the time we called wiphy_name()
the device name was not set yet and we returned the fallback.
Move the alloc_ordered_workqueue() from linuxkpi_ieee80211_alloc_hw()
to linuxkpi_ieee80211_ifattach() at which time the device name has
to be set to give us a unique name.
Bjoern A. Zeeb [Wed, 16 Feb 2022 03:48:54 +0000 (03:48 +0000)]
LinuxKPI: 802.11 assign an(y) early chandef
The Realtek driver assumes an early chandef to be set. At the time
of linuxkpi_ieee80211_ifattach() we do not really know one yet so
try to find the first one which is available and set that.
This prevents a NULL-deref panic.
Bjoern A. Zeeb [Wed, 16 Feb 2022 03:00:34 +0000 (03:00 +0000)]
LinuxKPI: 802.11 scan update
Realtek's rtw88 is returning a hard-coded 1 in case they cannot
hw_scan (fw not advertising it). In that case if we want any scan
to run we need to fall-back to sw scan. Start dealing with this.
Long-term we probably need to keep internal state.
Bjoern A. Zeeb [Wed, 9 Feb 2022 11:58:40 +0000 (11:58 +0000)]
LinuxKPI: add kstrtoint_from_user() and DECLARE_FLEX_ARRAY()
Add an implementation of kstrtoint_from_user() based on the other
implementations and an attempt at DECLARE_FLEX_ARRAY() which works
for the driver needing it.
Bjoern A. Zeeb [Mon, 14 Feb 2022 22:29:38 +0000 (22:29 +0000)]
LinuxKPI: 802.11: get rid of lkpi_ic_getradiocaps warnings
Users are seeing warnings about 2 channels (1 per band)
triggered by an ioctl from wpa_supplicant usually:
lkpi_ic_getradiocaps: Adding chan ... returned error 55
This was an early FAQ.
Check the current number of channels against maxchans and the return
code from net80211. In case net80211 reports that we reached the limit
do not print the warning and do not try to add further channels.
Bjoern A. Zeeb [Wed, 9 Feb 2022 12:07:44 +0000 (12:07 +0000)]
LinuxKPI: add linux/pm_qos.h
Add a linux/pm_qos.h with three dummy functions and a struct as needed
by a driver and drm-kmod [1] in main with no intend to support this for
the moment.
Bjoern A. Zeeb [Tue, 8 Feb 2022 23:47:15 +0000 (23:47 +0000)]
TCP syncache: enhance KASSERT output
Improve the "syncache: mbuf too small" assertion message with various
variables (some not actually needed) but enough that it will be obvious
if (a) we use IPv4 or IPv6, (b) if UDP tunneling is on, (c) what
max_linkhdr is, and (d) what MHLEN is.
This should help diagnostics in the future.
The case was hit with wireless drivers setting a large ic_headroom
and using IPv6.
John Baldwin [Thu, 10 Feb 2022 20:47:08 +0000 (12:47 -0800)]
libthr: Disable stack unwinding on ARM.
When a thread exits, _Unwind_ForcedUnwind() is used to walk up stack
frames executing pending cleanups pushed by pthread_cleanup_push().
The cleanups are popped by thread_unwind_stop() which is passed as a
callback function to _Unwind_ForcedUnwind().
LLVM's libunwind uses a different function type for the callback on
32-bit ARM relative to all other platforms. The previous unwind.h
header (as well as the unwind.h from libcxxrt) use the non-ARM type on
all platforms, so this has likely been broken on 32-bit arm since it
switched to using LLVM's libunwind.
For now, just disable stack unwinding on 32-bit arm to unbreak the
build until a proper fix is tested.
John Baldwin [Thu, 27 Jan 2022 22:42:40 +0000 (14:42 -0800)]
Change the return value of _Unwind_GetCFA in include/unwind.h.
I tested the original commit as part of a series that culminates in
removing this header and installing LLVM libunwind's unwind.h in its
place so missed updating this header as was done in b84693501af6.
Pointy hat to: jhb
Reported by: kevans
Fixes: 3a502289d316 Use uintptr_t for return type of _Unwind_GetCFA.
Kristof Provost [Tue, 1 Feb 2022 17:25:57 +0000 (18:25 +0100)]
pf: deal with tables gaining or losing counters
When we create a table without counters, add an entry and later
re-define the table to have counters we wound up trying to read
non-existent counters.
We now cope with this by attempting to add them if needed, removing them
when they're no longer needed and not trying to read from counters that
are not present.
Kenneth D. Merry [Mon, 24 Jan 2022 21:19:25 +0000 (16:19 -0500)]
Fix non-printable characters in NVMe model and serial numbers.
The NVMe 1.4 spec simply says that Model and Serial numbers are
ASCII strings. Unlike SCSI, it doesn't prohibit non-printable
characters or say that the strings should be padded with spaces.
Since 2014, we have had cam_strvis_sbuf(), which gives additional
options for handling non-ASCII characters. That behavior hasn't
been available for non-sbuf consumers, so users of cam_strvis()
were left with having octal ASCII codes inserted.
So, to avoid having garbage or octal chracters in the strings, use
cam_strvis_sbuf() to create a new function, cam_strvis_flag(), and
re-implement cam_strvis() using cam_strvis_flag().
Now, for the NVMe drives, we can use cam_strvis_flag with the
CAM_STRVIS_FLAG_NONASCII_SPC flag. This transforms non-printable
characters into spaces.
sys/cam/cam.c:
Add a new function, cam_strvis_flag(), that creates an sbuf
on the stack with the user's destination buffer, and calls
cam_strvis_sbuf() with the given flag argument.
Re-implement cam_strvis() to call cam_strvis_flag with the
CAM_STRVIS_FLAG_NONASCII_ESC argument. This should be the
equivalent of the old cam_strvis() function, except for the
overhead of creating the sbuf and calling sbuf_putc/printf.
sys/cam/cam.h:
Declaration for cam_strvis_flag.
sys/cam/nvme/nvme_all.c:
In nvme_print_ident, use the NONASCII_SPC flag with
cam_strvis_flag().
sys/cam/nvme/nvme_da.c:
In ndaregister(), use cam_strvis_flag() with the
NONASCII_SPC flag for the disk description and serial
number we report to GEOM.
sys/cam/nvme/nvme_xpt.c:
In nvme_probe_done(), use cam_strvis_flag with the
NONASCII_SPC flag when storing the drive serial number
in the CAM EDT.
Make vmdaemon timeout configurable, so that one can adjust
how often it runs.
Here's a trick: set this to 1, then run 'limits -m 0 sh',
then run whatever you want with 'ktrace -it XXX', and observe
how the working set changes over time.
This builds on recently introduced NO_NEW_PRIVS flag to implement
unprivileged chroot, enabled by `security.bsd.unprivileged_chroot`.
It allows non-root processes to chroot(2), provided they have the
NO_NEW_PRIVS flag set.
The chroot(8) utility gets a new flag, -n, which sets NO_NEW_PRIVS
before chrooting.
Kristof Provost [Mon, 31 Jan 2022 17:31:53 +0000 (18:31 +0100)]
libpfctl: fix pfctl_kill_states()
735748f30a changed the output of the states so that the creator id
endianness would be consistent. This means that we need to convert the
host endianness creatorid back to big-endian before we give it to the
kernel.
Kristof Provost [Fri, 21 Jan 2022 16:50:15 +0000 (17:50 +0100)]
libpfctl: fix creatorid endianness
We provide the hostid (which is the state creatorid) to the kernel as a
big endian number (see pfctl/pfctl.c pfctl_set_hostid()), so convert it
back to system endianness when we get it from the kernel.
This avoids a confusing mismatch between the value the user configures
and the value displayed in the state.
Ed Maste [Sun, 16 Jan 2022 19:22:05 +0000 (14:22 -0500)]
compiler-rt: re-exec with ASLR disabled when necessary
Some sanitizers (at least msan) currently require ASLR to be disabled.
When we detect that ASLR is enabled, re-exec with it disabled rather
than exiting with an error. See LLVM GitHub issue 53256 for more
detail: https://github.com/llvm/llvm-project/issues/53256
No objection: dim
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D33934
Ed Maste [Tue, 15 Feb 2022 03:30:29 +0000 (22:30 -0500)]
elfctl: fix -e invalid operation error handling
Validate the operation prior to parsing the feature string, so that e.g.
-e 0x1 reports invalid operation '0' rather than invalid feature 'x11'.
Also make it an error rather than a warning, so that it is not repeated
if multiple files are specified.
(Previously an invalid operation resulted in a segfault.)
MFC after: 3 days
Sponsored by: The FreeBSD Foundation
The flag values seem to be the same between Linux and FreeBSD.
Comparing to a Linux VM on the same hardware, we're missing
HWCAP_EVTSTRM, HWCAP_CPUID, HWCAP_DCPOP, HWCAP_USCAT, HWCAP_PACA,
and HWCAP_PACG.