Bjoern A. Zeeb [Sat, 1 Jan 2022 18:08:31 +0000 (18:08 +0000)]
iwlwifi: clarify page update
Based on some feedback clarify the man page for
- how to load the driver currently
- status of the driver with respect to iwm(4)
and leave a comment to (automatically) add a full list of chipsets
to the man page.
Sponsored by: The FreeBSD Foundation
MFC after: 3 days
Reviewed by: debdrup
Differential Revision: https://reviews.freebsd.org/D33713
Alan Somers [Thu, 2 Dec 2021 02:50:47 +0000 (19:50 -0700)]
fusefs: fix .. lookups when the parent has been reclaimed.
By default, FUSE file systems are assumed not to support lookups for "."
and "..". They must opt-in to that. To cope with this limitation, the
fusefs kernel module caches every fuse vnode's parent's inode number,
and uses that during VOP_LOOKUP for "..". But if the parent's vnode has
been reclaimed that won't be possible. Previously we paniced in this
situation. Now, we'll return ESTALE instead. Or, if the file system
has opted into ".." lookups, we'll just do that instead.
This commit also fixes VOP_LOOKUP to respect the cache timeout for ".."
lookups, if the FUSE file system specified a finite timeout.
Alan Somers [Thu, 2 Dec 2021 02:38:04 +0000 (19:38 -0700)]
fusefs: in the tests, always assume debug.try_reclaim_vnode is available
In an earlier version of the revision that created that sysctl (D20519)
the sysctl was gated by INVARIANTS, so the test had to check for it.
But in the committed version it is always available.
Alan Somers [Mon, 29 Nov 2021 02:17:34 +0000 (19:17 -0700)]
Fix a race in fusefs that can corrupt a file's size.
VOPs like VOP_SETATTR can change a file's size, with the vnode
exclusively locked. But VOPs like VOP_LOOKUP look up the file size from
the server without the vnode locked. So a race is possible. For
example:
1) One thread calls VOP_SETATTR to truncate a file. It locks the vnode
and sends FUSE_SETATTR to the server.
2) A second thread calls VOP_LOOKUP and fetches the file's attributes from
the server. Then it blocks trying to acquire the vnode lock.
3) FUSE_SETATTR returns and the first thread releases the vnode lock.
4) The second thread acquires the vnode lock and caches the file's
attributes, which are now out-of-date.
Fix this race by recording a timestamp in the vnode of the last time
that its filesize was modified. Check that timestamp during VOP_LOOKUP
and VFS_VGET. If it's newer than the time at which FUSE_LOOKUP was
issued to the server, ignore the attributes returned by FUSE_LOOKUP.
Mark Johnston [Fri, 31 Dec 2021 22:01:39 +0000 (17:01 -0500)]
callout: Wait for the softclock thread to switch before rescheduling
When a softclock thread prepares to go off-CPU, the following happens in
the context of the thread:
1. callout state is locked
2. thread state is set to IWAIT
3. thread lock is switched from the tdq lock to the callout lock
4. tdq lock is released
5. sched_switch() sets td_lock to &blocked_lock
6. sched_switch() releases old td_lock (callout lock)
7. sched_switch() removes td from its runqueue
8. cpu_switch() sets td_lock back to the callout lock
Suppose a timer interrupt fires while the softclock thread is switching
off, and callout_process() schedules the softclock thread. Then there
is a window between steps 5 and 8 where callout_process() can call
sched_add() while td_lock is &blocked_lock, but this is not correct
since the thread is not logically locked.
callout_process() thus needs to spin waiting for the softclock thread to
finish switching off (i.e., after step 8 completes) before rescheduling
it, since callout_process() does not acquire the thread lock directly.
Reported by: syzbot+fb44dbf6734ff492c337@syzkaller.appspotmail.com
Fixes: 74cf7cae4d22 ("softclock: Use dedicated ithreads for running callouts.")
Reviewed by: mav, kib, jhb
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D33709
Dmitry Wagin [Tue, 23 Mar 2021 16:01:15 +0000 (12:01 -0400)]
libc: Some enhancements to syslog(3)
This is a re-application of commit 2d82b47a5b4ef18550565dd55628d51f54d0af2e, which was reverted since it
broke with syslog daemons that don't adjust the /dev/log recv buffer
size. Now that the default is large enough to accomodate 8KB messages,
restore support for large messages.
Bjoern A. Zeeb [Fri, 31 Dec 2021 11:51:18 +0000 (11:51 +0000)]
iwlwifi: import correct firmware versions for select Intel iwlwifi/mvm
The firmware files for 3160, 7260, and 7265 imported contain old versions
no longer supported by the driver.
Replace with latest versions from linux-firmware to possibly also
support these chip revisions.
Reported by: FreeBSD User (freebsd walstatt-de.de) on wireless (2021-12-30)
Sponsored by: The FreeBSD Foundation
MFC after: 3 days
Bjoern A. Zeeb [Fri, 31 Dec 2021 11:47:14 +0000 (11:47 +0000)]
LinuxKPI: 802.11 fix queue wait
We are using a bandaid to wait for queues after station creation
looping and pausing.
The abort condition was looping in the wrong direction so we were
potentially waiting forever if queues never became ready.
From initial user test data we also found that the wait time was
too low in some cases so increase the length.
Sponsored by: The FreeBSD Foundation
MFC after: 3 days
Stefan Eßer [Fri, 31 Dec 2021 10:08:34 +0000 (11:08 +0100)]
sys/cpuset.h: add 3 more macros provided by GLIBC
The lang/python* ports failed since they expected CPU_COUNT_S() to be
provided by sys/cpuset.h. Add this function plus 2 more in a way that
is compatible with GLIBC.
Doug Moore [Fri, 31 Dec 2021 05:31:18 +0000 (23:31 -0600)]
vm_phys: #include vm_extern
Arm64 and powerpc don't include vm_extern.h indirectly in vm_phys.c, which
means that for the sake of those architectures, it must be included explicitly.
Also, fix a set-unused warning that jenkins also found.
Reported by: Jenkins
Fixes: c606ab59e7f9 vm_extern: use standard address checkers everywhere
Doug Moore [Fri, 31 Dec 2021 04:09:08 +0000 (22:09 -0600)]
vm_extern: use standard address checkers everywhere
Define simple functions for alignment and boundary checks and use them
everywhere instead of having slightly different implementations
scattered about. Define them in vm_extern.h and use them where
possible where vm_extern.h is included.
Warner Losh [Fri, 31 Dec 2021 03:56:09 +0000 (20:56 -0700)]
mips: Remove sys/mips
Remove sys/mips as the next step of decomissioning mips from the tree.
Remove mips special cases from the kernel make files. Remove the mips
specific linker scripts.
Warner Losh [Thu, 30 Dec 2021 22:56:57 +0000 (15:56 -0700)]
tinybsd: Remove
This hasn't been updated in 10 years in any real way. It's time to
retire it. It hasn't worked in some time due to drivers being removed
starting in FreeBSD 10. All the interesting bits have already been
hoisted into other parts of base. The google code site hasn't had any
commits since 2011 and claims to Target FreeBSD 5, 6, 7, and 8.
Should someone fix the numerous issues, it can be restored.
John Baldwin [Thu, 30 Dec 2021 22:54:29 +0000 (14:54 -0800)]
softclock: Use dedicated ithreads for running callouts.
Rather than using the swi infrastructure, rewrite softclock() as a
thread loop (softclock_thread()) and use it as the main routine of the
softclock threads. The threads use the CC_LOCK as the thread lock
when idle.
When parsing a rule to rotate log files on a specific week day,
parseDWM() can advance the time to the next week. If the next week is
in the next month, then tm_mon is incremented. However, the increment
was failing to handle the wraparound from December to January, so when
parsing a rule during the last week of the December, the month would
advance to month 12. This triggered an out-of-bounds read of the
mtab[] array in days_pmonth() after parseDWM() returned. To fix,
this change resets the month to January and increment the year when
the month increment wraps.
The default rule for /var/log/weekly.log triggers this during the
last week of December each year.
Reported by: CHERI
Obtained from: CheriBSD
Reviewed by: jhb
Sponsored by: The University of Cambridge, Google Inc.
Differential Revision: <https://reviews.freebsd.org/D33687>
Emmanuel Vadot [Thu, 30 Dec 2021 08:47:06 +0000 (09:47 +0100)]
loader: tftp: Copy the first block into the cache
tftp_open reads the first block so copy it in the cached data.
If we have more than one block (i.e. we called tftp_read before
tftp_preload) simply just reset the transfer.
Michael Tuexen [Thu, 30 Dec 2021 14:16:05 +0000 (15:16 +0100)]
sctp: improve sctp_pathmtu_adjustment()
Allow the resending of DATA chunks to be controlled by the caller,
which allows retiring sctp_mtu_size_reset() in a separate commit.
Also improve the computaion of the overhead and use 32-bit integers
consistently.
Thanks to Timo Voelker for pointing me to the code.
Stefan Eßer [Thu, 30 Dec 2021 11:20:32 +0000 (12:20 +0100)]
Make CPU_SET macros compliant with other implementations
The introduction of <sched.h> improved compatibility with some 3rd
party software, but caused the configure scripts of some ports to
assume that they were run in a GLIBC compatible environment.
Parts of sched.h were made conditional on -D_WITH_CPU_SET_T being
added to ports, but there still were compatibility issues due to
invalid assumptions made in autoconfigure scripts.
The differences between the FreeBSD version of macros like CPU_AND,
CPU_OR, etc. and the GLIBC versions was in the number of arguments:
FreeBSD used a 2-address scheme (one source argument is also used as
the destination of the operation), while GLIBC uses a 3-adderess
scheme (2 source operands and a separately passed destination).
The GLIBC scheme provides a super-set of the functionality of the
FreeBSD macros, since it does not prevent passing the same variable
as source and destination arguments. In code that wanted to preserve
both source arguments, the FreeBSD macros required a temporary copy of
one of the source arguments.
This patch set allows to unconditionally provide functions and macros
expected by 3rd party software written for GLIBC based systems, but
breaks builds of externally maintained sources that use any of the
following macros: CPU_AND, CPU_ANDNOT, CPU_OR, CPU_XOR.
One contributed driver (contrib/ofed/libmlx5) has been patched to
support both the old and the new CPU_OR signatures. If this commit
is merged to -STABLE, the version test will have to be extended to
cover more ranges.
Ports that have added -D_WITH_CPU_SET_T to build on -CURRENT do
no longer require that option.
The FreeBSD version has been bumped to 1400046 to reflect this
incompatible change.
Dimitry Andric [Thu, 30 Dec 2021 09:53:25 +0000 (10:53 +0100)]
Avoid emitting popcnt in libclang_rt.fuzzer*.a if unsupported
Since popcnt is only supported by CPUTYPE=nehalem and later, ensure that
this instruction is only emitted when appropriate. Otherwise, programs
using the library can abort with SIGILL.
See also: https://github.com/llvm/llvm-project/issues/52893
PR: 258156
Reported by: Eric Rucker <bhtooefr@bhtooefr.org>
MFC after: 3 days
Fedor Uporov [Fri, 24 Dec 2021 14:18:15 +0000 (17:18 +0300)]
Improve extents verification logic
Add functionality for extents validation inside the filesystem
extents block. The main logic is implemented under
ext4_validate_extent_entries() function, which verifies extents
or extents indexes depending of extent depth value.
Fedor Uporov [Fri, 29 Oct 2021 12:45:50 +0000 (15:45 +0300)]
Add more accurate directory entries check
Rename ext2_dirbadentry() to ext2_check_direntry(). Add directory
entry inode value check, and call ext2_check_direntry() in all cases.
The dirchk sysctl is removed.
Alexander Motin [Thu, 30 Dec 2021 03:58:52 +0000 (22:58 -0500)]
CTL: Allow I/Os up to 8MB, depending on maxphys value.
For years CTL block backend limited I/O size to 1MB, splitting larger
requests into sequentially processed chunks. It is sufficient for
most of use cases, since typical initiators rarely use bigger I/Os.
One of known exceptions is VMWare VAAI offload, by default sending up
to 8 4MB EXTENDED COPY requests same time. CTL internally converted
those into 32 1MB READ/WRITE requests, that could overwhelm the block
backend, having finite number of processing threads and making more
important interactive I/Os to wait in its queue. Previously it was
partially covered by CTL core serializing sequential reads to help
ZFS speculative prefetcher, but that serialization was significantly
relaxed after recent ZFS improvements.
With the new settings block backend receives 8 4MB requests, that
should be easier for both CTL itself and the underlying storage.
John Baldwin [Thu, 30 Dec 2021 01:50:03 +0000 (17:50 -0800)]
/dev/crypto: Minimize cipher-specific logic.
Rather than duplicating the switches in crypto_auth_hash() and
crypto_cipher(), copy the algorithm constants from the new session
ioctl into a csp directly which permits using the functions in
crypto.c.
John Baldwin [Wed, 29 Dec 2021 22:36:04 +0000 (14:36 -0800)]
iscsi: Handle large Text responses.
Text requests and responses can span multiple PDUs. In that case, the
sender sets the Continue bit in non-final PDUs and the Final bit in
the last PDU. The receiver responds to non-final PDUs with an empty
text PDU.
To support this, add a more abstract API in libiscsi which accepts and
receives key sets rather than PDUs. These routines internally send or
receive one or more PDUs. Use these new functions to replace the
handling of TextRequest and TextResponse PDUs in discovery sessions in
both ctld and iscsid.
Note that there is not currently a use case for large Text requests
and those are still always sent as a single PDU. However, discovery
sessions can return a text response listing targets that spans
multiple PDUs, so the new API supports sending and receiving multi-PDU
responses.
Toomas Soome [Sun, 26 Dec 2021 09:01:16 +0000 (11:01 +0200)]
bhyve smbios type 3 structure is incorrect
If you look at the SMBIOS specification, we'll find something is
missing. In particular at offset 0Dh is supposed to be the OEM-defined
field. This should go between security and height. It is not legal to
actually skip this and will lead to other folks not properly
interpreting later parts of the table.
nhops: split nh_family into nh_upper_family and nh_neigh_family.
With IPv4 over IPv6 nexthops and IP->MPLS support, there is a need
to distingush "upper" e.g. traffic family and "neighbor" e.g. LLE/gateway
address family. Store them explicitly in the private part of the nexthop data.
While here, store nhop fibnum in nhop_prip datastructure to make it self-contained.
Introduce a new function, lltable_get(), to retrieve lltable pointer
for the specified interface and family.
Use it to avoid all-iftable list traversal when adding or deleting
ARP/ND records.
Colin Percival [Mon, 20 Dec 2021 17:55:36 +0000 (09:55 -0800)]
vfs_mountroot: Check for root dev before waiting
If GEOM is idle but the root device is not yet present when we enter
vfs_mountroot_wait_if_necessary, we call vfs_mountroot_wait to wait
for root holds (e.g. CAM or USB initialization). Upon returning from
vfs_mountroot_wait, we wait 100 ms at a time until the root device
shows up.
Since the root device most likely appeared during vfs_mountroot_wait
-- waiting for subsystems which may be responsible for the root
device is the whole purpose of that function -- it makes sense to
check if the device is now present rather than printing a warning
and pausing for 100 ms before checking.
Reviewed by: trasz Fixes: a3ba3d09c248 Make root mount wait mechanism smarter
Sponsored by: https://www.patreon.com/cperciva
Differential Revision: https://reviews.freebsd.org/D33593
Colin Percival [Mon, 20 Dec 2021 17:51:34 +0000 (09:51 -0800)]
vfs_mountroot: Wait for GEOM idle post root holds
In the case of a root hold related to the initialization of a disk
device, a flurry of GEOM tasting is likely to take place as soon as
the device is initialized and the root hold is released. If we
don't wait for GEOM idle it's easy for vfs_mountroot to "win" the
race and proceed before the root filesystem GEOM is ready.
Colin Percival [Mon, 20 Dec 2021 15:17:25 +0000 (07:17 -0800)]
vfs_mountroot: Skip 'Root mount waiting' < 1 s
While the message is technically correct, it's not particularly
helpful in the case where we're only waiting a few ms; this case
occurs frequently on EC2 arm64 instances with CAM initialization
racing to release its root hold before vfs_mountroot reaches this
point. Only print the message if we end up waiting for more than
one second.
Ed Maste [Wed, 29 Dec 2021 19:59:06 +0000 (14:59 -0500)]
ar: deprecate -T option
Other ar implementations (GNU, LLVM) use -T to mean thin archive
rather than use only the first fifteen characters of the archive member
name. We support both -T and -f for this, with -f documented as an
alias of -T.
An exp-run showed that the ports invoking `ar -T` expect thin archives,
not truncated names. Switch -f to be the documented flag for this
behaviour, and emit a warning when -T is used.
The warning will be changed to an error in the future (in main), once
ports no longer use -T.
PR: 260523 [exp-run]
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Bjoern A. Zeeb [Thu, 23 Dec 2021 14:59:49 +0000 (14:59 +0000)]
bhyve: passthru: enable BARs before possibly mmap(2)ing them
The first time we start bhyve with a passthru device everything is fine
as on boot we do enable BARs. If a driver (unload) inside bhyve disables
the BAR(s) as some Linux drivers do, we need to make sure we re-enable
them on next bhyve start.
If we are trying to mmap a disabled BAR for MSI-X (PCIOCBARMMAP)
the kernel will give us an EBUSY.
While we were re-enabling the BAR(s) in the current code loop
cfginit() was writing the changes out too late to the real hardware.
Move the call to init_msix_table() after the register on the real
hardware was updated. That way the kernel will be happy and the
mmap will succeed and bhyve will start.
Also simplify the code given the last argument to init_msix_table()
is unused we do not need to do checks for each bar. [1]
MFC after: 3 days
PR: 260148
Pointed out by: markj [1]
Sponsored by: The FreeBSD Foundation
Reviewed by: markj
Differential Revision: https://reviews.freebsd.org/D33628
Kevin Bowling [Wed, 29 Dec 2021 16:37:34 +0000 (09:37 -0700)]
igc: Remove redundant IFCAP_VLAN_HWTAGGING check
Match igb(4) as in f7926a6d0c10. From Vincenzo, this check is redundant
to setup providing us an IGC_RXD_STAT_VP bit and would make for an
unexpected condition if IFCAP_VLAN_HWTAGGING were not set but the tag
was stripped, which would be passed up the stack breaking isolation.