Alexander Motin [Sat, 17 Apr 2021 14:41:35 +0000 (10:41 -0400)]
mpt(4): Remove incorrect S/G segments limits.
First, two of those four checks are unreachable.
Second, I don't believe there should be ">=" instead of ">".
Third, bus_dma(9) already returns the same EFBIG if ">".
This fixes false I/O errors in worst S/G cases with maxphys >= 2MB.
Approved by: so
Security: EN-21:13.mpt
MFC after: 1 week
Zero `struct weightened_nhop` fields in nhgrp_get_addition_group().
`struct weightened_nhop` has spare 32bit between the fields due to
the alignment (on amd64).
Not zeroing these spare bits results in duplicating nhop groups
in the kernel due to the way how comparison works.
Fetch the sigfastblock value in syscalls that wait for signals
We have seen several cases of processes which have become "stuck" in
kern_sigsuspend(). When this occurs, the kernel's td_sigblock_val
is set to 0x10 (one block outstanding) and the userspace copy of the
word is set to 0 (unblocked). Because the kernel's cached value
shows that signals are blocked, kern_sigsuspend() blocks almost all
signals, which means the process hangs indefinitely in sigsuspend().
It is not entirely clear what is causing this condition to occur.
However, it seems to make sense to add some protection against this
case by fetching the latest sigfastblock value from userspace for
syscalls which will sleep waiting for signals. Here, the change is
applied to kern_sigsuspend() and kern_sigtimedwait().
Robert Watson [Sun, 21 Mar 2021 00:01:54 +0000 (00:01 +0000)]
Tune DTrace 'aframes' for the FBT and profile providers on arm64.
In both cases, too few frames were trimmed, leading to exception handling
or DTrace internals being exposed in stack traces exposed by D's stack()
primitive.
Reviewed by: emaste, andrew
Approved by: re (gjb)
Differential Revision: https://reviews.freebsd.org/D29356
On FreeBSD/arm fill_fpregs, fill_dbregs are stubs that zero the reg
struct and return success. set_fpregs and set_dbregs do nothing and
return success.
Provide the same implementation for arm64 COMPAT_FREEBSD32.
Reviewed by: andrew
Approved by: re (gjb)
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D29314
Plug nexthop group refcount leak.
In case with batch route delete via rib_walk_del(), when
some paths from the multipath route gets deleted, old
multipath group were not freed.
Flush remaining routes from the routing table during VNET shutdown.
Summary:
This fixes rtentry leak for the cloned interfaces created inside the
VNET.
Loopback teardown order is `SI_SUB_INIT_IF`, which happens after `SI_SUB_PROTO_DOMAIN` (route table teardown).
Thus, any route table operations are too late to schedule.
As the intent of the vnet teardown procedures to minimise the amount of effort by doing global cleanups instead of per-interface ones, address this by adding a relatively light-weight routing table cleanup function, `rib_flush_routes()`.
It removes all remaining routes from the routing table and schedules the deletion, which will happen later, when `rtables_destroy()` waits for the current epoch to finish.
Traditionally *BSD routing stack required to supply some
interface data for blackhole/reject routes. This lead to
varieties of hacks in routing daemons when inserting such routes.
With the recent routeing stack changes, gateway sockaddr without
RTF_GATEWAY started to be treated differently, purely as link
identifier.
This change broke net/bird, which installs blackhole routes with
127.0.0.1 gateway without RTF_GATEWAY flags.
Fix this by automatically constructing necessary gateway data at
rtsock level if RTF_REJECT/RTF_BLACKHOLE is set.
Reported by: Marek Zarychta <zarychtam at plan-b.pwste.edu.pl>
Reviewed by: donner
Approved by: re (gjb)
Emmanuel Vadot [Sat, 27 Mar 2021 11:04:51 +0000 (12:04 +0100)]
release: amd64: Fix ISO/USB hybrid image
Recent mkimg changes forces to have partitions given in explicit order.
This is so we can have the first partition starting at a specific offset
and the next ones starting after without having to specify an offset.
Switch the partition in the mkisoimage.sh script so the first one created
is the isoboot one.
Approved by: re(gjb)
PR: 254490
Reported by: Michael Dexter <editor@callfortesting.org
Tested by: Vincent Milum Jr <freebsd@darkain.com>
MFC after: Right now
Mark Johnston [Thu, 25 Mar 2021 21:55:20 +0000 (17:55 -0400)]
accept_filter: Fix filter parameter handling
For filters which implement accf_create, the setsockopt(2) handler
caches the filter name in the socket, but it also incorrectly frees the
buffer containing the copy, leaving a dangling pointer. Note that no
accept filters provided in the base system are susceptible to this, as
they don't implement accf_create.
Approved by: re (gjb)
Reported by: Alexey Kulaev <alex.qart@gmail.com>
Discussed with: emaste
Security: kernel use-after-free
Sponsored by: The FreeBSD Foundation
Mark Johnston [Tue, 23 Mar 2021 13:38:59 +0000 (09:38 -0400)]
pf: Handle unmapped mbufs when computing checksums
Approved by: re (cperciva)
PR: 254419
Reviewed by: gallatin, kp
Tested by: Igor A. Valkov <viaprog@gmail.com>
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D29378
Mark Johnston [Sun, 21 Mar 2021 18:18:10 +0000 (14:18 -0400)]
rtsold: Fix validation of RDNSS options
The header specifies the size of the option in multiples of eight bytes.
The option consists of an eight-byte header followed by one or more IPv6
addresses, so the option is invalid if the size is not equal to 1+2n for
some n>0. Check this.
The bug can cause random stack data to be formatted as an IPv6 address
and passed to resolvconf(8), but a host able to trigger the bug may also
specify arbitrary addresses this way.
Approved by: re (cperciva)
Reported by: Q C <cq674350529@gmail.com>
Sponsored by: The FreeBSD Foundation
Lawrence Stewart [Wed, 24 Mar 2021 04:25:49 +0000 (15:25 +1100)]
random(9): Restore historical [0,2^31-1] output range and related man documention.
Commit SVN r364219 / Git 8a0edc914ffd changed random(9) to be a shim around
prng32(9) and inadvertently caused random(9) to begin returning numbers in the
range [0,2^32-1] instead of [0,2^31-1], where the latter has been the documented
range for decades.
The increased output range has been identified as the source of numerous bugs in
code written against the historical output range e.g. ipfw "prob" rules and
stats(3) are known to be affected, and a non-exhaustive audit of the tree
identified other random(9) consumers which are also likely affected.
As random(9) is deprecated and slated for eventual removal in 14.0, consumers
should gradually be audited and migrated to prng(9).
Submitted by: Loic Prylli <lprylli@netflix.com>
Obtained from: Netflix
Reviewed by: cem, delphij, imp
MFC after: 1 day
MFC to: stable/13, releng/13.0
Differential Revision: https://reviews.freebsd.org/D29385
Approved by: re (delphij)
Alex Richardson [Mon, 1 Mar 2021 14:27:30 +0000 (14:27 +0000)]
AArch64: Don't set flush-subnormals-to-zero flag on startup
This flag has been set on startup since 65618fdda0f272a823e6701966421bdca0efa301.
However, This causes some of the math-related tests to fail as they report
zero instead of a tiny number. This fixes at least
/usr/tests/lib/msun/ldexp_test and possibly others.
Additionally, setting this flag prevents printf() from printing subnormal
numbers in decimal form.
See also https://www.openwall.com/lists/musl/2021/02/26/1
- Call vm_object_reference() before vm_map_lookup_done().
- Use vm_mmap_to_errno() to convert vm_map_* return values to errno.
- Fix memory leak of e->obj.
netmap: fix memory leak in NETMAP_REQ_PORT_INFO_GET
The netmap_ioctl() function has a reference counting bug in case of
NETMAP_REQ_PORT_INFO_GET command. When `hdr->nr_name[0] == '\0'`,
the function does not decrease the refcount of "nmd", which is
increased by netmap_mem_find(), causing a refcount leak.
Approved by: re (gjb)
Reported by: Xiyu Yang <sherllyyang00@gmail.com>
Submitted by: Carl Smith <carl.smith@alliedtelesis.co.nz>
MFC after: 3 days
PR: 254311
Brandon Bergren [Mon, 1 Mar 2021 02:35:53 +0000 (20:35 -0600)]
[PowerPC64] Fix multiple issues in fpsetmask().
Building R exposed a problem in fpsetmask() whereby we were not properly
clamping the provided mask to the valid range.
R initilizes the mask by calling fpsetmask(~0) on FreeBSD. Since we
recently enabled precise exceptions, this was causing an immediate
SIGFPE because we were attempting to set invalid bits in the fpscr.
Properly limit the range of bits that can be set via fpsetmask().
While here, use the correct fp_except_t type instead of fp_rnd_t.
Reported by: pkubaj (in IRC)
Sponsored by: Tag1 Consulting, Inc.
Approved by: re (gjb) (Post-RC3 outstanding request approved for RC4)
Nathan Whitehorn [Tue, 23 Mar 2021 13:19:42 +0000 (09:19 -0400)]
Fix scripted installs on EFI systems after default mounting of the ESP.
Because the ESP mount point (/boot/efi) is in mtree, tar will attempt to
extract a directory at that point post-mount when the system is installed.
Normally, this is fine, since tar can happily set whatever properties it
wants. For FAT32 file systems, however, like the ESP, tar will attempt to
set mtime on the root directory, which FAT does not support, and tar will
interpret this as a fatal error, breaking the install (see
https://github.com/libarchive/libarchive/issues/1516). This issue would
also break scripted installs on bare-metal POWER8, POWER9, and PS3
systems, as well as some ARM systems.
This patch solves the problem in two ways:
- If stdout is a TTY, use the distextract stage instead of tar, as in
interactive installs. distextract solves this problem internally and
provides a nicer UI to boot, but requires a TTY.
- If stdout is not a TTY, use tar but, as a stopgap for 13.0, exclude
boot/efi from tarball extraction and then add it by hand. This is a
hack, and better solutions (as in the libarchive ticket above) will
obsolete it, but it solves the most common case, leaving only
unattended TTY-less installs on a few tier-2 platforms broken.
In addition, fix a bug with fstab generation uncovered once the tar issue
is fixed that umount(8) can depend on the ordering of lines in fstab in a
way that mount(8) does not. The partition editor now writes out fstab in
mount order, making sure umount (run at the end of scripted, but not
interactive, installs) succeeds.
PR: 254395
Approved by: re (gjb)
Reviewed by: gjb, imp
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D29380
Mariusz Zaborski [Sat, 13 Mar 2021 11:56:17 +0000 (12:56 +0100)]
zfs: bring back possibility to rewind the checkpoint from
Add parsing of the rewind options.
When I was upstreaming the change [1], I omitted the part where we
detect that the pool should be rewind. When the FreeBSD repo has
synced with the OpenZFS, this part of the code was removed.
Michael Tuexen [Thu, 18 Mar 2021 20:25:47 +0000 (21:25 +0100)]
vtnet: fix TSO for TCP/IPv6
The decision whether a TCP packet is sent over IPv4 or IPv6 was
based on ethertype, which works correctly. In D27926 the criteria
was changed to checking if the CSUM_IP_TSO flag is set in the
csum-flags and then considering it to be TCP/IPv4.
However, the TCP stack sets the flag to CSUM_TSO for IPv4 and IPv6,
where CSUM_TSO is defined as CSUM_IP_TSO|CSUM_IP6_TSO.
Therefore TCP/IPv6 packets gets mis-classified as TCP/IPv4,
which breaks TSO for TCP/IPv6.
This patch bases the check again on the ethertype.
This fix is instantly MFCed.
Mateusz Guzik [Wed, 17 Mar 2021 21:33:47 +0000 (22:33 +0100)]
vfs: fix vnlru marker handling for filtered/unfiltered cases
The global list has a marker with an invariant that free vnodes are
placed somewhere past that. A caller which performs filtering (like ZFS)
can move said marker all the way to the end, across free vnodes which
don't match. Then a caller which does not perform filtering will fail to
find them. This makes vn_alloc_hard sleep for 1 second instead of
reclaiming, resulting in significant stalls.
Fix the problem by requiring an explicit marker by callers which do
filtering.
As a temporary measure extend vnlru_free to restart if it fails to
reclaim anything.
Big thanks go to the reporter for testing several iterations of the
patch.
Scott Long [Thu, 18 Mar 2021 07:34:07 +0000 (07:34 +0000)]
base: remove if_wg(4) and associated utilities, manpage
After length decisions, we've decided that the if_wg(4) driver and
related work is not yet ready to live in the tree. This driver has
larger security implications than many, and thus will be held to
more scrutiny than other drivers.
Andrew Gierth [Wed, 3 Mar 2021 18:25:11 +0000 (12:25 -0600)]
service(8): use an environment more consistent with init(8)
init(8) sets the "daemon" login class without specifying a pw
entry (so no substitutions are done on the variables). service(8)'s
use of env -L had the effect of specifying root's pw entry, with two
effects: getpwnam and getpwuid are being called, which may not be
entirely safe depending on what nsswitch is up to and what stage of
boot we are at, and substitutions would have been done.
Fix by teaching env(8) to allow -L -/classname to set the class
environment with no pw entry at all specified, and use it in
service(8).
linux(4): make getcwd(2) return ERANGE instead of ENOMEM
For native FreeBSD binaries, the return value from __getcwd(2)
doesn't really matter, as the libc wrapper takes over and returns
the proper errno.
Approved by: re (gjb)
PR: kern/254120
Reported By: Alex S <iwtcex@gmail.com>
Reviewed By: kib
Sponsored By: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D29217
Kyle Evans [Mon, 8 Mar 2021 20:20:10 +0000 (14:20 -0600)]
x86: tsc: deprioritize TSC on VirtualBox
Misbehavior has been observed with TSC under VirtualBox, where threads
doing small sleeps (~1 second) may miss their wake up and hang around
in a sleep state indefinitely. Switching back to ACPI-fast decidedly
fixes it, so stop using TSC on VirtualBox at least for the time being.
This partially reverts 84eaf2ccc6aa, applying it only to VirtualBox and
increasing the quality to 0. Negative qualities can never be chosen and
cannot be chosen with the tunable recently added. If we do not have a
timecounter with a higher quality than 0, then TSC does at least leave
the system mostly usable.
Mark Johnston [Sun, 14 Mar 2021 16:39:23 +0000 (12:39 -0400)]
vm_reserv: Fix list locking in vm_reserv_reclaim_contig()
The per-domain partpop queue is locked by the combination of the
per-domain lock and individual reservation mutexes.
vm_reserv_reclaim_contig() scans the queue looking for partially
populated reservations that can be reclaimed in order to satisfy the
caller's allocation.
During the scan, we drop the per-domain lock. At this point, the rvn
pointer may be invalidated. Take care to load rvn after re-acquiring
the per-domain lock.
While here, simplify the condition used to check whether a reservation
was dequeued while the per-domain lock was dropped.
Approved by: re (gjb)
Reviewed by: alc, kib
Reported by: gallatin
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D29203
John Baldwin [Mon, 8 Mar 2021 18:46:40 +0000 (10:46 -0800)]
Correct the name of the structure used for TCP socket options.
The structure was renamed while refactoring Netflix's KTLS changes for
upstreaming, but the original name remained in tcp.4 and was
subsequently copied to ktls.4.
PR: 254141
Reported by: asomers
Approved by: re (gjb)
P2P ifa may require 2 routes: one is the loopback route, another is
the "prefix" route towards its destination.
Current code marks loopback routes existence with IFA_RTSELF and
"prefix" p2p routes with IFA_ROUTE.
For historic reasons, we fill in ifa_dstaddr for loopback interfaces.
To avoid installing the same route twice, we preemptively set
IFA_RTSELF when adding "prefix" route for loopback.
However, the teardown part doesn't have this hack, so we try to
remove the same route twice.
Fix this by checking if ifa_dstaddr is different from the ifa_addr
and moving this logic into a separate function.
Nathan Whitehorn [Tue, 23 Feb 2021 21:16:52 +0000 (16:16 -0500)]
Mount the EFI system partition (ESP) on newly-installed systems and VM
images.
Per hier(7), the ESP will be mounted at /boot/efi. On UFS systems,
any existing ESP will be reused and mounted there; otherwise, a new one
will be made. On ZFS systems, space for an ESP is allocated on all disks
in the root pool, but only the partition actually used to boot is set up
and mounted.
This makes future upgrades of the EFI loader easier (upgrade scripts can
just change /boot/efi) and also greatly simplifies the parts of the
installer involved in initialization of the ESP. It also makes the
installer's behavior correspond to the documentation in hier(7).
Kyle Evans [Thu, 4 Mar 2021 19:28:53 +0000 (13:28 -0600)]
jail(8): reset to root cpuset before attaching to run commands
Recent changes have made it such that attaching to a jail will augment
the attaching process' cpu mask with the jail's cpuset. While this is
convenient for allowing the administrator to cpuset arbitrary programs
that will attach to a jail, this is decidedly not convenient for
executing long-running daemons during jail creation.
This change inserts a reset of the process cpuset to the root cpuset
between the fork and attach to execute a command. This allows commands
executed to have the widest mask possible, and the administrator can
cpuset(1) it back down inside the jail as needed.
With this applied, one should be able to change a jail's cpuset at
exec.poststart in addition to exec.created. The former was made
difficult if jail(8) itself was running with a constrained set, as then
some processes may have been spawned inside the jail with a non-root
set. The latter is the preferred option so that processes starting in
the jail are constrained appropriately up front.
Note that all system commands are still run with the process' initial
cpuset applied.
Gordon Bergling [Sun, 7 Mar 2021 19:27:59 +0000 (20:27 +0100)]
wg(4): Fix an example in the manual page
The example in the manual page of wg(4) for connecting to a
peer was missing the 'public-key' ifconfig(8) keyword and for the
addressed peer the port must be specified.
Traditionally routing socket code did almost zero checks on
the input message except for the most basic size checks.
This resulted in the unclear KPI boundary for the routing system code
(`rtrequest*` and now `rib_action()`) w.r.t message validness.
Multiple potential problems and nuances exists:
* Host bits in RTAX_DST sockaddr. Existing applications do send prefixes
with hostbits uncleared. Even `route(8)` does this, as they hope the kernel
would do the job of fixing it. Code inside `rib_action()` needs to handle
it on its own (see `rt_maskedcopy()` ugly hack).
* There are multiple way of adding the host route: it can be DST without
netmask or DST with /32(/128) netmask. Also, RTF_HOST has to be set correspondingly.
Currently, these 2 options create 2 DIFFERENT routes in the kernel.
* no sockaddr length/content checking for the "secondary" fields exists: nothing
stops rtsock application to send sockaddr_in with length of 25 (instead of 16).
Kernel will accept it, install to RIB as is and propagate to all rtsock consumers,
potentially triggering bugs in their code. Same goes for sin_port, sin_zero, etc.
The goal of this change is to make rtsock verify all sockaddr and prefix consistency.
Said differently, `rib_action()` or internals should NOT require to change any of the
sockaddrs supplied by `rt_addrinfo` structure due to incorrectness.
To be more specific, this change implements the following:
* sockaddr cleanup/validation check is added immediately after getting sockaddrs from rtm.
* Per-family dst/netmask checks clears host bits in dst and zeros all dst/netmask "secondary" fields.
* The same netmask checking code converts /32(/128) netmasks to "host" route case
(NULL netmask, RTF_HOST), removing the dualism.
* Instead of allowing ANY "known" sockaddr families (0<..<AF_MAX), allow only actually
supported ones (inet, inet6, link).
* Automatically convert `sockaddr_sdl` (AF_LINK) gateways to
`sockaddr_sdl_short`.
Reported by: Guy Yur <guyyur at gmail.com>
Reviewed By: donner
Approved by: re(gjb)
Differential Revision: https://reviews.freebsd.org/D28668
Eric Joyner [Tue, 23 Feb 2021 01:45:09 +0000 (17:45 -0800)]
ice(4): Update to version 0.28.1-k
This updates the driver to align with the version included in
the "Intel Ethernet Adapter Complete Driver Pack", version 25.6.
There are no major functional changes; this mostly contains
bug fixes and changes to prepare for new features. This version
of the driver uses the previously committed ice_ddp package
1.3.19.0.
Signed-off-by: Eric Joyner <erj@FreeBSD.org>
Tested by: jeffrey.e.pieper@intel.com
Approved by: re (gjb)
Sponsored by: Intel Corporation
Differential Revision: https://reviews.freebsd.org/D28640
ixl(4): Add ability to control link state on ifconfig down
Add sysctl link_active_on_if_down, which allows user to control
if interface is kept in active state when it is brought
down with ifconfig. Set it to enabled by default to preserve
backwards compatibility.
ixl(4): Report RX errors as sum of all RX error counters
HW keeps track of RX errors using several counters, each for
specific type of errors. Report RX errors to OS as sum
of all those counters: CRC errors, illegal bytes, checksum,
length, undersize, fragment, oversize and jabber errors.
There is no HW counter for frames with invalid L3/L4 checksums
so add a SW one.
Also add a "rx_errors" sysctl with a copy of netstat IERRORS
counter value to make it easier accessible from scripts.
ix(4): Report RX errors as sum of all RX error counters
HW keeps track of RX errors using several counters, each for
specific type of errors. Report RX errors to OS as sum
of all those counters: CRC errors, illegal bytes, checksum,
length, undersize, fragment, oversize and jabber errors.
Also, add new "rx_errs" sysctl in the dev.ix.N.mac_stats tree. This is
to provide an another way to display the sum of RX errors.
X700 family of controllers has limited number of available VLAN
HW filters. Driver did not handle properly a case when user
assigned more VLANs to the interface which had all filters
already in use. Fix that by disabling HW filtering when
it is impossible to create filters for all requested VLANs.
Keep track of registered VLANs using bitstring to be able
to re-enable HW filtering when number of requested VLANs
drops below the limit.
Also switch all allocations to use M_IXL malloc type
to ease detecting memory leaks in the driver.
Dimitry Andric [Fri, 5 Mar 2021 20:06:05 +0000 (21:06 +0100)]
Add a few missed files to libclang_rt.profile-<arch>.a
Otherwise, programs compiled with -fprofile-instr-generate will
encounter undefined symbol errors during linking, for example
__llvm_profile_counter_bias, lprofSetRuntimeCounterRelocation and a few
others were missing from the profile library.
Approved by: re (gjb)
Reported by: ota@j.email.ne.jp
PR: 254001
armv8crypto: fix AES-XTS regression introduced by ed9b7f44
Initialization of the XTS key schedule was accidentally dropped
when adding AES-GCM support so all-zero schedule was used instead.
This rendered previously created GELI partitions unusable.
This change restores proper XTS key schedule initialization.
Reported by: Peter Jeremy <peter@rulingia.com>
MFC after: immediately
Approved by: re (gjb)
Alexander Motin [Sat, 6 Mar 2021 03:39:52 +0000 (22:39 -0500)]
Do not exit ctl_be_block_worker() prematurely.
Return while there are any I/Os in a queue may result in them stuck
indefinitely, since there is only one taskqueue task for all of them.
I think I've reproduced this by switching ha_role to secondary under
heavy load.
Mitchell Horne [Thu, 4 Mar 2021 17:52:45 +0000 (13:52 -0400)]
riscv: fix errors in some atomic type aliases
This appears to be a copy-and-paste error that has simply been
overlooked. The tree contains only two calls to any of the affected
variants, but recent additions to the test suite started exercising the
call to atomic_clear_rel_int() in ng_leave_write(), reliably causing
panics.
Apparently, the issue was inherited from the arm64 atomic header. That
instance was addressed in c90baf6817a0, but the fix did not make its way
to RISC-V.
Note that the particular test case ng_macfilter_test:main still appears
to fail on this platform, but this change reduces the panic to a
timeout.
Stefan Eßer [Wed, 17 Feb 2021 21:56:16 +0000 (22:56 +0100)]
Upgrade to version 3.3.0
This update changes the behavior of "-e" or "-f" in BC_ENV_ARGS:
Use of these options on the command line makes bc exit after executing
the given commands. These options will not cause bc to exit when
passed via the environment (but EOF in STDIN or -e or -f on the
command line will make bc exit as before).
The same applies to DC_ENV_ARGS with regard to the dc program.
Martin Matuska [Wed, 3 Mar 2021 01:38:09 +0000 (02:38 +0100)]
zfs: cancel TRIM or initialize on FAULTED non-writeable vdevs
From the openzfs commit message:
When a device which is actively trimming or initializing becomes
FAULTED, and therefore no longer writable, cancel the active
TRIM or initialization. When the device is merely taken offline
with `zpool offline` then stop the operation but do not cancel it.
When the device is brought back online the operation will be
resumed if possible.
Martin Matuska [Wed, 3 Mar 2021 01:32:59 +0000 (02:32 +0100)]
zfs: fix assert in FreeBSD-specific dmu_read_pages
From the openzfs 2e160dee9 commit message:
The function has three similar pieces of code: for read-behind pages,
requested pages and read-ahead pages. All three pieces had an
assert to ensure that the page is not mapped. Later the assert was
relaxed to require that the page is not mapped for writing. But that
was done in two places out of three. This change fixes the third piece,
read-ahead.
Martin Matuska [Wed, 3 Mar 2021 01:28:56 +0000 (02:28 +0100)]
zfs: fix vdev_rebuild_thread deadlock
From the openzfs 8e43fa12c commit message:
The metaslab_disable() call may block waiting for a txg sync.
Therefore it's important that vdev_rebuild_thread release the
SCL_CONFIG read lock it is holding before this call. Failure
to do so can result in the txg_sync thread getting blocked
waiting for this lock which results in a deadlock.
Martin Matuska [Wed, 3 Mar 2021 01:25:03 +0000 (02:25 +0100)]
zfs: fix overly broad locking in spa_vdev_config_exit()
Resolves a deadlock which can occur when the ZED or zpool
command attaches a new device.
From the openzfs 75a089ed3 commit message:
Calling vdev_free() only requires the we acquire the spa config
SCL_STATE_ALL locks, not the SCL_ALL locks. In particular, we need
need to avoid taking the SCL_CONFIG lock (included in SCL_ALL) as a
writer since this can lead to a deadlock. The txg_sync_thread() may
block in spa_txg_history_init_io() when taking the SCL_CONFIG lock
as a reading when it detects there's a pending writer.
Ed Maste [Tue, 2 Mar 2021 22:35:48 +0000 (17:35 -0500)]
growfs: allow operation on RW-mounted filesystems
growfs supports growing mounted filesystems (writes are temporarily
suspended while the grow happens). Drop the check for fs_clean == 0
to restore this case. Leave fs_flags check for FS_UNCLEAN or
FS_NEEDSFSCK which represent the state of the filesystem when it was
mounted, and fsck should be run first if they are set.
PR: 253754
Reviewed by: mckusick
Approved by: re (gjb)
Fixes: 6eb925f8450f ("Filesystem utilities that modify the...")
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D29021
Peter Grehan [Sat, 27 Feb 2021 04:15:04 +0000 (14:15 +1000)]
Import wireguard fixes from pfSense 2.5
Merge the following fixes from https://github.com/pfsense/FreeBSD-src 1940e7d3 Save address of ingress packets to allow wg to work on HA 8f5531f1 Fix connection to IPv6 endpoint 825ed9ee Fix tcpdump for wg IPv6 rx tunnel traffic 2ec232d3 Fix issue with replying to INITIATION messages in server mode ec77593a Return immediately in wg_init if in DETACH'd state 0f0dde6f Remove unnecessary wg debug printf on transmit 2766dc94 Detect and fix case in wg_init() where sockets weren't cleaned up b62cc7ac Close the UDP tunnel sockets when the interface has been stopped
linux: fix handling of flags for 32 bit send(2) syscall
Previously the flags were passed as-is, which could resulted
in spurious EAGAIN returned for non-blocking sockets, which
broke some Steam games.
Approved by: re (gjb)
PR: 248065
Reported By: Alex S <iwtcex@gmail.com>
Tested By: Alex S <iwtcex@gmail.com>
Reviewed By: emaste
MFC After: 3 days
Sponsored By: The FreeBSD Foundation
Use compat.linux.emul_path instead of hardcoded path in /etc/rc.d/linux
In /etc/rc.d/linux the mounting paths of procfs, sysfs and devfs
are hardcoded to "/compat/linux". Switching to the content of
compat.linux.emul_path sysctl would allow to switch linuxulator
to different place.
Approved by: re (gjb)
Submitted by: freebsdnewbie_freenet.de
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D27807