Martin Matuska [Mon, 23 Aug 2021 00:54:15 +0000 (02:54 +0200)]
libarchive: import changes from upstream
Libarchive 3.5.2
New features:
PR #1502: Support for PWB and v7 binary cpio formats
PR #1509: Support of deflate algorithm in symbolic link decompression
for ZIP archives
Important bugfixes:
IS #1044: fix extraction of hardlinks to symlinks
PR #1480: Fix truncation of size values during 7zip archive
extraction on 32bit architectures
PR #1504: fix rar header skiming
PR #1514: ZIP excessive disk read - fix location of central directory
PR #1520: fix double-free in CAB reader
PR #1521: Fixed leak of rar before ending with error
PR #1530: Handle short writes from archive_write_callback
PR #1532: 7zip: Use compression settings from file also for file header
IS #1566: do not follow symlinks when processing the fixup list
Martin Matuska [Mon, 23 Aug 2021 00:24:04 +0000 (02:24 +0200)]
Update vendor/libarchive/dist to libarchive/libarchive@1b2c437b9
Libarchive 3.5.2
New features:
PR #1502: Support for PWB and v7 binary cpio formats
PR #1509: Support of deflate algorithm in symbolic link decompression
for ZIP archives
Important bugfixes:
IS #1044: fix extraction of hardlinks to symlinks
PR #1480: Fix truncation of size values during 7zip archive
extraction on 32bit architectures
PR #1504: fix rar header skiming
PR #1514: ZIP excessive disk read - fix location of central directory
PR #1520: fix double-free in CAB reader
PR #1521: Fixed leak of rar before ending with error
PR #1530: Handle short writes from archive_write_callback
PR #1532: 7zip: Use compression settings from file also for file header
IS #1566: do not follow symlinks when processing the fixup list
Zhenlei Huang [Sun, 22 Aug 2021 22:28:47 +0000 (22:28 +0000)]
routing: Allow using IPv6 next-hops for IPv4 routes (RFC 5549).
Implement kernel support for RFC 5549/8950.
* Relax control plane restrictions and allow specifying IPv6 gateways
for IPv4 routes. This behavior is controlled by the
net.route.rib_route_ipv6_nexthop sysctl (on by default).
* Always pass final destination in ro->ro_dst in ip_forward().
* Use ro->ro_dst to exract packet family inside if_output() routines.
Consistently use RO_GET_FAMILY() macro to handle ro=NULL case.
* Pass extracted family to nd6_resolve() to get the LLE with proper encap.
It leverages recent lltable changes committed in c541bd368f86.
Presence of the functionality can be checked using ipv4_rfc5549_support feature(3).
Example usage:
route add -net 192.0.0.0/24 -inet6 fe80::5054:ff:fe14:e319%vtnet0
Bjoern A. Zeeb [Sun, 22 Aug 2021 17:21:01 +0000 (17:21 +0000)]
vm: use __func__ for the correct function name
In fee2a2fa39834d8d5eaa981298fce9d2ed31546d the KASSERTs in
vm_page_unwire_noq() changed from "vm_page_unwire" to "vm_page_unref".
While the former no longer was part of that function the latter does
not exist as a function and is highly confusing when hit when using
tools to lookup the functions and not doing a full-text search.
Use %s __func__ for printing the function name, as that will do the
right thing as code moves around and functions get renamed.
Hit: while debugging a wired page leak with linuxkpi/iwlwifi
Sponsored by: The FreeBSD Foundation
Reviewed by: markj
Differential Revision: https://reviews.freebsd.org/D31635
Thomas Munro [Sun, 22 Aug 2021 11:34:07 +0000 (23:34 +1200)]
Fix aio_readv(2), aio_writev(2) with SIGEV_THREAD.
Add missing wrapper code to librt for these new functions so that
SIGEV_THREAD works. Without machinery to convert it to SIGEV_THREAD_ID,
you got EINVAL.
Thomas Munro [Sun, 22 Aug 2021 09:48:59 +0000 (21:48 +1200)]
lio_listio(2): Allow LIO_READV and LIO_WRITEV.
Allow multiple vector IOs to be started with one system call.
aio_readv() and aio_writev() already used these opcodes under the
covers. This commit makes them available to user space.
Being non-standard extensions, they're only visible if __BSD_VISIBLE is
defined, like the functions.
- make sure rings are disabled during resets
- introduce netmap_update_hostrings_mode(), with support
for multiple host rings
- always initialize ni_bufs_head in netmap_if
ni_bufs_head was not properly initialized when no external buffers were
requestedx and contained the ni_bufs_head from the last request. This
was causing spurious buffer frees when alternating between apps that
used external buffers and apps that did not use them.
- check na validitity under lock on detach
- netmap_mem: fix leak on error path
- nm_dispatch: fix compilation on Raspberry Pi
Dimitry Andric [Sat, 21 Aug 2021 21:03:37 +0000 (23:03 +0200)]
Apply clang fix for assertion failure compiling multimedia/minitube
Merge commit 79f9cfbc21e0 from llvm git (by Yaxun (Sam) Liu):
Do not merge LocalInstantiationScope for template specialization
A lambda in a function template may be recursively instantiated. The recursive
lambda will cause a lambda function instantiated multiple times, one inside another.
The inner LocalInstantiationScope should not be marked as MergeWithParentScope
since it already has references to locals properly substituted, otherwise it causes
assertion due to the check for duplicate locals in merged LocalInstantiationScope.
lltable: Add support for "child" LLEs holding encap for IPv4oIPv6 entries.
Currently we use pre-calculated headers inside LLE entries as prepend data
for `if_output` functions. Using these headers allows saving some
CPU cycles/memory accesses on the fast path.
However, this approach makes adding L2 header for IPv4 traffic with IPv6
nexthops more complex, as it is not possible to store multiple
pre-calculated headers inside lle. Additionally, the solution space is
limited by the fact that PCB caching saves LLEs in addition to the nexthop.
Thus, add support for creating special "child" LLEs for the purpose of holding
custom family encaps and store mbufs pending resolution. To simplify handling
of those LLEs, store them in a linked-list inside a "parent" (e.g. normal) LLE.
Such LLEs are not visible when iterating LLE table. Their lifecycle is bound
to the "parent" LLE - it is not possible to delete "child" when parent is alive.
Furthermore, "child" LLEs are static (RTF_STATIC), avoding complex state
machine used by the standard LLEs.
nd6_lookup() and nd6_resolve() now accepts an additional argument, family,
allowing to return such child LLEs. This change uses `LLE_SF()` macro which
packs family and flags in a single int field. This is done to simplify merging
back to stable/. Once this code lands, most of the cases will be converted to
use a dedicated `family` parameter.
Toomas Soome [Sat, 21 Aug 2021 11:25:36 +0000 (14:25 +0300)]
loader: FB console does leave garbage on screen while scrolling
Scrolling screen will leave "trail" of chars from first column.
Apparently caused by cursor location mismanagement.
Make sure we do not [attempt to] set cursor out of the screen.
Alexander Motin [Sat, 21 Aug 2021 13:31:41 +0000 (09:31 -0400)]
cam(4): Fix quick unplug/replug for SCSI.
If some device is plugged back in after unplug before the probe periph
destroyed, it will just restart the probe process. But I've found that
PROBE_INQUIRY_CKSUM flag not cleared between the iterations may cause
AC_FOUND_DEVICE not reported on the second iteration, and because of
AC_LOST_DEVICE reported during the first iteration, the device end up
configured, but without any periphs attached.
We've found that enabled serial console and 102-disk JBOD cause enough
probe delays to easily trigger the issue for half of the disks. This
change fixes it reliably on my tests.
When link_active_on_if_down flag is disabled and link is brought down
with ifconfig, FW reports a false positive link event about an
unqualified transceiver. The condition used in the driver to filter out
those false positive events was incorrect and caused that unqualified
module event to also not be reported when the event was valid.
Change the condition to rely on IFF_UP flag instead of
link_active_on_if_down and bump driver version to 2.3.1-k.
Maxim Sobolev [Fri, 20 Aug 2021 19:33:51 +0000 (12:33 -0700)]
Only trigger read-ahead if two adjacent blocks have been requested.
The change makes block caching algorithm to work better for remote
media on low-BW/high-delay links.
This cuts boot time over IP KVMs noticeably, since the initialization
stage reads bunch of small 4th (and now lua) files that are not in
the same cache stripe (usually), thus wasting lot of bandwidth and
increasing latency even further.
The original regression came in 2017 with revision 87ed2b7f5. We've
seen increase of time it takes for the loader to get to the kernel
loading from under a minute to 10-15 minutes in many cases.
Use interruptible wait for blocking recursive unmounts
Now that we allow recursive unmount attempts to be abandoned upon
exceeding the retry limit, we should avoid leaving an unkillable
thread when a synchronous unmount request was issued against the
base filesystem.
VFS: add retry limit and delay for failed recursive unmounts
A forcible unmount attempt may fail due to a transient condition, but
it may also fail due to some issue in the filesystem implementation
that will indefinitely prevent successful unmount. In such a case,
the retry logic in the recursive unmount facility will cause the
deferred unmount taskqueue to execute constantly.
Avoid this scenario by imposing a retry limit, with a default value
of 10, beyond which the recursive unmount facility will emit a log
message and give up. Additionally, introduce a grace period, with
a default value of 1s, between successive unmount retries on the
same mount.
Create a new sysctl node, vfs.deferred_unmount, to export the total
number of failed recursive unmount attempts since boot, and to allow
the retry limit and retry grace period to be tuned.
Alexander Motin [Fri, 20 Aug 2021 18:37:55 +0000 (14:37 -0400)]
mpr(4): Fix unmatched devq release.
Before this change devq was frozen only if some command was sent to
the target after reset started, but release was called always. This
change freezes the devq immediately, leaving mprsas_action_scsiio()
check only to cover race condition due to different lock devq use.
This should also avoid unnecessary requeue of the commands, creating
additional log noise and confusing some broken apps.
Maxim Sobolev [Fri, 20 Aug 2021 16:43:04 +0000 (09:43 -0700)]
Allow rc.d script to provide "status" method, even if it does not
define procname or have a PID file. This might be useful for cases,
such as mounting local FS, when there is no running daemon
still some other persistent state in the system which status
can be checked.
It is still possible to have a status method before this by having
extra_commands="status", but it's not obvious and might give
an script writer some extra legwork to figure out how and why
the straight method is not working.
Fully revert f83f5d58394db57576bbed6dc7531997cabeb102 for the sys/dev/usb/serial
folder, only keeping the zero length packet API introduced in sys/dev/usb
after more reports of USB serial devices not supporting ZLPs.
Alexander Motin [Fri, 20 Aug 2021 13:46:51 +0000 (09:46 -0400)]
mpr(4): Handle mprsas_alloc_tm() errors on device removal.
SAS9305-16e with firmware 16.00.01.00 report HighPriorityCredit of
only 8, while for comparison some other combinations I have report
100 or even 128. In case of large JBOD detach requirement to send
target reset command to each target same time overflows the limit,
and without adequate handling makes devices stuck in half-detached
state, preventing later re-attach.
To handle that in case of allocation error mark the target with new
MPRSAS_TARGET_TOREMOVE flag, and retry the removal attempt next time
something else free high priority command. With this patch I can
successfully detach/attach 102 disk JBOD from/to the SAS9305-16e.
Wei Hu [Fri, 20 Aug 2021 08:43:10 +0000 (08:43 +0000)]
Microsoft Azure Network Adapter(MANA) VF support
MANA is the new network adapter from Microsoft which will be available
in Azure public cloud. It provides SRIOV NIC as virtual function to
guest OS running on Hyper-V.
The code can be divided into two major parts. Gdma_main.c is the one to
bring up the hardware board and drives all underlying hardware queue
infrastructure. Mana_en.c contains all main ethernet driver code.
It has only tested and supported on amd64 architecture.
ls: prevent no-color build from complaining when COLORTERM is non-empty
As 257886 reports, if ls(1) is built with WITHOUT_LS_COLORS="YES", it
issues a warning whenever COLORTERM is non-empty. The warning is not
useful, so I thought to remove it, but as Ed pointed out, we may want
to have a way to determine whether a particular copy of ls has been
compiled with color support or not.
Therefore move the warnx() call to the getopt loop in
a WITHOUT_LS_COLORS build to fire when the user asks for colored output.
PR: 257886
Reported by: Marko Turk
Reviewed by: kevans
Bjoern A. Zeeb [Thu, 19 Aug 2021 17:27:04 +0000 (12:27 -0500)]
localedef: unbreak WITHOUT_LOCALES
After 0fa5403d493b ("pkgbase: move locales into their own package") we
need usr.bin/localedef as a bootstrap tool independent on where
WITHOUT_LOCALE was specified as we ALWAYS process C.UTF-8.
At the same time LOCALES= in the local Makefile is empty but
C.UTF-8 with WITHOUT_LOCALES. C.UTF-8 is excluded from FILES, and thus
after the replacement FILES= is set to only .LC_CTYPE which results in
a build failure not knowing how to build that. Tweak the substitution to
replace only non-empty words so that FILES remains harmlessly empty.
Ka Ho Ng [Thu, 19 Aug 2021 10:30:41 +0000 (18:30 +0800)]
truncate(1): Add hole-punching support
This commit adds hole-punching support to the truncate(1) utility. If
the option -d is specified, truncate(1) performs zeroing, and if
possible hole-punching in case the operation is supported by the
underlying file system of the specified files.
Sponsored by: The FreeBSD Foundation
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D31556
Alexander Motin [Wed, 18 Aug 2021 21:11:03 +0000 (17:11 -0400)]
geli(8): Do not report error on resize to the same size.
Just validate the old metadata and exit. Originally the check was
added to not thash the only copy of metadata, but we can achieve the
same just by skipping the writing/trashing. The metadata validation
should protect user from wrongly specifying new size instead of old.
John Baldwin [Wed, 18 Aug 2021 17:56:28 +0000 (10:56 -0700)]
iscsi: Teach the iSCSI stack about "large" received PDUs.
When using iSCSI PDU offload (cxgbei) on T6 adapters, a burst of
received PDUs can be reported via a single message to the driver.
Previously the driver passed these multi-PDU bursts up to the iSCSI
stack up as a single "large" PDU by rewriting the buffer offset, data
segment length, and DataSN fields in the iSCSI header. The DataSN
field in particular was rewritten so that each of the "large" PDUs
used consecutively increasing values. While this worked, the forged
DataSN values did not match the ExpDataSN value in the subsequent SCSI
Response PDU. The initiator does not currently verify this value, but
the forged DataSN values prevent adding a check.
To avoid this, allow a logical iSCSI PDU (struct icl_pdu) to describe
a burst of PDUs via a new 'ip_additional_pdus' field. Normally this
field is set to zero when 'struct icl_pdu' represents a single PDU.
If logical PDU represents a burst of on-the-wire PDUs, then 'ip_npdus'
contains the count of additional on-the-wire PDUs. The header of this
"large" PDU is still modified, but the DataSN field now contains the
DataSN value of the first on-the-wire PDU in the burst.
Kyle Evans [Wed, 18 Aug 2021 17:31:45 +0000 (12:31 -0500)]
uipc: avoid circular pr_{slow,fast}timos
domain_init() gets reinvoked for each vnet on a system, so we must not
alter global state. Practically speaking, we were creating circular
lists and tying up a softclock thread into an infinite loop.
The breakage here was most easily observed by simply creating a jail
in a new vnet and watching the system suddenly become erratic.
Reported by: markj
Fixes: e0a17c3f063f ("uipc: create dedicated lists for fast ...")
Pointy hat: kevans
Cyril Zhang [Wed, 18 Aug 2021 17:41:33 +0000 (13:41 -0400)]
vmm: Add credential to cdev object
Add a credential to the cdev object in sysctl_vmm_create(), then check
that we have the correct credentials in sysctl_vmm_destroy(). This
prevents a process in one jail from opening or destroying the /dev/vmm
file corresponding to a VM in a sibling jail.
Add regression tests.
Reviewed by: jhb, markj
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D31156
Scott Long [Tue, 17 Aug 2021 21:50:18 +0000 (21:50 +0000)]
- Fix the growfs rc script to cope with diskid labels.
- Fix a warning in growfs. gpart commit is supposed to be called on disk
device.
- Silence a gpart commit warning in growfs.
John Baldwin [Tue, 17 Aug 2021 21:39:58 +0000 (14:39 -0700)]
OpenSSL: Refactor KTLS tests to better support TLS 1.3.
Most of this upstream commit touched tests not included in the
vendor import. The one change merged in is to remove a constant
only present in an internal header to appease the older tests.
John Baldwin [Tue, 17 Aug 2021 21:39:32 +0000 (14:39 -0700)]
OpenSSL: Update KTLS documentation
KTLS support has been changed to be off by default, and configuration is
via a single "option" rather two "modes". Documentation is updated
accordingly.
John Baldwin [Tue, 17 Aug 2021 21:39:03 +0000 (14:39 -0700)]
OpenSSL: Only enable KTLS if it is explicitly configured
It has always been the case that KTLS is not compiled by default. However
if it is compiled then it was automatically used unless specifically
configured not to. This is problematic because it avoids any crypto
implementations from providers. A user who configures all crypto to use
the FIPS provider may unexpectedly find that TLS related crypto is actually
being performed outside of the FIPS boundary.
Instead we change KTLS so that it is disabled by default.
We also swap to using a single "option" (i.e. SSL_OP_ENABLE_KTLS) rather
than two separate "modes", (i.e. SSL_MODE_NO_KTLS_RX and
SSL_MODE_NO_KTLS_TX).
John Baldwin [Tue, 17 Aug 2021 21:37:47 +0000 (14:37 -0700)]
OpenSSL: Correct the return value of BIO_get_ktls_*().
BIO_get_ktls_send() and BIO_get_ktls_recv() are documented as
returning either 0 or 1. However, they were actually returning the
internal value of the associated BIO flag for the true case instead of
1.
When a prefix gets deleted from the RIB, dpdk_lpm algo needs to know
the nexthop of the "parent" prefix to update its internal state.
The glue code, which utilises RIB as a backing route store, uses
fib[46]_lookup_rt() for the prefix destination after its deletion
to fetch the desired nexthop.
This approach does not work when deleting less-specific prefixes
with most-specific ones are still present. For example, if
10.0.0.0/24, 10.0.0.0/23 and 10.0.0.0/22 exist in RIB, deleting
10.0.0.0/23 would result in 10.0.0.0/24 being returned as a search
result instead of 10.0.0.0/22. This, in turn, results in the failed
datastructure update: part of the deleted /23 prefix will still
contain the reference to an old nexthop. This leads to the
use-after-free behaviour, ending with the eventual crashes.
Fix the logic flaw by properly fetching the prefix "parent" via
newly-created rt_get_inet[6]_parent() helpers.
Randall Stewart [Tue, 17 Aug 2021 20:29:22 +0000 (16:29 -0400)]
tcp: Add support for DSACK based reordering window to rack.
The rack stack, with respect to the rack bits in it, was originally built based
on an early I-D of rack. In fact at that time the TLP bits were in a separate
I-D. The dynamic reordering window based on DSACK events was not present
in rack at that time. It is now part of the RFC and we need to update our stack
to include these features. However we want to have a way to control the feature
so that we can, if the admin decides, make it stay the same way system wide as
well as via socket option. The new sysctl and socket option has the following
meaning for setting:
00 (0) - Keep the old way, i.e. reordering window is 1 and do not use DSACK bytes to add to reorder window
01 (1) - Change the Reordering window to 1/4 of an RTT but do not use DSACK bytes to add to reorder window
10 (2) - Keep the reordering window as 1, but do use SACK bytes to add additional 1/4 RTT delay to the reorder window
11 (3) - reordering window is 1/4 of an RTT and add additional DSACK bytes to increase the reordering window (RFC behavior)
The default currently in the sysctl is 3 so we get standards based behavior.
Reviewed by: tuexen
Sponsored by: Netflix Inc.
Differential Revision: https://reviews.freebsd.org/D31506