Andriy Gapon [Fri, 22 May 2020 11:41:08 +0000 (11:41 +0000)]
MFC r358380,r358382: dsl_dataset_promote_sync: populate 'oldname' before using it
It's very unlikely that zfsvfs_update_fromname() and
zvol_rename_minors() ever did anything during the promote operation as
the old name was not initialized.
MFC r361075:
Assign process group of the TTY under the "proctree_lock".
This fixes a race where concurrent calls to doenterpgrp() and
leavepgrp() while TIOCSCTTY is executing may result in tp->t_pgrp
changing value so that tty_rel_pgrp() misses clearing it to NULL. For
more details refer to the use of pgdelete() in the kernel.
No functional change intended.
Panic backtrace:
__mtx_lock_sleep() # page fault due to using destroyed mutex
tty_signal_pgrp()
tty_ioctl()
ptsdev_ioctl()
kern_ioctl()
sys_ioctl()
amd64_syscall()
MFC r360477: Correctly set up the initial TCP congestion window in all cases
by not including the SYN bit sequence space in cwnd related calculations.
Snd_und is adjusted explicitly in all cases, outside the cwnd update, instead.
This fixes an off-by-one conformance issue with regular TCP sessions
not using Appropriate Byte Counting (RFC3465), sending one more
packet during the initial window than expected.
MFC r360479: Prevent premature shrinking of the scaled receive window
which can cause a TCP client to use invalid or stale TCP sequence numbers for ACK packets.
Packets with old sequence numbers are ignored and not used to update the send window size.
This might cause the TCP session to hang indefinitely under some circumstances.
Kyle Evans [Thu, 21 May 2020 02:08:34 +0000 (02:08 +0000)]
MFC r361011: kernel: provide panicky version of __unreachable
__builtin_unreachable doesn't raise any compile-time warnings/errors on its
own, so problems with its usage can't be easily detected. While it would be
nice for this situation to change and compilers to at least add a warning
for trivial cases where local state means the instruction can't be reached,
this isn't the case at the moment and likely will not happen.
This commit adds an __assert_unreachable, whose intent is incredibly clear:
it asserts that this instruction is unreachable. On INVARIANTS builds, it's
a panic(), and on non-INVARIANTS it expands to __unreachable().
Existing users of __unreachable() are converted to __assert_unreachable,
to improve debuggability if this assumption is violated.
Ryan Moeller [Thu, 21 May 2020 02:04:10 +0000 (02:04 +0000)]
MFC r361066:
jail: Add exec.prepare and exec.release command hooks
This change introduces new jail command hooks that run before and after any
other actions.
The exec.prepare hook can be used for example to invoke a script that checks
if the jail's root exists, creating it if it does not. Since arbitrary
variables in jail.conf can be passed to the command, it can be pretty useful
for templating jails.
An example use case for exec.release would be to remove the filesystem of an
ephemeral jail.
The names "prepare" and "release" are borrowed from the names of similar hooks
in libvirt.
Kyle Evans [Thu, 21 May 2020 01:55:10 +0000 (01:55 +0000)]
MFC r361065: pf tests: fix up a couple WARNS= 6 nits
common_init_tbl is only used within this single CU, so it should be marked
static.
WARNS=6 also complained about the var defined by
`ATF_TC_WITH_CLEANUP(getastats);` being unused, which turns out to be
because it's not been hooked up in ATF_TP_ADD_TCS. kp@ did not immediately
recall any reason for this, and the case passes on my local system, so hook
it up.
Note that I've not yet set WARNS= 6 here. Investigation is underway to see
if we can feasibly default WARNS to 6 for src builds to catch directories
too deep to inherit a WARNS from the top-level subdirectories' Makefile.inc.
Those particular WARNS settings will be subsequently removed as they become
redundant with a more-global default.
Kyle Evans [Thu, 21 May 2020 01:53:03 +0000 (01:53 +0000)]
MFC r361000, r361036: improve inetd(8) examples
r361000:
inetd(8): Provide HTTP proxy example using netcat
One of the fortunes that are included in freebsd-tips talks about how
the superserver can be used to proxy connections with netcat, but there are
no examples provided. This commit adds an example with comment explaining
what it does.
It's been reported/noted that a well-timed `certctl rehash` will completely
obliterate $CERTDESTDIR, which may get used by ports or system
administrators. While we can't guarantee the certctl semantics when other
non-certctl-controlled bits live here, we should make some amount of effort
to play nice.
Pruning all existing links, which we'll subsequently rebuild as needed, is
sufficient for our needs. This can still be destructive, but it's perhaps
less likely to cause issues.
I also note that we should probably be pruning /etc/ssl/blacklisted upon
rehash as well.
r361023: certctl: follow-up to r361022, prune blacklist as well
Otherwise, removals from the blacklist may not get processed as they should.
While we're here, restructure these to not bother with mkdir(1) if we've
already tested them to exist.
r361148: certctl: don't fall over flat with relative DESTDIR
Up until now, all of our DESTDIR use has been with absolute paths. It turned
out that the cd in/out dance we do here breaks us down later on, as the
relative path no longer resolves.
Convert EXTENSIONS to an ERE that we'll use to grep ls -1 of the dir we're
inspecting, rather than cd'ing into it and globbing it up.
Alexander Motin [Tue, 19 May 2020 14:31:47 +0000 (14:31 +0000)]
MFC r360564: Cleanup LUN addition/removal.
- Make ctl_add_lun() synchronous. Asynchronous addition was used by
Copan's proprietary code long ago and never for upstream FreeBSD.
- Move LUN enable/disable calls from backends to CTL core.
- Serialize LUN modification and partially removal to avoid double frees.
- Slightly unify backends code.
Colin Percival [Tue, 19 May 2020 01:40:45 +0000 (01:40 +0000)]
MFC r361114:
Move the devmatch rc.d script before netif in the boot process.
Prior to this change, using lagg to aggregate wired and wireless networks
was broken in the (relatively common) case where wifi drivers + firmware
are loaded by devmatch, since the interface didn't exist at the time when
the lagg interface was being created.
Colin Percival [Tue, 19 May 2020 01:39:37 +0000 (01:39 +0000)]
MFC r361097:
Send Lid status notification via devd from acpi_lid_status_update.
Some laptops don't send ACPI "lid status changed" notifications upon
opening the lid if the system was currently suspended. In r358219
this was partially fixed, updating the "lid_status" variable upon
resume even if there is no "status changed" notification from ACPI.
Unfortunately the fix in r358219 did not include notifying userland
via devd; this causes problems on systems using upowerd (e.g. KDE),
since upowerd remembers the most recent devd notification about the
lid status rather than querying the sysctl to get the current status.
This showed up as two symptoms when KDE's "When laptop lid closed: Sleep"
option is set:
1. 50% of the time, closing the lid would not trigger S3 sleep.
2. 50% of the time, plugging/unplugging AC power would trigger S3 sleep.
MFC r351003:
Fix build with DRM and INVARIANTS enabled.
The DRM drivers use the lockdep assertion macros with spinlock_t locks
which are backed by mutexes, not sx locks. This causes compile
failures since you can't use sx_assert with a mutex. Instead, change
the lockdep macros to use lock_class methods. This works by assuming
that each LinuxKPI locking primitive embeds a FreeBSD lock as its
first structure and uses a cast to get to the underlying 'struct
lock_object'.
MFC r360070:
Add missing feature descriptions to hci_features2str().
The list of possible features in hccontrol/features2str() is incomplete.
Refer to "Bluetooth Core Specification 5.2 Vol. 2 Part C. 3.3 Feature Mask Definition".
Peter Grehan [Sun, 17 May 2020 11:09:38 +0000 (11:09 +0000)]
MFC r361064
Hide host CPUID 0x15 TSC/Crystal ratio/freq info from guest
In recent Linux (5.3+) and OpenBSD (6.6+) kernels, and with hosts that
support CPUID 0x15, the local APIC frequency is determined directly
from the reported crystal clock to avoid calibration against the 8254
timer.
However, the local APIC frequency implemented by bhyve is 128MHz, where
most h/w systems report frequencies around 25MHz. This shows up on
OpenBSD guests as repeated keystrokes on the emulated PS2 keyboard
when using VNC, since the kernel's timers are now much shorter.
Fix by reporting all-zeroes for CPUID 0x15. This allows guests to fall
back to using the 8254 to calibrate the local APIC frequency.
Future work could be to compute values returned for 0x15 that would
match the host TSC and bhyve local APIC frequency, though all dependencies
on this would need to be examined (for example, Linux will start using
0x16 for some hosts).
Add new stats(7) man page and hook it up to the build.
This man page contains stat utilities that are available in
the base system. This is a better approach than looking them
up via "apropos stat" or similar commands.
Thanks to Daniel Ebdrup Jensen for writing the original page
and incorporating the feedback given.
Submitted by: Daniel Ebdrup Jensen
Reviewed by: 0mp, allanjude, brueffer, bcr
Approved by: bcr
Relnotes: yes (new stats(7) man page)
Differential Revision: https://reviews.freebsd.org/D24417
Alan Somers [Sun, 17 May 2020 02:35:50 +0000 (02:35 +0000)]
MFC r360339, r360567
r360339:
mac_bsdextended: ATFify the tests
The new tests have more complete setup and cleanup, are more granular, and
correctly annotate expected failures and skipped tests. A follow-up commit
will resolve a conflict with the fusefs tests (bug 244229).
r360567:
Resolve conflict between the fusefs(5) and mac_bsdextended(4) tests
mac_bsdextended(4), when enabled, causes ordinary operations to send many
more VOP_GETATTRs to file system. The fusefs tests expectations aren't
written with those in mind. Optionally expecting them would greatly
obfuscate the fusefs tests. Worse, certain fusefs functionality (like
attribute caching) would be impossible to test if the tests couldn't expect
an exact number of GETATTR operations.
This commit resolves that conflict by making two changes:
1. The fusefs tests will now check for mac_bsdextended, and skip if it's
enabled.
2. The mac_bsdextended tests will now check whether the module is enabled, not
merely loaded. If it's loaded but disabled, the tests will automatically
enable it for the duration of the tests.
With these changes, a CI system can achieve best coverage by loading both
fusefs and mac_bsdextended at boot, and setting
security.mac.bsdextended.enabled=0
Eric Joyner [Thu, 14 May 2020 19:56:54 +0000 (19:56 +0000)]
MFC r360398 and r360902
These commits introduce a new iflib device-dependent method and
implements that method in the Intel ethernet network drivers;
this method tells iflib if the network interface needs to be
restarted when certain events happen.
This fixes several issues that occur when VLANs are registered
or unregistered with the network interface.
Dimitry Andric [Thu, 14 May 2020 19:09:13 +0000 (19:09 +0000)]
MFC r360915:
Use -fno-asynchronous-unwind-tables to compile lib/csu
Summary:
In r209294 kib added -fno-asynchronous-unwind-tables to the compile
flags for the GNU C startup components. This was done to work around a
BFD ld assertion, "no .eh_frame_hdr table will be created", which is
produced because of the layout of the startup objects.
Add the same flag to lib/csu too, for the same reason. And similarly to
r209294, also add -fno-omit-frame-pointer.
This is primarily meant to quickly MFC to stable/11, so it can end up in
the 11.4 release, as a fix for https://bugs.freebsd.org/246322.
John Baldwin [Thu, 14 May 2020 18:59:34 +0000 (18:59 +0000)]
MFC 360710: Deprecate ubsec(4) for FreeBSD 13.0.
With the removal of in-tree consumers of DES, Triple DES, and
MD5-HMAC, the only algorithm this driver still supports is SHA1-HMAC.
This is not very useful as a standalone algorithm (IPsec AH-only with
SHA1 would be the only user).
This driver has also not been kept up to date with the original driver
in OpenBSD which supports a few more cards and AES-CBC on newer cards.
The newest card currently supported by this driver was released in
2005.
John Baldwin [Thu, 14 May 2020 18:49:43 +0000 (18:49 +0000)]
MFC 360399:
Update the cached MSI state when any MSI capability register is written.
bhyve uses cached copies of the MSI capability registers to generate
MSI interrupts for device models. Previously, these cached fields
were only set when the MSI capability control register was updated.
The Linux kernel recently adopted a change to deal with races in MSI
interrupt delivery that writes to the MSI capability address and data
registers to alter the destination of MSI interrupts without writing
to the MSI capability control register. bhyve was not updating its
cached registers for these writes and continued to send interrupts
with the old data value to the old address. Fix this by recomputing
the cached values for every write to any MSI capability register.
John Baldwin [Thu, 14 May 2020 17:56:43 +0000 (17:56 +0000)]
MFC 360171,360179,360285,360388: Don't dereference various user pointers.
360171:
Don't access a user buffer directly from the kernel.
The handle_string callback for the ENCIOC_SETSTRING ioctl was passing
a user pointer to memcpy(). Fix by using copyin() instead.
For ENCIOC_GETSTRING ioctls, the handler was storing the user pointer
in a CCB's data_ptr field where it was indirected by other code. Fix
this by allocating a temporary buffer (which ENCIOC_SETSTRING already
did) and copying the result out to the user buffer after the CCB has
been processed.
360179:
Don't pass a user buffer pointer as the data pointer in a CCB.
Allocate a temporary buffer in the kernel to serve as the CCB data
pointer for a pass-through transaction and use copyin/copyout to
shuffle the data to/from the user buffer.
360285:
Don't indirect user pointers directly in two 802.11s ioctls.
IEEE80211_MESH_RTCMD_ADD was invoking memcmp() to validate the
supplied address directly on the user pointer rather than first doing
a copyin() and validating the copied value.
IEEE80211_MESH_RTCMD_DELETE was passing the user pointer directly to
ieee80211_mesh_rt_del() rather than copying the user buffer into a
temporary kernel buffer.
360388:
Don't run strcmp() against strings stored in user memory.
Instead, copy the strings into a temporary buffer on the stack and
run strcmp on the copies.
MFC r360491: Introduce a lower bound of 2 MSS to TCP Cubic.
Running TCP Cubic together with ECN could end up reducing cwnd down to 1 byte, if the
receiver continously sets the ECE flag, resulting in very poor transmission speeds.
In line with RFC6582 App. B, a lower bound of 2 MSS is introduced, as well as a typecast
to prevent any potential integer overflows during intermediate calculation steps of the
adjusted cwnd.
Reported by: Cheng Cui
Reviewed by: tuexen (mentor)
Approved by: tuexen (mentor)
Sponsored by: NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D23353
John Baldwin [Wed, 13 May 2020 21:16:02 +0000 (21:16 +0000)]
MFC 359047,359054: Deprecate procfs-based process debugging.
359047:
Mark procfs-based process debugging as deprecated for FreeBSD 13.
Attempting to use ioctls on /proc/<pid>/mem to control a process will
trigger warnings on the console. The <sys/pioctl.h> include file will
also now emit a compile-time warning when used from userland.
359054:
Fix the workaround to ignore the #warning for GCC.
clang and gcc use different warning flags for #warning preprocessor
directives.
For both 12 and 11, adjust the GCC warning flags to only be added in
4.7 and later since 4.2.1 does not support -Wno-cpp. For 11, add the
needed warning suppression to procctl's build. procctl was removed in
12.0.
Justin Hibbits [Tue, 12 May 2020 21:51:56 +0000 (21:51 +0000)]
MFC r343969 by nwhitehorn:
Performance improvements for octe(4):
- Distribute RX load across multiple cores, if present. This reverts
r217212, which is no longer relevant (I think because of the newer
SDK).
- Use newer APIs for pinning taskqueue entries to specific cores.
- Deepen RX buffers.
This more than doubles NAT forwarding throughput on an EdgeRouter Lite from,
with typical packet mixture, 90 Mbps to over 200 Mbps. The result matches
forwarding throughput in Linux without the UBNT hardware offload on the same
hardware, and thus likely reflects hardware limits.
Marcin Wojtas [Tue, 12 May 2020 18:44:41 +0000 (18:44 +0000)]
MFC r360777: Optimize ENA Rx refill for low memory conditions
Sometimes, especially when there is not much memory in the system left,
allocating mbuf jumbo clusters (like 9KB or 16KB) can take a lot of time
and it is not guaranteed that it'll succeed. In that situation, the
fallback will work, but if the refill needs to take a place for a lot of
descriptors at once, the time spent in m_getjcl looking for memory can
cause system unresponsiveness due to high priority of the Rx task. This
can also lead to driver reset, because Tx cleanup routine is being
blocked and timer service could detect that Tx packets aren't cleaned
up. The reset routine can further create another unresponsiveness - Rx
rings are being refilled there, so m_getjcl will again burn the CPU.
This was causing NVMe driver timeouts and resets, because network driver
is having higher priority.
Instead of 16KB jumbo clusters for the Rx buffers, 9KB clusters are
enough - ENA MTU is being set to 9K anyway, so it's very unlikely that
more space than 9KB will be needed.
However, 9KB jumbo clusters can still cause issues, so by default the
page size mbuf cluster will be used for the Rx descriptors. This can have a
small (~2%) impact on the throughput of the device, so to restore
original behavior, one must change sysctl "hw.ena.enable_9k_mbufs" to
"1" in "/boot/loader.conf" file.
As a part of this patch (important fix), the version of the driver
was updated to v2.1.2.
Ed Maste [Tue, 12 May 2020 16:52:08 +0000 (16:52 +0000)]
MFC r360968: libalias: fix potential memory disclosure from ftp module
admbugs: 956
Submitted by: markj
Reported by: Vishnu Dev TJ working with Trend Micro Zero Day Initiative
Approved by: so
Security: FreeBSD-SA-20:13.libalias
Security: CVE-2020-7455
Security: ZDI-CAN-10849
Ed Maste [Tue, 12 May 2020 16:49:04 +0000 (16:49 +0000)]
MFC r360967: libalias: validate packet lengths before accessing headers
admbugs: 956
Submitted by: ae
Reported by: Lucas Leong (@_wmliang_) of Trend Micro Zero Day Initiative
Reported by: Vishnu working with Trend Micro Zero Day Initiative
Approved by: so
Security: FreeBSD-SA-20:12.libalias
Security: CVE-2020-7454
Security: ZDI-CAN-10624, ZDI-CAN-10850
John Baldwin [Mon, 11 May 2020 21:24:22 +0000 (21:24 +0000)]
MFC 357643: Tidy the _set_tp function for RISC-V.
- Use a constant for the offset instead of a magic number.
- Use an addi instruction that writes to tp directly instead of a mv
that writes the result of a compiler-generated addi.
John Baldwin [Mon, 11 May 2020 20:58:27 +0000 (20:58 +0000)]
MFC 357632: Use the context created in makectx() for stack traces.
Always use the kdb_thr_ctx() for db_trace_thread() as on other
architectures. Initialize pcb_ra to be the sepc from the saved
trapframe rather than the saved ra to avoid skipping a frame.
John Baldwin [Mon, 11 May 2020 20:41:27 +0000 (20:41 +0000)]
MFC 357630: Fix DDB to unwind across exception frames.
While here, add an extra line of information for exceptions and
interrupts and compress the per-frame line down to one line to match
other architectures.