- Mention option's arguments in the list of options (so that now we mention
"-N system" instead of just "-N").
- Stylize signals and other constants like O_APPEND with Dv.
- Sort options.
- Change indentation width for readability.
- Fix a couple of typos.
- Sort symbols list.
- Use Sy instead of Cm for symbols. They are not command modifiers.
- Use Ex -std in the EXIT STATUS section for consistency with other manual
pages.
- Use Ql instead of Dq Li for inline code examples as Li has recently been
deprecated by mdoc.
- Update synopsis to present all available arguments.
- Consistently call the argument specifying an arbitrary directory a
"directory".
- Do not put macros into -width argument to Bl. They do not expand there.
- Stylize command modifiers like "daily" with Cm instead of Pa. While
technically periodic(8) operates on directories with such names, it is
confusing from the perspective of the manual page reader as Pa and Ar are
stylized the same way. Also, I cannot recall a single manual page where
Pa would be used to describe the syntax of command-line arguments.
Michael Tuexen [Wed, 1 Jul 2020 23:47:51 +0000 (23:47 +0000)]
MFC r349893:
This commit updates rack to what is basically being used at NF as
well as sets in some of the groundwork for committing BBR. The
hpts system is updated as well as some other needed utilities
for the entrance of BBR. This is actually part 1 of 3 more
needed commits which will finally complete with BBRv1 being
added as a new tcp stack.
Merge conflics were manually resolved.
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D20834
Michael Tuexen [Wed, 1 Jul 2020 22:22:26 +0000 (22:22 +0000)]
MFC r356663:
Fix race when accepting TCP connections.
When expanding a SYN-cache entry to a socket/inp a two step approach was
taken:
1) The local address was filled in, then the inp was added to the hash
table.
2) The remote address was filled in and the inp was relocated in the
hash table.
Before the epoch changes, a write lock was held when this happens and
the code looking up entries was holding a corresponding read lock.
Since the read lock is gone away after the introduction of the
epochs, the half populated inp was found during lookup.
This resulted in processing TCP segments in the context of the wrong
TCP connection.
This patch changes the above procedure in a way that the inp is fully
populated before inserted into the hash table.
Thanks to Paul <devgs@ukr.net> for reporting the issue on the net@
mailing list and for testing the patch!
Reviewed by: rrs@
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D22971
Michael Tuexen [Wed, 1 Jul 2020 21:52:14 +0000 (21:52 +0000)]
MFC r356270:
Improve input validation of the spp_pathmtu field in the
SCTP_PEER_ADDR_PARAMS socket option. The code in the stack assumes
sane values for the MTU.
This issue was found by running an instance of syzkaller.
MFC r356271:
Remove empty line which was added in previous commit by accident.
Michael Tuexen [Wed, 1 Jul 2020 20:45:26 +0000 (20:45 +0000)]
MFC r355268:
Add a description for the TCP sysctl variable rfc6675_pipe.
It was introduced by r290122, but no documentation was provided.
This is taken from https://reviews.freebsd.org/D21798, since it
is not related to the feature added there.
Michael Tuexen [Wed, 1 Jul 2020 20:41:23 +0000 (20:41 +0000)]
MFC r355266:
In order for the TCP Handshake to support ECN++, and further ECN-related
improvements, the ECN bits need to be exposed to the TCP SYNcache.
This change is a minimal modification to the function headers, without any
functional change intended.
Michael Tuexen [Wed, 1 Jul 2020 20:37:06 +0000 (20:37 +0000)]
MFC r355264:
Update the hostcache also for PTB messages received for SCTP/IPv6.
The corresponding code for SCTP/IPv4 was introduced in
https://svnweb.freebsd.org/base?view=revision&revision=317597
Submitted by: Julius Flohr
Differential Revision: https://reviews.freebsd.org/D22605
Michael Tuexen [Wed, 1 Jul 2020 20:32:51 +0000 (20:32 +0000)]
MFC r355172:
Really ignore the SCTP association identifier on 1-to-1 style sockets
as requiresd by the socket API specification.
Thanks to Inaki Baz Castillo, who found this bug running the userland
stack with valgrind and reported the issue in
https://github.com/sctplab/usrsctp/issues/408
Michael Tuexen [Wed, 1 Jul 2020 20:26:35 +0000 (20:26 +0000)]
MFC r355135: Plug two mbuf leaks during INIT-ACK handling.
One leak happens when there is not enough memory to allocate the
the resources for streams. The other leak happens if the are
unknown parameters in the received INIT-ACK chunk which require
reporting and the INIT-ACK requires sending an ABORT due to illegal
parameter combinations.
Hopefully this fixes
https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=19083
Michael Tuexen [Wed, 1 Jul 2020 19:50:03 +0000 (19:50 +0000)]
MFC r354773: Improve TCP CUBIC specific after idle reaction.
The adjustments are inspired by the Linux stack, which has had a
functionally equivalent implementation for more than a decade now.
MFC r356224: Add curly braces missed in above change
Submitted by: rscheff
Reviewed by: Cheng Cui
Differential Revision: https://reviews.freebsd.org/D18982
if_vtnet: let vtnet_rx_vq_intr() and vtnet_rxq_tq_intr() share code
Since the two functions are similar, introduce a common function
(vtnet_rx_vq_process()) to share common code.
This also improves locking, by ensuring vrxs_rescheduled is accessed
under the RXQ lock, and taskqueue_enqueue() is not called under the
lock (therefore avoiding a spurious duplicate lock warning).
Michael Tuexen [Wed, 1 Jul 2020 19:23:15 +0000 (19:23 +0000)]
MFC r354772: Implement a TCP CUBIC-specific after idle reaction.
This patch addresses a very common case of frequent application stalls,
where TCP runs idle and looses the state of the network.
Submitted by: rscheff
Reviewed by: Cheng Cui
Differential Revision: https://reviews.freebsd.org/D18954
Michael Tuexen [Wed, 1 Jul 2020 18:03:38 +0000 (18:03 +0000)]
MFC r353480: Use event handler in SCTP
Use an event handler to notify the SCTP about IP address changes
instead of calling an SCTP specific function from the IP code.
This is a requirement of supporting SCTP as a kernel loadable module.
This patch was developed by markj@, I tweaked a bit the SCTP related
code.
MFC r362006: Prevent TCP Cubic to abruptly increase cwnd after app-limited
Cubic calculates the new cwnd based on absolute time
elapsed since the start of an epoch. A cubic epoch is
started on congestion events, or once the congestion
avoidance phase is started, after slow-start has
completed.
When a sender is application limited for an extended
amount of time and subsequently a larger volume of data
becomes ready for sending, Cubic recalculates cwnd
with a lingering cubic epoch. This recalculation
of the cwnd can induce a massive increase in cwnd,
causing a burst of data to be sent at line rate by
the sender.
This adds a flag to reset the cubic epoch once a
session transitions from app-limited to cwnd-limited
to prevent the above effect.
MFC r361987: Prevent TCP Cubic to abruptly increase cwnd after slow-start
Introducing flags to track the initial Wmax dragging and exit
from slow-start in TCP Cubic. This prevents sudden jumps in the
caluclated cwnd by cubic, especially when the flow is application
limited during slow start (cwnd can not grow as fast as expected).
The downside is that cubic may remain slightly longer in the
concave region before starting the convex region beyond Wmax again.
MFC r361806: Add O_DIRECT flag to DD for cache bypass
FreeBSD DD utility has not had support for the O_DIRECT flag, which
is useful to bypass local caching, e.g. for unconditionally issuing
NFS IO requests during testing.
In the current iflib_netmap_rxsync, there is nothing that prevents
kring->nr_hwtail to overrun kring->nr_hwcur during the descriptor
import phase. This may cause errors in netmap applications, such as:
em1 RX0: fail 'head < kring->nr_hwcur || head > kring->nr_hwtail'
h 795 c 795 t 282 rh 795 rc 795 rt 282 hc 282 ht 282
Dimitry Andric [Tue, 30 Jun 2020 15:53:52 +0000 (15:53 +0000)]
MFC r362623:
Fix copy/paste mistake in kvm_getswapinfo(3)
It seems this manpage was copied from kvm_getloadavg(3), but the
DIAGNOSTICS section was not updated completely. Update the section with
correct information about a return value of -1.
Don't compress and uuencode the "hexdump -C" output files. Just
save them with the $FreeBSD$ tag prepended. Changes to these
files are now a lot easier to comprehend, which makes diffs also
reviewable.
VHDX is the successor of Microsoft's VHD file format. It increases
maximum capacity of the virtual drive to 64TB and introduces features
to better handle power/system failures.
VHDX is the required format for 2nd generation Hyper-V VMs.
r362029:
Fix reading EDID on TVs/monitors without E-DCC support
Writing segment id to I2C device 0x30 only required if the segment is
non-zero. On the devices without E-DCC support writing to that address
fails and whole transaction then fails too. To avoid this do
not attempt write to the segment selection device unless required.
r362030:
Add mode selection to iMX6 IPU driver
- Configure ipu1_di0 tob e sourced from the VIDEO_PLL(PLL5) and hardcode
frequency to (455000000/3)Mhz. This value, further divided, can yield
frequencies close enough to support 1080p, 720p, 1024x768, and 640x480
modes. This is not ideal but it's an improvement comparing to the only
hardcoded 1024x768 mode.
- Fix memory leaks if attach method failed
- Print EDID when -v passed to the kernel
Cy Schubert [Sun, 28 Jun 2020 03:28:28 +0000 (03:28 +0000)]
MFC r362568:
MFV r362565:
Update 4.2.8p14 --> 4.2.8p15
Summary: Systems that use a CMAC algorithm in ntp.keys will not release
a bit of memory on each packet that uses a CMAC keyid, eventually causing
ntpd to run out of memory and fail. The CMAC cleanup from
https://bugs.ntp.org/3447, part of ntp-4.2.8p11, introduced a bug whereby
the CMAC data structure was no longer completely removed.
Kyle Evans [Sun, 28 Jun 2020 02:38:07 +0000 (02:38 +0000)]
MFC r361884: sed: attempt to learn about hex escapes (e.g. \x27)
Somewhat predictably, software often wants to use \x27/\x24 among others so
that they can decline worrying about ugly escaping, if said escaping is even
possible. Right now, this software is using these and getting the wrong
results, as we'll interpret those as x27 and x24 respectively. Some examples
of this, when an exp-run was ran, were science/octopus and misc/vifm.
Go ahead and process these at all times. We allow either one or two digits,
and the tests account for both. If extra digits are specified, e.g. \x2727,
then the third and fourth digits are interpreted literally as one might
expect.
Rick Macklem [Sun, 28 Jun 2020 01:29:14 +0000 (01:29 +0000)]
MFC: r361854
Fix mountd so that it will not lose SIGHUPs that indicate "reload exports".
Without this patch, if a SIGHUP is handled while the process is executing
get_exportlist(), that SIGHUP is essentially ignored because the got_sighup
variable is reset to 0 after get_exportlist().
This results in the exports file(s) not being reloaded until another SIGHUP
signal is sent to mountd.
This patch fixes this by resetting got_sighup to zero before the
get_exportlist() call while SIGHUP is blocked.
It also defines a delay time of 250msec before doing another exports reload
if there are RPC request(s) to process. This prevents repeated exports reloads
from delaying handling of RPC requests significantly.
- Sort options in the list & indent for readability.
- Pet linters
- Use "\(em" instead of "--"
- Remove Tn macros
- Use Ql instead of Dq Li
- Add arguments to the -M, -N, -p, and -u options in their descriptions.
- Use Sy instead of Li for field names. Li is deprecated, and Ql makes no
sense here.
- Replace a literal block with a list for the table of special names
related to FD.
- Use Ql instead of ``X''.
- Add a dot after etc.
- Reference fuser(1).
Kristof Provost [Fri, 26 Jun 2020 12:11:22 +0000 (12:11 +0000)]
MFC r360345:
bridge: epoch-ification
Run the bridge datapath under epoch, rather than under the
BRIDGE_LOCK().
We still take the BRIDGE_LOCK() whenever we insert or delete items in
the relevant lists, but we use epoch callbacks to free items so that
it's safe to iterate the lists without the BRIDGE_LOCK.
Tests on mercat5/6 shows this increases bridge throughput significantly,
from 3.7Mpps to 18.6Mpps.
Kristof Provost [Fri, 26 Jun 2020 10:08:57 +0000 (10:08 +0000)]
MFC r359641:
bridge: Change lists to CK_LIST as a peparation for epochification
Prepare the ground for a rework of the bridge locking approach. We will
use an epoch-based approach in the datapath and making it safe to
iterate over the interface, span and rtnode lists without holding the
BRIDGE_LOCK. Replace the relevant lists by their ConcurrencyKit
equivalents.
Kristof Provost [Fri, 26 Jun 2020 09:52:43 +0000 (09:52 +0000)]
MFC r358325:
bridge: Move locking defines into if_bridge.c
The locking defines for if_bridge used to live in if_bridgevar.h, but
they're only ever used by the bridge implementation itself (in
if_bridge.c). Moving them into the .c file.
Ryan Moeller [Fri, 26 Jun 2020 00:58:59 +0000 (00:58 +0000)]
MFC r362544:
libdevdctl: Force full match of "timestamp" field name
OpenZFS generates events with a "zio_timestamp" field, which gets mistaken for
"timestamp" by libdevdctl due to imprecise string matching. Then later it is
assumed a "timestamp" field exists when it doesn't and an exception is thrown.
Add a space to the search string so we match exactly "timestamp" rather than
anything with that as a suffix.
Approved by: mav (mentor)
Sponsored by: iXsystems, Inc.
Rick Macklem [Thu, 25 Jun 2020 02:45:49 +0000 (02:45 +0000)]
MFC: r361780, r361956
Fix mountd to handle getgrouplist() not returning groups[0] == groups[1].
Prior to r174547, getgrouplist(3) always returned a groups list with
element 0 and 1 set to the basegid argument, so long as ngroups was > 1.
Post-r174547 this is not the case. r328304 disabled the deduplication that
removed the duplicate, but the duplicate still does not occur unless the
group for a user in the password database is also entered in the group
database.
This patch fixes mountd so that it handles the case where a user specified
with the -maproot or -mapall exports option has a getgrouplist(3) groups
list where groups[0] != groups[1].
MFC r361347: With RFC3168 ECN, CWR SHOULD only be sent with new data
Overly conservative data receivers may ignore the CWR flag on other
packets, and keep ECE latched. This can result in continous reduction
of the congestion window, and very poor performance when ECN is
enabled.
Alexander Motin [Wed, 24 Jun 2020 13:49:30 +0000 (13:49 +0000)]
MFC r362337: Make polled request timeout less invasive.
Instead of panic after one second of polling, make the normal timeout
handler to activate, reset the controller and abort the outstanding
requests. If all of it won't happen within 10 seconds then something
in the driver is likely stuck bad and panic is the only way out.
In particular this fixed device hot unplug during execution of those
polled commands, allowing clean device detach instead of panic.
In the receive interrupt routine, always call netmap_rx_irq().
The latter function will return != NM_IRQ_PASS if netmap is not
active on that specific receive queue, so that the driver can go
on with iflib_rxeof(). Note that netmap supports partial opening,
where only a subset of the RX or TX rings can be open in netmap mode.
Checking the IFCAP_NETMAP flag is not enough to make sure that the
queue is indeed in netmap mode.
Moreover, in case netmap_rx_irq() returns NM_IRQ_RESCHED, it means
that netmap expects the driver to call netmap_rx_irq() again as soon
as possible. Currently, this may happen when the device is attached
to a VALE switch.
The nm_register callback needs to call nm_set_native_flags()
or nm_clear_native_flags() once the device has been stopped.
However, in the current implementation this is not true,
as the device is stopped by vtnet_init_locked(). This causes
race conditions where the driver crashes as soon as it
dequeues netmap buffers assuming they are mbufs (or the other
way around).
To fix the issue, we extend vtnet_init_locked() with a second
argument that, if not zero, will set/clear the netmap flags.
This results in a huge simplification of the nm_register
callback itself.
Also, use netmap_reset() to check if a ring is going to be
re-initialized in netmap mode.