Chuck Tuffli [Mon, 24 Aug 2020 01:51:21 +0000 (01:51 +0000)]
bhyve: NVMe queue create must init head/tail
The NVMe emulation code did not explicitly initialize queue head and
tail pointers on queue creation. As these pointers are part of
calloc()'ed memory, this only becomes a problem if the queues are
deleted and then recreated.
This error can manifest with messages about completions not matching a
command.
Chuck Tuffli [Mon, 24 Aug 2020 01:51:17 +0000 (01:51 +0000)]
bhyve: NVMe set nominal health values
Some operating systems believe bhyve's emulated NVMe drive is failing
based on certain values in the SMART / Health Information log page being
zero. Fix is to set the reported temperature and available spare values
to reasonable defaults.
Kyle Evans [Sun, 23 Aug 2020 23:56:57 +0000 (23:56 +0000)]
caroot: switch to using echo+shell glob to enumerate certs
This solves an issue on stable/12 that causes certs to not get installed.
ls is apparently not in PATH during installworld, so TRUSTED_CERTS ends up
blank and nothing gets installed. We don't really require anything
ls-specific, though, so let's just simplify it.
Bjoern A. Zeeb [Sun, 23 Aug 2020 21:37:20 +0000 (21:37 +0000)]
net80211: set_vht_extchan() reverse order to always return best
In set_vht_extchan() the checks are performed in the order of VHT20/40/80.
That means if a channel has a lower and higheer VHT flag set we would
return the lower first.
We normally do not set more than one VHT flag so this change is supposed
to be a NOP but follows the logical thinking order of returning the best
first. Also we nowhere assert a single VHT flag so make sure we'll not
be stuck with VHT20 when we could do more.
While here add the debugging printfs for VHT160 and VHT80P80 which still
need doing once we deal with a driver at that level.
Since LA57 was moved to the main SDM document with revision 072, it
seems that we should have a support for it, and silicons are coming.
This patch makes pmap support both LA48 and LA57 hardware. The
selection of page table level is done at startup, kernel always
receives control from loader with 4-level paging. It is not clear how
UEFI spec would adapt LA57, for instance it could hand out control in
LA57 mode sometimes.
To switch from LA48 to LA57 requires turning off long mode, requesting
LA57 in CR4, then re-entering long mode. This is somewhat delicate
and done in pmap_bootstrap_la57(). AP startup in LA57 mode is much
easier, we only need to toggle a bit in CR4 and load right value in CR3.
I decided to not change kernel map for now. Single PML5 entry is
created that points to the existing kernel_pml4 (KML4Phys) page, and a
pml5 entry to create our recursive mapping for vtopte()/vtopde().
This decision is motivated by the fact that we cannot overcommit for
KVA, so large space there is unusable until machines start providing
wider physical memory addressing. Another reason is that I do not
want to break our fragile autotuning, so the KVA expansion is not
included into this first step. Nice side effect is that minidumps are
compatible.
On the other hand, (very) large address space is definitely
immediately useful for some userspace applications.
For userspace, numbering of pte entries (or page table pages) is
always done for 5-level structures even if we operate in 4-level mode.
The pmap_is_la57() function is added to report the mode of the
specified pmap, this is done not to allow simultaneous 4-/5-levels
(which is not allowed by hw), but to accomodate for EPT which has
separate level control and in principle might not allow 5-leve EPT
despite x86 paging supports it. Anyway, it does not seems critical to
have 5-level EPT support now.
Tested by: pho (LA48 hardware)
Reviewed by: alc
Sponsored by: The FreeBSD Foundation
Differential revision: https://reviews.freebsd.org/D25273
Pass pointers to info parsed from notes, to brandinfo->header_supported filter.
Currently, we parse notes for the values of ELF FreeBSD feature flags
and osrel. Knowing these values, or knowing that image does not carry
the note if pointers are NULL, is useful to decide which ABI variant
(brand) we want to activate for the image.
Right now this is only a plumbing change
Tested by: pho
Sponsored by: The FreeBSD Foundation
Differential revision: https://reviews.freebsd.org/D25273
Mateusz Guzik [Sun, 23 Aug 2020 11:06:59 +0000 (11:06 +0000)]
libc: hide alphasort_thunk behind I_AM_SCANDIR_B
Should unbreak gcc build as reported by tinderbox:
lib/libc/gen/scandir.c:59:12: warning: 'alphasort_thunk' declared 'static' but never defined [-Wunused-function]
Attempt of adding assertions that pgrp->pg_jobc counters do not
underflow in r361967, reverted in r362910, points out bugs in the
handling of job control. Peter Holm was able to narrow down the
problem to very easy reproduction with timeout(1) which uses reaping.
The following list of problems with calculation of pg_jobs which
directs SIGHUP/SIGCONT delivery for orphaned process group was
identified:
- Re-calculation of the orphaned status for children of exiting parent
was wrong, but mostly unnoticed when all children were reparented to
init(8). When child can be reparented to a different process which
could affect the child' job control state, it was not properly
accounted for in pg_jobc.
- Lockless check for exiting process' parent process group is racy
because nothing prevents the parent from changing its group
membership.
- Exited process is left in the process group, until waited. This
affects other calculations of pg_jobc.
Split handling of job control status on process changing its process
group, and process exiting. Calculate increments and decrements for
pg_jobs by exact checking the orphanage instead of assuming process
group membership for children and parent. Move the call to killjobc()
later under the proctree_lock. Mark exiting process in killjobc()
with a new flag P_TREE_GRPEXITED and skip it for all pg_jobc
calculations after the flag is set.
Add checker that independently recalculates pg_jobc value and compares
it with the memoized process group state. This is enabled under INVARIANTS.
Rename rt_flags to rte_flags && reduce number of rt_nhop accesses.
No functional changes.
Most of the routing flags are stored in the netxtop instead of rtentry.
Rename rt->rt_flags to rt->rte_flags to simplify reading/modifying code
checking routing flags.
In the new multipath code, rt->rt_nhop may actually point to nexthop group
instead of nhop. To ease transition, reduce the amount of rt->rt_nhop->...
accesses.
Warner Losh [Sat, 22 Aug 2020 19:02:15 +0000 (19:02 +0000)]
Retire obsolete sysctl hw.bus.devctl_disable
hw.bus.devctl_disable has tagged been obsolete for a decade. Remove it. Also
remove some long obsolete comments. This was done and backed out once in 2014,
but we've had enough releases with the 'new' method of setting queue length that
we can just remove this sysctl now (stable/11, stable/12 and current all don't
reference it).
Add test for checking RTF_HOST and RTAX_NETMASK inconsistency.
RTF_HOST indicates whether route is a host route
(netmask is empty or /{32,128}).
Check that if netmask is empty and host route is not specified, kernel
returns an error.
Ed Maste [Sat, 22 Aug 2020 14:39:14 +0000 (14:39 +0000)]
acpi_iort: fix mapping end calculation
According to the ARM Design Document "IO Remapping Table Platform"
(DEN 0049D), the "Number of IDs" field of the ID mapping format means
"The number of IDs in the range minus one".
Submitted by: Greg V <greg@unrelenting.technology>
Reviewed by: andrew
Differential Revision: https://reviews.freebsd.org/D25179
Dimitry Andric [Sat, 22 Aug 2020 12:05:11 +0000 (12:05 +0000)]
Add a few new source files to libc++, in particular the implementation
part of std::random_shuffle. These were split off at some point by
upstream, but I forgot to add them to our Makefile.
This should allow some ports which use std::random_shuffle to correctly
link again.
Instantiate Error in Target::GetEntryPointAddress() only when
necessary
When Target::GetEntryPointAddress() calls
exe_module->GetObjectFile()->GetEntryPointAddress(), and the returned
entry_addr is valid, it can immediately be returned.
However, just before that, an llvm::Error value has been setup, but
in this case it is not consumed before returning, like is done
further below in the function.
In https://bugs.freebsd.org/248745 we got a bug report for this,
where a very simple test case aborts and dumps core:
* thread #1, name = 'testcase', stop reason = breakpoint 1.1
frame #0: 0x00000000002018d4 testcase`main(argc=1, argv=0x00007fffffffea18) at testcase.c:3:5
1 int main(int argc, char *argv[])
2 {
-> 3 return 0;
4 }
(lldb) p argc
Program aborted due to an unhandled Error:
Error value was Success. (Note: Success values must still be checked prior to being destroyed).
Thread 1 received signal SIGABRT, Aborted.
thr_kill () at thr_kill.S:3
3 thr_kill.S: No such file or directory.
(gdb) bt
#0 thr_kill () at thr_kill.S:3
#1 0x00000008049a0004 in __raise (s=6) at /usr/src/lib/libc/gen/raise.c:52
#2 0x0000000804916229 in abort () at /usr/src/lib/libc/stdlib/abort.c:67
#3 0x000000000451b5f5 in fatalUncheckedError () at /usr/src/contrib/llvm-project/llvm/lib/Support/Error.cpp:112
#4 0x00000000019cf008 in GetEntryPointAddress () at /usr/src/contrib/llvm-project/llvm/include/llvm/Support/Error.h:267
#5 0x0000000001bccbd8 in ConstructorSetup () at /usr/src/contrib/llvm-project/lldb/source/Target/ThreadPlanCallFunction.cpp:67
#6 0x0000000001bcd2c0 in ThreadPlanCallFunction () at /usr/src/contrib/llvm-project/lldb/source/Target/ThreadPlanCallFunction.cpp:114
#7 0x00000000020076d4 in InferiorCallMmap () at /usr/src/contrib/llvm-project/lldb/source/Plugins/Process/Utility/InferiorCallPOSIX.cpp:97
#8 0x0000000001f4be33 in DoAllocateMemory () at /usr/src/contrib/llvm-project/lldb/source/Plugins/Process/FreeBSD/ProcessFreeBSD.cpp:604
#9 0x0000000001fe51b9 in AllocatePage () at /usr/src/contrib/llvm-project/lldb/source/Target/Memory.cpp:347
#10 0x0000000001fe5385 in AllocateMemory () at /usr/src/contrib/llvm-project/lldb/source/Target/Memory.cpp:383
#11 0x0000000001974da2 in AllocateMemory () at /usr/src/contrib/llvm-project/lldb/source/Target/Process.cpp:2301
#12 CanJIT () at /usr/src/contrib/llvm-project/lldb/source/Target/Process.cpp:2331
#13 0x0000000001a1bf3d in Evaluate () at /usr/src/contrib/llvm-project/lldb/source/Expression/UserExpression.cpp:190
#14 0x00000000019ce7a2 in EvaluateExpression () at /usr/src/contrib/llvm-project/lldb/source/Target/Target.cpp:2372
#15 0x0000000001ad784c in EvaluateExpression () at /usr/src/contrib/llvm-project/lldb/source/Commands/CommandObjectExpression.cpp:414
#16 0x0000000001ad86ae in DoExecute () at /usr/src/contrib/llvm-project/lldb/source/Commands/CommandObjectExpression.cpp:646
#17 0x0000000001a5e3ed in Execute () at /usr/src/contrib/llvm-project/lldb/source/Interpreter/CommandObject.cpp:1003
#18 0x0000000001a6c4a3 in HandleCommand () at /usr/src/contrib/llvm-project/lldb/source/Interpreter/CommandInterpreter.cpp:1762
#19 0x0000000001a6f98c in IOHandlerInputComplete () at /usr/src/contrib/llvm-project/lldb/source/Interpreter/CommandInterpreter.cpp:2760
#20 0x0000000001a90b08 in Run () at /usr/src/contrib/llvm-project/lldb/source/Core/IOHandler.cpp:548
#21 0x00000000019a6c6a in ExecuteIOHandlers () at /usr/src/contrib/llvm-project/lldb/source/Core/Debugger.cpp:903
#22 0x0000000001a70337 in RunCommandInterpreter () at /usr/src/contrib/llvm-project/lldb/source/Interpreter/CommandInterpreter.cpp:2946
#23 0x0000000001d9d812 in RunCommandInterpreter () at /usr/src/contrib/llvm-project/lldb/source/API/SBDebugger.cpp:1169
#24 0x0000000001918be8 in MainLoop () at /usr/src/contrib/llvm-project/lldb/tools/driver/Driver.cpp:675
#25 0x000000000191a114 in main () at /usr/src/contrib/llvm-project/lldb/tools/driver/Driver.cpp:890
Fix the incorrect error catch by only instantiating an Error object
if it is necessary.
Mateusz Guzik [Sat, 22 Aug 2020 06:56:04 +0000 (06:56 +0000)]
vfs: add a work around for vp_crossmp bug to realpath
The actual bug is not yet addressed as it will get much easier after other
problems are addressed (most notably rename contract).
The only affected in-tree consumer is realpath. Everyone else happens to be
performing lookups within a mount point, having a side effect of ni_dvp being
set to mount point's root vnode in the worst case.
Rick Macklem [Sat, 22 Aug 2020 03:57:55 +0000 (03:57 +0000)]
Add TLS support to the kernel RPC.
An internet draft titled "Towards Remote Procedure Call Encryption By Default"
describes how TLS is to be used for Sun RPC, with NFS as an intended use case.
This patch adds client and server support for this to the kernel RPC,
using KERN_TLS and upcalls to daemons for the handshake, peer reset and
other non-application data record cases.
The upcalls to the daemons use three fields to uniquely identify the
TCP connection. They are the time.tv_sec, time.tv_usec of the connection
establshment, plus a 64bit sequence number. The time fields avoid problems
with re-use of the sequence number after a daemon restart.
For the server side, once a Null RPC with AUTH_TLS is received, kernel
reception on the socket is blocked and an upcall to the rpctlssd(8) daemon
is done to perform the TLS handshake. Upon completion, the completion
status of the handshake is stored in xp_tls as flag bits and the reply to
the Null RPC is sent.
For the client, if CLSET_TLS has been set, a new TCP connection will
send the Null RPC with AUTH_TLS to initiate the handshake. The client
kernel RPC code will then block kernel I/O on the socket and do an upcall
to the rpctlscd(8) daemon to perform the handshake.
If the upcall is successful, ct_rcvstate will be maintained to indicate
if/when an upcall is being done.
If non-application data records are received, the code does an upcall to
the appropriate daemon, which will do a SSL_read() of 0 length to handle
the record(s).
When the socket is being shut down, upcalls are done to the daemons, so
that they can perform SSL_shutdown() calls to perform the "peer reset".
The rpctlssd(8) and rpctlscd(8) daemons require a patched version of the
openssl library and, as such, will not be committed to head at this time.
Although the changes done by this patch are fairly numerous, there should
be no semantics change to the kernel RPC at this time.
A future commit to the NFS code will optionally enable use of TLS for NFS.
Bjoern A. Zeeb [Fri, 21 Aug 2020 22:31:45 +0000 (22:31 +0000)]
After the clang/llvm version 11 import LLD_VERSION is no longer used
upstream so Version.inc now only defines LLD_VERSION_STRING.
This breaks the WANT_LINKER_VERSION magic and might lead to us building
more than needed (e.g., for croos-tools).
Change the awk script to parse LLD_VERSION_STRING instead of LLD_VERSION,
which not only unbreaks the current situation but should also be backwards
compatible as dim points out.
PR: 248818
Reviewed by: emaste, dim (seems right and the way to go)
MFC after: 4 weeks
X-MFC before: 364284
Allow to dynamically grow the amount of fibs in each vnet.
This change alters current behavior. Currently, if one defines
ROUTETABLES > 1 in the kernel config, each vnet will be created
with the number of fibs defined in the kernel config.
After this commit vnets will be created with fibs=1.
Dynamic net.fibs is not compatible with net.add_addr_allfibs.
The plan is to deprecate the latter and make
net.add_addr_allfibs=0 default behaviour.
Eric van Gyzen [Fri, 21 Aug 2020 19:34:41 +0000 (19:34 +0000)]
ixgbe: fix impossible condition
Coverity flagged this condition: The condition
offset == 0 && offset == 65535
can never be true because offset cannot be equal
to two different values at the same time.
Andrew Gallatin [Fri, 21 Aug 2020 18:31:57 +0000 (18:31 +0000)]
uma: record allocation failures due to zone limits
The zone limit mechanism was recently reworked, and
allocation failures due to limits being exceeded
were inadvertently no longer being recorded. This
would lead to, for example, mbuf allocation failures
not being indicated in netstat -m or vmstat -z
Coverity has identified the line in this change as "Potential integer
overflowing expression" due to the variable i declared as an int
and used in an expression with vm_paddr_t, a 64bit variable.
This change has very little effect as when this line is execute
nkpt is small and phys_addr is a the beginning of physical memory.
But there is no explicit protection that the above is true.
- Remove trailing whitespace
- Address igor and mandoc warnings
- Sort options
- Use macros consistently (e.g., Fl for flags, Dq for quoting, Bd for code
blocks)
- Add a history section
- Fix incorrect use of macros in various places
Brandon Bergren [Fri, 21 Aug 2020 03:31:01 +0000 (03:31 +0000)]
[PowerPC] Fix translation-related crashes during startup
After spending a lot of time trying to track down what was going on, I have
isolated the "black screen" failures when using boot1 to boot a G4 machine.
It turns out we were replacing the traps before installing the temporary
BAT entry for the bottom of physical memory. That meant that until the MMU
was bootstrapped, the cached translations were the only thing keeping us
from losing.
Throwing boot1 into the mix was affecting execution flow enough to cause us
to hit an uncached page and crash.
Fix this by properly setting up the initial BAT entry at the same time we
are replacing the OpenFirmware traps, so we can continue executing in
segment 0 until the rest of the DMAP has been set up.
A second thing discovered while researching this is that we were entering a
BAT region for segment 16. It turns out this range was a) considered part
of KVA, and b) has firmware mappings with varying attributes.
If we ever accessed an unmapped page in segment 16, it would cause a BAT
entry to be installed for the whole segment, which would bypass the
existing mappings until it was flushed out again.
Instead, translate the OFW memory attributes into VM memory attributes and
install the ranges into the kernel address space properly.
Gleb Smirnoff [Thu, 20 Aug 2020 20:31:47 +0000 (20:31 +0000)]
When we have a command returned by zfs_nextboot() that is longer
than command in the loader.conf, the latter needs to be nul terminated,
otherwise garbage trailer left from zfs_nextboot() will be passed to
parse_cmd() together with loader.conf command.
While here, reset cmd to empty string if read() returns error.
Mark Johnston [Thu, 20 Aug 2020 19:28:19 +0000 (19:28 +0000)]
Enable creation of static userspace probes in incremental builds.
To define USDT probes, dtrace -G makes use of relocations for undefined
symbols: the target address is overwritten with NOPs and the location is
recorded in the DOF section of the output object file. To avoid link
errors, the original relocation is destroyed. However, this means that
the same input object file cannot be processed multiple times, as
happens during incremental rebuilds. Instead, only set the relocation
type to NONE, so that all information required to reconstruct USDT
probes is preserved.
Reported by: bdrewery
MFC after: 3 weeks
Sponsored by: The FreeBSD Foundation
Mark Johnston [Thu, 20 Aug 2020 19:27:49 +0000 (19:27 +0000)]
Remove non-FreeBSD ifdefs from dt_link.c.
This file is too complicated as it is and has diverged a fair bit from
illumos due to toolchain differences, so just drop unused code
(including SPARC support).
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Dimitry Andric [Thu, 20 Aug 2020 18:50:46 +0000 (18:50 +0000)]
Bump kldxref's MAXSEGS to 16, to stop complaints about the kernel
supposedly having too many segments, when lld 11 links it. Such kernels
should load just fine.
Note that we may still do some tweaking of our kernel linker scripts, to
lower the number of segments, although the exact benefit is not entirely
clear.
The AMD's Ryzen 3 3200g XHCI controllers apparently need the evaluate
control endpoint context command, but we don't need to issue this
command when the bMaxPacketSize is received after the read of the USB
device descriptor, because this part should be handled automatically.
Warner Losh [Thu, 20 Aug 2020 17:35:47 +0000 (17:35 +0000)]
Remove the long obsolete ufm driver.
It was a driver for a USB FM tuner that was available in the market in 2002. I
wrote the driver in 2003. I've not used it since 2005 or so, so it's time to
retire this driver. No userland code ever interfaced to the special device it
created. There's no user base: the last bug I received on this driver was in
2004.
Warner Losh [Thu, 20 Aug 2020 16:52:48 +0000 (16:52 +0000)]
Use names suggested by kib@ in review D25969, move call for unmount to not call
with vnode locked, use NOWAIT alloc and only report when we don't overflow.
These changes were accidentally omitted from r364402, except for the not
reporting on overflow. They were lumped in with a debugging commit in my tree
that I omitted w/o realizing this.
Other issues from the review are pending some other changes I need to do first.
Emmanuel Vadot [Thu, 20 Aug 2020 12:50:49 +0000 (12:50 +0000)]
libsa: smbios: Parse the chassis type and export it as smbios.chassis.type
It can be useful to know what type of machine we are running on for desktop
related thing.
It also allow us to support all the DMI variable that linux driver can fetch.
MFC after: 1 week
Sponsored by: Sponsored-by: The FreeBSD Foundation
Mateusz Guzik [Thu, 20 Aug 2020 10:06:50 +0000 (10:06 +0000)]
cache: don't use cache_purge_negative when renaming
It avoidably scans (and evicts) unrelated entries. Instead take
advantage of passed componentname and perform a hash lookup
for the exact one.
Sample data from buildworld probed on cache_purge_negative extended
to count both scanned and evicted entries on each call are below.
At most it has to evict 1.
Pedro F. Giffuni [Thu, 20 Aug 2020 05:08:49 +0000 (05:08 +0000)]
extfs: remove redundant little endian conversion.
The XTIME_TO_NSEC macro already calls the htole32(), so there is no need
to call it twice. This code does nothing on LE platforms and affects only
nanosecond and birthtime fields so it's difficult to notice on regular use.
Rick Macklem [Thu, 20 Aug 2020 03:53:18 +0000 (03:53 +0000)]
Add MSG_TLSAPPDATA to lib/libsysdecode/mktables.
I have no idea what this does (and until now that it even existed), but
apparently it needs this entry changed for the MSG_TLSAPPDATA, since
it is kernel only.
Alan Somers [Thu, 20 Aug 2020 01:31:21 +0000 (01:31 +0000)]
zfs: fix EIO accessing dataset after resuming interrupted receive
ZFS unmounts a dataset while receiving into it and remounts it afterwards.
But if ZFS is resuming an incomplete receive, it screws up and ends up with
a dataset that is mounted, but returns EIO for every access. This commit
fixes that condition.
While the vulnerable code also exists in OpenZFS, the problem is not
reproducible there. Apparently OpenZFS doesn't unmount the destination
dataset during receive, like FreeBSD does.
Mark Johnston [Thu, 20 Aug 2020 00:52:53 +0000 (00:52 +0000)]
Use pmap_mapbios() to map ACPI tables on amd64 and i386.
The ACPI table-mapping code used pmap_kenter_temporary() to create
mappings, which in turn uses the fixed-size crashdump map. Moreover,
the code was not verifying that the table fits in this map, so when
mapping large tables we could clobber adjacent mappings. This use of
pmap_kenter_temporary() appears to predate support in pmap_mapbios() for
creating early mappings, but that restriction no longer applies.
PR: 248746
Reviewed by: kib, mav
Tested by: gallatin, Curtis Villamizar <curtis@ipv6.occnc.com>
MFC after: 3 days
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D26125
Rick Macklem [Wed, 19 Aug 2020 23:42:33 +0000 (23:42 +0000)]
Add the MSG_TLSAPPDATA flag to indicate "return ENXIO" for non-application TLS
data records.
The kernel RPC cannot process non-application data records when
using TLS. It must to an upcall to a userspace daemon that will
call SSL_read() to process them.
This patch adds a new flag called MSG_TLSAPPDATA that the kernel
RPC can use to tell sorecieve() to return ENXIO instead of a non-application
data record, when that is what is at the top of the receive queue.
I put the code in #ifdef KERN_TLS/#endif, although it will build without
that, so that it is recognized as only useful when KERN_TLS is enabled.
The alternative to doing this is to have the kernel RPC re-queue the
non-application data message after receiving it, but that seems more
complicated and might introduce message ordering issues when there
are multiple non-application data records one after another.
I do not know what, if any, changes will be required to support TLS1.3.
Andrew Gallatin [Wed, 19 Aug 2020 17:59:06 +0000 (17:59 +0000)]
TCP: remove special treatment for hardware (ifnet) TLS
Remove most special treatment for ifnet TLS in the TCP stack, except
for code to avoid mixing handshakes and bulk data.
This code made heroic efforts to send down entire TLS records to
NICs. It was added to improve the PCIe bus efficiency of older TLS
offload NICs which did not keep state per-session, and so would need
to re-DMA the first part(s) of a TLS record if a TLS record was sent
in multiple TCP packets or TSOs. Newer TLS offload NICs do not need
this feature.
At Netflix, we've run extensive QoE tests which show that this feature
reduces client quality metrics, presumably because the effort to send
TLS records atomically causes the server to both wait too long to send
data (leading to buffers running dry), and to send too much data at
once (leading to packet loss).
Warner Losh [Wed, 19 Aug 2020 17:10:09 +0000 (17:10 +0000)]
Document the VFS FS events
MOUNT notifies when a filesystem is mounted
REMOUNT notifies when a filesystem is mounted again
UNMOUNT notifies when a filesystem is unmounted
These events are asynchronous to the actual state of the event (though the data
is recorded at a time when it is stable). The mount event is reported after the
filesystem is mounted. However, in the interim it may be unmounted by another
agent. Likewise, umount is called just before the mountpoint is finished tearing
down. It may be remounted (or maybe if the process scheduling is wonky and devd
gets to run before the last few steps are complete).