Bjoern A. Zeeb [Wed, 26 May 2021 17:51:24 +0000 (17:51 +0000)]
OFED: migrate LinuxKPI net_device/ifnet macros into ofed
The LinuxKPI net_device actually is an ifnet; in order to further
clean that up so we can extend "net_device" migrate the few macros
left into ofed and make sure the header is included in all files
which need access to the macros.
Sponsored by: The FreeBSD Foundation
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D30477
Bjoern A. Zeeb [Mon, 24 May 2021 17:54:16 +0000 (17:54 +0000)]
LinuxKPI: byteorder.h
Add a few more le<n>_{tp,add}_cpu*() #defines/functions found in
wireless drivers. While here fill most of the combinatorics gaps
and also add the remaining combinations [1].
Suggested by: emaste [1] (for one part)
Sponsored by: The FreeBSD Foundation
Reviewed by: hselasky
Differential Revision: https://reviews.freebsd.org/D30418
Bjoern A. Zeeb [Mon, 24 May 2021 18:09:37 +0000 (18:09 +0000)]
LinuxKPI: add ether_addr_equal_unaligned()
Replace the implementation for ether_addr_equal() with
ether_addr_equal_unaligned() and add a define for ether_addr_equal()
pointing to the now ether_addr_equal_unaligned() implementation.
This way ether_addr_equal_unaligned() cannot be broken by accident [1].
Suggested by: emaste [1]
Sponsored by: The FreeBSD Foundation
Reviewed by: hselasky
Differential Revision: https://reviews.freebsd.org/D30425
Bjoern A. Zeeb [Mon, 24 May 2021 18:14:37 +0000 (18:14 +0000)]
LinuxKPI: add irq_set_affinity_hint()
Add an implementation for irq_set_affinity_hint() to linux/interrupt.h
and include linux/hardirq.h for synchronize_irq() as needed by
wireless drivers.
Sponsored by: The FreeBSD Foundation
Reviewed by: hselasky
Differential Revision: https://reviews.freebsd.org/D30427
Bjoern A. Zeeb [Mon, 24 May 2021 18:17:30 +0000 (18:17 +0000)]
LinuxKPI: add linux/{ip,tcp,udp}.h
Add header files for struct and accessors for IPv4, UDP, and TCP.
Only parts of the fields of the structs have been seen while working
on wireless drivers. The remaining field names are filled up with
the FreeBSD field names for now. If you have insights into their
correct naming in Linux, feel free to adjust.
Sponsored by: The FreeBSD Foundation
Reviewed by: hselasky
Differential Revision: https://reviews.freebsd.org/D30428
Bjoern A. Zeeb [Mon, 24 May 2021 18:26:41 +0000 (18:26 +0000)]
LinuxKPI: change BUILD_BUG_ON()
BUILD_BUG_ON() can be used inside functions where the definition to
CTASSERT() (_Static_assert()) seems to not work.
Go back to an old-style CTASSERT() implementation but also add a
variable dclaration to avoid "unsued typedef" errors and dummy-use
the variable to avoid "unusued variable" errors. Given it is all
self-contained in a block and not used outside this should be
optimised away.
Sponsored by: The FreeBSD Foundation
Reviewed by: hselasky
Differential Revision: https://reviews.freebsd.org/D30431
Bjoern A. Zeeb [Mon, 24 May 2021 18:40:42 +0000 (18:40 +0000)]
LinuxKPI: add rcu_dereference_check()
Add a define for rcu_dereference_check() to rcu_dereference_protected()
which ignores the check argument. Our lockdep compat implementation
for use cases found in iwlwifi would return 1 anyway.
Sponsored by: The FreeBSD Foundation
Reviewed by: hselasky
Differential Revision: https://reviews.freebsd.org/D30436
Bjoern A. Zeeb [Mon, 24 May 2021 18:53:28 +0000 (18:53 +0000)]
LinuxKPI: extract stringify() in their own header file
Add linux/stringify.h as directly included by drivers. Remove the
definitions from compiler.h and include the new header in places
where the stringify macros are already used without linuxkpi.
Bjoern A. Zeeb [Wed, 31 Mar 2021 15:25:01 +0000 (15:25 +0000)]
LinuxKPI: treat firmware file names more lenient
A lot of firmware files have a "-" in the name. That "-" is a problem
when dealing with shell variables or loader (e.g., auto-loading .ko).
It may thus often be convenient to generate firmware kernel object files
with s/-/_/g in the name. In order to automatically find them from
drivers using LinuxKPI also substitue the '-' for a '_' like we do
for '/' and '.' already.
Reviewed by: hselasky, manu (ok)
Differential Revision: https://reviews.freebsd.org/D29514
Bjoern A. Zeeb [Tue, 30 Mar 2021 15:58:55 +0000 (15:58 +0000)]
mlx5: remove dependency on ifnet specifics of linux/netdevice.h
Rename the last remaining bits depending on ifnet from linux/netdevice.h
instead of using the compat macros. This helps clearing up
struct netdevice being struct ifnet from linux/netdevice.h.
Sponsored by: The FreeBSD Foundation
Reviewed by: hselasky, kib
Differential Revision: https://reviews.freebsd.org/D29497
The two functions in linux/inetdevice.h are highly FreeBSD/ifnet
specific. This is a result of struct net_device being mapped to
struct ifnet.
The only known consumer of these functions are two files in the
ofed/infiniband code.
As a first step of cleaning up copy linux/inetdevice.h to
rdma/ib_addr_freebsd.h. (It stayed a separate file to preserve
copyright and license of the original file; otherwise it could be
merged into ib_addr.h where more EPOCH/vnet/.. are already used).
Slightly rename the function to not conflict with LinuxKPI
in the future.
Remove the three last, now unneeded includes of inetdevice.h and
zap linux/inetdevice.h to an empty header file with only the forward
include to netdevice.h remaining.
Sponsored by: The FreeBSD Foundation
Reviewed by: hselasky, kib
Differential Revision: https://reviews.freebsd.org/D29434
Bjoern A. Zeeb [Fri, 26 Mar 2021 16:10:25 +0000 (16:10 +0000)]
cxgbe: remove unused linux headers
Remove unused #includes of LinuxKPI headers noticed while trying to
solve LinuxKPI struct net_device and related functions.
Neither netdevice.h nor inetdevice.h nor notifier.h seem to be needed.
This takes cxgbe(4) out of the picture of D29366.
Sponsored by: The FreeBSD Foundation
Reviewed by: np
Differential Revision: https://reviews.freebsd.org/D29432
Bjoern A. Zeeb [Fri, 26 Mar 2021 17:17:10 +0000 (17:17 +0000)]
qlnxr: remove netdevice.h
Remove unused #includes of a LinuxKPI header noticed while trying to
solve LinuxKPI struct net_device and related functions.
This takes qlnxr out of the picture of D29366.
Bjoern A. Zeeb [Sun, 21 Mar 2021 21:18:34 +0000 (21:18 +0000)]
LinuxKPI: netdevice notifier callback argument
Introduce struct netdev_notifier_info as a container to pass
net_device to the callback functions.
Adjust netdev_notifier_info_to_dev() to return the net_device field.
Add explicit casts from ifp to ni->dev even though currently
struct net_device is defined to struct ifnet. This is needed in
preparation for untangling this and improving the net_device compat
code.
Sponsored by: The FreeBSD Foundation
Reviewed by: hselasky
Differential Revision: https://reviews.freebsd.org/D29365
Bjoern A. Zeeb [Tue, 23 Mar 2021 14:31:37 +0000 (14:31 +0000)]
qlnxr: remove duplicate defines
upper_32_bits() and lower_32_bits() are defined twice in this file.
With the extra conditinal removed on LinuxKPI in 3b1ecc9fa1b5
they are also included from there already. Use the LinuxKPI version
and remove the two local ones.
Sponsored by: The FreeBSD Foundation
Reviewed by: hselasky
Differential Revision: https://reviews.freebsd.org/D29392
Bjoern A. Zeeb [Tue, 23 Mar 2021 14:24:49 +0000 (14:24 +0000)]
LinuxKPI: remove < 5.0 version support
We are not aware of any out-of-tree consumers anymore
which would need KPI support for before Linux version 5.
Update the two in-tree consumers to use the new KPI.
This allows us to remove the extra version check and
will also give access to {lower,upper}_32_bits() unconditionally.
Sponsored by: The FreeBSD Foundation
Reviewed by: hselasky, rlibby, rstone
X-MFC: to 13 only
Differential Revision: https://reviews.freebsd.org/D29391
Bjoern A. Zeeb [Tue, 23 Mar 2021 17:13:15 +0000 (17:13 +0000)]
LinuxKPI: add pci_ids.h
brcm80211 include pci_ids.h directly while historically we were tracking
IDs in pci.h. Move the current set of IDs from pci.h to pci_ids.h and
while here add IDs for Realtek and Broadcom as well as a network class
as needed by their wireless drivers.
We still include pci_ids.h from pci.h so this should not change anything.
Bjoern A. Zeeb [Tue, 23 Mar 2021 16:37:35 +0000 (16:37 +0000)]
LinuxKPI: add more linux-specific errno
Add ERFKILL and EBADE found in iwlwifi and brcmfmac wireless drivers.
While here add a comment above the block of error numbers above 500 to
document expectations.
Sponsored by: The FreeBSD Foundation
Reviewed by: hselasky, emaste
Differential Revision: https://reviews.freebsd.org/D29396
Bjoern A. Zeeb [Sun, 21 Mar 2021 21:07:45 +0000 (21:07 +0000)]
ofed/linuxkpi: use proper accessor function
In the notifier event callback function rather than casting directly
to the expected type use the proper accessor function as the mlx drivers
already do.
This is preparational work to allow us to improve the struct net_device
is struct ifnet compat code shortcut in the future.
Sponsored by: The FreeBSD Foundation
Reviewed by: hselasky
Differential Revision: https://reviews.freebsd.org/D29364
Bjoern A. Zeeb [Thu, 18 Mar 2021 22:23:15 +0000 (22:23 +0000)]
linuxkpi: add ieee80211_node.h to headers to include before LIST_HEAD
ieee80211_node.h uses LIST_HEAD() which LinuxKPI redefines and this
can lead to problems (see comment there). Make sure the net80211
header file is handled correctly by adding it to the list of files
to include before re-defining the macro.
Also add header files needed as dependencies.
Sponsored by: The FreeBSD Foundation
Reviewed by: philip, hselasky
Differential Revision: https://reviews.freebsd.org/D29336
Bjoern A. Zeeb [Tue, 23 Mar 2021 15:08:46 +0000 (15:08 +0000)]
ifconfig: 80211, add line break after key info
Beauty correction for verbose mode or in case we print multiple key
information to not continue with the next options directly after
as we did so far, e.g.:
AES-CCM 2:128-bit
AES-CCM 3:128-bit powersavemode ...
Sponsored by: The FreeBSD Foundation
Reviewed by: adrian
Differential Revision: https://reviews.freebsd.org/D29393
Bjoern A. Zeeb [Thu, 18 Mar 2021 22:15:00 +0000 (22:15 +0000)]
net80211: prefix get_random_bytes() with net80211_
Both linux/random.h and net80211 have a function named
get_random_bytes(). With overlapping files included these collide.
Arguably the function could be renamed in linuxkpi but the generic
name should also not be used in net80211 so rename it there.
Sponsored by: The FreeBSD Foundation
Reviewed by: philip, adrian
Differential Revision: https://reviews.freebsd.org/D29335
lib80211: Start adding 11ac ETSI bits to regdomain.xml
This change currently (partially) duplicates AC1 freqbands as AC2
as they are not fully overlapping.
It then adds the 11ac netband to the "etsi" domain including
"indoor" and "dfs" flags, which we can deal with, as well as
appropriate (round down) maxpower values.
Comments are left for the actual frequency bands as we do use the
centerfreq for the first/last (chansep sized) channel in the
freqband and their "id" name, which can be confusing.
Reviewed by: philip, adrian
Differential Revision: https://reviews.freebsd.org/D25999
Factor out ieee80211_probereq_ie() and ieee80211_probereq_ie_len()
and make the length dynamic rather than static max. The latter is
needed as our current fixed length was longer than some "hw scan",
e.g. that of ath10k, will take. This way we can pass what we have.
Should this not be sufficient in the future we might have to deal
with filtering and much more error handling.
This also removes a duplicate calculation for ieee80211_ie_wpa [1].
c338cf2c6d5eacdee813191d5976aa531d450ee7 split up ieee80211_probereq_ie().
For HW scans we usually do not want to add a SSID to the IEs.
During that split we allocate memory based on the length which will
always include the length of the SSID and only later we reduced the
length but never updated the value passed back to the caller.
Split the SSID handling up and reduce the length before malloc().
This not only makes us not over-allocate in these situatoins but also
fixes the length returned to the caller and with that usually directly
passed to firmware.
Repoprted by: Martin Husemann <martin NetBSD.org> [1]
Sponsored by: Rubicon Communications, LLC ("Netgate")
Sponsored by: The FreeBSD Foundation (update for alloc)
Reviewed by: adrian, martin NetBSD.org (earlier version)
Reviewed by: philip
Differential Revision: https://reviews.freebsd.org/D26545
Differential Revision: https://reviews.freebsd.org/D30813
Bjoern A. Zeeb [Wed, 10 Mar 2021 22:17:07 +0000 (22:17 +0000)]
termios: add more speeds
A lot of small arm64 gadgets are using 1500000 as console speed.
While cu can perfectly deal with this some 3rd party software, e.g.,
comms/conserver-con add speeds based on B<n> being defined.
Having it defined here simplifies enhancing other software.
Obtained from: NetBSD sys/sys/termios.h 1.36
Reviewed by: philip (,okayed by imp)
Differential Revision: https://reviews.freebsd.org/D29209
Bjoern A. Zeeb [Tue, 23 Mar 2021 15:47:24 +0000 (15:47 +0000)]
pci: enhance printf for leaked MSI[-X] vectors
When debugging leaked MSI/MSI-X vectors through LinuxKPI I found
the informational printf unhelpful. Rather than just stating we
leaked also tell how many MSI or MSI-X vectors we leak.
Sponsored by: The FreeBSD Foundation
Reviewed by: jhb
Differential Revision: https://reviews.freebsd.org/D29394
NanoBSD has helper script "fill_pkg.sh" which links all packages and
ther dependencies from "package dump" (like /usr/ports/packages/All) to
specified director. fill_pkg.sh has some limitations:
1) It needs ports tree, which should have exactly same versions as
"package dump".
2) It requires full paths to needed ports, including "/usr/ports" part.
3) It has assumptions about Nano Package Dir (it assumes, that it
specified rtelative to current directory).
4) It does not have any diagnostics (almost).
This PR enhances "fill_pkg.sh" script in several ways:
1) Nano package dir could be absolute path.
2) Script understands four ways to specify "root" ports/packages:
(a) Absolute directory with port (old one)
(b) Relative directory with port, relative to ${PORTSDIR} or /usr/ports
(c) Absolute path to file with package (with .tbz suffix)
(d) Name of package in dump dir, with or without .tbz suffix
These ways can be mixed in one call. Dependencies for
packages are obtained with 'pkg_info -r' call, and are searched for
in same directory as "parent" package. Dependencies for ports are
obtained in old way from port's Makefile.
3) Three levels of diagnostic (and -v option, could be repeated) are added.
4) All path variables are enclosed in quotes, to make script work with paths,
containing spaces.
Note: imp merged in the changes to fill_pkg.sh since this has been a PR.
John Hood [Sun, 11 Jul 2021 14:44:12 +0000 (08:44 -0600)]
loader: support.4th resets the read buffer incorrectly
Large nextboot.conf files (over 80 bytes) are not read correctly by the
Forth loader, causing file parsing to abort, and nextboot configuration
fails to apply.
Simple repro:
nextboot -e foo=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
shutdown -r now
That will cause the bug to cause a parse failure but shouldn't otherwise
affect the boot. Depending on your loader configuration, you may also
have to set beastie_disable and/or reduce the number of modules loaded
to see the error on a small console screen. 12.0 or CURRENT users will
also have to explicitly use the Forth loader instead of the Lua loader.
The error will look something like:
Warning: syntax error on file /boot/loader.conf.local
foo="xxxxxxxxxxxxxxnextboot_enable="YES"
^
/boot/support.4th has crude file I/O buffering, which uses a buffer
'read_buffer', defined to be 80 bytes by the 'read_buffer_size'
constant. The loader first tastes nextboot.conf, reading and parsing
the first line in it for nextboot_enable="YES". If this is true, then
it reopens the file and parses it like other loader .conf files.
Unfortunately, the file I/O buffering code does not fully reset the
buffer state in the reset_line_reading word. If the last file was read
to the end, that doesn't matter; the file buffer is treated as empty
anyway. But in the nextboot.conf case, the loader will not read to the
end of file if it is over 80 bytes, and the file buffer may be reused
when reading the next file. When the file is reread, the corrupt text
may cause file parsing to abort on bad syntax (if the corrupt line has
<>2 quotes in it), the wrong variable to be set, no variable to be set
at all, or (if the splice happens to land at a line ending) something
approximating normal operation.
The bug is very old, dating back to at least 2000 if not before, and is
still present in 12.0 and CURRENT r345863 (though it is now hidden by
the Lua loader by default).
Suggested one-line attached. This does change the behavior of the
reset_line_reading word, which is exported in the line-reading
dictionary (though the export is not documented in loader man pages).
But repo history shows it was probably exported for the PNP support
code, which was never included in the loader build, and was removed 5
months ago.
One thing that puzzles me: how has this bug gone unnoticed/unfixed for
nearly 2 decades? I find it hard to believe that nobody's tried to do
something interesting with nextboot, like load a kernel and filesystem,
which is what I'm doing.
The fslsdma device requires sdma_fw, but that's not included in
GENERIC. That firmware is not in the FreeBSD tree at the moment, but
could easily be.
The license for the firmware can be found in the linux firmware repo:
https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/?id=3123d78e09d2f815de4d94aa35c07b3c0469c80e
and looks to be a BSD license + no reverse engineer.
We can add this back after the firmware is imported, made a port, or
whose automatic loading can be made to happen.
Reviewed by: imp (with ian finding the license)
PR: 237466
MFC after: 1 week
Warner Losh [Thu, 8 Jul 2021 19:53:18 +0000 (13:53 -0600)]
devmatch: don't announce autoloading so much
devmatch rc script would announce it was loading a module multiple
times. It used kldload -n so it really wasn't loading it that many
times, but the message is confusing. Use kldstat to see if we need to
load the module before saying we do. This fixes the vast majority of the
problems. It may be possible to race devmatch with a user invocation and
devd, though quite hard. In that case we'll announce things twice, but
still only load it once. No attempt is made to fix this.
Warner Losh [Thu, 8 Jul 2021 19:44:21 +0000 (13:44 -0600)]
devmatch: Be tolerant of .ko being present.
We document that we did not need .ko on the module names in
devmatch_blocklist, but we really needed them. Keep the documentation
the same, but strip the .ko when we need to use the names so you can
specify either.
devmatch loads a number of things automatically. Allow the list of
things to load to happen first in case those drivers affect what would
be loaded. Normally, this will produce the same results, but there's
some special cases that may not when drivers are loaded that report
other drivers missing, like virtio_pci.
Rather than allocating however much memory userspace asks for we only
allocate enough for a handful of states, and copy to userspace for each
completed row.
We start out with enough space for 16 states (per row), but grow that as
required. In most configurations we expect at most a handful of states
per row (more than that would have other negative effects on packet
processing performance).
Happily this wasn't a real bug, because pf_killstates() never fails, but
we should check the return value anyway, in case it does ever start
returning errors.
pidx is never NULL, and is used unconditionally later on in the
function.
Add an assertion, as documentation for the requirement to provide an idx
pointer.
Eugene Grosbein [Mon, 17 May 2021 21:03:15 +0000 (04:03 +0700)]
ipfw: reload sysctl.conf variables if needed
Currently ipfw has multiple components that are not parts
of GENERIC kernel like dummynet etc. They can bring in important
sysctls if enabled with rc.conf(5) and loaded with ipfw startup script
by means of "required_modules" after initial consult
with /etc/sysctl.conf at boot time. Here is an example of one
increasing limit for dummynet hold queues that defaults to 100:
net.inet.ip.dummynet.pipe_slot_limit=1000
This makes it possible to use ipfw/dummynet rules such as:
ipfw pipe 1 config bw 50Mbit/s queue 1000
Such rule is rejected unless above sysctl is applied.
Another example is a group of net.inet.ip.alias.* sysctls
created after libalias.ko loaded as dependency of ipfw_nat.
This is not a problem if corresponding code compiled in custom kernel
so sysctls exist when sysctl.conf is read early or kernel modules
loaded with a loader. This change makes it work also for GENERIC
and modules loaded by means of rc.conf(5) settings.
Eugene Grosbein [Wed, 19 May 2021 13:02:31 +0000 (20:02 +0700)]
rc.d: unbreak sysctl lastload
/etc/rc.d/securelevel is supposed to run /etc/rc.d/sysctl lastload
late at boot time to apply /etc/sysctl.conf settings that fail
to apply early. However, this does not work in default configuration
because of kern_securelevel_enable="NO" by default.
Add new script /etc/rc.d/sysctl_lastload that starts unconditionally.
Warner Losh [Fri, 16 Jul 2021 04:03:24 +0000 (22:03 -0600)]
UPDATING: Not unusual side effect of the awk bug fixed in 3e804463521
You might not be able to build the kernel if you have an awk between Jul
10th and today. It does not affect all platforms due to the nature of
the bug (so amd64 is unaffected in stable/13 or current, but is affected
in stable/12. i386 seems to be affected everywhere).
Warner Losh [Thu, 15 Jul 2021 22:46:06 +0000 (16:46 -0600)]
awk: revert upstream's attempt to disallow hex strings
Upstream one-true-awk decided to disallow hex strings as numbers. This
is in line with awk's behavior prior to C99, and allowed by the POSIX
standard. The standard, however, allows them to be treated as numbers
because that's what the standard said in the 2001 through 2004 editions.
Since 2001, the nawk in FreeBSD has treated them as numbers, so restore
that behavior, allowed by the standard.
A number of scripts in the FreeBSD tree depend on this interpretation,
including scripts to build the kernel which had mysteriously started
failing for some people and not others. By re-allowing 0x hex numbers,
this fixes those scripts and restores POLA.
Upstream issue: https://github.com/onetrueawk/awk/issues/126
Sponsored by: Netflix
Reviewed by: kevans
MFC After: asap due to regression alrady merged to stable
Differential Revision: https://reviews.freebsd.org/D31199
When the setting of kern.ipc.maxsockbuf is less than what is
desired for I/O based on vfs.maxbcachebuf and vfs.nfs.bufpackets,
a console message of "Consider increasing kern.ipc.maxsockbuf".
is printed.
This patch modifies the message to provide a suggested value
for kern.ipc.maxsockbuf.
Note that the setting is only needed when the NFS rsize/wsize
is set to vfs.maxbcachebuf.
While here, make nfs_bufpackets global, so that it can be used
by a future patch that adds a sysctl to set the NFS server's
maximum I/O size. Also, remove "sizeof(u_int32_t)" from the maximum
packet length, since NFS_MAXXDR is already an "overestimate"
of the actual length.
pf: allow table stats clearing and reading with ruleset rlock
Instead serialize against these operations with a dedicated lock.
Prior to the change, When pushing 17 mln pps of traffic, calling
DIOCRGETTSTATS in a loop would restrict throughput to about 7 mln. With
the change there is no slowdown.
Creating tables and zeroing their counters induces excessive IPIs (14
per table), which in turns kills single- and multi-threaded performance.
Work around the problem by extending per-CPU counters with a general
counter populated on "zeroing" requests -- it stores the currently found
sum. Then requests to report the current value are the sum of per-CPU
counters subtracted by the saved value.
Sample timings when loading a config with 100k tables on a 104-way box:
stock:
pfctl -f tables100000.conf 0.39s user 69.37s system 99% cpu 1:09.76 total
pfctl -f tables100000.conf 0.40s user 68.14s system 99% cpu 1:08.54 total
patched:
pfctl -f tables100000.conf 0.35s user 6.41s system 99% cpu 6.771 total
pfctl -f tables100000.conf 0.48s user 6.47s system 99% cpu 6.949 total
Stefan Eßer [Sat, 10 Jul 2021 11:00:56 +0000 (13:00 +0200)]
libalias: fix divide by zero causing panic
The packet_limit can fall to 0, leading to a divide by zero abort in
the "packets % packet_limit".
An possible solution would be to apply a lower limit of 1 after the
calculation of packet_limit, but since any number modulo 1 gives 0,
the more efficient solution is to skip the modulo operation for
packet_limit <= 1.
Randall Stewart [Thu, 8 Jul 2021 11:06:58 +0000 (07:06 -0400)]
tcp: Fix 32 bit platform breakage
This fixes the incorrect use of a sysctl add to u64. It
was for a useconds time, but on 32 bit platforms its
not a u64. Instead use the long directive.
Reviewed by: tuexen
Sponsored by: Netflix Inc.
Differential Revision: https://reviews.freebsd.org/D31107
Randall Stewart [Tue, 6 Jul 2021 19:23:22 +0000 (15:23 -0400)]
tcp: HPTS performance enhancements
HPTS drives both rack and bbr, and yet there have been many complaints
about performance. This bit of work restructures hpts to help reduce CPU
overhead. It does this by now instead of relying on the timer/callout to
drive it instead use user return from a system call as well as lro flushes
to drive hpts. The timer becomes a backstop that dynamically adjusts
based on how "late" we are.
Randall Stewart [Tue, 6 Jul 2021 14:36:14 +0000 (10:36 -0400)]
tcp: Address goodput and TLP edge cases.
There are several cases where we make a goodput measurement and we are running
out of data when we decide to make the measurement. In reality we should not make
such a measurement if there is no chance we can have "enough" data. There is also
some corner case TLP's that end up not registering as a TLP like they should, we
fix this by pushing the doing_tlp setup to the actual timeout that knows it did
a TLP. This makes it so we always have the appropriate flag on the sendmap
indicating a TLP being done as well as count correctly so we make no more
that two TLP's.
In addressing the goodput lets also add a "quality" metric that can be viewed via
blackbox logs so that a casual observer does not have to figure out how good
of a measurement it is. This is needed due to the fact that we may still make
a measurement that is of a poorer quality as we run out of data but still have
a minimal amount of data to make a measurement.
Reviewed by: tuexen
Sponsored by: Netflix Inc.
Differential Revision: https://reviews.freebsd.org/D31076
Randall Stewart [Fri, 25 Jun 2021 13:30:54 +0000 (09:30 -0400)]
tcp: Preparation for allowing hardware TLS to be able to kick a tcp connection that is retransmitting too much out of hardware and back to software.
Hardware TLS is now supported in some interface cards and it works well. Except that
when we have connections that retransmit a lot we get into trouble with all the retransmits.
This prep step makes way for change that Drew will be making so that we can "kick out" a
session from hardware TLS.
Randall Stewart [Thu, 24 Jun 2021 18:42:21 +0000 (14:42 -0400)]
tcp: Rack not being very friendly with V6:4 socket and having a connection from V4
There were two bugs that prevented V4 sockets from connecting to
a rack server running a V4/V6 socket. As well as a bug that stops the
mapped v4 in V6 address from working.