This commit introduces SRD metrics through sysctl.
The metrics can be queried using the following sysctl node:
sysctl dev.ena.<device index>.ena_srd_info
This commit adds sysctl support for customer metrics.
Different customer metrics can be found in the following sysctl node:
sysctl dev.ena.<device index>.customer_metrics
ena: Introduce shared sample interval for all stats
Rename sample_interval node to stats_sample_interval and move
it up in the sysctl tree to make it clear that it's relevant for
all the stats and not only ENI metrics (Currently, sample interval node
is found under eni_metrics node).
Path to node:
dev.ena.<device_index>.stats_sample_interval
Once this parameter is set it will set the sample interval for all the
stats node including SRD/customer metrics.
Osama Abboud [Mon, 30 Oct 2023 11:27:03 +0000 (11:27 +0000)]
ena: Add sysctl support for spreading IRQs
This commit allows spreading IO IRQs over different CPUs through sysctl.
Two sysctl nodes are introduced:
1- base_cpu: servers as the first CPU to which the first IO IRQ
will be bound.
2- cpu_stride: sets the distance between every two CPUs to which every
two consecutive IO IRQs are bound.
For example for doing the following IO IRQs / CPU binding:
Run the following commands:
sysctl dev.ena.<device index>.irq_affinity.base_cpu=0
sysctl dev.ena.<device_index>.irq_affinity.cpu_stride=2
Also introduced rss_enabled field, which is intended to replace
'#ifdef RSS' in multiple places, in order to prevent code duplication.
We want to bind interrupts to CPUs in case of rss set OR in case
the newly defined sysctl paremeter is set. This requires to remove a
couple of '#ifdef RSS' as well in the structs, since we'll be using the
relevant parameters in the CPU binding code.
Osama Abboud [Thu, 28 Dec 2023 13:25:43 +0000 (13:25 +0000)]
ena: Upgrade ena-com to freebsd v2.7.0
This commit introduces a number of infrastructures in ena-com, some of
which are being used as of ENA v2.7.0 while other certain infrastructure
assets have been made available for potential future application.
Upgrade ena-com to include the following changes:
* Introduce customer metrics infrastructures
* Introduce SRD metrics infrastructures
* Remove unused fields from ena_com_io_cq and ena_com_io_sq structs
* Minor rework of ena_com_fill_hash_function
* Introduce PHC infrastructures
* Update the licenses for ena-com files
* Delete duplicate *_defs.h found in ena-com and ena_defs directories
* Add validation for completion descriptors consistency
* Move ena_fbsd_log.h file to ena_plat.h
Approved by: cperciva (mentor)
Sponsored by: Amazon, Inc.
Dimitry Andric [Thu, 28 Dec 2023 12:57:41 +0000 (13:57 +0100)]
Reorganize libclang_rt Makefile and make more lib/arch combos available
Upstream has made more clang runtime libraries available for more
architectures, so add them. To make this easier, split up subdir lists
into functional parts (asan, tsan, etc), and put each architecture into
its own .if block.
Effectively, this adds the following libraries for aarch64: asan, cfi,
fuzzer, msan, safestack, stats, tsan, ubsan, xray.
Warner Losh [Thu, 21 Dec 2023 20:36:12 +0000 (13:36 -0700)]
vtnet: Better adjust for ethernet alignment.
Move adjustment of the mbuf from where we allocate it to where we are
about to queue it to the device. Do this only on those platforms that
require it. This allows us to receive an entire jumbo frame on other
platforms. It also doesn't make the adjustment on subsequent frames when
we queue mulitple mbufs for LRO operations.
For the normal use case on armv7, there's no difference because we only
ever allocate one mbuf. However, for the LRO cases it increases what's
available in LRO. It also ensure that we get enough mbufs in those cases
as well (though I have no ability to test this on a LRO scenario with
armv7).
This has the side effect of reverting 527b62e37e68.
Jose Luis Duran [Thu, 28 Dec 2023 05:26:23 +0000 (22:26 -0700)]
mtree: Update mtree flags in README file
- Add -b (suppress blank lines before directories).
- The equivalent of `-i` in fmtree is `-j` in mtree (nmtree) (indent the
output 4 spaces).
- Add `-F freebsd9` compatibility flavor (print the closing `..` at the
end).
Warner Losh [Thu, 28 Dec 2023 00:16:33 +0000 (17:16 -0700)]
contributing: Add note about static analyzers
Please don't submit the raw results of some static analysis. Please do
submit the thoughtful results, though. Please test with kyua and create
test cases for any actual bugs that might be fixed.
Graham Perrin [Wed, 27 Dec 2023 23:36:26 +0000 (16:36 -0700)]
bsd-family-tree: tidiness, width
Tidy the raggedness in the section that begins [44B]. As the line that begins
[KB] was previously tidied, now tidy the section to accommodate [BSDI] and
[TUHS]. Rewrap the section to fit the same number of columns.
Colin Percival [Wed, 27 Dec 2023 08:09:08 +0000 (00:09 -0800)]
x86: Adjust base addr for PCI MCFG regions
Each bus gets 1 MB of address space; the actual base address for an
MCFG bus range is the address from the table plus the starting bus
number times 1 MB.
The PCI spec is unclear on this point, but this change matches what
Linux does, which is likely enough of a de facto standard regardless
of what any de jure standard might attempt to say.
Mark Johnston [Wed, 27 Dec 2023 20:17:53 +0000 (15:17 -0500)]
Fix the FreeBSD userspace build (#15716)
- Mark some parameters to zpool_power*() as unused.
- Add a stub zpool_disk_wait().
Fixes: a9520e6e5 ("zpool: Add slot power control, print power status") Signed-off-by: Mark Johnston <markj@FreeBSD.org> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Lexi Winter [Wed, 27 Dec 2023 17:30:31 +0000 (17:30 +0000)]
nfsstat: update option strings in docs
Add the missing -q option to the nfsstat(1) manpage SYNOPSIS (it is
already documented in DESCRIPTION), and add the missing -E and -q
options to the built-in usage output.
Alan Somers [Mon, 9 Oct 2023 18:26:25 +0000 (12:26 -0600)]
Fix multiple bugs with ctld's UCL parsing
* Don't segfault when parsing a misformatted auth-group section
* If the config file specifies a chap section within a target but no
auth-group, create a new anonymous auth-group. That matches the
behavior with non-UCL config files.
* Protect some potential segfaults with assertions
Gleb Smirnoff [Wed, 27 Dec 2023 16:34:37 +0000 (08:34 -0800)]
inpcb: poison several inpcb pointer in in_pcbfree()
There are few subsystems that reference inpcb and allow it to outlive
in_pcbfree(). There are no known bugs with them to unreference the
options pointers for a freed inpcb. Enforce this so that such bugs
don't appear in the future.
Gleb Smirnoff [Wed, 27 Dec 2023 16:34:37 +0000 (08:34 -0800)]
inpcb: reoder inpcb destruction
First, merge in_pcbdetach() with in_pcbfree(). The comment for
in_pcbdetach() was no longer correct. Then, make sure we remove
the inpcb from the hash before we commit any destructive actions
on it. There are couple functions that rely on the hash lock
skipping SMR + inpcb lock to lookup an inpcb. Although there are
no known functions that similarly rely on the global inpcb list
lock, also do list removal before destructive actions.
Mark Johnston [Wed, 27 Dec 2023 15:13:29 +0000 (10:13 -0500)]
netmap: Ignore errors in CSB_WRITE()
The CSB_WRITE() and _READ() macros respectively write to and read from
userspace memory and so can in principle fault. However, we do not
check for errors and will proceed blindly if they fail. Add assertions
to verify that they do not.
This is in preparation for annotating copyin() and related functions
with __result_use_check.
Ihor Antonov [Wed, 27 Dec 2023 06:07:26 +0000 (00:07 -0600)]
daemon: replace memchr with memrchr
Looping over lines in the buffer is not needed.
Same effect can be achieved by looking for the last new line.
If found the buffer is guaranteed to have one or more complete lines.
All complete lines are flushed at once with no looping.
Ihor Antonov [Wed, 27 Dec 2023 06:07:25 +0000 (00:07 -0600)]
daemon: move buffer into daemon_state
There is no reason for a buffer in listen_child()
to be a static function variable. The buffer and
its position are parts of the daemon state and should
live together with the rest of the state variables.
Ihor Antonov [Wed, 27 Dec 2023 06:07:25 +0000 (00:07 -0600)]
daemon: fix clang-tidy warnings
Fixed narrowing conversions:
- strtol replaced with strtonum with range check
- read returns ssize_t
- kevent.data explicitly cast to int before passing into strerror
While we we're here:
- Defined and documented maximum restart delay.
- Fixed typo in a comment.
- Remove unused includes
Gleb Smirnoff [Wed, 27 Dec 2023 04:22:12 +0000 (20:22 -0800)]
netlink: simplify socket destruction
Destroy the socket at the file descriptor close(2). There is no
reason to linger for any longer, there are no external references.
Remove pr_detach method as nothing left to do after pr_close.
Remove pr_abort method as it shall never be executed for this type
of socket.
Alexander Motin [Wed, 27 Dec 2023 03:30:56 +0000 (22:30 -0500)]
Schedule fast taskqueue callouts on right CPU.
With fast taskqueues using direct callouts we can reduce number of
CPU wakeups by scheduling callout on current CPU if taskqueue calls
taskqueue_enqueue_timeout() on itself. The trick won't work for
regular taskqueues, since the callout thread will occupy the CPU.
It also may not work in case of multiple threads since we do not
know which thread will pick the task, and we do not want excessive
callout migrations. So we optimize only the other cases we can.
In practice this allows iichid(4) taskqueue to stay on CPU where
underlying ig4(4) interrupts are routed and to not kick CPU 0 with
timer interrupts on each sampling period (every 2nd/3rd sleep).
The test mostly focus on testing various corner cases.
The tests take a long time to run, so for the common.run runfile
we randomly select a hundred tests.
To run all the bclone tests, bclone.run runfile should be used.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Pawel Jakub Dawidek <pawel@dawidek.net>
Closes #15631
netpfil: Use accessor functions and named constants for all tcphdr flags
Update all remaining references to the struct tcphdr th_x2 field.
This completes the compatibilty of various aspects with AccECN
(TH_AE), after the internal ipfw "re-checksum required" was moved
to use the TH_RES1 flag.
Alexander Motin [Wed, 27 Dec 2023 00:36:34 +0000 (19:36 -0500)]
iichid(4): Switch taskqueue to "fast"
While "fast" taskqueue may be more expensive due to spinlock use,
when used mainly for timeout tasks it allows to avoid extra context
switches to and from callout thread, that is even more expensive.
Alexander Motin [Wed, 27 Dec 2023 00:28:56 +0000 (19:28 -0500)]
iichid(4): Unify two taskqueue tasks
taskqueue_enqueue_timeout(0) is equivalent to taskqueue_enqueue(),
so no need to create a separate periodic_task and event_task to run
exactly the same handler.