pkg(7): replace usage of sbuf(9) with open_memstream(3)
open_memstream(3) is a standard way to obtain the same feature we do get
by using sbuf(9) (aka dynamic size buffer), switching to using it makes
pkg(7) more portable, and reduces its number of dependencies.
Reviewed by: manu
Differential Revision: https://reviews.freebsd.org/D30005
pkg(7): when bootstrapping first search for pkg.pkg file then pkg.txz
The package extension is going to be changed to .pkg to be among other
things resilient to the change of compression format used and reduce
the impact of all third party tool of that change.
Ensure the bootstrap knows about it
Reviewed by: manu
Differential revision: https://reviews.freebsd.org/D29232
Moritz Schmitt [Tue, 27 Apr 2021 01:59:12 +0000 (03:59 +0200)]
Make pkg(7) use environment variables specified in pkg.conf
Modify /usr/sbin/pkg to use environment variables specified in pkg.conf.
This allows control over underlying libraries like fetch(3), which can
be configured by setting HTTP_PROXY.
Alexander Motin [Mon, 5 Apr 2021 14:34:40 +0000 (10:34 -0400)]
Set PCIe device's Max_Payload_Size to match PCIe root's.
Usually on boot the MPS is already configured by BIOS. But we've
found that on hot-plug it is not true at least for our Supermicro
X11 boards. As result, mismatch between root's configuration of
256 bytes and device's default of 128 bytes cause problems for some
devices, while others seem to work fine.
33cb3cb2e321 introduced an `rib_head` structure field under the
FIB_ALGO define. This may be problematic for the CTF, as some
of the files including `route_var.h` do not have `fib_algo`
defined.
Make dtrace happy by making the field unconditional.
Add rib_walk_from() wrapper for selective rib tree traversal.
Provide wrapper for the rnh_walktree_from() rib callback.
As currently `struct rib_head` is considered internal to the
routing subsystem, this wrapper is necessary to maintain isolation
from the external code.
[fib algo] Delay algo init at fib growth to to allow to reliably use rib KPI.
Currently, most of the rib(9) KPI does not use rnh pointers, using
fibnum and family parameters to determine the rib pointer instead.
This works well except for the case when we initialize new rib pointers
during fib growth.
In that case, there is no mapping between fib/family and the new rib,
as an entirely new rib pointer array is populated.
Address this by delaying fib algo initialization till after switching
to the new pointer array and updating the number of fibs.
Set datapath pointer to the dummy function, so the potential callers
won't crash the kernel in the brief moment when the rib exists, but
no fib algo is attached.
This change allows to avoid creating duplicates of existing rib functions,
with altered signature.
Traditionally we had 2 sources of information whether the
added/delete route request targets network or a host route:
netmask (RTA_NETMASK) and RTF_HOST flag.
The former one is tricky: netmask can be empty or can explicitly
specify the host netmask. Parsing netmask sockaddr requires per-family
parsing and that's what rtsock code traditionally avoided. As a result,
consistency was not enforced and it was possible to specify network with
the RTF_HOST flag and vice versa.
Continue normalization efforts from D29826 and D29826 and ensure that
RTF_HOST flag always reflects host/network data from netmask field.
Differential Revision: https://reviews.freebsd.org/D29958
MFC after: 2 days
Rick Macklem [Tue, 20 Apr 2021 00:51:07 +0000 (17:51 -0700)]
nfsd: fix stripe size reply for the File Layout pNFS server
At a recent testing event I found out that I had misinterpreted
RFC5661 where it describes the stripe size in the File Layout's
nfl_util field. This patch fixes the pNFS File Layout server
so that it returns the correct value to the NFSv4.1/4.2 pNFS
enabled client.
This affects almost no one, since pNFS server configurations
are rare and the extant pNFS aware NFS clients seemed to
function correctly despite the erroneous stripe size.
It *might* be needed for correct behaviour if a recent
Linux client mounts a FreeBSD pNFS server configuration
that is using File Layout (non-mirrored configuration).
Mark Johnston [Mon, 26 Apr 2021 18:53:16 +0000 (14:53 -0400)]
imgact_elf: Ensure that the return value in parse_notes is initialized
parse_notes relies on the caller-supplied callback to initialize "res".
Two callbacks are used in practice, brandnote_cb and note_fctl_cb, and
the latter fails to initialize res. Fix it.
In the worst case, the bug would cause the inner loop of check_note to
examine more program headers than necessary, and the note header usually
comes last anyway.
Reviewed by: kib
Reported by: KMSAN
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D29986
Kevin Bowling [Mon, 19 Apr 2021 02:11:27 +0000 (19:11 -0700)]
e1000: Add support for [Tiger, Alder, Meteor] Lake
Add support for current and future client platform PCI IDs. These are
all I219 variants and have no known driver changes versus previous
generation client platform I219 variants.
ng_ubt: Do not clear stall before receiving of HCI command response.
Unconditional execution of "clear feature" request at SETUP stage was
workaround for probe failures on ng_ubt.ko re-kldloading which is
unnecessary now.
Rick Macklem [Fri, 30 Apr 2021 01:29:25 +0000 (18:29 -0700)]
param.h: bump __FreeBSD_version for commit 5a45802b3c8c
Commit 5a45802b3c8c changed the internal KAPI between the krpc
and NFS. As such, the krpc, nfscommon and nfscl modules must
all be rebuilt from sources.
Rick Macklem [Sun, 11 Apr 2021 21:34:57 +0000 (14:34 -0700)]
nfsv4 client: do the BindConnectionToSession as required
During a recent testing event, it was reported that the NFSv4.1/4.2
server erroneously bound the back channel to a new TCP connection.
RFC5661 specifies that the fore channel is implicitly bound to a
new TCP connection when an RPC with Sequence (almost any of them)
is done on it. For the back channel to be bound to the new TCP
connection, an explicit BindConnectionToSession must be done as
the first RPC on the new connection.
Since new TCP connections are created by the "reconnect" layer
(sys/rpc/clnt_rc.c) of the krpc, this patch adds an optional
upcall done by the krpc whenever a new connection is created.
The patch also adds the specific upcall function that does a
BindConnectionToSession and configures the krpc to call it
when required.
This is necessary for correct interoperability with NFSv4.1/NFSv4.2
servers when the nfscbd daemon is running.
If doing NFSv4.1/NFSv4.2 mounts without this patch, it is
recommended that the nfscbd daemon not be running and that
the "pnfs" mount option not be specified.
truncate(1) is not case-sensitive with regard to setting the size
of a file. makefs(8), however, does not honor upper-case values.
Update release-specific files and the release(7) manual page to
reflect this.
Modular fib lookup framework features logic that allows
route update batching for the algorithms that cannot easily
apply the routing change without rebuilding. As a result,
dataplane lookups may return old data until the the sync
takes place. With the default sync timeout of 50ms, it is
possible that new binary like ping(8) executed exactly after
route(8) will still use the old fib data.
To address some aspects of the problem, framework executes
all rtable changes without RTF_GATEWAY synchronously.
To fix the aforementioned problem, this diff extends sync
execution for all RTF_STATIC routes (e.g. ones maintained by
route(8).
This fixes a bunch of tests in the networking space.
Currently, PCB caching mechanism relies on the rib generation
counter (rnh_gen) to invalidate cached nhops/LLE entries.
With certain fib algorithms, it is now possible that the
datapath lookup state applies RIB changes with some delay.
In that scenario, PCB cache will invalidate on the RIB change,
but the new lookup may result in the same nexthop being returned.
When fib algo finally gets in sync with the RIB changes, PCB cache
will not receive any notification and will end up caching the stale data.
To fix this, introduce additional counter, rnh_gen_rib, which is used
only when FIB_ALGO is enabled.
This counter is incremented by the control plane. Each time when fib algo
synchronises with the RIB, it updates rnh_gen to the current rnh_gen_rib value.
Initial fib algo implementation was build on a very simple set of
principles w.r.t updates:
1) algorithm is ether able to apply the change synchronously (DIR24-8)
or requires full rebuild (bsearch, lradix).
2) framework falls back to rebuild on every error (memory allocation,
nhg limit, other internal algo errors, etc).
This changes brings the new "intermediate" concept - batched updates.
Algotirhm can indicate that the particular update has to be handled in
batched fashion (FLM_BATCH).
The framework will write this update and other updates to the temporary
buffer instead of pushing them to the algo callback.
Depending on the update rate, the framework will batch 50..1024 ms of updates
and submit them to a different algo callback.
This functionality is handy for the slow-to-rebuild algorithms like DXR.
The intent is to better handle time intervals with large amount of RIB
updates (e.g. BGP peer going up or down), while still keeping low sync
delay for the rest scenarios.
The implementation is the following: updates are bucketed into the
buckets of size 50ms. If the number of updates within a current bucket
exceeds the threshold of 500 routes/sec (e.g. 10 updates per bucket
interval), the update is delayed for another 50ms. This can be repeated
until the maximum update delay (1 sec) is reached.
inp_lookup_mcast_ifp() is static and is only used in the inp_join_group().
The latter function is also static, and is only used in the inp_setmoptions(),
which relies on inp being non-NULL.
As a result, in the current code, inp_lookup_mcast_ifp() is always called
with non-NULL inp. Eliminate unused RT_DEFAULT_FIB condition and always
use inp fib instead.
Mark Johnston [Wed, 14 Apr 2021 16:57:24 +0000 (12:57 -0400)]
uma: Introduce per-domain reclamation functions
Make it possible to reclaim items from a specific NUMA domain.
- Add uma_zone_reclaim_domain() and uma_reclaim_domain().
- Permit parallel reclamations. Use a counter instead of a flag to
synchronize with zone_dtor().
- Use the zone lock to protect cache_shrink() now that parallel reclaims
can happen.
- Add a sysctl that can be used to trigger reclamation from a specific
domain.
Currently the new KPIs are unused, so there should be no functional
change.
Reviewed by: mav
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D29685
Mark Johnston [Wed, 14 Apr 2021 16:56:39 +0000 (12:56 -0400)]
domainset: Define additional global policies
Add global definitions for first-touch and interleave policies. The
former may be useful for UMA, which implements a similar policy without
using domainset iterators.
No functional change intended.
Reviewed by: mav
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D29104
Mark Johnston [Wed, 21 Apr 2021 19:38:01 +0000 (15:38 -0400)]
Add required checks for unmapped mbufs in ipdivert and ipfw
Also add an M_ASSERTMAPPED() macro to verify that all mbufs in the chain
are mapped. Use it in ipfw_nat, which operates on a chain returned by
m_megapullup().
PR: 255164
Reviewed by: ae, gallatin
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D29838
iwnstats: fix build with clang and allow install under /usr/local/sbin
iwnstats was not compiling because of some issues raised by the clang
compiler due to -Werror. As a tool it is not connected to world build.
Add missing field "barker_mrc" initialization in struct
iwn_sensitivity_limits for -Wmissing-field-initializers, remove unused
pointer *is on iwn_stats_*_print functions and unused variables for
-Wunused-parameter and -Wunused-variable.
The value for field "barker_mrc" of struct iwn2030_sensitivity_limits
was obtained from linux 3.2 wireless/iwlwifi driver code (iwl-2000.c:115
.barker_corr_th_min_mrc = 390).
Also set BINDIR in Makefile to make it possible to install under
/usr/local/sbin/iwnstats as it require super user.
Reviewed by: adrian
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D29800
Alexander Motin [Tue, 13 Apr 2021 15:19:10 +0000 (11:19 -0400)]
Fix race in case of device destruction.
During device destruction it is possible that open() succeed, but
fdevname() return NULL, that can't be assigned to string variable.
Fix that by adding explicit NULL check.
Also while there switch from fdevname() to fdevname_r().
Ka Ho Ng [Sun, 11 Apr 2021 06:45:37 +0000 (14:45 +0800)]
vnode_pager_setsize.9: Some clarifications on the manpage
A number of changes:
- Clarifies the locking rules when calling the routine.
- Correct the description regarding the content range to be purged.
- Document the effects on page fault handler.
MFC with: 86a52e262a6f
Sponsored by: The FreeBSD Foundation
Reviewed by: bcr, kib
Approved by: philip (mentor)
Differential Revision: https://reviews.freebsd.org/D29637
Nathan Whitehorn [Thu, 25 Feb 2021 02:16:56 +0000 (21:16 -0500)]
Use makefs(8) in release VM-image generation instead of md(4) and newfs.
Using makefs instead reduces the privileges needed to build VM images,
simplifies the script (no need to copy files to a fresh image at the end),
and improves portability by allowing generation of cross-endian images.
As a result of the last, this patch also adds support for generation of
powerpc64 and powerpc64le VM images.
No other changes to the output. Tested and working for both amd64 and
powerpc64 targets.
Allocate extra inodes in makefs when leaving free space in UFS images.
By default, makefs(8) has very few spare inodes in its output images,
which is fine for static filesystems, but not so great for VM images
where many more files will be added. Make makefs(8) use the same
default settings as newfs(8) when creating images with free space --
there isn't much point to leaving free space on the image if you
can't put files there. If no free space is requested, use current
behavior of a minimal number of available inodes.
Reviewed by: manu
MFC after: 3 weeks
Differential Revision: https://reviews.freebsd.org/D29492
John Baldwin [Fri, 26 Mar 2021 22:05:31 +0000 (15:05 -0700)]
cxgbe: Add a struct sge_ofld_txq type.
This type mirrors struct sge_ofld_rxq and holds state for TCP offload
transmit queues. Currently it only holds a work queue but will
include additional state in future changes.