Dmitry Chagin [Tue, 14 Feb 2023 14:46:32 +0000 (17:46 +0300)]
linux(4): Move use_real_names knob to the linux.c
MI linux.[c|h] are the module independent in terms of the Linux emulation
layer (ie, intended for both ISA - 32 & 64 bit), analogue of MD linux.h.
There must be a code here that cannot be placed into the corresponding by
common sense MI source and header files, i.e., code is machine independent,
but ISA dependent.
For the use_real_names knob, the code must be placed into the
linux_socket.[c|h], however linux_socket is ISA dependent.
netlink: fix IPv6 route addition with link-local gateway
Currently kernel assumes that IPv6 gateway address is in "embedded"
form - that is, for the link-local IPv6 addresses, interface index
is embedded in bytes 2 and 3 of the address.
Fix address embedding in netlink by wrapping nhop_set_gw() in the
netlink-specific nl_set_nexthop_gw(), which does such embedding
automatically.
Reported by: Marek Zarychta <zarychtam@plan-b.pwste.edu.pl>
MFC after: 3 days
testing: handling non-root users with VNETs in pytest-based tests.
Currently isolation and resource requirements are handled directly
by the kyua runner, based on the requirements specified by the test.
It works well for simple tests, but may cause discrepancy with tests
doing complex pre-setups. For example, all tests that perform
VNET setups require root access to properly function.
This change adds additional handling of the "require_user" property
within the python testing framework. Specifically, it requests
root access if the test class signals its root requirements and
drops privileges to the desired user after performing the pre-setup.
Tom Hukins [Fri, 24 Feb 2023 10:25:35 +0000 (10:25 +0000)]
netlink: Fix "version introduced" documentation.
netlink(4) and associated features will exist in FreeBSD 14.0 but they
will also exist in 13.2, an older version, from commits such as 02b958b
and b309249.
Mark Johnston [Thu, 9 Feb 2023 20:52:35 +0000 (15:52 -0500)]
vmm: Fix AP startup compatibility for old bhyve executables
These changes unbreak AP startup when using a 13.1-RELEASE bhyve
executable with a newer kernel:
- Correct the destination mask for the VM_EXITCODE_IPI message generated
by an INIT or STARTUP IPI in vlapic_icrlo_write_handler().
- Only initialize vlapics on active vCPUs. 13.1-RELEASE bhyve activates
AP vCPUs only after the BSP starts them with an IPI, and vmm now
allocates vcpu structures lazily, so the STARTUP handling in
vm_handle_ipi() could trigger a page fault.
- Fix an off-by-one setting the vcpuid in a VM_EXITCODE_SPINUP_AP
message.
Fixes: 7c326ab5bb9a ("vmm: don't lock a mtx in the icr_low write handler")
Reviewed by: jhb, corvink
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D38446
Jessica Clarke [Thu, 30 Jun 2022 20:03:26 +0000 (21:03 +0100)]
.github: Attempt to fix and increase robustness of macOS action
Homebrew has added LLVM 14 and made that the default version, but GitHub
continues to install LLVM 13 for now, so it ends up only accessible via
the versioned name and not the unversioned one. We also add an explicit
installation of llvm@13 so that, if GitHub updates the image to using
LLVM 14, the action continues to work, albeit slightly more slowly. This
also ensures the compiler label remains correct rather than outdated, as
has occurred in the past, and that we don't get new versions of LLVM
before we're ready for them, which is especially relevant for stable
branches. This all mirrors how the Ubuntu jobs are configured.
Rick Macklem [Wed, 8 Feb 2023 22:25:01 +0000 (14:25 -0800)]
nfscl: Fix interaction between mmap'd and VOP_WRITE file updates
asomers@ found a problem with the NFS client, where a write to
an NFS mounted file done via mmap(2) was lost when fspacectl(2)
was done before it. This turned out to be caused by clearing the
dirty bit on pages when the client was doing commit RPCs,
due to the second argument to vfs_busy_pages() being set to 1.
Commit RPCs tell the server to commit previously written data to
stable storage. However, Commit RPCs do not write data from the
client to the server. As such, if the dirty bit on the page has
been set by a mmap'd write to an address in the page, it should
not be cleared. Clearing it causes the mmap'd write to by lost.
This patch fixes the problem by changing the 2nd argument to
vfs_busy_pages() to 0 for this case.
I doubt this bug has affected many, since it was inherited from
the old NFS client and was in 4.3 FreeBSD twenty years ago.
Although fspacectl(2) is FreeBSD 14 specific, a write(2) would
cause the same failure.
- Account for a filter required to enable reception of untagged frames
while registering and unregistering VLANs to avoid trying to add more
filters than HW supports
- While adding MAC/VLAN filters, pre-set matching method field in the
Admin Queue Command response buffer to expected error value to work
around an issue with some FW versions, which do not update that field if
operation fails, and be able correctly track which filters were
configured in HW.
- Remove unused IXL_MAX_FILTERS macro definition
- Update number of available MAC/VLAN filters as in newer FW versions it
was decreased by one.
Extend SFP+ cage crosstalk fix by re-checking link state after 5ms delay
to filter out spurious link up indication by transceiver with no fiber
cable connected.
Piotr Kubaj [Thu, 16 Feb 2023 23:49:43 +0000 (00:49 +0100)]
llvm: make sure to use ELFv2 ABI on powerpc64
Currently LLVM is more or less set up to use ELFv2, but it still defaults to
ELFv1 in some places. This causes lld to generate broken binaries when used
with LTO.
Kyle Evans [Thu, 28 Oct 2021 04:40:08 +0000 (23:40 -0500)]
kern: physmem: improve region coalescing logic
The existing logic didn't take into account newly inserted mappings
wholly contained by an existing region (or vice versa), nor did it
account for weird overlap scenarios. The latter is probably unlikely
to happen, but the former may happen in UEFI: BootServicesData allocated
within a large chunk of ConventionalMemory. This situation blows up vm
initialization.
While we're here, remove the "exact match" logic as it's likely wrong;
if an exact match exists with conflicting flags, for instance, then we
should probably be doing something else. The new logic takes into
account exact matches as part of the overlapping efforts.
Mark Johnston [Fri, 3 Feb 2023 15:54:23 +0000 (10:54 -0500)]
pvclock: Export a vDSO page even without rdtscp available
When the cycle counter is "stable", i.e., synchronized across vCPUs by
the hypervisor, userspace can use a serialized rdtsc instead of relying
on rdtscp, just like the kernel timecounter does. This can be useful
for performance in guests where the hypervisor hides rdtscp for some
reason.
To avoid breaking compatibility with older userspace which expects
rdtscp to be usable when pvclock exports timekeeping info, hide this
feature behind a sysctl.
Reviewed by: kib
Tested by: Shrikanth R Kamath <kshrikanth@juniper.net>
MFC after: 2 weeks
Sponsored by: Klara, Inc.
Sponsored by: Juniper Networks, Inc.
Differential Revision: https://reviews.freebsd.org/D38342
Mark Johnston [Fri, 3 Feb 2023 15:53:20 +0000 (10:53 -0500)]
libc: Fall back to rdtsc when using pvclock and rdtscp is not available
In preparation for a follow-up revision wherein kvmclock may export
timekeeping info to userspace even in the absence of AMDID_RDTSCP, fall
back to using rdtsc when rdtscp isn't available. This mimics
pvclock_read_time_info() in the kernel.
Reviewed by: kib
Tested by: Shrikanth R Kamath <kshrikanth@juniper.net>
MFC after: 2 weeks
Sponsored by: Klara, Inc.
Sponsored by: Juniper Networks, Inc.
Differential Revision: https://reviews.freebsd.org/D38341
Ed Maste [Mon, 20 Feb 2023 16:19:35 +0000 (09:19 -0700)]
lua: reduce diffs between luaconf.h copies
Upstream luaconf.h is contrib/lua/src/luaconf.h.dist, while userland lua
and loader lua have copies in lib/liblua/luaconf.h and
stand/liblua/luaconf.h.
Adjust whitespace, VCS tags, etc. to match upstream's version, for ease
of comparison.
Reviewed By: imp
Sponsored By: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D38206
Warner Losh [Mon, 20 Feb 2023 16:09:07 +0000 (09:09 -0700)]
stand: Update mfc notes
Using some automation I found a few mistakes in my earlier list and also
a change merged without a cherry-pick -x (and evidentally changed as
well, since git log --cherry didn't know it had been merged). Fix them
so I can use this list with my experimental mfc script.
nd6_resolve_slow() can be called without mbuf. If the LLE entry
is not reachable, nd6_resolve_slow() will add this NULL mbuf to
the holdchain via lltable_append_entry_queue, which will "append"
NULL to the end of the queue (effectively no-op) and bump la_numhold
value. When this entry gets freed, the kernel will panic due to the
inconsistency between the amount of mbufs in the queue and the value
of la_numhold.
Fix the panic by checking of mbuf is not NULL prior to inserting it
into the holdchain.
Arseny Smalyuk [Tue, 31 May 2022 20:04:51 +0000 (20:04 +0000)]
netinet6: Fix mbuf leak in NDP
Mbufs leak when manually removing incomplete NDP records with pending packet via ndp -d.
It happens because lltable_drop_entry_queue() rely on `la_numheld`
counter when dropping NDP entries (lles). It turned out NDP code never
increased `la_numheld`, so the actual free never happened.
Fix the issue by introducing unified lltable_append_entry_queue(),
common for both ARP and NDP code, properly addressing packet queue
maintenance.
The current code missed interface addition when reallocating
temporary buffer.
Tweak the code to perform the reallocation first and add
interface afterwards unconditionally.
Reported by: Marek Zarychta <zarychtam@plan-b.pwste.edu.pl>
MFC after: 3 days
netlink: return optional metadata with the operation result.
Some operations like interface creation may need to return metadata
- in this case, interface name - back to the caller if the operation
is successful.
This change implements attaching an `NLMSGERR_ATTR_COOKIE` nla to the
operation reply message via `nlmsg_report_cookie()`.
Additionally, on successful interface creation, interface index and
interface name are returned in the `IFLA_NEW_IFINDEX` and `IFLA_IFNAME
TLVs, encapsulated in the `NLMSGERR_ATTR_COOKIE`.
Xin LI [Mon, 13 Feb 2023 04:56:17 +0000 (20:56 -0800)]
cleanvar: Be more careful when cleaning up /var.
The cleanvar script uses find -delete to remove stale files under /var,
which could lead to unwanted removal of files in some unusual scenarios.
For example, when a mounted fdescfs(5) is present under /var/run/samba/fd,
find(1) could descend into a directory that is out of /var/run and remove
files that should not be removed.
To mitigate this, modify the script to use find -x, which restricts the
find scope to one file system only instead of descending into mounted
file systems.
Alan Somers [Sat, 11 Feb 2023 23:43:30 +0000 (16:43 -0700)]
fusefs: fix some resource leaks
fusefs would leak tickets in three cases:
* After FUSE_CREATE, if the server returned a bad inode number.
* After a FUSE_FALLOCATE operation during VOP_ALLOCATE
* After a FUSE_FALLOCATE operation during VOP_DEALLOCATE
Warner Losh [Thu, 16 Feb 2023 16:36:03 +0000 (09:36 -0700)]
efivar: Really look for labels for the provider with right efimedia
The prior code mistakently thought that the g_consumer that hung off the
provider we found were the right thing to use to find all the glabel
aliases for this node. However, the only way to find that is to iterate
through all the geoms that belong to the glabel geom class, looking for
those geoms with the same name as the provider with the right efimedia.
Do this in a way that caches glabel class, and allows for it to be
absent. Tighten the filter for mounted filesystems to only look
for the ones that are mounted on /dev/.. since the rest of the code
assumes that.
Warner Losh [Thu, 16 Feb 2023 16:36:03 +0000 (09:36 -0700)]
efibootmgr: Add --efidev (-u) to discover UEFI's device path to a dev or file
"efibootmgr --efidev unix-path" will return the UEFI device-path to the
file or device specified by unix-path. It's useful for debugging, but
may also be useful for scripting.
Sponsored by: Netflix
Reviewed by: corvink, manu
Differential Revision: https://reviews.freebsd.org/D38617
Warner Losh [Thu, 16 Feb 2023 16:36:03 +0000 (09:36 -0700)]
efivar: support device paths as well as mounted paths in path_to_dp
In path_to_dp, allow passing in either the actual device path "eg
/dev/foo/bar" or the path where the device is mounted (say
/mnt/baz/bing). In the former case we'll assume the path within the
device is nothing (the relpath). In the latter, we'll take from the
mount point on down as the relpath.
Warner Losh [Thu, 16 Feb 2023 16:36:03 +0000 (09:36 -0700)]
efivar: Try harder to find label's efimedia
If there's no efimedia attribute on the provider, and the provider's a
glabel, then find the 'parent' geom. In this case, the provider's name
is label-type/name, but the geom's label will that of the underlying
device (eg ada0p1). If it is, recurisvely call find_geom_efimedia with
the geom's name, which shuold have the efimedia attribute.
routing: always pass rtentry to add_route_flags().
add_route_flags() uses `rt` prefix data to lookup the the current
rtentry from the routing table. Update rib_add_route_px() to
always pass rtentry regardless of the op_flags.
Reported by: Stefan Grundmann <sg2342@googlemail.com>
MFC after: 1 day
netlink: use ifmedia to provide vlan interface operstate.
Netlink customers rely on admin and operational state when
working with interfaces. The current implementation retuns
"unknown" operstate for all interface types except IFT_ETHER
and IFT_LOOP.
This change updates the code to fetch vlan operstate in the same way
as for the ether interfaces. For the rest of the interface types,
operstate is now mapped to the admin state.
Reported by: Marek Zarychta <zarychtam@plan-b.pwste.edu.pl>
MFC after: 3 days
There were changes in -HEAD domain/protosw setup logic and
.pru_rcvd netlink handler was missed when performing the merge.
Lack of this handler resulted in userland being waiting forever
when performing large dumps of data.
This change restores the handler as direct commit to stable/13.
Reported by: Marek Zarychta <zarychtam@plan-b.pwste.edu.pl>
Mark Johnston [Thu, 26 Jan 2023 15:46:19 +0000 (10:46 -0500)]
netlink: Zero-initialize writer structures allocated on the stack
The prevailing pattern seems to be to simply initialize all fields to
zero. Without this, it's possible to trigger a branch on uninitialized
memory, specifically, when testing nw->ignore_limit in
nlmsg_refill_buffer().
Initialize the writer structure in a couple of functions where this is
necessary.
Reported by: KMSAN
Reviewed by: melifaro
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D38213
Mark Johnston [Tue, 17 Jan 2023 14:36:54 +0000 (09:36 -0500)]
netlink: Zero-initialize mbuf messages
Some users of nlmsg_reserve_object() and nlmsg_reserve_data() are not
careful to fully initialize pad and reserved fields, allowing
uninitialized bytes to leak to userspace. For example, dump_nhgrp()
doesn't set nhm->resvd = 0.
Meanwhile, nlmsg_get_ns_buf() and nlmsg_get_ns_lbuf() zero-initialize
the buffer, so nlmsg_get_ns_mbuf() is inconsistent. Let's just make
them all behave the same here.
Reported by: KMSAN
Reviewed by: melifaro
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D38098
Val Packett [Mon, 6 Feb 2023 21:50:13 +0000 (21:50 +0000)]
LinuxKPI: return an address string in pci_name()
amdgpu's virtual display feature uses pci_name() to match a module parameter
string, and the documentation shows an example of `0000:26:00.0` for the name.
In our case the name was just `drmn`, which is not actually unique across
devices.
The other consumers are wireless drivers, which will benefit from this
change.
Generate the expected string for pci_name() to return.
Related to: https://github.com/freebsd/drm-kmod/issues/134
Sponsored by: https://www.patreon.com/valpackett
Reviewed by: bz, hselasky, manu (earlier)
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D34248
Bjoern A. Zeeb [Sat, 28 Jan 2023 15:02:51 +0000 (15:02 +0000)]
LinuxKPI: pci: add more functions
Add a dummy pci_assign_resource() and an implementation of
pci_irq_vector() returning the irq for MSI-X, MSI, and legacy interrupt.
Both are needed by wirless drivers.
Sponsored by: The FreeBSD Foundation
MFC after: 3 days
Reviewed by: jhb
Differential Revision: https://reviews.freebsd.org/D38237
Sponsored by: The FreeBSD Foundation
Discussed with: grehan (in Dec)
MFC after: 3 days
Reviewed by: jhb
Differential Revision: https://reviews.freebsd.org/D38222
Bjoern A. Zeeb [Mon, 28 Nov 2022 18:27:03 +0000 (18:27 +0000)]
LinuxKPI: implement irq_get_msi_desc()
Add irq_get_msi_desc() as a wrapper around a PCI function which will
allocate a single cached value (see comment on struct) for the
msi_desc requested if it doesn't exist yet and handle freeing it
when the PCI device goes away. We take the values from the ivars of
the native (FreeBSD) device.
While changing struct pci_dev also add the msi_cap field requested by
a wireless driver.
Bjoern A. Zeeb [Tue, 31 Jan 2023 16:17:14 +0000 (16:17 +0000)]
LinuxKPI: 802.11: basic implementation of *queue(s)/*txq*
LinuxKPI: 802.11: deal with stopped queues
Very basic implementations of ieee80211_{wake,stop}_queue[s],
as well as ieee80211_txq_schedule_start(), ieee80211_next_txq(),
and ieee80211_schedule_txq().
Various combinations of these are used by different wireless
drivers, incl. iwlwifi.
Bjoern A. Zeeb [Tue, 31 Jan 2023 23:00:28 +0000 (23:00 +0000)]
LinuxKPI: 802.11: enhance lkpi_scan_ies_add() for HT and VHT
Add code (currently disabled by #ifdef) for HT and VHT to
lkpi_scan_ies_add(). Switch to a local variable for ic given
the new code also needs the value.
Sponsored by: The FreeBSD Foundation
MFC after: 3 days
Bjoern A. Zeeb [Sat, 28 Jan 2023 15:53:03 +0000 (15:53 +0000)]
LinuxKPI: pm.h: add dummy pm_wakeup_event()
Add a dummy implementation of pm_wakeup_event() which is used to notify
the power management system about a wakeup (which we currently do not
implement yet).
Sponsored by: The FreeBSD Foundation
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D38239
Bjoern A. Zeeb [Sat, 28 Jan 2023 15:18:24 +0000 (15:18 +0000)]
LinuxKPI: device: add device_set_wakeup_enable()
Add a dummy device_set_wakeup_enable() which is used for WoWLAN which we
do not (yet) support and device_wakeup_enable() which is a wrapper to the
former with the enable argument being true.
Sponsored by: The FreeBSD Foundation
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D38238
Bjoern A. Zeeb [Sat, 28 Jan 2023 16:15:19 +0000 (16:15 +0000)]
LinuxKPI: const argument to irq_set_affinity_hint()
irq_set_affinity_hint() takes a const mask argument and some drivers
pass it in as such where earlier implementations were more lenient.
Deal with it and __DECONST() the argument when passed to intr_setaffinity().
Sponsored by: The FreeBSD Foundation
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D38242
Add a dummy irq_set_status_flags() along with #defines passed by the driver.
Add disable_irq_nosync() as another wrapper to lkpi_disable_irq().
Those are used by wireless drivers.
Sponsored by: The FreeBSD Foundation
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D38241
Piotr Kubaj [Tue, 14 Feb 2023 01:29:44 +0000 (17:29 -0800)]
ice(4): Update to 1.37.7-k
Notable changes include:
- DSCP QoS Support (leveraging support added in
rG9c950139051298831ce19d01ea5fb33ec6ea7f89)
- Improved PFC handling and TC queue assignments (now all remaining
queues are assigned to TC 0 when more than one TC is enabled and the
number of available queues does not evenly divide between them)
- Support for dumping the internal FW state for additional debugging by
Intel support
- Support for allowing "No FEC" to be a valid state for the LESM to
negotiate when using non-standard compliant modules
Also includes various bug fixes and smaller enhancements, too.
Signed-off-by: Eric Joyner <erj@FreeBSD.org>
Reviewed by: erj@
Tested by: Jeff Pieper <jeffrey.pieper@intel.com>
MFC after: 3 days
Relnotes: yes
Sponsored by: Intel Corporation
Differential Revision: https://reviews.freebsd.org/D38109