CyberLeo.Net >> Repos - FreeBSD/FreeBSD.git/log

LinuxKPI: Add some typical header pollution

To reduce amount of drm-kmod patching

MFC after: 1 week
Reviewed by: hselasky, manu
Differential Revision: https://reviews.freebsd.org/D33297

(cherry picked from commit f1a7639a165d2ef945c0fdac5862167da671c7c4)

LinuxKPI: Implement smp_*mb barriers with atomic_thread_fence_*

for x86 and move them to asm/barrier.h

MFC after: 1 week
Reviewed by: bz, hselasky, manu
Differential Revision: https://reviews.freebsd.org/D33296

(cherry picked from commit 2fb5be7978c27505c02b667a21ce3a79f72e2091)

LinuxKPI: Make lockdep*_pin_lock macros useable for drm-kmod

Summary:
- Add dummy struct pin_cookie definition;
- Convert lockdep_pin_lock macro to function;
- Fix 'unused variable' compile-time errors;

MFC after: 1 week
Reviewers: hselasky, manu
Differential Revision: https://reviews.freebsd.org/D33295

(cherry picked from commit 68fcdba38b7ea65b1f2f395fbd25fb59880d7163)

LinuxKPI: Convert schedule() to inlined function

to prevent name clashing with drm-kmod

MFC after: 1 week
Reviewed by: hselasky, manu
Differential Revision: https://reviews.freebsd.org/D33294

(cherry picked from commit 7ec6cbf1d2fd104734ac8844d8c0bc1fdf50cb6d)

LinuxKPI: Add support for XA_FLAGS_ALLOC1 xarray flag

XA_FLAGS_ALLOC1 causes allocation of xarray entries starting at 1

Required by drm-kmod 5.7

MFC after: 1 week
Reviewed by: hselasky, manu
Differential Revision: https://reviews.freebsd.org/D33293

(cherry picked from commit e705066cd87559831096f9638603f35d2fea635f)

LinuxKPI: Implement default sysfs kobject attribute operations

Required by drm-kmod 5.7

MFC after: 1 week
Reviewed by: hselasky
Differential Revision: https://reviews.freebsd.org/D33292

(cherry picked from commit 04d42cb453888cbda0fb81d38bd722962ca6fc03)

LinuxKPI: Implement kstrtoull

Required by drm-kmod 5.7

MFC after: 1 week
Reviewed by: hselasky, manu
Differential Revision: https://reviews.freebsd.org/D33291

(cherry picked from commit c427456fd560d6def83cd3867cc5bf01d20653e5)

LinuxKPI: Implement dev_driver_string()

Required by drm-kmod 5.7

MFC after: 1 week
Reviewed by: bz, hselasky, manu
Differential Revision: https://reviews.freebsd.org/D33290

(cherry picked from commit bc923d93dffc9d82a705d4e5b9960daa9acdcca6)

LinuxKPI: Implement clflush_cache_range()

Required by drm-kmod 5.7

MFC after: 1 week
Reviewed by: hselasky, manu
Differential Revision: https://reviews.freebsd.org/D33289

(cherry picked from commit db562aeff7755a1128165cc0fbf8252756004847)

LinuxKPI: Add clflush argument type conversion wrapper

to reduce amount of source patching in drm-kmod.

MFC after: 1 week
Reviewed by: manu
Differential Revision: https://reviews.freebsd.org/D33288

(cherry picked from commit 9a79e08ae761f1d53c982e1d089b923e25b0092f)

LinuxKPI: Implement interval_tree

Required by drm-kmod

MFC after: 1 week
Reviewed by: hselasky, manu
Differential Revision: https://reviews.freebsd.org/D32869

(cherry picked from commit dbc920bd9a9b413182a1940155539a3144a405aa)

LinuxKPI: Import some linux/rbtree.h functions from OpenBSD

Required by drm-kmod

Obtained from: OpenBSD
MFC after: 1 week

(cherry picked from commit dd52763387abd18bb6ac510b1148632a13b945f0)

kqueue(2): Add note about format of the data for NOTE_EXIT

PR: 261346

(cherry picked from commit 7406ec4ea99c1c61e88d5c98c58094093b9e78fb)

nvd: For AHCI attached devices, report ahci bridge

When an NVME device is attached via a AHCI controller, we have no access
to its config space. So instead of information about the nvme drive
itself, return info about the AHCI controller as the next best
thing. Since the Intel Hardware RAID support looks at these values, this
likely is best.

Sponsored by: Netflix
Reviewed by: mav
Differential Revision: https://reviews.freebsd.org/D33286

(cherry picked from commit b8194f37666768dac35a0e1105c41242aad9b2d0)

carp: fix send error demotion recovery

The problem is that carp(4) would clear the error counter on first
successful send, and stop counting successes after that. Fix this
logic and document it in human language.

PR: 260499
Differential revision: https://reviews.freebsd.org/D33536

(cherry picked from commit 9a8cf950b259f6833c7562ce941b0cfeae6687e5)

zone.9: Remove documentation of non-existent NUMA configuration flags

These configuration options were removed in commit dfe13344f557.

Some forthcoming work will update the UMA man page to describe its
current behaviour on NUMA systems.

Sponsored by: The FreeBSD Foundation

(cherry picked from commit 39d4ccf82607c99655fda0d76357a7f534fa724f)

rc: make ctld depend on NETWORKING

This fixes a problem where ctld(8) would refuse to start on boot
with a specific IP address to listen on configured in ctl.conf(5).
It also fixes a problem where ctld(8) would fail to start with
some network interfaces which require a sysctl.conf(5) tweak
to configure them, eg to switch them from InfiniBand to IP mode.

PR: 232397

(cherry picked from commit 015351de04e3e621cff825cc1fdad5faf078c3ac)

adaspindown: check disk power mode before sending IDLE command

If a disk is already in STANDBY mode, then setting IDLE mode can
actually spin it up.

(cherry picked from commit 15910dc0bcb526d575f8cc49efe1f98a5091c88e)

nvme: Do not rearm timeout for commands without one.

Admin queues almost always have several ASYNC_EVENT_REQUEST outstanding.
They have no timeouts, but their presence in qpair->outstanding_tr caused
useless timeout callout rearming twice a second.

While there, relax timeout callout period from 0.5s to 0.5-1s to improve
aggregation. Command timeouts are measured in seconds, so we don't need
to be precise here.

Reviewed by: imp
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D33781

(cherry picked from commit b3c9b6060f9a3525196867d8e812b24fc0bc61e1)

nvme_sim: Only report PCI related stats when we can

For AHCI attached devices, we report the location and identification
information of the AHCI controller that we're attached to. We also
don't reprot link speed in that case, since we can't get to the PCIe
config space registers to find that out.

Sponsored by: Netflix
Reviewed by: mav
Differential Revision: https://reviews.freebsd.org/D33287

(cherry picked from commit 8f07932272c4b34804bc575c4f8bffecd15cd4ef)

nvme_ahci: Mark AHCI devices as such in the controller

Add a quirk to flag AHCI attachment to the controller. This is for any
of the strategies for attaching nvme devices as children of the AHCI
device for Intel's RAID devices. This also has a side effect of cleaning
up resource allocation from failed nvme_attach calls now.

Sponsored by: Netflix
Reviewed by: mav
Differential Revision: https://reviews.freebsd.org/D33285

(cherry picked from commit 7cf8d63c884c484fee9b287f792549ee15270ae7)

nvme: Move to a quirk for the Intel alignment data

Prior to NVMe 1.3, Intel produced a series of drives that had
performance alignment data in the vendor specific space since no
standard had been defined. Move testing the versions to a quick so the
NVMe NS code doesn't know about PCI device info.

Sponsored by: Netflix
Reviewed by: mav
Differential Revision: https://reviews.freebsd.org/D33284

(cherry picked from commit 053f8ed6ebf2355a92cb1798a9701f701610771c)

nvme: Reduce traffic to the doorbell register

Reduce traffic to doorbell register when processing multiple completion
events at once. Only write it at the end of the loop after we've
processed everything (assuming we found at least one completion,
even if that completion wasn't valid).

Sponsored by: Netflix
Reviewed by: mav
Differential Revision: https://reviews.freebsd.org/D32470

(cherry picked from commit 2ec165e3f065217ae8d54a2a8235fe1f219805ea)

nvme: Restore hotplug warning

Restore hotplug warning in recovery state machine. No functional change
other than what message gets printed.

Sponsored by: Netflix

(cherry picked from commit 18dc12bfd2e23ad2ea97db54cb8ee499f6f014da)

nvme: Use adaptive spinning when polling for completion or state change

We only use nvme_completion_poll in the initialization path. The
commands they queue and wait for finish quickly as they involve no I/O
to the drive's media. These command take about 20-200 microsecnds
each. Set the wait time to 1us and then increase it by 1.5 each
successive iteration (max 1ms). This reduces initialization time by
80ms in cpervica's tests.

Use this same technique waiting for RDY state transitions. This saves
another 20ms. In total we're down from ~330ms to ~2ms.

Tested by: cperciva
Sponsored by: Netflix
Reviewed by: mav
Differential Review: https://reviews.freebsd.org/D32259

(cherry picked from commit 83581511d9476ef5084f47e3cc379be7191ae866)

nvme: Only reset once on attach.

The FreeBSD nvme driver has reset the nvme controller twice on attach to
address a theoretical issue assuring the hardware is in a known
state. However, exierence has shown the second reset is unnecessary and
increases the time to boot. Eliminate the second reset. Should there be
a situation when you need a second reset (for buggy or at least somewhat
out of the mainstream hardware), the hardware option NVME_2X_RESET will
restore the old behavior. Document this in nvme(4).

If there's any trouble at all with this, I'll add a sysctl tunable to
control it.

Sponsored by: Netflix
Reviewed by: cperciva, mav
Differential Revision: https://reviews.freebsd.org/D32241

(cherry picked from commit 4b3da659bf62b0f5306b5acee9add41b84361498)

nvme: Remove pause while resetting

After some study of the code and the standard, I think we can just drop
the pause(), unconditionally.  If we're not initialized, then there's
nothing to wait for from a software perspective.  If we are initialized,
then there might be outstanding I/O. If so, then the qpair 'recovery
state' will transition to WAITING in nvme_ctrlr_disable_qpairs, which
will ignore any interrupts for items that complete before we complete
the reset by setting cc.en=0.

If we go on to fail the controller, we'll cancel the outstanding I/O
transactions.  If we reset the controller, the hardware throws away
pending transactions and we retry all the pending I/O transactions. Any
transactions that happend to complete before cc.en=0 will have the same
effect in the end (doing the same transaction twice is just inefficient,
it won't affect the state of the device any differently than having done
it once).

The standard imposes no wait times here, so it isn't needed from that
perspective.

Unanswered Question: Do we may need to disable interrupts while we
disable in legacy mode since those are level-sensitive.

Sponsored by: Netflix
Reviewed by: mav
Differential Revision: https://reviews.freebsd.org/D32248

(cherry picked from commit e5e26e4a24a1142e02a9b477877e13ed0c194f36)

nvme: Explain a workaround a little better

The don't touch the mmio of the drive after we do a EN 1->0 transition
is only for a tiny number of dirves that have this unforunate issue.

Sponsored by: Netflix

(cherry picked from commit 77054a897f6440632267e75ebe31793c7555b79e)

nvme_ctrlr_enable: Small style nits

Rewrite the nested if's using the preferred FreeBSD style for branches
of ifs that return. NFC. Minor tweaks to the comments to better fit new
code layout.

Sponsored by: Netflix
Reviewed by: mav, chuck (prior rev, but comments rolled in)
Differential Revision: https://reviews.freebsd.org/D32245

(cherry picked from commit a245627a4e9553c84eddea07570daaf85c1067b6)

nvme: Use MS_2_TICKS rather than rolling our own

Sponsored by: Netflix
Reviewed by: mav
Differential Revision: https://reviews.freebsd.org/D32246

(cherry picked from commit 26259f6ab96e68766556b973d3a4ca0ad125e174)

nvme_ctrlr_enable: Remove unnecessary 5ms delays

Remove the 5ms delays after writing the administrative queue
registers. These delays are from the very earliest days of the driver
(they are in the first commit) and were most likely vestiges of the
Chatham NVMe prototype card that was used to create this driver. Many of
the workarounds necessary for it aren't necessary for standards
compliant cards. The original driver had other areas marked for Chatham,
but these were not. They are unneeded. There's three lines of supporting
evidence.

First, the NVMe standards make no mention of a delay time after these
registers are written. Second, the Linux driver doesn't have them, even
as an option. Third, all my nvme cards work w/o them.

To be safe, add a write barrier between setting up the admin queue and
enabling the controller.

Sponsored by: Netflix
Reviewed by: mav
Differential Revision: https://reviews.freebsd.org/D32247

(cherry picked from commit d5fca1dc1d7de15695b65374d6457abd29a747ee)

nvme: Sanity check completion id

Make sure the completion ID is in the range of [0..num_trackers) since
the values past the end of the act_tr array are never going to be valid
trackers and will lead to pain and suffering if we try to dereference
them to get the tracker or to set the tracker back to NULL as we
complete the I/O.

Sponsored by: Netflix
Reviewed by: mav, chs, chuck
Differential Revision: https://reviews.freebsd.org/D32088

(cherry picked from commit 36a87d0c6fe9d65de23f177ef84000b205f87e39)

nvme: count number of ignored interrupts

Count the number of times we're asked to process completions, but that
we ignore because the state of the qpair isn't in RECOVERY_NONE.

Sponsored by: Netflix
Reviewed by: mav, chuck
Differential Revision: https://reviews.freebsd.org/D32212

(cherry picked from commit 587aa25525e54ea775298c402acd7a647f9838fb)

nvme: Add sanity check for phase on startup.

The proper phase for the qpiar right after reset in the first interrupt
is 1. For it, make sure that we're not still in phase 0. This is an
illegal state to be processing interrupts and indicates that we've
failed to properly protect against a race between initializing our state
and processing interrupts. Modify stat resetting code so it resets the
number of interrpts to 1 instead of 0 so we don't trigger a false
positive panic.

Sponsored by: Netflix
Reviewed by: cperciva, mav (prior version)
Differential Revision: https://reviews.freebsd.org/D32211

(cherry picked from commit 7d5eebe0f4a0f2aa5c8c7dfdd1a9ce1513849da8)

nvme: start qpair in state RECOVERY_WAITING

An interrupt happens on the admin queue right away after the reset, so
as soon as we enable interrupts, we'll get a call to our interrupt
handler. It is safe to ignore this interrupt if we're not yet
initialized, or to process it if we are. If we are initialized, we'll
see there's no completion records and return. If we're not, we'll
process no completion records and return. Either way, nothing is
processed and nothing is lost.

Until we've completely setup the qpair, we need to avoid processing
completion records. Start the qpair in the waiting recovery state so we
return immediately when we try to process completions. The code already
sets it to 'NONE' when we're initialization is complete. It's safe to
defer completion processing here because we don't send any commands
before the initialization of the software state of the qpair is
complete. And even if we were to somehow send a command prior to that
completing, the completion record for that command would be processed
when we send commands to the admin qpair after we've setup the software
state. There's no good central point to add an assert for this last
condition.

This fixes an KASSERT "received completion for unknown cmd" panic on
boot.

Fixes: 502dc84a8b6703e7c0626739179a3cdffdd22d81
Sponsored by: Netflix
Reviewed by: mav, cperciva, gallatin
Differential Revision: https://reviews.freebsd.org/D32210

(cherry picked from commit fa81f3731d1a2984a28ae44e60d12a0659b8fd2f)

nvme: Use shared timeout rather than timeout per transaction

Keep track of the approximate time commands are 'due' and the next
deadline for a command. twice a second, wake up to see if any commands
have entered timeout. If so, quiessce and then enter a recovery mode
half the timeout further in the future to allow the ISR to
complete. Once we exit recovery mode, we go back to operations as
normal.

Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D28583

(cherry picked from commit 502dc84a8b6703e7c0626739179a3cdffdd22d81)

nvme/nda: Fail all nvme I/Os after controller fails

Once the controller has failed, fail all I/O w/o sending it to the
device. The reset of the nvme driver won't schedule any I/O to the
failed device, and the controller is in an indeterminate state and can't
accept I/O. Fail both at the top end of the sim and the bottom
end. Don't bother queueing up the I/O for failure in a different task.

Reviewed by: chuck
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D31341

(cherry picked from commit 4b977e6dda92fe093ea300f1a91dbcf877b64fa0)

Add some nvme initialization routines to TSLOG

About 335 ms of EC2 instance boot time is being spent here.

(cherry picked from commit bad42df9bfcb8d77bdec04ea1f9acd874c762740)

Fix build. Sorry.

MFC after: 2 weeks

(cherry picked from commit 135c269d87c47890cc27309bdfedb98fe04021be)

CTL: Relax callouts precisions.

MFC after: 2 weeks

(cherry picked from commit f4d499fd670283ee09f8870088c1b394843ae468)

cam: Relax callouts precisions.

On large systems even relatively rare callouts may fire many times
per second. This should allow them to aggregate better, since we do
not require any precision when polling for media change, etc.

MFC after: 2 weeks

(cherry picked from commit 0e5c50bf60727a5a832da9ba9dac06c057307a76)

netbsd-tests: Fix the libc stat_socket test

The test tries to connect a socket to a closed port at 127.0.0.1. It
sets O_NONBLOCK on the socket first and expects to get EINPROGRESS from
connect(2), but this is not guaranteed, ECONNREFUSED is possible.
Handle both cases, and re-enable the test.

PR: 240621
Sponsored by: The FreeBSD Foundation

(cherry picked from commit 95c75073d3d1ca9dcae41784453172f199bb2c0f)

Free UMA zones when a pass(4) instance goes away.

If the UMA zones are not freed, we get warnings about re-using the
sysctl variables associated with the UMA zones, and we're leaking
the other memory associated with the zone structures. e.g.:

sysctl_warn_reuse: can't re-use a leaf (vm.uma.pass44.size)!
sysctl_warn_reuse: can't re-use a leaf (vm.uma.pass44.flags)!
sysctl_warn_reuse: can't re-use a leaf (vm.uma.pass44.bucket_size)!
sysctl_warn_reuse: can't re-use a leaf (vm.uma.pass44.bucket_size_max)!
sysctl_warn_reuse: can't re-use a leaf (vm.uma.pass44.keg.name)!
sysctl_warn_reuse: can't re-use a leaf (vm.uma.pass44.keg.rsize)!
sysctl_warn_reuse: can't re-use a leaf (vm.uma.pass44.keg.ppera)!
sysctl_warn_reuse: can't re-use a leaf (vm.uma.pass44.keg.ipers)!

Also, correctly clear the PASS_FLAG_ZONE_INPROG flag in
passcreatezone(). The way it was previously done, it would have
had set the flag and cleared all other flags that were set at
that point.

Sponsored by: Spectra Logic

(cherry picked from commit ca2a7262df5ec5fd07d4ac61738947f48c9cd7f2)

net80211: ieee80211_dump_node() cosmetics

Printing %p does not need the 0x prefix and while here mark the
ieee80211_node_table argument unused given we do not need it in the
current incarnation of the function.

Sponsored by: The FreeBSD Foundation

(cherry picked from commit c3db9d4a1439e0144415e007599a94dde4bee01b)

LinuxKPI: 802.11 correct enum ieee80211_channel_flags

enum ieee80211_channel_flags are used as bit fields and not as 1..n.
Correct the values using BIT(n).

This is also hoped to fix problems with 7260 cards which come up and
panic due to an empty channel list as all channels are set disabled [1][2].
It will hopefully also fix the one or other oddity.

Reported by: ambrisko, Mike Tancsa (mike sentex.net) [1]
Confirmed to fix by: ambrisko, Mike Tancsa (mike sentex.net) [2]
Sponsored by: The FreeBSD Foundation

(cherry picked from commit d7ce88aafc870944d5eda477b125478f56844f81)

LinuxKPI: 802.11 Refine/add DTIM/TSF handling

Correct data types related to delivery traffic indication map (DTIM)/
timing synchronization function (TSF) and implement/refine their
handling.  This information is used/needed by iwlwifi to set a station
as associated.  This will hopefully avoid more "no beacon heard"
time event failures.

The recording of the Linux specific sync_device_ts is done in the
receive path for now in case we do have the right information
available.  I need to investigate as to how-much it may make sense
to also migrate it into net80211 in the future depending on the
usage in other drivers (or how we did handle this in the past in
natively ported versions, e.g. iwm).

Sponsored by: The FreeBSD Foundation

(cherry picked from commit c8dafefaee00d5741ea141f4f7514811437add06)

LinuxKPI: 802.11 handle connection loss differently

Rather than just bouncing back to SCAN bounce to INIT on connection
loss. This is should be refined in the future as the comment already
indicates but we need to tie two different worlds together.

Sponsored by: The FreeBSD Foundation

(cherry picked from commit f3229b62a14953f0e487e8b32fef14b073da2890)

pci_dw_mv: Don't enable unhandled interrupts.

Mainly link errors interrupts should only be activated on fully linked port,
otherwise noise on lanes can cause livelock. But we don't have error
counters yet, so leave these interrupts disabled.

(cherry picked from commit ce5a4083de2d79bc44d209c9e355a09ede47346c)

simple_mfd: switch to controllable locking for syscon provider.

MFC after 3 weeks

(cherry picked from commit f97f57b51855cecb9b497a90dfed06dac2c21111)

mvebu_gpio: Fix settings of gpio pin direction.

Data Output Enable Control register is inverted – 0 means output direction.
Reflect this fact in code.

MFC after: 3 weeks

(cherry picked from commit 01c6d7918985c6e8610d6245af0f745ced86ffd5)

mvebu_gpio: Multiple fixes.

- gpio register access primitives
- locking in interrupt path
- cleanup

In cooperation with: mw
Reviewed by: mw (initial version)
MFC after: 3 weeks
Differential Revision: https://reviews.freebsd.org/D29044
Differential Revision: https://reviews.freebsd.org/D28911

(cherry picked from commit a5dce53b75d8750ba95623ad2dbffac4acfd3545)

mvebu_gpio: fix interrupt cause register configuration

According to Armada 8k documentation, the interrupt cause register
(at offset 0x14) is RW0C. Update the configuration in attach and
the mvebu_gpio_isrc_eoi() to follow the description.

Reviewed by: mmel
Obtained from: Semihalf
Sponsored by: Marvell
Differential Revision: https://reviews.freebsd.org/D29013

(cherry picked from commit 819760b35f3196227a1d90089fb98ee115e7ed0d)

tegra/ahci: do not advertise enclosure management facility

It is not implemented in HW.

MFC after: 1 week

(cherry picked from commit 6e9119768dad5289c188e8a07aada98783b52abf)

tegra124: Implement new get_gate method for tegra124 clocks.

MFC after: 1 week

(cherry picked from commit be01656fa4cd78f191c0ad8a6f4640a0c520d5a9)

tegra210: Implement new get_gate method for tegra210 clocks.

MFC after: 1 week

(cherry picked from commit 7c0ec6638548e78a4fd85a5a2d811bac7c2da98b)

extres/clk: Add a method to detect the HW state of the clock gate.

- add method to read gate enable/disable staust from HW
- show gate status in sysctl clock dump

MFC after: 1 week

(cherry picked from commit 1a74d77f851212f8cc80e6b15e30c2b252b84d48)

extres/clk: Improve sysctl dump of clocks.

Always recalculate the frequency, the cache is lazily initialized so it is not always up to date.
While I'm in mark sysctl as MPSAFE.

Discussed with: manu, adrian
MFC after: 1 week

(cherry picked from commit 72a2f3b5e28ada60de01f08b28888f70eec0baed)

arm: Fix handling of undefined instruction aborts in THUMB2 mode.

Correctly recognize NEON/SIMD and VFP instructions in THUMB2 mode and pass
these to the appropriate handler. Note that it is not necessary to filter
all undefined instruction variant or register combinations, this is a job
for given handler.

Reported by: Robert Clausecker <fuz@fuz.su>
PR: 259187
MFC after: 2 weks

(cherry picked from commit a670e1c13a522df4fb8c63bb023b88b1d65de797)

dwmmc: Calculate the maximum transaction length correctly.

We should reserve two descriptors (not MMC_SECTORS) for potentially
unaligned (so bounced) buffer fragments, one for the starting fragment
and one for the ending fragment.

Submitted by: kjopek@gmail.com
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D30387

(cherry picked from commit dfb7360222856e7e4f5e0e5564281a25af63319c)

booti: Enable loading the kernel image to any address aligned to 2 MB

We've supported this for a long time, plus most u-boot setups quietly expect

MFC after: 2 weeks

(cherry picked from commit b07a6bd15a58aa6e23761c51eba78d449cd2cbf3)

intrng: remove now redundant shadow variable.

Should not be a functional change.

Submitted by: ehem_freebsd@m5p.com
Discussed in: https://reviews.freebsd.org/D29310
MFC after: 4 weeks

(cherry picked from commit e88c3b1b02a663f18f51167f54a50e7b4f0eca02)

intrng: Releasing interrupt source should clear interrupt table full state.

The first release of an interrupt in a situation where the interrupt table
is full should schedule a full table check the next time an interrupt is
allocated. A full check is necessary to ensure maximum separation between
the order of allocation and the order of release.

Submitted by: ehem_freebsd@m5p.com (initial version)
Discussed in: https://reviews.freebsd.org/D29310
MFC after: 4 weeks

(cherry picked from commit a49f208d94b873b2187adbfe1d785b3bc8bdc598)

Fix error value returned by ofw_bus_gen_get_node().

By definition ofw_bus_get_node() should consistently return -1 when there
is no associated OF node.

MFC after: 4 weeks
Discussed with: nwhitehorn
Analyzed in: https://reviews.freebsd.org/D30761

(cherry picked from commit 3eae4e106ac7222364fc9dc8c3d35d4ad8c5293a)

riscv: gdb(4) support

Add the MD portion required for the gdb stub.

Reviewed by: jhb (earlier version)
Discussed with: jrtc27
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D33734

(cherry picked from commit d72e944812f8774ab0b78fdff9b0204386dd6151)

ipfilter: Fixup comment

Fix comment documenting checksum block in ip_nat.c. Fastforward doesn't
perform checksum.

(cherry picked from commit 896a0153190937e071a31c682c6cf55e4c599886)

ipfilter: Fix whitespace errors

(cherry picked from commit 6af38b34e4b9863171e0240df1d1d432606c21a1)

ipfilter: Fix IP header checksums post ftp proxy

Don't assume checksums will be calculated later in fastforward.

(cherry picked from commit 2a6465245fa3f5323c2036049a730e2f2b95d270)

ipfilter: Correct function description

Correct the parameters descriptions for ipf_fix_outcksum and
ipf_fix_incksum.

(cherry picked from commit 4b5c0c9b813160842b942a4e978d482e8a7d3f7e)

mmc_da: remove write-only local variables

(cherry picked from commit dfb1c97ab973d6c248b4886d7cc28be72c7b33f2)

truss(1): detach more carefully

(cherry picked from commit 12f747e6ff675edfc1f2f95f7fc435dc01e0c29c)

truss: remove write-only variable

(cherry picked from commit ba33c288488d4543d1a140cd5c44b8b3c4c29915)

libc: correct SPDX tag on strstr.c

It was obtained from musl, and is MIT licensed.

MFC after: 3 days
Sponsored by: The FreeBSD Foundation

(cherry picked from commit 2e9bc9d14440aa17e6945a9b6613ebb1711fe960)

libc: fix misleading comment in strstr

Obtained from: musl c53e9b239418

(cherry picked from commit c6750f07b43d18d39729570533f4ecb56da286bf)

When doing a read-only mount of a UFS filesystem using gjournal(8),
suppress error message about a missing gjournal provider.

Submitted by: Andreas Longwitz
Sponsored by: Netflix

(cherry picked from commit 1fbcaa13b033230c52487a270803bd0f7723e107)

systat -vm: Humanize output for ease of reading.

Using 8 width is too wide for large numbers like 1379991K;
1330M is easier to read.

Submitted by: ota_j.email.ne.jp
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D33495

(cherry picked from commit a115a4aa51ae891330c9c4404dd4df13b601556f)

usbconfig: actually set the exit code in usage()

Oversight in previous commit: usage() had been turned to accept
an "exitcode" parameter, but it hasn't been used.

MFC after: 2 weeks

(cherry picked from commit 1654b51455cd1c890b08685551abafec88111606)

usbconfig: implement a -v option

Implement a -v option to usbconfig(8), as a shortcut for the most
frequently needed commands dump_device_desc, dump_curr_config_desc,
and show_ifdrv.

While here, implement a real -h option that has been promised by the
man page.

Use <sysexits.h> to declare the utility return codes.

Reviewed by: hselasky
Differential Revision: https://reviews.freebsd.org/D33586
MFC after: 2 weeks

(cherry picked from commit d69b9cc26d1c24a4cbc37478a571b1f531aa7bcc)

usbconfig: documentation fixes, mainly for -i option

* in usage(), clearly mark -i interface as optional
* both, -u busnum and -a devaddr are optional as well
* various minor man page fixes
* clearly mark those two commands that actually use -i ifaceidx
* remove unused bitfield tag got_iface
* fix indentation level according to review comment

Differential Revision: https://reviews.freebsd.org/D33579/
Reviewed by: hselasky
MFC after: 2 weeks

(cherry picked from commit cae1884d4791726f5acf5d64bba9a3583b63e38b)

usbconfig: use getopt(3) for option handling

This makes option handling consistent with other utilities as well as
Posix rules. By that, it's no longer important whether option name and
its argument are separated by a space or not, so -d5.3 works the same
as -d 5.3.

Also, recognize either /dev/ugen or ugen as prefix to the -d argument.

Note that this removes the undocumented feature that allowed to
specify multiple -d n.m options interleaved with commands referring to
that particular device in a single run.

(cherry picked from commit ae450e6de96b5ec65f425a52b08dc859576ab8d0)

swap_pager: uma_zcreate() doesn't fail

Remove always-false checks for UMA zone creation failure. No functional
change intended.

Reviewed by: alc, kib
Sponsored by: The FreeBSD Foundation

(cherry picked from commit 43b3b8e52d642e94876728202d9b9863315c8525)

vm_pageout: Group sysctl variables together with sysctl definitions

Fix some style bugs while here. No functional change intended.

Reviewed by: alc, kib
Sponsored by: The FreeBSD Foundation

(cherry picked from commit c4a25e0713263200d41673fe1740e3e0362ccd95)

fusefs: implement VOP_ALLOCATE

Now posix_fallocate will be correctly forwarded to fuse file system
servers, for those that support it.

Reviewed by: pfg
Differential Revision: https://reviews.freebsd.org/D33389

(cherry picked from commit 398c88c7582a195cbfeb689ceff1400cc717673f)

fusefs: in the tests, always assume debug.try_reclaim_vnode is available

In an earlier version of the revision that created that sysctl (D20519)
the sysctl was gated by INVARIANTS, so the test had to check for it.
But in the committed version it is always available.

(cherry picked from commit 19ab361045343bb777176bb08468f7706d7649c4)

fusefs: move common code from forget.cc to utils.cc

(cherry picked from commit 8d99a6b91b788b7ddf88f975f288f7c6479f4be3)

fusefs: fix .. lookups when the parent has been reclaimed.

By default, FUSE file systems are assumed not to support lookups for "."
and "..".  They must opt-in to that.  To cope with this limitation, the
fusefs kernel module caches every fuse vnode's parent's inode number,
and uses that during VOP_LOOKUP for "..".  But if the parent's vnode has
been reclaimed that won't be possible.  Previously we paniced in this
situation.  Now, we'll return ESTALE instead.  Or, if the file system
has opted into ".." lookups, we'll just do that instead.

This commit also fixes VOP_LOOKUP to respect the cache timeout for ".."
lookups, if the FUSE file system specified a finite timeout.

PR: 259974
Reviewed by: pfg
Differential Revision: https://reviews.freebsd.org/D33239

(cherry picked from commit 1613087a8127122b03a3730046d051adf4edd14f)

fusefs: copy_file_range must update file timestamps

If FUSE_COPY_FILE_RANGE returns successfully, update the atime of the
source and the mtime and ctime of the destination.

Reviewers: pfg
Differential Revision: https://reviews.freebsd.org/D33159

(cherry picked from commit 5169832c96451e0c939338d8ef34cd0875a24b83)

Fix a race in fusefs that can corrupt a file's size.

VOPs like VOP_SETATTR can change a file's size, with the vnode
exclusively locked.  But VOPs like VOP_LOOKUP look up the file size from
the server without the vnode locked.  So a race is possible.  For
example:

1) One thread calls VOP_SETATTR to truncate a file.  It locks the vnode
   and sends FUSE_SETATTR to the server.
2) A second thread calls VOP_LOOKUP and fetches the file's attributes from
   the server.  Then it blocks trying to acquire the vnode lock.
3) FUSE_SETATTR returns and the first thread releases the vnode lock.
4) The second thread acquires the vnode lock and caches the file's
   attributes, which are now out-of-date.

Fix this race by recording a timestamp in the vnode of the last time
that its filesize was modified.  Check that timestamp during VOP_LOOKUP
and VFS_VGET.  If it's newer than the time at which FUSE_LOOKUP was
issued to the server, ignore the attributes returned by FUSE_LOOKUP.

PR: 259071
Reported by: Agata <chogata@moosefs.pro>
Reviewed by: pfg
Differential Revision: https://reviews.freebsd.org/D33158

(cherry picked from commit 13d593a5b060cf7be40acfa2ca9dc9e0e2339a31)

file: upgrade to 5.41.

(cherry picked from commit 43a5ec4eb41567cc92586503212743d89686d78f)

bhyve: dynamically register FwCtl ports

Qemu's FwCfg uses the same ports as Bhyve's FwCtl. Static allocated
ports wouldn't allow to switch between Qemu's FwCfg and Bhyve's
FwCtl.

Reviewed by:    markj
MFC after:      2 weeks
Sponsored by:   Beckhoff Automation GmbH & Co. KG
Differential Revision:  https://reviews.freebsd.org/D33496

(cherry picked from commit 9fe79f2f2b22ea068e5acb2af23d130a13d2ab06)

bhyve: add more slop to 64 bit BARs

Bhyve allocates small 64 bit BARs below 4 GB and generates ACPI tables
based on this allocation. If the guest decides to relocate those BARs
above 4 GB, it could lead to mismatching ACPI tables. Especially
when using OVMF with enabled bus enumeration it could cause
issues. OVMF relocates all 64 bit BARs above 4 GB. The guest OS
may be unable to recover from this situation and disables some PCI
devices because their BARs are located outside of the MMIO space
reported by ACPI. Avoid this situation by giving the guest more
space for relocating BARs.

Let's be paranoid. The available space for BARs below 4 GB is 512 MB
large. Use a slop of 512 MB. It'll allow the guest to relocate all
BARs below 4 GB to an address above 4 GB. We could run into issues
when we exceeding the memlimit above 4 GB. However, this space has
a size of 32 GB. Even when using many PCI device with large BARs
like framebuffer or when using multiple PCI busses, it's very
unlikely that we run out of space due to the large slop.
Additionally, this situation will occur on startup and not at runtime
which is much better.

Reviewed by:    markj
MFC after:      2 weeks
Sponsored by:   Beckhoff Automation GmbH & Co. KG
Differential Revision:  https://reviews.freebsd.org/D33118

(cherry picked from commit 7d55d295086e0f568b42c89604fad3e47633b2ed)

bhyve: allow reading of fwctl signature multiple times

At the moment, you only have one single chance to read the fwctl
signature. At boot bhyve is in the state IDENT_WAIT. It's then
possible to switch to IDENT_SEND. After bhyve sends the signature,
it switches to REQ. From now on it's impossible to switch back to
IDENT_SEND to read the signature. For that reason, only a single
driver can read the signature. A guest can't use two drivers to
identify that fwctl is present. It gets even worse when using
OVMF. OVMF uses a library to access fwctl. Therefore, every single
OVMF driver would try to read the signature. Currently, only a
single OVMF driver accesses the fwctl. So, there's no issue with
it yet. However, no OS driver would have a chance to detect fwctl when
using OVMF because it's signature was already consumed by OVMF.

Reviewed by:    markj
MFC after:      2 weeks
Sponsored by:   Beckhoff Automation GmbH & Co. KG
Differential Revision:  https://reviews.freebsd.org/D31981

(cherry picked from commit 8ec366ec6c943550a011effe50bc73e3875f8ead)

bhyve: enumerate BARs by size

E.g. Framebuffers can require large space and BARs need to be aligned
by their size. If BARs aren't allocated by size, it'll cause much
fragmentation of the MMIO space. Reduce fragmentation by ordering
the BAR allocation on their size to reduce the risk of
OUT_OF_MMIO_SPACE issues.

Reviewed by: markj
MFC after: 2 weeks
Sponsored by: Beckhoff Automation GmbH & Co. KG
Differential Revision: https://reviews.freebsd.org/D28278

(cherry picked from commit 01f9362ef4eb14b041ccdf935fccf0f794074258)

gpart(8): MFC: add minimal reference to glabel(8) to manual page

(cherry picked from commit ba94a95402f335c8e7aa8e28ebdad43361c65909)

LinuxKPI: 802.11 correctly spell queues

PR: 261078
Sponsored by: The FreeBSD Foundation

(cherry picked from commit 312ba38a9bee9510fb7836997b1360d95b9305d8)

LinuxKPI: 802.11 fix locking in lkpi_stop_hw_scan()

In lkpi_stop_hw_scan() we have to unlock around cancelling the
hardware scan and an msleep to wait for the confirmation that the
scan ended. Otherwise we are sleeping with the non-sleepable
net80211 com lock held. At the same time we need to hold the lhw
lock for the msleep().
This lock change got lost in the refactoring of lkpi_iv_newstate().

Reported by: ambrisko, delphij
PR: 261075
Sponsored by: The FreeBSD Foundation

(cherry picked from commit bec766282f242aab3a4bfba402ea74cb0ccf96fb)

LinuxKPI: 802.11 update compat code for driver updates

Add more (dummy in case of HE) defines, structs, functions and another
mac80211 function pointer needed to update and support recent drivers.

(cherry picked from commit 51b461b3db33b7cd7cbc62c9206568321f7298ad)

LinuxKPI / iwlwifi: fix spelling of constants

Fix the spelling of IEEE80211_HE_PHY_CAP9_NOMINAL_PKT_PADDING_*
(was "NOMIMAL"). The original version came from iwlwifi
in iwlwifi-next. Other drivers (from wireless-testing) already
use the correct spelling and need this change in LinuxKPI.

Sponsored by: The FreeBSD Foundation

(cherry picked from commit e200809190fd8472fe525b2527ff1122b37999ac)

LinuxKPI: 802.11 handle IEEE80211_CONF_IDLE better

We never initialized hw->conf.flags for IEEE80211_CONF_IDLE but
on set_channel we would clear it and announce a change.
This lead to a problem that drivers may do some work every time
which was not needed and may lead to unexpected behaviour (for no
better driver code).

Properly initialize conf.flags with IEEE80211_CONF_IDLE.
Factor out the toggling into a function and clear IDLE while
sw scanning and when associated and set again when scan ends
or we are bouncing out of assoc.

(cherry picked from commit 086be6a80979f76124972273d62106583e35c83c)

LinuxKPI: ip.h add #include

Also include netinet/in.h so that in_addr in known for ip.h.
Found by compiling a new piece of code which complained.

(cherry picked from commit 4ddc0079eab3633aa8370eeec9e37b3796cc88bd)

LinuxKPI: bitfields add more *replace_bits()

Add or extend the already existing *_replace_bits() implementations
using macros as we do for the other parts in the file for
le<n>p_replace_bits(), u<n>p_replace_bits(), and _u<n>_replace_bits().

Reviewed by: hselasky
Differential Revision: https://reviews.freebsd.org/D33799

(cherry picked from commit 2fb0569f1ff58209420ed9c5500476ad7d93e702)

LinuxKPI: add hex2bin()

Add a hex2bin() implementation needed by a driver's debugfs code.

Reviewed by: hselasky
Differential Revision: https://reviews.freebsd.org/D33798

(cherry picked from commit deb9bfbd5bcea70c79f70c7091c35d399b40fb0a)

tests: Add some regression tests for a couple of KERN_PROC_* sysctls

Sponsored by: The FreeBSD Foundation

(cherry picked from commit fff0ae77b9960bb26034297fcb885d26432354bf)