Ian Lepore [Sun, 22 Dec 2019 22:33:22 +0000 (22:33 +0000)]
In gptboot, don't assume a partition number is a single digit, 1-9. GPT
partitions can have 128 partitions, so parse contiguous digits and then
validate that the number is between 1-128 inclusive.
I'm not sure 128 is a hard limit in the GPT standard, but it's the common
number in use, and it's a better upper limit than 9.
Warner Losh [Mon, 16 Dec 2019 21:52:12 +0000 (21:52 +0000)]
Use symbolic names for int13 calls
For all the INT13 calls, use symbolic names instead of magic numbers. This makes
it easier to understand what the code is doing w/o a trip to google to find what
these numbers mean.
Emmanuel Vadot [Fri, 8 Nov 2019 20:08:44 +0000 (20:08 +0000)]
loader.efi: Default to serial if we don't have a ConOut variable
In the EFI implementation in U-Boot no ConOut efi variable is created,
this cause loader to fallback to TERM_EMU implementation which is very
very very slow (and uses the ConOut device in the system table anyway).
The UEFI spec aren't clear as if this variable needs to exists or not.
fusefs: don't panic if FUSE_GETATTR fails durint VOP_GETPAGES
During VOP_GETPAGES, fusefs needs to determine the file's length, which
could require a FUSE_GETATTR operation. If that fails, it's better to
SIGBUS than panic.
vinvalbuf: do not panic if we were unable to flush dirty buffers
Return EBUSY instead and let caller to handle the issue.
For vgone()/vnode reclamation, caller first does vinvalbuf(V_SAVE),
which return EBUSY in case dirty buffers where not flushed. Then caller
calls vinvalbuf(0) due to non-zero return, which gets rid of all dirty
buffers without dependencies.
PR: 238565
Reviewed by: asomers, mckusick
Sponsored by: The FreeBSD Foundation
Differential revision: https://reviews.freebsd.org/D30555
Alan Somers [Thu, 22 Apr 2021 21:09:03 +0000 (15:09 -0600)]
gmultipath: make physpath distinct from the underlying providers'
zfsd uses a device's physical path attribute to automatically replace a
missing ZFS disk when a blank disk is inserted into the same physical
slot. Currently gmultipath passes through its underlying providers'
physical path attribute. That may cause zfsd to replace a missing
gmultipath provider with a newly arrived, single-path disk. That would
be bad.
This commit fixes that problem by simply appending "/mp" to the
underlying providers' physical path, in a manner similar to what geli
already does.
Charlie Root [Tue, 12 Jan 2021 01:56:12 +0000 (18:56 -0700)]
ICMP checksum test: Fix for big endian
The in_cksum tests originally tried to simulate a BE environment by
swapping the byte order of the input. But that's overcomplicated, and
didn't actually work on real BE hardware. The correct testing strategy
is just to test on the native endianness, and run the tests in both BE
and LE environments.
MFC uipc_shm: Fix kern.ipc.posix_shm_list for jails
Fix error return of kern.ipc.posix_shm_list, which caused it (and thus
"posixshmcontrol ls") to fail for all jails that didn't happen to own
the last shm object in the list.
Toomas Soome [Mon, 13 Jan 2020 20:02:27 +0000 (20:02 +0000)]
Backout 356693. The libsa malloc does provide necessary alignment and
memalign by 4 will reduce alignment for some platforms. Thanks for Ian for
pointing this out.
Ravi Pokala [Tue, 31 Mar 2020 20:09:20 +0000 (20:09 +0000)]
Fix build for mips.XLP64 kernel, by re-ordering headers
The log for the failure contained errors like this:
| In file included from ${SRCTOP}/sys/mips/nlm/dev/net/xlpge.c:34:
| In file included from ${SRCTOP}/sys/sys/systm.h:44:
| In file included from ./machine/atomic.h:849:
| ${SRCTOP}/sys/sys/_atomic_subword.h:222:37: error: unknown type name 'u_long'; did you mean 'long'?
| atomic_testandset_acq_long(volatile u_long *p, u_int v)
| ^~~~~~
| long
And similar "unknown type name" errors for u_int, not recognizing bool as a type, etc.
This was caused by including <sys/param.h> too far down; move it up where it belongs.
While here, add a blank line after '__FBSDID()', in keeping with convention.
Justin Hibbits [Fri, 15 Nov 2019 04:33:07 +0000 (04:33 +0000)]
atomic: Add atomic_cmpset_masked to powerpc and use it
Summary:
This is a more optimal way of doing atomic_compset_masked() than the
fallback in sys/_atomic_subword.h. There's also an override for
_atomic_fcmpset_masked_word(), which may or may not be necessary, and is
unused for powerpc.
Justin Hibbits [Tue, 8 Oct 2019 01:36:34 +0000 (01:36 +0000)]
powerpc: Implement atomic_(f)cmpset_ for short and char
This adds two implementations for each atomic_fcmpset_ and atomic_cmpset_
short and char functions, selectable at compile time for the target
architecture. By default, it uses a generic shift-and-mask to perform atomic
updates to sub-components of 32-bit words from <sys/_atomic_subword.h>.
However, if ISA_206_ATOMICS is defined it uses the ll/sc instructions for
halfword and bytes, introduced in PowerISA 2.06. These instructions are
supported by all IBM processors from POWER7 on, as well as the Freescale/NXP
e6500 core. Although the e5500 and e500mc both implement PowerISA 2.06 they
do not implement these instructions.
As part of this, clean up the atomic_(f)cmpset_acq and _rel wrappers, by
using macros to reduce code duplication.
ISA_206_ATOMICS requires clang or newer binutils (2.20 or later).
Kyle Evans [Wed, 2 Oct 2019 17:06:28 +0000 (17:06 +0000)]
Provide generic sub-word atomic *cmpset
Provide *cmpset_{8,16} as wrappers around atomic_fcmpset_32. Initial users
will be mips and sparc64, and perhaps parts of powerpc.
This are not for general consumption; machine/atomic.h should include this
header as needed to provide atomic_{,f}cmpset_{8,16} and machine/atomic.h
should provide acq_ and rel_ variants.
Kyle Evans [Thu, 2 Jan 2020 22:52:31 +0000 (22:52 +0000)]
sys/dev/cfi: include sys/types.h as well
This will soon be a dependency for machine/atomic.h on mips with the
introduction of 64-bit atomics; the scope here is pretty narrow, so throw it
here in the header just before systm.h, which includes machine/atomic.h
Kyle Evans [Wed, 2 Oct 2019 15:13:40 +0000 (15:13 +0000)]
mips: fcmpset: do not spin on sc failure
For ll/sc architectures, atomic(9) allows failure modes where *old == val
due to write failure and callers should compensate for this. Do not retry on
failure, just leave 0 in ret and fail the operation if we couldn't sc it.
This lets the caller determine if it should retry or not.
Warner Losh [Tue, 17 Dec 2019 03:20:37 +0000 (03:20 +0000)]
Two minor issues:
(1) Don't define load/store 64 atomics for o32. They aren't atomic
there.
(2) Add comment about why we need 64 atomic define on n32 only.
Brandon Bergren [Thu, 2 Jan 2020 23:20:37 +0000 (23:20 +0000)]
[PowerPC] [MIPS] Implement 32-bit kernel emulation of atomic64 operations
This is a lock-based emulation of 64-bit atomics for kernel use, split off
from an earlier patch by jhibbits.
This is needed to unblock future improvements that reduce the need for
locking on 64-bit platforms by using atomic updates.
The implementation allows for future integration with userland atomic64,
but as that implies going through sysarch for every use, the current
status quo of userland doing its own locking may be for the best.
Ian Lepore [Tue, 28 Sep 2021 19:29:10 +0000 (13:29 -0600)]
Fix busdma resource leak on usb device detach.
When a usb device is detached, usb_pc_dmamap_destroy() called
bus_dmamap_destroy() while the map was still loaded. That's harmless on x86
architectures, but on all other platforms it causes bus_dmamap_destroy() to
return EBUSY and leak away any memory resources (including bounce buffers)
associated with the mapping, as well as any allocated map structure itself.
This change introduces a new is_loaded flag to the usb_page_cache struct to
track whether a map is loaded or not. If the map is loaded,
bus_dmamap_unload() is called before bus_dmamap_destroy() to avoid leaking
away resources.
Kyle Evans [Wed, 6 Oct 2021 14:50:32 +0000 (09:50 -0500)]
tests: kqueue: CLOCK_BOOTTIME is an alias of CLOCK_UPTIME
Build-test should be done a buildenv from a newer branch. =-( We don't
have this alias in stable/12, so just provide it locally in a way that
won't break should 155f15118a77 find its way here.
Use atomic counters to ensure that we correctly track the number of half
open states and syncookie responses in-flight.
This determines if we activate or deactivate syncookies in adaptive
mode.
We'd likely be better served by converting these to the equivalent mem*
calls, but just kill the knob for now. The b* macros being defined get
in the way of _FORTIFY_SOURCE.
kqueue: don't arbitrarily restrict long-past values for NOTE_ABSTIME
NOTE_ABSTIME values are converted to values relative to boottime in
filt_timervalidate(), and negative values are currently rejected. We
don't reject times in the past in general, so clamp this up to 0 as
needed such that the timer fires immediately rather than imposing what
looks like an arbitrary restriction.
Another possible scenario is that the system clock had to be adjusted
by ~minutes or ~hours and we have less than that in terms of uptime,
making a reasonable short-timeout suddenly invalid. Firing it is still
a valid choice in this scenario so that applications can at least
expect a consistent behavior.
This function was renamed to kern_reboot() in 2010, but the man page has
failed to keep in sync. Bring it up to date on the rename, add the
shutdown hooks to the synopsis, and document the (obvious) fact that
kern_reboot() does not return.
Fix an outdated reference to the old name in kern_reboot(), and leave a
reference to the man page so future readers might find it before any
large changes.
Reviewed by: imp, markj
MFC after: 3 days
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D32085
rman: fix overflow in rman_reserve_resource_bound()
If the default range of [0, ~0] is given, then (~0 - 0) + 1 == 0. This
in turn will cause any allocation of non-zero size to fail. Zero-sized
allocations are prohibited, so add a KASSERT to this effect.
History indicates it is part of the original rman code. This bug may in
fact be older than some contributors.
David Bright [Mon, 27 Sep 2021 13:18:46 +0000 (06:18 -0700)]
ntb_hw_intel: fix xeon NTB gen3 bar disable logic
In NTB gen3 driver, it was supposed to disable NTB bar access by
default, but due to incorrect register access method, the bar disable
logic does not work as expected. Those registers should be modified
through NTB bar0 rather than PCI configuration space.
Besides, we'd better to protect ourselves from a bad buddy node so
ingress disable logic should be implemented together.
When WITHOUT_INET6 is selected we generate a null if-then-else blocks
due to incorrect placment of #if statments. Move the #if statements
reducing unnecessary runtime comparisons WITHOUT_INET6.
From jilles: POSIX requires that a script set `OPTIND=1` before using
different sets of parameters with `getopts`, or the results will be
unspecified.
The specific problem observed here is that we would execute `man -f` or
`man -k` without cleaning up state from man_parse_args()' `getopts`
loop. FreeBSD's /bin/sh seems to reset OPTIND to 1 after we hit the
second getopts loop, rendering the following shift harmless; other
/bin/sh implementations will leave it at what we came into the loop at
(e.g., bash as /bin/sh), shifting off any keywords that we had.
Matt Macy [Sat, 19 Dec 2020 01:08:33 +0000 (17:08 -0800)]
iflib: ensure that tx interrupts enabled and cleanups
Doing a 'dd' over iscsi will reliably cause stalls. Tx
cleaning _should_ reliably happen as data is sent.
However, currently if the transmit queue fills it will
wait until the iflib timer (hz/2) runs.
This change causes the the tx taskq thread to be run
if there are completed descriptors.
While here:
- make timer interrupt delay a sysctl
- simplify txd_db_check handling
- comment on INTR types
Background on the change:
Initially doorbell updates were minimized by only writing to the register
on every fourth packet. If txq_drain would return without writing to the
doorbell it scheduled a callout on the next tick to do the doorbell write
to ensure that the write otherwise happened "soon". At that time a sysctl
was added for users to avoid the potential added latency by simply writing
to the doorbell register on every packet. This worked perfectly well for
e1000 and ixgbe ... and appeared to work well on ixl. However, as it
turned out there was a race to this approach that would lockup the ixl MAC.
It was possible for a lower producer index to be written after a higher one.
On e1000 and ixgbe this was harmless - on ixl it was fatal. My initial
response was to add a lock around doorbell writes - fixing the problem but
adding an unacceptable amount of lock contention.
The next iteration was to use transmit interrupts to drive delayed doorbell
writes. If there were no packets in the queue all doorbell writes would be
immediate as the queue started to fill up we could delay doorbell writes
further and further. At the start of drain if we've cleaned any packets we
know we've moved the state machine along and we write the doorbell (an
obvious missing optimization was to skip that doorbell write if db_pending
is zero). This change required that tx interrupts be scheduled periodically
as opposed to just when the hardware txq was full. However, that just leads
to our next problem.
Initially dedicated msix vectors were used for both tx and rx. However, it
was often possible to use up all available vectors before we set up all the
queues we wanted. By having rx and tx share a vector for a given queue we
could halve the number of vectors used by a given configuration. The problem
here is that with this change only e1000 passed the necessary value to have
the fast interrupt drive tx when appropriate.
Eric van Gyzen [Fri, 21 Aug 2020 19:34:41 +0000 (19:34 +0000)]
ixgbe: fix impossible condition
Coverity flagged this condition: The condition
offset == 0 && offset == 65535
can never be true because offset cannot be equal
to two different values at the same time.
Mark Johnston [Mon, 8 Mar 2021 17:39:06 +0000 (12:39 -0500)]
iflib: Make if_shared_ctx_t a pointer to const
This structure is shared among multiple instances of a driver, so we
should ensure that it doesn't somehow get treated as if there's a
separate instance per interface. This is especially important for
software-only drivers like wg.
DEVICE_REGISTER() still returns a void * and so the per-driver sctx
structures are not yet defined with the const qualifier.
Reviewed by: gallatin, erj
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D29102
Conrad Meyer [Wed, 29 Jan 2020 05:31:40 +0000 (05:31 +0000)]
ixgbe(4): Eliminate bogus sizeof() expressions
All of these uses of sizeof() were on the wrong type in relation to the pointer
passed to SYSCTL_ADD_PROC as arg1. Fortunately, none of the handlers actually
use arg2. So just don't pass a (non-zero) arg2.
Eric Joyner [Tue, 23 Jul 2019 18:14:32 +0000 (18:14 +0000)]
ixgbe(4): Fix enabling/disabling and reconfiguration of queues
- Wrong order of casting and bit shift caused that enabling and disabling
queues didn't work properly for queues number larger than 32. Use literals
with right suffix instead.
- TX ring tail address was not updated during reinitiailzation of TX
structures. It could block sending traffic.
- Also remove unused variables 'eims' and 'active_queues'.
Warner Losh [Wed, 1 Sep 2021 19:37:27 +0000 (13:37 -0600)]
ppbus: Set the lock for pps interface, update to latest api
Since we take a lock when we enter the ioctl, we need to set driver_mtx
in the pps structure so it can be dropped while sleeping during a call
to timepps_fetch() with a non-zero timeout (PPS_CANWAIT feature).
MFC After: 5 days
Sponsored by: Netflix
Reviewed by: ian
Differential Revision: https://reviews.freebsd.org/D31763
unzip: sync with NetBSD upstream to add passphrase support
- Add support for password protected zip archives.
We use memset_s() rather than explicit_bzero() for more portable
(See PR).
- Use success/failure macro in exit()
- Mention ZIPX format in unzip(1)
Submitted by: Mingye Wang and Alex Kozlov (ak@)
PR: 244181
Reviewed by: mizhka
Obtained from: NetBSD
Differential Revision: https://reviews.freebsd.org/D28892
ng_ether: Create netgraph nodes for bridge interfaces.
Create netgraph nodes for bridge interfaces when the ng_ether module
is loaded. If a bridge interface is created after loading the ng_ether
module, a netgraph node is created via ether_ifattach().
We can't copyout() while holding a lock, in case it triggers a page
fault.
Release the lock before copyout, which is safe because we've already
copied all the data into the nvlist.
Wenzhuo Lu [Fri, 16 Oct 2015 02:51:09 +0000 (10:51 +0800)]
e1000: fix K1 configuration
This patch is for the following updates to the K1 configurations:
Tx idle period for entering K1 should be 128 ns.
Minimum Tx idle period in K1 should be 256 ns.
Philip Paeps [Wed, 29 Sep 2021 04:43:58 +0000 (12:43 +0800)]
contrib/tzdata: correct DST in Jordan and Samoa
Direct commit to stable/12.
The recent tzdata 2021b release includes several controversial changes
under active debate on the tz mailing list. Pending consensus, and
hopefully a 2021c release reflecting it, only merge the DST changes for
Jordan and Samoa. This corrects present and future timestamps in those
regions.
Alexander Motin [Wed, 15 Sep 2021 01:06:39 +0000 (21:06 -0400)]
ipmi(4): Limit maximum watchdog pre-timeout interval.
Previous code by default setting pre-timeout interval to 120 seconds
made impossible to set timeout interval below that, resulting in error
0xcc (Invalid data field in Request) at least on Supermicro boards.
To fix that limit maximum pre-timeout interval to ~1/4 of the timeout
interval, that sounds like a reasonable default: not too short to fire
too late, but also not too long to give many false reports.