marius [Wed, 13 Feb 2019 16:02:55 +0000 (16:02 +0000)]
MFC: r343372
ixl(4): Fix handling data passed with ioctl from NVM update tool
From Krzysztof:
Ensure that the entire data buffer passed from the NVM update tool is copied in
to kernel space and copied back out to user space using copyin() and copyout().
PR: 234104
Submitted by: Krzysztof Galazka <krzysztof.galazka@intel.com>
Reported by: Finn <ixbug@riseup.net>
Differential Revision: https://reviews.freebsd.org/D18817
marius [Wed, 13 Feb 2019 14:28:02 +0000 (14:28 +0000)]
MFC: r343203
ixgbe: this statement may fall through warnings with gcc
The recent gcc versions (7 and 8 at least) can check for switch case
statements for fall through (implicit-fallthrough). When fall through
is intentional, the default method for warning suppression is to place
comment /* FALLTHROUGH */ exactly before next case statement.
marius [Wed, 13 Feb 2019 14:25:05 +0000 (14:25 +0000)]
MFC: r333879, r342749
- Even though 64-bit atomics are supported on i386 there are panics
indicating that the code does not work correctly there. Switch
to mutex based variant (and fix that while we're here).
Reported by: pho, kib
- mp_ring: avoid items offset difference between iflib and mp_ring
on architectures without 64-bit atomics
Reported by: Augustin Cavalier <waddlesplash@gmail.com>
mav [Wed, 13 Feb 2019 00:39:28 +0000 (00:39 +0000)]
MFC r343586: Remove BIO_ORDERED flag from BIO_FLUSH sent by ZFS.
In all cases where ZFS sends BIO_FLUSH, it first waits for all related
writes to complete, so its BIO_FLUSH does not care about strict ordering.
Removal of one makes life much easier at least for NVMe driver, which
hardware has no concept of request ordering, relying completely on software.
mav [Wed, 13 Feb 2019 00:38:28 +0000 (00:38 +0000)]
MFC r343582,r343588:Relax BIO_FLUSH ordering in da(4), respecting BIO_ORDERED.
r212160 tightened this from always using MSG_SIMPLE_Q_TAG to always
MSG_ORDERED_Q_TAG. Since it also marked all BIO_FLUSH requests with
BIO_ORDERED, this commit changes nothing immediately, but it returns
BIO_FLUSH callers ability to actually specify ordering they really
need, alike to other request types.
mav [Wed, 13 Feb 2019 00:35:09 +0000 (00:35 +0000)]
MFC r343585: Only sort requests of types that have concept of offset.
Other types, such as BIO_FLUSH or BIO_ZONE, or especially new/unknown ones,
may imply some degree of ordering even if strict ordering is not requested
explicitly.
ngie [Tue, 12 Feb 2019 23:37:20 +0000 (23:37 +0000)]
MFC r342598:
Remove legacy rc.d infrastructure references from rc(8)
Legacy rc.d scripts (.sh extension) have not been supported since
r193118. Remove the outdated references to the legacy format, as they
are no longer valid.
ram [Tue, 12 Feb 2019 17:05:59 +0000 (17:05 +0000)]
MFC r342946: Remove accessing remote node and domain objects
while processing cam actions.
Issue:
ocs_fc(4) driver panics. It's induced by setting the port_state
sysctl to offline, then online, then offline, then online, and so
forth and so on in rapid succession.
Reason:
While we set the port_state to online fc discovery will start and OS
is enumerating the target discs by calling ocs_action(), then set the
port state to "offline" which deletes domain/sport/nodes.
In ocs_action()->XPT_GET_TRAN_SETTINGS we are accessing the remote
node which can be invalid to get the wwpn, wwnn and port.
Fix:
Removed accessing of remote node and domain in some ocs_action() cases.
Populated the required values from ocs_fcport.
This removes the dependency of node and domain structures while
processing XPT_PATH_INQ and XPT_GET_TRAN_SETTINGS.
We will invalidate the target entries after the device lost
timeout(30 seconds).
kib [Tue, 12 Feb 2019 16:56:10 +0000 (16:56 +0000)]
Fix PAE modules build on i386.
Reimplement PAE version of pte_load() by copying/pasting the
atomic_load_acq_64_i586() into it definition. pmap_kextract() is defined
as inline and uses pte_load() in its body, so the pte_load() should be
available when pmap.h is included. On stable/11, the atomic inlines are
not exposed to modules.
This is a direct commit to stable/11.
Reported by: dim
Sponsored by: The FreeBSD Foundation
vmaffione [Tue, 12 Feb 2019 09:26:05 +0000 (09:26 +0000)]
MFC r343772, r343867
netmap: refactor logging macros and pipes
Changelist:
- Replace ND, D and RD macros with nm_prdis, nm_prinf, nm_prerr
and nm_prlim, to avoid possible naming conflicts.
- Add netmap_krings_mode_commit() helper function and use that
to reduce code duplication.
- Refactor pipes control code to export some functions that
can be reused by the veth driver (on Linux) and epair(4).
- Add check to reject API requests with version less than 11.
- Small code refactoring for the null adapter.
ngie [Tue, 12 Feb 2019 03:13:10 +0000 (03:13 +0000)]
MFC r342904:
route(8): clarify -prefixlen description
Try to reword -prefixlen section to more clearly and accurately describe how
the -prefixlen modifier works.
While here, fix a word that igor considered a typo: aggregatable addresses is a
valid technical term per RFC-2374, however, it was superseded by the term
"aggregator" in RFC-3587.
mav [Tue, 12 Feb 2019 00:53:43 +0000 (00:53 +0000)]
MFC r343562, r343563: Reimplement BIO_ORDERED handling in nvd(4).
This fixes BIO_ORDERED semantics while also improving performance by:
- sleeping also before BIO_ORDERED bio, as defined, not only after;
- not queueing BIO_ORDERED bio to taskqueue if no other bios running;
- waking up sleeping taskqueue explicitly rather then rely on polling.
On Samsung SSD 970 PRO this shows sync write latency, measured with
`diskinfo -wS`, reduction from ~2ms to ~1.1ms by not sleeping without
reason till next HZ tick.
On the same device ZFS pool with 8 ZVOLs synchronously writing 4KB blocks
shows ~950 IOPS instead of ~750 IOPS before. I suspect ZFS does not need
BIO_ORDERED on BIO_FLUSH at all, but that will be next question.
kp [Mon, 11 Feb 2019 19:08:03 +0000 (19:08 +0000)]
MFC r343520:
pfctl: Point users to net.pf.request_maxcount if large requests are rejected
The kernel will reject very large tables to avoid resource exhaustion
attacks. Some users run into this limit with legitimate table
configurations.
The error message in this case was not very clear:
pf.conf:1: cannot define table nets: Invalid argument
pfctl: Syntax error in config file: pf rules not loaded
If a table definition fails we now check the request_maxcount sysctl,
and if we've tried to create more than that point the user at
net.pf.request_maxcount:
pf.conf:1: cannot define table nets: too many elements.
Consider increasing net.pf.request_maxcount.
pfctl: Syntax error in config file: pf rules not loaded
bcr [Mon, 11 Feb 2019 17:48:52 +0000 (17:48 +0000)]
MFC r343921:
Add an example to pw.8 about how to add an existing user to a group.
Instead of using pw to modify group membership, users often edit
/etc/group by hand, which is discouraged. Provide an example of
adding a user to the wheel group, which is a common use case.
I'm using a different user here as in the previous example as that
deleted the user (although the examples don't necessarily have to
be followed in order).
ram [Mon, 11 Feb 2019 16:28:04 +0000 (16:28 +0000)]
MFC r336446: Implemented Device Lost Timer,
which is used to give target device the time to recover before marking dead.
Issue: IO fails immediately after doing port-toggle.
Fix: Added LDT(Device Lost Timer)- we wait a specific period of time prior to telling the OS about lost device.
mav [Mon, 11 Feb 2019 14:49:10 +0000 (14:49 +0000)]
MFC r343728: Check element type before setting LEDs.
With r319610, sesutil started twiddling the bits of every SES device.
Not everything is a disk slot, there are also fan controllers, temperature
sensors, even power supplies, among other things controlled by SES.
Add a type check to make sure we are only operating on device slot and array
device slot elements. Other type elements will be skipped, but it would be
simple to add additional cases for controlling the ident LEDs of other
element types (which are not necessarily the same bits).
Rather than doing raw bit manipulation of an unstructured byte array using
unnamed numeric constants, leverage existing code abstractions.
Submitted by: Ryan Moeller <ryan@freqlabs.com>
Sponsored by: iXsystems, Inc.
avos [Mon, 11 Feb 2019 00:31:58 +0000 (00:31 +0000)]
MFC r343815:
iwn(4): plug initialization path vs interrupt handler races
There are few places in interrupt handler where the driver
lock is dropped; ensure that device is still running before
processing remaining ring entries.
For 11n / 11ac we are still using non-11n rates for management and
multicast traffic by default; check 'MCS rate' bit to determine how
to print them correctly.
This fixes 'Assertion failed: ((VT.getVectorNumElements() +
N2C->getZExtValue() <= N1.getValueType().getVectorNumElements()) &&
"Extract subvector overflow!"), function getNode' when building the
multimedia/aom port (with AVX2 enabled).
mav [Sat, 9 Feb 2019 02:10:03 +0000 (02:10 +0000)]
MFC r343673: Fix integer math overflow in UMA hash_alloc().
512GB of ZFS ABD ARC means abd_chunk zone of 128M 4KB items. To manage
them UMA tries to allocate 2GB hash table, which size does not fit into
the int variable, causing later allocation failure, which makes ARC shrink
back below the 512GB, not letting it to use more RAM. With this change I
easily reached >700GB ARC size on 768GB RAM machine.
vmaffione [Thu, 7 Feb 2019 10:44:53 +0000 (10:44 +0000)]
MFC r343689
netmap: upgrade sync-kloop support
Add SYNC_KLOOP_MODE option, and add support for direct mode, where application
executes the TXSYNC and RXSYNC in the context of the ioeventfd wake up callback.
dim [Thu, 7 Feb 2019 06:55:26 +0000 (06:55 +0000)]
MFC r343748:
Use NLDT to get number of LDTs on i386
Compiling a GENERIC kernel for i386 with clang 8.0 results in the
following warning:
/usr/src/sys/i386/i386/sys_machdep.c:542:40: error: 'sizeof ((ldt))' will return the size of the pointer, not the array itself [-Werror,-Wsizeof-pointer-div]
nldt = pldt != NULL ? pldt->ldt_len : nitems(ldt);
^~~~~~~~~~~
/usr/src/sys/sys/param.h:299:32: note: expanded from macro 'nitems'
#define nitems(x) (sizeof((x)) / sizeof((x)[0]))
~~~~~~~~~~~ ^
Indeed, 'ldt' is declared as 'union descriptor *', so nitems() is not
the right way to determine the number of LDTs. Instead, the NLDT define
from sys/x86/include/segments.h should be used.
vmaffione [Wed, 6 Feb 2019 09:49:42 +0000 (09:49 +0000)]
MFC r343346
netmap: improvements to the netmap kloop (CSB mode)
Changelist:
- Add the proper memory barriers in the kloop ring processing
functions.
- Fix memory barriers usage in the user helpers (nm_sync_kloop_appl_write,
nm_sync_kloop_appl_read).
- Fix nm_kr_txempty() helper to look at rhead rather than rcur. This
is important since the kloop can read a value of rcur which is ahead
of the value of rhead (see explanation in nm_sync_kloop_appl_write)
- Remove obsolete ptnetmap_guest_write_kring_csb() and
ptnet_guest_read_kring_csb().
- Prepare in advance the arguments for netmap_sync_kloop_[tr]x_ring(),
to make the kloop faster.
- Provide kernel and user implementation for nm_ldld_barrier() and
nm_ldst_barrier()
vmaffione [Wed, 6 Feb 2019 09:38:44 +0000 (09:38 +0000)]
MFC r343344
netmap: fix knote() argument to match the mutex state
The nm_os_selwakeup function needs to call knote() to wake up kqueue(9)
users. However, this function can be called from different code paths,
with different lock requirements.
This patch fixes the knote() call argument to match the relavant lock state.
Also, comments have been updated to reflect current code.
avos [Wed, 6 Feb 2019 01:53:01 +0000 (01:53 +0000)]
MFC r343697:
net80211(4): fix rate check when 'roaming' ifconfig(8) option is set to 'auto'
Do not try to clear 'basic rate' bit from roamRate; it cannot be here and,
actually, this operation clears 'MCS rate' bit instead, breaking comparison
for 11n / 11ac modes.
Remove ipsd (IP Scan Detetor). It is unused and to my knowledge has
never been used on any platform that ipfilter has been on. However
it looks like it could be a useful utility, therefore there are plans
to make it a port one day. It lacks a man page as well.
When cleaning up a vnet we free the counters in V_pf_default_rule and
V_pf_status from shutdown_pf(), but we can still use them later, for example
through pf_purge_expired_src_nodes().
Free them as the very last operation, as they rely on nothing else themselves.
brooks [Wed, 30 Jan 2019 23:38:42 +0000 (23:38 +0000)]
MFC r340242:
Add a top-level make target to rebuild all sysent files.
The sysent target is useful when changing makesyscalls.sh, when
making paired changes to syscalls.master files, or in a future where
freebsd32 sysent entries are built from the default syscalls.master.
vmaffione [Tue, 29 Jan 2019 18:18:55 +0000 (18:18 +0000)]
ixl: remove unnecessary limitations related to netmap
Netmap supports the case where TX rings and RX rings have different size.
Remove unnecessary limitations related to netmap support, making the code
simpler.
Also, check that the value of the hw head index written back from the NIC
is valid.
kp [Tue, 29 Jan 2019 17:49:39 +0000 (17:49 +0000)]
MFC r343295:
pf: Validate psn_len in DIOCGETSRCNODES
psn_len is controlled by user space, but we allocated memory based on it.
Check how much memory we might need at most (i.e. how many source nodes we
have) and limit the allocation to that.
pfg [Mon, 28 Jan 2019 02:12:48 +0000 (02:12 +0000)]
MFC r343459: (parcial)
ext2fs: Add some extra consistency checks for the superblock.
Maliciously formed, or badly corrupted, filesystems can cause kernel
panics. In general, such acts of foot-shooting can only be accomplished
by root, but in a world with VM images that is moving towards automated
mounts it is important to have some form of prevention.
Reported by: Christopher Krah, Thomas Barabosch, and Jan-Niclas Hilgert
of Fraunhofer FKIE.
Incidentaly this should also fix a memory corruption issue reported by
Dr Silvio Cesare of InfoSect.
Huge thanks to all reseachers for making us aware of the issue.
Note: for the MFC to stable/11 several changes had to made.
avos [Mon, 28 Jan 2019 01:50:47 +0000 (01:50 +0000)]
MFC r343238:
urtw(4): add length checks in Rx path.
- Check if buffer can contain Rx descriptor before accessing it.
- Verify upper / lower bounds for frame length.
- Do not pass too short frames into ieee80211_find_rxnode().
avos [Mon, 28 Jan 2019 01:37:36 +0000 (01:37 +0000)]
MFC r343234:
run(4): add more length checks in Rx path.
- Discard frames that are bigger than MCLBYTES (to prevent buffer overrun).
- Check buffer length before accessing its contents.
- Fix len <-> dmalen check - the last includes Rx Wireless information
structure size.
- Fix out-of-bounds read during Rx node search for ACK / CTS frames
(monitor mode only).
While here:
- Mark few suspicious places with comments.
- Move common cleanup to the function end.
avos [Mon, 28 Jan 2019 01:12:20 +0000 (01:12 +0000)]
MFC r343340:
net80211: fix channel list construction for non-auto operating mode.
Change the way how channel list mode <-> desired mode match is done:
- Match channel list mode for next non-auto desired modes:
* 11b: 11g, 11ng;
* 11a: 11na
- Add pre-defined channels only when one of the next conditions met:
* the desired channel mode is 'auto' or
* the desired channel and selected channel list modes are exactly
the same or
* the previous rule (11g / 11n promotion) applies.
Before r275875 construction work properly for all except
11ng / 11na modes - these were broken at all
(i.e., the scan list was empty); after r275875 all checks were removed,
so scan table was populated by all device-compatible channels
(desired mode was ignored).
For example, if I will set 'ifconfig wlan0 mode 11ng' for RTL8821AU:
- pre-r275875: nothing, scan will not work;
- after r275875: both 11ng and 11na bands were scanned; also, since 11b
channel list was used, 14th channel was scanned too.
- after this change: only 11ng - 1-13 channels - are used for scanning.
Note: since 11-stable does not have VHT mode definitions
they were removed from this merge.
marius [Sun, 27 Jan 2019 19:04:28 +0000 (19:04 +0000)]
MFC: r342634 (partial)
o Don't allocate resources for SDMA in sdhci(4) if the controller or the
front-end doesn't support SDMA or the latter implements a platform-
specific transfer method instead. While at it, factor out allocation
and freeing of SDMA resources to sdhci_dma_{alloc,free}() in order to
keep the code more readable when adding support for ADMA variants.
o Base the size of the SDMA bounce buffer on MAXPHYS up to the maximum
of 512 KiB instead of using a fixed 4-KiB-buffer. With the default
MAXPHYS of 128 KiB and depending on the controller and medium, this
reduces the number of SDHCI interrupts by a factor of ~16 to ~32 on
sequential reads while an increase of throughput of up to ~84 % was
seen.
Front-ends for broken controllers that only support an SDMA buffer
boundary of a specific size may set SDHCI_QUIRK_BROKEN_SDMA_BOUNDARY
and supply a size via struct sdhci_slot. According to Linux, only -
unsupported in stable/11 anyway - Qualcomm MSM-type SDHCI controllers
are affected by this, though.
Requested by: Shreyank Amartya (unconditional bump to 512 KiB)
o Introduce a SDHCI_DEPEND macro for specifying the dependency of the
front-end modules on the sdhci(4) one and bump the module version
of sdhci(4) to 2 via an also newly introduced SDHCI_VERSION in order
to ensure that all components are in sync WRT struct sdhci_slot.
o In sdhci(4):
- Make pointers const were applicable, and
- replace a few device_printf(9) calls with slot_printf() for
consistency.
marius [Sun, 27 Jan 2019 14:36:52 +0000 (14:36 +0000)]
MFC: r333745, r333764, r337533, r339375, r341041
- ck: add support for executing callbacks outside of main poll loop
Pull in change from upstream deca119d14bfffd440770eb67cbdbeaf7b57eb7b
- Import CK as of commit deca119d14bfffd440770eb67cbdbeaf7b57eb7b.
This is mostly a noop, for mergeinfo purpose, because the relevant changes
were committed directly.
- Import CK as of commit 08813496570879fbcc2adcdd9ddc0a054361bfde, mostly
to avoid using lwsync on ppc32.
- Import CK as of commit 5221ae2f3722a78c7fc41e47069ad94983d3bccb.
This fixes two problems, one where epoch calls could occur before all
the readers had exited the epoch section, and one where the epoch calls
could be unnecessarily delayed.
- Import CK as of 21d3e319407d19dece16ee317c757ffc54a452bc, which makes its
sparcv9 atomics compatible with the FreeBSD kernel by using instructions
which access the appropriate address space.
Since r287197 ieee80211com is a part of drivers softc; as a result,
after detach all pointers to it (iv_ic, ni_ic) are invalid. Most
possible users (tasks, interrupt handlers) are blocked / removed
when device is stopped; however, ioctl handlers were not tracked
and may crash if ieee80211com structure is accessed.
Since ieee80211com pointer access from ieee80211vap structure is not
protected by lock (constant after interface creation) and used in
many other places just use reference counting for ioctl handlers;
on detach set 'detached' flag and wait until reference counter goes to 0.
For KBI stability the last element of iv_spare[] array was reused.
avos [Sat, 26 Jan 2019 13:36:06 +0000 (13:36 +0000)]
MFC r343249:
Fix duplicate wpa_supplicant(8) / hostapd(8) startup with devd(8)
Do not invoke 'wlan_up' function from devd(8) on interface
creation event (an example to create such event:
'ifconfig wlan0 create wlandev rtwn0');
they're typically produced during 'service netif (re)start'
and result in duplicate interface initialization.
From the user side if WPA option is used, this result in messages like:
- /etc/rc.d/wpa_supplicant: WARNING: failed to start wpa_supplicant
or
- wpa_supplicant already running? (pid=xxxx).
(for HOSTAP interfaces this race may result in startup failure).
As a side effect, wpa_supplicant(8) / hostapd(8) will not be
invoked when new wlan(4) interface is created manually and
corresponding configuration for it is present in rc.conf(5).
This change does not affect device attach / removal events.