When cleaning up a vnet we free the counters in V_pf_default_rule and
V_pf_status from shutdown_pf(), but we can still use them later, for example
through pf_purge_expired_src_nodes().
Free them as the very last operation, as they rely on nothing else themselves.
brooks [Wed, 30 Jan 2019 23:38:42 +0000 (23:38 +0000)]
MFC r340242:
Add a top-level make target to rebuild all sysent files.
The sysent target is useful when changing makesyscalls.sh, when
making paired changes to syscalls.master files, or in a future where
freebsd32 sysent entries are built from the default syscalls.master.
vmaffione [Tue, 29 Jan 2019 18:18:55 +0000 (18:18 +0000)]
ixl: remove unnecessary limitations related to netmap
Netmap supports the case where TX rings and RX rings have different size.
Remove unnecessary limitations related to netmap support, making the code
simpler.
Also, check that the value of the hw head index written back from the NIC
is valid.
kp [Tue, 29 Jan 2019 17:49:39 +0000 (17:49 +0000)]
MFC r343295:
pf: Validate psn_len in DIOCGETSRCNODES
psn_len is controlled by user space, but we allocated memory based on it.
Check how much memory we might need at most (i.e. how many source nodes we
have) and limit the allocation to that.
pfg [Mon, 28 Jan 2019 02:12:48 +0000 (02:12 +0000)]
MFC r343459: (parcial)
ext2fs: Add some extra consistency checks for the superblock.
Maliciously formed, or badly corrupted, filesystems can cause kernel
panics. In general, such acts of foot-shooting can only be accomplished
by root, but in a world with VM images that is moving towards automated
mounts it is important to have some form of prevention.
Reported by: Christopher Krah, Thomas Barabosch, and Jan-Niclas Hilgert
of Fraunhofer FKIE.
Incidentaly this should also fix a memory corruption issue reported by
Dr Silvio Cesare of InfoSect.
Huge thanks to all reseachers for making us aware of the issue.
Note: for the MFC to stable/11 several changes had to made.
avos [Mon, 28 Jan 2019 01:50:47 +0000 (01:50 +0000)]
MFC r343238:
urtw(4): add length checks in Rx path.
- Check if buffer can contain Rx descriptor before accessing it.
- Verify upper / lower bounds for frame length.
- Do not pass too short frames into ieee80211_find_rxnode().
avos [Mon, 28 Jan 2019 01:37:36 +0000 (01:37 +0000)]
MFC r343234:
run(4): add more length checks in Rx path.
- Discard frames that are bigger than MCLBYTES (to prevent buffer overrun).
- Check buffer length before accessing its contents.
- Fix len <-> dmalen check - the last includes Rx Wireless information
structure size.
- Fix out-of-bounds read during Rx node search for ACK / CTS frames
(monitor mode only).
While here:
- Mark few suspicious places with comments.
- Move common cleanup to the function end.
avos [Mon, 28 Jan 2019 01:12:20 +0000 (01:12 +0000)]
MFC r343340:
net80211: fix channel list construction for non-auto operating mode.
Change the way how channel list mode <-> desired mode match is done:
- Match channel list mode for next non-auto desired modes:
* 11b: 11g, 11ng;
* 11a: 11na
- Add pre-defined channels only when one of the next conditions met:
* the desired channel mode is 'auto' or
* the desired channel and selected channel list modes are exactly
the same or
* the previous rule (11g / 11n promotion) applies.
Before r275875 construction work properly for all except
11ng / 11na modes - these were broken at all
(i.e., the scan list was empty); after r275875 all checks were removed,
so scan table was populated by all device-compatible channels
(desired mode was ignored).
For example, if I will set 'ifconfig wlan0 mode 11ng' for RTL8821AU:
- pre-r275875: nothing, scan will not work;
- after r275875: both 11ng and 11na bands were scanned; also, since 11b
channel list was used, 14th channel was scanned too.
- after this change: only 11ng - 1-13 channels - are used for scanning.
Note: since 11-stable does not have VHT mode definitions
they were removed from this merge.
marius [Sun, 27 Jan 2019 19:04:28 +0000 (19:04 +0000)]
MFC: r342634 (partial)
o Don't allocate resources for SDMA in sdhci(4) if the controller or the
front-end doesn't support SDMA or the latter implements a platform-
specific transfer method instead. While at it, factor out allocation
and freeing of SDMA resources to sdhci_dma_{alloc,free}() in order to
keep the code more readable when adding support for ADMA variants.
o Base the size of the SDMA bounce buffer on MAXPHYS up to the maximum
of 512 KiB instead of using a fixed 4-KiB-buffer. With the default
MAXPHYS of 128 KiB and depending on the controller and medium, this
reduces the number of SDHCI interrupts by a factor of ~16 to ~32 on
sequential reads while an increase of throughput of up to ~84 % was
seen.
Front-ends for broken controllers that only support an SDMA buffer
boundary of a specific size may set SDHCI_QUIRK_BROKEN_SDMA_BOUNDARY
and supply a size via struct sdhci_slot. According to Linux, only -
unsupported in stable/11 anyway - Qualcomm MSM-type SDHCI controllers
are affected by this, though.
Requested by: Shreyank Amartya (unconditional bump to 512 KiB)
o Introduce a SDHCI_DEPEND macro for specifying the dependency of the
front-end modules on the sdhci(4) one and bump the module version
of sdhci(4) to 2 via an also newly introduced SDHCI_VERSION in order
to ensure that all components are in sync WRT struct sdhci_slot.
o In sdhci(4):
- Make pointers const were applicable, and
- replace a few device_printf(9) calls with slot_printf() for
consistency.
marius [Sun, 27 Jan 2019 14:36:52 +0000 (14:36 +0000)]
MFC: r333745, r333764, r337533, r339375, r341041
- ck: add support for executing callbacks outside of main poll loop
Pull in change from upstream deca119d14bfffd440770eb67cbdbeaf7b57eb7b
- Import CK as of commit deca119d14bfffd440770eb67cbdbeaf7b57eb7b.
This is mostly a noop, for mergeinfo purpose, because the relevant changes
were committed directly.
- Import CK as of commit 08813496570879fbcc2adcdd9ddc0a054361bfde, mostly
to avoid using lwsync on ppc32.
- Import CK as of commit 5221ae2f3722a78c7fc41e47069ad94983d3bccb.
This fixes two problems, one where epoch calls could occur before all
the readers had exited the epoch section, and one where the epoch calls
could be unnecessarily delayed.
- Import CK as of 21d3e319407d19dece16ee317c757ffc54a452bc, which makes its
sparcv9 atomics compatible with the FreeBSD kernel by using instructions
which access the appropriate address space.
Since r287197 ieee80211com is a part of drivers softc; as a result,
after detach all pointers to it (iv_ic, ni_ic) are invalid. Most
possible users (tasks, interrupt handlers) are blocked / removed
when device is stopped; however, ioctl handlers were not tracked
and may crash if ieee80211com structure is accessed.
Since ieee80211com pointer access from ieee80211vap structure is not
protected by lock (constant after interface creation) and used in
many other places just use reference counting for ioctl handlers;
on detach set 'detached' flag and wait until reference counter goes to 0.
For KBI stability the last element of iv_spare[] array was reused.
avos [Sat, 26 Jan 2019 13:36:06 +0000 (13:36 +0000)]
MFC r343249:
Fix duplicate wpa_supplicant(8) / hostapd(8) startup with devd(8)
Do not invoke 'wlan_up' function from devd(8) on interface
creation event (an example to create such event:
'ifconfig wlan0 create wlandev rtwn0');
they're typically produced during 'service netif (re)start'
and result in duplicate interface initialization.
From the user side if WPA option is used, this result in messages like:
- /etc/rc.d/wpa_supplicant: WARNING: failed to start wpa_supplicant
or
- wpa_supplicant already running? (pid=xxxx).
(for HOSTAP interfaces this race may result in startup failure).
As a side effect, wpa_supplicant(8) / hostapd(8) will not be
invoked when new wlan(4) interface is created manually and
corresponding configuration for it is present in rc.conf(5).
This change does not affect device attach / removal events.
avos [Sat, 26 Jan 2019 12:35:06 +0000 (12:35 +0000)]
MFC r343190:
net80211: drop m_pullup call from ieee80211_crypto_decap.
For most wireless drivers Rx mbuf is allocated as one
contiguous chunk; only few are using chains for allocations -
but even then at least MCLBYTES (minus Rx descriptor size) is
available in the first mbuf.
In addition to the above, m_pullup was never called here - otherwise,
reallocation will break post-crypto_decap logic (ieee80211_decap,
ieee80211_deliver_data...), so just remove it; length check is left
in case if some truncated frame appears here.
jilles [Fri, 25 Jan 2019 22:52:49 +0000 (22:52 +0000)]
MFC r343105: libedit: Avoid out of bounds read in 'bind' command
This is CVS revision 1.31 from NetBSD lib/libedit/chartype.c:
Make sure that argv is NULL terminated since functions like tty_stty rely
on it to be so (Gerry Swinslow)
This broke when the wide-character support was enabled in libedit. The
conversion from multibyte to wide-character did not supply the apparently
expected terminating NULL in the new argv array.
mav [Fri, 25 Jan 2019 20:00:59 +0000 (20:00 +0000)]
MFC r342557, r342559: Reimplement nvd(4) detach handling.
Previous code typically crashed in case of NVMe device unplug or even clean
detach while some I/Os are still in flight. To fix this the new code calls
disk_gone() and waits for confirmation of all references gone before calling
disk_destroy(), freeing other resources and allowing controller detach.
While there, fix disk lists locking and reimplement unit numbers assignment.
tuexen [Fri, 25 Jan 2019 15:25:53 +0000 (15:25 +0000)]
MFC r338138:
Enabling the IPPROTO_IPV6 level socket option IPV6_USE_MIN_MTU on a TCP
socket resulted in sending fragmented IPV6 packets.
This is fixes by reducing the MSS to the appropriate value. In addtion,
if the socket option is set before the handshake happens, announce this
MSS to the peer. This is not stricly required, but done since TCP
is conservative.
mw [Thu, 24 Jan 2019 11:31:57 +0000 (11:31 +0000)]
MFC: Second part of Amazon ENA driver fixes and improvements
Now, the driver functionality is aligned with the latest version in HEAD.
r343074 Suppress excessive error prints in ENA TX hotpath
r336099 Add PNP info to PCI attachment of ena driver
r333456 Do not pass header length to the ENA controller
r333453 Apply fixes in ena-com
r333450 Upgrade ENA version to v0.8.1
r325593 Fix setting AENQ group in ENA driver
r325592 Allow usage of more RX descriptors than 1 in ENA driver
r325591 Read max MTU from the ENA device
r325590 Fix calculating io queues number in ENA driver
r325589 Rework printouts and logging level in ENA driver
r325587 Fix comparing L3 type with L4 enum on RX hash in ENA driver
r325586 Fix compilation warnings when building ENA driver with gcc compiler
r325585 Fix checking if the DF flag was set in ENA driver
r325584 Cleanup of the ENA driver header file
r325583 Allow partial MSI-x allocation in ENA driver
r325582 Remove deprecated and unused counters in ENA driver
r325581 Cover ENA driver code with branch predictioning statements
mw [Thu, 24 Jan 2019 09:53:41 +0000 (09:53 +0000)]
MFC: First part of Amazon ENA driver fixes and improvements
r325580 Refactor style of the ENA driver
r325579 Fix error handling in the ENA driver and lock drbr_free() call
r325578 Destroy admin queue after freeing interrupts in ENA driver
r325577 Split function checking for missing TX completion in ENA driver
r325576 Check for Rx ring state to prevent from stall in the ENA driver
r325574 Add RX OOO completion feature
r325512 Change function validate_tx_req_id() to inline in ENA driver
r325511 Fix ENA driver error handling in attach and basic style fixes
r325239 Rework counting of hardware statistics in ENA driver
r325236 Update ena-com HAL to v1.1.4.3 and update driver accordingly
mav [Wed, 23 Jan 2019 01:23:45 +0000 (01:23 +0000)]
MFC r342558: Switch from mutexes to atomics in GEOM_DEV I/O path.
Mutexes in I/O path there were used twice per I/O to atomically access
several variables to close and/or destroy the device on last request
completion. I found the way to fit all required info into one integer,
suitable for atomic operations. It opened race window on device close,
but addition of timeout to the msleep() there should cover it.
Profiling shows removal of significant spinning time on those mutexes
and IOPS increase from ~600K to >800K to NVMe on 72-core systems.
mav [Wed, 23 Jan 2019 00:55:57 +0000 (00:55 +0000)]
MFC r342400: Increase MTX_POOL_SLEEP_SIZE from 128 to 1024.
This value remained unchanged for 15 years, and now this bump reduces
lock spinning in GEOM and BIO layers while doing ~1.6M IOPS to 4 NVMe
on 72-core system from ~25% to ~5% by the cost of additional 28KB RAM.
While there, align struct mtx_pool fields to cache lines.
rgrimes [Tue, 22 Jan 2019 21:52:07 +0000 (21:52 +0000)]
MFC: 325765 (imp) Add notes about overlapping copies.
Add notes to each of these that specifically state that results are
undefined if the strings overlap. In the case of memcpy, we document
the overlapping behavior on FreeBSD (pre-existing). For str*, it is
left unspecified, however, since the default (and x86) implementations
do not handle overlapping strings properly.
mav [Tue, 22 Jan 2019 21:35:25 +0000 (21:35 +0000)]
MFC r342977 (by cem): amdtemp(4): Add support for Family 15h, Model >=60h
Family 15h is a bit of an oddball. Early models used the same temperature
register and spec (mostly[1]) as earlier CPU families.
Model 60h-6Fh and 70-7Fh use something more like Family 17h's Service
Management Network, communicating with it in a similar fashion. To support
them, add support for their version of SMU indirection to amdsmn(4) and use
it in amdtemp(4) on these models.
While here, clarify some of the deviceid macros in amdtemp(4) that were
added with arbitrary, incorrect family numbers, and remove ones that were
not used. Additionally, clarify intent and condition of heterogenous
multi-socket system detection.
[1]: 15h adds the "adjust range by -49°C if a certain condition is met,"
which previous families did not have.
Reported by: D. C. <tjoard AT gmail.com>
PR: 234657
Tested by: D. C. <tjoard AT gmail.com>
kp [Tue, 22 Jan 2019 01:07:20 +0000 (01:07 +0000)]
MFC r343041
pf: silence a runtime warning
Sometimes, for negated tables, pf can log 'pfr_update_stats: assertion failed'.
This warning does not clarify anything for users, so silence it, just as
OpenBSD has.
kp [Sun, 20 Jan 2019 22:01:41 +0000 (22:01 +0000)]
MFC r342989
pfctl: Fix 'set skip' handling for groups
When we skip on a group the kernel will automatically skip on the member
interfaces. We still need to update our own cache though, or we risk
overruling the kernel afterwards.
This manifested as 'set skip' working initially, then not working when
the rules were reloaded.
wulf [Fri, 18 Jan 2019 21:12:00 +0000 (21:12 +0000)]
MFC r340912,r340913:
psm(4): Revert r328640 and add minimal support for active AUX port
multiplexers
Active PS/2 multiplexing is a method for attaching up to four PS/2
pointing devices to a computer. Enabling of multiplexed mode allows
commands to be directed to individual devices using routing prefixes.
Multiplexed mode reports input with each byte tagged to identify
its source. This method differs from one currently supported by psm(4)
where so called guest device (trackpoint) is attached to special
interface located on the host device (touchpad) and latter performs
guest protocol conversion to special encapsulation packet format.
At present time active PS/2 multiplexing is used in some models of
HP laptops e.g. EliteBook 8560w, 9470m. Enabling of absolute operation
mode on such touchpads is connected with following problems:
1. Touchpad's port priority is lower than trackpoint's. That blocks
information queries thus prevents touchpad detection and configuration.
2. Touchpad and trackpoint have different protocol packet sizes and
sync bytes.
As PS/2 usage is on decline only minimal possible set of changes to
support Synaptics touchpad and generic mouses is implemented.
Active multiplexing mode is enabled only at probe stage to scan through
attached PS/2 devices to query and configure Synaptics touchpad.
After touchpad has been configured, mux is switched back to legacy
(hidden multiplexing) mode to perform normal interrupt-driven input
data processing. Overflow bit values rather than tags are used to
separate packets produced by different devices. Switching back to
legacy mode allows to avoid psm(4) and atkbd(4) rework to support
4 instances of mouse driver.
Note: While in hidden multiplexing mode KBC does some editing of the
packet stream. It remembers the button bits from the last packet
received from each device, and replaces the button bits of every
packet with the logical OR of all devices’ most recent button bits.
This sort of button crosstalk results in spurious button events
which are inhibitted with various tricks. E.g. trackpoint middle
button events are suppressed while trackpad surface is touched and
touchpad left and right button events are suppressed if corresponding
trackpoint buttons are pressed.
PR: 231058
Reported by: Michael Figiel <mifigiel at gmail.com>
Tested by: Michael Figiel <mifigiel at gmail.com>
ae [Fri, 18 Jan 2019 09:57:03 +0000 (09:57 +0000)]
MFC 342925:
Relax requirement to packet size of CARP protocol and remove version check.
CARP shares protocol number 112 with VRRP (RFC 5798). And the size of
VRRP packet may be smaller than CARP. ipfw_chk() does m_pullup() to at
least sizeof(struct carp_header) and can fail when packet is VRRP. This
leads to packet drop and message about failed pullup attempt.
Also, RFC 5798 defines version 3 of VRRP protocol, this version number
also unsupported by CARP and such check leads to packet drop.
carp_input() does its own checks for protocol version and packet size,
so we can remove these checks to be able pass VRRP packets.
hselasky [Fri, 18 Jan 2019 08:57:23 +0000 (08:57 +0000)]
MFC r342884:
Fix loopback traffic when using non-lo0 link local IPv6 addresses.
The loopback interface can only receive packets with a single scope ID,
namely the scope ID of the loopback interface itself. To mitigate this
packets which use the scope ID are appearing as received by the real
network interface, see "origifp" in the patch. The current code would
drop packets which are designated for loopback which use a link-local
scope ID in the destination address or source address, because they
won't match the lo0's scope ID. To fix this restore the network
interface pointer from the scope ID in the destination address for
the problematic cases. See comments added in patch for a more detailed
description.
This issue was introduced with route caching by karels@ .
hselasky [Fri, 18 Jan 2019 08:48:30 +0000 (08:48 +0000)]
MFC r342778:
Reduce timeout for reading the USB HUB port status to 1000ms and try to filter
out dead USB HUB devices by implementing an error counter, so that the USB
enumeration thread does not spend all its time reading from non-responding
devices, blocking user-space access in the end.
dim [Wed, 16 Jan 2019 20:38:17 +0000 (20:38 +0000)]
Pull in r337861 from upstream llvm trunk (by Hideki Saito):
[LV] Fix for PR38110, LV encountered llvm_unreachable()
Summary: truncateToMinimalBitWidths() doesn't handle all Instructions
and the worst case is compiler crash via llvm_unreachable(). Fix is
to add a case to handle PHINode and changed the worst case to NO-OP
(from compiler crash).
This should fix "Unhandled instruction type!" (if assertions are
enabled) or segmentation faults (if assertions are disabled) when
compiling certain versions of the net-p2p/gtk-gnutella port.
Direct commit to stable/11 and stable/12, since head already has this
fix.
shurd [Wed, 16 Jan 2019 19:20:14 +0000 (19:20 +0000)]
MFC r342855:
Use iflib_if_init_locked() during resume instead of iflib_init_locked().
iflib_init_locked() assumes that iflib_stop() has been called, however,
it is not called for suspend. iflib_if_init_locked() calls stop then init,
so fixes the problem.
This was causing errors after a resume from suspend.
gonzo [Wed, 16 Jan 2019 04:01:30 +0000 (04:01 +0000)]
MFC r335675:
Fix file(1) dumpdate reporting for dump(8) files
Magic file for dump(8) had this dump and previous dump dates reversed.
Fix order for all three flavours of the dump(8) format.
This fix was committed to upstream repo as magic/Magdir/dump,v 1.17
and will be merged during next vendor import.
kevans [Tue, 15 Jan 2019 16:12:47 +0000 (16:12 +0000)]
MFC r342792, r342805: Provide rc_service variable for rc service scripts
r342792: rc.subr: Provide rc_service variable for service scripts
Some rc scripts in ports (e.g. uwsgi, apache, openvpn) allow for
'application profiles' that usually require the rc script to be invoked
again for each active profile. Because there's no consistent way to
determine the path because it differs between manual/service(8) invocations
and /etc/rc invocations, this leads to patterns like these:
- www/uwsgi hardcodes the script path
- security/openvpn guesses either $_file or $0 based on $0 = /etc/rc
Instead of forcing rc scripts to guess, provide an rc_service variable to
the scripts that gets set appropriately both for direct execution or when a
script is being executed via run_rc_script (e.g. /etc/rc).
This is our analog of an OpenRC variable with the same name, different case
(RC_SERVICE).
r342805: rc.subr: Fix typo
Originally intended as 'in case in needs to be re-invoked', but it was later
decided (by myself) that 're-invoke itself' makes it more clear that the
script is expected to use this in a way.
r305074:
Remove CHS alignment. It's not needed and causes problems for the BBB
boot partition. NetBSD removed it in 1.10 in their repo some time ago.
r305075:
The code only converts from bpbHugeSectors to bpbSectors if the sum of
the hidden and huge sectors is less than or equal MAXU16. When
formatting in Windows bpbSectors is still used for 63488 sectors and
2048 hidden (sum > MAXU16). The hidden sectors count is the number of
sectors before the FAT16 Boot Record so it shouldn't affect the sector
count. Attached patch (huge_sec_conversion.patch) to only check for
bpb.bpbHugeSectors <= MAXU16 when converting to bpbSectors.
r327275:
Close fd and fd1 before returning now that we're done with them.
r327570:
Only call close if fd and fd1 are not -1.
avos [Tue, 15 Jan 2019 02:16:23 +0000 (02:16 +0000)]
MFC r342966:
net80211: fix possible panic for some drivers after r342464
Check if rate control structures were allocated before trying to
access them in various places; this was possible before on
allocation failure (unlikely), but was revealed after r342211
where allocation was deferred.
In case if driver uses wlan_amrr(4) and it is loaded it
is possible to reproduce the panic via
sysctl net.wlan.<number>.rate_stats
(for wlan0 the number will be 0).
The patch was adjusted a bit since file contents are different enough
since r306591.
avos [Mon, 14 Jan 2019 07:54:11 +0000 (07:54 +0000)]
MFC r342883:
net80211: fix panic when device is removed during initialization
if_dead() is called during device detach - check if interface is
still exists before trying to refresh vap MAC address
(IF_LLADDR will trigger page fault otherwise).