The nm_register callback needs to call nm_set_native_flags()
or nm_clear_native_flags() once the device has been stopped.
However, in the current implementation this is not true,
as the device is stopped by vtnet_init_locked(). This causes
race conditions where the driver crashes as soon as it
dequeues netmap buffers assuming they are mbufs (or the other
way around).
To fix the issue, we extend vtnet_init_locked() with a second
argument that, if not zero, will set/clear the netmap flags.
This results in a huge simplification of the nm_register
callback itself.
Also, use netmap_reset() to check if a ring is going to be
re-initialized in netmap mode.
Jilles Tjoelker [Sun, 14 Jun 2020 19:41:24 +0000 (19:41 +0000)]
sh/tests: Add tests for SIGINT in non-jobc background commands
If job control is not enabled, background commands shall ignore SIGINT and
SIGQUIT, and it shall be possible to override that ignore in the same shell.
Brandon Bergren [Sun, 14 Jun 2020 16:47:16 +0000 (16:47 +0000)]
[PowerPC] Fix scc z8530 driver
Parts of the z8530 driver were still using the SUN channel spacing.
This was invalid on PowerMac and QEMU, where the attachment was to escc,
not escc-legacy. This means the driver has apparently NEVER worked properly
on Macintosh hardware.
Add documentation for the channel spacing details, and change to using
driver-specific initialization instead of hardcoded spacing so either
spacing can be used.
Fixes boot hang in QEMU when using the serial console, and fixes use on
Xserve serial (and presumably PowerMacs that have a Stealth Serial port
or similar)
Toomas Soome [Sun, 14 Jun 2020 06:58:58 +0000 (06:58 +0000)]
Move font related data structured to sys/font.c and update vtfontcvt
Prepare support to be able to handle font data in loader, consolidate
data structures to sys/font.h and update vtfontcvt.
vtfontcvt update is about to output set of glyphs in form of C source,
the implementation does allow to output compressed or uncompressed font
bitmaps.
Rick Macklem [Sun, 14 Jun 2020 00:40:00 +0000 (00:40 +0000)]
Modify mountd to use the new struct export_args committed by r362158.
r362158 modified struct export_args for make the ex_flags field 64bits
and also changed the anonymous credentials to allow more than 16 groups.
This patch fixes mountd.c to use the new structure.
It does allocate larger exportlist and grouplist structures now.
That will be fixed in a future commit.
The only visible change will be that the credentials provided for the
-maproot and -mapall exports options can now have more than 16 groups.
Rick Macklem [Sun, 14 Jun 2020 00:10:18 +0000 (00:10 +0000)]
Fix export_args ex_flags field so that is 64bits, the same as mnt_flags.
Since mnt_flags was upgraded to 64bits there has been a quirk in
"struct export_args", since it hold a copy of mnt_flags
in ex_flags, which is an "int" (32bits).
This happens to currently work, since all the flag bits used in ex_flags are
defined in the low order 32bits. However, new export flags cannot be defined.
Also, ex_anon is a "struct xucred", which limits it to 16 additional groups.
This patch revises "struct export_args" to make ex_flags 64bits and replaces
ex_anon with ex_uid, ex_ngroups and ex_groups (which points to a
groups list, so it can be malloc'd up to NGROUPS in size.
This requires that the VFS_CHECKEXP() arguments change, so I also modified the
last "secflavors" argument to be an array pointer, so that the
secflavors could be copied in VFS_CHECKEXP() while the export entry is locked.
(Without this patch VFS_CHECKEXP() returns a pointer to the secflavors
array and then it is used after being unlocked, which is potentially
a problem if the exports entry is changed.
In practice this does not occur when mountd is run with "-S",
but I think it is worth fixing.)
This patch also deleted the vfs_oexport_conv() function, since
do_mount_update() does the conversion, as required by the old vfs_cmount()
calls.
Adrian Chadd [Sat, 13 Jun 2020 23:35:22 +0000 (23:35 +0000)]
[net80211] Handle offloaded AMSDU in AMPDU reordering.
In the 11n world, most NICs did A-MPDU receive/transmit offloading but
not A-MSDU offloading. So, the net80211 A-MPDU receive path would just
receive MPDUs, do the reordering bit, pass it up to the rest of
net80211 for crypto decap and then do A-MSDU decap before throwing ethernet
frames up to the rest of the system.
However 11ac and 11ax NICs are increasingly doing A-MSDU offload (and
newer 11ax stuff does socket offload, but hey I don't want to scare people
JUST yet) - so although A-MPDU reordering may be done in the OS, A-MSDUs
look like a normal MPDU. This means that all the MSDUs are actually
faked into a set of MPDUs with matching 802.11 header - the sequence number,
QoS header and any encryption verification bits (like IV) are just copied.
This shows up as MASSIVE packet loss in net80211, cause after the first MPDU
we just toss the rest.
(And don't get me started about ethernet decap with A-MPDU host reordering;
we'll have to cross that bridge for later 11ac and 11ax bits too.)
Anyway, this work changes each A-MPDU reorder slot into an mbufq.
The mbufq is treated as a whole set of frames to pass up to the stack
and reordered/de-duped as a group. The last frame in the reorder list
is checked to see if it's an A-MSDU final frame so any duplicates are
correctly tossed rather than double-received. Other than that, the
rest of the logic is unchanged.
The previous commit did a small subset of this - if there wasn't any reordering
going on then it'd accept the A-MSDUs. This is the rest of the needed work.
This is a no-op for 11n NICs doing A-MPDU reordering but needing software
A-MSDU decap - they aren't tagged as A-MSDU and so any subsequent
frames added to the reorder slot are tossed.
Adrian Chadd [Sat, 13 Jun 2020 22:20:02 +0000 (22:20 +0000)]
[net80211] separate out node allocation and node initialisation.
This is a new, optional (for now!) method that drivers can use to separate
node allocation and node initialisation. Right now they're the same, and
drivers that need to do node allocation via firmware commands need to sleep
and thus they need to defer node allocation into an internal taskqueue.
Right now they're just separate but not deferred. Later on if I get the time
we'll start deferring the node and key related operations but that requires
making a bunch of other stuff (notably things that generate frames!) also
async/deferred.
Fix ldd for PIE binaries after rtld stopped accepting binaries for dlopen.
ldd proclaims ET_DYN objects as shared libraries and tries to
dlopen(RTLD_TRACE) them to get dependencies. Since PIE binaries are
ET_DYN | DF_1_PIE, refusal to dlopen such binaries breaks ldd.
Fix it by reading and parsing dynamic segment looking for DF_FLAG_1
and taking DF_1_PIE into account when deciding between binary and
library.
Reported by: Dewayne Geraghty <dewayne@heuristicsystems.com.au>
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D25257
Yuri Pankov [Sat, 13 Jun 2020 14:11:02 +0000 (14:11 +0000)]
nvi: fallback to ISO8859-1 as last resort
Current logic of using user's locale encoding that is UTF-8 doesn't make
much sense if we already failed the looks_utf8() check and skipped
encoding set using "fileencoding" as being UTF-8 as well; fallback to
ISO8859-1 in that case.
CVE-2020-11655: SQLite through 3.31.1 allows attackers to cause a denial of
service (segmentation fault) via a malformed window-function query because
the AggInfo object's initialization is mishandled.
CVE-2020-13434: SQLite through 3.32.0 has an integer overflow in
sqlite3_str_vappendf in printf.c.
CVE-2020-13435: SQLite through 3.32.0 has a segmentation fault in
sqlite3ExprCodeTarget in expr.c.
CVE-2020-13630: ext/fts3/fts3.c in SQLite before 3.32.0 has a
use-after-free in fts3EvalNextRow, related to the snippet feature
CVE-2020-13631: SQLite before 3.32.0 allows a virtual table to be renamed
to the name of one of its shadow tables, related to alter.c and build.c.
CVE-2020-13632: ext/fts3/fts3_snippet.c in SQLite before 3.32.0 ha s a
NULL pointer dereference via a crafted matchinfo() query.
Doug Moore [Sat, 13 Jun 2020 01:54:09 +0000 (01:54 +0000)]
Linuxkpi uses the rb-tree structures without using their interfaces,
making them break when the representation changes. Revert changes that
eliminated the color field from rb-trees, leaving everything as it was
before.
John Baldwin [Fri, 12 Jun 2020 23:10:30 +0000 (23:10 +0000)]
Various optimizations to software AES-CCM and AES-GCM.
- Make use of cursors to avoid data copies for AES-CCM and AES-GCM.
Pass pointers into the request's input and/or output buffers
directly to the Update, encrypt, and decrypt hooks rather than
always copying all data into a temporary block buffer on the stack.
- Move handling for partial final blocks out of the main loop.
This removes branches from the main loop and permits using
encrypt/decrypt_last which avoids a memset to clear the rest of the
block on the stack.
- Shrink the on-stack buffers to assume AES block sizes and CCM/GCM
tag lengths.
- For AAD data, pass larger chunks to axf->Update. CCM can take each
AAD segment in a single call. GMAC can take multiple blocks at a
time.
Control for Special Register Buffer Data Sampling mitigation.
New microcode update for Intel enables mitigation for SRBDS, which
slows down RDSEED and related instructions. The update also provides
a control to limit the mitigation to SGX enclaves, which should
restore the speed of random generator by the cost of potential
cross-core bufer sampling.
See https://software.intel.com/security-software-guidance/insights/deep-dive-special-register-buffer-data-sampling
GIve the user control over it.
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D25221
Rtld itself is a shared object which does not have vendor note, so
after the direct exec of ld-elf.so.1 process has p_osrel set to zero.
This affects the ABI of syscalls.
Set osrel to the __FreeBSD_version value at compile time right after
rtld identified direct exec mode. Then, switch to the osrel read from
the binary note or zero if no note, right before starting calling
ifunc resolvers, which is the first byte of the user code.
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
John Baldwin [Fri, 12 Jun 2020 21:21:18 +0000 (21:21 +0000)]
Various fixes to TLS for MIPS.
- Clear the current thread's TLS pointer on exec. Previously the TLS
pointer (and register) remain unchanged.
- Explicitly clear the TLS pointer when new threads are created.
- Make md_tls_tcb_offset per-process instead of per-thread.
The layout of the TLS and TCB are identical for all threads in a
process, it is only the TLS pointer values themselves that vary by
thread. This also makes setting md_tls_tcb_offset in
cpu_set_user_tls() redundant with the setting in exec_setregs(), so
only set it in exec_setregs().
Eric van Gyzen [Fri, 12 Jun 2020 21:17:56 +0000 (21:17 +0000)]
FPU init: allocate initial state from UMA to ensure alignment
The Intel Instruction Set Reference says this about the XSAVE instruction:
Use of a destination operand not aligned to 64-byte boundary
(in either 64-bit or 32-bit modes) results in a general-protection
(#GP) exception.
This alignment happens naturally when all malloc buckets are powers
of two. However, this change is necessary on some systems when
certain non-power-of-two (and non-multiple of 64) malloc buckets
are defined.
Reviewed by: cem; kib; earlier version by jhb
MFC after: 2 weeks
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D25098
Eric van Gyzen [Fri, 12 Jun 2020 21:10:45 +0000 (21:10 +0000)]
FPU init: Do potentially blocking operations before disabling interrupts
In particular, uma_zcreate creates sysctl oids, which locks an sx lock,
which uses IPIs under contention. IPIs tend not to work very well
when interrupts are disabled. Who knew, right?
Randall Stewart [Fri, 12 Jun 2020 19:56:19 +0000 (19:56 +0000)]
So it turns out with the right window scaling you can get the code in all stacks to
always want to do a window update, even when no data can be sent. Now in
cases where you are not pacing thats probably ok, you just send an extra
window update or two. However with bbr (and rack if its paced) every time
the pacer goes off its going to send a "window update".
Also in testing bbr I have found that if we are not responding to
data right away we end up staying in startup but incorrectly holding
a pacing gain of 192 (a loss). This is because the idle window code
does not restict itself to only work with PROBE_BW. In all other
states you dont want it doing a PROBE_BW state change.
Sponsored by: Netflix Inc.
Differential Revision: https://reviews.freebsd.org/D25247
Andrew Gallatin [Fri, 12 Jun 2020 18:41:12 +0000 (18:41 +0000)]
x86: Bump default msi/msix vector limit to 2048
Given that 64c/128t CPUs are currently available, and that many
devices (nvme, many NICs) desire to map 1 MSI-X vector per core,
or even 1 per-thread, it is becoming far easier to see MSI-X interrupt
setup fail due to msi vector exhaustion, and devices fail to attach at
boot on large system.
This bump costs 12KB on amd64 (and 6KB on i386), which seems
worth the trade off for a better out of the box experience on
high end hardware.
Reviewed by: jhb
MFC after: 21 days
Sponsored by: Netflix
Kyle Evans [Fri, 12 Jun 2020 18:13:32 +0000 (18:13 +0000)]
posix_spawn: fix for some custom allocator setups
libc cannot assume that aligned_alloc and free come from jemalloc, or that
any application providing its own malloc and free is actually providing
aligned_alloc.
Switch back to malloc and just make sure we're passing a properly aligned
stack into rfork_thread, as an application perhaps can't reasonably replace
just malloc or just free without headaches.
This unbreaks ksh93 after r361996, which provides malloc/free but no
aligned_alloc.
Reported by: freqlabs
Diagnosed by: Andrew Gierth <andrew_tao173.riddles.org.uk>
X-MFC-With: r361996
Doug Moore [Fri, 12 Jun 2020 16:51:55 +0000 (16:51 +0000)]
The linuxkpi code accesses left/right rb tree pointers without using
RB_LEFT or RB_RIGHT, so they aren't stripping off the color bit
encoded there. Strip off that bit for linuxkpi.
Adrian Chadd [Fri, 12 Jun 2020 06:10:27 +0000 (06:10 +0000)]
[wlanstats] Add the per-node amsdu hardware decap'ed receive stats.
This is useful for tracking hardware provided AMSDU frames to see
when we're (a) seeing them, and (b) seeing the split between
intermediary and final frames.
Adrian Chadd [Fri, 12 Jun 2020 04:19:03 +0000 (04:19 +0000)]
[net80211] First part of A-MSDU offload handling - don't bump A-MPDU reordering seqno
When doing A-MSDU offload handling the driver is required to mark
A-MSDUs from the same MPDU with the same sequence number.
It then tags them as AMSDU (if it's a decap'ed A-MSDU) and AMSDU_MORE
(saying there's more AMSDUs decapped in the same MSDU.)
This allows encryption and sequence number offload to work right.
In the A-MSDU path the sequence number check looks at the A-MSDU flags
in the frame to see whether it's part of the same seqno and will pass them
(ie, not increment rx_seq until the last A-MSDU is seen from the driver,
or a new seqno shows up.0
However, I did this work in the A-MSDU path but not the A-MSDU in A-MPDU path.
For the non A-MDSU offload case the A-MPDU receive reordering will do its
thing and then pass up the MPDU up for decap - which then will see it's
an A-MSDU and decap each sub-frame. But this isn't done for offloaded
A-MSDU frames.
This requires two parts:
* Don't bump the RX sequence number, same as above; and
* If frames go into the reordering buffer, they need to be added into the slot
as a set of frames rather than a single frame, so once a new seqno shows up
this slot can be marked as "full" and we can move on.
This patch does the first. The latter requires that I find and commit
work to change rxa_m from an mbuf to an mbufq and the nhandle A-MSDU
there. But, the first is enough to allow the normal case (ie, no or not
a lot of A-MPDU RX reordering) to work.
This allows the athp driver (QCA9880) throughput to go from VERY low
(like 5mbit TCP, 1/3-1/4 expected UDP throughput) to ~ 250mbit TCP
and > 300mbit UDP on a VHT/40 channel. TCP sucks because, well, it
shows up as MASSIVE packet loss when all but one frame in a decap'ed
A-MSDU stream is dropped. Le whoops.
Now, where'd I put that laptop with the patch for rxa_m mbufq that
I wrote like in 2017...
Tested:
* AR9380, STA/AP mode (a big no-op, no A-MSDU hardware decap);
* if_run (RT3593), STA DWDS mode (A-MPDU / A-MSDU receive, but again
no A-MSDU hardware decap);
* QCA9880, STA/AP mode (which is doing hardware A-MPDU/A-MSDU decap,
but no A-MPDU reordering in the firmware.)
Ravi Pokala [Thu, 11 Jun 2020 22:46:08 +0000 (22:46 +0000)]
Decode the "LACP Fast Timeout" LAGG option flag
r286700 added the "lacp_fast_timeout" option to `ifconfig', but we forgot to
include the new option in the string used to decode the option bits. Add
"LACP_FAST_TIMO" to LAGG_OPT_BITS.
Also, s/LAGG_OPT_LACP_TIMEOUT/LAGG_OPT_LACP_FAST_TIMO/g , to be clearer that
the flag indicates "Fast Timeout" mode.
This function returns NULL if the ring identified by
queue id and direction is in netmap mode. Otherwise
return the corresponding kring.
Use this function to replace vtnet_netmap_queue_on().
Eric Joyner [Thu, 11 Jun 2020 15:59:49 +0000 (15:59 +0000)]
em(4): Always reinit interface when adding/removing VLAN
This partially reverts r361053 since there have been reports
by users that this breaks some functionality for em(4)
devices; it seems at first glance that some sort of interface
restart is required for those cards.
This isn't a proper fix; this unbreaks those users until a proper
fix is found for their issues.
PR: 240818
Reported by: Marek Zarychta <zarychtam@plan-b.pwste.edu.pl>
MFC after: 3 days
Michal Meloun [Thu, 11 Jun 2020 12:53:22 +0000 (12:53 +0000)]
Fix grabbing of tegra uart.
An attempt to write to FCR register may corrupt transmit FIFO,
so we should wait for the FIFO to be empty before we can modify it.
Andriy Gapon [Thu, 11 Jun 2020 10:41:31 +0000 (10:41 +0000)]
rework how ZVOLs are updated in response to DSL operations
With this change all ZVOL updates are initiated from the SPA sync
context instead of a mix of the sync and open contexts. The updates are
queued to be applied by a dedicated thread in the original order. This
should ensure that ZVOLs always accurately reflect the corresponding
datasets. ZFS ioctl operations wait on the mentioned thread to complete
its work. Thus, the illusion of the synchronous ZVOL update is
preserved. At the same time, the SPA sync thread never blocks on ZVOL
related operations avoiding problems like reported in bug 203864.
This change is based on earlier work in the same direction: D7179 and
D14669 by Anthoine Bourgeois. D7179 tried to perform ZVOL operations
in the open context and that opened races between them. D14669 uses a
design very similar to this change but with different implementation
details.
This change also heavily borrows from similar code in ZoL, but there are
many differences too. See:
- https://github.com/zfsonlinux/zfs/commit/a0bd735adb1b1eb81fef10b4db102ee051c4d4ff
- https://github.com/zfsonlinux/zfs/issues/3681
- https://github.com/zfsonlinux/zfs/issues/2217
Andriy Gapon [Thu, 11 Jun 2020 05:34:31 +0000 (05:34 +0000)]
iicbb: rebuild the bit-banging algorithms using different primitives
I2C_SET was quite inflexible, it used too long delays as well as some
unnecessary delays. The new building blocks are iicbb_clockin and
iicbb_clockout. The former sets SDA and starts the high period of SCL,
the latter executes the low period of SCL. What happens during the high
phase depends on the operation. For writes we just hold both lines, for
reads we poll SDA. S, Sr and P change SDA in the middle of the high
period.
Also, the calculation of udelay has been updated, so that the resulting
period more closely corresponds the requested bus frequency. There is a
new knob, io_delay, that allows to further adjust udelay based on the
estimated latency of pin toggling operations.
Finally, I slightly changed debug tracing and added error indicators to
it. The debug prints are compiled in but disabled by default. This can
be of use if there is any fallout from this change.
Some ideas for further improvements:
- add a function for sub-microsecond delays (e.g., in units of 1/10th of
a microsecond) and use it for more precise timing of short delays;
- account for the actual time spent in the pin I/O.
Some sample debug output with the new code follows.
Reading temperature and humidity from HTU21 in the bus hold mode:
<<w80+ we3+ <w81+ .....r6d+ rac+ r94- >>
<<w80+ we5+ <w81+ .............r47+ re2+ r84- >>
where '<<' is S, '<' is Sr, '>>' is P, '.' is one millisecond of clock
stretching by the slave.
Reading temperature and humidity in the no-hold mode:
<<w80+ wf3+ >>
<<w81- >>
<<w81+ r6d+ r54+ raf- >>
<<w80+ wf5+ >>
<<w81- >>
<<w81+ r48+ r4e+ r9c- >>
where '+' is Ack and '-' is NoAck.
We see that first read attempts are not acknowledged.