rmacklem [Tue, 16 Jun 2020 02:31:22 +0000 (02:31 +0000)]
Expose UID_xxx and GID_xxx definitions to userspace.
This patch moves the UID_xxx and GID_xxx definitions out of the
#ifdef _KERNEL section, so that userspace programs like mountd
can use them.
There are a couple of userspace programs that do define UID_ROOT,
but they do not include sys/conf.h. Since they are defined as
the same value, maybe they should be changed to include sys/conf.h.
U-APSD (unscheduled automatic power save delivery) is a power save method
that's a bit better than legacy PS-POLL - stations can mark frames with
an extra flag that tells the AP to leak out more frames after it sends
its own frames rather than needing to send a PS-POLL to get another frame
from the AP.
Now, this code just handles the negotiation bits; it doesn't actually
implement U-APSD. That's up to drivers, and nothing in the tree yet
implements this. I /may/ implement this for ath(4) if I eventually care
enough but right now I plan on just implementing it for firmware offload
based NICs that handle this in the NIC.
I'll commit the ifconfig bit after this and I may have some follow-up
commits as this gets used more by me in local testing.
This should be a glorious no-op for everyone else. If things change
for anyone that isn't fixed by a complete recompile then please reach out
to me.
trasz [Mon, 15 Jun 2020 20:12:10 +0000 (20:12 +0000)]
Make Linux uname(2) return x86_64 to 32-bit apps. This helps Steam.
PR: kern/240432
Analyzed by by: Alex S <iwtcex@gmail.com>
Reviewed by: emaste
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D25248
vmaffione [Mon, 15 Jun 2020 19:46:34 +0000 (19:46 +0000)]
if_vtnet: let vtnet_rx_vq_intr() and vtnet_rxq_tq_intr() share code
Since the two functions are similar, introduce a common function
(vtnet_rx_vq_process()) to share common code.
This also improves locking, by ensuring vrxs_rescheduled is accessed
under the RXQ lock, and taskqueue_enqueue() is not called under the
lock (therefore avoiding a spurious duplicate lock warning).
jhb [Mon, 15 Jun 2020 19:38:48 +0000 (19:38 +0000)]
Remove the sed hack for ABI tag notes.
The ELF notes compiled in C were placed in a section with the wrong type
(SHT_PROGBITS instead of SHT_NOTE). Previously, sed was used on the
generated assembly to rewrite the section type. Instead, write the notes
in assembly which permits setting the correct section type directly.
While here, move inline assembly entry points out of C and into assembly
for aarch64, arm, and riscv.
jhb [Mon, 15 Jun 2020 18:57:43 +0000 (18:57 +0000)]
Simplify MACHINE_ARCH to be a single string.
Big endian and armv4 mean that we are now down to only two supported
variants. A future change will use MACHINE_ARCH in assembly which
does not support C-style string concatentation and thus needs
MACHINE_ARCH defined as a single string.
jrtc27 [Sun, 14 Jun 2020 22:39:34 +0000 (22:39 +0000)]
vtnet: Fix regression introduced in r361944
For legacy devices that don't support MrgRxBuf (such as bhyve pre-r358180),
r361944 failed to update the receive handler to account for the additional
padding introduced by the unused num_buffers field that is now always present
in struct vtnet_rx_header. Thus, calculate the padding dynamically based on
vtnet_hdr_size.
vmaffione [Sun, 14 Jun 2020 20:47:31 +0000 (20:47 +0000)]
netmap: vtnet: fix races in vtnet_netmap_reg()
The nm_register callback needs to call nm_set_native_flags()
or nm_clear_native_flags() once the device has been stopped.
However, in the current implementation this is not true,
as the device is stopped by vtnet_init_locked(). This causes
race conditions where the driver crashes as soon as it
dequeues netmap buffers assuming they are mbufs (or the other
way around).
To fix the issue, we extend vtnet_init_locked() with a second
argument that, if not zero, will set/clear the netmap flags.
This results in a huge simplification of the nm_register
callback itself.
Also, use netmap_reset() to check if a ring is going to be
re-initialized in netmap mode.
jilles [Sun, 14 Jun 2020 19:41:24 +0000 (19:41 +0000)]
sh/tests: Add tests for SIGINT in non-jobc background commands
If job control is not enabled, background commands shall ignore SIGINT and
SIGQUIT, and it shall be possible to override that ignore in the same shell.
bdragon [Sun, 14 Jun 2020 16:47:16 +0000 (16:47 +0000)]
[PowerPC] Fix scc z8530 driver
Parts of the z8530 driver were still using the SUN channel spacing.
This was invalid on PowerMac and QEMU, where the attachment was to escc,
not escc-legacy. This means the driver has apparently NEVER worked properly
on Macintosh hardware.
Add documentation for the channel spacing details, and change to using
driver-specific initialization instead of hardcoded spacing so either
spacing can be used.
Fixes boot hang in QEMU when using the serial console, and fixes use on
Xserve serial (and presumably PowerMacs that have a Stealth Serial port
or similar)
tsoome [Sun, 14 Jun 2020 06:58:58 +0000 (06:58 +0000)]
Move font related data structured to sys/font.c and update vtfontcvt
Prepare support to be able to handle font data in loader, consolidate
data structures to sys/font.h and update vtfontcvt.
vtfontcvt update is about to output set of glyphs in form of C source,
the implementation does allow to output compressed or uncompressed font
bitmaps.
rmacklem [Sun, 14 Jun 2020 00:40:00 +0000 (00:40 +0000)]
Modify mountd to use the new struct export_args committed by r362158.
r362158 modified struct export_args for make the ex_flags field 64bits
and also changed the anonymous credentials to allow more than 16 groups.
This patch fixes mountd.c to use the new structure.
It does allocate larger exportlist and grouplist structures now.
That will be fixed in a future commit.
The only visible change will be that the credentials provided for the
-maproot and -mapall exports options can now have more than 16 groups.
rmacklem [Sun, 14 Jun 2020 00:10:18 +0000 (00:10 +0000)]
Fix export_args ex_flags field so that is 64bits, the same as mnt_flags.
Since mnt_flags was upgraded to 64bits there has been a quirk in
"struct export_args", since it hold a copy of mnt_flags
in ex_flags, which is an "int" (32bits).
This happens to currently work, since all the flag bits used in ex_flags are
defined in the low order 32bits. However, new export flags cannot be defined.
Also, ex_anon is a "struct xucred", which limits it to 16 additional groups.
This patch revises "struct export_args" to make ex_flags 64bits and replaces
ex_anon with ex_uid, ex_ngroups and ex_groups (which points to a
groups list, so it can be malloc'd up to NGROUPS in size.
This requires that the VFS_CHECKEXP() arguments change, so I also modified the
last "secflavors" argument to be an array pointer, so that the
secflavors could be copied in VFS_CHECKEXP() while the export entry is locked.
(Without this patch VFS_CHECKEXP() returns a pointer to the secflavors
array and then it is used after being unlocked, which is potentially
a problem if the exports entry is changed.
In practice this does not occur when mountd is run with "-S",
but I think it is worth fixing.)
This patch also deleted the vfs_oexport_conv() function, since
do_mount_update() does the conversion, as required by the old vfs_cmount()
calls.
adrian [Sat, 13 Jun 2020 23:35:22 +0000 (23:35 +0000)]
[net80211] Handle offloaded AMSDU in AMPDU reordering.
In the 11n world, most NICs did A-MPDU receive/transmit offloading but
not A-MSDU offloading. So, the net80211 A-MPDU receive path would just
receive MPDUs, do the reordering bit, pass it up to the rest of
net80211 for crypto decap and then do A-MSDU decap before throwing ethernet
frames up to the rest of the system.
However 11ac and 11ax NICs are increasingly doing A-MSDU offload (and
newer 11ax stuff does socket offload, but hey I don't want to scare people
JUST yet) - so although A-MPDU reordering may be done in the OS, A-MSDUs
look like a normal MPDU. This means that all the MSDUs are actually
faked into a set of MPDUs with matching 802.11 header - the sequence number,
QoS header and any encryption verification bits (like IV) are just copied.
This shows up as MASSIVE packet loss in net80211, cause after the first MPDU
we just toss the rest.
(And don't get me started about ethernet decap with A-MPDU host reordering;
we'll have to cross that bridge for later 11ac and 11ax bits too.)
Anyway, this work changes each A-MPDU reorder slot into an mbufq.
The mbufq is treated as a whole set of frames to pass up to the stack
and reordered/de-duped as a group. The last frame in the reorder list
is checked to see if it's an A-MSDU final frame so any duplicates are
correctly tossed rather than double-received. Other than that, the
rest of the logic is unchanged.
The previous commit did a small subset of this - if there wasn't any reordering
going on then it'd accept the A-MSDUs. This is the rest of the needed work.
This is a no-op for 11n NICs doing A-MPDU reordering but needing software
A-MSDU decap - they aren't tagged as A-MSDU and so any subsequent
frames added to the reorder slot are tossed.
adrian [Sat, 13 Jun 2020 22:20:02 +0000 (22:20 +0000)]
[net80211] separate out node allocation and node initialisation.
This is a new, optional (for now!) method that drivers can use to separate
node allocation and node initialisation. Right now they're the same, and
drivers that need to do node allocation via firmware commands need to sleep
and thus they need to defer node allocation into an internal taskqueue.
Right now they're just separate but not deferred. Later on if I get the time
we'll start deferring the node and key related operations but that requires
making a bunch of other stuff (notably things that generate frames!) also
async/deferred.
kib [Sat, 13 Jun 2020 18:21:31 +0000 (18:21 +0000)]
Fix ldd for PIE binaries after rtld stopped accepting binaries for dlopen.
ldd proclaims ET_DYN objects as shared libraries and tries to
dlopen(RTLD_TRACE) them to get dependencies. Since PIE binaries are
ET_DYN | DF_1_PIE, refusal to dlopen such binaries breaks ldd.
Fix it by reading and parsing dynamic segment looking for DF_FLAG_1
and taking DF_1_PIE into account when deciding between binary and
library.
Reported by: Dewayne Geraghty <dewayne@heuristicsystems.com.au>
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D25257
yuripv [Sat, 13 Jun 2020 14:11:02 +0000 (14:11 +0000)]
nvi: fallback to ISO8859-1 as last resort
Current logic of using user's locale encoding that is UTF-8 doesn't make
much sense if we already failed the looks_utf8() check and skipped
encoding set using "fileencoding" as being UTF-8 as well; fallback to
ISO8859-1 in that case.
CVE-2020-11655: SQLite through 3.31.1 allows attackers to cause a denial of
service (segmentation fault) via a malformed window-function query because
the AggInfo object's initialization is mishandled.
CVE-2020-13434: SQLite through 3.32.0 has an integer overflow in
sqlite3_str_vappendf in printf.c.
CVE-2020-13435: SQLite through 3.32.0 has a segmentation fault in
sqlite3ExprCodeTarget in expr.c.
CVE-2020-13630: ext/fts3/fts3.c in SQLite before 3.32.0 has a
use-after-free in fts3EvalNextRow, related to the snippet feature
CVE-2020-13631: SQLite before 3.32.0 allows a virtual table to be renamed
to the name of one of its shadow tables, related to alter.c and build.c.
CVE-2020-13632: ext/fts3/fts3_snippet.c in SQLite before 3.32.0 ha s a
NULL pointer dereference via a crafted matchinfo() query.
dougm [Sat, 13 Jun 2020 01:54:09 +0000 (01:54 +0000)]
Linuxkpi uses the rb-tree structures without using their interfaces,
making them break when the representation changes. Revert changes that
eliminated the color field from rb-trees, leaving everything as it was
before.
jhb [Fri, 12 Jun 2020 23:10:30 +0000 (23:10 +0000)]
Various optimizations to software AES-CCM and AES-GCM.
- Make use of cursors to avoid data copies for AES-CCM and AES-GCM.
Pass pointers into the request's input and/or output buffers
directly to the Update, encrypt, and decrypt hooks rather than
always copying all data into a temporary block buffer on the stack.
- Move handling for partial final blocks out of the main loop.
This removes branches from the main loop and permits using
encrypt/decrypt_last which avoids a memset to clear the rest of the
block on the stack.
- Shrink the on-stack buffers to assume AES block sizes and CCM/GCM
tag lengths.
- For AAD data, pass larger chunks to axf->Update. CCM can take each
AAD segment in a single call. GMAC can take multiple blocks at a
time.
kib [Fri, 12 Jun 2020 22:14:45 +0000 (22:14 +0000)]
Control for Special Register Buffer Data Sampling mitigation.
New microcode update for Intel enables mitigation for SRBDS, which
slows down RDSEED and related instructions. The update also provides
a control to limit the mitigation to SGX enclaves, which should
restore the speed of random generator by the cost of potential
cross-core bufer sampling.
See https://software.intel.com/security-software-guidance/insights/deep-dive-special-register-buffer-data-sampling
GIve the user control over it.
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D25221
kib [Fri, 12 Jun 2020 22:10:03 +0000 (22:10 +0000)]
rtld: set osrel when in the direct exec mode.
Rtld itself is a shared object which does not have vendor note, so
after the direct exec of ld-elf.so.1 process has p_osrel set to zero.
This affects the ABI of syscalls.
Set osrel to the __FreeBSD_version value at compile time right after
rtld identified direct exec mode. Then, switch to the osrel read from
the binary note or zero if no note, right before starting calling
ifunc resolvers, which is the first byte of the user code.
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
jhb [Fri, 12 Jun 2020 21:21:18 +0000 (21:21 +0000)]
Various fixes to TLS for MIPS.
- Clear the current thread's TLS pointer on exec. Previously the TLS
pointer (and register) remain unchanged.
- Explicitly clear the TLS pointer when new threads are created.
- Make md_tls_tcb_offset per-process instead of per-thread.
The layout of the TLS and TCB are identical for all threads in a
process, it is only the TLS pointer values themselves that vary by
thread. This also makes setting md_tls_tcb_offset in
cpu_set_user_tls() redundant with the setting in exec_setregs(), so
only set it in exec_setregs().
vangyzen [Fri, 12 Jun 2020 21:17:56 +0000 (21:17 +0000)]
FPU init: allocate initial state from UMA to ensure alignment
The Intel Instruction Set Reference says this about the XSAVE instruction:
Use of a destination operand not aligned to 64-byte boundary
(in either 64-bit or 32-bit modes) results in a general-protection
(#GP) exception.
This alignment happens naturally when all malloc buckets are powers
of two. However, this change is necessary on some systems when
certain non-power-of-two (and non-multiple of 64) malloc buckets
are defined.
Reviewed by: cem; kib; earlier version by jhb
MFC after: 2 weeks
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D25098
vangyzen [Fri, 12 Jun 2020 21:10:45 +0000 (21:10 +0000)]
FPU init: Do potentially blocking operations before disabling interrupts
In particular, uma_zcreate creates sysctl oids, which locks an sx lock,
which uses IPIs under contention. IPIs tend not to work very well
when interrupts are disabled. Who knew, right?
rrs [Fri, 12 Jun 2020 19:56:19 +0000 (19:56 +0000)]
So it turns out with the right window scaling you can get the code in all stacks to
always want to do a window update, even when no data can be sent. Now in
cases where you are not pacing thats probably ok, you just send an extra
window update or two. However with bbr (and rack if its paced) every time
the pacer goes off its going to send a "window update".
Also in testing bbr I have found that if we are not responding to
data right away we end up staying in startup but incorrectly holding
a pacing gain of 192 (a loss). This is because the idle window code
does not restict itself to only work with PROBE_BW. In all other
states you dont want it doing a PROBE_BW state change.
Sponsored by: Netflix Inc.
Differential Revision: https://reviews.freebsd.org/D25247
gallatin [Fri, 12 Jun 2020 18:41:12 +0000 (18:41 +0000)]
x86: Bump default msi/msix vector limit to 2048
Given that 64c/128t CPUs are currently available, and that many
devices (nvme, many NICs) desire to map 1 MSI-X vector per core,
or even 1 per-thread, it is becoming far easier to see MSI-X interrupt
setup fail due to msi vector exhaustion, and devices fail to attach at
boot on large system.
This bump costs 12KB on amd64 (and 6KB on i386), which seems
worth the trade off for a better out of the box experience on
high end hardware.
Reviewed by: jhb
MFC after: 21 days
Sponsored by: Netflix
kevans [Fri, 12 Jun 2020 18:13:32 +0000 (18:13 +0000)]
posix_spawn: fix for some custom allocator setups
libc cannot assume that aligned_alloc and free come from jemalloc, or that
any application providing its own malloc and free is actually providing
aligned_alloc.
Switch back to malloc and just make sure we're passing a properly aligned
stack into rfork_thread, as an application perhaps can't reasonably replace
just malloc or just free without headaches.
This unbreaks ksh93 after r361996, which provides malloc/free but no
aligned_alloc.
Reported by: freqlabs
Diagnosed by: Andrew Gierth <andrew_tao173.riddles.org.uk>
X-MFC-With: r361996
dougm [Fri, 12 Jun 2020 16:51:55 +0000 (16:51 +0000)]
The linuxkpi code accesses left/right rb tree pointers without using
RB_LEFT or RB_RIGHT, so they aren't stripping off the color bit
encoded there. Strip off that bit for linuxkpi.
adrian [Fri, 12 Jun 2020 06:10:27 +0000 (06:10 +0000)]
[wlanstats] Add the per-node amsdu hardware decap'ed receive stats.
This is useful for tracking hardware provided AMSDU frames to see
when we're (a) seeing them, and (b) seeing the split between
intermediary and final frames.
adrian [Fri, 12 Jun 2020 04:19:03 +0000 (04:19 +0000)]
[net80211] First part of A-MSDU offload handling - don't bump A-MPDU reordering seqno
When doing A-MSDU offload handling the driver is required to mark
A-MSDUs from the same MPDU with the same sequence number.
It then tags them as AMSDU (if it's a decap'ed A-MSDU) and AMSDU_MORE
(saying there's more AMSDUs decapped in the same MSDU.)
This allows encryption and sequence number offload to work right.
In the A-MSDU path the sequence number check looks at the A-MSDU flags
in the frame to see whether it's part of the same seqno and will pass them
(ie, not increment rx_seq until the last A-MSDU is seen from the driver,
or a new seqno shows up.0
However, I did this work in the A-MSDU path but not the A-MSDU in A-MPDU path.
For the non A-MDSU offload case the A-MPDU receive reordering will do its
thing and then pass up the MPDU up for decap - which then will see it's
an A-MSDU and decap each sub-frame. But this isn't done for offloaded
A-MSDU frames.
This requires two parts:
* Don't bump the RX sequence number, same as above; and
* If frames go into the reordering buffer, they need to be added into the slot
as a set of frames rather than a single frame, so once a new seqno shows up
this slot can be marked as "full" and we can move on.
This patch does the first. The latter requires that I find and commit
work to change rxa_m from an mbuf to an mbufq and the nhandle A-MSDU
there. But, the first is enough to allow the normal case (ie, no or not
a lot of A-MPDU RX reordering) to work.
This allows the athp driver (QCA9880) throughput to go from VERY low
(like 5mbit TCP, 1/3-1/4 expected UDP throughput) to ~ 250mbit TCP
and > 300mbit UDP on a VHT/40 channel. TCP sucks because, well, it
shows up as MASSIVE packet loss when all but one frame in a decap'ed
A-MSDU stream is dropped. Le whoops.
Now, where'd I put that laptop with the patch for rxa_m mbufq that
I wrote like in 2017...
Tested:
* AR9380, STA/AP mode (a big no-op, no A-MSDU hardware decap);
* if_run (RT3593), STA DWDS mode (A-MPDU / A-MSDU receive, but again
no A-MSDU hardware decap);
* QCA9880, STA/AP mode (which is doing hardware A-MPDU/A-MSDU decap,
but no A-MPDU reordering in the firmware.)
rpokala [Thu, 11 Jun 2020 22:46:08 +0000 (22:46 +0000)]
Decode the "LACP Fast Timeout" LAGG option flag
r286700 added the "lacp_fast_timeout" option to `ifconfig', but we forgot to
include the new option in the string used to decode the option bits. Add
"LACP_FAST_TIMO" to LAGG_OPT_BITS.
Also, s/LAGG_OPT_LACP_TIMEOUT/LAGG_OPT_LACP_FAST_TIMO/g , to be clearer that
the flag indicates "Fast Timeout" mode.
vmaffione [Thu, 11 Jun 2020 20:35:28 +0000 (20:35 +0000)]
netmap: introduce netmap_kring_on()
This function returns NULL if the ring identified by
queue id and direction is in netmap mode. Otherwise
return the corresponding kring.
Use this function to replace vtnet_netmap_queue_on().