Kyle Evans [Wed, 26 Aug 2020 00:50:27 +0000 (00:50 +0000)]
MFC r364600: caroot: switch to using echo+shell glob to enumerate certs
This solves an issue on stable/12 that causes certs to not get installed.
ls is apparently not in PATH during installworld, so TRUSTED_CERTS ends up
blank and nothing gets installed. We don't really require anything
ls-specific, though, so let's just simplify it.
Alexander Motin [Wed, 26 Aug 2020 00:28:28 +0000 (00:28 +0000)]
MFC r364399: Remove some noisy ACPI tables messages from verbose dmesg.
Those messages were printed hundreds of times during boot, often multiple
times for each table. We already print information about the tables in
more organized form once to not duplicate it when random ACPI drivers are
attaching.
Code was checking for NETMAP_{SW,HW}_RING in req->nr_ringid which
had already been masked by NETMAP_RING_MASK. Therefore, the comparisons
always failed and set NR_REG_ALL_NIC. Check against the original nmr
structure.
Navdeep Parhar [Tue, 25 Aug 2020 02:14:36 +0000 (02:14 +0000)]
MFC r351444, r357475, r357479, r357481-r357482, r358859, and r364497.
All these are rx improvements in the cxgbe(4) driver.
r351444:
cxgbe(4): Use the same buffer size for TOE rx queues as the NIC rx queues.
This is a minor simplification.
r357475:
cxgbe(4): Initialize the rx buffer's metadata on first-use and not on
allocation.
refill_fl doesn't touch any part of a freshly allocated cluster after
this change.
r357479:
cxgbe(4): Avoid ext_arg2 in rxb_free.
ext_arg2 is the only item in the third cacheline in an mbuf and could be
cold by the time rxb_free runs. Put the information needed by rxb_free
in the same line as the refcount, which is very likely to be hot given
that rxb_free runs when the refcount is decremented and reaches 0.
r357481:
cxgbe(4): Retire the allow_mbufs_in_cluster optimization.
This simplifies the driver's rx fast path as well as the bookkeeping
code that tracks various rx buffer sizes and layouts.
r357482:
cxgbe(4): Treat NIC rx as special and run its handler directly and not
via the t4_cpl_handler dispatch table.
r358859:
cxgbe(4): Do not try to use 0 as an rx buffer address when the driver is
already allocating from the safe zone and the allocation fails.
This bug was introduced in r357481.
r364497:
cxgbe(4): Use large clusters for TOE rx queues when TOE+TLS is enabled.
Rx is more efficient within the chip when the receive buffer size
matches the TLS PDU size.
Linuxulator depends on a fundamental kernel settings such as SMP. Many
of them listed in opt_global.h which is not generated while building
modules outside of a kernel and such modules never match real cofigured
kernel.
So, we should prevent our users from building obviously defective modules.
Therefore, remove the root cause of the building of modules outside of a
kernel - the possibility of building modules with DEBUG or KTR flags.
And remove all of DEBUG printfs as it is incomplete and in threaded
programms not informative, also a half of system call does not have DEBUG
printf. For debuging Linux programms we have dtrace, ktr and ktrace ability.
Rework linux accept(2). This makes the code flow easier to follow,
and fixes a bug where calling accept(2) could result in closing fd 0.
Note that the code still contains a number of problems: it makes
assumptions about l_sockaddr_in being the same as sockaddr_in,
the EFAULT-related code looks like it doesn't work at all, and the
socket type check is racy. Those will be addressed later on;
I'm trying to work in small steps to avoid breaking one thing while
fixing another.
Make linux(4) create /dev/shm. Linux applications often expect
a tmpfs to be mounted there, and because they like to verify it's
actually a mountpoint, a symlink won't do.
Our bsd_to_linux_sockaddr() and linux_to_bsd_sockaddr() functions
alter the userspace sockaddr to convert the format between linux and BSD versions.
That's the minimum 3 of copyin/copyout operations for one syscall.
Also some syscall uses linux_sa_put() and linux_getsockaddr() when load
sockaddr to userspace or from userspace accordingly.
To avoid this chaos, especially converting sockaddr in the userspace,
rewrite these 4 functions to convert sockaddr only in kernel and leave
only 2 of this functions.
Also in order to reduce duplication between MD parts of the Linuxulator put
struct sockaddr conversion functions that are MI out into linux_common module.
It is documented as a raw hardware-based clock not subject to NTP or
incremental adjustments. With this "not as precise as CLOCK_MONOTONIC"
description in mind, map it to our CLOCK_MONOTNIC_FAST (the same
mapping as for the linux CLOCK_MONOTONIC_COARSE).
This is needed for the webcomponent of steam (chromium) and some
other steam component or game.
The linux-steam-utils port contains a LD_PRELOAD based fix for this.
There this is mapped to CLOCK_MONOTONIC.
As an untrained ear/eye (= the majority of people) is normaly not
noticing a difference of jitter in the 10-20 ms range, specially
if you don't pay attention like for example in a browser session
while watching a video stream, the mapping to CLOCK_MONOTONIC_FAST
seems more appropriate than to CLOCK_MONOTONIC.
MFC r363130 by netchild:
Fix r363125 (Implement CLOCK_MONOTONIC_RAW (linux >= 2.6.28)),
by realy using the MONOTONIC version and not the REALTIME version.
Fix clock_gettime() and clock_getres() for cpu clocks:
- handle the CLOCK_{PROCESS,THREAD}_CPUTIME_ID specified directly;
- fix thread id calculation as in the Linuxulator we should
convert the user supplied thread id to struct thread * by linux_tdfind();
- fix CPUCLOCK_SCHED case by using kern_{process,thread}_cputime()
directly as native get_cputime() used by kern_clock_gettime() uses
native tdfind()/pfind() to find proccess/thread.
linux_to_native_clockid() properly initializes nwhich variable (or return error),
so don't initialize nwhich in declaration and remove stale comment from r161304.
r363843:
linuxkpi: Add time_after32 and time_before32
This compare two 32 bits times
Sponsored by: The FreeBSD Foundation
Reviewed by: kib, hselasky
Differential Revision: https://reviews.freebsd.org/D25700
r364232:
linuxkpi: Add a few wait_bit functions
The linux function does a lot more than that as multiple waitqueue could be fetch
from a static table based on the hash of the argument but since in DRM it's only used
in one place just add a single variable.
We will probably need to change that in the futur but it's ok with DRM even with current
linux.
Reviewed by: hselasky
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D26054
Emmanuel Vadot [Mon, 24 Aug 2020 13:14:38 +0000 (13:14 +0000)]
MFC r361450, r361452, r361550-r361551
r361450:
linuxkpi: Add refcount.h
Implement some refcount functions needed by drm.
Just use the atomic_t struct and functions from linuxkpi for simplicity.
Sponsored-by: The FreeBSD Foundation
Reviewed by: hselsasky
Differential Revision: https://reviews.freebsd.org/D24985
r361452:
linuxkpi: Fix mod_timer and del_timer_sync
mod_timer is supposed to return 1 if the modified timer was pending, which
is exactly what callout_reset does so return the value after checking
that it's a correct one in case the api change.
del_timer_sync returns int so add a function and handle that.
Since handlers are call in a thread context we can simply use a workqueue
to emulate those functions.
The DRM code was patched to do that already, having it in linuxkpi allows us
to not patch the upstream code.
Sponsored-by: The FreeBSD Foundation
Reviewed by: hselasky
Differential Revision: https://reviews.freebsd.org/D24859
r361343:
linuxkpi: Add rcu_work functions
The rcu_work function helps to queue some work after waiting for a grace
period.
This is needed by DRM drivers.
Sponsored-by: The FreeBSD Foundation
Reviewed by: hselasky
Differential Revision: https://reviews.freebsd.org/D24942
r361418:
libkern: Add arc4random_uniform
This variant get a random number up to the limit passed as the argument.
This is simply a copy of the libc version.
Sponsored-by: The FreeBSD Foundation
Reviewed by: cem, hselasky (previous version)
Differential Revision: https://reviews.freebsd.org/D24962
r361419:
linuxkpi: Add prandom_u32_max
This is just a wrapper around arc4random_uniform
Needed by DRM v5.3
Sponsored-by: The FreeBSD Foundation
Reviewed by: cem, hselasky
Differential Revision: https://reviews.freebsd.org/D24961
r361422:
bbr: Use arc4random_uniform from libkern.
This unbreak LINT build
Reported by: jenkins, melifaro
r361449:
linuxkpi: Add __same_type and __must_be_array macros
The same_type macro simply wraps around builtin_types_compatible_p which
exist for both GCC and CLANG, which returns 1 if both types are the same.
The __must_be_array macros returns 1 if the argument is an array.
This is needed for DRM v5.3
Sponsored-by: The FreeBSD Foundation
Reviewed by: hselasky
Differential Revision: https://reviews.freebsd.org/D24953
Fix linuxulator prlimit64(2) with pid == 0. This makes 'ulimit -a'
return something reasonable, and helps linux binaries which attempt
to close all the files, eg apt(8).
Make linux(4) set the openfiles soft resource limit to 1024 for Linux
applications, which often depend on this being the case. There's a new
sysctl, compat.linux.default_openfiles, to control this behaviour.
Move compat.linux.map_sched_prio sysctl definition to linux_mib.c so it is
only defined by linux_common kernel module and not both linux and linux64
modules.
linuxulator: Map scheduler priorities to Linux priorities.
On Linux the valid range of priorities for the SCHED_FIFO and SCHED_RR
scheduling policies is [1,99]. For SCHED_OTHER the single valid priority is
0. On FreeBSD it is [0,31] for all policies. Programs are supposed to
query the valid range using sched_get_priority_(min|max), but of course some
programs assume the Linux values are valid.
This commit adds a tunable compat.linux.map_sched_prio. When enabled
sched_get_priority_(min|max) return the Linux values and sched_setscheduler
and sched_(get|set)param translate between FreeBSD and Linux values.
Because there are more Linux levels than FreeBSD levels, multiple Linux
levels map to a single FreeBSD level, which means pre-emption might not
happen as it does on Linux, so the tunable allows to disable this behaviour.
It is enabled by default because I think it is unlikely that anyone runs
real-time software under Linux emulation on FreeBSD that critically relies
on correct pre-emption.
This fixes FMOD, a commercial sound library used by several games.
Add compat.linux.ignore_ip_recverr sysctl. This is a workaround
for missing IP_RECVERR setsockopt(2) support. Without it, DNS
resolution is broken for glibc >= 2.30 (glibc BZ #24047).
From the user point of view this fixes "yum update" on recent
CentOS 8.
Support SO_SNDBUFFORCE/SO_RCVBUFFORCE by aliasing them to the
standard SO_SNDBUF/SO_RCVBUF. Mostly cosmetics, to get rid
of the warning during 'apt upgrade'.
Emmanuel Vadot [Mon, 24 Aug 2020 10:46:09 +0000 (10:46 +0000)]
MFC r361007, r361138-r361140, r361245-r361246
r361007:
linuxkpi: Add EBADRQC to errno.h
This is used in the amdgpu driver from Linux 5.2
Sponsored-by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D24807
r361138:
linuxkpi: Add atomic_dec_and_mutex_lock
This function decrement the counter and if the result is 0 it acquires
the mutex and returns 1, if not it simply returns 0.
Needed by DRM from Linux v5.3
Sponsored-by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D24847
r361139:
linuxkpi: Add __mutex_init
Same as mutex_init, the lock_class_key argument seems to be only used for
debug in Linux, simply ignore it for now.
Needed by DRM in Linux v5.3
Sponsored-by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D24848
r361140:
linuxkpi: Add offsetofend macro
This calculate the offset of the end of the member in the given struct.
Needed by DRM in Linux v5.3
Sponsored-by: The FreeBSD Foudation
Differential Revision: https://reviews.freebsd.org/D24849
r361245:
linuxkpi: Add __init_waitqueue_head
The only difference with init_waitqueue_head is that the name and the
lock class key are provided but we don't use those so use init_waitqueue_head
directly.
Sponsored-by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D24861
r361246:
linuxkpi: add pci_dev_present
pci_dev_present shows if a set of pci ids are present in the system.
It just wraps pci_find_device.
Needed by DRMv5.2
Emmanuel Vadot [Mon, 24 Aug 2020 10:42:04 +0000 (10:42 +0000)]
MFC r360787, r360851, r360870-r360872
r360787:
linuxkpi: Add pci_iomap and pci_iounmap
Those function are use to map/unmap io region of a pci device.
Different resource can be mapped depending on the bar so use a
tailq to store them all.
Sponsored-by: The FreeBSD Foundation
Reviewed by: emaste, hselasky
Differential Revision: https://reviews.freebsd.org/D24696
r360851:
linuxkpi: Add bitmap_copy and bitmap_andnot
bitmap_copy simply copy the bitmaps, no idea why it exists.
bitmap_andnot is similar to bitmap_and but uses !src2.
Sponsored-by: The FreeBSD Foundation
Reviewed by: hselasky
Differential Revision: https://reviews.freebsd.org/D24782
r360870:
linuxkpi: Add bitmap_alloc and bitmap_free
This is a simple call to kmallock_array/kfree, therefore include linux/slab.h as
this is where the kmalloc_array/kfree definition is.
Sponsored-by: The FreeBSD Foundation
Reviewed by: hselsasky
Differential Revision: https://reviews.freebsd.org/D24794
r360871:
linuxkpi: Really add bitmap_alloc and bitmap_zalloc
This was missing in r360870
Sponsored-by: The FreeBSD Foundation
r360872:
qnlx: Do not redifines types.
r360870 added linux/slab.h into liunx/bitmap.h and this include linux/types.h
The qlnx driver is redefining some of those types so remove them and add an
explicit linux/types.h include.
Pointy hat: manu
Reported by: Austin Shafer <ashafer@badland.io>
Michael Tuexen [Mon, 24 Aug 2020 09:15:52 +0000 (09:15 +0000)]
MFC r363456:
Clear the pointer to the socket when closing it also in case of
an ungraceful operation.
This fixes a use-after-free bug found and reported by Taylor
Brandstetter of Google by testing the userland stack.
Michael Tuexen [Mon, 24 Aug 2020 09:13:06 +0000 (09:13 +0000)]
MFC r363323:
Add reference counts for inp/stcb/net when timers are running.
This avoids a use-after-free reported for the userland stack.
Thanks to Taylor Brandstetter for suggesting a patch for
the userland stack.
Michael Tuexen [Mon, 24 Aug 2020 09:10:19 +0000 (09:10 +0000)]
MFC r363275:
Improve the locking of address lists by adding some asserts and
rearranging the addition of address such that the lock is not
given up during checking and adding.
Michael Tuexen [Mon, 24 Aug 2020 09:06:46 +0000 (09:06 +0000)]
MFC r363194:
Improve the error handling in generating ASCONF chunks.
In case of errors, the cleanup was not consistent.
Thanks to Felix Weinrank for fuzzing the userland stack and making
me aware of the issue.
Michael Tuexen [Mon, 24 Aug 2020 09:00:07 +0000 (09:00 +0000)]
MFC r363076:
Fix a use-after-free bug for the userland stack. The kernel
stack is not affected.
Thanks to Mark Wodrich from Google for finding and reporting the
bug.
Michael Tuexen [Mon, 24 Aug 2020 08:58:45 +0000 (08:58 +0000)]
MFC r363046:
Optimize flushing of receive queues.
This addresses an issue found and reported for the userland stack in
https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=21243
Michael Tuexen [Mon, 24 Aug 2020 08:38:58 +0000 (08:38 +0000)]
MFC r363008:
Improve handling of PKTDROP chunks. This includes the input validation
to address two issues found by ossfuzz testing the userland stack:
* https://oss-fuzz.com/testcase-detail/5387560242380800
* https://oss-fuzz.com/testcase-detail/4887954068865024
and adding support for I-DATA chunks in addition to DATA chunks.
Michael Tuexen [Mon, 24 Aug 2020 08:35:13 +0000 (08:35 +0000)]
MFC r362722:
Don't send packets containing ERROR chunks in response to unknown
chunks when being in a state where the verification tag to be used
is not known yet.
Michael Tuexen [Mon, 24 Aug 2020 08:32:16 +0000 (08:32 +0000)]
MFC r362581:
Fix the acconting for fragmented unordered messages when using
interleaving.
This was reported for the userland stack in
https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=19321
Michael Tuexen [Mon, 24 Aug 2020 08:25:00 +0000 (08:25 +0000)]
MFC r362473:
leanup the defintion of struct sctp_getaddresses. This stucture
is used by the IPPROTO_SCTP level socket options SCTP_GET_PEER_ADDRESSES
and SCTP_GET_LOCAL_ADDRESSES, which are used by libc to implement
sctp_getladdrs() and sctp_getpaddrs().
These changes allow an old libc to work on a newer kernel.
Michael Tuexen [Mon, 24 Aug 2020 08:19:25 +0000 (08:19 +0000)]
MFC r362451:
Use a struct sockaddr_in or struct sockaddr_in6 as the option value
for the IPPROTO_SCTP level socket options SCTP_BINDX_ADD_ADDR and
SCTP_BINDX_REM_ADDR. These socket option are intended for internal
use only to implement sctp_bindx().
This is one user of struct sctp_getaddresses less.
struct sctp_getaddresses is strange and will be changed shortly.