MFC r347188:
Disabling a PCI device should only disable busmaster in the LinuxKPI.
As Linux comment for this function point:
Signal to the system that the PCI device is not in use by the system
anymore. This only involves disabling PCI bus-mastering, if active.
MFC r347185:
Allow controlling pr_debug at runtime in the LinuxKPI.
Turning on pr_debug at compile time make it non-optional at runtime.
This often means that the amount of the debugging is unbearable.
Allow developer to turn on pr_debug output only when needed.
Ian Lepore [Thu, 16 May 2019 15:30:35 +0000 (15:30 +0000)]
MFC r346968, r346973
r346968:
Update the manpage text to show the output generated by the first-stage
bootloader these days (x86 instead of i386).
r346973:
Add a paragraph that mentions gptboot having an interactive mode, and
direct the user to the boot(8) manpage, which provides the details on that.
Change default value of kern.bootfile to reflect reality
In most cases kernel.bootfile is populated from the information
provided by loader(8). There are certain scenarios when loader
is not available, for instance when kernel is loaded by u-boot
or some other BootROM directly. In this case the default value
"/kernel" points to invalid location and breaks some functinality,
like using installkernel on self-hosted system or dtrace's CTF
lookup. This can be fixed by setting the value manually but the
default that reflects correct location is better than default that
points to invalid one.
Current default was set around FreeBSD 1, when "/kernel" was the
actual path. Transition to /boot/kernel/kernel happened circa FreeBSD 3.
Ian Lepore [Wed, 15 May 2019 17:50:17 +0000 (17:50 +0000)]
MFC r347422:
Allow dcons(4) to be unloaded when loaded as a module.
When the module is unloaded, the tty devices are destroyed. That requires
implementing the tsw_free callback to avoid a panic. This driver requires
no particular cleanup to be done from the callback, but the module itself
must remain in memory until the deferred tsw_free callbacks are invoked.
These changes implement that by incrementing a reference count variable in
the detach routine, and decrementing it in the tsw_free callback. The
MOD_UNLOAD event handler doesn't return until the count drops to zero.
Enji Cooper [Wed, 15 May 2019 07:51:35 +0000 (07:51 +0000)]
MFC r347075:
Fix `clang -Wcast-qual` issues
Remove unnecessary `char*` casting for arguments passed to `cget*(3)`, and
deconst `_PATH_PRINTCAP` before passing it to `cget*` via the `printcapdb`
variable.
This unblocks ^/projects/runtime-coverage-v2 from building cleanly on
universe13a.freebsd.org. I suspect the issue was introduced through some
changes to `bsd.*.mk` inclusion on the branch, which I will continue to
investigate/isolate.
Alexander Motin [Wed, 15 May 2019 01:38:34 +0000 (01:38 +0000)]
MFC r347240: Fix dataset name comparison in zfs_compare().
The code never returned match comparing two datasets (not snapshots).
As result, uu_avl_find(), called from zfs_callback(), never succeeded,
allowing to add same dataset into the list multiple times, for example:
# zfs get name pers pers pers@z pers@z
NAME PROPERTY VALUE SOURCE
pers name pers -
pers name pers -
pers@z name pers@z -
With the patch:
# zfs get name pers pers pers@z pers@z
NAME PROPERTY VALUE SOURCE
pers name pers -
pers@z name pers@z -
Factor out retrieving the interpreter path from the main ELF
loader routine.
MFC r345734 by kib:
Fix branding after r345661.
In particular, elf32 FreeBSD binaries were not executed on LP64 hosts.
The interp_name_len value should account for the nul terminator. This
is needed for strncmp()s in brand checking code to work.
Factor out resource limit enforcement code in the ELF loader.
It makes the code slightly easier to follow, and might make
it easier to fix the resouce accounting to also account for
the interpreter.
The PROC_UNLOCK() is moved earlier - I don't see anything
it should protect; the lim_max() is a wrapper around lim_rlimit(),
and that, differently from lim_rlimit_proc(), doesn't require
the proc lock to be held.
Remove sv_pagesize, originally introduced with r100384.
In all of the architectures we have today, we always use PAGE_SIZE.
While in theory one could define different things, none of the
current architectures do, even the ones that have transitioned from
32-bit to 64-bit like i386 and arm. Some ancient mips binaries on
other systems used 8k instead of 4k, but we don't support running
those and likely never will due to their age and obscurity.
Differently from the original commit, the merge leaves the struct
member in place to preserve the ABI.
MFC r346028:
Fix URE_WDT6_SET_MODE value in the register definition.
Both linux and u-boot sources for RTL8152 driver has this value.
RTL8152 USB ethernet is used in NanoPI R1 board as second ethernet.
This fixes RTL8152 USB ethernet not detected problem after
reboot.
MFC r347241 (partial): Initial mechanism for mapping ifname <-> kld
if_tun/if_tap mappings have been removed and the vmnet mapping has been
updated to the if_tap module.
MFC r347392: ifconfig(8): Partial revert of r347241
r347241 introduced an ifname <-> kld mapping table, mostly so tun/tap/vmnet
can autoload the correct module on use. It also inadvertently made bogus
some previously valid uses of sizeof().
Revert back to ifkind on the stack for simplicity sake. This reduces the
diff from the previous version of ifmaybeload for easiser auditing.
MFC r347429: ifconfig(8): Add kld mappings for ipsec/enc
Additionally, providing mappings makes the comparison for already loaded
modules a little more strict. This should have been done at initial
introduction, but there was no real reason- however, it proves necessary for
enc which has a standard enc -> if_enc mapping but there also exists an
'enc' module that's actually CAM. The mapping lets us unambiguously
determine the correct module.
Stephen Hurd [Mon, 13 May 2019 18:48:08 +0000 (18:48 +0000)]
MFC r346708:
iflib: Better control over queue core assignment
By default, cores are now assigned to queues in a sequential
manner rather than all NICs starting at the first core. On a four-core
system with two NICs each using two queue pairs, the nic:queue -> core
mapping has changed from this:
0:0 -> 0, 0:1 -> 1
1:0 -> 0, 1:1 -> 1
To this:
0:0 -> 0, 0:1 -> 1
1:0 -> 2, 1:1 -> 3
Additionally, a device can now be configured to use separate cores for TX
and RX queues.
Two new tunables have been added, dev.X.Y.iflib.separate_txrx and
dev.X.Y.iflib.core_offset. If core_offset is set, the NIC is not part
of the auto-assigned sequence.
Reviewed by: marius
Sponsored by: Limelight Networks
Differential Revision: https://reviews.freebsd.org/D20029
MFC r346885:
Handle HAVE_PROTO flag and print "proto" keyword for O_IP4 and O_IP6
opcodes when it is needed.
This should fix the problem, when printed by `ipfw show` rule can not
be added due to missing "proto" keyword.
Justin Hibbits [Sat, 11 May 2019 18:31:05 +0000 (18:31 +0000)]
MFC r345829, r345831
r345829:
powerpc: Apply r178139 from sparc64 to powerpc's fpu_sqrt
This fix was committed less than 2 months after the code was forked into the
powerpc kernel. Though powerpc doesn't use quad-precision floating point,
or need it for emulation, the changes do look like correctness fixes
overall.
This was found while trying to get fsqrt emulation working on e5500, which
does have a real FPU, but lacks the fsqrt instruction. This is not the
complete fix, the rest is to be committed separately.
r345831:
powerpc: Allow emulating optional FPU instructions on CPUs with an FPU
The e5500 has an FPU, but lacks the optional fsqrt instruction. This
instruction gets emulated in the kernel, but the emulation uses stale data,
from the last switch out, and does not return the result of the operation
immediately. Fix both of these conditions by saving and restoring the FPRs
around the emulation point.
Attempting to build www/firefox on POWER9 resulted in a HMI exception being
thrown, a fatal trap currently. This is typically caused by timer facility
errors, but examination of the Hypervisor Maintenance Exception Register
(HMER) yielded only that an exception had recovered, with no information of
the actual exception cause.
When an HMI occurs, OPAL_HANDLE_HMI or OPAL_HANDLE_HMI2 must be called to
handle the exception at the firmware level. If the exception is handled, we
can continue.
This adds only the preliminary handler, enough to prevent package building
from panicking. An enhancement in the future is to use the flags returned
by OPAL_HANDLE_HMI2 to print more useful error messages, and log maintenance
events.
Dimitry Andric [Sat, 11 May 2019 09:56:59 +0000 (09:56 +0000)]
MFC r347243:
Pull in r360099 from upstream llvm trunk (by Eli Friedman):
[ARM] Glue register copies to tail calls.
This generally follows what other targets do. I don't completely
understand why the special case for tail calls existed in the first
place; even when the code was committed in r105413, call lowering
didn't work in the way described in the comments.
Stack protector lowering breaks if the register copies are not glued
to a tail call: we have to insert the stack protector check before
the tail call, and we choose the location based on the assumption
that all physical register dependencies of a tail call are adjacent
to the tail call. (See FindSplitPointForStackProtector.) This is sort
of fragile, but I don't see any reason to break that assumption.
I'm guessing nobody has seen this before just because it's hard to
convince the scheduler to actually schedule the code in a way that
breaks; even without the glue, the only computation that could
actually be scheduled after the register copies is the computation of
the call address, and the scheduler usually prefers to schedule that
before the copies anyway.
This should fix several instances of "Bad machine code: Using an
undefined physical register", when compiling ports such as
multimedia/vlc, audio/alsa-lib and devel/avro-c for armv6, with
-fstack-protector-strong.
Enji Cooper [Thu, 9 May 2019 17:02:47 +0000 (17:02 +0000)]
MFC r346578:
Build libclang_rt/profile on all clang-supported architectures
There's no reason why a special case needs to be added specifically for amd64,
arm, and i386, as the code is written in machine architecture agnostic C/C++.
This will make it possible for all supporting clang architectures to produce
runtime coverage with `--coverage`.
r346602:
tun(4): Defer clearing TUN_OPEN until much later
tun destruction will not continue until TUN_OPEN is cleared. There are brief
moments in tunclose where the mutex is dropped and we've already cleared
TUN_OPEN, so tun_destroy would be able to proceed while we're in the middle
of cleaning up the tun still. tun_destroy should be blocked until these
parts (address/route purges, mostly) are complete.
r346670:
tun/tap: close race between destroy/ioctl handler
It seems that there should be a better way to handle this, but this seems to
be the more common approach and it should likely get replaced in all of the
places it happens... Basically, thread 1 is in the process of destroying the
tun/tap while thread 2 is executing one of the ioctls that requires the
tun/tap mutex and the mutex is destroyed before the ioctl handler can
acquire it.
This is only one of the races described/found in PR 233955.
r346671:
tun(4): Don't allow open of open or dying devices
Previously, a pid check was used to prevent open of the tun(4); this works,
but may not make the most sense as we don't prevent the owner process from
opening the tun device multiple times.
The potential race described near tun_pid should not be an issue: if a
tun(4) is to be handed off, its fd has to have been sent via control message
or some other mechanism that duplicates the fd to the receiving process so
that it may set the pid. Otherwise, the pid gets cleared when the original
process closes it and you have no effective handoff mechanism.
Close up another potential issue with handing a tun(4) off by not clobbering
state if the closer isn't the controller anymore. If we want some state to
be cleared, we should do that a little more surgically.
Additionally, nothing prevents a dying tun(4) from being "reopened" in the
middle of tun_destroy as soon as the mutex is unlocked, quickly leading to a
bad time. Return EBUSY if we're marked for destruction, as well, and the
consumer will need to deal with it. The associated character device will be
destroyed in short order.
r347183:
geom: fix initialization order
There's a race between the initialization of devsoftc.mtx (by devinit)
and the creation of the geom worker thread g_run_events, which calls
devctl_queue_data_f. Both of those are initialized at SI_SUB_DRIVERS
and SI_ORDER_FIRST, which means the geom worked thread can be created
before the mutex has been initialized, leading to the panic below:
wpanic: mtx_lock() of spin mutex (null) @ /usr/home/osstest/build.135317.build-amd64-freebsd/freebsd/sys/kern/subr_bus.c:620
cpuid = 3
time = 1
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe003b968710
vpanic() at vpanic+0x19d/frame 0xfffffe003b968760
panic() at panic+0x43/frame 0xfffffe003b9687c0
__mtx_lock_flags() at __mtx_lock_flags+0x145/frame 0xfffffe003b968810
devctl_queue_data_f() at devctl_queue_data_f+0x6a/frame 0xfffffe003b968840
g_dev_taste() at g_dev_taste+0x463/frame 0xfffffe003b968a00
g_load_class() at g_load_class+0x1bc/frame 0xfffffe003b968a30
g_run_events() at g_run_events+0x197/frame 0xfffffe003b968a70
fork_exit() at fork_exit+0x84/frame 0xfffffe003b968ab0
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe003b968ab0
--- trap 0, rip = 0, rsp = 0, rbp = 0 ---
KDB: enter: panic
[ thread pid 13 tid 100029 ]
Stopped at kdb_enter+0x3b: movq $0,kdb_why
Fix this by initializing geom at SI_ORDER_SECOND instead of
SI_ORDER_FIRST.
Alexander Motin [Wed, 8 May 2019 15:23:45 +0000 (15:23 +0000)]
MFC r346898: ip multicast debug: fix strings vs defines
Turning on multicast debug made multicast failure worse
because the strings and #define values no longer matched
up. Fix them, and make sure they stay matched-up.
Alexander Motin [Wed, 8 May 2019 15:17:04 +0000 (15:17 +0000)]
MFC r346491: Polish SCSI sense data validity checks.
According to specs and common sense, all sense data reported in descriptor
format should be valid. But practice shows different, some devices return
descriptors with invalid data, resulting in error messages looking worse.
Decouple block/stream commands sense data and information field printing.
Looking on present specs, there are much more cases when those fields are
not related, and incomplete old code was not printing valid sense data and
leaving empty lines for invalid.
Justin Hibbits [Mon, 6 May 2019 03:35:44 +0000 (03:35 +0000)]
MFC r344613:
powerpc/mpc85xx: Synchronize timebase the platform correct way
Summary:
To safely synchronize timebase we need to disable the timebase on all
cores, set timebase, and resynchronize. This adds two new devices, mutually
exclusive, which attach on the SoC simplebus, to freeze and unfreeze the
timebase. The devices are singletons, and platform-specific, so no reason
to make them optional and in separate files.
This was found to be necessary for top(1) to work correctly on an AmigaOne
X5000 (P5020 SoC). It also fixes bufdaemon and bufspacedaemon hangs at
shutdown.
Justin Hibbits [Mon, 6 May 2019 03:15:07 +0000 (03:15 +0000)]
MFC r339559,344083,344202,344203,344204
Bulk merge of Book-E pmap changes
r339559: powerpc/booke: Turn tlb*_print_tlbentries() into 'show tlb*' DDB
commands
r344083: powerpc/booke: Use the 'tlbilx' instruction on newer cores
r344202,344204: powerpc/booke: Use DMAP where possible for page copy and zeroing
r344203: powerpc/booke: depessimize MAS register updates
Rick Macklem [Mon, 6 May 2019 03:06:22 +0000 (03:06 +0000)]
MFC: r346856
Add #ifdef INET6 around declaration of nbuf.
It was reported that without #ifdef INET6 around the declaration of "nbuf",
a build would report an unused variable. For some reason, I didn't see that
warning when I did a build, but it seems reasonable to add these #ifdef INET6's.
r347027:
libbe(3): Properly mount BEs with mountpoint=none
Instead of pretending to successfully mount them while not actually
mounting anything, we'll now actually mount them *and* claim we mounted them
successfully.
Reported by: ler
r347028:
libbe: set mountpoint=none in be_import
If we're going to set a mountpoint at all, mountpoint=none makes more sense
than mountpoint=/.
Kyle Evans [Mon, 6 May 2019 02:08:52 +0000 (02:08 +0000)]
MFC r347021: fdt: Fix installation of aarch64 dtb
r345519 rewrote parts of how we build .dtb, but mistakenly dropped the
vendor dir for aarch64. Simply drop the :T for building ${DTB} in the
aarch64 case- it'll get applied at install-time as-needed, with :H:T for
determining the vendor dir.
Michael Tuexen [Sat, 4 May 2019 13:58:45 +0000 (13:58 +0000)]
MFC r346854:
Some test scripts use ncat --sctp --listen port to run an SCTP discard
server in the background. However, when running in the background,
stdin is closed and ncat initiates a graceful shutdown of the SCTP
association. This is not expected by the client. Therefore, the
ncat-based discard server is replaced by a perl-based one.
In addition, to remove the dependency from ncat, which needs to be
installed via the nmap port, also the code testing for a free SCTP port
is changed to use the perl-based client.
Finally, remove some debug output from the report generated.
Michael Tuexen [Sat, 4 May 2019 13:55:51 +0000 (13:55 +0000)]
MFC r346400:
Improve input validation for the socket option IPV6_CHECKSUM.
When using the IPPROTO_IPV6 level socket option IPV6_CHECKSUM on a raw
IPv6 socket, ensure that the value is either -1 or a non-negative even
number.
MFC r346401:
Avoid a buffer overwrite in rip6_output() when computing the checksum
as requested by the user via the IPPROTO_IPV6 level socket option
IPV6_CHECKSUM. The check if there are enough bytes in the packet to
store the checksum at the requested offset was wrong by 1.
MFC r346402:
When a checksum has to be computed for a received IPv6 packet because it
is requested by the application using the IPPROTO_IPV6 level socket option
IPV6_CHECKSUM on a raw socket, ensure that the packet contains enough
bytes to contain the checksum at the specified offset.
MFC r346406:
When an IPv6 packet is received for a raw socket which has the
IPPROTO_IPV6 level socket option IPV6_CHECKSUM enabled and the
checksum check fails, drop the message. Without this fix, an
ICMP6 message was sent indicating a parameter problem.
Thanks to bz@ for suggesting a way to simplify this fix.
MFC: r346715: Acpi MADT table correction for VM_MAXCPU > 21
The bhyve acpi MADT table was given a static space of 256 (0x100) bytes,
this is enough space to allow VM_MAXCPU to be 21, this patch changes that
so VM_MAXCPU can be of arbitrary value and not overflow the space by
actually calculating the space needed for the table.
Michael Tuexen [Sat, 4 May 2019 13:05:21 +0000 (13:05 +0000)]
MFC r346197:
When sending a routing message, don't allow the user to set the
RTF_RNH_LOCKED flag in rtm_flags, since this flag is used only
internally.
Michael Tuexen [Sat, 4 May 2019 13:02:46 +0000 (13:02 +0000)]
MFC r346182:
When sending IPv4 packets on a SOCK_RAW socket using the IP_HDRINCL option,
ensure that the ip_hl field is valid. Furthermore, ensure that the complete
IPv4 header is contained in the first mbuf. Finally, move the length checks
before relying on them when accessing fields of the IPv4 header.
Michael Tuexen [Sat, 4 May 2019 12:07:00 +0000 (12:07 +0000)]
MFC r345461:
Limit the size of messages sent on 1-to-many style SCTP sockets with the
SCTP_SENDALL flag. Allow also only one operation per SCTP endpoint.
This fixes an issue found by running syzkaller and is joint work with rrs@.
Michael Tuexen [Sat, 4 May 2019 11:15:01 +0000 (11:15 +0000)]
MFC r344872:
After removing an entry from the stream scheduler list, set the pointers
to NULL, since we are checking for it in case the element gets inserted
again.
Michael Tuexen [Sat, 4 May 2019 11:13:03 +0000 (11:13 +0000)]
MFC r344742:
Allocate an assocition id and register the stcb with holding the lock.
This avoids a race where stcbs can be found, which are not completely
initialized.