]> CyberLeo.Net >> Repos - FreeBSD/FreeBSD.git/log
FreeBSD/FreeBSD.git
4 years agoThis commit adds BBR (Bottleneck Bandwidth and RTT) congestion control. This
rrs [Tue, 24 Sep 2019 18:18:11 +0000 (18:18 +0000)]
This commit adds BBR (Bottleneck Bandwidth and RTT) congestion control. This
is a completely separate TCP stack (tcp_bbr.ko) that will be built only if
you add the make options WITH_EXTRA_TCP_STACKS=1 and also include the option
TCPHPTS. You can also include the RATELIMIT option if you have a NIC interface that
supports hardware pacing, BBR understands how to use such a feature.

Note that this commit also adds in a general purpose time-filter which
allows you to have a min-filter or max-filter. A filter allows you to
have a low (or high) value for some period of time and degrade slowly
to another value has time passes. You can find out the details of
BBR by looking at the original paper at:

https://queue.acm.org/detail.cfm?id=3022184

or consult many other web resources you can find on the web
referenced by "BBR congestion control". It should be noted that
BBRv1 (which this is) does tend to unfairness in cases of small
buffered paths, and it will usually get less bandwidth in the case
of large BDP paths(when competing with new-reno or cubic flows). BBR
is still an active research area and we do plan on  implementing V2
of BBR to see if it is an improvement over V1.

Sponsored by: Netflix Inc.
Differential Revision: https://reviews.freebsd.org/D21582

4 years agoix, ixv: Read msix_bar from device configuration
erj [Tue, 24 Sep 2019 17:06:32 +0000 (17:06 +0000)]
ix, ixv: Read msix_bar from device configuration

Instead of predicting the MSI-X bar index based on the device's MAC
type, read it from the device's PCI configuration instead.

PR: 239704
Submitted by: Piotr Pietruszewski <piotr.pietruszewski@intel.com>
Reviewed by: erj@
MFC after: 3 days
Sponsored by: Intel Corporation
Differential Revision: https://reviews.freebsd.org/D21547

4 years agoiflib: Remove redundant VLAN events deregistration
erj [Tue, 24 Sep 2019 17:03:31 +0000 (17:03 +0000)]
iflib: Remove redundant VLAN events deregistration

From Piotr:
r351152 introduced iflib_deregister() function calling
EVENTHANDLER_DEREGISTER() to unregister VLAN events. This patch removes
duplicate of EVENTHANDLER_DEREGISTER() calls placed in
iflib_device_deregister() as this function is now calling
iflib_deregister(). This is to avoid deregistering same event twice.

This patch also adds check in iflib_vlan_register() to prevent
registering VLAN while being in detach.

Patch co-authored by Krzysztof Galazka <krzysztof.galazka@intel.com>,
erj <erj@FreeBSD.org> and Jacob Keller <jacob.e.keller@intel.com>.

Signed-off-by: Piotr Pietruszewski <piotr.pietruszewski@intel.com>
Submitted by: Piotr Pietruszewski <piotr.pietruszewski@intel.com>
Reviewed by: gallatin@, erj@
MFC after: 3 days
Sponsored by: Intel Corporation
Differential Revision: https://reviews.freebsd.org/D21711

4 years agoFix a minor typo
olivier [Tue, 24 Sep 2019 16:49:42 +0000 (16:49 +0000)]
Fix a minor typo

Approved by: lwhsu
MFC after: 1 month
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D19970

4 years agoFix coredump_phnum_test in case of kern.compress_user_cores=1
olivier [Tue, 24 Sep 2019 16:45:34 +0000 (16:45 +0000)]
Fix coredump_phnum_test in case of kern.compress_user_cores=1

PR: 240783
Approved by: ngie, lwhsu
MFC after: 1 month
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D21776

4 years agoPlumb a memory leak.
tuexen [Tue, 24 Sep 2019 13:15:24 +0000 (13:15 +0000)]
Plumb a memory leak.
Thnanks to Felix Weinrank for finding this issue using fuzz testing
and reporting it for the userland stack:
https://github.com/sctplab/usrsctp/issues/378

MFC after: 3 days

4 years agolib/libc/regex: fix build with REDEBUG defined
yuripv [Tue, 24 Sep 2019 12:21:01 +0000 (12:21 +0000)]
lib/libc/regex: fix build with REDEBUG defined

Reviewed by: kevans
Differential Revision: https://reviews.freebsd.org/D21760

4 years agoReplace all mtx_lock()/mtx_unlock() on n_mtx with the macros.
rmacklem [Tue, 24 Sep 2019 01:58:54 +0000 (01:58 +0000)]
Replace all mtx_lock()/mtx_unlock() on n_mtx with the macros.

For a long time, some places in the NFS code have locked/unlocked the
NFS node lock with the macros NFSLOCKNODE()/NFSUNLOCKNODE() whereas
others have simply used mtx_lock()/mtx_unlock().
Since the NFS node mutex needs to change to an sx lock so it can be held when
vnode_pager_setsize() is called, replace all occurrences of mtx_lock/mtx_unlock
with the macros to simply making the change to an sx lock in future commit.
There is no semantic change as a result of this commit.

I am not sure if the change to an sx lock will be MFC'd soon, so I put
an MFC of 1 week on this commit so that it could be MFC'd with that commit.

Suggested by: kib
MFC after: 1 week

4 years agoClean LINT* kernel configurations for arm*
lwhsu [Tue, 24 Sep 2019 01:56:27 +0000 (01:56 +0000)]
Clean LINT* kernel configurations for arm*

MFC after: 3 days
Sponsored by: The FreeBSD Foundation

4 years agoping6: Use caph_rights_limit(3) for STDIN_FILENO
markj [Mon, 23 Sep 2019 22:20:11 +0000 (22:20 +0000)]
ping6: Use caph_rights_limit(3) for STDIN_FILENO

Update some error messages while here.

Reported by: olivier
MFC after: 3 days

4 years agocache: tidy up handling of negative entries
mjg [Mon, 23 Sep 2019 20:50:04 +0000 (20:50 +0000)]
cache: tidy up handling of negative entries

- track the total count of hot entries
- pre-read the lock when shrinking since it is typically already taken
- place the lock in its own cacheline
- shorten the hold time of hot lock list when zapping

Sponsored by: The FreeBSD Foundation

4 years agoMake nvme(4) driver some more NUMA aware.
mav [Mon, 23 Sep 2019 17:53:47 +0000 (17:53 +0000)]
Make nvme(4) driver some more NUMA aware.

 - For each queue pair precalculate CPU and domain it is bound to.
If queue pairs are not per-CPU, then use the domain of the device.
 - Allocate most of queue pair memory from the domain it is bound to.
 - Bind callouts to the same CPUs as queue pair to avoid migrations.
 - Do not assign queue pairs to each SMT thread.  It just wasted
resources and increased lock congestions.
 - Remove fixed multiplier of CPUs per queue pair, spread them even.
This allows to use more queue pairs in some hardware configurations.
 - If queue pair serves multiple CPUs, bind different NVMe devices to
different CPUs.

MFC after: 1 month
Sponsored by: iXsystems, Inc.

4 years agoImplement x86 dtrace_invop_(un)init() in C.
markj [Mon, 23 Sep 2019 15:08:17 +0000 (15:08 +0000)]
Implement x86 dtrace_invop_(un)init() in C.

There is no reason for these routines to be written in assembly.  In
the ports of DTrace to other platforms, they are already written in C.
No functional change intended.

MFC after: 1 week
Sponsored by: Netflix

4 years agoFix a harmless typo.
markj [Mon, 23 Sep 2019 14:34:23 +0000 (14:34 +0000)]
Fix a harmless typo.

MFC after: 1 week

4 years agoRevert r316820.
markj [Mon, 23 Sep 2019 14:29:05 +0000 (14:29 +0000)]
Revert r316820.

Despite appearing correct, r316820 breaks packet rx/tx for jme(4)
interfaces.  With 12.1 approaching, let's just revert the commit for now.

PR: 233952
Tested by: Armin Gruner <ag-freebsd@muc.de>
MFC after: 3 days

4 years agoSet NX on some non-leaf direct map page table entries.
markj [Mon, 23 Sep 2019 14:19:41 +0000 (14:19 +0000)]
Set NX on some non-leaf direct map page table entries.

The direct map is never used for execution of code, so we might as well
set NX in the direct map's PML4Es.  Also clarify the intent of the code
in create_pagetables() that restricts access protections on the region
of the direct map mapping the kernel text.

Reviewed by: alc, kib (previous version)
MFC after: 1 week
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D21759

4 years agoUse elf_relocaddr() when handling R_X86_64_RELATIVE relocations.
markj [Mon, 23 Sep 2019 14:14:43 +0000 (14:14 +0000)]
Use elf_relocaddr() when handling R_X86_64_RELATIVE relocations.

This is required for DPCPU and VNET data variable definitions to work when
KLDs are linked as DSOs.  R_X86_64_RELATIVE relocations should not appear
in object files, so assert this in elf_relocaddr().

Reviewed by: kib
MFC after: 1 month
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D21755

4 years agoSet NX in mappings created by pmap_kenter() and pmap_kenter_attr().
markj [Mon, 23 Sep 2019 14:11:59 +0000 (14:11 +0000)]
Set NX in mappings created by pmap_kenter() and pmap_kenter_attr().

There does not appear to be any existing need for such mappings to be
executable.

Reviewed by: alc, kib
MFC after: 1 month
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D21754

4 years agoFix destruction of the robust mutexes.
kib [Mon, 23 Sep 2019 13:24:31 +0000 (13:24 +0000)]
Fix destruction of the robust mutexes.

If robust mutex' owner terminated, causing kernel-assisted state
recovery, and then pthread_mutex_destroy() is executed as the next
action, assert is triggered about mutex still being on the list.
Ignore the mutex linkage in pthread_mutex_destroy() for shared robust
mutexes with dead owner, same as for enqueue_mutex().

Reported by: avg
Sponsored by: The FreeBSD Foundation
MFC after: 1 week

4 years agomips: fix XLPN32 after r352434
kevans [Mon, 23 Sep 2019 12:43:08 +0000 (12:43 +0000)]
mips: fix XLPN32 after r352434

SYSINIT usage was added, but the <sys/kernel.h> dependency was not added.
This worked by coincidence, as most of the mips configs have DDB enabled and
pmap.c gets <sys/kernel.h> via ddb.h pollution.

Reported by: dim

4 years agoCreate a "drm" subdirectory for drm devices in linsysfs. Recent versions of
tijl [Mon, 23 Sep 2019 12:27:55 +0000 (12:27 +0000)]
Create a "drm" subdirectory for drm devices in linsysfs.  Recent versions of
linux libdrm check for the existence of this directory:

https://cgit.freedesktop.org/mesa/drm/commit/?id=f8392583418aef5e27bfed9989aeb601e20cc96d

MFC after: 2 weeks

4 years agocache: count evictions of negatve entries
mjg [Mon, 23 Sep 2019 08:53:14 +0000 (08:53 +0000)]
cache: count evictions of negatve entries

Sponsored by: The FreeBSD Foundation

4 years agoAdd two options to allow mount to avoid covering up existing mount points.
sef [Mon, 23 Sep 2019 04:28:07 +0000 (04:28 +0000)]
Add two options to allow mount to avoid covering up existing mount points.
The two options are

* nocover/cover:  Prevent/allow mounting over an existing root mountpoint.
E.g., "mount -t ufs -o nocover /dev/sd1a /usr/local" will fail if /usr/local
is already a mountpoint.
* emptydir/noemptydir:  Prevent/allow mounting on a non-empty directory.
E.g., "mount -t ufs -o emptydir /dev/sd1a /usr" will fail.

Neither of these options is intended to be a default, for historical and
compatibility reasons.

Reviewed by: allanjude, kib
Differential Revision: https://reviews.freebsd.org/D21458

4 years agocache: try to avoid vhold if locks held
mjg [Sun, 22 Sep 2019 20:50:24 +0000 (20:50 +0000)]
cache: try to avoid vhold if locks held

Sponsored by: The FreeBSD Foundation

4 years agocache: jump in negative success instead of positive
mjg [Sun, 22 Sep 2019 20:49:17 +0000 (20:49 +0000)]
cache: jump in negative success instead of positive

Sponsored by: The FreeBSD Foundation

4 years agolockprof: move per-cpu data to dpcpu
mjg [Sun, 22 Sep 2019 20:44:24 +0000 (20:44 +0000)]
lockprof: move per-cpu data to dpcpu

Reviewed by: kib
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D21747

4 years agoi386: reduce differences in source between PAE and non-PAE pmaps ...
kib [Sun, 22 Sep 2019 19:59:10 +0000 (19:59 +0000)]
i386: reduce differences in source between PAE and non-PAE pmaps ...

by defining pg_nx as zero for non-PAE and correspondingly simplifying
some expressions.

Suggested and reviewed by: markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D21757

4 years agoi386: implement sysctl vm.pmap.kernel_maps.
kib [Sun, 22 Sep 2019 19:23:00 +0000 (19:23 +0000)]
i386: implement sysctl vm.pmap.kernel_maps.

Reviewed by: markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D21739

4 years agoamd64: minor tweaks to pat decoding in sysctl vm.pmap.kernel_maps.
kib [Sun, 22 Sep 2019 19:20:37 +0000 (19:20 +0000)]
amd64: minor tweaks to pat decoding in sysctl vm.pmap.kernel_maps.

Decode PAT_UNCACHED.
When unknown pat mode is encountered, print the pte bits combination
instead of the index, which is always 8.

Reviewed by: markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D21738

4 years agoocteon-sdk: suppress another set of warnings under clang
kevans [Sun, 22 Sep 2019 18:32:05 +0000 (18:32 +0000)]
octeon-sdk: suppress another set of warnings under clang

Clang sees this construct and warns that adding an int to a string like this
does not concatenate the two. Fortunately, this is not what octeon-sdk
actually intended to do, so we take the path towards remediation that clang
offers: use array indexing instead.

4 years agoocteon1: suppress a couple of warnings under clang
kevans [Sun, 22 Sep 2019 18:30:19 +0000 (18:30 +0000)]
octeon1: suppress a couple of warnings under clang

These appear in octeon-sdk -- there are new releases, but they don't seem to
address the running issues in octeon-sdk. GCC4.2 is more than happy, but
clang is much less-so and most of them are fairly innocuous and perhaps a
by-product of their style guide, which may make some of the changes harder
to upstream (if this is even possible anymore).

4 years agoHonor CWARNFLAGS.clang/gcc in the kernel build
kevans [Sun, 22 Sep 2019 18:27:57 +0000 (18:27 +0000)]
Honor CWARNFLAGS.clang/gcc in the kernel build

Some kernel builds or users may want to disable warnings on a per-compiler
basis, so do this now.

4 years agoloader_lua: lua color changes should end with reset
tsoome [Sun, 22 Sep 2019 17:39:20 +0000 (17:39 +0000)]
loader_lua: lua color changes should end with reset

The color change should have reset sequence, not switch to white.

4 years agoloader_4th: menu items need to reset color attribute, not switch to white
tsoome [Sun, 22 Sep 2019 16:10:25 +0000 (16:10 +0000)]
loader_4th: menu items need to reset color attribute, not switch to white

Forth menu kernel and BE entries, instead of resetting the color attribute,
are switching to white color.

4 years agoAdd support for ps -H on corefiles in libkvm
karels [Sun, 22 Sep 2019 13:56:27 +0000 (13:56 +0000)]
Add support for ps -H on corefiles in libkvm

Add support for kernel threads in kvm_getprocs() and the underlying
kvm_proclist() in libkvm when fetching from a kernel core file. This
has been missing/needed for several releases, when kernel threads became
normal threads.  The loop over the processes now contains a sub-loop for
threads, which iterates beyond the first thread only when threads are
requested.  Also set some fields such as tid that were previously
uninitialized.

Reviewed by: vangyzen jhb(earlier revision)
MFC after: 4 days
Sponsored by: Forcepoint LLC
Differential Revision: https://reviews.freebsd.org/D21461

4 years agoDon't hold the info lock when calling sctp_select_a_tag().
tuexen [Sun, 22 Sep 2019 11:11:01 +0000 (11:11 +0000)]
Don't hold the info lock when calling sctp_select_a_tag().

This avoids a double lock bug in the NAT colliding state processing
of SCTP. Thanks to Felix Weinrank for finding and reporting this issue in
https://github.com/sctplab/usrsctp/issues/374
He found this bug using fuzz testing.

MFC after: 3 days

4 years agoCleanup the RTO calculation and perform some consistency checks
tuexen [Sun, 22 Sep 2019 10:40:15 +0000 (10:40 +0000)]
Cleanup the RTO calculation and perform some consistency checks
before computing the RTO.
This should fix an overflow issue reported by Felix Weinrank in
https://github.com/sctplab/usrsctp/issues/375
for the userland stack and found by running a fuzz tester.

MFC after: 3 days

4 years agoMFZoL: Retire send space estimation via ZFS_IOC_SEND
avg [Sun, 22 Sep 2019 08:44:41 +0000 (08:44 +0000)]
MFZoL: Retire send space estimation via ZFS_IOC_SEND

Add a small wrapper around libzfs_core's lzc_send_space() to libzfs so
that every legacy ZFS_IOC_SEND consumer, along with their userland
counterpart estimate_ioctl(), can leverage ZFS_IOC_SEND_SPACE to
request send space estimation.

The legacy functionality in zfs_ioc_send() is left untouched for
compatibility purposes.

Obtained from: ZoL
Obtained from: zfsonlinux/zfs@cf7684bc8d57
Author: loli10K <ezomori.nozomu@gmail.com>
MFC after: 2 weeks

4 years agoprint summary line for space estimate of zfs send from bookmark
avg [Sun, 22 Sep 2019 08:34:23 +0000 (08:34 +0000)]
print summary line for space estimate of zfs send from bookmark

Although there is always a single stream and the total size in the
summary is always equal to the size reported for the stream, it's nice
to follow the usual output format.

MFC after: 3 days

4 years agokern.elf{32,64}.pie_base sysctl: enforce page alignment.
kib [Sat, 21 Sep 2019 20:03:17 +0000 (20:03 +0000)]
kern.elf{32,64}.pie_base sysctl: enforce page alignment.

Requested by: rstone
Sponsored by: The FreeBSD Foundation
MFC after: 1 week

4 years agoIn case a translation fault on the kernel address space occurs from
alc [Sat, 21 Sep 2019 19:51:57 +0000 (19:51 +0000)]
In case a translation fault on the kernel address space occurs from
within a critical section, we must perform a lock-free check on the
faulting address.

Reported by: andrew
Reviewed by: andrew, markj
X-MFC with: r350579
Differential Revision: https://reviews.freebsd.org/D21685

4 years agolockprof: use CPUFOREACH and drop always false lp_cpu NULL checks
mjg [Sat, 21 Sep 2019 19:05:38 +0000 (19:05 +0000)]
lockprof: use CPUFOREACH and drop always false lp_cpu NULL checks

Sponsored by: The FreeBSD Foundation

4 years agoMake non-ASLR pie base tunable.
kib [Sat, 21 Sep 2019 18:00:23 +0000 (18:00 +0000)]
Make non-ASLR pie base tunable.

Sponsored by: The FreeBSD Foundation
MFC after: 1 week

4 years agoamd64 pmap: Fix formats for 64bit addresses in ddb and sysctl output.
kib [Sat, 21 Sep 2019 17:59:15 +0000 (17:59 +0000)]
amd64 pmap: Fix formats for 64bit addresses in ddb and sysctl output.

Reviewed by: markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D21737

4 years agoFix a regression introduced in r344601, and work properly with the
sef [Sat, 21 Sep 2019 17:54:42 +0000 (17:54 +0000)]
Fix a regression introduced in r344601, and work properly with the
-v and -n options.

PR: 240640
Reported by: Andriy Gapon <avg@FreeBSD.org>
Reviewed by: avg
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D21709

4 years agoAllocate callout wheel from the respective memory domain.
mav [Sat, 21 Sep 2019 15:38:08 +0000 (15:38 +0000)]
Allocate callout wheel from the respective memory domain.

MFC after: 1 week

4 years agojot.1: Explain default argument values more precisely
0mp [Sat, 21 Sep 2019 15:01:11 +0000 (15:01 +0000)]
jot.1: Explain default argument values more precisely

The way jot(1) defaults missing arguments doesn't match the behaviour
described in the manpage, which states that with fewer than 3 arguments
missing values are supplied from left to right.

In fact, with one or two arguments, the last (s which is step size or seed)
defaults to 1 (or -1 if begin and end specify a descending range), and then
omitted arguments are set to default starting with the leftmost until three
arguments are available.

This is why `jot 2 1000` prints 1000 and 1001 instead of 1000 and 100.

PR: 135475
Submitted by: Jonathan McKeown <j.mckeown@ru.ac.za>
Approved by: doc (bcr)
Differential Revision: https://reviews.freebsd.org/D21736
Event: EuroBSDcon 2019

4 years agoascii(7): Add STANDARDS section and update HISTORY section
0mp [Sat, 21 Sep 2019 14:16:37 +0000 (14:16 +0000)]
ascii(7): Add STANDARDS section and update HISTORY section

PR: 240727
Submitted by: Gordon Bergling <gbergling@gmail.com>
Approved by: src (imp)
Event: EuroBSDcon 2019

4 years ago- Revert WARNS to 2 because of mismatch between (xdrproc_t) and xdr_void().
hrs [Sat, 21 Sep 2019 13:34:06 +0000 (13:34 +0000)]
- Revert WARNS to 2 because of mismatch between (xdrproc_t) and xdr_void().
- Add prototype of from_addr().

4 years agoFix warnings and set WARNS=6.
hrs [Sat, 21 Sep 2019 12:33:41 +0000 (12:33 +0000)]
Fix warnings and set WARNS=6.

4 years agoFix build errors of test.c, which had been broken for a long time.
hrs [Sat, 21 Sep 2019 01:29:59 +0000 (01:29 +0000)]
Fix build errors of test.c, which had been broken for a long time.
This is a temporary fix and should be converted to a complete
test scenarios by using this tool.

4 years agoImpove wording and move descriptions about
hrs [Sat, 21 Sep 2019 00:44:37 +0000 (00:44 +0000)]
Impove wording and move descriptions about
locale to LC_CTYPE in the ENVIRONMENT section.

4 years agoAdd a workaround for servers which respond RPC_PROGNOTREGISTERED
hrs [Sat, 21 Sep 2019 00:17:40 +0000 (00:17 +0000)]
Add a workaround for servers which respond RPC_PROGNOTREGISTERED
to a clnt_create() call even when it is actually a program
version mismatch.

Normally the server is supposed to return RPC_PROGVERSMISMATCH
when it supports the specified program but not support
the specified version.  Some filers return RPC_PROGNOTREGISTERED
to RQUOTA v2 calls and FreeBSD does not retry with the old
v1 calls.  This change fixes this failure scenario.

Submitted by: Jian-Bo Liao
PR: 236179

4 years agomsdosfs: do not deget unlinked denodes
kevans [Fri, 20 Sep 2019 20:47:10 +0000 (20:47 +0000)]
msdosfs: do not deget unlinked denodes

When a file is unlinked, the denode is not reclaimed until the last
reference is dropped, but the directory entry is immediately up for reuse.
This is a problem later when createde goes to grab a denode for the newly
created entry -- we search the hash and find a dead denode, then return that
without even bumping the reference count and the data later gets truncated
when the the last reference to the unlinked file is dropped.

This manifested itself as a broken in-place strip(1) on msdosfs. elfcopy
will do a sequence incredibly roughly like this:

open("/mnt/foo", ...) => fd 3
mmap()
unlink("/mnt/foo")
open("/mnt/foo", ...) => fd 4
write(4, ...)
close(4)
close(3)

and the resulting file would be truncated, but the write succeeded, as long
as a reference to the unlinked file had not been closed.

Some archaeology indicates that this bug has likely existed since msdosfs
was converted to use vfs_hash instead of a home rolled hash implementation
in r143570. Prior to that point, the hashget implementation would do a
refcnt check while searching and explicitly only return a denode with
de_refcnt != 0. vfs_hash did not yet have the callback that it does today,
so this slipped away and did not come back when it later grew that
functionality.

The comment indicating that we want to skip these denodes has been updated
to reflect where this is actually done. My repo-diving session seems to
indicate that the refcnt check was likely never actually below the comment,
to be pedantic, but instead a detail wrapped up in the hashget
implementation since the beginning of its inclusion into FreeBSD.

This bug was the cause behind the issue addressed in r352557.

Reported by: jhibbits
Reviewed by: kib
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D21731

4 years agoloader: Respect loader_color=YES for serial consoles
kevans [Fri, 20 Sep 2019 19:43:40 +0000 (19:43 +0000)]
loader: Respect loader_color=YES for serial consoles

It's not uncommon these days for the terminals attached to serial consoles
to support ANSI escape sequences. However, we assume escape sequences may
break some serial consoles and default to not using them when boot_serial or
boot_multicons (or if console contains "comconsole" in the forth loader) for
broader compatibility. We also have loader_color which can be explicitly set
to "NO" to disable the use of ANSI escape sequences.

The problem is that loader_color=YES gets ignored when boot_serial=YES or
boot_multicons=YES (or when console contains "comconsole" in the forth
loader).

To fix, the existing default behavior remains unchanged when loader_color is
unset, loader_color=NO explicitly disables the use of ANSI escape sequences
still, and the change is that loader_color=YES can now be used to explicitly
allow ANSI escapes when a serial console is enabled.

Submitted by: Ryan Moeller <ryan@ixsystems.com>
Reviewed by: tsoome (forth), kevans (lua)
MFC after: 1 week
Sponsored by: iXsystems, Inc. (Ryan)
Differential Revision: https://reviews.freebsd.org/D21732

4 years agotop(1): support multibyte characters in command names (ARGV array)
daichi [Fri, 20 Sep 2019 17:37:23 +0000 (17:37 +0000)]
top(1): support multibyte characters in command names (ARGV array)
depending on locale.

 - add setlocale()
 - remove printable() function
 - add VIS_OCTAL and VIS_SAFE to the flag of strvisx() to display
   non-printable characters that do not use C-style backslash sequences
   in three digit octal sequence, or remove it

This change allows multibyte characters to be displayed according to
locale. If it is recognized as a non-display character according to the
locale, it is displayed in three digit octal sequence.

Reference:
https://www.mail-archive.com/svn-src-all@freebsd.org/msg165751.html
https://www.mail-archive.com/svn-src-all@freebsd.org/msg165766.html
https://www.mail-archive.com/svn-src-all@freebsd.org/msg165833.html
https://www.mail-archive.com/svn-src-all@freebsd.org/msg165846.html
https://www.mail-archive.com/svn-src-all@freebsd.org/msg165891.html

Submitted by: hrs
Differential Revision: https://reviews.freebsd.org/D16204

4 years agopowerpc/loader: Install ubldr without stripping
jhibbits [Fri, 20 Sep 2019 13:35:28 +0000 (13:35 +0000)]
powerpc/loader: Install ubldr without stripping

Summary:
Install's strip capability, by way of strip(1), doesn't seem to work
correctly on msdosfs, and instead ends up truncating the resulting
binary to 0-length.  As a workaround, don't strip ubldr(8).  This
fixes installworld on Book-E ubldr-based platforms, which prior to this
would need to manually install ubldr separately after installworld, in
order to have a functional ubldr.

The same thing could be done on PowerNV platforms that use msdosfs /boot
volumes, since loader and loader.kboot, etc, all get truncated to 0 on
install.  However, PowerNV does not use loader, instead loading from
petitboot, so it's not really necessary at this time.

Reviewed by: kevans
Differential Revision: https://reviews.freebsd.org/D21725

4 years agoAdd quirk for XHCI(4) controllers to support USB control transfers
hselasky [Fri, 20 Sep 2019 11:28:45 +0000 (11:28 +0000)]
Add quirk for XHCI(4) controllers to support USB control transfers
above 1Kbyte.  It might look like some XHCI(4) controllers do not
support when the USB control transfer is split using a link TRB. The
next NORMAL TRB after the link TRB is simply failing with XHCI error
code 4. The quirk ensures we allocate a 64Kbyte buffer so that the
data stage TRB is not broken with a link TRB.

Found at: EuroBSDcon 2019
MFC after: 1 week
Sponsored by: Mellanox Technologies

4 years agoIncrease the maximum user-space buffer size from 256kBytes to 32MBytes for
hselasky [Fri, 20 Sep 2019 11:00:02 +0000 (11:00 +0000)]
Increase the maximum user-space buffer size from 256kBytes to 32MBytes for
libusb. This is useful for speeding up large data transfers while reducing
the interrupt rate.

Found at: EuroBSDcon 2019
MFC after: 1 week
Sponsored by: Mellanox Technologies

4 years agoThe maximum TD size is 31 and not 15.
hselasky [Fri, 20 Sep 2019 10:56:13 +0000 (10:56 +0000)]
The maximum TD size is 31 and not 15.

Found at: EuroBSDcon 2019
MFC after: 1 week
Sponsored by: Mellanox Technologies

4 years agoEnsure libthr is always built before libprivatezstd when building the
bapt [Fri, 20 Sep 2019 09:45:38 +0000 (09:45 +0000)]
Ensure libthr is always built before libprivatezstd when building the
startup libs

Reported by: "Galazka, Krzysztof" <krzysztof.galazka@intel.com>

4 years agoremove redundant "ktls" in KTLS thr name
gallatin [Fri, 20 Sep 2019 09:36:07 +0000 (09:36 +0000)]
remove redundant "ktls" in  KTLS thr name

This reducesthe string width of the ktls thread name
and improves "ps" output.

Glanced at by: jhb
Event: EuroBSDCon hackathon
Sponsored by: Netflix

4 years agoelf_common: add ELF note names
emaste [Fri, 20 Sep 2019 09:04:52 +0000 (09:04 +0000)]
elf_common: add ELF note names

r348628 added a definition of NT_GNU_BUILD_ID.  Some software (Valgrind)
also expects a #define for the note name (ELF_NOTE_GNU) in the case that
NT_GNU_BUILD_ID is defined.

PR: 239669
Reported by: Yuichiro NAITO
Sponsored by: The FreeBSD Foundation
Event: EuroBSDCon FreeBSD DevSummit 2019

4 years agoFix the handling of invalid parameters in ASCONF chunks.
tuexen [Fri, 20 Sep 2019 08:20:20 +0000 (08:20 +0000)]
Fix the handling of invalid parameters in ASCONF chunks.
Thanks to Mark Wodrich from Google for reproting the issue in
https://github.com/sctplab/usrsctp/issues/376
for the userland stack.

MFC after: 3 days

4 years agoloader: fix typo in zalloc.
tsoome [Fri, 20 Sep 2019 05:22:34 +0000 (05:22 +0000)]
loader: fix typo in zalloc.

4 years agoImprove ioat(4) NUMA-awareness.
mav [Thu, 19 Sep 2019 22:15:57 +0000 (22:15 +0000)]
Improve ioat(4) NUMA-awareness.

Allocate ioat->ring memory from the device domain.
Schedule ioat->poll_timer to the first CPU of the device domain.

According to pcm-numa tool from intel-pcm port, this reduces number of
remote DRAM accesses while copying data by 75%.  And unless it is a noise,
I've noticed some speed improvement when copying data to other domain.

MFC after: 1 week
Sponsored by: iXsystems, Inc.

4 years agovfs: group fields used for per-cpu ops in one cacheline
mjg [Thu, 19 Sep 2019 21:23:14 +0000 (21:23 +0000)]
vfs: group fields used for per-cpu ops in one cacheline

Sponsored by: The FreeBSD Foundation

4 years agoFix src component detection
grembo [Thu, 19 Sep 2019 21:13:51 +0000 (21:13 +0000)]
Fix src component detection

Reviewed by: emaste
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D21579

4 years agoFollow up on r352304 which disabled default mlockall() at startup.
cy [Thu, 19 Sep 2019 20:16:51 +0000 (20:16 +0000)]
Follow up on r352304 which disabled default mlockall() at startup.
Unfortunately though the original tarball supports this in ./configure
(for Linux), to fully support disabling of mlockall() by default requires
a little extra help otherwise the following is logged in syslog:

Cannot set RLIMIT_MEMLOCK: Operation not permitted

MFC after: 2 weeks
X-MFC with: r352304

4 years agoApply r346792 (cperciva) from stable/12 to head. The original commit
gjb [Thu, 19 Sep 2019 16:43:12 +0000 (16:43 +0000)]
Apply r346792 (cperciva) from stable/12 to head.  The original commit
message:

 On non-x86 systems, use "quarterly" packages.

 x86 architectures have "latest" package builds on stable/*, so keep using
 those (they'll get switched over to "quarterly" during releases).

The original commit was a direct commit to stable/12, as at the time it
was presumed it would not be necessary for head.  However, when it is time
to create a releng branch or switch from PRERELEASE/STABLE to BETA/RC, the
pkg(7) Makefile needs further adjusting.  This commit includes those
further adjustments, evaluating the BRANCH variable from release/Makefile
to determine the pkg(7) repository to use.

MFC after: immediate (if possible)
Sponsored by: Rubicon Communications, LLC (Netgate)

4 years agoReduce calls to close(2) at startup through the use of closefrom(2).
cy [Thu, 19 Sep 2019 14:45:04 +0000 (14:45 +0000)]
Reduce calls to close(2) at startup through the use of closefrom(2).

Submitted by: pawel.biernacki@gmail.com
Reviewed by: mjg, cy
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D21715

4 years agoWhitespace cleanup, no functional change
lwhsu [Thu, 19 Sep 2019 13:25:19 +0000 (13:25 +0000)]
Whitespace cleanup, no functional change

Sponsored by: The FreeBSD Foundation

4 years agoTemporarily add test_write_filter_zstd BROKEN_TESTS as it always fails in CI
lwhsu [Thu, 19 Sep 2019 13:23:25 +0000 (13:23 +0000)]
Temporarily add test_write_filter_zstd BROKEN_TESTS as it always fails in CI

There is no trivial way to mark single libarchive test skip currently so just
add it to BROKEN_TESTS for now.

PR: 240683
Sponsored by: The FreeBSD Foundation

4 years agofreebsd-update: make usage output consistent
emaste [Thu, 19 Sep 2019 11:46:43 +0000 (11:46 +0000)]
freebsd-update: make usage output consistent

Drop trailing . which appeared only on description of IDS.

Submitted by: grembo
Event: EuroBSDCon Norway FreeBSD DevSummit

4 years agofreebsd-update.8: appease igor
emaste [Thu, 19 Sep 2019 11:34:35 +0000 (11:34 +0000)]
freebsd-update.8: appease igor

igor follows American style guides in the belief that abbreviations i.e.
and e.g. are always followed by a comma.  Make that change now so that
future updates to freebsd-update.8 do not complain about this.

Submitted by: grembo
Event: EuroBSDCon Norway FreeBSD DevSummit

4 years agoWhen the RACK stack computes the space for user data in a TCP segment,
tuexen [Thu, 19 Sep 2019 10:27:47 +0000 (10:27 +0000)]
When the RACK stack computes the space for user data in a TCP segment,
it wasn't taking the IP level options into account. This patch fixes this.
In addition, it also corrects a KASSERT and adds protection code to assure
that the IP header chain and the TCP head fit in the first fragment as
required by RFC 7112.

Reviewed by: rrs@
MFC after: 3 days
Sponsored by: Nertflix, Inc.
Differential Revision: https://reviews.freebsd.org/D21666

4 years agoWhen processing an incoming IPv6 packet over the loopback interface which
tuexen [Thu, 19 Sep 2019 10:22:29 +0000 (10:22 +0000)]
When processing an incoming IPv6 packet over the loopback interface which
contains Hop-by-Hop options, the mbuf chain is potentially changed in
ip6_hopopts_input(), called by ip6_input_hbh().
This can happen, because of the the use of IP6_EXTHDR_CHECK, which might
call m_pullup().
So provide the updated pointer back to the called of ip6_input_hbh() to
avoid using a freed mbuf chain in`ip6_input()`.

Reviewed by: markj@
MFC after: 3 days
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D21664

4 years agoupdate zfs send usage help with r352447
avg [Thu, 19 Sep 2019 09:48:01 +0000 (09:48 +0000)]
update zfs send usage help with r352447

MFC after: 3 days

4 years agofix dsl_scan_ds_clone_swapped logic
avg [Thu, 19 Sep 2019 09:43:56 +0000 (09:43 +0000)]
fix dsl_scan_ds_clone_swapped logic

It was incorrect with respect to swapping dataset IDs both in the
on-disk ZAP object and the in-memory queue.

In both cases, if only ds1 was already present, then it would be first
replaced with ds2 and then ds2 would be replaced back with ds1.  Also,
both cases did not properly handle a situation where both ds1 and ds2
are already queued.  A duplicate insertion would be attempted and its
failure would result in a panic.

This change has also been submitted to ZoL as zfsonlinux/zfs@dd262c9

PR: 239566
Reported by: pascal.guitierrez@gmail.com
MFC after: 4 days
Sponsored by: CyberSecure

4 years agovt: fix problems with trying to switch to a closed VT
avg [Thu, 19 Sep 2019 09:22:45 +0000 (09:22 +0000)]
vt: fix problems with trying to switch to a closed VT

If there is an attempt to switch from a process-owned VT to a closed VT,
then vt(4) first requests the process to release its VT and only then
realizes that the target VT is closed and, so, the switch is not
possible.  So, the driver does not actually do any switch, but at the
same time the owning process is not notified about that and it does not
re-acquire the VT.

This change adds an early check for the target VT state, so that the
switch can be refused before the process coordination dance.
On top of that, the code now checks for a failure of vt_window_switch()
and calls vt_window_postswitch() for the current VT if it is in the
process mode.

Test Plan:
- configure VT1 - VT8 (ttyv0 - ttyv7) to be text consoles (run getty)
- configure VT9 (ttyv8) to rn X server
- make sure that the X server configuration allows VT switching
- leave VT10 - VT12 unconfigured
- while in the X server press Ctrl+Alt+F10
- without the patch, observe strange screen content and problems with
  keyboard input
- with the patch, observe that nothing happens

The problem has been observed and the fix has been tested with an nVidia
graphics card and the proprietary nvidia driver.
Not sure if that matters.

Reviewed by: ray
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D21704

4 years agosys/vm/vm_glue.c: Incorrect function name in panic string
allanjude [Thu, 19 Sep 2019 07:28:24 +0000 (07:28 +0000)]
sys/vm/vm_glue.c: Incorrect function name in panic string

Use __func__ to avoid this issue in the future.

Submitted by: Wuyang Chung <wuyang.chung1@gmail.com>
Reviewed by: markj, emaste
Obtained from: https://github.com/freebsd/freebsd/pull/410

4 years agoAdd some tests for page fault signals and codes
jilles [Wed, 18 Sep 2019 21:00:32 +0000 (21:00 +0000)]
Add some tests for page fault signals and codes

It is useful to have some tests for page fault signals.

More tests would be useful but creating the conditions (such as various
kinds of running out of memory and I/O errors) is more complicated.

The tests page_fault_signal__bus_objerr_1 and
page_fault_signal__bus_objerr_2 depend on https://reviews.freebsd.org/D21566
before they can pass.

PR: 211924
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D21624

4 years agoFix typo, setting hidden flag instead of reparse.
mav [Wed, 18 Sep 2019 19:33:08 +0000 (19:33 +0000)]
Fix typo, setting hidden flag instead of reparse.

Submitted by: Ryan Moeller <ryan@ixsystems.com>
MFC after: 3 days
Sponsored by: iXsystems, Inc.

4 years agotruss: decode sysctl names.
kib [Wed, 18 Sep 2019 16:15:05 +0000 (16:15 +0000)]
truss: decode sysctl names.

Submitted by: Pawel Biernacki
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D21688

4 years agoAdd support for BERI statcounters.
br [Wed, 18 Sep 2019 16:13:50 +0000 (16:13 +0000)]
Add support for BERI statcounters.

BERI stands for Bluespec Extensible RISC Implementation, based on MIPS.

BERI has not implemented standard MIPS perfomance monitoring counters,
instead it provides statistical counters.

BERI statcounters have a several limitations:
- They can't be written
- They don't support start/stop operation
- None of hardware interrupt is provided on a counter overflow.

So make it separate to hwpmc_mips module and support process/system
counting mode only.

Sponsored by: DARPA, AFRL

4 years agosysctl: use names instead of magic numbers.
kib [Wed, 18 Sep 2019 16:13:10 +0000 (16:13 +0000)]
sysctl: use names instead of magic numbers.

Replace magic numbers with symbols for internal sysctl operations.
Convert in-kernel and libc consumers.

Submitted by: Pawel Biernacki
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D21693

4 years agoAdd the missing bits for LIBADD to properly function now that
bapt [Wed, 18 Sep 2019 08:02:03 +0000 (08:02 +0000)]
Add the missing bits for LIBADD to properly function now that
libarchive is linked to libzstd

Pointy hat: bapt
Reported by: antoine

4 years agoAdd native support for zstd to libarchive
bapt [Wed, 18 Sep 2019 07:57:56 +0000 (07:57 +0000)]
Add native support for zstd to libarchive

Note that old pkg will failed to build after this. A recent ports tree (one
providing pkg 1.12+) is required to build. Older already built pkg, should
continue working as expected

PR: 238797
Exp run by: antoine
Reviewed by: cem
Approved by: cem
Differential Revision: https://reviews.freebsd.org/D20752

4 years agosrc.conf(5): regenerate after r352465, r352466
kevans [Wed, 18 Sep 2019 02:04:41 +0000 (02:04 +0000)]
src.conf(5): regenerate after r352465, r352466

These changed the defaults for the GOOGLETEST knob and added a description
for WITH_GOOGLETEST.

4 years agoAdd description for WITH_GOOGLETEST
kevans [Wed, 18 Sep 2019 02:03:39 +0000 (02:03 +0000)]
Add description for WITH_GOOGLETEST

This is the logical negation of WITHOUT_GOOGLETEST, and helpful to have as
we now have different per-arch defaults for this option.

4 years agogoogletest: default-disable on all of MIPS for now
kevans [Wed, 18 Sep 2019 01:58:56 +0000 (01:58 +0000)]
googletest: default-disable on all of MIPS for now

Parts of the fusefs tests trigger a bug in current versions of llvm: IR
representation of some routine for the MIPS targets is a function with a
large number of arguments. This then leads the compiler on an hour+ long
goose chase, which is OK if you build the current tree but less-so if you're
trying external toolchain or doing a universe build involving mips when it
eventually gets switched over to LLVM.

Better, accurate details can be found in LLVM PR43263.

4 years agomips: ubldr: use truncated load address for mips32
kevans [Wed, 18 Sep 2019 01:33:17 +0000 (01:33 +0000)]
mips: ubldr: use truncated load address for mips32

BFD appears to silently truncate 0xffffffff80800000 when it processes the
ldscript for 32-bit mips, but LLD chokes on it as the linker script tries to
place elements above 32-bit range. It's unclear to me if silent truncation
is kosher or not and whether this patch is really what we want to do, but it
is one approach at least.

Reviewed by: imp, mizhka
Differential Revision: https://reviews.freebsd.org/D21487

4 years agoTemporarily skip sys.netpfil.common.tos.pf_tos on i386 CI as it always fails
lwhsu [Tue, 17 Sep 2019 22:09:14 +0000 (22:09 +0000)]
Temporarily skip sys.netpfil.common.tos.pf_tos on i386 CI as it always fails

PR: 240086
Sponsored by: The FreeBSD Foundation

4 years agoTemporarily skip sys.netpfil.common.forward.pf_v4 on i386 CI as it always fails
lwhsu [Tue, 17 Sep 2019 22:08:16 +0000 (22:08 +0000)]
Temporarily skip sys.netpfil.common.forward.pf_v4 on i386 CI as it always fails

PR: 240085
Sponsored by: The FreeBSD Foundation

4 years agoUse correct filename in newsyslog.conf
swills [Tue, 17 Sep 2019 20:05:06 +0000 (20:05 +0000)]
Use correct filename in newsyslog.conf

Approved by: bapt (implicit)
Differential Revision: https://reviews.freebsd.org/D21561

4 years agolog daemon.info to /var/log/daemon.log by default
swills [Tue, 17 Sep 2019 20:03:20 +0000 (20:03 +0000)]
log daemon.info to /var/log/daemon.log by default

log daemon facility now that daemon(8) has syslog support which defaults to
daemon facility, info priority

Reviewed by: bapt
Approved by: bapt
Differential Revision: https://reviews.freebsd.org/D21561

4 years agoifconfig: add report of the string from SIOCGIFDOWNREASON.
kib [Tue, 17 Sep 2019 18:51:10 +0000 (18:51 +0000)]
ifconfig: add report of the string from SIOCGIFDOWNREASON.

Sample output:
# ifconfig mce0
mce0: flags=8802<BROADCAST,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=3ed07bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,LRO,VLAN_HWFILTER,VLAN_HWTSO,LINKSTATE,RXCSUM_IPV6,TXCSUM_IPV6,TXRTLMT,HWRXTSTMP>
        ether e4:1d:2d:e7:10:0a
        media: Ethernet autoselect <full-duplex,rxpause,txpause>
        status: no carrier (Negotiation failure)
        nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>

Reviewed by: hselasky, rrs
Sponsored by: Mellanox Technologies
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D21527

4 years agoAdd SIOCGIFDOWNREASON.
kib [Tue, 17 Sep 2019 18:49:13 +0000 (18:49 +0000)]
Add SIOCGIFDOWNREASON.

The ioctl(2) is intended to provide more details about the cause of
the down for the link.

Eventually we might define a comprehensive list of codes for the
situations.  But interface also allows the driver to provide free-form
null-terminated ASCII string to provide arbitrary non-formalized
information.  Sample implementation exists for mlx5(4), where the
string is fetched from firmware controlling the port.

Reviewed by: hselasky, rrs
Sponsored by: Mellanox Technologies
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D21527

4 years agoFurther refine r352393, only call vnode_pager_setsize() outside the
kib [Tue, 17 Sep 2019 18:41:39 +0000 (18:41 +0000)]
Further refine r352393, only call vnode_pager_setsize() outside the
node lock when shrinking.

This is similar to r252528, applied to the above commit.

Apparently there is a race which makes necessary at least to keep the
n_size and pager size consistent when extending.  Current suspect is
that iod threads perform vnode_pager_setsize() without taking the
vnode lock, which corrupts the file content.

Reported and tested by: Masachika ISHIZUKA <ish@amail.plala.or.jp>
Discussed with: rmacklem (related issues)
Sponsored by: The FreeBSD Foundation
MFC after: 1 week

4 years agorealloc(x, 0) should not return NULL.
kib [Tue, 17 Sep 2019 18:36:29 +0000 (18:36 +0000)]
realloc(x, 0) should not return NULL.

See http://www.open-std.org/jtc1/sc22/wg14/www/docs/summary.htm#dr_400.
Upstream jemalloc issue is opened by emaste at
https://github.com/jemalloc/jemalloc/issues/1629.

Reviewed by: emaste
PR: 240456
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
DIfferential revision: https://reviews.freebsd.org/D21632