]> CyberLeo.Net >> Repos - FreeBSD/FreeBSD.git/log
FreeBSD/FreeBSD.git
4 years agoMFV r352731:
mm [Thu, 26 Sep 2019 01:50:20 +0000 (01:50 +0000)]
MFV r352731:
Sync libarchive with vendor.

Relevant vendor changes:
  Issue #1237: Fix integer overflow in archive_read_support_filter_lz4.c
  PR #1249: Correct some typographical and grammatical errors.
  PR #1250: Minor corrections to the formatting of manual pages

MFC after: 1 week

4 years agoFix some broken relocation handling
mhorne [Thu, 26 Sep 2019 00:58:47 +0000 (00:58 +0000)]
Fix some broken relocation handling

In a few cases, the symbol lookup is missing before attempting to
perform the relocation. While the relocation types affected are
currently unused, this results in an uninitialized variable warning,
that is escalated to an error when building with clang.

Reviewed by: markj
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D21773

4 years agoCleanup of elf_machdep.c
mhorne [Thu, 26 Sep 2019 00:54:07 +0000 (00:54 +0000)]
Cleanup of elf_machdep.c

Fix some style(9) violations.

This also changes the name of the machine-dependent sysctl kern.debug_kld to
debug.kld_reloc, and changes its type from int to bool. This is acceptable
since we are not currently concerned with preserving the RISC-V ABI.

Reviewed by: markj, kp
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D21772

4 years agoMicrooptimize sched_pickcpu() CPU affinity on SMT.
mav [Thu, 26 Sep 2019 00:35:06 +0000 (00:35 +0000)]
Microoptimize sched_pickcpu() CPU affinity on SMT.

Use of CPU_FFS() to implement CPUSET_FOREACH() allows to save up to ~0.5%
of CPU time on 72-thread SMT system doing 80K IOPS to NVMe from one thread.

MFC after: 1 month
Sponsored by: iXsystems, Inc.

4 years agoAdd SPDX tags to recently added files
kevans [Wed, 25 Sep 2019 22:53:30 +0000 (22:53 +0000)]
Add SPDX tags to recently added files

Reported by: Pawel Biernacki

4 years agoefibootmgr(8): fix markup and style issues
yuripv [Wed, 25 Sep 2019 21:23:30 +0000 (21:23 +0000)]
efibootmgr(8): fix markup and style issues

- split synopsis into separate options that can't be used together
- sort options
- fix (style) issues reported by mandoc lint

Reviewed by: imp
Differential Revision: https://reviews.freebsd.org/D21710

4 years agostyle(9): remove extraneous empty lines
glebius [Wed, 25 Sep 2019 20:46:09 +0000 (20:46 +0000)]
style(9): remove extraneous empty lines

4 years agoMicrooptimize sched_pickcpu() after r352658.
mav [Wed, 25 Sep 2019 19:29:09 +0000 (19:29 +0000)]
Microoptimize sched_pickcpu() after r352658.

I've noticed that I missed intr check at one more SCHED_AFFINITY(),
so instead of adding one more branching I prefer to remove few.

Profiler shows the function CPU time reduction from 0.24% to 0.16%.

MFC after: 1 month
Sponsored by: iXsystems, Inc.

4 years agoposix_spawn(3): handle potential signal issues with vfork
kevans [Wed, 25 Sep 2019 19:22:03 +0000 (19:22 +0000)]
posix_spawn(3): handle potential signal issues with vfork

Described in [1], signal handlers running in a vfork child have
opportunities to corrupt the parent's state. Address this by adding a new
rfork(2) flag, RFSPAWN, that has vfork(2) semantics but also resets signal
handlers in the child during creation.

x86 uses rfork_thread(3) instead of a direct rfork(2) because rfork with
RFMEM/RFSPAWN cannot work when the return address is stored on the stack --
further information about this problem is described under RFMEM in the
rfork(2) man page.

Addressing this has been identified as a prerequisite to using posix_spawn
in subprocess on FreeBSD [2].

[1] https://ewontfix.com/7/
[2] https://bugs.python.org/issue35823

Reviewed by: jilles, kib
Differential Revision: https://reviews.freebsd.org/D19058

4 years agorfork(2): add RFSPAWN flag
kevans [Wed, 25 Sep 2019 19:20:41 +0000 (19:20 +0000)]
rfork(2): add RFSPAWN flag

When RFSPAWN is passed, rfork exhibits vfork(2) semantics but also resets
signal handlers in the child during creation to avoid a point of corruption
of parent state from the child.

This flag will be used by posix_spawn(3) to handle potential signal issues.

Reviewed by: jilles, kib
Differential Revision: https://reviews.freebsd.org/D19058

4 years agoDo not left-shift a negative number (inducing undefined behavior in
dim [Wed, 25 Sep 2019 18:50:57 +0000 (18:50 +0000)]
Do not left-shift a negative number (inducing undefined behavior in
C/C++) in exp(3), expf(3), expm1(3) and expm1f(3) during intermediate
computations that compute the IEEE-754 bit pattern for |2**k| for
integer |k|.

The implementations of exp(3), expf(3), expm1(3) and expm1f(3) need to
compute IEEE-754 bit patterns for 2**k in certain places.  (k is an
integer and 2**k is exactly representable in IEEE-754.)

Currently they do things like 0x3FF0'0000+(k<<20), which is to say they
take the bit pattern representing 1 and then add directly to the
exponent field to get the desired power of two.  This is fine when k is
non-negative.

But when k<0 (and certain classes of input trigger this), this
left-shifts a negative number -- an operation with undefined behavior in
C and C++.

The desired semantics can be achieved by instead adding the
possibly-negative k to the IEEE-754 exponent bias to get the desired
exponent field, _then_ shifting that into its proper overall position.

(Note that in case of s_expm1.c and s_expm1f.c, there are SET_HIGH_WORD
and SET_FLOAT_WORD uses further down in each of these files that perform
shift operations involving k, but by these points k's range has been
restricted to 2 < k <= 56, and the shift operations under those
circumstances can't do anything that would be UB.)

Submitted by: Jeff Walden, https://github.com/jswalden
Obtained from: https://github.com/freebsd/freebsd/pull/411
Obtained from: https://github.com/freebsd/freebsd/pull/412
MFC after: 3 days

4 years agocompat/freebsd32: restore style after r352705 (no functional change)
kevans [Wed, 25 Sep 2019 18:48:05 +0000 (18:48 +0000)]
compat/freebsd32: restore style after r352705 (no functional change)

The escaped newlines haven't been necessary since r339624, but this file has
not been reformatted. Restore the style.

4 years agoAdd debugging facility EPOCH_TRACE that checks that epochs entered are
glebius [Wed, 25 Sep 2019 18:26:31 +0000 (18:26 +0000)]
Add debugging facility EPOCH_TRACE that checks that epochs entered are
properly nested and warns about recursive entrances.  Unlike with locks,
there is nothing fundamentally wrong with such use, the intent of tracer
is to help to review complex epoch-protected code paths, and we mean the
network stack here.

Reviewed by: hselasky
Sponsored by: Netflix
Pull Request: https://reviews.freebsd.org/D21610

4 years agosysent: regenerate after r352705
kevans [Wed, 25 Sep 2019 18:09:19 +0000 (18:09 +0000)]
sysent: regenerate after r352705

This also implements it, fixes kdump, and removes no longer needed bits from
lib/libc/sys/shm_open.c for the interim.

4 years agoMark shm_open(2) as COMPAT12, succeeded by shm_open2
kevans [Wed, 25 Sep 2019 18:06:48 +0000 (18:06 +0000)]
Mark shm_open(2) as COMPAT12, succeeded by shm_open2

Implementation and regenerated files will follow.

4 years agoAdjust Makefile.inc1 syscall sub commit
kevans [Wed, 25 Sep 2019 18:04:09 +0000 (18:04 +0000)]
Adjust Makefile.inc1 syscall sub commit

4 years agoAdd linux-compatible memfd_create
kevans [Wed, 25 Sep 2019 18:03:18 +0000 (18:03 +0000)]
Add linux-compatible memfd_create

memfd_create is effectively a SHM_ANON shm_open(2) mapping with optional
CLOEXEC and file sealing support. This is used by some mesa parts, some
linux libs, and qemu can also take advantage of it and uses the sealing to
prevent resizing the region.

This reimplements shm_open in terms of shm_open2(2) at the same time.

shm_open(2) will be moved to COMPAT12 shortly.

Reviewed by: markj, kib
Differential Revision: https://reviews.freebsd.org/D21393

4 years agoEnhance the 'ps' command so that it prints a line per proc and a line
glebius [Wed, 25 Sep 2019 18:03:15 +0000 (18:03 +0000)]
Enhance the 'ps' command so that it prints a line per proc and a line
per thread, so that instead of repeating the same info for all threads
in proc, it would print thread specific info. Also includes thread number
that would match 'info threads' info and can be used as argument for
thread swithcing with 'thread' command.

4 years agosysent: regenerate after r352700
kevans [Wed, 25 Sep 2019 17:59:58 +0000 (17:59 +0000)]
sysent: regenerate after r352700

4 years agoAdd a shm_open2 syscall to support upcoming memfd_create
kevans [Wed, 25 Sep 2019 17:59:15 +0000 (17:59 +0000)]
Add a shm_open2 syscall to support upcoming memfd_create

shm_open2 allows a little more flexibility than the original shm_open.
shm_open2 doesn't enforce CLOEXEC on its callers, and it has a separate
shmflag argument that can be expanded later. Currently the only shmflag is
to allow file sealing on the returned fd.

shm_open and memfd_create will both be implemented in libc to use this new
syscall.

__FreeBSD_version is bumped to indicate the presence.

Reviewed by: kib, markj
Differential Revision: https://reviews.freebsd.org/D21393

4 years agoIn suite.test.mk, test if ${DESTDIR} exists before attempting to run
dim [Wed, 25 Sep 2019 17:52:59 +0000 (17:52 +0000)]
In suite.test.mk, test if ${DESTDIR} exists before attempting to run
chflags -R on it, otherwise the command will error out.  (Note that
adding -f to the chflags invocation does not help, unlike with rm.)

MFC after: 3 days

4 years agoIn r340411, libufs.so's major number was bumped to 7, but an entry in
dim [Wed, 25 Sep 2019 17:35:34 +0000 (17:35 +0000)]
In r340411, libufs.so's major number was bumped to 7, but an entry in
ObsoleteFiles.inc was not added.  Retroactively fix that.

4 years ago[2/3] Add an initial seal argument to kern_shm_open()
kevans [Wed, 25 Sep 2019 17:35:03 +0000 (17:35 +0000)]
[2/3] Add an initial seal argument to kern_shm_open()

Now that flags may be set on posixshm, add an argument to kern_shm_open()
for the initial seals. To maintain past behavior where callers of
shm_open(2) are guaranteed to not have any seals applied to the fd they're
given, apply F_SEAL_SEAL for existing callers of kern_shm_open. A special
flag could be opened later for shm_open(2) to indicate that sealing should
be allowed.

We currently restrict initial seals to F_SEAL_SEAL. We cannot error out if
F_SEAL_SEAL is re-applied, as this would easily break shm_open() twice to a
shmfd that already existed. A note's been added about the assumptions we've
made here as a hint towards anyone wanting to allow other seals to be
applied at creation.

Reviewed by: kib, markj
Differential Revision: https://reviews.freebsd.org/D21392

4 years agoUpdate fcntl(2) after r352695
kevans [Wed, 25 Sep 2019 17:33:12 +0000 (17:33 +0000)]
Update fcntl(2) after r352695

4 years ago[1/3] Add mostly Linux-compatible file sealing support
kevans [Wed, 25 Sep 2019 17:32:43 +0000 (17:32 +0000)]
[1/3] Add mostly Linux-compatible file sealing support

File sealing applies protections against certain actions
(currently: write, growth, shrink) at the inode level. New fileops are added
to accommodate seals - EINVAL is returned by fcntl(2) if they are not
implemented.

Reviewed by: markj, kib
Differential Revision: https://reviews.freebsd.org/D21391

4 years agosysent: regenerate after r352693
kevans [Wed, 25 Sep 2019 17:30:28 +0000 (17:30 +0000)]
sysent: regenerate after r352693

4 years agoAdd COMPAT12 support to makesyscalls.sh
kevans [Wed, 25 Sep 2019 17:29:45 +0000 (17:29 +0000)]
Add COMPAT12 support to makesyscalls.sh

Reviewed by: kib, imp, brooks (all without syscalls.master edits)
Differential Revision: https://reviews.freebsd.org/D21366

4 years agobsdgrep(1): various fixes of empty pattern/exit code/-c behavior
kevans [Wed, 25 Sep 2019 17:14:43 +0000 (17:14 +0000)]
bsdgrep(1): various fixes of empty pattern/exit code/-c behavior

When an empty pattern is encountered in the pattern list, I had previously
broken bsdgrep to count that as a "match all" and ignore any other patterns
in the list. This commit rectifies that mistake, among others:

- The -v flag semantics were not quite right; lines matched should have been
  counted differently based on whether the -v flag was set or not. procline
  now definitively returns whether it's matched or not, and interpreting
  that result has been kicked up a level.
- Empty patterns with the -x flag was broken similarly to empty patterns
  with the -w flag. The former is a whole-line match and should be more
  strict, only matching blank lines. No -x and no -w will will match the
  empty string at the beginning of each line.
- The exit code with -L was broken, w.r.t. modern grep. Modern grap will
  exit(0) if any file that didn't match was output, so our interpretation
  was simply backwards. The new interpretation makes sense to me.

Tests updated and added to try and catch some of this.

This misbehavior was found by autoconf while fixing ports found in PR 229925
expecting either a more sane or a more GNU-like sed.

MFC after: 1 week

4 years agoAdd some counters for per-VM page events.
markj [Wed, 25 Sep 2019 17:08:35 +0000 (17:08 +0000)]
Add some counters for per-VM page events.

For now, just count batched page queue state operations.
vm.stats.page.queue_ops counts the number of batch entries that
successfully completed, while queue_nops counts entries that had no
effect, which occurs when the queue operation had been completed before
the batch entry was processed.

Reviewed by: alc, kib
MFC after: 1 week
Sponsored by: Intel, Netflix
Differential Revision: https://reviews.freebsd.org/D21782

4 years agoremove obsolete i386 MD memchr implementation
emaste [Wed, 25 Sep 2019 16:49:22 +0000 (16:49 +0000)]
remove obsolete i386 MD memchr implementation

bde reports (in a reply to r351700 commit mail):
    This uses scasb, which was last optimal on the 8086, or perhaps the
    original i386.  On freefall, it is several times slower than the
    naive translation of the naive C code.

Reported by: bde
Reviewed by: kib, markj
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D21785

4 years agoComplete the removal of the "wire_count" field from struct vm_page.
markj [Wed, 25 Sep 2019 16:11:35 +0000 (16:11 +0000)]
Complete the removal of the "wire_count" field from struct vm_page.

Convert all remaining references to that field to "ref_count" and update
comments accordingly.  No functional change intended.

Reviewed by: alc, kib
Sponsored by: Intel, Netflix
Differential Revision: https://reviews.freebsd.org/D21768

4 years agox86: Fall back to leaf 0x16 if TSC frequency is obtained by CPUID and
kib [Wed, 25 Sep 2019 13:36:56 +0000 (13:36 +0000)]
x86: Fall back to leaf 0x16 if TSC frequency is obtained by CPUID and
leaf 0x15 is not functional.

This should improve automatic TSC frequency determination on
Skylake/Kabylake/... families, where 0x15 exists but does not provide
all necessary information.  SDM contains relatively strong wording
against such uses of 0x16, but Intel does not give us any other way to
obtain the frequency. Linux did the same in the commit
604dc9170f2435d27da5039a3efd757dceadc684.

Based on submission by: Neel Chauhan <neel@neelc.org>
PR: 240475
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D21777

4 years agovt: use colors from terminal emulator
tsoome [Wed, 25 Sep 2019 13:24:31 +0000 (13:24 +0000)]
vt: use colors from terminal emulator

Instead of hardcoded colors, use terminal state. This also means,
we need to record the pointer to terminal state with vtbuf.

4 years agokernel: terminal_init() should check for teken colors from kenv
tsoome [Wed, 25 Sep 2019 13:21:07 +0000 (13:21 +0000)]
kernel: terminal_init() should check for teken colors from kenv

Check for teken.fg_color and teken.bg_color and prepare the color
attributes accordingly.

When white background is used, make it light to improve visibility.
When black background is used, make kernel messages light.

4 years agoRELNOTES: Document r352668 (crontab -n and -q options)
kevans [Wed, 25 Sep 2019 13:04:34 +0000 (13:04 +0000)]
RELNOTES: Document r352668 (crontab -n and -q options)

Suggested by: bapt

4 years agoFix wrong assertion in r352658.
mav [Wed, 25 Sep 2019 11:58:54 +0000 (11:58 +0000)]
Fix wrong assertion in r352658.

MFC after: 1 month

4 years agoSize is unsigned, so remove the test entirely.
imp [Wed, 25 Sep 2019 07:51:30 +0000 (07:51 +0000)]
Size is unsigned, so remove the test entirely.

The kernel won't crash if you have a bad value and I'd rather not have
nvmecontrol know the internal details about how the nvme driver limits
the transfer size.

4 years agoloader: fix indentation in efi_console and vidconsole
tsoome [Wed, 25 Sep 2019 07:36:35 +0000 (07:36 +0000)]
loader: fix indentation in efi_console and vidconsole

Remove extra tab.

Reported by: yuripv

4 years agoloader: add teken.fg_color and teken.bg_color variables
tsoome [Wed, 25 Sep 2019 07:09:25 +0000 (07:09 +0000)]
loader: add teken.fg_color and teken.bg_color variables

Add settable variables to control teken default color attributes.
The supported colors are 0-7 or basic color names:
black, red, green, brown, blue, magenta, cyan, white.

The current implementation does add some duplication which will be addressed
later.

4 years agocron: add log suppression and mail suppression for successful runs
kevans [Wed, 25 Sep 2019 02:37:40 +0000 (02:37 +0000)]
cron: add log suppression and mail suppression for successful runs

This commit adds two new extensions to crontab, ported from OpenBSD:
- -n: suppress mail on succesful run
- -q: suppress logging of command execution

The -q option appears decades old, but -n is relatively new. The
original proposal by Job Snijder can be found here [1], and gives very
convincing reasons for inclusion in base.

This patch is a nearly identical port of OpenBSD cron for -q and -n
features. It is written to follow existing conventions and style of the
existing codebase.

Example usage:

# should only send email, but won't show up in log
* * * * * -q date

# should not send email
* * * * * -n date

# should not send email or log
* * * * * -n -q date

# should send email because of ping failure
* * * * * -n -q ping -c 1 5.5.5.5

[1]: https://marc.info/?l=openbsd-tech&m=152874866117948&w=2

PR: 237538
Submitted by: Naveen Nathan <freebsd_t.lastninja.net>
Reviewed by: bcr (manpages)
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D20046

4 years agopowerpc/atomic: Follow recommendations on atomic primitive comparisons
jhibbits [Wed, 25 Sep 2019 01:39:58 +0000 (01:39 +0000)]
powerpc/atomic: Follow recommendations on atomic primitive comparisons

Both IBM and Freescale programming examples presume the cmpset operands will
favor equal, and pessimize the non-equal case instead.  Do the same for
atomic_cmpset_* and atomic_fcmpset_*.  This slightly pessimizes the failure
case, in favor of the success case.

MFC after: 3 weeks

4 years agopowerpc: Allocate DPCPU block from domain-local memory
jhibbits [Wed, 25 Sep 2019 01:23:08 +0000 (01:23 +0000)]
powerpc: Allocate DPCPU block from domain-local memory

This should improve NUMA scalability a little, by binding to the CPU's NUMA
domain.  This matches what's done on amd64.

4 years agoAfter my comnd changes, the number of threads and size weren't set. In
imp [Wed, 25 Sep 2019 00:24:57 +0000 (00:24 +0000)]
After my comnd changes, the number of threads and size weren't set. In
addition, the flags are optional, but were made to be mandatory. Set
these things, as well as santiy check the specified size.

Submitted by: Stefan Rink
PR: 240798

4 years agoReplace all mtx_lock()/mtx_unlock() on the iod lock with macros.
rmacklem [Tue, 24 Sep 2019 23:38:10 +0000 (23:38 +0000)]
Replace all mtx_lock()/mtx_unlock() on the iod lock with macros.

Since the NFS node mutex needs to change to an sx lock so it can be held when
vnode_pager_setsize() is called and the iod lock is held when the NFS node lock
is acquired, the iod mutex will need to be changed to an sx lock as well.
To simply the future commit that changes both the NFS node lock and iod lock
to sx locks, this commit replaces all mtx_lock()/mtx_unlock() calls on the
iod lock with macros.
There is no semantic change as a result of this commit.

I don't know when the future commit will happen and be MFC'd, so I have
set the MFC on this commit to one week so that it can be MFC'd at the same
time.

Suggested by: kib
MFC after: 1 week

4 years agoFix white spaces.
jkim [Tue, 24 Sep 2019 21:41:19 +0000 (21:41 +0000)]
Fix white spaces.

4 years agofreebsd-update: Add `updatesready' and `showconfig' commands
grembo [Tue, 24 Sep 2019 20:49:33 +0000 (20:49 +0000)]
freebsd-update: Add `updatesready' and `showconfig' commands

`freebsd-update updatesready' can be used to check if there are any pending
fetched updates that can be installed.

`freebsd-update showconfig' writes freebsd-update's configuration to
stdout.

This also changes the exit code of `freebsd-update install' to 2 in case
there are no updates pending to be installed and there wasn't a fetch phase
in the same invocation. This allows scripts to tell apart these error
conditions without breaking existing jail managers.

See freebsd-update(8) for details.

PR: 240757, 240177, 229346
Reviewed by: manpages (bcr), sectam (emaste), yuripv
Differential Revision: https://reviews.freebsd.org/D21473

4 years agolets put (void) in a couple of functions to keep older platforms that
rrs [Tue, 24 Sep 2019 20:36:43 +0000 (20:36 +0000)]
lets put (void) in a couple of functions to keep older platforms that
are stuck with gcc happy (ppc). The changes are needed in both bbr and
rack.

Obtained from: Michael Tuexen (mtuexen@)

4 years agodon't call in_ratelmit detach when RATELIMIT is not
rrs [Tue, 24 Sep 2019 20:11:55 +0000 (20:11 +0000)]
don't call in_ratelmit detach when RATELIMIT is not
compiled in the kernel.

4 years agoFix the ifdefs in tcp_ratelimit.h. They were reversed so
rrs [Tue, 24 Sep 2019 20:04:31 +0000 (20:04 +0000)]
Fix the ifdefs in tcp_ratelimit.h. They were reversed so
that instead of functions only being inside the _KERNEL and
the absence of RATELIMIT causing us to have NULL/error returning
interfaces we ended up with non-kernel getting the error path.
opps..

4 years agoFix/improve interrupt threads scheduling.
mav [Tue, 24 Sep 2019 20:01:20 +0000 (20:01 +0000)]
Fix/improve interrupt threads scheduling.

Doing some tests with very high interrupt rates I've noticed that one of
conditions I added in r232207 to make interrupt threads in most cases
run on local CPU never worked as expected (worked only if previous time
it was executed on some other CPU, that is quite opposite).  It caused
additional CPU usage to run full CPU search and could schedule interrupt
threads to some other CPU.

This patch removes that code and instead reuses existing non-interrupt
code path with some tweaks for interrupt case:
 - On SMT systems, if current thread is idle, don't look on other threads.
Even if they are busy, it may take more time to do fill search and bounce
the interrupt thread to other core then execute it locally, even sharing
CPU resources.  It is other threads should migrate, not bound interrupts.
 - Try hard to keep interrupt threads within LLC of their original CPU.
This improves scheduling cost and supposedly cache and memory locality.

On a test system with 72 threads doing 2.2M IOPS to NVMe this saves few
percents of CPU time while adding few percents to IOPS.

MFC after: 1 month
Sponsored by: iXsystems, Inc.

4 years agoThis commit adds BBR (Bottleneck Bandwidth and RTT) congestion control. This
rrs [Tue, 24 Sep 2019 18:18:11 +0000 (18:18 +0000)]
This commit adds BBR (Bottleneck Bandwidth and RTT) congestion control. This
is a completely separate TCP stack (tcp_bbr.ko) that will be built only if
you add the make options WITH_EXTRA_TCP_STACKS=1 and also include the option
TCPHPTS. You can also include the RATELIMIT option if you have a NIC interface that
supports hardware pacing, BBR understands how to use such a feature.

Note that this commit also adds in a general purpose time-filter which
allows you to have a min-filter or max-filter. A filter allows you to
have a low (or high) value for some period of time and degrade slowly
to another value has time passes. You can find out the details of
BBR by looking at the original paper at:

https://queue.acm.org/detail.cfm?id=3022184

or consult many other web resources you can find on the web
referenced by "BBR congestion control". It should be noted that
BBRv1 (which this is) does tend to unfairness in cases of small
buffered paths, and it will usually get less bandwidth in the case
of large BDP paths(when competing with new-reno or cubic flows). BBR
is still an active research area and we do plan on  implementing V2
of BBR to see if it is an improvement over V1.

Sponsored by: Netflix Inc.
Differential Revision: https://reviews.freebsd.org/D21582

4 years agoix, ixv: Read msix_bar from device configuration
erj [Tue, 24 Sep 2019 17:06:32 +0000 (17:06 +0000)]
ix, ixv: Read msix_bar from device configuration

Instead of predicting the MSI-X bar index based on the device's MAC
type, read it from the device's PCI configuration instead.

PR: 239704
Submitted by: Piotr Pietruszewski <piotr.pietruszewski@intel.com>
Reviewed by: erj@
MFC after: 3 days
Sponsored by: Intel Corporation
Differential Revision: https://reviews.freebsd.org/D21547

4 years agoiflib: Remove redundant VLAN events deregistration
erj [Tue, 24 Sep 2019 17:03:31 +0000 (17:03 +0000)]
iflib: Remove redundant VLAN events deregistration

From Piotr:
r351152 introduced iflib_deregister() function calling
EVENTHANDLER_DEREGISTER() to unregister VLAN events. This patch removes
duplicate of EVENTHANDLER_DEREGISTER() calls placed in
iflib_device_deregister() as this function is now calling
iflib_deregister(). This is to avoid deregistering same event twice.

This patch also adds check in iflib_vlan_register() to prevent
registering VLAN while being in detach.

Patch co-authored by Krzysztof Galazka <krzysztof.galazka@intel.com>,
erj <erj@FreeBSD.org> and Jacob Keller <jacob.e.keller@intel.com>.

Signed-off-by: Piotr Pietruszewski <piotr.pietruszewski@intel.com>
Submitted by: Piotr Pietruszewski <piotr.pietruszewski@intel.com>
Reviewed by: gallatin@, erj@
MFC after: 3 days
Sponsored by: Intel Corporation
Differential Revision: https://reviews.freebsd.org/D21711

4 years agoFix a minor typo
olivier [Tue, 24 Sep 2019 16:49:42 +0000 (16:49 +0000)]
Fix a minor typo

Approved by: lwhsu
MFC after: 1 month
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D19970

4 years agoFix coredump_phnum_test in case of kern.compress_user_cores=1
olivier [Tue, 24 Sep 2019 16:45:34 +0000 (16:45 +0000)]
Fix coredump_phnum_test in case of kern.compress_user_cores=1

PR: 240783
Approved by: ngie, lwhsu
MFC after: 1 month
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D21776

4 years agoPlumb a memory leak.
tuexen [Tue, 24 Sep 2019 13:15:24 +0000 (13:15 +0000)]
Plumb a memory leak.
Thnanks to Felix Weinrank for finding this issue using fuzz testing
and reporting it for the userland stack:
https://github.com/sctplab/usrsctp/issues/378

MFC after: 3 days

4 years agolib/libc/regex: fix build with REDEBUG defined
yuripv [Tue, 24 Sep 2019 12:21:01 +0000 (12:21 +0000)]
lib/libc/regex: fix build with REDEBUG defined

Reviewed by: kevans
Differential Revision: https://reviews.freebsd.org/D21760

4 years agoReplace all mtx_lock()/mtx_unlock() on n_mtx with the macros.
rmacklem [Tue, 24 Sep 2019 01:58:54 +0000 (01:58 +0000)]
Replace all mtx_lock()/mtx_unlock() on n_mtx with the macros.

For a long time, some places in the NFS code have locked/unlocked the
NFS node lock with the macros NFSLOCKNODE()/NFSUNLOCKNODE() whereas
others have simply used mtx_lock()/mtx_unlock().
Since the NFS node mutex needs to change to an sx lock so it can be held when
vnode_pager_setsize() is called, replace all occurrences of mtx_lock/mtx_unlock
with the macros to simply making the change to an sx lock in future commit.
There is no semantic change as a result of this commit.

I am not sure if the change to an sx lock will be MFC'd soon, so I put
an MFC of 1 week on this commit so that it could be MFC'd with that commit.

Suggested by: kib
MFC after: 1 week

4 years agoClean LINT* kernel configurations for arm*
lwhsu [Tue, 24 Sep 2019 01:56:27 +0000 (01:56 +0000)]
Clean LINT* kernel configurations for arm*

MFC after: 3 days
Sponsored by: The FreeBSD Foundation

4 years agoping6: Use caph_rights_limit(3) for STDIN_FILENO
markj [Mon, 23 Sep 2019 22:20:11 +0000 (22:20 +0000)]
ping6: Use caph_rights_limit(3) for STDIN_FILENO

Update some error messages while here.

Reported by: olivier
MFC after: 3 days

4 years agocache: tidy up handling of negative entries
mjg [Mon, 23 Sep 2019 20:50:04 +0000 (20:50 +0000)]
cache: tidy up handling of negative entries

- track the total count of hot entries
- pre-read the lock when shrinking since it is typically already taken
- place the lock in its own cacheline
- shorten the hold time of hot lock list when zapping

Sponsored by: The FreeBSD Foundation

4 years agoMake nvme(4) driver some more NUMA aware.
mav [Mon, 23 Sep 2019 17:53:47 +0000 (17:53 +0000)]
Make nvme(4) driver some more NUMA aware.

 - For each queue pair precalculate CPU and domain it is bound to.
If queue pairs are not per-CPU, then use the domain of the device.
 - Allocate most of queue pair memory from the domain it is bound to.
 - Bind callouts to the same CPUs as queue pair to avoid migrations.
 - Do not assign queue pairs to each SMT thread.  It just wasted
resources and increased lock congestions.
 - Remove fixed multiplier of CPUs per queue pair, spread them even.
This allows to use more queue pairs in some hardware configurations.
 - If queue pair serves multiple CPUs, bind different NVMe devices to
different CPUs.

MFC after: 1 month
Sponsored by: iXsystems, Inc.

4 years agoImplement x86 dtrace_invop_(un)init() in C.
markj [Mon, 23 Sep 2019 15:08:17 +0000 (15:08 +0000)]
Implement x86 dtrace_invop_(un)init() in C.

There is no reason for these routines to be written in assembly.  In
the ports of DTrace to other platforms, they are already written in C.
No functional change intended.

MFC after: 1 week
Sponsored by: Netflix

4 years agoFix a harmless typo.
markj [Mon, 23 Sep 2019 14:34:23 +0000 (14:34 +0000)]
Fix a harmless typo.

MFC after: 1 week

4 years agoRevert r316820.
markj [Mon, 23 Sep 2019 14:29:05 +0000 (14:29 +0000)]
Revert r316820.

Despite appearing correct, r316820 breaks packet rx/tx for jme(4)
interfaces.  With 12.1 approaching, let's just revert the commit for now.

PR: 233952
Tested by: Armin Gruner <ag-freebsd@muc.de>
MFC after: 3 days

4 years agoSet NX on some non-leaf direct map page table entries.
markj [Mon, 23 Sep 2019 14:19:41 +0000 (14:19 +0000)]
Set NX on some non-leaf direct map page table entries.

The direct map is never used for execution of code, so we might as well
set NX in the direct map's PML4Es.  Also clarify the intent of the code
in create_pagetables() that restricts access protections on the region
of the direct map mapping the kernel text.

Reviewed by: alc, kib (previous version)
MFC after: 1 week
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D21759

4 years agoUse elf_relocaddr() when handling R_X86_64_RELATIVE relocations.
markj [Mon, 23 Sep 2019 14:14:43 +0000 (14:14 +0000)]
Use elf_relocaddr() when handling R_X86_64_RELATIVE relocations.

This is required for DPCPU and VNET data variable definitions to work when
KLDs are linked as DSOs.  R_X86_64_RELATIVE relocations should not appear
in object files, so assert this in elf_relocaddr().

Reviewed by: kib
MFC after: 1 month
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D21755

4 years agoSet NX in mappings created by pmap_kenter() and pmap_kenter_attr().
markj [Mon, 23 Sep 2019 14:11:59 +0000 (14:11 +0000)]
Set NX in mappings created by pmap_kenter() and pmap_kenter_attr().

There does not appear to be any existing need for such mappings to be
executable.

Reviewed by: alc, kib
MFC after: 1 month
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D21754

4 years agoFix destruction of the robust mutexes.
kib [Mon, 23 Sep 2019 13:24:31 +0000 (13:24 +0000)]
Fix destruction of the robust mutexes.

If robust mutex' owner terminated, causing kernel-assisted state
recovery, and then pthread_mutex_destroy() is executed as the next
action, assert is triggered about mutex still being on the list.
Ignore the mutex linkage in pthread_mutex_destroy() for shared robust
mutexes with dead owner, same as for enqueue_mutex().

Reported by: avg
Sponsored by: The FreeBSD Foundation
MFC after: 1 week

4 years agomips: fix XLPN32 after r352434
kevans [Mon, 23 Sep 2019 12:43:08 +0000 (12:43 +0000)]
mips: fix XLPN32 after r352434

SYSINIT usage was added, but the <sys/kernel.h> dependency was not added.
This worked by coincidence, as most of the mips configs have DDB enabled and
pmap.c gets <sys/kernel.h> via ddb.h pollution.

Reported by: dim

4 years agoCreate a "drm" subdirectory for drm devices in linsysfs. Recent versions of
tijl [Mon, 23 Sep 2019 12:27:55 +0000 (12:27 +0000)]
Create a "drm" subdirectory for drm devices in linsysfs.  Recent versions of
linux libdrm check for the existence of this directory:

https://cgit.freedesktop.org/mesa/drm/commit/?id=f8392583418aef5e27bfed9989aeb601e20cc96d

MFC after: 2 weeks

4 years agocache: count evictions of negatve entries
mjg [Mon, 23 Sep 2019 08:53:14 +0000 (08:53 +0000)]
cache: count evictions of negatve entries

Sponsored by: The FreeBSD Foundation

4 years agoAdd two options to allow mount to avoid covering up existing mount points.
sef [Mon, 23 Sep 2019 04:28:07 +0000 (04:28 +0000)]
Add two options to allow mount to avoid covering up existing mount points.
The two options are

* nocover/cover:  Prevent/allow mounting over an existing root mountpoint.
E.g., "mount -t ufs -o nocover /dev/sd1a /usr/local" will fail if /usr/local
is already a mountpoint.
* emptydir/noemptydir:  Prevent/allow mounting on a non-empty directory.
E.g., "mount -t ufs -o emptydir /dev/sd1a /usr" will fail.

Neither of these options is intended to be a default, for historical and
compatibility reasons.

Reviewed by: allanjude, kib
Differential Revision: https://reviews.freebsd.org/D21458

4 years agocache: try to avoid vhold if locks held
mjg [Sun, 22 Sep 2019 20:50:24 +0000 (20:50 +0000)]
cache: try to avoid vhold if locks held

Sponsored by: The FreeBSD Foundation

4 years agocache: jump in negative success instead of positive
mjg [Sun, 22 Sep 2019 20:49:17 +0000 (20:49 +0000)]
cache: jump in negative success instead of positive

Sponsored by: The FreeBSD Foundation

4 years agolockprof: move per-cpu data to dpcpu
mjg [Sun, 22 Sep 2019 20:44:24 +0000 (20:44 +0000)]
lockprof: move per-cpu data to dpcpu

Reviewed by: kib
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D21747

4 years agoi386: reduce differences in source between PAE and non-PAE pmaps ...
kib [Sun, 22 Sep 2019 19:59:10 +0000 (19:59 +0000)]
i386: reduce differences in source between PAE and non-PAE pmaps ...

by defining pg_nx as zero for non-PAE and correspondingly simplifying
some expressions.

Suggested and reviewed by: markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D21757

4 years agoi386: implement sysctl vm.pmap.kernel_maps.
kib [Sun, 22 Sep 2019 19:23:00 +0000 (19:23 +0000)]
i386: implement sysctl vm.pmap.kernel_maps.

Reviewed by: markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D21739

4 years agoamd64: minor tweaks to pat decoding in sysctl vm.pmap.kernel_maps.
kib [Sun, 22 Sep 2019 19:20:37 +0000 (19:20 +0000)]
amd64: minor tweaks to pat decoding in sysctl vm.pmap.kernel_maps.

Decode PAT_UNCACHED.
When unknown pat mode is encountered, print the pte bits combination
instead of the index, which is always 8.

Reviewed by: markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D21738

4 years agoocteon-sdk: suppress another set of warnings under clang
kevans [Sun, 22 Sep 2019 18:32:05 +0000 (18:32 +0000)]
octeon-sdk: suppress another set of warnings under clang

Clang sees this construct and warns that adding an int to a string like this
does not concatenate the two. Fortunately, this is not what octeon-sdk
actually intended to do, so we take the path towards remediation that clang
offers: use array indexing instead.

4 years agoocteon1: suppress a couple of warnings under clang
kevans [Sun, 22 Sep 2019 18:30:19 +0000 (18:30 +0000)]
octeon1: suppress a couple of warnings under clang

These appear in octeon-sdk -- there are new releases, but they don't seem to
address the running issues in octeon-sdk. GCC4.2 is more than happy, but
clang is much less-so and most of them are fairly innocuous and perhaps a
by-product of their style guide, which may make some of the changes harder
to upstream (if this is even possible anymore).

4 years agoHonor CWARNFLAGS.clang/gcc in the kernel build
kevans [Sun, 22 Sep 2019 18:27:57 +0000 (18:27 +0000)]
Honor CWARNFLAGS.clang/gcc in the kernel build

Some kernel builds or users may want to disable warnings on a per-compiler
basis, so do this now.

4 years agoloader_lua: lua color changes should end with reset
tsoome [Sun, 22 Sep 2019 17:39:20 +0000 (17:39 +0000)]
loader_lua: lua color changes should end with reset

The color change should have reset sequence, not switch to white.

4 years agoloader_4th: menu items need to reset color attribute, not switch to white
tsoome [Sun, 22 Sep 2019 16:10:25 +0000 (16:10 +0000)]
loader_4th: menu items need to reset color attribute, not switch to white

Forth menu kernel and BE entries, instead of resetting the color attribute,
are switching to white color.

4 years agoAdd support for ps -H on corefiles in libkvm
karels [Sun, 22 Sep 2019 13:56:27 +0000 (13:56 +0000)]
Add support for ps -H on corefiles in libkvm

Add support for kernel threads in kvm_getprocs() and the underlying
kvm_proclist() in libkvm when fetching from a kernel core file. This
has been missing/needed for several releases, when kernel threads became
normal threads.  The loop over the processes now contains a sub-loop for
threads, which iterates beyond the first thread only when threads are
requested.  Also set some fields such as tid that were previously
uninitialized.

Reviewed by: vangyzen jhb(earlier revision)
MFC after: 4 days
Sponsored by: Forcepoint LLC
Differential Revision: https://reviews.freebsd.org/D21461

4 years agoDon't hold the info lock when calling sctp_select_a_tag().
tuexen [Sun, 22 Sep 2019 11:11:01 +0000 (11:11 +0000)]
Don't hold the info lock when calling sctp_select_a_tag().

This avoids a double lock bug in the NAT colliding state processing
of SCTP. Thanks to Felix Weinrank for finding and reporting this issue in
https://github.com/sctplab/usrsctp/issues/374
He found this bug using fuzz testing.

MFC after: 3 days

4 years agoCleanup the RTO calculation and perform some consistency checks
tuexen [Sun, 22 Sep 2019 10:40:15 +0000 (10:40 +0000)]
Cleanup the RTO calculation and perform some consistency checks
before computing the RTO.
This should fix an overflow issue reported by Felix Weinrank in
https://github.com/sctplab/usrsctp/issues/375
for the userland stack and found by running a fuzz tester.

MFC after: 3 days

4 years agoMFZoL: Retire send space estimation via ZFS_IOC_SEND
avg [Sun, 22 Sep 2019 08:44:41 +0000 (08:44 +0000)]
MFZoL: Retire send space estimation via ZFS_IOC_SEND

Add a small wrapper around libzfs_core's lzc_send_space() to libzfs so
that every legacy ZFS_IOC_SEND consumer, along with their userland
counterpart estimate_ioctl(), can leverage ZFS_IOC_SEND_SPACE to
request send space estimation.

The legacy functionality in zfs_ioc_send() is left untouched for
compatibility purposes.

Obtained from: ZoL
Obtained from: zfsonlinux/zfs@cf7684bc8d57
Author: loli10K <ezomori.nozomu@gmail.com>
MFC after: 2 weeks

4 years agoprint summary line for space estimate of zfs send from bookmark
avg [Sun, 22 Sep 2019 08:34:23 +0000 (08:34 +0000)]
print summary line for space estimate of zfs send from bookmark

Although there is always a single stream and the total size in the
summary is always equal to the size reported for the stream, it's nice
to follow the usual output format.

MFC after: 3 days

4 years agokern.elf{32,64}.pie_base sysctl: enforce page alignment.
kib [Sat, 21 Sep 2019 20:03:17 +0000 (20:03 +0000)]
kern.elf{32,64}.pie_base sysctl: enforce page alignment.

Requested by: rstone
Sponsored by: The FreeBSD Foundation
MFC after: 1 week

4 years agoIn case a translation fault on the kernel address space occurs from
alc [Sat, 21 Sep 2019 19:51:57 +0000 (19:51 +0000)]
In case a translation fault on the kernel address space occurs from
within a critical section, we must perform a lock-free check on the
faulting address.

Reported by: andrew
Reviewed by: andrew, markj
X-MFC with: r350579
Differential Revision: https://reviews.freebsd.org/D21685

4 years agolockprof: use CPUFOREACH and drop always false lp_cpu NULL checks
mjg [Sat, 21 Sep 2019 19:05:38 +0000 (19:05 +0000)]
lockprof: use CPUFOREACH and drop always false lp_cpu NULL checks

Sponsored by: The FreeBSD Foundation

4 years agoMake non-ASLR pie base tunable.
kib [Sat, 21 Sep 2019 18:00:23 +0000 (18:00 +0000)]
Make non-ASLR pie base tunable.

Sponsored by: The FreeBSD Foundation
MFC after: 1 week

4 years agoamd64 pmap: Fix formats for 64bit addresses in ddb and sysctl output.
kib [Sat, 21 Sep 2019 17:59:15 +0000 (17:59 +0000)]
amd64 pmap: Fix formats for 64bit addresses in ddb and sysctl output.

Reviewed by: markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D21737

4 years agoFix a regression introduced in r344601, and work properly with the
sef [Sat, 21 Sep 2019 17:54:42 +0000 (17:54 +0000)]
Fix a regression introduced in r344601, and work properly with the
-v and -n options.

PR: 240640
Reported by: Andriy Gapon <avg@FreeBSD.org>
Reviewed by: avg
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D21709

4 years agoAllocate callout wheel from the respective memory domain.
mav [Sat, 21 Sep 2019 15:38:08 +0000 (15:38 +0000)]
Allocate callout wheel from the respective memory domain.

MFC after: 1 week

4 years agojot.1: Explain default argument values more precisely
0mp [Sat, 21 Sep 2019 15:01:11 +0000 (15:01 +0000)]
jot.1: Explain default argument values more precisely

The way jot(1) defaults missing arguments doesn't match the behaviour
described in the manpage, which states that with fewer than 3 arguments
missing values are supplied from left to right.

In fact, with one or two arguments, the last (s which is step size or seed)
defaults to 1 (or -1 if begin and end specify a descending range), and then
omitted arguments are set to default starting with the leftmost until three
arguments are available.

This is why `jot 2 1000` prints 1000 and 1001 instead of 1000 and 100.

PR: 135475
Submitted by: Jonathan McKeown <j.mckeown@ru.ac.za>
Approved by: doc (bcr)
Differential Revision: https://reviews.freebsd.org/D21736
Event: EuroBSDcon 2019

4 years agoascii(7): Add STANDARDS section and update HISTORY section
0mp [Sat, 21 Sep 2019 14:16:37 +0000 (14:16 +0000)]
ascii(7): Add STANDARDS section and update HISTORY section

PR: 240727
Submitted by: Gordon Bergling <gbergling@gmail.com>
Approved by: src (imp)
Event: EuroBSDcon 2019

4 years ago- Revert WARNS to 2 because of mismatch between (xdrproc_t) and xdr_void().
hrs [Sat, 21 Sep 2019 13:34:06 +0000 (13:34 +0000)]
- Revert WARNS to 2 because of mismatch between (xdrproc_t) and xdr_void().
- Add prototype of from_addr().

4 years agoFix warnings and set WARNS=6.
hrs [Sat, 21 Sep 2019 12:33:41 +0000 (12:33 +0000)]
Fix warnings and set WARNS=6.