bsdgrep(1): various fixes of empty pattern/exit code/-c behavior
When an empty pattern is encountered in the pattern list, I had previously
broken bsdgrep to count that as a "match all" and ignore any other patterns
in the list. This commit rectifies that mistake, among others:
- The -v flag semantics were not quite right; lines matched should have been
counted differently based on whether the -v flag was set or not. procline
now definitively returns whether it's matched or not, and interpreting
that result has been kicked up a level.
- Empty patterns with the -x flag was broken similarly to empty patterns
with the -w flag. The former is a whole-line match and should be more
strict, only matching blank lines. No -x and no -w will will match the
empty string at the beginning of each line.
- The exit code with -L was broken, w.r.t. modern grep. Modern grap will
exit(0) if any file that didn't match was output, so our interpretation
was simply backwards. The new interpretation makes sense to me.
Tests updated and added to try and catch some of this.
This misbehavior was found by autoconf while fixing ports found in PR 229925
expecting either a more sane or a more GNU-like sed.
Mark Johnston [Wed, 25 Sep 2019 17:08:35 +0000 (17:08 +0000)]
Add some counters for per-VM page events.
For now, just count batched page queue state operations.
vm.stats.page.queue_ops counts the number of batch entries that
successfully completed, while queue_nops counts entries that had no
effect, which occurs when the queue operation had been completed before
the batch entry was processed.
Ed Maste [Wed, 25 Sep 2019 16:49:22 +0000 (16:49 +0000)]
remove obsolete i386 MD memchr implementation
bde reports (in a reply to r351700 commit mail):
This uses scasb, which was last optimal on the 8086, or perhaps the
original i386. On freefall, it is several times slower than the
naive translation of the naive C code.
Reported by: bde
Reviewed by: kib, markj
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D21785
x86: Fall back to leaf 0x16 if TSC frequency is obtained by CPUID and
leaf 0x15 is not functional.
This should improve automatic TSC frequency determination on
Skylake/Kabylake/... families, where 0x15 exists but does not provide
all necessary information. SDM contains relatively strong wording
against such uses of 0x16, but Intel does not give us any other way to
obtain the frequency. Linux did the same in the commit 604dc9170f2435d27da5039a3efd757dceadc684.
Based on submission by: Neel Chauhan <neel@neelc.org>
PR: 240475
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D21777
Warner Losh [Wed, 25 Sep 2019 07:51:30 +0000 (07:51 +0000)]
Size is unsigned, so remove the test entirely.
The kernel won't crash if you have a bad value and I'd rather not have
nvmecontrol know the internal details about how the nvme driver limits
the transfer size.
Toomas Soome [Wed, 25 Sep 2019 07:09:25 +0000 (07:09 +0000)]
loader: add teken.fg_color and teken.bg_color variables
Add settable variables to control teken default color attributes.
The supported colors are 0-7 or basic color names:
black, red, green, brown, blue, magenta, cyan, white.
The current implementation does add some duplication which will be addressed
later.
cron: add log suppression and mail suppression for successful runs
This commit adds two new extensions to crontab, ported from OpenBSD:
- -n: suppress mail on succesful run
- -q: suppress logging of command execution
The -q option appears decades old, but -n is relatively new. The
original proposal by Job Snijder can be found here [1], and gives very
convincing reasons for inclusion in base.
This patch is a nearly identical port of OpenBSD cron for -q and -n
features. It is written to follow existing conventions and style of the
existing codebase.
Example usage:
# should only send email, but won't show up in log
* * * * * -q date
# should not send email
* * * * * -n date
# should not send email or log
* * * * * -n -q date
# should send email because of ping failure
* * * * * -n -q ping -c 1 5.5.5.5
powerpc/atomic: Follow recommendations on atomic primitive comparisons
Both IBM and Freescale programming examples presume the cmpset operands will
favor equal, and pessimize the non-equal case instead. Do the same for
atomic_cmpset_* and atomic_fcmpset_*. This slightly pessimizes the failure
case, in favor of the success case.
Warner Losh [Wed, 25 Sep 2019 00:24:57 +0000 (00:24 +0000)]
After my comnd changes, the number of threads and size weren't set. In
addition, the flags are optional, but were made to be mandatory. Set
these things, as well as santiy check the specified size.
Rick Macklem [Tue, 24 Sep 2019 23:38:10 +0000 (23:38 +0000)]
Replace all mtx_lock()/mtx_unlock() on the iod lock with macros.
Since the NFS node mutex needs to change to an sx lock so it can be held when
vnode_pager_setsize() is called and the iod lock is held when the NFS node lock
is acquired, the iod mutex will need to be changed to an sx lock as well.
To simply the future commit that changes both the NFS node lock and iod lock
to sx locks, this commit replaces all mtx_lock()/mtx_unlock() calls on the
iod lock with macros.
There is no semantic change as a result of this commit.
I don't know when the future commit will happen and be MFC'd, so I have
set the MFC on this commit to one week so that it can be MFC'd at the same
time.
Michael Gmelin [Tue, 24 Sep 2019 20:49:33 +0000 (20:49 +0000)]
freebsd-update: Add `updatesready' and `showconfig' commands
`freebsd-update updatesready' can be used to check if there are any pending
fetched updates that can be installed.
`freebsd-update showconfig' writes freebsd-update's configuration to
stdout.
This also changes the exit code of `freebsd-update install' to 2 in case
there are no updates pending to be installed and there wasn't a fetch phase
in the same invocation. This allows scripts to tell apart these error
conditions without breaking existing jail managers.
Randall Stewart [Tue, 24 Sep 2019 20:04:31 +0000 (20:04 +0000)]
Fix the ifdefs in tcp_ratelimit.h. They were reversed so
that instead of functions only being inside the _KERNEL and
the absence of RATELIMIT causing us to have NULL/error returning
interfaces we ended up with non-kernel getting the error path.
opps..
Alexander Motin [Tue, 24 Sep 2019 20:01:20 +0000 (20:01 +0000)]
Fix/improve interrupt threads scheduling.
Doing some tests with very high interrupt rates I've noticed that one of
conditions I added in r232207 to make interrupt threads in most cases
run on local CPU never worked as expected (worked only if previous time
it was executed on some other CPU, that is quite opposite). It caused
additional CPU usage to run full CPU search and could schedule interrupt
threads to some other CPU.
This patch removes that code and instead reuses existing non-interrupt
code path with some tweaks for interrupt case:
- On SMT systems, if current thread is idle, don't look on other threads.
Even if they are busy, it may take more time to do fill search and bounce
the interrupt thread to other core then execute it locally, even sharing
CPU resources. It is other threads should migrate, not bound interrupts.
- Try hard to keep interrupt threads within LLC of their original CPU.
This improves scheduling cost and supposedly cache and memory locality.
On a test system with 72 threads doing 2.2M IOPS to NVMe this saves few
percents of CPU time while adding few percents to IOPS.
Randall Stewart [Tue, 24 Sep 2019 18:18:11 +0000 (18:18 +0000)]
This commit adds BBR (Bottleneck Bandwidth and RTT) congestion control. This
is a completely separate TCP stack (tcp_bbr.ko) that will be built only if
you add the make options WITH_EXTRA_TCP_STACKS=1 and also include the option
TCPHPTS. You can also include the RATELIMIT option if you have a NIC interface that
supports hardware pacing, BBR understands how to use such a feature.
Note that this commit also adds in a general purpose time-filter which
allows you to have a min-filter or max-filter. A filter allows you to
have a low (or high) value for some period of time and degrade slowly
to another value has time passes. You can find out the details of
BBR by looking at the original paper at:
or consult many other web resources you can find on the web
referenced by "BBR congestion control". It should be noted that
BBRv1 (which this is) does tend to unfairness in cases of small
buffered paths, and it will usually get less bandwidth in the case
of large BDP paths(when competing with new-reno or cubic flows). BBR
is still an active research area and we do plan on implementing V2
of BBR to see if it is an improvement over V1.
Sponsored by: Netflix Inc.
Differential Revision: https://reviews.freebsd.org/D21582
From Piotr:
r351152 introduced iflib_deregister() function calling
EVENTHANDLER_DEREGISTER() to unregister VLAN events. This patch removes
duplicate of EVENTHANDLER_DEREGISTER() calls placed in
iflib_device_deregister() as this function is now calling
iflib_deregister(). This is to avoid deregistering same event twice.
This patch also adds check in iflib_vlan_register() to prevent
registering VLAN while being in detach.
Patch co-authored by Krzysztof Galazka <krzysztof.galazka@intel.com>,
erj <erj@FreeBSD.org> and Jacob Keller <jacob.e.keller@intel.com>.
Signed-off-by: Piotr Pietruszewski <piotr.pietruszewski@intel.com>
Submitted by: Piotr Pietruszewski <piotr.pietruszewski@intel.com>
Reviewed by: gallatin@, erj@
MFC after: 3 days
Sponsored by: Intel Corporation
Differential Revision: https://reviews.freebsd.org/D21711
Michael Tuexen [Tue, 24 Sep 2019 13:15:24 +0000 (13:15 +0000)]
Plumb a memory leak.
Thnanks to Felix Weinrank for finding this issue using fuzz testing
and reporting it for the userland stack:
https://github.com/sctplab/usrsctp/issues/378
Rick Macklem [Tue, 24 Sep 2019 01:58:54 +0000 (01:58 +0000)]
Replace all mtx_lock()/mtx_unlock() on n_mtx with the macros.
For a long time, some places in the NFS code have locked/unlocked the
NFS node lock with the macros NFSLOCKNODE()/NFSUNLOCKNODE() whereas
others have simply used mtx_lock()/mtx_unlock().
Since the NFS node mutex needs to change to an sx lock so it can be held when
vnode_pager_setsize() is called, replace all occurrences of mtx_lock/mtx_unlock
with the macros to simply making the change to an sx lock in future commit.
There is no semantic change as a result of this commit.
I am not sure if the change to an sx lock will be MFC'd soon, so I put
an MFC of 1 week on this commit so that it could be MFC'd with that commit.
- track the total count of hot entries
- pre-read the lock when shrinking since it is typically already taken
- place the lock in its own cacheline
- shorten the hold time of hot lock list when zapping
Alexander Motin [Mon, 23 Sep 2019 17:53:47 +0000 (17:53 +0000)]
Make nvme(4) driver some more NUMA aware.
- For each queue pair precalculate CPU and domain it is bound to.
If queue pairs are not per-CPU, then use the domain of the device.
- Allocate most of queue pair memory from the domain it is bound to.
- Bind callouts to the same CPUs as queue pair to avoid migrations.
- Do not assign queue pairs to each SMT thread. It just wasted
resources and increased lock congestions.
- Remove fixed multiplier of CPUs per queue pair, spread them even.
This allows to use more queue pairs in some hardware configurations.
- If queue pair serves multiple CPUs, bind different NVMe devices to
different CPUs.
Mark Johnston [Mon, 23 Sep 2019 15:08:17 +0000 (15:08 +0000)]
Implement x86 dtrace_invop_(un)init() in C.
There is no reason for these routines to be written in assembly. In
the ports of DTrace to other platforms, they are already written in C.
No functional change intended.
Mark Johnston [Mon, 23 Sep 2019 14:19:41 +0000 (14:19 +0000)]
Set NX on some non-leaf direct map page table entries.
The direct map is never used for execution of code, so we might as well
set NX in the direct map's PML4Es. Also clarify the intent of the code
in create_pagetables() that restricts access protections on the region
of the direct map mapping the kernel text.
Mark Johnston [Mon, 23 Sep 2019 14:14:43 +0000 (14:14 +0000)]
Use elf_relocaddr() when handling R_X86_64_RELATIVE relocations.
This is required for DPCPU and VNET data variable definitions to work when
KLDs are linked as DSOs. R_X86_64_RELATIVE relocations should not appear
in object files, so assert this in elf_relocaddr().
If robust mutex' owner terminated, causing kernel-assisted state
recovery, and then pthread_mutex_destroy() is executed as the next
action, assert is triggered about mutex still being on the list.
Ignore the mutex linkage in pthread_mutex_destroy() for shared robust
mutexes with dead owner, same as for enqueue_mutex().
Reported by: avg
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
SYSINIT usage was added, but the <sys/kernel.h> dependency was not added.
This worked by coincidence, as most of the mips configs have DDB enabled and
pmap.c gets <sys/kernel.h> via ddb.h pollution.
Sean Eric Fagan [Mon, 23 Sep 2019 04:28:07 +0000 (04:28 +0000)]
Add two options to allow mount to avoid covering up existing mount points.
The two options are
* nocover/cover: Prevent/allow mounting over an existing root mountpoint.
E.g., "mount -t ufs -o nocover /dev/sd1a /usr/local" will fail if /usr/local
is already a mountpoint.
* emptydir/noemptydir: Prevent/allow mounting on a non-empty directory.
E.g., "mount -t ufs -o emptydir /dev/sd1a /usr" will fail.
Neither of these options is intended to be a default, for historical and
compatibility reasons.
octeon-sdk: suppress another set of warnings under clang
Clang sees this construct and warns that adding an int to a string like this
does not concatenate the two. Fortunately, this is not what octeon-sdk
actually intended to do, so we take the path towards remediation that clang
offers: use array indexing instead.
octeon1: suppress a couple of warnings under clang
These appear in octeon-sdk -- there are new releases, but they don't seem to
address the running issues in octeon-sdk. GCC4.2 is more than happy, but
clang is much less-so and most of them are fairly innocuous and perhaps a
by-product of their style guide, which may make some of the changes harder
to upstream (if this is even possible anymore).
Mike Karels [Sun, 22 Sep 2019 13:56:27 +0000 (13:56 +0000)]
Add support for ps -H on corefiles in libkvm
Add support for kernel threads in kvm_getprocs() and the underlying
kvm_proclist() in libkvm when fetching from a kernel core file. This
has been missing/needed for several releases, when kernel threads became
normal threads. The loop over the processes now contains a sub-loop for
threads, which iterates beyond the first thread only when threads are
requested. Also set some fields such as tid that were previously
uninitialized.
Michael Tuexen [Sun, 22 Sep 2019 11:11:01 +0000 (11:11 +0000)]
Don't hold the info lock when calling sctp_select_a_tag().
This avoids a double lock bug in the NAT colliding state processing
of SCTP. Thanks to Felix Weinrank for finding and reporting this issue in
https://github.com/sctplab/usrsctp/issues/374
He found this bug using fuzz testing.
Michael Tuexen [Sun, 22 Sep 2019 10:40:15 +0000 (10:40 +0000)]
Cleanup the RTO calculation and perform some consistency checks
before computing the RTO.
This should fix an overflow issue reported by Felix Weinrank in
https://github.com/sctplab/usrsctp/issues/375
for the userland stack and found by running a fuzz tester.
MFZoL: Retire send space estimation via ZFS_IOC_SEND
Add a small wrapper around libzfs_core's lzc_send_space() to libzfs so
that every legacy ZFS_IOC_SEND consumer, along with their userland
counterpart estimate_ioctl(), can leverage ZFS_IOC_SEND_SPACE to
request send space estimation.
The legacy functionality in zfs_ioc_send() is left untouched for
compatibility purposes.
print summary line for space estimate of zfs send from bookmark
Although there is always a single stream and the total size in the
summary is always equal to the size reported for the stream, it's nice
to follow the usual output format.
Alan Cox [Sat, 21 Sep 2019 19:51:57 +0000 (19:51 +0000)]
In case a translation fault on the kernel address space occurs from
within a critical section, we must perform a lock-free check on the
faulting address.
jot.1: Explain default argument values more precisely
The way jot(1) defaults missing arguments doesn't match the behaviour
described in the manpage, which states that with fewer than 3 arguments
missing values are supplied from left to right.
In fact, with one or two arguments, the last (s which is step size or seed)
defaults to 1 (or -1 if begin and end specify a descending range), and then
omitted arguments are set to default starting with the leftmost until three
arguments are available.
This is why `jot 2 1000` prints 1000 and 1001 instead of 1000 and 100.
Hiroki Sato [Sat, 21 Sep 2019 01:29:59 +0000 (01:29 +0000)]
Fix build errors of test.c, which had been broken for a long time.
This is a temporary fix and should be converted to a complete
test scenarios by using this tool.
Hiroki Sato [Sat, 21 Sep 2019 00:17:40 +0000 (00:17 +0000)]
Add a workaround for servers which respond RPC_PROGNOTREGISTERED
to a clnt_create() call even when it is actually a program
version mismatch.
Normally the server is supposed to return RPC_PROGVERSMISMATCH
when it supports the specified program but not support
the specified version. Some filers return RPC_PROGNOTREGISTERED
to RQUOTA v2 calls and FreeBSD does not retry with the old
v1 calls. This change fixes this failure scenario.
When a file is unlinked, the denode is not reclaimed until the last
reference is dropped, but the directory entry is immediately up for reuse.
This is a problem later when createde goes to grab a denode for the newly
created entry -- we search the hash and find a dead denode, then return that
without even bumping the reference count and the data later gets truncated
when the the last reference to the unlinked file is dropped.
This manifested itself as a broken in-place strip(1) on msdosfs. elfcopy
will do a sequence incredibly roughly like this:
and the resulting file would be truncated, but the write succeeded, as long
as a reference to the unlinked file had not been closed.
Some archaeology indicates that this bug has likely existed since msdosfs
was converted to use vfs_hash instead of a home rolled hash implementation
in r143570. Prior to that point, the hashget implementation would do a
refcnt check while searching and explicitly only return a denode with
de_refcnt != 0. vfs_hash did not yet have the callback that it does today,
so this slipped away and did not come back when it later grew that
functionality.
The comment indicating that we want to skip these denodes has been updated
to reflect where this is actually done. My repo-diving session seems to
indicate that the refcnt check was likely never actually below the comment,
to be pedantic, but instead a detail wrapped up in the hashget
implementation since the beginning of its inclusion into FreeBSD.
This bug was the cause behind the issue addressed in r352557.
loader: Respect loader_color=YES for serial consoles
It's not uncommon these days for the terminals attached to serial consoles
to support ANSI escape sequences. However, we assume escape sequences may
break some serial consoles and default to not using them when boot_serial or
boot_multicons (or if console contains "comconsole" in the forth loader) for
broader compatibility. We also have loader_color which can be explicitly set
to "NO" to disable the use of ANSI escape sequences.
The problem is that loader_color=YES gets ignored when boot_serial=YES or
boot_multicons=YES (or when console contains "comconsole" in the forth
loader).
To fix, the existing default behavior remains unchanged when loader_color is
unset, loader_color=NO explicitly disables the use of ANSI escape sequences
still, and the change is that loader_color=YES can now be used to explicitly
allow ANSI escapes when a serial console is enabled.
top(1): support multibyte characters in command names (ARGV array)
depending on locale.
- add setlocale()
- remove printable() function
- add VIS_OCTAL and VIS_SAFE to the flag of strvisx() to display
non-printable characters that do not use C-style backslash sequences
in three digit octal sequence, or remove it
This change allows multibyte characters to be displayed according to
locale. If it is recognized as a non-display character according to the
locale, it is displayed in three digit octal sequence.
Summary:
Install's strip capability, by way of strip(1), doesn't seem to work
correctly on msdosfs, and instead ends up truncating the resulting
binary to 0-length. As a workaround, don't strip ubldr(8). This
fixes installworld on Book-E ubldr-based platforms, which prior to this
would need to manually install ubldr separately after installworld, in
order to have a functional ubldr.
The same thing could be done on PowerNV platforms that use msdosfs /boot
volumes, since loader and loader.kboot, etc, all get truncated to 0 on
install. However, PowerNV does not use loader, instead loading from
petitboot, so it's not really necessary at this time.
Add quirk for XHCI(4) controllers to support USB control transfers
above 1Kbyte. It might look like some XHCI(4) controllers do not
support when the USB control transfer is split using a link TRB. The
next NORMAL TRB after the link TRB is simply failing with XHCI error
code 4. The quirk ensures we allocate a 64Kbyte buffer so that the
data stage TRB is not broken with a link TRB.
Increase the maximum user-space buffer size from 256kBytes to 32MBytes for
libusb. This is useful for speeding up large data transfers while reducing
the interrupt rate.
Ed Maste [Fri, 20 Sep 2019 09:04:52 +0000 (09:04 +0000)]
elf_common: add ELF note names
r348628 added a definition of NT_GNU_BUILD_ID. Some software (Valgrind)
also expects a #define for the note name (ELF_NOTE_GNU) in the case that
NT_GNU_BUILD_ID is defined.
PR: 239669
Reported by: Yuichiro NAITO
Sponsored by: The FreeBSD Foundation
Event: EuroBSDCon FreeBSD DevSummit 2019
Michael Tuexen [Fri, 20 Sep 2019 08:20:20 +0000 (08:20 +0000)]
Fix the handling of invalid parameters in ASCONF chunks.
Thanks to Mark Wodrich from Google for reproting the issue in
https://github.com/sctplab/usrsctp/issues/376
for the userland stack.
Alexander Motin [Thu, 19 Sep 2019 22:15:57 +0000 (22:15 +0000)]
Improve ioat(4) NUMA-awareness.
Allocate ioat->ring memory from the device domain.
Schedule ioat->poll_timer to the first CPU of the device domain.
According to pcm-numa tool from intel-pcm port, this reduces number of
remote DRAM accesses while copying data by 75%. And unless it is a noise,
I've noticed some speed improvement when copying data to other domain.
Follow up on r352304 which disabled default mlockall() at startup.
Unfortunately though the original tarball supports this in ./configure
(for Linux), to fully support disabling of mlockall() by default requires
a little extra help otherwise the following is logged in syslog:
Cannot set RLIMIT_MEMLOCK: Operation not permitted
Apply r346792 (cperciva) from stable/12 to head. The original commit
message:
On non-x86 systems, use "quarterly" packages.
x86 architectures have "latest" package builds on stable/*, so keep using
those (they'll get switched over to "quarterly" during releases).
The original commit was a direct commit to stable/12, as at the time it
was presumed it would not be necessary for head. However, when it is time
to create a releng branch or switch from PRERELEASE/STABLE to BETA/RC, the
pkg(7) Makefile needs further adjusting. This commit includes those
further adjustments, evaluating the BRANCH variable from release/Makefile
to determine the pkg(7) repository to use.
Ed Maste [Thu, 19 Sep 2019 11:34:35 +0000 (11:34 +0000)]
freebsd-update.8: appease igor
igor follows American style guides in the belief that abbreviations i.e.
and e.g. are always followed by a comma. Make that change now so that
future updates to freebsd-update.8 do not complain about this.
Michael Tuexen [Thu, 19 Sep 2019 10:27:47 +0000 (10:27 +0000)]
When the RACK stack computes the space for user data in a TCP segment,
it wasn't taking the IP level options into account. This patch fixes this.
In addition, it also corrects a KASSERT and adds protection code to assure
that the IP header chain and the TCP head fit in the first fragment as
required by RFC 7112.
Reviewed by: rrs@
MFC after: 3 days
Sponsored by: Nertflix, Inc.
Differential Revision: https://reviews.freebsd.org/D21666
Michael Tuexen [Thu, 19 Sep 2019 10:22:29 +0000 (10:22 +0000)]
When processing an incoming IPv6 packet over the loopback interface which
contains Hop-by-Hop options, the mbuf chain is potentially changed in
ip6_hopopts_input(), called by ip6_input_hbh().
This can happen, because of the the use of IP6_EXTHDR_CHECK, which might
call m_pullup().
So provide the updated pointer back to the called of ip6_input_hbh() to
avoid using a freed mbuf chain in`ip6_input()`.
Reviewed by: markj@
MFC after: 3 days
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D21664