CyberLeo.Net >> Repos - FreeBSD/FreeBSD.git/log

Extend the info about the limitations of datasets in jails.

Reviewed by: allanjude
Sponsored by: Essen Hackathon

buildworld fix: private appears to have special meaning on FreeBSD - revert to priv

Limit the amount of dnode metadata in the ARC

In addition import most recent arc_prune_async implementation as dependency

commit 25458cbef9e59ef9ee6a7e729ab2522ed308f88f
Author: Tim Chase <tim@chase2k.com>
Date:   Wed Jul 13 07:42:40 2016 -0500

    Limit the amount of dnode metadata in the ARC

    Metadata-intensive workloads can cause the ARC to become permanently
    filled with dnode_t objects as they're pinned by the VFS layer.
    Subsequent data-intensive workloads may only benefit from about
    25% of the potential ARC (arc_c_max - arc_meta_limit).

    In order to help track metadata usage more precisely, the other_size
    metadata arcstat has replaced with dbuf_size, dnode_size and bonus_size.

    The new zfs_arc_dnode_limit tunable, which defaults to 10% of
    zfs_arc_meta_limit, defines the minimum number of bytes which is desirable
    to be consumed by dnodes.  Attempts to evict non-metadata will trigger
    async prune tasks if the space used by dnodes exceeds this limit.

    The new zfs_arc_dnode_reduce_percent tunable specifies the amount by
    which the excess dnode space is attempted to be pruned as a percentage of
    the amount by which zfs_arc_dnode_limit is being exceeded.  By default,
    it tries to unpin 10% of the dnodes.

    The problem of dnode metadata pinning was observed with the following
    testing procedure (in this example, zfs_arc_max is set to 4GiB):

        - Create a large number of small files until arc_meta_used exceeds
          arc_meta_limit (3GiB with default tuning) and arc_prune
          starts increasing.

        - Create a 3GiB file with dd.  Observe arc_mata_used.  It will still
          be around 3GiB.

        - Repeatedly read the 3GiB file and observe arc_meta_limit as before.
          It will continue to stay around 3GiB.

    With this modification, space for the 3GiB file is gradually made
    available as subsequent demands on the ARC are made.  The previous behavior
    can be restored by setting zfs_arc_dnode_limit to the same value as the
    zfs_arc_meta_limit.

Signed-off-by: Tim Chase <tim@chase2k.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
    Issue #4345
    Issue #4512
    Issue #4773
    Closes #4858

Eliminate a redundant assignment.

MFC after: 1 week

indent(1): revert r334640 and r334632

While STACKSIZE macro is indeed problematic on some systems, the commits
were wrong to shrink il[] and cstk[], because they need to be of the same
size as p_stack[] as they're accessed with the same index ps.tos.

Move all NTP related files to usr.sbin/ntp/ntpd.

This helps with pkgbase by using CONFS to tag these as config files.

Approved by: allanjude (mentor), ian, cy
Sponsored by: Essen Hackathon
Differential Revision: https://reviews.freebsd.org/D16661

Move all periodic related config and scripts to usr.sbin/periodic/

This makes pkgbase easier by tagging these as CONFS so they are properly
tagged as config files.

Approved by: will (mentor)
Sponsored by: Essen Hackathon
Differential Revision: https://reviews.freebsd.org/D16553

pf tests: Basic test for 'set skip in $groupname'

This tests for the problem reported in PR 229241, where using a group
name in 'set skip on' did not work as expected.

Sponsored by: Essen Hackathon

pf: Take the IF_ADDR_RLOCK() when iterating over the group list

We did do this elsewhere in pf, but the lock was missing here.

Sponsored by: Essen Hackathon

pf: Fix 'set skip on' for groups

The pfi_skip_if() function sometimes caused skipping of groups to work,
if the members of the group used the groupname as a name prefix.
This is often the case, e.g. group lo usually contains lo0, lo1, ...,
but not always.

Rather than relying on the name explicitly check for group memberships.

Obtained from: OpenBSD (pf_if.c,v 1.62, pf_if.c,v 1.63)
Sponsored by: Essen Hackathon

- Correct the description when jobs are executed related to load avg
to match reality (slightly different to what was submitted in the
PR: use english word instead of math-symbol).
- Wrap the corresponding part to below 80 characters per line.

Submitted by: yamagi@yamagi.org
PR: 202202
Sponsored by: Essen Hackathon

Re-enable reading byte swapped NFS_MAGIC dumps.

Fix bug introduced in r98542: previously to this revision the byte-swapped
value was compared at this place. The current check is in a conditional
section where the non-byte-swapped value was already checked to be not
the value which is checked again. As byte-swapping is activated afterwards,
it only makes sense if the byte-swapped value is checked.

Submitted by: Keith White <kwhite@site.uottawa.ca>
PR: 200059
MFC after: 1 month
Sponsored by: Essen Hackathon

Fix the build by just installing systop since testing shows it works with:

dwatch -X systop

Reviewed by: kp
Approved by: allanjude (mentor)

Remove unused MAPDESCFILE.

Move pf.os to sbin/pfctl/

Approved by: will (mentor)
Glanced at by: kp
Sponsored by: Essen Hackathon
Differential Revision: https://reviews.freebsd.org/D16557

Move cron.d/at to usr.bin/at/

This helps with pkgbase as it tags this as a config file so it is handled as
such

Approved by: allanjude (mentor)
Sponsored by: Essen Hackathon
Differential Revision: https://reviews.freebsd.org/D16673

Move snmpd.config to usr.sbin/bsnmpd/bsnmpd/

This helps with pkgbase as this config file will now be tagged as a config
file

Approved by: allanjude (mentor)
Sponsored by: Essen Hackathon
Differential Revision: https://reviews.freebsd.org/D16674

Move sysctl.conf to sbin/sysctl/ and switch to CONFS.

This helps with pkgbase to tag this config file as a config file.

Approved by: allanjude (mentor), will (mentor)
Differential Revision: https://reviews.freebsd.org/D16559

Move ddb.conf to sbin/ddb/ and switch to CONFS.

This helps pkgbase as this config file will now be tagged as a config file.

Approved by: allanjude (mentor)
Differential Revision: https://reviews.freebsd.org/D16675

Move OpenBSM to CONFS

This helps with pkgbase as these config files will be properly tagged as
config files.

Approved by: allanjude (mentor), oshogbo
Differential Revision: https://reviews.freebsd.org/D16679

Add svnlite to places where svn is mentioned.

The Makefile part in the PR is solved already differently, so this
part is skipped form the PR The man page change change is slightly
changed to adapt to the way the Makefile works and to the spirit
of what is intended here.

Submitted by: Juan Ramón Molina Menor <info@juanmolina.eu>
PR: 194910
Sponsored by: Essen Hackathon

Add "ESI Juli@ XTe" as a supported device.

Submitted by: Vladislav Movchan <vladislav.movchan@gmail.com>
PR: 222025
Sponsored by: Essen Hackathon

printf: Fix \c in %b in printf builtin exiting the shell after r337458

SVN r337458 erroneously partially reverted r265885.

This is immediately visible when running the Kyua/ATF tests for
usr.bin/printf, which actually test sh's printf builtin.

PR: 229641

IEEE!

Pointy hat: myself

Pull in r338481 from upstream llvm trunk (by Chandler Carruth):

  [x86] Fix a really subtle miscompile due to a somewhat glaring bug in
  EFLAGS copy lowering.

  If you have a branch of LLVM, you may want to cherrypick this. It is
  extremely unlikely to hit this case empirically, but it will likely
  manifest as an "impossible" branch being taken somewhere, and will be
  ... very hard to debug.

  Hitting this requires complex conditions living across complex
  control flow combined with some interesting memory (non-stack)
  initialized with the results of a comparison. Also, because you have
  to arrange for an EFLAGS copy to be in *just* the right place, almost
  anything you do to the code will hide the bug. I was unable to reduce
  anything remotely resembling a "good" test case from the place where
  I hit it, and so instead I have constructed synthetic MIR testing
  that directly exercises the bug in question (as well as the good
  behavior for completeness).

  The issue is that we would mistakenly assume any SETcc with a valid
  condition and an initial operand that was a register and a virtual
  register at that to be a register *defining* SETcc...

  It isn't though....

  This would in turn cause us to test some other bizarre register,
  typically the base pointer of some memory. Now, testing this register
  and using that to branch on doesn't make any sense. It even fails the
  machine verifier (if you are running it) due to the wrong register
  class. But it will make it through LLVM, assemble, and it *looks*
  fine... But wow do you get a very unsual and surprising branch taken
  in your actual code.

  The fix is to actually check what kind of SETcc instruction we're
  dealing with. Because there are a bunch of them, I just test the
  may-store bit in the instruction. I've also added an assert for
  sanity that ensure we are, in fact, *defining* the register operand.
  =D

Noticed by: kib
MFC after: 1 week

Drop the ternary operator for calculating ssid display length in list_scan().
Regardless if a verbose scan is required or not, we'd still want to display the
full SSID name by default so use the IEE80211_NWID_LEN constant to set the
value to use instead.

Tested on rene@'s laptop.
Reviewed by: kp
Sponsored by: Essen Hackathon
Differential Revision: https://reviews.freebsd.org/D16566

Advise reader to also see mdconfig(8) in mount_cd9660(8).
It's useful for how to mount an iso file via loopback.

Reviewed by: jilles
Approved by: bcr (mentor)
Differential Revision: https://reviews.freebsd.org/D16067

dwatch(1): Add systop profile

Provides a top-like view of syscall consumers.

MFC after: 3 days
X-MFC-to: stable/11
Sponsored by: Smule, Inc.

dwatch(1): Fix syntax error in vop_readdir profile

Reported by: Arne Ehrlich <ehrlich@consider-it.de>
MFC after: 3 days
X-MFC-to: stable/11
Sponsored by: Smule, Inc.

cxgbe(4): Create two variants of service_iq, one for queues with
freelists and one for those without.

MFH: 3 weeks
Sponsored by: Chelsio Communications

Destroy a couple of rogue svn:mergeinfo

stat(1): cache id->name resolution

When invoked on a large list of files, it is most common for a small number of
uids/gids to own most of the results.

Like ls(1), use pwcache(3) to avoid repeatedly looking up the same IDs.

Example microbenchmark and non-scientific results:

$ time (find /usr/src -type f -print0 | xargs -0 stat >/dev/null)

BEFORE:
3.62s user 5.23s system 102% cpu 8.655 total
3.47s user 5.38s system 102% cpu 8.647 total

AFTER:
1.23s user 1.81s system 108% cpu 2.810 total
1.43s user 1.54s system 107% cpu 2.754 total

Does this microbenchmark have any real-world significance?  Until a use case
is demonstrated otherwise, I doubt it.  Ordinarily I would be resistant to
optimizing pointless microbenchmarks in base utilities (e.g., recent totally
gratuitous changes to yes(1)).  However, the pwcache(3) APIs actually
simplify stat(1) logic ever so slightly compared to the raw APIs they wrap,
so I think this is at worst harmless.

PR: 230491
Reported by: Thomas Hurst <tom AT hur.st>
Discussed with: gad@

Fix escaping, otherwise Dx gets translated as the macro for DragonFly.
From 2018 Linuxhotel Hackathon & DevSummit

Approved by: eadler
Obtained from: OpenBSD r1.49
Differential Revision: https://reviews.freebsd.org/D16616

ZFS/MFV:    Use cached feature info in spa_add_feature_stats()

commit 417104bdd3c7ce07ec58674dd078f9891c3bc780
Author: Ned Bass <bass6@llnl.gov>
Date:   Thu Feb 26 12:24:11 2015 -0800

    Use cached feature info in spa_add_feature_stats()

    Avoid issuing I/O to the pool when retrieving feature flags information.
    Trying to read the ZAPs from disk means that zpool clear would hang if
    the pool is suspended and recovery would require a reboot. To keep the
    feature stats resident in memory, we hang a cached nvlist off of the
    spa.  It is built up from disk the first time spa_add_feature_stats() is
    called, and refreshed thereafter using the cached feature reference
    counts. spa_add_feature_stats() gets called at pool import time so we
    can be sure the cached nvlist will be available if the pool is later
    suspended.

Signed-off-by: Ned Bass <bass6@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
    Closes #3082

Fix misspellings of transmitter/transmitted

Reviewed by: emaste, bcr
Sponsored by: Smule, Inc.
Differential Revision: https://reviews.freebsd.org/D16025

In r308100, an explicit -fexceptions flag was added for the C sources
from LLVM's libunwind, which end up in libgcc_eh.a and libgcc_s.so.
This is because the unwinder needs the unwinder data for its own
functions.

However, for the C++ sources in libunwind, -fexceptions is already the
default, and this can have the side effect of generating a reference to
__gxx_personality_v0, the so-called personality function, which is
normally provided by the C++ ABI library (libcxxrt or libsupc++).

If the reference ends up in the eventual libgcc_s.so, linking any
non-C++ programs against it will fail with "undefined reference to
`__gxx_personality_v0'".

Note that at high optimization levels, the reference is usually
optimized away, which is why we have never noticed this problem before.

With clang 7.0.0 though, higher optimization levels don't help anymore,
since the addition of address-significance tables [1] in
<https://reviews.llvm.org/rL337339>. Effectively, this always causes a
reference to __gxx_personality_v0.

After discussion with the upstream author of that change, it turns out
that we should compile libunwind sources with the -fno-exceptions
-funwind-tables flags instead. This ensures unwind tables are
generated, but no references to any personality functions are emitted.

[1] https://lists.llvm.org/pipermail/llvm-dev/2018-May/123514.html

Reported by: jbeich
PR: 230399
MFC after: 1 week

Disable the D subroutines msgsize() and msgdsize().

They are specific to illumos and the corresponding DIF subroutines are
already disabled on FreeBSD.

Reported by: gnn

Walk back r337554 while discussion continues

The idea was to get the uncontroversial mechanical change out of the way,
then get the meatier functional changes reviewed subsequently.  I had not
realized that the immediately adjacent issue was addressed in a different
direction in r334506 (see Warner's guidance in D15592).

Discussion continues, trying to determine if there is a secondary issue
still[1] and how best to fix it.  With 12-related activities coming up,
while that is ongoing, just take this back for now.

[1]: Shutdown-time eventhandler events fire normally during panic's reboot
path.  Driver callbacks that attempt to issue and wait on interrupt-
completed IO may never complete, hanging the system.  This is particularly
obnoxious in the shutdown/panic path, as the debugger cannot be entered
anymore and the hang prevents reboot restoring availability.

(There's nothing CAM-specific about this problem -- any shutdown
event-triggered driver could do something like this during panic.  But most
NICs, etc.  don't try to send spin-down commands at shutdown. ;-))

Discussed with: imp, markj

subr_prf: remove think-o that had returned to local patch

Reported by: cognet

boot tagging: minor fixes

msgbufinit may be called multiple times as we initialize the msgbuf into a
progressively larger buffer. This doesn't happen as of now on head, but it
may happen in the future and we generally support this. As such, only print
the boot tag if we've just initialized the buffer for the first time.

The boot tag also now has a newline appended to it for better visibility,
and has been switched to a normal printf, by requesto f bde, after we've
denoted that the msgbuf is mapped.

Update man page to include FreeBSD-specific details.

While this implements a standards-conforming C11 function, there's
implementation details the programmer needs to know. Include those
here. Make changes inspired by comments on the initial review as well,
though mostly this involves stealing the epoch verbage from
gettimeofday(2). Add myself to authors since I've now changed a
substantial amount of this man page.

Remove assert.h and commented out _DIAGASSERT.

Remove assert.h and _DIAGASSERT to create a paper-trail of changes
from NetBSD. Specifically didn't fix other style issues since I
don't want this to diverge from the NetBSD original too much and
that's too niggling a change to be worth future merge hassles.

Differential Review: https://reviews.freebsd.org/D16649

Bring in timespce_get form NetBSD.

Bring in the functionality for timespec_get from NetBSD. I've lightly
edited the .c file to remove _DIAGASSERT because FreeBSD doesn't have
that functionality and the typical #define'ing it to assert isn't
right here. The man page is verbatim from NetBSD, but will be revised
as part of a larger cleanup of the time man pages (they are
inconsistent and vague in all the wrong places).

Differential Review: https://reviews.freebsd.org/D16649

Restore the behaviour changed in r337536, when bad `ipfw delete` command
returns error.

Now -q option only makes it quiet. And when -f flag is specified, the
command will ignore errors and continue executing with next batched
command.

MFC after: 2 weeks

ath: Minor style cleanups

device_printf => DPRINTF and two whitespace adjustments

Submitted by: Augustin Cavalier <waddlesplash@gmail.com>
Obtained from: Haiku (4a88aa503ad4155a20931e263d24343043994ea9)
MFC after: 1 week

ieee8021_node: fix whitespace issues

Submitted by: Augustin Cavalier <waddlesplash@gmail.com>
Obtained from: Haiku (dffc3e235360cd7b71261239ee8507b7d62a1471)
MFC after: 1 week

net80211: Drain ageq before cleaning it up.

The comment above ieee80211_ageq_cleanup specifically notes that the queue
is assumed to be empty, and in order to make it so, ieee80211_ageq_drain
must be used.

Submitted by: Augustin Cavalier <waddlesplash@gmail.com>
Obtained from: Haiku (dffc3e235360cd7b71261239ee8507b7d62a1471)
MFC after: 1 week

bwi(4): Set ic->ic_softc before bwi_getradiocaps to avoid bad deref

Submitted by: François Revol <revol@free.fr>
Obtained from: Haiku (ba88131cfde64e21bedb4ebedd699cfa5e7fd314)
MFC after: 1 week

readelf: display NT_GNU_PROPERTY_TYPE_0 note name

NT_GNU_PROPERTY_TYPE_0 in a .note.gnu.property section "contains a
program property note which describes special handling requirements
for linker and run-time loader." (from the System V Application Binary
Interface - Linux Extensions")

Intel CET uses two processor-specific program properties in
NT_GNU_PROPERTY_TYPE_0: GNU_PROPERTY_X86_FEATURE_1_IBT to indicate that
all executable sections are compatible with Indirect Branch Tracking,
and GNU_PROPERTY_X86_FEATURE_1_SHSTK to indicate that sections are
compatible with shadow stack.

A later change should add decoding of the individual properties.

MFC after: 1 week
Sponsored by: The FreeBSD Foundation

Remove unneeded ipsec-related includes.

Reviewed by: rrs
Differential Revision: https://reviews.freebsd.org/D16637

Performance optimization of AVL tree comparator functions

MFV:
commit ee36c709c3d5f7040e1bd11f5c75318aa03e789f
Author: Gvozden Neskovic <neskovic@gmail.com>
Date:   Sat Aug 27 20:12:53 2016 +0200

    perf: 2.75x faster ddt_entry_compare()
        First 256bits of ddt_key_t is a block checksum, which are expected
    to be close to random data. Hence, on average, comparison only needs to
    look at first few bytes of the keys. To reduce number of conditional
    jump instructions, the result is computed as: sign(memcmp(k1, k2)).

    Sign of an integer 'a' can be obtained as: `(0 < a) - (a < 0)` := {-1, 0, 1} ,
    which is computed efficiently.  Synthetic performance evaluation of
    original and new algorithm over 1G random keys on 2.6GHz Intel(R) Xeon(R)
    CPU E5-2660 v3:

    old     6.85789 s
    new     2.49089 s

    perf: 2.8x faster vdev_queue_offset_compare() and vdev_queue_timestamp_compare()
        Compute the result directly instead of using conditionals

    perf: zfs_range_compare()
        Speedup between 1.1x - 2.5x, depending on compiler version and
    optimization level.

    perf: spa_error_entry_compare()
        `bcmp()` is not suitable for comparator use. Use `memcmp()` instead.

    perf: 2.8x faster metaslab_compare() and metaslab_rangesize_compare()
    perf: 2.8x faster zil_bp_compare()
    perf: 2.8x faster mze_compare()
    perf: faster dbuf_compare()
    perf: faster compares in spa_misc
    perf: 2.8x faster layout_hash_compare()
    perf: 2.8x faster space_reftree_compare()
    perf: libzfs: faster avl tree comparators
    perf: guid_compare()
    perf: dsl_deadlist_compare()
    perf: perm_set_compare()
    perf: 2x faster range_tree_seg_compare()
    perf: faster unique_compare()
    perf: faster vdev_cache _compare()
    perf: faster vdev_uberblock_compare()
    perf: faster fuid _compare()
    perf: faster zfs_znode_hold_compare()

Signed-off-by: Gvozden Neskovic <neskovic@gmail.com>
Signed-off-by: Richard Elling <richard.elling@gmail.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
    Closes #5033

Make distribution now happens from top of source tree.

Silence debugging output

powerpc: Add lwsync and ptesync 'sync' opcode variants to ddb disassembler

The canonical form of sync is:

  sync L, E (if Category Elemental Memory Barriers implemented)

The L bits (2) denote the type of sync:

  0 -- hwsync
  1 -- lwsync
  2 -- ptesync or hwsync

It's been found that most 32-bit CPUs designed prior to the introduction of
lwsync will ignore the L bits.  However, some cores, particularly the e500 core,
will trigger an illegal instruction exception.  Adding these variants will make
it easier to see which sync variant is actually being used in case of a trap.

Correct a comment. Should have been detected by ipf_nat_in() not
ipf_nat_out().

MFC after: 1 week
X-MFC-with: r337558

Makefile.inc1: Add libl to -legacy as well

libl is needed for config(8), which is a bootstrap-tool. It is possible to
build a system WITHOUT_TOOLCHAIN to exclude lex and thus, libl. We still
need to support building from this kind of host, though.

While here, group the config(8) dependencies together and add a small
explanation. These can likely both be scoped more clearly, but this will
need some further investigation.

Reported by: rgrimes (not WITHOUT_TOOLCHAIN, but provoked investigation)
MFC after: immediately

Identify the return value (rval) that led to the IPv4 NAT failure
in ipf_nat_checkout() and report it in the frb_natv4out and frb_natv4in
dtrace probes.

This is currently being used to diagnose NAT failures in PR/208566. It's
rather handy so this commit makes it available for future diagnosis and
debugging efforts.

PR: 208566
MFC after: 1 week

Rename head from -CURRENT to -ALPHA1 as part of the
12.0-RELEASE cycle. This commit marks the start of
the code slush for the 12.0 cycle.

Approved by: re (implicit)
Sponsored by: The FreeBSD Foundation

Invoke the growfs rc script for each boot on GCE.

PR: 230275
Submitted by: gustavo.scalet@collabora.com
MFC after: 3 days
Sponsored by: The FreeBSD Foundation

Update and replace old rc daemons for GCE images.

PR: 229000
Submitted by: helen.koike@collabora.com
MFC after: 3 days
Sponsored by: The FreeBSD Foundation

cam(4): Add an xpt-neutral flag indicating a valid panic CCB

No functional change.

Note that this change is careful to set the CCB header xflags after
foo_fill_bar() routines, which generally zero existing flags. An earlier
version of this patch mistakenly set the flag before the fill routines.

Submitted by: Scott Ferris <sferris AT isilon.com>, jhibbits@
Reviewed by: bdrewery@, markj@, and non-committer FreeBSD contributor Anton Rang
Sponsored by: Dell EMC Isilon

cxgbe(4): Add a sysctl to control the tx credit reclaim mechanism for
netmap tx queues. There is no change in default behavior.

Sponsored by: Chelsio Communications

Add optional LLVM BPF target support

BPF (eBPF) is an independent instruction set architecture which is
introduced in Linux a few years ago. Originally, eBPF execute
environment was only inside Linux kernel. However, recent years there
are some user space implementation (https://github.com/iovisor/ubpf,
https://doc.dpdk.org/guides/prog_guide/bpf_lib.html) and kernel space
implementation for FreeBSD is going on
(https://github.com/YutaroHayakawa/generic-ebpf).

The BPF target support can be enabled using WITH_LLVM_TARGET_BPF, as it
is not built by default.

Submitted by: Yutaro Hayakawa <yhayakawa3720@gmail.com>
Reviewed by: dim, bdrewery
Differential Revision: https://reviews.freebsd.org/D16033

cam_ccb.h: Remove redundant declarations of static inline functions

No functional change.

They're unnecessarily confusing for tools like grep or ctags.

Sponsored by: Dell EMC Isilon

cxgbe(4): Set fl_pktshift to 0 by default.

Sponsored by: Chelsio Communications

libnv: Remove -I${SRCTOP}/sys

This should have been done as part of r336019 -- including ${SRCTOP}/sys is
not a good business model for something that's build in legacy/bootstrap
stages.

Beyond that, libnv seems to build quite alright as legacy, part of
buildworld, and standalone without. Axe it.

Reported by: truckman (head building stable/11)
Tested by: Shawn Webb (HardenedBSD)
MFC after: 3 days

subr_prf: style(9) the sizeof

Reported by: jkim, ian

Account for the lowmem handlers in the inactive queue scan target.

Before r329882 the target would be computed after lowmem handlers run
and free pages.  On some systems a significant amount of page
reclamation happens this way.  However, with r329882 the target is
computed first, which can lead to unnecessary reclamation from the
page cache, and this in turn may result in excessive swapping.

Instead, adjust the target after running lowmem handlers.  Don't
invoke the lowmem handlers before the PID controller, though, since
that would hide the true rate of page allocation.

Reviewed by: alc, kib (previous version)
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D16606

subr_prf: Use "sizeof current_boot_tag" instead

BOOT_TAG: Make a config(5) option, expose as sysctl and loader tunable

BOOT_TAG lived shortly in sys/msgbuf.h, but this wasn't necessarily great
for changing it or removing it. Move it into subr_prf.c and add options for
it to opt_printf.h.

One can specify both the BOOT_TAG and BOOT_TAG_SZ (really, size of the
buffer that holds the BOOT_TAG). We expose it as kern.boot_tag and also add
a loader tunable by the same name that we'll fetch upon initialization of
the msgbuf.

This allows for flexibility and also ensures that there's a consistent way
to figure out the boot tag of the running kernel, rather than relying on
headers to be in-sync.

Prodded super-super-lightly by: imp

msgbuf: Light detailing (const'ify and bool'itize)

Correct default path of kernel modules.

cxgbe(4): Display pkt-size and burst-size in traffic class parameters.

cxgbetool(8): Userspace part of support for high priority filters on T6+.

MFC after: 1 week
Sponsored by: Chelsio Communications

cxgbe(4): Add support for high priority filters on T6+. They have their
own region in the TCAM starting with T6, unlike previous chips where
they were in the same region as normal filters.

These filters "hit" before anything else in the LE's lookup. The exact
order is:
a) High priority filters
b) TOE's active region (TCAM and/or hash)
c) Servers (TOE hw listeners)
d) Normal filters

MFC after: 1 week
Sponsored by: Chelsio Communications

[ppc] Fix kernel panic when using BOOTP_NFSROOT

On PowerPC (and possibly other architectures), that doesn't use
EARLY_AP_STARTUP, the config task queue may be used initialized.
This was observed while trying to mount the root fs from NFS, as
reported here: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=230168.

This patch has 2 main changes:
1- Perform a basic initialization of qgroup_config, similar to
what is done in taskqgroup_adjust, but simpler.
This makes qgroup_config ready to be used during NFS root mount.

2- When EARLY_AP_STARTUP is not used, call inm_init() and
in6m_init() right before SI_SUB_ROOT_CONF, because bootp needs
to send multicast packages to request an IP.

PR: Bug 230168
Reported by: sbruno
Reviewed by: jhibbits, mmacy, sbruno
Approved by: jhibbits
Differential Revision: D16633

If -q flag is specified, do not complain when we are trying to delete
nonexistent NAT instance or nonexistent rule.

This allows execute batched `delete` commands and do not fail when
found nonexistent rule.

Obtained from: Yandex LLC
MFC after: 2 weeks
Sponsored by: Yandex LLC

Use NULLs instead of casted zeroes, for consistency.

MFC after: 2 weeks
Sponsored by: DARPA, AFRL

Refactor common code into execute_script().

Reviewed by: kib
MFC after: 2 weeks
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D16627

Import CK as of commit 08813496570879fbcc2adcdd9ddc0a054361bfde, mostly
to avoid using lwsync on ppc32.

Make ldconfig(8) atomic, by removing an unneccessary call to unlink(2)
before rename(2).

Reviewed by: kib
MFC after: 2 weeks
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D16641

Implement missing atomic_fcmpset_XXX() support for i386.

This also fixes i386 build after r337527.

MFC after: 1 week
Sponsored by: Mellanox Technologies

add an option for ddb ps command to print process arguments

We use ps to collect the information of all processes in textdump. But
it doesn't contain process arguments which however sometimes are very
useful for debugging. The new 'a' modifier adds that capability.

While here, remove 'm' modifier from ddb.4. It was in the manual page
from its very first revision, but I could not find any evidence of the
code ever supporting it.

Submitted by: Terry Hu <thu@panzura.com>
Reviewed by: kib
MFC after: 1 week
Sponsored by: Panzura
Differential Revision: https://reviews.freebsd.org/D16603

Use atomic_fcmpset_XXX() instead of atomic_cmpset_XXX() when possible
in the LinuxKPI.

Suggested by: mjg @
MFC after: 1 week
Sponsored by: Mellanox Technologies

Follow up to r333195, add us Macbook/Macbook Pro keyboard support.
Tested by: rcyu

epoch_block_wait: don't check TD_RUNNING

struct epoch_thread is not type safe (stack allocated) and thus cannot be dereferenced from another CPU

Reported by: novel@

libi386: Fix typo in pxe.h

PR: 207337
Submitted by: Tony Narlock <tony@git-pull.com>
MFC after: 1 week

libsa: exit on EOF in ngets

It was possible in some rare circumstances for ngets to behave terribly with
bhyveload and some form of redirecting user input over a pipe.

PR: 198706
Submitted by: Ivan Krivonos <int0dster@gmail.com>
MFC after: 1 week

In read_zones(), check if the file name actually fit in the buffer
and make sure it would terminate with nul with strlcpy().

Reviewed by: imp (earlier revision)
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D16595

isoboot, gptboot: Fix WITHOUT_LOADER_GELI (gptboot) and isoboot in general

gptboot was broken when r316078 added the LOADER_GELI_SUPPORT #ifdef to
not pass geliargs via __exec. KARGS_FLAGS_EXTARG must not be used if we're
not going to pass an additional argument to __exec.

PR: 228151
Submitted by: guyyur@gmail.com
MFC after: 1 week

kern: Add a BOOT_TAG marker at the beginning of boot dmesg

From the "newly licensed to drive" PR department, add a BOOT_TAG marker (by
default, --<<BOOT>>--, to the beginning of each boot's dmesg. This makes it
easier to do textproc magic to locate the start of each boot and, of
particular interest to some, the dmesg of the current boot.

The PR has a dmesg(8) component as well that I've opted not to include for
the moment- it was the more contentious part of this PR.

bde@ also made the statement that this boot tag should be written with an
ordinary printf, which I've- for the moment- declined to change about this
patch to keep it more transparent to observer of the boot process.

PR: 43434
Submitted by: dak <aurelien.nephtali@wanadoo.fr> (basically rewritten)
MFC after: maybe never

Fix a typo plus add a couple of sentences to pnfsserver.4.

This is a content change.

Terminate filter_create_ext() args with NULL, not 0.

filter_create_ext() is documented to take a NULL terminated set of
arguments. 0 is promoted to an int so this would fail on 64-bit
systems if the value was not passed in a register. On all currently
supported 64-bit architectures it is.

Obtained from: CheriBSD
Sponsored by: DARPA, AFRL

ls(1): Enable colors with COLORTERM is set in the environment

COLORTERM is the de facto standard, while CLICOLOR is generally specific to
FreeBSD and ls(1).

PR: 230101
Submitted by: D Green <dfrg@xsmail.com> (with manpage additions by myself)
Reviewed by: cem ("LGTM" in PR; pre-manpage changes)
MFC after: 1 week

dd: add status=progress support

This reports the current status on a single line every second, mirroring
similar functionality in GNU dd, and carefully interacts with SIGINFO.

PR: 229615
Submitted by: Thomas Hurst <tom@hur.st> (modified for style(9) nits by me)
MFC after: 1 week

apply(1): Fix magic number substitution with magic character ' '

Using a space as the magic character would result in problems if the command
started with a number:

- For a 'valid' number n, n < size of argv, it would erroneously get
replaced with that argument; e.g. `apply -a ' ' -d 1rm x => `execxrm x`

- For an 'invalid' number n, n >= size of argv, it would segfault.
e.g. `apply -a ' ' 2to3 test.py` would try to access argv[2]

This problem occurred because apply(1) would prepend "exec " to the command
string before doing the actual magic number replacements, so it would come
across "exec 2to3 1" and assume that the " 2" is also a magic number to be
replaced.

Re-work this to instead just append "exec " to the command sbuf and
workaround the ugliness. This also simplifies stuff in the process.

PR: 226948
Submitted by: Tobias Stoeckmann <tobias@stoeckmann.org>
MFC after: 1 week

powerpc64/powernv: re-read RTC after polling

If OPAL_RTC_READ is busy and does not return the information on the first run,
as returning OPAL_BUSY_EVENT, the system will crash since ymd and hmsm variable
will contain junk values.

This is happening because we were not calling OPAL_RTC_READ again after
OPAL_POLL_EVENTS' return, which would finally replace the old/junk hmsm and ymd
values.

The code was also mixing OPAL_RTC_READ and OPAL_POLL_EVENTS return values.

This patch fix this logic and guarantee that we call OPAL_RTC_READ after
OPAL_POLL_EVENTS return, and guarantee the code will only proceed if
OPAL_RTC_READ returns OPAL_SUCCESS.

Reviewed by: jhibbits
Approved by: jhibbits (mentor)
Differential Revision: https://reviews.freebsd.org/D16617

Fix the err() arguments for a nfssvc(8) failure.

argv has been incremented during argument handling, so elements of the
array are no longer valid. Change the err() arguments so only the
first string pointer in argv is used.
Found during code inspection.

Assorted fixes to handling of LayoutRecall callbacks, mostly error handling.

After a re-read of the appropriate section of RFC5661, I decided that a
few things should be changed related to LayoutRecall callback handling.
Here are the things fixed by this patch.
- For two of the three cases that LayoutRecall is done, I now think
  setting the clora_changed argument false is correct.
- All errors other than NFSERR_DELAY returned by LayoutRecall appear
  permanent, so don't retry for any of them. (NFSERR_DELAY is retried by
  newnfs_request(), so it is not affected by this patch.)
- Instead of waiting "forever" (actually until the process is SIGTERM'd)
  for Layouts to be returned during a mirror copy, fail and return
  ENXIO after about 1minute.
  Waiting for a <ctrl>C made sense when pnfsdscopymr() was done by itself,
  but did not make sense when done via find(1).
This patch only affects the pNFS server.

Use the right variable when updating interface routes.

PR: 229807
Submitted by: John Hay <jhay@meraka.org.za>
MFC after: 2 weeks

Switch the default pager for most commands to less

Finally, a pager for the nineties.

MFC after: Never
Relnotes: Yes
Differential Revision: https://reviews.freebsd.org/D13465
Poll: https://reviews.freebsd.org/V7