CyberLeo.Net >> Repos - FreeBSD/FreeBSD.git/log

Simplify VM and UMA startup by eliminating boot pages.  Instead use careful
ordering to allocate early pages in the same way boot pages were but only
as needed.  After the KVA allocator has started up we allocate the KVA that
we consumed during boot.  This also makes the boot pages freeable since they
have vm_page structures allocated with the rest of memory.

Parts of this patch were written and tested by markj.

Reviewed by: glebius, markj
Differential Revision: https://reviews.freebsd.org/D23102

[PPC64] memcpy/memmove/bcopy optimization

For copies shorter than 512 bytes, the data is copied using plain
ld/std instructions.
For 512 bytes or more, the copy is done in 3 phases:

Phase 1: copy from the src buffer until it's aligned at a 16-byte boundary
Phase 2: copy as many aligned 64-byte blocks from the src buffer as possible
Phase 3: copy the remaining data, if any

In phase 2, this code uses VSX instructions when available. Otherwise,
it uses ldx/stdx.

Submitted by: Luis Pires <lffpires_ruabrasil.org> (original version)
Reviewed by: jhibbits
Differential Revision: https://reviews.freebsd.org/D15118

[PPC64] strncpy optimization

Assembly optimization of strncpy for PowerPC64, using double words
instead of bytes to copy strings.

Submitted by: Leonardo Bianconi <leonardo.bianconi_eldorado.org.br> (original version)
Reviewed by: jhibbits
Differential Revision: https://reviews.freebsd.org/D15369

[PPC64] strcpy optimization

Assembly optimization of strcpy for PowerPC64, using double words
instead of bytes to copy strings.

Submitted by: Leonardo Bianconi <leonardo.bianconi_eldorado.org.br> (original version)
Reviewed by: jhibbits
Differential Revision: https://reviews.freebsd.org/D15368

acpi_ibm: reference ThinkPad instead of IBM

These are now Lenovo ThinkPads, not IBM ThinkPads.

PR: 234403
Submitted by: Kevin Zheng <kevinz5000@gmail.com> (original)

Peter Holm reports that his test that does an umount(8) on an active
mount point while numerous tests are running that are writing to
files on that mount point cause the unmount(8) to hang forever.

The unmount(8) system call is handled in the kernel by the dounmount()
function. The cause of the hang is that prior to dounmount() calling
VFS_UNMOUNT() it is calling VFS_SYNC(mp, MNT_WAIT). The MNT_WAIT
flag indicates that VFS_SYNC() should not return until all the dirty
buffers associated with the mount point have been written to disk.
Because user processes are allowed to continue writing and can do
so faster than the data can be written to disk, the call to VFS_SYNC()
can never finish.

Unlike VFS_SYNC(), the VFS_UNMOUNT() routine can suspend all processes
when they request to do a write thus having a finite number of dirty
buffers to write that cannot be expanded. There is no need to call
VFS_SYNC() before calling VFS_UNMOUNT(), because VFS_UNMOUNT() needs
to flush everything again anyway after suspending writes, to catch
anything that was dirtied between the VFS_SYNC() and writes being
suspended.

The fix is to simply remove the unnecessary call to VFS_SYNC() from
dounmount().

Reported by:  Peter Holm
Analysis by:  Chuck Silvers
Tested by:    Peter Holm
MFC after:    7 days
Sponsored by: Netflix

Fix a spacing error from the previous commit for -ll mode. Add a little
more space padding to that mode to give the columns a consistent offset.

mips trampoline: don't bother with unwind tables

The utility here seems somewhat limited, but clang will attempt to generate
.eh_frame and actively fail in doing so. It is perhaps worth investigating
why it's being generated in the first place (GCC doesn't do so), but this
isn't a high priority.

Handle a NULL thread pointer in linux_close_file().

This can happen if a file is closed during unix socket GC. The same bug
was fixed for devfs descriptors in r228361.

PR: 242913
Reported and tested by: iz-rpi03@hs-karlsruhe.de
Reviewed by: hselasky, kib
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D23178

Update WITHOUT_BINUTILS* descriptions

In the WITHOUT_ descriptions we don't need to mention that ld.bfd is
limited to powerpc. When WITHOUT_BINUTILS is specified ld.bfd is not
installed on any CPU architecture.

bsdinstall: Change "default" (first) Partitioning method to ZFS

Reported by: Ruben Schade (during his talk at linux.conf.au)
Approved by: philip
Differential Revision: https://reviews.freebsd.org/D23173

gif_transmit() must always be called in the network epoch.

A miss from r356754.

Introduce NET_EPOCH_CALL() macro and use it everywhere where we free
data based on the network epoch. The macro reverses the argument
order of epoch_call(9) - first function, then its argument. NFC

Use official macro to enter/exit the network epoch. NFC

Mechanically substitute assertion of in_epoch(net_epoch_preempt) to
NET_EPOCH_ASSERT(). NFC

Stop header pollution and don't include if_var.h via in_pcb.h.

Since this code dereferences struct ifnet, it must include if_var.h
explicitly, not via header pollution. While here move TCPSTATES
declaration right above the include that is going to make use of it.

Since this code uses if_ref()/if_rele() it must include if_var.h
explicitly, not via header pollution.

Netgraph queue processing thread must process all its items
in the network epoch.

Reported by: Michael Zhilin <mizhka@ >

- Move global network epoch definition to epoch.h, as more different
  subsystems tend to need to know about it, and including if_var.h is
  huge header pollution for them.  Polluting possible non-network
  users with single symbol seems much lesser evil.
- Remove non-preemptible network epoch.  Not used yet, and unlikely
  to get used in close future.

The non-preemptible network epoch identified by net_epoch isn't used.
This code definitely meant net_epoch_preempt.

vfs: in vop_stdadd_writecount only vlazy vnodes on mounts using msync

The only reason to vlazy there is to (overzealously) ensure all vnodes
which need to be visited by msync scan can be found there.

In particluar this is of no use zfs and tmpfs.

While here depessimize the check.

tmpfs: add missing CLTFLAG_MPSAFE annotation

nfs: add missing CLTFLAG_MPSAFE annotations

fusefs: add missing CLTFLAG_MPSAFE annotation

rtld: remove hand rolled memset and bzero

They were introduced to take care of ifunc, but right now no architecture
provides ifunc'ed variants. Since rtld uses memset extensively this results in
a pessmization. Should someone want to use ifunc here they should provide a
mandatory symbol (e.g., rtld_memset).

See the review for profiling data.

Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D23176

bsdinstall: Use TMPDIR if set

Submitted by: Ryan Moeller <ryan@freqlabs.com>
Reviewed by: bcran, Nick Wolff <darkfiberiru@gmail.com>
Differential Revision: https://reviews.freebsd.org/D22979/

When sync'ing a mount point, the mount point's vnodes were scanned
twice. Once to update the changed inodes, and a second time to update
changed quota information. This change merges these two scans into a
single scan which does both inode and quota updates.

MFC after: 7 days

src.conf.5: regen after r356736, limiting ld.bfd to powerpc

Preserve the inherited value of the status register in cpu_set_upcall().

Instead of re-deriving the value of SR using logic similar to
exec_set_regs(), just inherit the value from the existing thread
similar to fork().

Reviewed by: brooks
Obtained from: CheriBSD
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D23059

limit ld.bfd to powerpc

All archs except powerpc either use lld or require external toolchain.
powerpc still needs binutils ld to link 32-bit binaries.

Reviewed by: jhibbits
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D23107

Revert r353140: Re-add ALLOW_MIPS_SHARED_TEXTREL, sprinkle it around

arichardson has an actual fix for the same issue that this was working
around; given that we don't build with llvm today, go ahead and revert the
workaround in advance.

src.conf.5: regen after option description updates

Update WITH_/WITHOUT_CLANG_IS_CC descriptions

Describe /usr/bin/cc etc. as links to the compiler, and don't conflate
WITHOUT_CLANG_IS_CC with installing GCC. Leave a reference to WITH_GCC
and WITHOUT_CLANG_IS_CC installing links to GCC, although this will be
removed in ~1.5 months when GCC 4.2.1 is removed from the tree.

Sponsored by: The FreeBSD Foundation

Update WITH_AMD description reflecting upcoming removal

In-tree amd(8) is deprecated; update WITH_AMD's description to make
this more clear.

Sponsored by: The FreeBSD Foundation

Do not skip line-by-line comparison if -q and -I are specified.

This fixes a regression from r356695.

Submitted by: kevans
Reported by: Jenkins via lwhsu
MFC after: 6 days

storvsc: port a Linux patch, properly set residual data length on errors

This change is based on Linux commit 40630f462824ee. csio.resid should
account for transfer_len only for success and SRB_STATUS_DATA_OVERRUN
condition.

I am not sure how exactly this change works, but I have a report from a
user that they see lots of checksum errors when running a pool scrub
concurrently with iozone -l 1 -s 100G. After applying this patch the
problem cannot be reproduced.

Reviewed by: nobody
Sponsored by: CyberSecure
Differential Revision: https://reviews.freebsd.org/D22312

Make linux(4) use kern_setsockopt(9) instead of going through
sys_setsockopt. Just a cleanup; no functional changes.

Reviewed by: kib
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D22812

Make linux(4) use kern_getsockopt(9) instead of going through
sys_getsockopt(). It's a cleanup; no functional changes.

Reviewed by: kib (earlier version)
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D22813

Make linux getcpu(2) report the domain.

Submitted by: markj
Reviewed by: markj
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D23144

When system calls indicate an error they return -1, not some arbitrary
value < 0. errno is only updated in this case.

Obtained from: OpenBSD
MFC after: 3 days

asprintf returns -1, not an arbitrary value < 0. Also upon error the
(very sloppy specification) leaves an undefined value in *ret, so it is
wrong to inspect it, the error condition is enough.

Obtained from: OpenBSD
MFC after: 3 days

mkstemp returns -1

Obtained from: OpenBSD
MFC after: 3 days

Restore loop break in vm_pageout_lowmem().

r355004 removed return statement from this loop with intention to also
call uma_reclaim_wakeup(). But in case of vm.lowmem_period=0 it causes
infinite loop.

Reviewed by: markj
Sponsored by: iXsystems, Inc.

uma: split slabzone into two sizes

By allowing more items per slab, we can improve memory efficiency for
small allocs.  If we were just to increase the bitmap size of the
slabzone, we would then waste slabzone memory.  So, split slabzone into
two zones, one especially for 8-byte allocs (512 per slab).  The
practical effect should be reduced memory usage for counter(9).

Reviewed by: jeff, markj
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D23149

malloc: remove assumptions about MINALLOCSIZE

Remove assumptions about the minimum MINALLOCSIZE, in order to allow
testing of smaller MINALLOCSIZE. A following patch will lower the
MINALLOCSIZE, but not so much that the present patch is required for
correctness at these sites.

Reviewed by: jeff, markj
Sponsored by: Dell EMC Isilon

uma: fixup some ktr messages

Reviewed by: markj
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D23148

Fix a long standing bug in journaled soft-updates.  The dirrem structure
needs to handle file removal, directory removal, file move, directory move,
etc.  The code in handle_workitem_remove() needs to propagate any completed
journal entries to the write that will render the change stable.  In the
case of a moved directory this means the new parent.  However, for an
overwrite that frees a directory (DIRCHG) we must move the jsegdep to the
removed inode to be released when it is stable in the cg bitmap or the
unlinked inode list.  This case was previously unhandled and caused a
panic.

Reported by: mckusick, pho
Reviewed by: mckusick
Tested by: pho

cxgbe/iw_cxgbe: Do not allow memory registrations with page size greater
than 128MB, which is the maximum supported by the hardware in RDMA mode.

Obtained from: Chelsio Communications
MFC after: 3 days
Sponsored by: Chelsio Communications

powerpc/mpc85xx: Partially revert r356640

The count block was correct before. r356640 caused a read past the end of
the tuple.

fstyp hammer2: remove dead code

best_i will always be >= 0, so remove code to test otherwise.

Reported by: Coverity
CID: 1412244
MFC after: 2 weeks
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D23159

fstyp hammer: use strlcpy

Use strlcpy to guarantee NUL termination. Due to this, there is
no need for strncmp; simply use strcmp.

Reported by: Coverity
CID: 1412242
MFC after: 2 weeks
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D23159

Map ECKSUM and EFRAGS from ZFS onto real errnos.

Make it less confusing when, for example, stat sets errno to 122 because a
checksum failed in ZFS:

Before: getfacl: /foo/bar: stat() failed: Unknown error: 122
After: getfacl: /foo/bar: stat() failed: Integrity check failed

Submitted by: Ryan Moeller <ryan@ixsystems.com>
Reviewed by: mckusick, mav
MFC after: 2 weeks
Sponsored by: iXsystems, Inc.
Differential Revision: https://reviews.freebsd.org/D22973

savecore: include time zone in info.N file

This helps with event correlation when machines are distributed
across multiple time zones.

Format the time with relaxed ISO 8601 for all the usual reasons.

MFC after: 2 weeks
Sponsored by: Dell EMC Isilon

Add missing comma in nfsv4_errstr

Reported by: Coverity
CID: 1412243
Sponsored by: Dell EMC Isilon

netmap: disable passthrough with no hypervisor support

The netmap passthrough subsystem requires proper support in the
hypervisor. In particular, two PCI device ids (from the Red Hat
PCI vendor id 0x1b36) need to be assigned to the two netmap
virtual devices. We then disable these devices until the ids have
not been assigned, in order to avoid conflicts with other
virtual devices emulated by upstream QEMU.

PR: 241774
MFC after: 3 days

vmx: fix initialization of TSO related descriptor fields

Fix a mistake introduced by r343291, which ported the vmx(4)
driver to iflib.
In case of TSO, the hlen field of the (first) tx descriptor must
be initialized to the cumulative length of Ethernet, IP and TCP
headers. The length of the TCP header was missing.

PR: 236999
Reported by: pkelsey
Reviewed by: avg
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D22967

Fix yet another regression from r354484. Error code from cr_cansee()
aliases with hard error from other operations.

Reported by: flo

Merge commit f46ba4f07 from llvm git (by Simon Atanasyan):

  [mips] Use less registers to load address of TargetExternalSymbol

  There is no pattern matched `add hi, (MipsLo texternalsym)`. As a
  result, loading an address of 32-bit symbol requires two registers
  and one more additional instruction:
  ```
  addiu $1, $zero, %lo(foo)
  lui   $2, %hi(foo)
  addu  $25, $2, $1
  ```

  This patch adds the missed pattern and enables generation more
  effective set of instructions:
  ```
  lui   $1, %hi(foo)
  addiu $25, $1, %lo(foo)
  ```

  Differential Revision: https://reviews.llvm.org/D66771

  llvm-svn: 370196

Merge commit 59bb3609f from llvm git (by Simon Atanasyan):

  [mips] Fix 64-bit address loading in case of applying 32-bit mask to
  the result

  If result of 64-bit address loading combines with 32-bit mask, LLVM
  tries to optimize the code and remove "redundant" loading of upper
  32-bits of the address. It leads to incorrect code on MIPS64 targets.

  MIPS backend creates the following chain of commands to load 64-bit
  address in the `MipsTargetLowering::getAddrNonPICSym64` method:
  ```
  (add (shl (add (shl (add %highest(sym), %higher(sym)),
      16),
%hi(sym)),
    16),
       %lo(%sym))
  ```

  If the mask presents, LLVM decides to optimize the chain of commands.
  It really does not make sense to load upper 32-bits because the
  0x0fffffff mask anyway clears them. After removing redundant commands
  we get this chain:
  ```
  (add (shl (%hi(sym), 16), %lo(%sym))
  ```

  There is no patterns matched `(MipsHi (i64 symbol))`. Due a bug in
  `SYM_32` predicate definition, backend incorrectly selects a pattern
  for a 32-bit symbols and uses the `lui` instruction for loading
  `%hi(sym)`.

  As a result we get incorrect set of instructions with unnecessary
  16-bit left shifting:
  ```
  lui     at,0x0
      R_MIPS_HI16     foo
  dsll    at,at,0x10
  daddiu  at,at,0
      R_MIPS_LO16     foo
  ```

  This patch resolves two problems:
  - Fix `SYM_32/SYM_64` predicates to prevent selection of patterns
    dedicated to 32-bit symbols in case of using N64 ABI.
  - Add missed patterns for 64-bit symbols for `%hi/%lo`.

  Fix PR42736.

  Differential Revision: https://reviews.llvm.org/D66228

  llvm-svn: 370268

These two commits fix a miscompilation of the kernel for mips64, and
should allow clang to be used as the default compiler for mips64.

Requested by: arichards
MFC after: 3 days

Backout 356693. The libsa malloc does provide necessary alignment and
memalign by 4 will reduce alignment for some platforms. Thanks for Ian for
pointing this out.

Optimize diff -q.

Once we know whether the files differ, we don't need to do any further
work.

PR: 242828
Submitted by: fehmi noyan isi <fnoyanisi@yahoo.com> (original version)
Reviewed by: bapt, kevans
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D23152

tap(4): also note that we drop configured addresses

This provides a specific pointer for users of tap(4) to understand why their
interfaces are losing their addresses, and specifically how to workaround
this if they need different behavior.

This manpage received a .Dd bump earlier today in r35688, so no bump occurs
this time.

Submitted by: sigsys@gmail.com (via IRC)

loader: allocate properly aligned buffer for network packet

Use memalign(4, size) to ensure we have properly aligned buffer.

MFC after: 2 weeks

Install tap(4) manpage as vmnet(4) as well

If one comes across a vmnet interface, this is a useful pointer to have
towards what it actually is if they're otherwise unfamiliar.

MFC after: 3 days

gprof: Enable riscv

Add a missing riscv.h header file, and fix the check for riscv (must test
MACHINE_CPUARCH, not MACHINE_ARCH, if we want to use 'riscv').

Sponsored by: Axiado

Fix a typo.

MFC after: 3 days
Sponsored by: Rubicon Communications, LLC (netgate.com)

Ensure the TYPE, BRANCH, and REVISION variables are set in
cloudware targets when OSRELEASE is overridden.

Submitted by: Trond Endrestol
PR: 243287
MFC after: 3 days
Sponsored by: Rubicon Communications, LLC (netgate.com)

src.conf.5: regen after r356615, KERBEROS_SUPPORT dep on KERBEROS

ufs: relax an overzealous assert added in r356671

Part of i_flag can persist across a drop to hold count of 0, at which
point the vnode is taken off the lazy list. Then whoever locks and unlocks
the vnode can trip on the assert.

This trips over kyua running a test untarring character devices to ufs.

Reported by: lwhsu

Code must not unlock a mutex while owning the thread lock.

Reviewed by: hselasky, markj
Sponsored by: Mellanox Technologies
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D23150

Sync with r356645. desiredvnodes is now maxvnodes.

As of r356642 desiredvnodes is u_long.

Unbound's config.h is manually maintained, using a ./configure produced
config.h as a guide. In practice contributed software maintains a copy
of config.h within its build directory tree containing its Makefile.
usr.sbin/unbound is the home for its config.h.

MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D22983

RISC-V: fix global symbol lookups for mpentry with lld

This is a follow up to r356481. In locore.S, before virtual memory is
set up, we should avoid using indirect address lookups through the GOT.
Therefore we need to convert uses of the la instruction to lla, which
always generates an auipc/addi pair of instructions. This conversion was
done for the BSP case, but not the AP case, resulting in a fault
somewhere before mpva and a failure to bring APs online.

Reported by: lwhsu
Reviewed by: lwhsu, jrtc27 (accepted in a comment)
Differential Revision: https://reviews.freebsd.org/D23138

vfs: per-cpu batched requeuing of free vnodes

Constant requeuing adds significant lock contention in certain
workloads. Lessen the problem by batching it.

Per-cpu areas are locked in order to synchronize against UMA freeing
memory.

vnode's v_mflag is converted to short to prevent the struct from
growing.

Sample result from an incremental make -s -j 104 bzImage on tmpfs:
stock: 122.38s user 1780.45s system 6242% cpu 30.480 total
patched: 144.84s user 985.90s system 4856% cpu 23.282 total

Reviewed by: jeff
Tested by: pho (in a larger patch, previous version)
Differential Revision: https://reviews.freebsd.org/D22998

vfs: rework vnode list management

The current notion of an active vnode is eliminated.

Vnodes transition between 0<->1 hold counts all the time and the
associated traversal between different lists induces significant
scalability problems in certain workloads.

Introduce a global list containing all allocated vnodes. They get
unlinked only when UMA reclaims memory and are only requeued when
hold count reaches 0.

Sample result from an incremental make -s -j 104 bzImage on tmpfs:
stock: 118.55s user 3649.73s system 7479% cpu 50.382 total
patched: 122.38s user 1780.45s system 6242% cpu 30.480 total

Reviewed by: jeff
Tested by: pho (in a larger patch, previous version)
Differential Revision: https://reviews.freebsd.org/D22997

ufs: use lazy list instead of active list for syncer

Quota code is temporarily regressed to do a full vnode scan.

Reviewed by: jeff
Tested by: pho (in a larger patch, previous version)
Differential Revision: https://reviews.freebsd.org/D22996

vfs: add per-mount vnode lazy list and use it for deferred inactive + msync

This obviates the need to scan the entire active list looking for vnodes
of interest.

msync is handled by adding all vnodes with write count to the lazy list.

deferred inactive directly adds vnodes as it sets the VI_DEFINACT flag.

Vnodes get dequeued from the list when their hold count reaches 0.

Newly added MNT_VNODE_FOREACH_LAZY* macros support filtering so that
spurious locking is avoided in the common case.

Reviewed by: jeff
Tested by: pho (in a larger patch, previous version)
Differential Revision: https://reviews.freebsd.org/D22995

ufs: add a setter for inode i_flag field

This will be used later to add vnodes to the lazy list.

Reviewed by: kib (previous version), jeff
Tested by: pho (in a larger patch)
Differential Revision: https://reviews.freebsd.org/D22994

Fix a typo in r356667 comment

No functional change.

Reported by: bdragon
Approved by: csprng(markm), earlier version
X-MFC-With: r356667

getrandom(2): Add Linux GRND_INSECURE API flag

Treat it as a synonym for GRND_NONBLOCK.  The reasoning is this:

We have two choices for handling Linux's GRND_INSECURE API flag.

1. We could ignore it completely (like GRND_RANDOM).  However, this might
produce the surprising result of GRND_INSECURE requests blocking, when the
Linux API does not block.

2. Alternatively, we could treat GRND_INSECURE requests as requests for
GRND_NONBLOCk.  Here, the surprising result for Linux programs is that
invocations with unseeded random(4) will produce EAGAIN, rather than
garbage.

Honoring the flag in the way Linux does seems fraught.  If we actually use
the output of a random(4) implementation prior to seeding, we leak some
entropy (in an information theory and also practical sense) from what will
be the initial seed to attackers (or allow attackers to arbitrary DoS
initial seeding, if we don't leak).  This seems unacceptable -- it defeats
the purpose of blocking on initial seeding.

Secondary to that concern, before seeding we may have arbitrarily little
entropy collected; producing output from zero or a handful of entropy bits
does not seem particularly useful to userspace.

If userspace can accept garbage, insecure, non-random bytes, they can create
their own insecure garbage with srandom(time(NULL)) or similar.  Any program
which would be satisfied with a 3-bit key CTR stream has no need for CSPRNG
bytes.  So asking the kernel to produce such an output from the secure
getrandom(2) API seems inane.

For now, we've elected to emulate GRND_INSECURE as an alternative spelling
of GRND_NONBLOCK (2).  Consider this API not-quite stable for now.  We
guarantee it will never block.  But we will attempt to monitor actual port
uptake of this bizarre API and may revise our plans for the unseeded
behavior (prior stable/13 branching).

Approved by: csprng(markm), manpages(bcr)
See also: https://lwn.net/ml/linux-kernel/cover.1577088521.git.luto@kernel.org/
See also: https://lwn.net/ml/linux-kernel/20200107204400.GH3619@mit.edu/
Differential Revision: https://reviews.freebsd.org/D23130

Fix the way 'factor' behaves when using OpenSSL to match the description
of how it works when not compiled with OpenSSL.

Also, allow users to specify a hexadecimal number by using a prefix of
'0x'. Before this, users could only specify a hexadecimal value if that
value included a hex digit ('a'-'f') in the value.

PR: 243136
Submitted by: Steve Kargl
Reviewed by: gad
MFC after: 3 weeks

Fix race when accepting TCP connections.

When expanding a SYN-cache entry to a socket/inp a two step approach was
taken:
1) The local address was filled in, then the inp was added to the hash
table.
2) The remote address was filled in and the inp was relocated in the
hash table.
Before the epoch changes, a write lock was held when this happens and
the code looking up entries was holding a corresponding read lock.
Since the read lock is gone away after the introduction of the
epochs, the half populated inp was found during lookup.
This resulted in processing TCP segments in the context of the wrong
TCP connection.
This patch changes the above procedure in a way that the inp is fully
populated before inserted into the hash table.

Thanks to Paul <devgs@ukr.net> for reporting the issue on the net@
mailing list and for testing the patch!

Reviewed by: rrs@
MFC after: 1 week
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D22971

nd6_rtr: constantly use __func__ for nd6log()

Over time one or two hard coded function names did not match the
actual function anymore. Consistently use __func__ for nd6log() calls
and re-wrap/re-format some messages for consitency.

MFC after: 2 weeks

nd6_rtr: make nd6_prefix_onlink() static

nd6_prefix_onlink() is not used anywhere outside nd6_rtr.c. Stop
exporting it and make it file local static.

Fix division by zero issue.

Thanks to Stas Denisov for reporting the issue for the userland stack
and providing a fix.

MFC after: 3 days

dd kern_getpriority(), make Linuxulator use it.

Reviewed by: kib, emaste
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D22842

Add kern_setpriority(), use it in Linuxulator.

Reviewed by: kib
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D22841

Tighten FAT checks and fix off-by-one error in corner case.

sbin/fsck_msdosfs/fat.c:
- readfat:
    * Only truncate out-of-range cluster pointers (1, or greater than
      NumClusters but smaller than CLUST_RSRVD), as the current cluster
      may contain some data. We can't fix reserved cluster pointers at
      this pass, because we do no know the potential cluster preceding
      it.
    * Accept valid cluster for head bitmap. This is a no-op, and mainly
      to improve code readability, because the 1 is already handled in
      the previous else if block.
- truncate_at: absorbed into checkchain.
- checkchain: save the previous node we have traversed in case that we
   have a chain that ends with a special (>= CLUST_RSRVD) cluster, or is
   free. In these cases, we need to truncate at the cluster preceding the
   current cluster, as the current cluster contains a marker instead of
   a next pointer and can not be changed to CLUST_EOF (the else case can
   happen if the user answered "no" at some point in readfat()).
- clearchain: correct the iterator for next cluster so that we don't
   stop after clearing the first cluster.
- checklost: If checkchain() thinks the chain have no cluster, it
   doesn't make sense to reconnect it, so don't bother asking.

Reviewed by: kevlo
MFC after: 24 days
X-MFC-With: r356313
Differential Revision: https://reviews.freebsd.org/D23065

Add "panicked" boolean which can be tested instead of panicstr

The test is performed all the time and reading entire panicstr to do it
wastes space.

Add KERNEL_PANICKED macro for use in place of direct panicstr tests

sysctl: add missing CLTFLAG_MPSAFE annotation to CONST_STRING

vm: add missing CLTFLAG_MPSAFE annotations

This covers all vm/* files.

dtrace: add missing CLTFLAG_MPSAFE annotations

zfs: add missing CLTFLAG_MPSAFE annotations

Makefile.inc1: push /usr/libexec into the BPATH/TMPPATH

${WORLDTMP}/legacy/usr/libexec will only have libexec/ bits that we've
pushed as bootstrap tools, so this is generally safe to include prior to
PATH. The following are the ramifications of this change:

- BPATH addition gets us at least bootstrap flua in WMAKEENV path for
  buildenv, for those earlier systems where it's bootstrapped still

- Reworked the sysent target to just set PATH and let it get worked out in
  src.lua.mk or individual sysent makefiles -- this gives us back the
  ability to overwrite LUA_CMD and use a different/external lua for these
  targets.  sysent can also now work cleanly in buildenv.

- tools/build/Makefile will now symlink the host flua into build's host
  tools so that the above can work without needing to add the host's
  /usr/libexec explicitly into TMPPATH.

Reviewed by: arichardson, brooks, imp (all slightly earlier version)
Differential Revision: https://reviews.freebsd.org/D22464

regulator: small enhancements to regulator_shutdown

Highlights:

- Exit early if we're not disabling unused regulators; there's no need to
  take the regulator topology lock and re-evaluate this every iteration, as
  it's not going to change.
- Don't emit a notice that we're shutting down a regulator if it's not
  enabled, to reduce noise.
- Mention the outcome of the shutdown, to aide debugging and easily let
  developer/user collect list of regulators we actually shutdown to
  determine problematic one.

Reviewed by: manu
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D22213

vfs: only recalculate watermarks when limits are changing

Previously they would get recalculated all the time, in particular in:
getnewvnode -> vcheckspace -> vspace

vfs: deduplicate vnode allocation logic

This creates a dedicated routine (vn_alloc) to allocate vnodes.

As a side effect code duplicationw with getnewvnode_reserve is eleminated.

Add vn_free for symmetry.