CyberLeo.Net >> Repos - FreeBSD/FreeBSD.git/log

amd64 pmap_pkru_same: prev_ppr was always NULL

Fix the logic so it works as it appears.

Reported by: Coverity
Reviewed by: kib
MFC after: 2 weeks
Sponsored by: Dell EMC Isilon
Differential Revision: D26211 (in progress, so omitting full URL)

Install library symlinks atomically.

As we do for shared library binaries, pass -S to install(1) when
installing symlinks. Doing so helps avoid transient failures when
libraries are being reinstalled, which seems to be the root cause of
spurious libgcc_s.so link failures during CI builds.

PR: 233769
Reviewed by: emaste
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D26453

ys/contrib/dev/ath: remove unintentional double semicolon

Approved by: adrian

build: provide a default WARNS for all in-tree builds

The current default is provided in various Makefile.inc in some top-level
directories and covers a good portion of the tree, but doesn't cover parts
of the build a little deeper (e.g. libcasper).

Provide a default in src.sys.mk and set WARNS to it in bsd.sys.mk if that
variable is defined. This lets us relatively cleanly provide a default WARNS
no matter where you're building in the src tree without breaking things
outside of the tree.

Crunchgen has been updated as a bootstrap tool to work on this change
because it needs r365605 at a minimum to succeed. The cleanup necessary to
successfully walk over this change on WITHOUT_CLEAN builds has been added.

There is a supplemental project to this to list all of the warnings that are
encountered when the environment has WARNS=6 NO_WERROR=yes:
https://warns.kevans.dev -- this project will hopefully eventually go away
in favor of CI doing a much better job than it.

Reviewed by: emaste, brooks, ngie (all earlier version)
Reviewed by: emaste, arichardson (depend-cleanup.sh change)
Differential Revision: https://reviews.freebsd.org/D26455

vm_ooffset_t is now unsigned

vm_ooffset_t is now unsigned. Remove some tests for negative values,
or make other adjustments accordingly.

Reported by: Coverity
Reviewed by: kib markj
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D26214

arm64: generate ISO release images

Some IPMI implementations on arm64 are reportedly unable to load our
memstick installer images, but support the older ISO format. Start
generating these for arm64.

Unlike installer ISOs for other platforms, these images are UEFI-only.

Reviewed by: emaste
Relnotes: yes
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D26452

pkgbase: use consistent annotation for objectk eys

Everywhere else we use objects ("scripts", generally) we do sepcify the
optional colon. Be consistent and do the same for directories.

PR: 249273
Submitted by: Martin <martin.jakob gmx com>
MFC after: 1 week

Remove unnecessary include "../Makefile.inc"

This is already pulled in by bsd.init.mk.

Reported By: kevans

Initialize some local variables earlier

Move the initialization of these variables to the beginning of their
respective functions.

On our end this creates a small amount of unneeded churn, as these
variables are properly initialized before their first use in all cases.
However, changing this benefits at least one downstream consumer
(NetApp) by allowing local and future modifications to these functions
to be made without worrying about where the initialization occurs.

Reviewed by: melifaro, rscheff
Sponsored by: NetApp, Inc.
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D26454

Add pargs, penv, pwdx commands and aliases to procstat(1).

Intent is to mimic Solaris commands with the same names.

Submitted by: Juraj Lutter <juraj@lutter.sk>
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D26310

Assert we are not traversing through superpages in the arm64 pmap.

Reviewed by: alc, andrew
MFC after: 1 week
Sponsored by: Juniper Networks, Inc., Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D26465

Ensure that a protection key is selected in pmap_enter_largepage().

Reviewed by: alc, kib
Reported by: Coverity
MFC with: r365518
Differential Revision: https://reviews.freebsd.org/D26464

Fix error checking in shm_create_largepage().

Reviewed by: alc, kib
Reported by: Coverity
MFC with: r365524
Differential Revision: https://reviews.freebsd.org/D26464

libarchive: fix mismatch between library and test configuration

I was investigating libarchive test failures on CheriBSD and it turns out
we get a reproducible SIGBUS for test_archive_m5, etc. Debugging this shows
that libarchive and the tests disagree when it comes to the definition of
archive_md5_ctx: libarchive assumes it's the OpenSSL type whereas the test
use the libmd type. The latter is not necessarily aligned enough to store
a pointer (16 bytes for CHERI RISC-V), so we were crashing when storing
EVP_MD_CTX* to an 8-byte-aligned archive_md5_ctx.

To avoid problems like this in the future, factor out the common compiler
flags into a Makefile.inc and include that from the tests Makefile.

Reviewed By: lwhsu
Differential Revision: https://reviews.freebsd.org/D26469

crypto_buffer(9): Bring back the reference for bus_dma(9)

The reference was accidentally deleted in r365855.

Reported by: jhb
Pointy hat to: gbe

Use atf_fail instead of exit 1 to indicate mpath tests failure.

Fix byte-reversal of language ID in string descriptor.

The language id of String Descriptors in usb mouse is
0x0904, while the spec require 0x0409 (English - United States)

Submitted by: Wanpeng Qian
Reviewed by: grehan
Approved by: grehan (#bhyve)
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D26472

cxgbe(4): add the firmware binaries instead of the empty files that were added
in r365861.

Obtained from: Chelsio Communications
MFC after: 3 days
Sponsored by: Chelsio Communications

cxgbe(4): add support for stateless offloads for VXLAN traffic.

Hardware assistance includes checksumming (tx and rx), TSO, and RSS on
the inner traffic in a VXLAN tunnel.

Relnotes: Yes
Sponsored by: Chelsio Communications

if_vxlan(4): add support for hardware assisted checksumming, TSO, and RSS.

This lets a VXLAN pseudo-interface take advantage of hardware checksumming (tx
and rx), TSO, and RSS if the NIC is capable of performing these operations on
inner VXLAN traffic.

A VXLAN interface inherits the capabilities of its vxlandev interface if one is
specified or of the interface that hosts the vxlanlocal address. If other
interfaces will carry traffic for that VXLAN then they must have the same
hardware capabilities.

On transmit, if_vxlan verifies that the outbound interface has the required
capabilities and then translates the CSUM_ flags to their inner equivalents.
This tells the hardware ifnet that it needs to operate on the inner frame and
not the outer VXLAN headers.

An event is generated when a VXLAN ifnet starts. This allows hardware drivers to
configure their devices to expect VXLAN traffic on the specified incoming port.

On receive, the hardware does RSS and checksum verification on the inner frame.
if_vxlan now does a direct netisr dispatch to take full advantage of RSS. It is
not very clear why it didn't do this already.

Future work:
Rx: it should be possible to avoid the first trip up the protocol stack to get
the frame to if_vxlan just so it can decapsulate and requeue for a second trip
up the stack. The hardware NIC driver could directly call an if_vxlan receive
routine for VXLAN traffic instead.

Rx: LRO. depends on what happens with the previous item. There will have to to
be a mechanism to indicate that it's time for if_vxlan to flush its LRO state.

Reviewed by: kib@
Relnotes: Yes
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D25873

Add a knob to allow zero UDP checksums for UDP/IPv6 traffic on the given UDP port.

This will be used by some upcoming changes to if_vxlan(4).  RFC 7348 (VXLAN)
says that the UDP checksum "SHOULD be transmitted as zero.  When a packet is
received with a UDP checksum of zero, it MUST be accepted for decapsulation."
But the original IPv6 RFCs did not allow zero UDP checksum.  RFC 6935 attempts
to resolve this.

Reviewed by: kib@
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D25873

Add two new ifnet capabilities for hw checksumming and TSO for VXLAN traffic.

These are similar to the existing VLAN capabilities.

Reviewed by: kib@
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D25873

mbuf checksum flags and fields to support tunneling protocols.

These are being added to support VXLAN but will work for GENEVE as well.
ENCAP_RSVD1 will likely become ENCAP_GENEVE in the future.

The size of struct mbuf does not change and that means this change can be MFC'd.
If size wasn't a constraint a cleaner way may have been to add inner_csum_flags
and inner_csum_data to go with csum_flags and csum_data.

Reviewed by: kib@
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D25873

State kgssapi dependency on xdr.

Submitted by: Dmitry Afanasiev
PR: 249378
MFC after: 3 days

cxgbe(4): Update T4/5/6 firmwares to 1.25.0.0.

Obtained from: Chelsio Communications
MFC after: 3 days
Sponsored by: Chelsio Communications

arch(7): Some markup fixes

- no blank before trailing delimiter

MFC after: 3 days

man(9): Some markup fixes

- whitespace at end of input line
- skipping paragraph macro: Pp after Sh

MFC after: 3 days

pwmbus(9): some markup fixes

- whitespace at end of input line

MFC after: 3 days

mbuf(9): Some markup fixes

- whitespace at end of input line
- no blank before trailing delimiter: Dv MJUM16BYTES

MFC after: 3 days

crypto_buffer(9): Sort the SEE ALSO section

MFC after: 3 days

VOP_INACTIVE(9): Remove trailing whitespace

MFC after: 3 days

domainset(9): Some markup fixes

- new sentence, new line
- whitespace at end of input line

MFC after: 3 days

Revert r361257: bsdinstall: do a `certctl rehash` upon installation [...]

As of r365829, any given base distribution set will now include the /etc/ssl
symlinks that this rehash would've otherwise installed. This extra step is
no longer required.

MFC after: 1 week
X-MFC-With: r365837

rmlock(9): Some markup fixes

- new sentence, new line

MFC after: 3 days

bus_dma(9): Some markup fixes

- new sentence, new line
- no blank before trailing delimiter
- whitespace at end of input line

MFC after: 3 days

Merge commit 46673763f from llvm git (by Craig Topper):

  [X86] Place new constant node in topological order in
  X86DAGToDAGISel::matchBitExtract

  Fixes PR47482

This should fix 'Assertion failed: (Op->getNodeId() != -1 && "Node has
already selected predecessor node"), function DoInstructionSelection,
file
/usr/src/contrib/llvm-project/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp,
line 1149' when compiling part of the project_painter project, while
targeting the bdver2 (or higher) CPU.

Reported by: jkim
MFC after: 6 weeks
X-MFC-With: r364284

Merge commit e09107ab8 from llvm git (by Raul Tambre):

  [Sema] Introduce BuiltinAttr, per-declaration builtin-ness

  Instead of relying on whether a certain identifier is a builtin,
  introduce BuiltinAttr to specify a declaration as having builtin
  semantics.

  This fixes incompatible redeclarations of builtins, as reverting the
  identifier as being builtin due to one incompatible redeclaration
  would have broken rest of the builtin calls.
  Mostly-compatible redeclarations of builtins also no longer have
  builtin semantics. They don't call the builtin nor inherit their
  attributes.
  A long-standing FIXME regarding builtins inside a namespace enclosed
  in extern "C" not being recognized is also addressed.

  Due to the more correct handling attributes for builtin functions are
  added in more places, resulting in more useful warnings.
  Tests are updated to reflect that.

  Intrinsics without an inline definition in intrin.h had `inline` and
  `static` removed as they had no effect and caused them to no longer
  be recognized as builtins otherwise.

  A pthread_create() related test is XFAIL-ed, as it relied on it being
  recognized as a builtin based on its name.
  The builtin declaration syntax is too restrictive and doesn't allow
  custom structs, function pointers, etc.
  It seems to be the only case and fixing this would require reworking
  the current builtin syntax, so this seems acceptable.

  Fixes PR45410.

  Reviewed By: rsmith, yutsumi

  Differential Revision: https://reviews.llvm.org/D77491

This should fix 'Assertion failed: (i < getNumParams() && "Illegal
param #"), function getParamDecl, file
/usr/src/contrib/llvm-project/clang/include/clang/AST/Decl.h, line 2430'
when building the graphics/pgplot port.

Note that there may also have been other ports which triggered this
assertion, if they redeclare standard functions with incompatible
arguments.

Reported by: zeising
MFC after: 6 weeks
X-MFC-With: r364284

makefs: connect cd9660 El Torito EFI boot image system type

Sponsored by: The FreeBSD Foundation

Cirrus-CI: build as an unprivileged user

The Cirrus-CI-provided working tree is owned by root. Leave that as is
for simplicity but build as an unprivileged user; this tests building
with an unmodifiable source tree as a side effect.

Continue running the smoke test as root for now, as it failed when run
as an unprivileged user - pkg reported "Fail to chmod
/usr/bin/.pkgtemp.lpq.dUHpEqPGJ9pq:Operation not permitted"

Sponsored by: The FreeBSD Foundation

Fix additional memory leak in process_mapfile

Additional Coverity detected memory leak fix.

Submitted by: bret_ketchum@dell.com
Reported by: Coverity
Reviewed by: cem, emaste
MFC after: 2 weeks
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D26462

Remove no longer used variable.

Pointy hat to: jhb
Reported by: kevans
MFC after: 1 week

Move to a more robust and conservative alloation scheme for devctl messages

Change the zone setup:
- Allow slabs to be returned to the OS
- Set the number of slots to the max devctl will queue before discarding
- Reserve 2% of the max (capped at 100) for low memory allocations
- Disable per-cpu caching since we don't need it and we avoid some pathologies

Change the alloation strategiy a bit:
- If a normal allocation fails, try to get the reserve
- If a reserve allocation fails, re-use the oldest-queued entry for storage
- If there's a weird race/failure and nothing on the queue to steal, return NULL

This addresses two main issues in the old code:
- If devd had died, and we're generating a lot of messages, we have an
unbounded leak. This new scheme avoids the issue that lead to this.
- The MPASS that was 'sure' the allocation couldn't have failed turned out
to be wrong in some rare cases. The new code doesn't make this assumption.

Since we reserve only 2% of the space, we go from about 1MB of
allocation all the time to more like 50kB for the reserve.

Reviewed by: markj@
Differential Revision: https://reviews.freebsd.org/D26448

Remove support for setting some obscure fields.

Don't permit setting the exception bitmap or VMCS entry interrupt
information. These are not generally useful to set. If it is needed
in the future, dedicated pseudo registers can be added for these that
would be used with vm_set_register().

Discussed with: grehan
MFC after: 1 week

Increase the default vm.max_user_wired value.

Since r347532 (merged to stable/12) we only count user-wired pages
towards the system limit. However, we now also treat pages wired by
hypervisors (bhyve and virtualbox) as user-wired, so starting VMs with
large amounts of RAM tends to fail due to the low limit.

The purpose of the limit is to provide a seatbelt, not to impose some
policy on the use of wired memory. Thus, increase the default limit to
allow reasonable VM configurations to work without tuning.

Reviewed by: kib
Discussed with: dougm
MFC after: 3 days
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D26424

Add some basic regression tests for SHM_LARGEPAGE.

Discussed with: kib
Sponsored by: Juniper Networks, Inc.
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D25900

Promote the installworld `certctl rehash` to distributeworld

Contrary to my belief, installworld is not sufficient for getting certs
installed into VM images. Promote the rehash to both installworld and
distributeworld (notably: not stageworld) and rehash the base distdir so we
end up with /etc/ssl/certs populated in the base dist archive. A future
commit will remove the rehash from bsdinstall, which doesn't really need to
happen if they're installed into base.txz.

While here, fix a minor typo: s/CERTCLTFLAGS/CERTCTLFLAGS/

MFC after: 1 week

Stop using lorder and ranlib when building libraries

Use of ranlib or lorder is no longer necessary with current linkers
(probably anything newer than ~1990) and ar's ability to create an object
index and symbol table in the archive.
Currently the build system uses lorder+tsort to sort the .o files in
dependency order so that a single-pass linker can use them. However,
we can use the -s flag to ar to add an index to the .a file which makes
lorder unnecessary.
Running ar -s is equivalent to running ranlib afterwards, so we can also
skip the ranlib invocation.

Similarly, we don't have to pass the .o files for shared libraries in
dependency order since both ld.bfd and ld.lld will correctly resolve
references between the .o files.

This removes many fork()+execve calls for each library so should speed up
builds a bit. Additionally lorder.sh uses a regular expression that is not
supported by the macOS libc or glibc and results in many warnings when
cross-building (see D25989).

There is one functional change: lorder.sh removed duplicated .o files
from the linker command line which now no longer happens. I fixed the duplicates
in the base system in r364649. I also checked the ports tree for uses of
bsd.lib.mk and found one duplicate source file which I fixed in r548168.
Most ports use CMake/autotools rather than bsd.lib.mk but if this breaks any
ports that I missed in my search please let me know.

Avoiding the shell script actually speeds up the linking step noticeably: I
measured how long it takes to rebuild the .a and .so files for lib/libc using a
basic benchmark: `rm $LIBC_OBJDIR/*.so* $LIBC_OBJDIR/*.a* && /usr/bin/time make -DWITHOUT_TESTS -s > /dev/null`
Without this change ~4.5 seconds and afterwards ~3.1 seconds.
Looking at truss -cf output we can see that the number fork() system
calls goes down from 27 to 12 (and the speedup while tracing is more
noticeable: 81 seconds -> 65 seconds).

See also https://www.gnu.org/software/coreutils/manual/html_node/tsort-background.html
for some more background:
This whole procedure has been obsolete since about 1980, because Unix
archives now contain a symbol table (traditionally built by ranlib, now
generally built by ar itself), and the Unix linker uses the symbol table
to effectively make multiple passes over an archive file.

Or alternatively https://www.unix.com/man-page/osf1/1/lorder/:
The lorder command is essentially obsolete. Use the following command in
its place: % ar -ts file.a

Reviewed By: emaste, imp, dim
Differential Revision: https://reviews.freebsd.org/D26044

Add dtb/sifive module

This allows building the HiFive Unleashed device tree blob.

Reviewed by: manu
Differential Revision: https://reviews.freebsd.org/D26459

Reduce code duplication by introducing linux_copyout_sockaddr()
helper function. No functional changes.

Reviewed by: emaste
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D25804

Add support for SOUND_MIXER_WRITE_MONITOR ioctl. Fixes alsamixer(1)
on my x220.

Reviewed by: emaste
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D25806

Get rid of sv_errtbl and SV_ABI_ERRNO().

Reviewed by: kib
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D26388

Revert r249362, atime update in tmpfs is fixed in r365810

PR: 249362
Sponsored by: The FreeBSD Foundation

geom_part: make it possible recovering broken GPT after some LBAs cut off

This is followup to r365477.

If pre-formatted device has GPT and a partition covering
last available LBAs and the device is attached using
a bridge reducing amount of LBAs, then it could be not enough
forcing GEOM to use primary GPT. Also, we should make it possible
to recover GPT and this requires either deleting or resizing the partition.

This change enables "gpart delete" and "gpart resize" commands
on corrupted GPT with following "gpart recover".

It still does not allow modifying corrupted GPT without
preliminary setting sysctl kern.geom.part.check_integrity=0

For example:

# gpart show da0
=>        34  3906963389  da0  GPT  (1.8T) [CORRUPT]
          34      262144    1  ms-reserved  (128M)
      262178        2014       - free -  (1.0M)
      264192  3906764943    2  freebsd-swap  (1.8T)
# gpart resize -i 2 -s 3900000000 da0
# gpart recover da0

Reported by: Alex Korchmar
MFC after: 3 days

installworld: run `certctl rehash` after installation completes

This was originally introduced back in r360833, and subsequently reverted
because it was broken for -DNO_ROOT builds and it may not have been the
correct place for it.

While debatably this may still not be 'the correct place,' it's much cleaner
than scattering rehashes all throughout the tree. brooks has fixed the issue
with -DNO_ROOT by properly writing to the METALOG in r361397.

Do note that this is different than what was originally committed; brooks
had revisions in D24932 that made it actually use the revised unprivileged
mode and write to METALOG, along with being a little more friendly to
foreign crossbuilds and just using the certctl in-tree.

With this change, I believe we should now have a populated /etc/ssl/certs in
the VM images.

MFC after: 1 week

Put calls to check_pgrp_jobc() in fixjobc_kill() under INVARIANTS.

Reported by: Michael Butler <imb@protected-networks.net>
Sponsored by: The FreeBSD Foundation
MFC after: 1 week

Add check_pgrp_jobc() calls into process exit path.

Both before and after job control adjustments.

Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D26416

Fix fixjobc+orhpanage.

Orphans affect job control state, we must account for them when
changing pg_jobc.

Instead of p_pptr, use proc_realparent() to get parent relevant for
job control.

Use correct calculation of the parent for exiting process. For jobc
purposes, we must use realparent, but if it is also exiting, we should
fall to reaper, then recursively find non-exiting reaper.

Reported by: trasz
PR: 249257
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D26416

Assert that P_TREE_GRPEXITED is set only once.

Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D26416

proc_realparent: if p_oppid does not match pid of the current parent
for non-orphaned process, return reaper instead of init.

Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D26416

Improve ddb 'show pgrpdump' command.

Use ddb pager.
Make lines more compact.
Eliminate unneeded casts.
Print more job-control related info when reporting process group.

Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D26416

tmpfs: restore atime updates for reads from page cache.

Split TMPFS_NODE_ACCCESSED bit into dedicated byte that can be updated
atomically without locks or (locked) atomics.

tn_update_getattr() change also contains unrelated bug fix.

Reported by: lwhsu
PR: 249362
Reviewed by: markj (previous version)
Discussed with: mjg
Sponsored by: The FreeBSD Foundation
Differential revision: https://reviews.freebsd.org/D26451

Style.

Sponsored by: The FreeBSD Foundation
MFC after: 3 days

Merge llvm, clang, compiler-rt, libc++, libunwind, lld, lldb and openmp
release/11.x llvmorg-11.0.0-rc2-91-g6e042866c30.

MFC after: 6 weeks
X-MFC-With: r364284

if_media: definitions for 40GE LM4 ethernet media type

Reviewed by: erj
Sponsored by: NetApp, Inc.
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D26276

Move PLTs to the beginning of amd64 kernel modules.

As with .text, the aim is to ensure that executable sections are
segregated from the rest, to avoid creation of writeable and executable
mappings. Recent versions of LLVM emit a PLT in firmware modules.

Reviewed by: kib
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D26444

Temporarily skip sys.fs.tmpfs.times_test.{empty,non_empty} in CI

PR: 249362
Sponsored by: The FreeBSD Foundation

Update to 2020.08.19

MFC after: 3 days

Use standard bool type, instead of non-standard boolean_t

Fix a LOR between the NFS server and server side krpc.

Recent testing of the NFS-over-TLS code found a LOR between the mutex lock
used for sessions and the sleep lock used for server side krpc socket
structures.
The code in nfsrv_checksequence() would call SVC_RELEASE() with the mutex
held. Normally this is ok, since all that happens is SVC_RELEASE()
decrements a reference count. However, if the socket has just been shut
down, SVC_RELEASE() drops the reference count to 0 and acquires a sleep
lock during destruction of the server side krpc structure.

This patch fixes the problem by moving the SVC_RELEASE() call in
nfsrv_checksequence() down a few lines to below where the mutex is released.

MFC after: 1 week

Fix locking in uipc_accept().

Reported by: cy
MFC after: 1 week
Sponsored by: The FreeBSD Foundation

Add tmpfs page cache read support.

Or it could be explained as lockless (for vnode lock) reads.  Reads
are performed from the node tn_obj object.  Tmpfs regular vnode object
lifecycle is significantly different from the normal OBJT_VNODE: it is
alive as far as ref_count > 0.

Ensure liveness of the tmpfs VREG node and consequently v_object
inside VOP_READ_PGCACHE by referencing tmpfs node in tmpfs_open().
Provide custom tmpfs fo_close() method on file, to ensure that close
is paired with open.

Add tmpfs VOP_READ_PGCACHE that takes advantage of all tmpfs quirks.
It is quite cheap in code size sense to support page-ins for read for
tmpfs even if we do not own tmpfs vnode lock.  Also, we can handle
holes in tmpfs node without additional efforts, and do not have
limitation of the transfer size.

Reviewed by: markj
Discussed with and benchmarked by: mjg (previous version)
Tested by: pho
Sponsored by: The FreeBSD Foundation
Differential revision: https://reviews.freebsd.org/D26346

Microoptimize tmpfs node ref/unref by using atomics.

Avoid tmpfs mount and node locks when ref count is greater than zero,
which is the case until node is being destroyed by unlink or unmount.

Reviewed by: markj
Tested by: pho
Sponsored by: The FreeBSD Foundation
Differential revision: https://reviews.freebsd.org/D26346

Convert page cache read to VOP.

There are several negative side-effects of not calling into VOP layer
at all for page cache reads. The biggest is the missed activation of
EVFILT_READ knotes.

Also, it allows filesystem to make more fine grained decision to
refuse read from page cache.

Keep VIRF_PGREAD flag around, it is still useful for nullfs, and for
asserts.

Reviewed by: markj
Tested by: pho
Discussed with: mjg
Sponsored by: The FreeBSD Foundation
Differential revision: https://reviews.freebsd.org/D26346

vfs_subr.c: export io_hold_cnt and vn_read_from_obj().

Reviewed by: markj
Tested by: pho
Sponsored by: The FreeBSD Foundation
Differential revision: https://reviews.freebsd.org/D26346

Do not copy vp into f_data for DTYPE_VNODE files.

The pointer to vnode is already stored into f_vnode, so f_data can be
reused. Fix all found users of f_data for DTYPE_VNODE.

Provide finit_vnode() helper to initialize file of DTYPE_VNODE type.

Reviewed by: markj (previous version)
Discussed with: freqlabs (openzfs chunk)
Tested by: pho (previous version)
Sponsored by: The FreeBSD Foundation
Differential revision: https://reviews.freebsd.org/D26346

e1000: Properly retain promisc flag

From Franco:
The iflib rewrite forced the promisc flag but it was not reported
to the system. Noticed on a stock VM that went into unsolicited
promisc mode when dhclient was started during bootup.

PR: 248869
Submitted by: Franco Fichtner <franco@opnsense.org>
Reviewed by: erj@
MFC after: 3 days

bhyve: do not permit write access to VMCB / VMCS

Reported by: Patrick Mooney
Submitted by: jhb
Security: CVE-2020-24718

igb(4): Fix define and includes with RSS option enabled

This re-adds the opt_rss.h header to the driver and includes some
RSS-specific headers when RSS is defined.

PR: 249191
Submitted by: Milosz Kaniewski <milosz.kaniewski@gmail.com>
MFC after: 3 days

ftpd: Exit during authentication if an error occurs after chroot().

admbug: 969
Security: CVE-2020-7468

[PowerPC64LE] Use correct in_masks table on LE to fix checksumming

Due to a check that should have been an endian check being an #if 0,
the wrong checksum mask table was being used on LE, which was causing
extreme strangeness in DNS resolution -- *some* hosts would be resolvable,
but most would not.

This fixes DNS resolution.

(I am committing some parts of the LE patchset ahead of time to reduce the
amount of work I have to do while committing the main patchset.)

Sponsored by: Tag1 Consulting, Inc.

[PowerPC64LE] Set up the powernv partition table correctly.

The partition table is always big endian.

Sponsored by: Tag1 Consulting, Inc.

bhyve: intercept AMD SVM instructions.

Intercept and report #UD to VM on SVM/AMD in case VM tried to execute an
SVM instruction. Otherwise, SVM allows execution of them, and instructions
operate on host physical addresses despite being executed in guest mode.

Reported by: Maxime Villard <max@m00nbsd.net>
admbug: 972
CVE: CVE-2020-7467
Reviewed by: grehan, markj
Differential revision: https://reviews.freebsd.org/D26313

Fix locking in uipc_accept().

This function wasn't converted to use the new locking protocol in
r333744. Make it use the PCB lock for synchronizing connection state.

Tested by: pho
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D26300

Simplify unix socket connection peer locking.

unp_pcb_owned_lock2() has some sharp edges and forces callers to deal
with a bunch of cases.  Simplify it:

- Rename to unp_pcb_lock_peer().
- Return the connected peer instead of forcing callers to load it
  beforehand.
- Handle self-connected sockets.
- In unp_connectat(), just lock the accept socket directly.  It should
  not be possible for the nascent socket to participate in any other
  lock orders.
- Get rid of connect_internal().  It does not provide any useful
  checking anymore.
- Block in unp_connectat() when a different thread is concurrently
  attempting to lock both sides of a connection.  This provides simpler
  semantics for callers of unp_pcb_lock_peer().
- Make unp_connectat() return EISCONN if the socket is already
  connected.  This fixes a race[1] when multiple threads attempt to
  connect() to different addresses using the same datagram socket.
  Upper layers will disconnect a connected datagram socket before
  calling the protocol connect's method, but there is no synchronization
  between this and protocol-layer code.

Reported by: syzkaller [1]
Tested by: pho
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D26299

Avoid an unnecessary malloc() when connecting dgram sockets.

The allocated memory is only required for SOCK_STREAM and SOCK_SEQPACKET
sockets.

Reviewed by: kevans
Tested by: pho
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D26298

Simplify unp_disconnect() callers.

In all cases, PCBs are unlocked after unp_disconnect() returns. Since
unp_disconnect() may release the last PCB reference, callers may have to
bump the refcount before the call just so that they can release them
again.

Change unp_disconnect() to release PCB locks as well as connection
references; this lets us remove several refcount manipulations. Tighten
assertions.

Tested by: pho
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D26297

Rename unp_pcb_lock2().

unp_pcb_lock_pair() seems like a better name. Also make it handle the
case where the two sockets are the same instead of making callers do it.
No functional change intended.

Reviewed by: glebius, kevans, kib
Tested by: pho
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D26296

Improve unix socket PCB refcounting.

- Use refcount_init().
- Define an INVARIANTS-only zone destructor to assert that various
bits of PCB state aren't left dangling.
- Annotate unp_pcb_rele() with __result_use_check.
- Simplify control flow.

Reviewed by: glebius, kevans, kib
Tested by: pho
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D26295

Update unix domain socket locking comments.

- Define a locking key for unpcb members.
- Rewrite some of the locking protocol description to make it less
verbose and avoid referencing some subroutines which will be renamed.
- Reorder includes.

Reviewed by: glebius, kevans, kib
Tested by: pho
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D26294

certctl: fix unprivileged mode

The first issue was lack of quoting around INSTALLFLAGS, which set it
incorrectly and produced an error on -M.

The second issue was that we weren't actually doing the install in
unprivileged mode, making it effectively useless. This was designed to pass
through the proper metalog/unpriv flags to install(1), so just let it
happen.

MFC after: 3 days

Move SV_ABI_ERRNO translation into linux-specific code, to simplify
the syscall path and declutter it a bit. No functional changes intended.

Reviewed by: kib (earlier version)
MFC after: 2 weeks
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D26378

src.conf.5: regen after r365753

Add descriptions of the WITH_(OUT)_GH_BC options that exist in -CURRENT
(default: WITH_GH_BC) and 12-STABLE (default: WITHOUT_GH_BC).

Since the new implementation of bc and dc is optionally available in
12-STABLE, I intend to MFC these descriptions for inclusion in 12.2.

MFC after: 3 days

Include sys/types.h here

It's included by header pollution in most of the compile
environments. However, in the standalone envirnment, it's not
included. Go ahead and include it always since the overhead is low and
it is simpler that way.

MFC After: 3 days

Use ATTR_DEFAULT in the arm64 locore.S

We can use ATTR_DEFAULT directly in locore.S as it fits within an orr
instruction operand.

Sponsored by: Innovate UK

Fix some posixshmcontrol nits.

- Exit with an error if no path is specified.
- Man page typo.
- Error message typo.

Reviewed by: kib
Sponsored by: Juniper Networks, Inc.
Sponsored by: Klara, Inc.
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D26376

[PowerPC] Remove obsolete MK_LOADER_FORCE_LE

In D12421, the ability to compile stand/ in little-endian was added, with the
intention to extend loader.kboot to run in Petitboot.

However, no further work was done, as the kernel then gained self-execution
capabilities as Petitboot was taught to load FreeBSD kernels directly.

The FreeBSD installer on powerpc64 (on POWER8 and POWER9) uses
/boot/etc/kboot.conf instead of loader.

As this option does nothing but cause stand/ to be miscompiled and actively
causes confusion, remove it.

(I have a functioning petitboot loader in my local tree, however, it turned
out to be quite inconvient to use due to the current petitboot plugin design
so I put it on hold.)

Reviewed by: emaste, imp, jhibbits
Sponsored by: Tag1 Consulting, Inc.
Differential Revision: https://reviews.freebsd.org/D26430

We don't need the sc_ekeys_lock in standalone environment.

When we bring in geli into the boot loader, we are single threaded so
we don't have to worry about locking. We have no mutexes, and don't need
to use them, so comment it out.

MFC After: 3 days

Don't do the busy dance in icee_open/close

We don't need to do the busy dance for this driver. It's handled by
destroy_dev() entirely. Since all we did was busy/unbusy in
open/close, just delete them. We therefore don't need to track closes
either.

Reviewed by: ian@
Differential Revision: https://reviews.freebsd.org/D26431

Tweak what's visible in the standalone environment. We define offsetof
in stand.h typically, but when this is included we can define it
multiple times. However, we don't define bool in stand.h at the
moment, so allow it to be defined inside types.h when we're building
for the standalone environment.

MFC After: 3 days