]> CyberLeo.Net >> Repos - FreeBSD/FreeBSD.git/log
FreeBSD/FreeBSD.git
3 years agoRemove commented-out lines describing the old never-implemented -t option.
ian [Sun, 26 Jul 2020 17:50:39 +0000 (17:50 +0000)]
Remove commented-out lines describing the old never-implemented -t option.

In 2018, r338094 removed the commented-out code for supporting the -t
command line option which had been present since the BSD 4.4 Lite import,
but was never implemented for freebsd.

This does the same for the man page.

3 years agoRevert r363564
manu [Sun, 26 Jul 2020 17:21:24 +0000 (17:21 +0000)]
Revert r363564

linux/sizes.h doesn't exists in base ... sorry.

3 years agolinuxkpi: Add taint* defines
manu [Sun, 26 Jul 2020 16:31:49 +0000 (16:31 +0000)]
linuxkpi: Add taint* defines

This isn't used for us but allow us to port drivers more easily.

Reviewed by: hselasky
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D25703

3 years agolinuxkpi: Include hardirq.h in preempt.h and lockdep.h in hardirq.h
manu [Sun, 26 Jul 2020 16:30:59 +0000 (16:30 +0000)]
linuxkpi: Include hardirq.h in preempt.h and lockdep.h in hardirq.h

Linux does the same, this avoids ifdef or extra includes in ported drivers.

Reviewed by: emaste, hselasky
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D25702

3 years agolinuxkpi: Include linux/sizes.h in dma-mapping.h
manu [Sun, 26 Jul 2020 16:30:01 +0000 (16:30 +0000)]
linuxkpi: Include linux/sizes.h in dma-mapping.h

Linux does the same, this avoids ifdef or extra includes in ported drivers.

Reviewed by: emaste, hselasky
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D25701

3 years agochio: avoid out of bounds read
emaste [Sun, 26 Jul 2020 15:10:33 +0000 (15:10 +0000)]
chio: avoid out of bounds read

ch_ces is alloacated with space for total_elem entries.

CID: 1418536
Reported by: Coverity Scan
Sponsored by: The FreeBSD Foundation

3 years agoBump __FreeBSD_version after introduction of lockless lookup to the VFS layer
mjg [Sun, 26 Jul 2020 13:30:33 +0000 (13:30 +0000)]
Bump __FreeBSD_version after introduction of lockless lookup to the VFS layer

3 years agoRename DMAR flags:
br [Sun, 26 Jul 2020 12:29:22 +0000 (12:29 +0000)]
Rename DMAR flags:
o DMAR_DOMAIN_* -> IOMMU_DOMAIN_*
o DMAR_PGF_* -> IOMMU_PGF_*

Reviewed by: kib
Sponsored by: DARPA/AFRL
Differential Revision: https://reviews.freebsd.org/D25812

3 years agoarm64: Only compile imx8 files if soc_freescale_imx8 is selected
manu [Sun, 26 Jul 2020 10:07:05 +0000 (10:07 +0000)]
arm64: Only compile imx8 files if soc_freescale_imx8 is selected

No Objection from:  gonzo

3 years agosed: treat '[' as ordinary character in 'y' command
yuripv [Sun, 26 Jul 2020 09:15:05 +0000 (09:15 +0000)]
sed: treat '[' as ordinary character in 'y' command

'y' does not handle bracket expressions, treat '[' as ordinary character
and do not apply bracket expression checks (GNU sed agrees).

PR: 247931
Reviewed by: pfg, kevans
Tested by: antoine (exp-run), Quentin L'Hours <lhoursquentin@gmail.com>
Differential Revision: https://reviews.freebsd.org/D25640

3 years agoAdd support for ext_pgs mbufs to nfsrv_adj().
rmacklem [Sun, 26 Jul 2020 02:42:09 +0000 (02:42 +0000)]
Add support for ext_pgs mbufs to nfsrv_adj().

This patch uses a slightly different algorithm for nfsrv_adj()
since ext_pgs mbuf lists are not permitted to have m_len == 0 mbufs.
As such, the code now frees mbufs after the adjustment in the list instead
of setting their m_len field to 0.
Since mbuf(s) may be trimmed off the tail of the list, the function now
returns a pointer to the last mbuf in the list.  This saves the caller
from needing to use m_last() to find the last mbuf.
It also implies that it might return a nul list, which required a check for
that in nfsrvd_readlink().

This is another in the series of commits that add support to the NFS client
and server for building RPC messages in ext_pgs mbufs with anonymous pages.
This is useful so that the entire mbuf list does not need to be
copied before calling sosend() when NFS over TLS is enabled.

Use of ext_pgs mbufs will not be enabled until the kernel RPC is updated
to handle TLS.

3 years agoUse snprintf instead of sprintf.
delphij [Sun, 26 Jul 2020 01:45:26 +0000 (01:45 +0000)]
Use snprintf instead of sprintf.

MFC after: 2 weeks

3 years agogeom_label: Make glabel labels more trivial by separating the tasting
delphij [Sun, 26 Jul 2020 00:44:59 +0000 (00:44 +0000)]
geom_label: Make glabel labels more trivial by separating the tasting
routines out.

While there, also simplify the creation of label paths a little bit
by requiring the / suffix for label directory prefixes (ld_dir renamed
to ld_dirprefix to indicate the change) and stop defining macros for
these when they are only used once.

Reviewed by: cem
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D25597

3 years agoo Make the _hw_iommu sysctl node non-static;
br [Sat, 25 Jul 2020 21:37:07 +0000 (21:37 +0000)]
o Make the _hw_iommu sysctl node non-static;
o Move the dmar sysctl knobs to _hw_iommu_dmar.

Reviewed by: kib
Sponsored by: DARPA/AFRL
Differential Revision: https://reviews.freebsd.org/D25807

3 years agoo Move iommu gas prototypes, DMAR flags to iommu.h;
br [Sat, 25 Jul 2020 19:07:12 +0000 (19:07 +0000)]
o Move iommu gas prototypes, DMAR flags to iommu.h;
o Move hw.dmar sysctl node to iommu_gas.c.

Reviewed by: kib
Sponsored by: DARPA/AFRL
Differential Revision: https://reviews.freebsd.org/D25802

3 years agoFix an overflow bug in the blist allocator that needlessly capped max
dougm [Sat, 25 Jul 2020 18:29:10 +0000 (18:29 +0000)]
Fix an overflow bug in the blist allocator that needlessly capped max
swap size by dividing a value, which was always a multiple of 64, by
64.  Remove the code that reduced max swap size down to that cap.

Eliminate the distinction between BLIST_BMAP_RADIX and
BLIST_META_RADIX.  Call them both BLIST_RADIX.

Make improvments to the blist self-test code to silence compiler
warnings and to test larger blists.

Reported by: jmallett
Reviewed by: alc
Discussed with: kib
Tested by: pho
Differential Revision: https://reviews.freebsd.org/D25736

3 years agoclean up whitespace...
jmg [Sat, 25 Jul 2020 18:09:04 +0000 (18:09 +0000)]
clean up whitespace...

3 years agofd: put back FILEDESC_SUNLOCK to pwd_hold lost during rebase
mjg [Sat, 25 Jul 2020 15:34:29 +0000 (15:34 +0000)]
fd: put back FILEDESC_SUNLOCK to pwd_hold lost during rebase

Reported by: pho

3 years agoAllow swi_sched() to be called from NMI context.
mav [Sat, 25 Jul 2020 15:19:38 +0000 (15:19 +0000)]
Allow swi_sched() to be called from NMI context.

For purposes of handling hardware error reported via NMIs I need a way to
escape NMI context, being too restrictive to do something significant.

To do it this change introduces new swi_sched() flag SWI_FROMNMI, making
it careful about used KPIs.  On platforms allowing IPI sending from NMI
context (x86 for now) it immediately wakes clk_intr_event via new IPI_SWI,
otherwise it works just like SWI_DELAY.  To handle the delayed SWIs this
patch calls clk_intr_event on every hardclock() tick.

MFC after: 2 weeks
Sponsored by: iXsystems, Inc.
Differential Revision: https://reviews.freebsd.org/D25754

3 years agoMove Intel GAS to dev/iommu/ as now a part of generic iommu framework.
br [Sat, 25 Jul 2020 11:34:50 +0000 (11:34 +0000)]
Move Intel GAS to dev/iommu/ as now a part of generic iommu framework.

Reviewed by: kib
Sponsored by: DARPA/AFRL
Differential Revision: https://reviews.freebsd.org/D25799

3 years agovfs: add support for !LOCKLEAF to lockless lookup
mjg [Sat, 25 Jul 2020 10:40:38 +0000 (10:40 +0000)]
vfs: add support for !LOCKLEAF to lockless lookup

Tested by:      pho (in a patchset)
Differential Revision: https://reviews.freebsd.org/D23916

3 years agozfs: add support for lockless lookup
mjg [Sat, 25 Jul 2020 10:39:41 +0000 (10:39 +0000)]
zfs: add support for lockless lookup

Tested by: pho (in a patchset, previous version)
Differential Revision: https://reviews.freebsd.org/D25581

3 years agotmpfs: add support for lockless lookup
mjg [Sat, 25 Jul 2020 10:38:44 +0000 (10:38 +0000)]
tmpfs: add support for lockless lookup

Reviewed by:    kib
Tested by:      pho (in a patchset)
Differential Revision: https://reviews.freebsd.org/D25580

3 years agoufs: add support for lockless lookup
mjg [Sat, 25 Jul 2020 10:38:05 +0000 (10:38 +0000)]
ufs: add support for lockless lookup

ACLs are not supported, meaning their presence will force the use of the old lookup.

Reviewed by:    kib
Tested by:      pho (in a patchset)
Differential Revision: https://reviews.freebsd.org/D25579

3 years agovfs: lockless lookup
mjg [Sat, 25 Jul 2020 10:37:15 +0000 (10:37 +0000)]
vfs: lockless lookup

Provides full scalability as long as all visited filesystems support the
lookup and terminal vnodes are different.

Inner workings are explained in the comment above cache_fplookup.

Capabilities and fd-relative lookups are not supported and will result in
immediate fallback to regular code.

Symlinks, ".." in the path, mount points without support for lockless lookup
and mismatched counters will result in an attempt to get a reference to the
directory vnode and continue in regular lookup. If this fails, the entire
operation is aborted and regular lookup starts from scratch. However, care is
taken that data is not copied again from userspace.

Sample benchmark:
incremental -j 104 bzImage on tmpfs:
before: 142.96s user 1025.63s system 4924% cpu 23.731 total
after: 147.36s user 313.40s system 3216% cpu 14.326 total

Sample microbenchmark: access calls to separate files in /tmpfs, 104 workers, ops/s:
before:   2165816
after:  151216530

Reviewed by:    kib
Tested by:      pho (in a patchset)
Differential Revision: https://reviews.freebsd.org/D25578

3 years agovfs: add the infrastructure for lockless lookup
mjg [Sat, 25 Jul 2020 10:32:45 +0000 (10:32 +0000)]
vfs: add the infrastructure for lockless lookup

Reviewed by:    kib
Tested by:      pho (in a patchset)
Differential Revision: https://reviews.freebsd.org/D25577

3 years agovfs: introduce vnode sequence counters
mjg [Sat, 25 Jul 2020 10:31:52 +0000 (10:31 +0000)]
vfs: introduce vnode sequence counters

Modified on each permission change and link/unlink.

Reviewed by: kib
Tested by: pho (in a patchset)
Differential Revision: https://reviews.freebsd.org/D25573

3 years agoseqc: add a sleepable variant and convert some routines to macros
mjg [Sat, 25 Jul 2020 10:29:48 +0000 (10:29 +0000)]
seqc: add a sleepable variant and convert some routines to macros

This temporarily duplicates some code.

Macro conversion convinces clang to carry predicts into consumers.

3 years agoSplit-out the Intel GAS (Guest Address Space) management component
br [Sat, 25 Jul 2020 09:28:38 +0000 (09:28 +0000)]
Split-out the Intel GAS (Guest Address Space) management component
from Intel DMAR support, so it can be used on other IOMMU systems.

Reviewed by: kib
Sponsored by: DARPA/AFRL
Differential Revision: https://reviews.freebsd.org/D25743

3 years agoRemove duplicated content from _eventhandler.h
mjg [Sat, 25 Jul 2020 07:48:20 +0000 (07:48 +0000)]
Remove duplicated content from _eventhandler.h

3 years agoRemove leftover macros for long gone vmsize mtx
mjg [Sat, 25 Jul 2020 07:45:44 +0000 (07:45 +0000)]
Remove leftover macros for long gone vmsize mtx

3 years agoGuard sbcompress_ktls_rx with KERN_TLS
mjg [Sat, 25 Jul 2020 07:15:23 +0000 (07:15 +0000)]
Guard sbcompress_ktls_rx with KERN_TLS

Fixes a compilation warning after r363464

3 years agoDo a lockless check in kthread_suspend_check
mjg [Sat, 25 Jul 2020 07:14:33 +0000 (07:14 +0000)]
Do a lockless check in kthread_suspend_check

Otherwise an idle system running lockstat sleep 10 reports contention on
process lock comming from bufdaemon.

While here fix a style nit.

3 years agoRevert r363123.
mmel [Sat, 25 Jul 2020 06:32:23 +0000 (06:32 +0000)]
Revert r363123.
As Emanuel poited me the Linux processes these clock assignments in forward
order, not in reversed. I misread the original code.
Tha problem with wrong order for assigned clocks found in tegra (and some imx)
DT should be reanalyzed and solved by different way.

MFC with: r363123
Reported by; manu

3 years agoAdd support for ext_pgs mbufs to nfsm_uiombuflist() and nfsm_split().
rmacklem [Fri, 24 Jul 2020 23:17:09 +0000 (23:17 +0000)]
Add support for ext_pgs mbufs to nfsm_uiombuflist() and nfsm_split().

This patch uses a slightly different algorithm for nfsm_uiombuflist() for
the non-ext_pgs case, where a variable called "mcp" is maintained, pointing to
the current location that mbuf data can be filled into. This avoids use of
mtod(mp, char *) + mp->m_len to calculate the location, since this does
not work for ext_pgs mbufs and I think it makes the algorithm more readable.
This change should not result in semantic changes for the non-ext_pgs case.
The patch also deletes come unneeded code.

It also adds support for anonymous page ext_pgs mbufs to nfsm_split().

This is another in the series of commits that add support to the NFS client
and server for building RPC messages in ext_pgs mbufs with anonymous pages.
This is useful so that the entire mbuf list does not need to be
copied before calling sosend() when NFS over TLS is enabled.
At this time for this case, use of ext_pgs mbufs cannot be enabled, since
ktls_encrypt() replaces the unencrypted data with encrypted data in place.

Until such time as this can be enabled, there should be no semantic change.
Also, note that this code is only used by the NFS client for a mirrored pNFS
server.

3 years agocxgbe(4): Some updates to the common code.
np [Fri, 24 Jul 2020 23:15:42 +0000 (23:15 +0000)]
cxgbe(4): Some updates to the common code.

Obtained from: Chelsio Communications
MFC after: 1 week
Sponsored by: Chelsio Communications

3 years agoMake it possible to get/set MMC frequency from camcontrol
kibab [Fri, 24 Jul 2020 21:14:59 +0000 (21:14 +0000)]
Make it possible to get/set MMC frequency from camcontrol

Enhance camcontrol(8) so that it's possible to manually set frequency for SD/MMC cards.
While here, display more information about the current controller, such as
supported operating modes and VCCQ voltages, as well as current VCCQ voltage.

Reviewed by: manu
Approved by: imp (mentor)
Differential Revision: https://reviews.freebsd.org/D25795

3 years agoIntroduce ipi_self_from_nmi().
mav [Fri, 24 Jul 2020 20:52:09 +0000 (20:52 +0000)]
Introduce ipi_self_from_nmi().

It allows safe IPI sending to current CPU from NMI context.

Unlike other ipi_*() functions this waits for delivery to leave LAPIC in
a state safe for interrupted code.

MFC after: 2 weeks
Sponsored by: iXsystems, Inc.

3 years agoUse APIC_IPI_DEST_OTHERS for bitmapped IPIs too.
mav [Fri, 24 Jul 2020 20:44:50 +0000 (20:44 +0000)]
Use APIC_IPI_DEST_OTHERS for bitmapped IPIs too.

It should save bunch of LAPIC register accesses.

MFC after: 2 weeks

3 years agoMake lapic_ipi_vectored(APIC_IPI_DEST_SELF) NMI safe.
mav [Fri, 24 Jul 2020 19:54:15 +0000 (19:54 +0000)]
Make lapic_ipi_vectored(APIC_IPI_DEST_SELF) NMI safe.

Sending IPI to self or all CPUs does not require write into upper part of
the ICR, prone to races.  Previously the code disabled interrupts, but it
was not enough for NMIs.  Instead of that when possible write only lower
part of the register, or use special SELF IPI register in x2APIC mode.

This also removes ICR reads used to preserve reserved bits on write.
It was there from the beginning, but I failed to find explanation why,
neither I see Linux doing it.  Specification even tells that ICR content
may be lost in deep C-states, so if hardware does not bother to preserve
it, why should we?

MFC after: 2 weeks
Sponsored by: iXsystems, Inc.

3 years agodwmmc: Add MMCCAM part
manu [Fri, 24 Jul 2020 19:52:52 +0000 (19:52 +0000)]
dwmmc: Add MMCCAM part

Add support for MMCCAM for dwmmc

Submitted by: kibab
Tested On: Rock64, RockPro64

3 years agommccam: aw_mmc: Only print the new ios value under bootverbose
manu [Fri, 24 Jul 2020 18:44:50 +0000 (18:44 +0000)]
mmccam: aw_mmc: Only print the new ios value under bootverbose

3 years agommccam: Make non bootverbose more readable
manu [Fri, 24 Jul 2020 18:43:46 +0000 (18:43 +0000)]
mmccam: Make non bootverbose more readable

Remove some debug printfs.
Convert some to CAM_DEBUG
Only print some when bootverbose is set.

3 years agoUse gbincore_unlocked for unprotected incore()
cem [Fri, 24 Jul 2020 17:34:44 +0000 (17:34 +0000)]
Use gbincore_unlocked for unprotected incore()

Reviewed by: markj
Sponsored by: Isilon
Differential Revision: https://reviews.freebsd.org/D25790

3 years agoAdd unlocked/SMR fast path to getblk()
cem [Fri, 24 Jul 2020 17:34:04 +0000 (17:34 +0000)]
Add unlocked/SMR fast path to getblk()

Convert the bufobj tries to an SMR zone/PCTRIE and add a gbincore_unlocked()
API wrapping this functionality.  Use it for a fast path in getblkx(),
falling back to locked lookup if we raced a thread changing the buf's
identity.

Reported by: Attilio
Reviewed by: kib, markj
Testing: pho (in progress)
Sponsored by: Isilon
Differential Revision: https://reviews.freebsd.org/D25782

3 years agoUse SMR to provide safe unlocked lookup for pctries from SMR zones
cem [Fri, 24 Jul 2020 17:32:10 +0000 (17:32 +0000)]
Use SMR to provide safe unlocked lookup for pctries from SMR zones

Adapt r358130, for the almost identical vm_radix, to the pctrie subsystem.
Like that change, the tree is kept correct for readers with store barriers
and careful ordering.  Existing locks serialize writers.

Add a PCTRIE_DEFINE_SMR() wrapper that takes an additional smr_t parameter
and instantiates a FOO_PCTRIE_LOOKUP_UNLOCKED() function, in addition to the
usual definitions created by PCTRIE_DEFINE().

Interface consumers will be introduced in later commits.

As future work, it might be nice to add vm_radix algorithms missing from
generic pctrie to the pctrie interface, and then adapt vm_radix to use
pctrie.

Reported by: Attilio
Reviewed by: markj
Sponsored by: Isilon
Differential Revision: https://reviews.freebsd.org/D25781

3 years agolockmgr: add missing 'continue' to account for spuriously failed fcmpset
mjg [Fri, 24 Jul 2020 17:28:24 +0000 (17:28 +0000)]
lockmgr: add missing 'continue' to account for spuriously failed fcmpset

PR: 248245
Reported by: gbe
Noted by: markj
Fixes by: r363415 ("lockmgr: add adaptive spinning")

3 years agommccam: Add some aliases for non-mmccam to mmccam transition
manu [Fri, 24 Jul 2020 17:11:14 +0000 (17:11 +0000)]
mmccam: Add some aliases for non-mmccam to mmccam transition

A new tunable is present, kern.cam.sdda.mmcsd_compat to enable
this feature or not (default is enabled)

3 years agoRemove reference to nlist(3) missed in SCCS revision 5.26 by mckusick
jmallett [Fri, 24 Jul 2020 16:58:13 +0000 (16:58 +0000)]
Remove reference to nlist(3) missed in SCCS revision 5.26 by mckusick
when converting rwhod(8) to using kern.boottime ather than extracting
the boot time from kernel memory directly.

Reviewed by: imp

3 years agoFix grammar issues and typos
0mp [Fri, 24 Jul 2020 15:04:34 +0000 (15:04 +0000)]
Fix grammar issues and typos

Reported by: ian
MFC after: 1 week

3 years agoDocument that force_depend() supports only /etc/rc.d scripts
0mp [Fri, 24 Jul 2020 14:17:37 +0000 (14:17 +0000)]
Document that force_depend() supports only /etc/rc.d scripts

Currently, force_depend() from rc.subr(8) does not support depending on
scripts outside of /etc/rc.d (like /usr/local/etc/rc.d). The /etc/rc.d path
is hard-coded into force_depend().

MFC after: 1 week

3 years agovm: fix swap reservation leak and clean up surrounding code
mjg [Fri, 24 Jul 2020 13:23:32 +0000 (13:23 +0000)]
vm: fix swap reservation leak and clean up surrounding code

The code did not subtract from the global counter if per-uid reservation
failed.

Cleanup highlights:
- load overcommit once
- move per-uid manipulation to dedicated routines
- don't fetch wire count if requested size is below the limit
- convert return type from int to bool
- ifdef the routines with _KERNEL to keep vm.h compilable by userspace

Reviewed by: kib (previous version)
Differential Revision: https://reviews.freebsd.org/D25787

3 years agoInclude TMPFS in all the GENERIC kernel configs
arichardson [Fri, 24 Jul 2020 08:40:04 +0000 (08:40 +0000)]
Include TMPFS in all the GENERIC kernel configs

Being able to use tmpfs without kernel modules is very useful when building
small MFS_ROOT kernels without a real file system.
Including TMPFS also matches arm/GENERIC and the MIPS std.MALTA configs.

Compiling TMPFS only adds 4 .c files so this should not make much of a
difference to NO_MODULES build times (as we do for our minimal RISC-V
images).

Reviewed By: br (earlier version for riscv), brooks, emaste
Differential Revision: https://reviews.freebsd.org/D25317

3 years agofix up docs for m_getjcl as well..
jmg [Fri, 24 Jul 2020 00:47:14 +0000 (00:47 +0000)]
fix up docs for m_getjcl as well..

3 years agodocument that m_get2 only accepts up to MJUMPAGESIZE..
jmg [Fri, 24 Jul 2020 00:35:21 +0000 (00:35 +0000)]
document that m_get2 only accepts up to MJUMPAGESIZE..

3 years agoAdd support for KTLS RX via software decryption.
jhb [Thu, 23 Jul 2020 23:48:18 +0000 (23:48 +0000)]
Add support for KTLS RX via software decryption.

Allow TLS records to be decrypted in the kernel after being received
by a NIC.  At a high level this is somewhat similar to software KTLS
for the transmit path except in reverse.  Protocols enqueue mbufs
containing encrypted TLS records (or portions of records) into the
tail of a socket buffer and the KTLS layer decrypts those records
before returning them to userland applications.  However, there is an
important difference:

- In the transmit case, the socket buffer is always a single "record"
  holding a chain of mbufs.  Not-yet-encrypted mbufs are marked not
  ready (M_NOTREADY) and released to protocols for transmit by marking
  mbufs ready once their data is encrypted.

- In the receive case, incoming (encrypted) data appended to the
  socket buffer is still a single stream of data from the protocol,
  but decrypted TLS records are stored as separate records in the
  socket buffer and read individually via recvmsg().

Initially I tried to make this work by marking incoming mbufs as
M_NOTREADY, but there didn't seemed to be a non-gross way to deal with
picking a portion of the mbuf chain and turning it into a new record
in the socket buffer after decrypting the TLS record it contained
(along with prepending a control message).  Also, such mbufs would
also need to be "pinned" in some way while they are being decrypted
such that a concurrent sbcut() wouldn't free them out from under the
thread performing decryption.

As such, I settled on the following solution:

- Socket buffers now contain an additional chain of mbufs (sb_mtls,
  sb_mtlstail, and sb_tlscc) containing encrypted mbufs appended by
  the protocol layer.  These mbufs are still marked M_NOTREADY, but
  soreceive*() generally don't know about them (except that they will
  block waiting for data to be decrypted for a blocking read).

- Each time a new mbuf is appended to this TLS mbuf chain, the socket
  buffer peeks at the TLS record header at the head of the chain to
  determine the encrypted record's length.  If enough data is queued
  for the TLS record, the socket is placed on a per-CPU TLS workqueue
  (reusing the existing KTLS workqueues and worker threads).

- The worker thread loops over the TLS mbuf chain decrypting records
  until it runs out of data.  Each record is detached from the TLS
  mbuf chain while it is being decrypted to keep the mbufs "pinned".
  However, a new sb_dtlscc field tracks the character count of the
  detached record and sbcut()/sbdrop() is updated to account for the
  detached record.  After the record is decrypted, the worker thread
  first checks to see if sbcut() dropped the record.  If so, it is
  freed (can happen when a socket is closed with pending data).
  Otherwise, the header and trailer are stripped from the original
  mbufs, a control message is created holding the decrypted TLS
  header, and the decrypted TLS record is appended to the "normal"
  socket buffer chain.

(Side note: the SBCHECK() infrastucture was very useful as I was
 able to add assertions there about the TLS chain that caught several
 bugs during development.)

Tested by: rmacklem (various versions)
Relnotes: yes
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D24628

3 years agoLimit gmirror failpoint tests to the test worker
bdrewery [Thu, 23 Jul 2020 23:29:50 +0000 (23:29 +0000)]
Limit gmirror failpoint tests to the test worker

This avoids injecting errors into the test system's mirrors.

gnop seems like a good solution here but it injects errors at the wrong
place vs where these tests expect and does not support a 'max global count'
like the failpoints do with 'n*' syntax.

Reviewed by: cem, vangyzen
Sponsored by: Dell EMC Isilon

3 years agoupdate example to make it active when creating a new boot method...
jmg [Thu, 23 Jul 2020 22:28:35 +0000 (22:28 +0000)]
update example to make it active when creating a new boot method...

Clean up some of the sentences and grammar...

make igor happy..

3 years agoConsolidate duplicated code into a ktls_ocf_dispatch function.
jhb [Thu, 23 Jul 2020 21:43:06 +0000 (21:43 +0000)]
Consolidate duplicated code into a ktls_ocf_dispatch function.

This function manages the loop around crypto_dispatch and coordination
with ktls_ocf_callback.

Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D25757

3 years agoSet si_trapno to the exception code from esr.
jhb [Thu, 23 Jul 2020 21:40:03 +0000 (21:40 +0000)]
Set si_trapno to the exception code from esr.

Reviewed by: kib
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D25771

3 years agoPass the right size to memcpy() when copying the array of FP registers.
jhb [Thu, 23 Jul 2020 21:33:10 +0000 (21:33 +0000)]
Pass the right size to memcpy() when copying the array of FP registers.

The size of the containing structure was passed instead of the size of
the array.  This happened to be harmless as the extra word copied is
one we copy in the next line anyway.

Reported by: CHERI (bounds check violation)
Reviewed by: brooks, imp
Obtained from: CheriBSD
MFC after: 1 week
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D25791

3 years agoSet si_addr to badvaddr for TLB faults.
jhb [Thu, 23 Jul 2020 20:08:42 +0000 (20:08 +0000)]
Set si_addr to badvaddr for TLB faults.

Reviewed by: kib
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D25775

3 years agomd5: return non-zero if built-in tests (-x) fail
emaste [Thu, 23 Jul 2020 20:06:24 +0000 (20:06 +0000)]
md5: return non-zero if built-in tests (-x) fail

MFC after: 1 week
Sponsored by: The FreeBSD Foundation

3 years agoClear the pointer to the socket when closing it also in case of
tuexen [Thu, 23 Jul 2020 19:43:49 +0000 (19:43 +0000)]
Clear the pointer to the socket when closing it also in case of
an ungraceful operation.
This fixes a use-after-free bug found and reported by Taylor
Brandstetter of Google by testing the userland stack.

MFC after: 1 week

3 years agomodules/crypto: disable optimized assembly skein1024 implementation
emaste [Thu, 23 Jul 2020 19:19:33 +0000 (19:19 +0000)]
modules/crypto: disable optimized assembly skein1024 implementation

It is presumably broken in the same way as userland skein1024 (see r363454)

PR: 248221

3 years agolibmd: temporarily disable optimized assembly skein1024 implementation
emaste [Thu, 23 Jul 2020 18:55:47 +0000 (18:55 +0000)]
libmd: temporarily disable optimized assembly skein1024 implementation

It is apparently broken when assembled by contemporary GNU as as well as
Clang IAS (which is used in the default configuration).

PR: 248221
Reported by: pizzamig
Sponsored by: The FreeBSD Foundation

3 years agoDocument the IPFILTER_PREDEFINED environment variable.
cy [Thu, 23 Jul 2020 17:39:49 +0000 (17:39 +0000)]
Document the IPFILTER_PREDEFINED environment variable.

PR: 248088
Reported by: joeb1@a1poweruser.com
MFC after: 1 week

3 years agoLoad ipfilter, ipnat, and ippool rules, and start ipmon in a vnet jail.
cy [Thu, 23 Jul 2020 17:39:45 +0000 (17:39 +0000)]
Load ipfilter, ipnat, and ippool rules, and start ipmon in a vnet jail.

PR: 248109
Reported by: joeb1@a1poweruser.com
MFC after: 2 weeks

3 years agolocks: fix a long standing bug for primitives with kdtrace but without spinning
mjg [Thu, 23 Jul 2020 17:26:53 +0000 (17:26 +0000)]
locks: fix a long standing bug for primitives with kdtrace but without spinning

In such a case the second argument to lock_delay_arg_init was NULL which was
immediately causing a null pointer deref.

Since the sructure is only used for spin count, provide a dedicate routine
initializing it.

Reported by: andrew

3 years agoRank balanced (RB) trees are a class of balanced trees that includes
dougm [Thu, 23 Jul 2020 17:16:20 +0000 (17:16 +0000)]
Rank balanced (RB) trees are a class of balanced trees that includes
AVL trees, red-black trees, and others. Weak AVL (wavl) trees are a
recently discovered member of that class. This change replaces
red-black rebalancing with weak AVL rebalancing in the RB tree macros.

Wavl trees sit between AVL and red-black trees in terms of how
strictly balance is enforced. They have the stricter balance of AVL
trees as the tree is built - a wavl tree is an AVL tree until the
first deletion. Once removals start, wavl trees are lazier about
rebalancing than AVL trees, so that removals can be fast, but the
balance of the tree can decay to that of a red-black tree. Subsequent
insertions can push balance back toward the stricter AVL conditions.

Removing a node from a wavl tree never requires more than two
rotations, which is better than either red-black or AVL
trees. Inserting a node into a wavl tree never requires more than two
rotations, which matches red-black and AVL trees. The only
disadvantage of wavl trees to red-black trees is that more insertions
are likely to adjust the tree a bit. That's the cost of keeping the
tree more balanced.

Testing has shown that for the cases where red-black trees do worst,
wavl trees better balance leads to faster lookups, so that if lookups
outnumber insertions by a nontrivial amount, lookup time saved exceeds
the extra cost of balancing.

Reviewed by: alc, gbe, markj
Tested by: pho
Discussed with: emaste
Differential Revision: https://reviews.freebsd.org/D25480

3 years agorc.firewall: Merge two identical conditions into one.
markj [Thu, 23 Jul 2020 15:03:28 +0000 (15:03 +0000)]
rc.firewall: Merge two identical conditions into one.

No functional change intended.

PR: 247949
Submitted by: Jose Luis Duran <jlduran@gmail.com>
MFC after: 1 week

3 years agoAdd missing newlines.
mav [Thu, 23 Jul 2020 14:33:25 +0000 (14:33 +0000)]
Add missing newlines.

MFC after: 3 days

3 years agoMFOpenZFS: Fix zpool history unbounded memory usage
markj [Thu, 23 Jul 2020 14:21:45 +0000 (14:21 +0000)]
MFOpenZFS: Fix zpool history unbounded memory usage

In original implementation, zpool history will read the whole history
before printing anything, causing memory usage goes unbounded. We fix
this by breaking it into read-print iterations.

Reviewed-by: Tom Caputi <tcaputi@datto.com>
Reviewed-by: Matt Ahrens <matt@delphix.com>
Reviewed-by: Igor Kozhukhov <igor@dilos.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chunwei Chen <david.chen@nutanix.com>
Closes #9516

Note, this change changes the libzfs.so ABI by modifying the prototype
of zpool_get_history().  Since libzfs is effectively private to the base
system it is anticipated that this will not be a problem.

PR: 247557
Obtained from: OpenZFS
Reported and tested by: Sam Vaughan <samjvaughan@gmail.com>
Discussed with: freqlabs
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D25745
openzfs/zfs@7125a109dcc55628336ff3f58e59e503f4d7694d

3 years agocuse: Stop checking for failures from malloc(M_WAITOK).
markj [Thu, 23 Jul 2020 14:03:37 +0000 (14:03 +0000)]
cuse: Stop checking for failures from malloc(M_WAITOK).

PR: 240545
Submitted by: Andrew Reiter <arr@watson.org>
Reviewed by: hselasky
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D25765

3 years agontb: Stop checking for failures from malloc(M_WAITOK).
markj [Thu, 23 Jul 2020 14:03:24 +0000 (14:03 +0000)]
ntb: Stop checking for failures from malloc(M_WAITOK).

PR: 240545
Submitted by: Andrew Reiter <arr@watson.org>
Reviewed by: cem, mav
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D25768

3 years agovm: annotate swap_reserved with __exclusive_cache_line
mjg [Thu, 23 Jul 2020 08:42:16 +0000 (08:42 +0000)]
vm: annotate swap_reserved with __exclusive_cache_line

The counter keeps being updated all the time and variables read afterwards
share the cacheline. Note this still fundamentally does not scale and needs
to be replaced, in the meantime gets a bandaid.

brk1_processes -t 52 ops/s:
before: 8598298
after:  9098080

3 years agoDetect and handle an invalid reassembly constellation, which results in
tuexen [Thu, 23 Jul 2020 01:35:24 +0000 (01:35 +0000)]
Detect and handle an invalid reassembly constellation, which results in
a memory leak.

Thanks to Felix Weinrank for finding this issue using fuzz testing the
userland stack.

MFC after: 1 week

3 years agoCorrect a type-mismatch between xdr_long and the variable "bad".
brooks [Wed, 22 Jul 2020 23:39:58 +0000 (23:39 +0000)]
Correct a type-mismatch between xdr_long and the variable "bad".

Way back in r28911 (August 1997, CVS rev 1.22) we imported a NetBSD
information leak fix via OpenBSD.  Unfortunatly we failed to track the
followup commit that fixed the type of the error code.  Apply the change
from int to long now.

Reviewed by: emaste
Found by: CHERI
Obtained from: CheriBSD
MFC after: 3 days
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D25779

3 years agoUse SI_ORDER_(FOURTH|FIFTH) rather than bespoke versions.
brooks [Wed, 22 Jul 2020 23:35:41 +0000 (23:35 +0000)]
Use SI_ORDER_(FOURTH|FIFTH) rather than bespoke versions.

No functional change.

When these SYSINITs were added these macros didn't exist.

Reviewed by: imp
Obtained from: CheriBSD
MFC after: 1 week
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D25758

3 years agoModify writing to mirrored pNFS DSs to prepare for use of ext_pgs mbufs.
rmacklem [Wed, 22 Jul 2020 23:33:37 +0000 (23:33 +0000)]
Modify writing to mirrored pNFS DSs to prepare for use of ext_pgs mbufs.

This patch modifies writing to mirrored pNFS DSs slightly so that there is
only one m_copym() call for a mirrored pair instead of two of them.
This call replaces the custom nfsm_copym() call, which is no longer needed
and deleted by this patch. The patch does introduce a new nfsm_split()
function that only calls m_split() for the non-ext_pgs case.
The semantics of nfsm_uiombuflist() is changed to include code that nul
pads the generated mbuf list. This was done by nfsm_copym() prior to this patch.

The main reason for this change is that it allows the data to be a list
of ext_pgs mbufs, since the m_copym() is for the entire mbuf list.
This support will be added in a future commit.

This patch only affects writing to mirrored flexible file layout pNFS servers.

3 years agoAdd missing space after switch.
jhb [Wed, 22 Jul 2020 22:51:14 +0000 (22:51 +0000)]
Add missing space after switch.

Reviewed by: br, emaste
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D25778

3 years agoAvoid reading one byte before the path buffer.
brooks [Wed, 22 Jul 2020 21:44:51 +0000 (21:44 +0000)]
Avoid reading one byte before the path buffer.

This happens when there's only one component (e.g. "/foo"). This
(mostly-harmless) bug has been present since June 1990 when it was
commited to mountd.c SCCS version 5.9.

Note: the bug is on the second changed line, the first line is changed
for visual consistency.

Reviewed by: cem, emaste, mckusick, rmacklem
Found with: CHERI
Obtained from: CheriBSD
MFC after: 1 week
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D25759

3 years agoUntie nmi_handle_intr() from DEV_ISA.
mav [Wed, 22 Jul 2020 20:15:21 +0000 (20:15 +0000)]
Untie nmi_handle_intr() from DEV_ISA.

The only part of nmi_handle_intr() depending on ISA is isa_nmi(), which is
already wrapped.  Entering debugger on NMI does not really depend on ISA.

MFC after: 2 weeks

3 years agommccam: Add support for 1.2V and 1.8V eMMC
manu [Wed, 22 Jul 2020 19:08:05 +0000 (19:08 +0000)]
mmccam: Add support for 1.2V and 1.8V eMMC

If the card reports that it support 1.2V or 1.8V signaling switch to this voltage.

Submitted by: kibab

3 years agommccam: Add support for 1.8V sdcard
manu [Wed, 22 Jul 2020 19:04:45 +0000 (19:04 +0000)]
mmccam: Add support for 1.8V sdcard

If the card reports that it support 1.8V signaling switch to this voltage.
While here update the list of mode for mmccam.

Submitted by: kibab

3 years agoaw_mmc: Start a mmccam discovery when the CD handler is called.
manu [Wed, 22 Jul 2020 18:33:36 +0000 (18:33 +0000)]
aw_mmc: Start a mmccam discovery when the CD handler is called.

Submitted by: kibab

3 years agommccam: Add a generic mmccam_start_discovery function
manu [Wed, 22 Jul 2020 18:30:17 +0000 (18:30 +0000)]
mmccam: Add a generic mmccam_start_discovery function

This is a generic function start a scan request for the given
cam_sim.
Other driver can now just use this function to request a new rescan.

Submitted by: kibab

3 years agommccam: Use a sbuf for the mmc ident function
manu [Wed, 22 Jul 2020 18:21:37 +0000 (18:21 +0000)]
mmccam: Use a sbuf for the mmc ident function

While here fix a typo.

3 years agoFix sys.geom.class.eli.onetime_test.onetime after r363402
lwhsu [Wed, 22 Jul 2020 17:37:11 +0000 (17:37 +0000)]
Fix sys.geom.class.eli.onetime_test.onetime after r363402

PR: 247954
X-MFC with: r363402
Sponsored by: The FreeBSD Foundation

3 years agommc_xpt: Fix debug messages
manu [Wed, 22 Jul 2020 17:36:28 +0000 (17:36 +0000)]
mmc_xpt: Fix debug messages

PROBE_RESET was printed for PROBE_IDENTIFY, fix this.
While here add one for the PROBE_RESET.

Submitted by: kibab

3 years agopkg-bootstrap: complain on improper `pkg bootstrap` usage
kevans [Wed, 22 Jul 2020 17:33:35 +0000 (17:33 +0000)]
pkg-bootstrap: complain on improper `pkg bootstrap` usage

Right now, the bootstrap will gloss over things like pkg bootstrap -x or
pkg bootstrap -f pkg. Make it more clear that this is incorrect, and hint
at the correct formatting.

Reported by: jhb (IIRC via IRC)
Approved by: bapt, jhb, manu
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D24750

3 years agousb(4): Stop checking for failures from malloc(M_WAITOK).
markj [Wed, 22 Jul 2020 14:32:47 +0000 (14:32 +0000)]
usb(4): Stop checking for failures from malloc(M_WAITOK).

Handle the fact that parts of usb(4) can be compiled into the boot
loader, where M_WAITOK does not guarantee a successful allocation.

PR: 240545
Submitted by: Andrew Reiter <arr@watson.org> (original version)
Reviewed by: hselasky
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D25706

3 years agoAdd tests for "add", "change" and "delete" functionality of /sbin/route.
thj [Wed, 22 Jul 2020 13:49:54 +0000 (13:49 +0000)]
Add tests for "add", "change" and "delete" functionality of /sbin/route.

Add tests to cover "add", "change" and "delete" functionality of /sbin/route
for ipv4 and ipv6. These tests for the existing route tool are the first step
towards creating libroute.

Submitted by:   Ahsan Barkati
Sponsored by:   Google, Inc. (GSoC 2020)
Reviewed by:    kp, thj
Approved by:    bz (mentor)
MFC after:      1 month
Differential Revision:  https://reviews.freebsd.org/D25220

3 years agogeli(8): Add missing commands in the EXAMPLES section
gbe [Wed, 22 Jul 2020 13:00:56 +0000 (13:00 +0000)]
geli(8): Add missing commands in the EXAMPLES section

- Add a missing 'geli attach' command
- Fix the passphrase prompt for a 'geli attach' command

Reported by: Fabian Keil <freebsd-listen at fabiankeil dot de>
Reviewed by: bcr (mentor)
Approved by: bcr (mentor)
Differential Revision: https://reviews.freebsd.org/D25761

3 years agolockmgr: add adaptive spinning
mjg [Wed, 22 Jul 2020 12:30:31 +0000 (12:30 +0000)]
lockmgr: add adaptive spinning

It is very conservative. Only spinning when LK_ADAPTIVE is passed, only on
exclusive lock and never when any waiters are present. buffer cache is remains
not spinning.

This reduces total sleep times during buildworld etc., but it does not shorten
total real time (culprits are contention in the vm subsystem along with slock +
upgrade which is not covered).

For microbenchmarks: open3_processes -t 52 (open/close of the same file for
writing) ops/s:
before: 258845
after: 801638

Reviewed by: kib
Tested by: pho
Differential Revision: https://reviews.freebsd.org/D25753

3 years agoConsistently use gctl_get_provider instead of home-grown variants.
delphij [Wed, 22 Jul 2020 02:15:21 +0000 (02:15 +0000)]
Consistently use gctl_get_provider instead of home-grown variants.

Reviewed by: cem, imp
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D25739

3 years agogctl_get_class, gctl_get_geom and gctl_get_provider: provide feedback
delphij [Wed, 22 Jul 2020 02:14:27 +0000 (02:14 +0000)]
gctl_get_class, gctl_get_geom and gctl_get_provider: provide feedback
when the requested argument is missing.

Reviewed by: cem
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D25738

3 years agolibbe: annotate lbh as __unused in be_is_auto_snapshot_name
kevans [Wed, 22 Jul 2020 02:09:10 +0000 (02:09 +0000)]
libbe: annotate lbh as __unused in be_is_auto_snapshot_name

lbh is included for consistency with other functions and in case
future work needs to use it, but it is currently unused. Mark it,
and a post-OpenZFS-import world will be able to raise WARNS of
libbe to the default (pending some minor changes to openzfs libzfs).

MFC after: 3 days

3 years agogetty appears to date from 3rd edition research unix. That's the oldest man page
imp [Wed, 22 Jul 2020 00:44:47 +0000 (00:44 +0000)]
getty appears to date from 3rd edition research unix. That's the oldest man page
on TUHS and its 'unix 1972' restoration effort has assembler sources that look
like simpler version of what's in the 5th edition.

3 years agoINTRNG: only shuffle for !EARLY_AP_STARTUP
mhorne [Tue, 21 Jul 2020 22:47:02 +0000 (22:47 +0000)]
INTRNG: only shuffle for !EARLY_AP_STARTUP

During device attachment, all interrupt sources will bind to the BSP,
as it is the only processor online. This means interrupts must be
redistributed ("shuffled") later, during SI_SUB_SMP.

For the EARLY_AP_STARTUP case, this is no longer true. SI_SUB_SMP will
execute much earlier, meaning APs will be online and available before
devices begin attachment, and there will therefore be nothing to
shuffle.

All PIC-conforming interrupt controllers will handle this early
distribution properly, except for RISC-V's PLIC. Make the necessary
tweak to the PLIC driver.

While here, convert irq_assign_cpu from a boolean_t to a bool.

Reviewed by: markj
Differential Revision: https://reviews.freebsd.org/D25693