cem [Fri, 22 Nov 2019 19:30:31 +0000 (19:30 +0000)]
random/ivy: Provide mechanism to read independent seed values from rdrand
On x86 platforms with the intrinsic, rdrand is a deterministic bit generator
(AES-CTR) seeded from an entropic source. On x86 platforms with rdseed, it
is something closer to the upstream entropic source. (There is more nuance;
a block diagram is provided in [1].)
On devices with rdrand and without rdseed, there is no good intrinsic for
acecssing the good entropic soure directly. However, the DRBG is guaranteed
to reseed every 8 kB on these platforms. As a conservative option, on such
hardware we can read an extra 7.99kB samples every time we want a sample
from an independent seed.
As one can imagine, this drastically slows the effective read rate of
RDRAND (a factor of 1024 on amd64 and 2048 on ia32). Microbenchmarks on AMD
Zen (has RDSEED) show an RDRAND rate of 25 MB/s and Intel Haswell (no
RDSEED) show RDRAND of 170 MB/s. This would reduce the read rate on Haswell
to ~170 kB/s (at 100% CPU). random(4)'s harvestq thread periodically
"feeds" from pure sources in amounts of 128-1024 bytes. On Haswell,
enabling this feature increases the CPU time of RDRAND in each "feed" from
approximately 0.7-6 µs to 0.7-6 ms.
Because there is some performance penalty to this more conservative option,
a knob is provided to enable the change. The change does not affect
platforms with RDSEED.
mav [Fri, 22 Nov 2019 18:39:51 +0000 (18:39 +0000)]
Make CAM use root_mount_hold_token() to delay boot.
Before this change CAM used config_intrhook_establish() for this purpose,
but that approach does not allow to delay it again after releasing once.
USB stack uses root_mount_hold() to delay boot until bus scan is complete.
But once it is, CAM had no time to scan SCSI bus, registered by umass(4),
if it already done other scans and called config_intrhook_disestablish().
The new approach makes it work smooth, assuming the USB device is found
during the initial bus scan. Devices appearing on USB bus later may still
require setting kern.cam.boot_delay, but hopefully those are minority.
markj [Fri, 22 Nov 2019 16:31:30 +0000 (16:31 +0000)]
Reclaim memory from UMA if the page daemon is struggling.
Use the UMA reclaim thread to asynchronously drain all caches if
there is a severe shortage in a domain. Otherwise we only trigger UMA
reclamation every 10s even when the system has completely run out of
memory.
Stop entirely draining the caches when one domain falls below its min
threshold. In some workloads it is normal for one NUMA domain to end
up being nearly depleted by kernel memory allocations, for example for
the ZFS ARC. The domainset iterators skip domains below the
vmd_min_free theshold on the first iteration, so we should allow that
mechanism to limit further depletion of the domain's free pages before
taking the extreme step of calling uma_reclaim(UMA_RECLAIM_DRAIN_CPU).
Discussed with: jeff
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D22395
markj [Fri, 22 Nov 2019 16:31:10 +0000 (16:31 +0000)]
Update the checks in vm_page_zone_import().
- Remove the cnt == 1 check. UMA passes cnt == 1 when it has disabled
per-CPU caching. In this case we might as well just allocate a single
page and return it to the caller, since the caller is going to do
exactly that anyway if the UMA cache allocation attempt fails.
- Don't replenish caches if the domain is severely short on free pages.
With large buckets we may otherwise quickly exacerbate a situation
where the page daemon is failing to keep up.
- Don't replenish caches if the calling thread belongs to the page
daemon, which should avoid creating extra memory pressure when it is
trying to free memory. Virtually all such allocations while occur in
the context of laundering, where the laundry thread must allocate
slabs for various swap and I/O-related UMA zones.
Reviewed by: kib
Discussed with: alc, jeff
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D22394
markj [Fri, 22 Nov 2019 16:30:47 +0000 (16:30 +0000)]
Revise the page cache size policy.
In r353734 the use of the page caches was limited to systems with a
relatively large amount of RAM per CPU. This was to mitigate some
issues reported with the system not able to keep up with memory pressure
in cases where it had been able to do so prior to the addition of the
direct free pool cache. This change re-enables those caches.
The change modifies uma_zone_set_maxcache(), which was introduced
specifically for the page cache zones. Rather than using it to limit
only the full bucket cache, have it also set uz_count_max to provide an
upper bound on the per-CPU cache size that is consistent with the number
of items requested. Remove its return value since it has no use.
Enable the page cache zones unconditionally, and limit them to 0.1% of
the domain's pages. The limit can be overridden by the
vm.pgcache_zone_max tunable as before.
Change the item size parameter passed to uma_zcache_create() to the
correct size, and stop setting UMA_ZONE_MAXBUCKET. This allows the page
cache buckets to be adaptively sized, like the rest of UMA's caches.
This also causes the initial bucket size to be small, so only systems
which benefit from large caches will get them.
Reviewed by: gallatin, jeff
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D22393
markj [Fri, 22 Nov 2019 16:28:52 +0000 (16:28 +0000)]
Fix locking in vm_reserv_reclaim_contig().
We were not properly handling the case where the trylock of the
reservaton fails, in which case we could leak reservation lock.
Introduce a marker reservation to implement precise scanning in
vm_reserv_reclaim_contig(). Before, a race could result in early
termination of the scan in rare situations. Use the marker's lock to
serialize scans of the partpop queue so that a global marker structure
can be used. Modify vm_reserv_reclaim_inactive() to handle the presence
of a marker while minimizing the hold time of domain-global locks.
Reviewed by: alc, jeff, kib
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D22392
mav [Fri, 22 Nov 2019 15:41:47 +0000 (15:41 +0000)]
Fix off-by-one error in HPA/AMA maximum reporting.
Before my refactoring the code reported value as maximum number of sectors,
adding one to the maximum sector number returned by respective command.
While this difference is somewhat confusing, restore previous behavior.
jhibbits [Fri, 22 Nov 2019 04:34:46 +0000 (04:34 +0000)]
powerpc/ptrace: Give ptrace(2) access to SPE registers when available
SPE registers are already exported in core dumps with the VMX note, so use
the same interface for live access.
Instead of simply guarding out in #ifndef __SPE__ the cpu_feature check, I
chose to keep the check and check against PPC_FEATURE_SPE, on the off-chance
someone decides to run a SPE kernel on a non-SPE device (which is possible,
though highly unlikely, and would be no different from running a MPC85XX
kernel in that instance).
rmacklem [Fri, 22 Nov 2019 00:22:55 +0000 (00:22 +0000)]
Fix the pNFS server's reporting of SpaceUsed (va_bytes).
The pNFS server currently reports SpaceUsed (va_bytes) for the metadata
file. This in not correct, since the metadata file is always empty and,
as such, va_bytes is just the allocation for the empty file.
This patch adds va_bytes to the list of attributes acquired from the
DS for a file, so that it includes the allocated data size and is updated
when the file is written.
For files created on a pNFS server before this patch is applied, the
va_bytes value is estimated by rounding va_size up to a multiple of
BLKDEV_IOSIZE. Once the file is written after this patch has been
applied to the metadata server, the va_bytes returned for the file
will be correct.
This patch only affects a pNFS metadata server.
Found during testing of the NFSv4.2 pNFS server for the Allocate operation.
(Not yet in head/current.)
dim [Thu, 21 Nov 2019 20:36:46 +0000 (20:36 +0000)]
Merge commit a751f557d from llvm git (by Simon Atanasyan):
[mips] Set macros for Octeon+ CPU
This is one of the upstream changes needed for adding support for the
OCTEON+ CPU type, so that we can test Clang builds using the most
commonly available FreeBSD/mips64 reference platform, the Edge Router
Lite.
dim [Thu, 21 Nov 2019 20:35:53 +0000 (20:35 +0000)]
Merge commit 0d14656b9 from llvm git (by Simon Atanasyan):
[mips] Set __OCTEON__ macros
This is one of the upstream changes needed for adding support for the
OCTEON+ CPU type, so that we can test Clang builds using the most
commonly available FreeBSD/mips64 reference platform, the Edge Router
Lite.
dim [Thu, 21 Nov 2019 20:32:34 +0000 (20:32 +0000)]
Merge commit e578d0fd2 from llvm git (by Simon Atanasyan):
[mips] Fix `__mips_isa_rev` macros value for Octeon CPU
This is one of the upstream changes needed for adding support for the
OCTEON+ CPU type, so that we can test Clang builds using the most
commonly available FreeBSD/mips64 reference platform, the Edge Router
Lite.
dim [Thu, 21 Nov 2019 20:26:34 +0000 (20:26 +0000)]
Merge commit 3552d3e0f from llvm git (by Simon Atanasyan):
[mips] Add `octeon+` to the list of CPUs accepted by the driver
This is one of the upstream changes needed for adding support for the
OCTEON+ CPU type, so that we can test Clang builds using the most
commonly available FreeBSD/mips64 reference platform, the Edge Router
Lite.
This is one of the upstream changes needed for adding support for the
OCTEON+ CPU type, so that we can test Clang builds using the most
commonly available FreeBSD/mips64 reference platform, the Edge Router
Lite.
This is one of the upstream changes needed for adding support for the
OCTEON+ CPU type, so that we can test Clang builds using the most
commonly available FreeBSD/mips64 reference platform, the Edge Router
Lite.
dim [Thu, 21 Nov 2019 20:22:07 +0000 (20:22 +0000)]
Merge commit 7bed381ea from llvm git (by Simon Atanasyan):
[mips] Implement Octeon+ `saa` and `saad` instructions
`saa` and `saad` are 32-bit and 64-bit store atomic add instructions.
memory[base] = memory[base] + rt
These instructions are available for "Octeon+" CPU. The patch adds
support for both instructions to MIPS assembler and diassembler and
introduces new CPU type - "octeon+".
Next patches will implement `.set arch=octeon+` directive and
`AFL_EXT_OCTEONP` ISA extension flag support.
This is one of the upstream changes needed for adding support for the
OCTEON+ CPU type, so that we can test Clang builds using the most
commonly available FreeBSD/mips64 reference platform, the Edge Router
Lite.
erj [Thu, 21 Nov 2019 19:57:56 +0000 (19:57 +0000)]
bitstring: add functions to find contiguous set/unset bit sequences
Add bit_ffs_area_at and bit_ffc_area_at functions for searching a bit
string for a sequence of contiguous set or unset bits of at least the
specified size.
The bit_ffc_area function will be used by the Intel ice driver for
implementing resource assignment logic using a bitstring to represent
whether or not a given index has been assigned or is currently free.
The bit_ffs_area, bit_ffc_area_at and bit_ffs_area_at functions are
implemented for completeness.
I'd like to add further test cases for the new functions, but I'm not
really sure how to add them easily. The new functions depend on specific
sequences of bits being set, while the bitstring tests appear to run for
varying bit sizes.
erj [Thu, 21 Nov 2019 19:36:11 +0000 (19:36 +0000)]
bitstring: exit early if _start is past size of the bitstring
bit_ffs_at and bit_ffc_at both take _start parameters which indicate to
start searching from _start onwards.
If the given _start index is past the size of the bit string, these
functions will calculate an address of the current bitstring which is
after the expected size. The function will also dereference the memory,
resulting in a read buffer overflow.
The output of the function remains correct, because the tests ensure to
stop the loop if the current bitstring chunk passes the stop bitstring
chunk, and because of a check to ensure the reported _value is never
past _nbits.
However, if <sys/bitstring.h> is ever used in code which is checked by
-fsanitize=undefined, or similar static analysis, it can produce
warnings about reading past the buffer size.
Because of the above mentioned checks, these buffer overflows do not
occur as long as _start is less than _nbits. Additionally, by definition
bit_ffs_at and bif_ffc_at should set _result to -1 in any case where the
_start is after the _nbits.
Check for this case at the start of the function and exit early if so,
preventing the buffer read overflow, and reducing the amount of
computation that occurs.
Note that it may seem odd to ever have code that could call bit_ffc_at
or bit_ffs_at with a _start value greater than _nbits. However, consider
a for-loop that used bit_ffs and bit_ffs_at to loop over a bit string
and perform some operation on each bit that was set. If the last bit of
the bit string was set, the simplest loop implementation would call
bit_ffs_at with a start of _nbits, and expect that to return -1. While
it does infact perform correctly, this is what ultimately triggers the
unexpected buffer read overflow.
jhb [Thu, 21 Nov 2019 19:30:31 +0000 (19:30 +0000)]
NIC KTLS for Chelsio T6 adapters.
This adds support for ifnet (NIC) KTLS using Chelsio T6 adapters.
Unlike the TOE-based KTLS in r353328, NIC TLS works with non-TOE
connections.
NIC KTLS on T6 is not able to use the normal TSO (LSO) path to segment
the encrypted TLS frames output by the crypto engine. Instead, the
TOE is placed into a special setup to permit "dummy" connections to be
associated with regular sockets using KTLS. This permits using the
TOE to segment the encrypted TLS records. However, this approach does
have some limitations:
1) Regular TOE sockets cannot be used when the TOE is in this special
mode. One can use either TOE and TOE-based KTLS or NIC KTLS, but
not both at the same time.
2) In NIC KTLS mode, the TOE is only able to accept a per-connection
timestamp offset that varies in the upper 4 bits. Put another way,
only connections whose timestamp offset has the 28 lower bits
cleared can use NIC KTLS and generate correct timestamps. The
driver will refuse to enable NIC KTLS on connections with a
timestamp offset with any of the lower 28 bits set. To use NIC
KTLS, users can either disable TCP timestamps by setting the
net.inet.tcp.rfc1323 sysctl to 0, or apply a local patch to the
tcp_new_ts_offset() function to clear the lower 28 bits of the
generated offset.
3) Because the TCP segmentation relies on fields mirrored in a TCB in
the TOE, not all fields in a TCP packet can be sent in the TCP
segments generated from a TLS record. Specifically, for packets
containing TCP options other than timestamps, the driver will
inject an "empty" TCP packet holding the requested options (e.g. a
SACK scoreboard) along with the segments from the TLS record.
These empty TCP packets are counted by the
dev.cc.N.txq.M.kern_tls_options sysctls.
Unlike TOE TLS which is able to buffer encrypted TLS records in
on-card memory to handle retransmits, NIC KTLS must re-encrypt TLS
records for retransmit requests as well as non-retransmit requests
that do not include the start of a TLS record but do include the
trailer. The T6 NIC KTLS code tries to optimize some of the cases for
requests to transmit partial TLS records. In particular it attempts
to minimize sending "waste" bytes that have to be given as input to
the crypto engine but are not needed on the wire to satisfy mbufs sent
from the TCP stack down to the driver.
TCP packets for TLS requests are broken down into the following
classes (with associated counters):
- Mbufs that send an entire TLS record in full do not have any waste
bytes (dev.cc.N.txq.M.kern_tls_full).
- Mbufs that send a short TLS record that ends before the end of the
trailer (dev.cc.N.txq.M.kern_tls_short). For sockets using AES-CBC,
the encryption must always start at the beginning, so if the mbuf
starts at an offset into the TLS record, the offset bytes will be
"waste" bytes. For sockets using AES-GCM, the encryption can start
at the 16 byte block before the starting offset capping the waste at
15 bytes.
- Mbufs that send a partial TLS record that has a non-zero starting
offset but ends at the end of the trailer
(dev.cc.N.txq.M.kern_tls_partial). In order to compute the
authentication hash stored in the trailer, the entire TLS record
must be sent as input to the crypto engine, so the bytes before the
offset are always "waste" bytes.
In addition, other per-txq sysctls are provided:
- dev.cc.N.txq.M.kern_tls_cbc: Count of sockets sent via this txq
using AES-CBC.
- dev.cc.N.txq.M.kern_tls_gcm: Count of sockets sent via this txq
using AES-GCM.
- dev.cc.N.txq.M.kern_tls_fin: Count of empty FIN-only packets sent to
compensate for the TOE engine not being able to set FIN on the last
segment of a TLS record if the TLS record mbuf had FIN set.
- dev.cc.N.txq.M.kern_tls_records: Count of TLS records sent via this
txq including full, short, and partial records.
- dev.cc.N.txq.M.kern_tls_octets: Count of non-waste bytes (TLS header
and payload) sent for TLS record requests.
- dev.cc.N.txq.M.kern_tls_waste: Count of waste bytes sent for TLS
record requests.
To enable NIC KTLS with T6, set the following tunables prior to
loading the cxgbe(4) driver:
ian [Thu, 21 Nov 2019 19:13:05 +0000 (19:13 +0000)]
Rewrite iicdev_writeto() to use a single buffer and a single iic_msg, rather
than effectively doing scatter/gather IO with a pair of iic_msgs that direct
the controller to do a single transfer with no bus STOP/START between the
two buffers. It turns out we have multiple i2c hardware drivers that don't
honor the NOSTOP and NOSTART flags; sometimes they just try to do the
transfers anyway, creating confusing failures or leading to corrupted data.
jhb [Thu, 21 Nov 2019 18:14:26 +0000 (18:14 +0000)]
Add a kmod.opts.mk.
This Makefile sets KERN_OPTS. This permits kernel module Makefiles to
use KERN_OPTS to control the value of variables such as SRCS that are
used by bsd.kmod.mk for KERN_OPTS values that honor WITH/WITHOUT
options for standalone builds.
zeising [Thu, 21 Nov 2019 15:38:27 +0000 (15:38 +0000)]
ObsoleteFiles.inc: add sio(4) leftovers
Add the manual page for sio(4) to ObsoleteFiles.inc, so that make delete-all
will remove it. The manual page was removed together with sio(4) in
r354929.
kevans [Thu, 21 Nov 2019 14:01:44 +0000 (14:01 +0000)]
bcm2835_sdhci: only inspect interrupts we handle
We'll write the value we read back to ack pending interrupts, but we should
at least make it clear to ourselves that we only want to ack pending
transfer interrupts.
https://www.illumos.org/issues/10592
This is a collection of recent fixes from ZoL: 8eef997679b Error path in metaslab_load_impl() forgets to drop ms_sync_lock 928e8ad47d3 Introduce auxiliary metaslab histograms 425d3237ee8 Get rid of space_map_update() for ms_synced_length 6c926f426a2 Simplify log vdev removal code 21e7cf5da89 zdb -L should skip leak detection altogether df72b8bebe0 Rename range_tree_verify to range_tree_verify_not_present 75058f33034 Remove unused vdev_t fields
andrew [Thu, 21 Nov 2019 11:22:08 +0000 (11:22 +0000)]
Port the NetBSD KCSAN runtime to FreeBSD.
Update the NetBSD Kernel Concurrency Sanitizer (KCSAN) runtime to work in
the FreeBSD kernel. It is a useful tool for finding data races between
threads executing on different CPUs.
This can be enabled by enabling KCSAN in the kernel config, or by using the
GENERIC-KCSAN amd64 kernel. It works on amd64 and arm64, however the later
needs a compiler change to allow -fsanitize=thread that KCSAN uses.
10601 Pool allocation classes
https://www.illumos.org/issues/10601
illumos port of ZoL Pool allocation classes. Includes at least these two
commits: 441709695 Pool allocation classes misplacing small file blocks cc99f275a Pool allocation classes
10757 Add -gLp to zpool subcommands for alt vdev names
https://www.illumos.org/issues/10757
Port from ZoL of d2f3e292d Add -gLp to zpool subcommands for alt vdev names
Note that a subsequent ZoL commit changed -p to -P a77f29f93 Change full path subcommand flag from -p to -P
Portions contributed by: Jerry Jelinek <jerry.jelinek@joyent.com>
Portions contributed by: Håkan Johansson <f96hajo@chalmers.se>
Portions contributed by: Richard Yao <ryao@gentoo.org>
Portions contributed by: Chunwei Chen <david.chen@nutanix.com>
Portions contributed by: loli10K <ezomori.nozomu@gmail.com>
Author: Don Brady <don.brady@delphix.com>
11541 allocation_classes feature must be enabled to add log device
https://www.illumos.org/issues/11541
After the allocation_classes feature was integrated, one can no longer add a
log device to a pool unless that feature is enabled. There is an explicit check
for this, but it is unnecessary in the case of log devices, so we should handle
this better instead of forcing the feature to be enabled.
Author: Jerry Jelinek <jerry.jelinek@joyent.com>
FreeBSD notes.
I faithfully added the new -g, -L, -P flags, but only -g does something:
vdev GUIDs are displayed instead of device names. -L, resolve symlinks,
and -P, display full disk paths, do nothing at the moment.
The use of special vdevs is backward compatible for read-only access, so
root pools should be bootable, but exercise caution.
emaste [Thu, 21 Nov 2019 03:10:02 +0000 (03:10 +0000)]
mark arm.arm (v4/v5) kernels as NO_UNIVERSE for now
r354290 removed arm.arm from universe, but arm.arm kernels were still
found and built during the kernel stage. I'm not aware of a better way
to address this at the moment, but since there aren't many arm.arm
kernels anyhow just add an explicit NO_UNIVERSE to them.
kevans [Thu, 21 Nov 2019 02:49:41 +0000 (02:49 +0000)]
bcm2835_sdhci: clean up DMA segments in error handling path
Later parts assume that this would've been done if interrupts are enabled,
but this is the only case in which that wouldn't have been true. This commit
also reorders operations such that we're done touching slot/slot->intmask
before we call back into the SDHCI framework and exit.
kevans [Thu, 21 Nov 2019 02:47:55 +0000 (02:47 +0000)]
bcm2835_sdhci: roll back r354823
r354823 kicked DATA_END handling out of the DMA interrupt path "to make
things easy", but this was likely a mistake -- if we know we're done after
we've finished pending DMA operations, we should go ahead and acknowledge
it rather than waiting for the controller to finalize it. If it's not ready,
we'll simply re-enable interrupts and wait for it anyways, to be re-entered
in sdhci_data_intr.
kevans [Thu, 21 Nov 2019 02:41:22 +0000 (02:41 +0000)]
bcm2835_sdhci: clean up DMA segments in error handling path
Later parts assume that this would've been done if interrupts are enabled,
but this is the only case in which that wouldn't have been true. This commit
also reorders operations such that we're done touching slot/slot->intmask
before we call back into the SDHCI framework and exit.
imp [Wed, 20 Nov 2019 23:58:36 +0000 (23:58 +0000)]
Add --esp/-E argument to print the currently booted ESP
Add code to decode the BootCurrent and BootXXXX variable it points at
to deduce the ESP used to boot the system. By default, it prints the
path to that device. With --unix-path (-p) it will instead print the
current mount point for the ESP, if any (or an error). With
--device-path (-d) it wil print the UEFI device path for the ESP.
Note: This is the best guess based on the UEFI variables. If the ESP
is part of a gmirror, etc, that won't be reported. If by some weird
chance there was a complicated series of chain boots, this may not be
what you want. For setups that don't add layers on top of the raw
devices, it is accurate.
imp [Wed, 20 Nov 2019 23:45:31 +0000 (23:45 +0000)]
Create /etc/os-release file.
Each boot, regenerate /var/run/os-release based on the currently running
system. Create a /etc/os-release symlink pointing to this file (so that this
doesn't create a new reason /etc can not be mounted read-only).
This is compatible with what other systems do and is what the sysutil/os-release
port attempted to do, but in an incomplete way. Linux, Solaris and DragonFly all
implement this natively as well. The complete standard can be found at
https://www.freedesktop.org/software/systemd/man/os-release.html
Moving this to the base solves both the non-standard location problem with the
port, as well as the lack of update of this file on system update.
imp [Wed, 20 Nov 2019 21:06:29 +0000 (21:06 +0000)]
Standardize EFI's ESP mount point.
Mount the UEFI ESP on /boot/efi. No current system uses this by default, but
there are many ad-hoc schemes that do this in /efi or /esp or /uefi and adding a
new directory at the top-level would have a much higher likelihood of
collision. Document this in /etc/mtree/BSD.root.mtree and create EFIDIR and
related variables in bsd.own.mk.
brooks [Wed, 20 Nov 2019 18:36:58 +0000 (18:36 +0000)]
Make the warning for deprecated NO_ variables an error.
Support for NO_CTF, NO_DEBUG_FILES, NO_INSTALLLIB, NO_MAN, NO_PROFILE,
and NO_WARNS as deprecated in 2014 with a warning added for each one
found. Turn these into error in preperation for removal of compatability
support before FreeBSD 13.
dim [Wed, 20 Nov 2019 18:12:01 +0000 (18:12 +0000)]
Add explanatory comments for the different SRCS_xxx variables used in
the Makefiles for libllvm and libclang. While here, cleanup a commented
out SRCS entry in libllvmminimal's Makefile.
emaste [Wed, 20 Nov 2019 17:57:46 +0000 (17:57 +0000)]
src.conf.5: regen for several recent changes
r354289 armv6: Switch to LLD by default
r354290 Take arm.arm (armv5) out of universe
r354348 armv6, armv7: Switch to llvm-libunwind by default
r354660 Enable the RISC-V LLVM backend by default.
emaste [Wed, 20 Nov 2019 17:37:45 +0000 (17:37 +0000)]
disable amd(8) by default
As of FreeBSD 10.1 the autofs(5) is available for automounting, and the
amd man page has indicated that the in-tree copy of amd is obsolete.
Disable it by default for now, with the expectation that it will be
removed before FreeBSD 13.0.
Reviewed by: kevans
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D22460
emaste [Wed, 20 Nov 2019 16:30:37 +0000 (16:30 +0000)]
sshd: make getpwclass wrapper MON_ISAUTH not MON_AUTH
In r339216 a privsep wrapper was added for login_getpwclass to address
PR 231172. Unfortunately the change used the MON_AUTH flag in the
wrapper, and MON_AUTH includes MON_AUTHDECIDE which triggers an
auth_log() on each invocation. getpwclass() does not participate in the
authentication decision, so should be MON_ISAUTH instead.
PR: 234793
Submitted by: Henry Hu
Reviewed by: Yuichiro NAITO
MFC after: 1 week
dougm [Wed, 20 Nov 2019 16:06:48 +0000 (16:06 +0000)]
Instead of looking up a predecessor or successor to the current map
entry, when that entry has been seen already, keep the
already-looked-up value in a variable and use that instead of looking
it up again.
andrew [Wed, 20 Nov 2019 14:37:48 +0000 (14:37 +0000)]
Import the NetBSD Kernel Concurrency Sanitizer (KCSAN) runtime.
KCSAN is a tool to find concurrent memory access that may race each other.
After a determined number of memory accesses a cell is created, this
describes the current access. It will then delay for a short period
to allow other CPUs a chance to race. If another CPU performs a memory
access to an overlapping region during this delay the race is reported.
This is a straight import of the NetBSD code, it will be adapted to
FreeBSD in a future commit.
mjg [Wed, 20 Nov 2019 12:05:59 +0000 (12:05 +0000)]
vfs: change si_usecount management to count used vnodes
Currently si_usecount is effectively a sum of usecounts from all associated
vnodes. This is maintained by special-casing for VCHR every time usecount is
modified. Apart from complicating the code a little bit, it has a scalability
impact since it forces a read from a cacheline shared with said count.
There are no consumers of the feature in the ports tree. In head there are only
2: revoke and devfs_close. Both can get away with a weaker requirement than the
exact usecount, namely just the count of active vnodes. Changing the meaning to
the latter means we only need to modify it on 0<->1 transitions, avoiding the
check plenty of times (and entirely in something like vrefact).
Reviewed by: kib, jeff
Tested by: pho
Differential Revision: https://reviews.freebsd.org/D22202
kib [Wed, 20 Nov 2019 11:12:19 +0000 (11:12 +0000)]
amd64: in double fault handler, do not rely on sane gsbase value.
Typical reasons for doublefault faults are either kernel stack
overflow or bugs in the code that manipulates protection CPU state.
The later code is the code which often has to set up gsbase for
kernel. Switching to explicit load of GSBASE MSR in the fault handler
makes it more probable to output a useful information.
Now all IST handlers have nmi_pcpu structure on top of their stacks.
It would be even more useful to save gsbase value at the moment of the
fault. I did not this because I do not want to modify PCB layout now.
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week