kib [Sat, 7 Dec 2019 00:28:08 +0000 (00:28 +0000)]
x86: Restore the critical section around whole ipi_bitmap_handler() if
hardclock IPI is delivered.
In the current code after r355311, critical section is taken only
around hardclockintr() call, and sched_preempt() is called after the
section is exited. If we reschedule after exit, as we typically would
due to conditions that caused IPI, in ULE the runq tdq_ipipending is
not cleared, which blocks generation of further preempt IPIs.
Since all relatively modern (10 years) hardware has per-cpu event
timers, restoring the critical section conditionally does not affect
it.
Reported and tested by: cy
Diagnosed and reviewed by: jeff (previous version)
Sponsored by: The FreeBSD Foundation
Differential revision: https://reviews.freebsd.org/D22716
brooks [Fri, 6 Dec 2019 23:59:23 +0000 (23:59 +0000)]
sysent: Reduce duplication and improve readability.
Use the power of variable to avoid spelling out source and generated
files too many times. The previous Makefiles were hard to read, hard to
edit, and badly formatted.
rmacklem [Fri, 6 Dec 2019 23:51:11 +0000 (23:51 +0000)]
Add a couple of definitions for NFSv4.2 and update macros to use them.
This patch adds code to macros to clear attribute bits not supported
by NFSv4.2. For now, these bits are never set anyhow, but this prepares
the code for the addition of NFSv4.2 support in a future commit.
There will be a series of these preliminary commits that will prepare for
a major commit of the NFSv4.2 client/server changes currently found in
subversion under projects/nfsv42/sys.
markj [Fri, 6 Dec 2019 23:39:38 +0000 (23:39 +0000)]
Fix tail -f in capability mode.
We were not adding CAP_EVENT to input file capabilities, so kevent()
always failed with ENOTCAPABLE. tail implements a fallback mode to
poll the file in this case, so the failure was not apparent.
Reviewed by: emaste
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D22709
markj [Fri, 6 Dec 2019 23:39:08 +0000 (23:39 +0000)]
Fix fault_type handling in vm_map_lookup().
Suppose that the map entry is wired, so that we later assign
fault_type = entry->protection. Suppose further that we jump back to
RetryLookup. Then fault_type will no longer contain the original
fault protection mask, but instead that of the wired entry.
kevans [Fri, 6 Dec 2019 22:45:36 +0000 (22:45 +0000)]
makesyscalls.lua: improve config processing
The current version will strip out #include directives appearing inside strings, which is clearly wrong. Improve the processing entirely in the following ways:
- Strip only whole-line comments on every single iteration
- Abort if we see a malformed line that doesn't match the key=value format
- For quoted (backtick or double quote) strings, we'll advance to the end of
the key=value pair and make sure there's not extra stuff left over
- For unquoted key=value pairs, we'll strip any trailing comments and verify
there's no internal whitespace
This has revealed the caveat that key/value pairs can't even include escaped quotes (and haven't been able to). I don't know if this is actually problematic, as we're usually looking at cases like "#include <foo>" or raw identifiers.The current version will strip out #include directives appearing inside strings, which is clearly wrong. Improve the processing entirely in the following ways:
Reviewed and noticed by: brooks
Differential Revision: https://reviews.freebsd.org/D22698
ian [Fri, 6 Dec 2019 22:32:06 +0000 (22:32 +0000)]
Implement bus_rescan for gpiobus(4). This allows on-the-fly reconfiguration
of gpio devices by using kenv to add hints for a new device and then do
'devctl rescan gpiobus4' to make the new device(s) attach.
It's not particularly easy to detect whether the 'at' hint has been deleted
for a child device that's currently attached, so this doesn't handle that.
But the user can use devctl commands to manually detach an existing device.
imp [Fri, 6 Dec 2019 22:12:39 +0000 (22:12 +0000)]
trackers always know what qpair they are on
Don't needlessly pass around qpair pointers when the tracker knows what
qpair it's on. This will simplify code and make it easier to split
submission and completion queues in the future.
kevans [Fri, 6 Dec 2019 19:33:39 +0000 (19:33 +0000)]
libbe: fix build against sysutils/openzfs, part 1
This is the half of the changes required that work as-is with both in-tree
ZFS and the new hotness, sysutils/openzfs. Highlights are less dependency
on header pollution (from somewhere) and using 'mnttab' instead of
'extmnttab'. In the in-tree ZFS, the latter is a #define for the former,
but in the port extmnttab is actually a distinct struct that's a super-set
of mnttab. We really want mnttab here anyways, so just use it.
jhb [Fri, 6 Dec 2019 19:20:45 +0000 (19:20 +0000)]
Remove SPARE_USRSPACE.
This constant was used to reserve space at the top of the stack to
hold translated system call arguments for non-default ABIs (the
"stackgap"). However, none of the compatibility ABIs have used the
stackgap in many years and the last use of SPARE_USRSPACE was removed
in r355373.
Update the comment related to SIIT and v4mapped addresses being rejected
by us when coming from the wire given we have supported IPv6-only kernels
for a few years now.
See also draft-itojun-v6ops-v4mapped-harmful.
mav [Fri, 6 Dec 2019 16:48:36 +0000 (16:48 +0000)]
Remove some branching from GEOM_DISK hot path.
pp->private just can not be NULL in those places.
In g_disk_start() and g_disk_ioctl() both dp != NULL and !dp->d_destroyed
should always be true if disk_gone() and disk_destroy() are used properly,
since GEOM does not send requests to errored providers. If the protocol is
not followed, then no amount of additional checks here give real safety.
In g_disk_access() though the checks are useful, since GEOM blocks only
new opens for errored providers, but allows closes. It should not happen
if disk_gone() and disk_destroy() are used properly, but may otherwise.
To improve cases when disk_gone() is not used, call it from disk_destroy().
It does not give full guaranties, but it errors the provider and makes
GEOM block unwanted requests at least after some race.
In ip6_input() we apply the same v4mapped address check twice. The only
case which skipps the first one is M_FASTFWD_OURS which should have passed
the check on the firstinput pass and passed the firewall.
Remove the 2nd redundant check.
Two changes to EPOCH_TRACE:
(1) add a sysctl to surpress the backtrace from epoch_trace_report().
Sometimes the log line for the recursion is enough and the
backtrace massively spams the console.
(2) In order to be able to go without the backtrace do not only
print where the previous occurance happened, but also where
the current one happens. That way we have file:line information
for both and can look at them without the need for getting line
numbers from backtrace and a debugging tool.
hselasky [Fri, 6 Dec 2019 15:36:32 +0000 (15:36 +0000)]
Implement hardware TLS via send tags for mlx5en(4), which is supported by
ConnectX-6 DX.
Currently TLS v1.2 and v1.3 with AES 128/256 crypto over TCP/IP (v4
and v6) is supported.
A per PCI device UMA zone is used to manage the memory of the send
tags. To optimize performance some crypto contexts may be cached by
the UMA zone, until the UMA zone finishes the memory of the given send
tag.
An asynchronous task is used manage setup of the send tags towards the
firmware. Most importantly setting the AES 128/256 bit pre-shared keys
for the crypto context.
Updating the state of the AES crypto engine and encrypting data, is
all done in the fast path. Each send tag tracks the TCP sequence
number in order to detect non-contiguous blocks of data, which may
require a dump of prior unencrypted data, to restore the crypto state
prior to wire transmission.
Statistics counters have been added to count the amount of TLS data
transmitted in total, and the amount of TLS data which has been dumped
prior to transmission. When non-contiguous TCP sequence numbers are
detected, the software needs to dump the beginning of the current TLS
record up until the point of retransmission. All TLS counters utilize
the counter(9) API.
In order to enable hardware TLS offload the following sysctls must be set:
kern.ipc.mb_use_ext_pgs=1
kern.ipc.tls.ifnet.permitted=1
kern.ipc.tls.enable=1
ian [Fri, 6 Dec 2019 03:48:35 +0000 (03:48 +0000)]
Declare the global kernel symbols created by ldscript.arm in arm's machdep.h,
and remove a couple scattered local declarations.
Most of these aren't referenced in C code (there are some references in
asm code), and they also aren't documented anywhere. This helps a bit
with the latter.
mav [Fri, 6 Dec 2019 03:46:38 +0000 (03:46 +0000)]
Block ioctls for dying GEOM_DEV instances.
For normal I/Os consumer and provider statuses are checked by g_io_check().
But ioctl calls often do not go through it, being dispatched directly. This
change makes their semantics more alike, protecting lower levels.
scottl [Fri, 6 Dec 2019 02:43:05 +0000 (02:43 +0000)]
Move the mds, irbs, and ssb mitigation knobs into machdep.mitigations.
They're in both the old and new places in HEAD for the moment for
discussion and transition. The old locations will be garbage collected
in 4 weeks. MFCs to 12 an 11 will keep the old and new for transition
purposes.
rmacklem [Fri, 6 Dec 2019 01:53:02 +0000 (01:53 +0000)]
Add some definitions for NFSv4.2 which will be used by subsequent commits.
This is a preliminary commit of NFSv4.2 definitions that will be used by
subsequent commits which adds NFSv4.2 support to the NFS client and server.
There will be a series of these preliminary commits that will prepare for
a major commit of the NFSv4.2 client/server changes currently found in
subversion under projects/nfsv42/sys.
asomers [Fri, 6 Dec 2019 00:12:14 +0000 (00:12 +0000)]
gmultipath: add ATF tests
Add ATF tests for most gmultipath operations. Add some dtrace probes too,
primarily for configuration changes that happen in response to provider
errors.
asomers [Fri, 6 Dec 2019 00:06:05 +0000 (00:06 +0000)]
ses: sanitize illegal strings in SES element descriptors
The SES4r3 standard requires that element descriptors may only contain ASCII
characters in the range 0x20 to 0x7e. Some SuperMicro expanders violate
that rule. This patch adds a sanity check to ses(4). Descriptors in
violation will be replaced by "<invalid>".
This patch fixes "sesutil --libxo xml" on such systems. Previously it would
generate non-well-formed XML output.
jhb [Thu, 5 Dec 2019 19:37:30 +0000 (19:37 +0000)]
Add a new "riscv-relaxations" linker feature.
When the linker doesn't have this feature, add -mno-relax to CFLAGS
on RISC-V.
Define the feature for ld.bfd, but not lld. If lld gains relaxation
support in a newer version, we can enable it for those versions of lld
in bsd.linker.mk.
alc [Thu, 5 Dec 2019 19:25:49 +0000 (19:25 +0000)]
On a context switch, handle the possibility that the old thread was
preempted after an "ic" or "tlbi" instruction but before it performed a
"dsb" instruction. The "ic" and "tlbi" instructions have unusual
synchronization requirements. If the old thread migrates to a new
processor, its completion of a "dsb" instruction on that new processor does
not guarantee that the "ic" or "tlbi" instructions performed on the old
processor have completed.
This issue is not restricted to the kernel. Since locore.S sets the UCI bit
in SCTLR, user-space programs can perform "ic ivau" instructions (as well as
some forms of the "dc" instruction).
Coverity points out that we've already dereferenced m by the time we check, so
there's no reason to keep the check. Moreover, it's safe to pass NULL to
m_freem() anyway.
The recent rpi-firmware update renamed "0" to "zero" in the RPi0 DTB
filename
It also included the components needed to boot the RPi4, so install those
now -- interested parties can install sysutils/u-boot-rpi4 and copy
config_rpi4.txt to config.txt on the FAT partition in order to boot the
board. Do note that we currently don't support ethernet/usb/pci.
kevans [Thu, 5 Dec 2019 15:32:33 +0000 (15:32 +0000)]
UPDATING: Add long-belated note about certs in base
While the interaction between this and the ETCSYMLINK option of
security/ca_root_nss isn't necessarily fatal, one should be aware and
attempt to understand the ramifications of mixing the two.
ports-secteam will be contacted to discuss the default option for branches
where certs are being included in base.
hselasky [Thu, 5 Dec 2019 15:16:19 +0000 (15:16 +0000)]
Add basic support for TCP/IP based hardware TLS offload to mlx5core.
The hardware offload is primarily targeted for TLS v1.2 and v1.3,
using AES 128/256 bit pre-shared keys. This patch adds all the needed
hardware structures, capabilites and firmware commands.
mjg [Thu, 5 Dec 2019 13:43:44 +0000 (13:43 +0000)]
sx: check for SX_LOCK_SHARED | SX_LOCK_WRITE_SPINNER when exclusive-locking
First, this removes a spurious difference compared to rw locks.
More importantly though this avoids a trip through sleepq code if the lock
happens to be caught in this state.
mav [Thu, 5 Dec 2019 04:03:08 +0000 (04:03 +0000)]
Switch GEOM_DEV from make_dev_p() to make_dev_s().
It closes the race condition and so allows to remove few NULL checks.
Also while there, use dev->si_drv1 in addition to cp->private to store
softc pointer. For calls coming from the dev side it gives reliable cache
hit instead of often miss before.
loos [Thu, 5 Dec 2019 00:56:03 +0000 (00:56 +0000)]
Add the I2C driver for the Armada 37x0.
This controller is a bit tricky as the STOP condition must be indicated in
the last tranferred byte, some devices will not like the repeated start
behavior of this controller. A proper fix to this issue is in the works.
This driver works in polling mode, can be used early in the boot (required
in some cases).
rmacklem [Wed, 4 Dec 2019 23:24:40 +0000 (23:24 +0000)]
Add some definitions for NFSv4.2 which will be used by subsequent commits.
This is a preliminary commit of NFSv4.2 definitions that will be used by
subsequent commits which adds NFSv4.2 support to the NFS client and server.
There will be a series of these preliminary commits that will prepare for
a major commit of the NFSv4.2 client/server changes currently found in
subversion under projects/nfsv42/sys.
jhb [Wed, 4 Dec 2019 21:01:13 +0000 (21:01 +0000)]
Use "far" calls and branches so that lld uses valid relocations.
Conditional branch and jump instructions do not always call via PLT
stubs and thus will not honor LD_PRELOAD, etc. lld warns about using
non-preemptible relocations for preemptible or unknown symbols whereas
bfd does not (at least for RISC-V).
markj [Wed, 4 Dec 2019 19:46:48 +0000 (19:46 +0000)]
Fix an off-by-one error in vm_map_pmap_enter().
If the starting pindex is equal to object->size, there is nothing to do.
This was harmless since the rest of vm_map_pmap_enter() has no effect
when psize == 0.
andrew [Wed, 4 Dec 2019 18:40:05 +0000 (18:40 +0000)]
Fix the signature for zone_import and zone_release
These are cast to uma_import and uma_release functions. Use the signature
for these in the zone functions.
This was found with an experimental Kernel CFI. It will complain if the
signature is different than what a function pointer expects. The
simplest way to fix these is to correct the signature.
dim [Wed, 4 Dec 2019 18:38:50 +0000 (18:38 +0000)]
Merge commit 241cbf201 from llvm git (by Nemanja Ivanovic):
[PowerPC] Fix crash in peephole optimization
When converting reg+reg shifts to reg+imm rotates, we neglect to
consider the CodeGenOnly versions of the 32-bit shift mnemonics. This
means we produce a rotate with missing operands which causes a crash.
Committing this fix without review since it is non-controversial that
the list of mnemonics to consider should include the 64-bit aliases
for the exact mnemonics.
Fixes PR44183.
This should fix "Assertion failed: (idx < size()), function operator[],
file /usr/src/contrib/llvm/include/llvm/ADT/SmallVector.h, line 153"
when building the graphics/mesa-dri port for the PowerPC64 ELFv2 ABI.
rlibby [Wed, 4 Dec 2019 18:21:29 +0000 (18:21 +0000)]
mbuf zones: take out the trash
The mbuf zones were explicitly specifying the uma trash procedures on
zcreate, conditionally on INVARIANTS, because that used to be necessary
in order to get use-after-free checking for uma zones with non-empty
constructors or destructors. After r355137 uma automatically invokes
the trash constructor and destructor as long as no init and fini are
specified. This now allows the mbuf zones to pass their constructors
and destructors without needing to add on the uma trash procedures
conditionally.
imp [Wed, 4 Dec 2019 16:56:11 +0000 (16:56 +0000)]
Regularize my copyright notice
o Remove All Rights Reserved from my notices
o imp@FreeBSD.org everywhere
o regularize punctiation, eliminate date ranges
o Make sure that it's clear that I don't claim All Rights reserved by listing
All Rights Reserved on same line as other copyright holders (but not
me). Other such holders are also listed last where it's clear.
Remove "All rights reserved" phrase from copyright notes.
With the ratification of the Berne Convention in 2000, it became obsolete.
I have removed that phrase and the "(c)" only from files without copyright
claims by other parties. There are 2 files (pci.c, pci_private.h) that are
also claimed by Michael Smith <msmith@freebsd.org> and by BSDi, which have
therefore not been included in this commit.
When all member nations of the Buenos Aires Convention adopted the Berne
Convention, the phrase "All rights reserved" became unnecessary to assert
copyright. Remove it from files under my copyright.
There are 2 files (pci.c, pci_private.h) that) that do also bear msmith's
and BSDi's copyright. I have left them unchanged for now, since I do not
know whether they (or the legal successor in case of BSDi) would agree.
jhibbits [Wed, 4 Dec 2019 03:51:30 +0000 (03:51 +0000)]
powerpc/booke: Fix some formatting errors in debug printfs
Use the right formats for the types given (vm_offset_t and vm_size_t are
both uint32_t on 32-bit platforms, and uint64_t on 64-bit platforms, and
match size_t in size, so we can use the size_t format as we do in other
similar code).
jhibbits [Wed, 4 Dec 2019 03:41:55 +0000 (03:41 +0000)]
powerpc/booke: Fix 32-bit Book-E SMP AP bringup
r354266 changed the type of bp_kernload to vm_paddr_t in platform_mpc85xx.c,
but not the variable itself in locore.S. This caused the AP to not come up,
due to overwriting the following variable (bp_virtaddr). Also, properly
load bp_kernload into MAS3 and MAS7. Prior to r354266, we required loading
into the low 4GB, but now we can load from anywhere in memory that ubldr can
access.
dougm [Wed, 4 Dec 2019 03:36:54 +0000 (03:36 +0000)]
Change the implementation of bit_ffc_area_at so that, in the worst
case, the number of operations spent on each b-bit word is
proportional to lg b rather than b.
For one word, shrink all regions of 0-bits by size-1 bit positions in
no more than O(lg(min(b,size))) operations. In what remains, the first
0-bit is either the start of an area of sufficient size contained
within the original word, or the start of an area that could spill
over into the next word, and prove to be of sufficient size once the
start of that word is examined.
jhb [Tue, 3 Dec 2019 23:20:19 +0000 (23:20 +0000)]
Pass 0 to __builtin_frame_address() to appease modern GCC.
Modern versions of GCC warn about passing non-zero values to
__builtin_frame_address(). Passing 1 is a cosmetic change to remove
the db_trace_self() frame from the printed stack trace.
jhb [Tue, 3 Dec 2019 23:17:54 +0000 (23:17 +0000)]
Use uintptr_t instead of register_t * for the stack base.
- Use ustringp for the location of the argv and environment strings
and allow destp to travel further down the stack for the stackgap
and auxv regions.
- Update the Linux copyout_strings variants to move destp down the
stack as was done for the native ABIs in r263349.
- Stop allocating a space for a stack gap in the Linux ABIs. This
used to hold translated system call arguments, but hasn't been used
since r159992.
mckusick [Tue, 3 Dec 2019 23:07:09 +0000 (23:07 +0000)]
Currently the breadn_flags() and getblkx() interfaces are passed
the vnode, logical block number, and size of data block that is
being requested. They then use the VOP_BMAP function to calculate
the mapping from logical block number to physical block number from
which to access the data. This change expands the interface to also
pass the physical block number in cases where the VOP_MAP function
may no longer work, for example when a file is being truncated.
No functional change.
Reviewed by: kib
Tested by: Peter Holm
Sponsored by: Netflix
manu [Tue, 3 Dec 2019 22:08:54 +0000 (22:08 +0000)]
cpufreq_dt: Do not attach the device if the cpu isn't present
If we boot with hw.ncpu=X (available on arm and arm64 at least) we
shouldn't attach the cpufreq driver as cf_set_method will try to get
the cpuid and it doesn't exists.
This solves cpufreq panicing on RockChip RK3399 when booting with
hw.ncpu=4
manu [Tue, 3 Dec 2019 21:00:45 +0000 (21:00 +0000)]
Remove "all rights reserved" from copyright for the file I own.
Some of the files have both me and Jared McNeill and he gave me
permission to remove it from his files too.
manu [Tue, 3 Dec 2019 19:18:32 +0000 (19:18 +0000)]
arm64: rockchip: rl3399: Remove the ability to put the PLL in normal mode at boot
RK3399 PLLs have three modes :
- Normal, where they behave normally and their freq is calculated based on
the registers values.
- Slow, where the PLL freq is 24Mhz (well, the external oscillator).
- Deep Slow, used for suspend where the freq is 32Khz.
We used to put every CPU related PLL in normal mode but it can cause problem
if the firmware didn't setup the clocks register correctly.
And even if it did but left the pll in slow or deep slow mode that might be
because the PMIC suppling voltage for the CPU haven't been configured yet
and we cannot do that at this point.
So remove the ability to set PLLs to normal mode at boot to avoid any problems.
kevans [Tue, 3 Dec 2019 18:44:19 +0000 (18:44 +0000)]
lualoader: correct a typo from r354247
r354247 converted try_include to lfs + dofile with the loader.lua_path added
just before. Fortunately, there was a hardcoded /boot/lua fallback in case
loader.lua_path wasn't being set yet- I typo'd it as loader.lua_paths.
rlibby [Tue, 3 Dec 2019 17:43:57 +0000 (17:43 +0000)]
bitset: avoid pessimized code when bitset size is not constant
We have a couple optimizations for when the bitset is known to be just
one word. But with dynamically sized bitsets, it was actually more work
to determine the size than just to do the necessary computation. Now,
only use the optimization when the size is known to be constant.
rlibby [Tue, 3 Dec 2019 17:43:52 +0000 (17:43 +0000)]
mips busdma: bzero map on alloc
Maps from the mips busdma dmamap_zone were not completely initialized.
In particular, pagesneeded and pagesreserved were not initialized. This
could cause a crash.
Remove some dead fields from mips struct bus_dmamap while here.
hselasky [Tue, 3 Dec 2019 08:46:59 +0000 (08:46 +0000)]
Use refcount from "in_joingroup_locked()" when joining multicast
groups. Do not acquire additional references. This makes the IPv4 IGMP
code in line with the IPv6 MLD code.
Background:
The IPv4 multicast code puts an extra reference on the in_multi struct
when joining groups. This becomes visible when using daemons like
igmpproxy from ports, that multicast entries do not disappear from the
output of ifmcstat(8) when multicast streams are disconnected.
This fixes a regression issue after r349762.
While at it factor the ip_mfilter_insert() and ip6_mfilter_insert() calls
to avoid repeated "is_new" check.
kevans [Tue, 3 Dec 2019 02:30:52 +0000 (02:30 +0000)]
syscons.c: clang-format pass to reduce style inconsistencies
This was purely automatically massaged... some parts are still imperfect,
but this is close enough to make it more readable/easy to work on.
Unfortunately the vt/syscons/kdb situation slightly complicates changes to
tty locking, so some work will need to be done to remediate that.