Brandon Bergren [Thu, 12 Dec 2019 17:40:32 +0000 (17:40 +0000)]
rtld: do not try to mmap a zero-sized PT_LOAD
When a PT_LOAD segment has a zero p_filesz, skip the data mmap, as mmapping
zero bytes from a file is an error.
A PT_LOAD with zero p_filesz is legal (but somewhat uncommon due to segment
merging in modern linkers, as it is more efficient to merge .data and .bss
by just extending p_memsz in the previous segment, assuming compatible
page protection.)
This was seen on ports/graphics/glew on a powerpc64 ELFv2 experimental
build.
Brandon Bergren [Thu, 12 Dec 2019 16:49:55 +0000 (16:49 +0000)]
[PowerPC] Fix powerpc 32 bit build in mmu_oea64.c
Due to ppc32 building mmu_oea64.c (for use when in bridge mode on a G5), we
need to guard the new moea64_page_array_startup code behind __powerpc64__
to avoid a compile error, since vm_offset_t is not 64-bit on ppc32.
Follow RFC 4443 p2.2 and always use own addresses for reflected ICMPv6
datagrams.
Previously destination address from original datagram was used. That
looked confusing, especially in the traceroute6 output.
Also honor IPSTEALTH kernel option and do TTL/HLIM decrementing only
when stealth mode is disabled.
Reported by: Marco van Tol <marco at tols org>
Reviewed by: melifaro
MFC after: 2 weeks
Sponsored by: Yandex LLC
Differential Revision: https://reviews.freebsd.org/D22631
Conrad Meyer [Thu, 12 Dec 2019 04:44:09 +0000 (04:44 +0000)]
arm: libgcc_s: Fix ABI breakage introduced in r354347
Provide the symbol version for llvm-libunwind's _Unwind_Backtrace that libgcc
has historically provided on arm, in addition to the (default) standard version
used on all other arch.
Mark Johnston [Thu, 12 Dec 2019 02:43:24 +0000 (02:43 +0000)]
Rename tdq_ipipending and clear it in sched_switch().
This fixes a regression after r355311. Specifically, sched_preempt()
may trigger a context switch by calling thread_lock(), since
thread_lock() calls critical_exit() in its slow path and the interrupted
thread may have already been marked for preemption. This would happen
before tdq_ipipending is cleared, blocking further preemption IPIs. The
CPU can be left in this state indefinitely if the interrupted thread
migrates.
Rename tdq_ipipending to tdq_owepreempt. Any switch satisfies a remote
preemption request, so clear tdq_owepreempt in sched_switch() instead of
sched_preempt() to avoid subtle problems of the sort described above.
Reviewed by: jeff, kib
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D22758
Ed Maste [Thu, 12 Dec 2019 02:18:18 +0000 (02:18 +0000)]
ObsoleteFiles.inc: remove stale comment
A comment at the top of the file claimed that the file was grouped into
OLD_FILES, OLD_LIBS, then OLD_DIRS, but that hasn't been the case since
the mid-2000s. Delete the stale comment, add a new comment for the
historical split entries, and move the one more recent entry (from 2013)
to group it into a single logical change.
Kyle Evans [Thu, 12 Dec 2019 01:41:55 +0000 (01:41 +0000)]
Add sigsetop extensions commonly found in musl libc and glibc
These functions (sigandset, sigisemptyset, sigorset) are commonly available
in at least musl libc and glibc; sigorset, at least, has proven quite useful
in qemu-bsd-user work for tracking the current process signal mask in a more
self-documenting/aesthetically pleasing manner.
Kyle Evans [Thu, 12 Dec 2019 01:35:56 +0000 (01:35 +0000)]
stand: liblua: drop default buffer size to 128
Lua allocates LUAL_BUFFERSIZE buffers on the stack for various string
functions (string.format, string.gsub) -- this works out to be somewhat
significant and not necessary, based on how we use string operations.
Dropping it risks having to allocate per call to format/gsub, but this is
not the case for our usage. This simply stops allocating 8K buffers on the
stack when luaL_Buffer is used.
Kyle Evans [Thu, 12 Dec 2019 01:33:45 +0000 (01:33 +0000)]
usr.sbin/ntp: don't emit versions w/ make -s
<sys.mk> defines ECHO=echo when not using make -s, and ECHO=true when using
make -s.
export ECHO for ntp products and use it in the mkver script to echo the
version. This suppresses the output as appropriate. ECHO is given a default
value to make sure things still work as expected for anyone that isn't
redefining ECHO.
John Baldwin [Wed, 11 Dec 2019 23:41:39 +0000 (23:41 +0000)]
Emulate reads of the PCI command register for passthrough devices.
VFs return zero for the memory enable bit even if it has been set by a
prior write. After r348779 this caused the annoying behavior that a
guest OS would unintentionally disable memory decoding on a future
read-modify-write operation on the command register. Instead, return
the shadow value of the command register for reads. This ensures that
the guest will only toggle the state of the memory enable bit when it
specifically intends to do so.
Mateusz Guzik [Wed, 11 Dec 2019 23:11:21 +0000 (23:11 +0000)]
vfs: locking primitives which elide ->v_vnlock and shared locking disablement
Both of these features are not needed by many consumers and result in avoidable
reads which in turn puts them on profiles due to cache-line ping ponging.
On top of that the current lockgmr entry point is slower than necessary
single-threaded. As an attempted clean up preparing for other changes,
provide new routines which don't support any of the aforementioned features.
With these patches in place vop_stdlock and vop_stdunlock disappear from
flamegraphs during -j 104 buildkernel.
Warner Losh [Wed, 11 Dec 2019 22:51:02 +0000 (22:51 +0000)]
Move reset to the interrutp processing stage
This trims the boot time a bit more for AWS and other platforms that have nvme
drives. There's no reason too do this inline. This has been in my tree a while,
but IIRC I talked to Jim Harris about this at one of our face to face meetings.
Ed Maste [Wed, 11 Dec 2019 22:09:22 +0000 (22:09 +0000)]
libpmc: convert arm64 data files to proper json
jevents includes a very permissive json parser that accepts invalid
json, of which there are many examples in libpmc (typically extra or
missing commas). Convert the arm64 files to proper json so other tools
can parse them.
Kyle Evans [Wed, 11 Dec 2019 19:32:52 +0000 (19:32 +0000)]
makesyscalls.lua: trim trailing spaces/commas from args
These are insignificant as far as declarations go, and we've historically
allowed it. fhlinkat in ^/sys/kern/syscalls.master, for example, currently
has a trailing comma after its final argument that this version of
makesyscalls is ignoring (not by conscious decision).
Fix it for now by actively stripping off trailing whitespace/commas until
we decide to actively prohibit it.
Emmanuel Vadot [Wed, 11 Dec 2019 18:50:23 +0000 (18:50 +0000)]
dwmmc: Handle the card detect interrupt
The driver used to always add the mmc device as it's child even
it no card was detected. Add a function that will detect if the
card is present or not and that will attach/detach the mmc device.
The function is either call on attach (as we won't have the interrupt
fired) or from two taskqueues. The first taskqueue will directly call
the function when the sdcard was present and is now removed and the other
one will delay a bit the attach when we didn't had a card and now have one.
This is mostly based on comments from the sdhci driver where it describe
a situation when the CD pin is detected before the others pins are connected.
Emmanuel Vadot [Wed, 11 Dec 2019 18:41:13 +0000 (18:41 +0000)]
dwmmc: Add a detach method
This method will disable the regulators, clocks and assert the reset of
the module. It will also detach it's children (the mmc device) and release
it's resources.
While here enable the regulators on attach as we need them to power up
the sdcard or emmc.
Emmanuel Vadot [Wed, 11 Dec 2019 18:39:05 +0000 (18:39 +0000)]
arm64: rk3328: Add the *clk_peri_niu clocks
Those clocks are always enable by default and are not really explained
in the TRM but the reason we had them is that they have the periph clock
as a parent and those parent should never be disable which can happen
if we disable all the childs. The current childs are the sd/emmc/sdio clocks
so the board will hang if we disable them.
Emmanuel Vadot [Wed, 11 Dec 2019 18:36:07 +0000 (18:36 +0000)]
arm64: Add explicit devices for dwmmc variant
We used to include the hisi version if soc_hisi_hi6220 was present,
include the altera version if dwmmc_altera was present and include
the rockchip version if soc_rockchip_rk3328 was present.
Now every version have it's own device directive.
The rockchip version isn't named dwmmc_rockchip because all other
rockchip driver are named rk_XXX.
Ed Maste [Wed, 11 Dec 2019 16:43:54 +0000 (16:43 +0000)]
security.7: add caveat about interim sysctl paths from r355436
r355436 moved mitigation sysctls to machdep.mitigations but did not
rationalize the sense of the invidual knobs. Clarify that the old
names remain the canonical way to set these mitigations.
Backwards compatibility will be maintained for the original names
(e.g. hw.ibrs_disable), but not from the interim names
(e.g. machdep.mitigations.ibrs.disable).
Andriy Gapon [Wed, 11 Dec 2019 15:52:29 +0000 (15:52 +0000)]
add a sanity check to the system call registration code
A system call number should be at least reserved.
We do not expect an attempt to register a fixed number system call
when nothing at all is known about it.
Dimitry Andric [Tue, 10 Dec 2019 22:10:25 +0000 (22:10 +0000)]
Add a few missed source files to libllvm, for the MK_LLVM_TARGET_BPF=yes
case. Otherwise, linking of clang and other llvm based executables
would complain about missing symbols.
John Baldwin [Tue, 10 Dec 2019 21:56:44 +0000 (21:56 +0000)]
Correct the offset of static TLS variables for Initial-Exec on RISC-V.
TP points to the start of the TLS block after the tcb, but
Obj_Entry.tlsoffset includes the tcb, so subtract the size of the tcb
to compute the offset relative to TP.
This is identical to the same fixes for powerpc in r339072 and r342671.
Reviewed by: James Clarke
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D22661
Ian Lepore [Tue, 10 Dec 2019 21:48:21 +0000 (21:48 +0000)]
Do not attach children of owc_gpiobus until interrupts are working.
The children of the bus need to do IO on the bus to probe for hardware
presence. Doing IO means timing the bus states using sbinuptime(), and
that requires working timecounters, which are not initialized until after
device attachment has completed.
bus_get/set_resource methods are implemented in child device (iicbus).
As their implementation with bus_generic_rl_get/set calls do not
recurse up the tree, the versions in ig4 are never called.
Kyle Evans [Tue, 10 Dec 2019 19:16:00 +0000 (19:16 +0000)]
sed: process \r, \n, and \t
This is both reasonable and a common GNUism that a lot of ported software
expects.
Universally process \r, \n, and \t into carriage return, newline, and tab
respectively. Newline still doesn't function in contexts where it can't
(e.g. BRE), but we process it anyways rather than passing
UB \n (escaped ordinary) through to the underlying regex engine.
Adding a --posix flag to disable these was considered, but sed.1 already
declares this version of sed a super-set of POSIX specification and this
behavior is the most likely expected when one attempts to use one of these
escape sequences in pattern space.
This differs from pre-r197362 behavior in that we now honor the three
arguably most common escape sequences used with sed(1) and we do so outside
of character classes, too.
Other escape sequences, like \s and \S, will come later when GNU extensions
are added to libregex; sed will likely link against libregex by default,
since the GNU extensions tend to be fairly un-intrusive.
Scott Long [Tue, 10 Dec 2019 18:57:39 +0000 (18:57 +0000)]
Fix the TAA state machine to do the right thing when the TAA
migitation is available in microcode and the operator has set
the sysctl to automatic mode.
Bryan Drewery [Tue, 10 Dec 2019 18:50:50 +0000 (18:50 +0000)]
Fix WITHOUT_CLANG build.
This decouples MK_LLVM_TARGET_ALL from MK_CLANG. It is fine if
LLVM_TARGET_* are set even if MK_CLANG is disabled. It never
made sense to depend MK_LLVM_TARGET_* to MK_CLANG (which I did
in r335706).
Mark Johnston [Tue, 10 Dec 2019 18:15:20 +0000 (18:15 +0000)]
Add a helper function to the swapout daemon's deactivation code.
vm_swapout_object_deactivate_pages() is renamed to
vm_swapout_object_deactivate(), and the loop body is moved into the new
vm_swapout_object_deactivate_page(). This makes the code a bit easier
to follow and is in preparation for some functional changes.
Mark Johnston [Tue, 10 Dec 2019 18:14:50 +0000 (18:14 +0000)]
Introduce vm_page_astate.
This is a 32-bit structure embedded in each vm_page, consisting mostly
of page queue state. The use of a structure makes it easy to store a
snapshot of a page's queue state in a stack variable and use cmpset
loops to update that state without requiring the page lock.
This change merely adds the structure and updates references to atomic
state fields. No functional change intended.
Ian Lepore [Mon, 9 Dec 2019 21:55:44 +0000 (21:55 +0000)]
Allow baud rates of 1,228,800 and 1,843,200 on CP2101/2/3 usb-serial adapters.
The datasheets for these chips claim the maximum is 921,600, but testing
shows these two higher rates also work (but no rates above 921,600 other
than these two work; these represent dividing the base buad clock by 3 and 2
respectively).
Mark Johnston [Mon, 9 Dec 2019 19:25:15 +0000 (19:25 +0000)]
Configure headphone redirection for the Dell L780 and X1 Carbon 7th gen.
As we do for many other laptops, put the headphone jack and speakers in
the same association by default so that the generic sound device
automatically switches between them.
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Dimitry Andric [Mon, 9 Dec 2019 19:17:56 +0000 (19:17 +0000)]
Correctly check for C++17 and higher when declaring timespec_get()
Summary:
In rS338751, the check to declare `timespec_get()` for C++17 and higher
was incorrectly done against a `cplusplus` define, while it should have
been `__cplusplus`.
Fix this by using `__cplusplus`, and also bump `__FreeBSD_version` so it
becomes possible to correctly check for `timespec_get()` in upstream
libc++ headers.
John Baldwin [Mon, 9 Dec 2019 19:17:28 +0000 (19:17 +0000)]
Copy out aux args after the argument and environment vectors.
Partially revert r354741 and r354754 and go back to allocating a
fixed-size chunk of stack space for the auxiliary vector. Keep
sv_copyout_auxargs but change it to accept the address at the end of
the environment vector as an input stack address and no longer
allocate room on the stack. It is now called at the end of
copyout_strings after the argv and environment vectors have been
copied out.
This should fix a regression in r354754 that broke the stack alignment
for newer Linux amd64 binaries (and probably broke Linux arm64 as
well).
Reviewed by: kib
Tested on: amd64 (native, linux64 (only linux-base-c7), and i386)
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D22695
Ian Lepore [Mon, 9 Dec 2019 19:00:39 +0000 (19:00 +0000)]
Switch gpioths(4) from using a callout to a taskqueue for periodic polling
of the sensor hardware. Part of the polling process involves signalling
the chip then waiting 20 milliseconds. This was being done with DELAY(),
which is a pretty rude thing to do in a callout. Now a taskqueue_thread
task is scheduled to do the polling, and because sleeping is allowed in
the task context, pause_sbt() replaces DELAY() for the 20ms wait.
Kyle Evans [Mon, 9 Dec 2019 17:34:40 +0000 (17:34 +0000)]
RPI: Fix DMA/SDHCI on the BCM2836 (Raspberry Pi 2)
r354875 pushed VCBUS <-> ARMC translations to runtime determination, but
incorrectly mapped addresses for the BCM2836 -- SOC_BCM2835 and SOC_BCM2836
are actually mutually exclusive, so the BCM2836 config (GENERIC) would have
taken the latter path in the header and used 0x3f000000 as peripheral start.
Easily fixed -- split out the BCM2836 into its own memmap config and use
that instead if SOC_BCM2836 is included. With this, we get back to userland
again.
Reported by: Marek Zarychta <zarychtam@plan-b.pwste.edu.pl>
Leandro Lupori [Mon, 9 Dec 2019 13:40:23 +0000 (13:40 +0000)]
Enable use of ofwcons for early debug
This change enables the use of OpenFirmware Console (ofwcons), even when VGA is
available, allowing early kernel messages to be seen, that is important in case
of crashes before VGA console initialization.
This is specially useful in virtualized environments, where the user/developer
doesn't have full control of the virtualization engine (e.g. OpenStack).
The old behavior is preserved by default and, in order to use ofwcons, a few
tunables that have been introduced need to be set:
- hw.ofwfb.disable=1 - disable OFW FrameBuffer device
- machdep.ofw.mtx_spin=1 - change PPC OFW mutex to SPIN type, to match kernel
console's mutex type
- debug.quiesce_ofw=0 - don't call OFW quiesce, needed to keep ofwcons I/O
working
More details can be found at differential revision D20640.
Doug Moore [Sun, 8 Dec 2019 22:33:51 +0000 (22:33 +0000)]
Define a vm_map method for user-space for advancing from a map entry
to its successor in cases where examining a map entry requires a
helper like kvm_read_all. Use that method, with kvm_read_all, to fix
procstat_getfiles_kvm, which tries to find the successor now without
using such a helper. This addresses a problem introduced by r355491.
Mateusz Guzik [Sun, 8 Dec 2019 21:30:04 +0000 (21:30 +0000)]
vfs: introduce v_irflag and make v_type smaller
The current vnode layout is not smp-friendly by having frequently read data
avoidably sharing cachelines with very frequently modified fields. In
particular v_iflag inspected for VI_DOOMED can be found in the same line with
v_usecount. Instead make it available in the same cacheline as the v_op, v_data
and v_type which all get read all the time.
v_type is avoidably 4 bytes while the necessary data will easily fit in 1.
Shrinking it frees up 3 bytes, 2 of which get used here to introduce a new
flag field with a new value: VIRF_DOOMED.
Reviewed by: kib, jeff
Differential Revision: https://reviews.freebsd.org/D22715
Mateusz Guzik [Sun, 8 Dec 2019 21:13:07 +0000 (21:13 +0000)]
vfs: clean up vputx a little
1. replace hand-rolled macros for operation type with enum
2. unlock the vnode in vput itself, there is no need to branch on it. existence
of VPUTX_VPUT remains significant in that the inactive variant adds LK_NOWAIT
to locking request.
3. remove the useless v_usecount assertion. few lines above the checks if
v_usecount > 0 and leaves. should the value be negative, refcount would fail.
4. the CTR return vnode %p to the freelist is incorrect as vdrop may find the
vnode with holdcnt > 1. if the like should exist, it should be moved there
5. no need to error = 0 for everyone
Reviewed by: kib, jeff (previous version)
Differential Revision: https://reviews.freebsd.org/D22718
Ian Lepore [Sun, 8 Dec 2019 21:12:33 +0000 (21:12 +0000)]
Add a MODULE_DEPEND() for the gpioths driver. Also, note that the prior commit
changed the sysctl format for the temperature from "I" to "IK", and
correspondingly changed the units from integer degrees C to decikelvin.
For access via sysctl(8) the output will be the same except that now
decimal fractions will be shown when available.
Ian Lepore [Sun, 8 Dec 2019 20:42:58 +0000 (20:42 +0000)]
Add support for more chips to the gpioths driver.
Previously the driver supported the DHT11 sensor. Now it supports
DHT11, DHT12, DHT21, DHT22, AM3201, AM3202.
All these chips are similar, differing primarily in supported temperature
and humidity ranges and accuracy (and, presumably, cost). There are two
basic data formats reported by the various chips, and it is possible to
figure out at runtime which format to use for decoding the data based on
the range of values in a single byte of the humidity measurement. (which
is detailed in a comment block, so I won't recapitulate it here).