Simplify kvm symbol resolution and error handling. The symbol table
nl_symbols will eventually be organized into several modules depending
on MK_* variables.
After the introduction of direct dispatch, the pacing code in g_down()
broke in two ways. One, the pacing variable was accessed in multiple
threads in an unsafe way. Two, since large numbers of I/O could come
down from the buf layer at one time, large numbers of allocation
failures could happen all at once, resulting in a huge pace value that
would limit I/Os to 10 IOPS for minutes (or even hours) at a
time. While a real solution to these problems requires substantial
work (to go to a no-allocation after the first model, or to have some
way to wait for more memory with some kind of reserve for pager and
swapper requests), it is relatively easy to make this simplistic
pacing less pathological.
Move to using a volatile variable with loads and stores. While this is
a little racy, losing the race is safe: either you get memory and
proceed, or you don't and queue. Second, sleep for 1ms (or one tick, whichever
is larger) instead of 100ms. This removes the artificial 10 IOPS limit
while still easing up on new I/Os during memory shortages. Remove
tying the amount of time we do this to the number of failed requests
and do it only as long as we keep failing requests.
Finally, to avoid needless recursion when memory is tight (start ->
g_io_deliver() -> g_io_request() -> start -> ... until we use 1/2 the
stack), don't do direct dispatch while pacing. This should be a rare
event (not steady state) so the performance hit here is worth the
extra safety of not starving g_down() with directly dispatched I/O.
cem [Wed, 2 Sep 2015 16:48:03 +0000 (16:48 +0000)]
ioat: re-initialize interrupts after resetting hw on BDXDE
Resetting some generations of the I/OAT hardware (just BDXDE for now)
resets the corresponding MSI-X registers. So, teardown and
re-initialize interrupts after resetting the hardware.
The ${BUILDKERNELS:[2..-1]} appears to produce a non zero result for
a one word variable, which is quite unexpected from documentation.
So, to avoid double installation of a single kernel, protect the extra
kernels loop with ${BUILDKERNELS:[#]} > 1 conditional.
It's 2015, and some people are still trying to use fdisk and then
go asking what debug flags to set for GEOM to make it work. Advice
them to use gpart(8) instead.
Something similar should probably done with disklabel,
but I need to rewrite the disklabel examples first.
Reviewed by: wblock@
MFC after: 1 month
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D3315
Fix dynamic attach/detach of 802.11 devices after r287197:
o In pccard_ether add code to start children of a 802.11
device, that are configured in rc.conf.
o In devd.conf provide a regex matching all 802.11 devices,
and on match run pccard_ether to spawn children.
PR: 202784
Submitted by: <vidwer gmail.com>
In collaboration with: "Oleg V. Nauman" <oleg opentransfer.com>
Export current system call code and argument count for system call entry
and exit events. procfs stop events for system call tracing report these
values (argument count for system call entry and code for system call exit),
but ptrace() does not provide this information. (Note that while the system
call code can be determined in an ABI-specific manner during system call
entry, it is not generally available during system call exit.)
The values are exported via new fields at the end of struct ptrace_lwpinfo
available via PT_LWPINFO.
pf: Fix misdetection of forwarding when net.link.bridge.pfil_bridge is set
If net.link.bridge.pfil_bridge is set we can end up thinking we're forwarding in
pf_test6() because the rcvif and the ifp (output interface) are different.
In that case we're bridging though, and the rcvif the the bridge member on which
the packet was received and ifp is the bridge itself.
If we'd set dir to PF_FWD we'd end up calling ip6_forward() which is incorrect.
Instead check if the rcvif is a member of the ifp bridge. (In other words, the
if_bridge is the ifp's softc). If that's the case we're not forwarding but
bridging.
PR: 202351
Reviewed by: eri
Differential Revision: https://reviews.freebsd.org/D3534
Fix an off by one error in r283613: Like regular ffs(), CPU_FFS() returns
1 for CPU 0, etc. so the return value must be decremented to obtain the
first valid CPU ID.
andrew [Tue, 1 Sep 2015 17:13:04 +0000 (17:13 +0000)]
Add support for the dwc usb in the HiSilicon hi6220 in the HiKey board. For
this we need to force the driver into host mode, as without this the driver
fails to detect any devices.
andrew [Tue, 1 Sep 2015 16:25:12 +0000 (16:25 +0000)]
Add support for the DesignWare MMC hardware in the HiSilicon hi6220. This
SoC is used in the HiKey board from 96boards.
Currently on the SD card is working on the HiKey, as such devices 0 and 2
will need to be disabled, for example by adding the following to
loader.conf:
andrew [Tue, 1 Sep 2015 15:57:03 +0000 (15:57 +0000)]
Fix how we place each objects thread local data. The code used was based
on the Variant II code, however arm64 uses Variant I. The former placed the
thread pointer after the data, pointing at the thread control block, while
the latter places these before said data.
Because of this we need to use the size of the previous entry to calculate
where to place the current entry. We also need to reserve 16 bytes at the
start for the thread control block.
This also fixes the value of TLS_TCB_SIZE to be correct. This is the size
of two unsigned longs, i.e. 2 * 8 bytes.
While here remove the bogus adjustment of the pointer in the
R_AARCH64_TLS_TPREL64 case. It should be the offset of the data relative
to the thread pointer, including the thread control block.
andrew [Tue, 1 Sep 2015 15:43:56 +0000 (15:43 +0000)]
Ensure we use calculate_first_tls_offset, even if the main program doesn't
have TLS program header. This is needed on architectures with Variant I
tls, that is arm, arm64, mips, and powerpc. These place the thread control
block at the start of the buffer and, without this, this data may be
trashed.
This appears to not be an issue on mips or powerpc as they include a second
adjustment to move the thread local data, however this is on arm64 (with a
future change to fix placing this data), and should be on arm. I am unable
to trigger this on arm, even after changing the code to move the data
around to make it more likely to be hit. This is most likely because my
tests didn't use the variable in offset 0.
Remove '-' separating OSRELEASE and SNAPSHOT_DATE for vagrant
builds, and prepend it to SNAPSHOT_DATE to prevent a trailing '-'
in the final box name for a release build.
MFC after: 3 days
Sponsored by: The FreeBSD Foundation
Exit notification for EVFILT_PROC removes knote from the knlist. In
particular, this invalidates the knote kn_link linkage, making the
SLIST_FOREACH() loop accessing undefined values (e.g. trashed by
QUEUE_MACRO_DEBUG). If the knote is freed by other thread when kq
lock is released or when influx is cleared, e.g. by knote_scan() for
kqueue owning the knote, the iteration step would access freed memory.
Use SLIST_FOREACH_SAFE() to fix iteration.
Diagnosed by: avg
Tested by: avg, lstewart, pawel
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Explain why it is fine to not check for M_NOWAIT failures in
kqueue_register(). Remove unneeded check for NULL result from
waitable allocation in kqueue_scan(). uma_free(9) handles NULL
argument correctly, remove checks for NULL. Remove useless cast and
adjust style in knote_alloc().
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
By doing file extension fast, it is possible to create excess supply
of the D_NEWBLK kinds of dependencies (i.e. D_ALLOCDIRECT and
D_ALLOCINDIR), which can exhaust kmem.
Handle excess of D_NEWBLK in the same way as excess of D_INODEDEP and
D_DIRREM, by scheduling ast to flush dependencies, after the thread,
which created new dep, left the VFS/FFS innards. For D_NEWBLK, the
only way to get rid of them is to do full sync, since items are
attached to data blocks of arbitrary vnodes. The check for D_NEWBLK
excess in softdep_ast_cleanup_proc() is unlocked.
For 32bit arches, reduce the total amount of allowed dependencies by
two. It could be considered increasing the limit for 64 bit platforms
with direct maps.
Reported and tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Fix t_spawnattr test for attributes handling by posix_spawn(3).
Connect it to the build.
The code assumed that SCHED_* constants form a contiguous set of
numbers, remove the assumption by using schedulers[] array in
get_different_scheduler(). This is no-op on FreeBSD, but improves
code portability.
The selection of different priority used the min/max priority range of
the current scheduler class, instead of the priority to be changed to.
The bug caused the test failure.
Remove duplication of POSIX_SPAWN_SETSIGDEF flag and now unused
duplications of MIN/MAX definitions.
Reviewed by: jilles, pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D3533
When building multiple kernels use [2..-1] to extract !INSTALLKERNEL
from BUILDKERNELS list. This is more strict, since INSTALLKERNEL by
definition is the first word of BUILDKERNELS list. The previous
code failed if INSTALLKERNEL is a substring of additional kernel name.
Reviewed by: gjb
Sponsored by: Netflix
Sponsored by: Nginx, Inc.
andrew [Tue, 1 Sep 2015 10:47:42 +0000 (10:47 +0000)]
Remove an variable we only ever write to, and stop assigning 0 to values
in the softc as it's the default value. The latter helps with subclassing
this driver.
As a result of the bug there was a timing window where callout_reset()
would fail to cancel a concurrent execution of a callout that is about
to start and would schedule the callout again.
The callout would fire more times than it is scheduled.
That would happen even if the callout is initialized with a lock.
For example, the bug triggered the "Stray timeout" assertion in
taskqueue_timeout_func().
allanjude [Mon, 31 Aug 2015 22:36:17 +0000 (22:36 +0000)]
Remove duplicate defines introduced in initial ZFS import (r168404)
This change reduces compiler warnings by removing duplicate defines
Line numbers are from r168404 (and r284648)
#define lbolt: lines 384 and 459 (531 and 648) (original was renamed later)
#define lbolt64: lines 385 and 460 (532 and 649) (original was renamed later)
#define gethrestime_sec: lines 390 and 465 (540 and 653)
uint64_t physmem: lines 402 and 463 (561 and 651)
marcel [Sun, 30 Aug 2015 23:58:53 +0000 (23:58 +0000)]
Add support for the UGA draw protocol. This includes adding a
command called 'uga' to show whether UGA is implemented by the
firmware and what the settings are. It also includes filling
the efi_fb structure from the UGA information when GOP isn't
implemented by the firmware.
Since UGA does not provide information about the stride, we
set the stride to the horizontal resolution. This is likely
not correct and we should determine the stride by trial and
error. For now, this should show something on the console
rather than nothing.
gnn [Sun, 30 Aug 2015 20:59:19 +0000 (20:59 +0000)]
A bibliography of FreeBSD and BSD related papers and books.
Keep this file in order by primary key which is the first author's
last name and the year of publication.
kib [Sun, 30 Aug 2015 18:02:57 +0000 (18:02 +0000)]
Use P1B_PRIO_MAX to designate max posix priority for the RR/FIFO
scheduler types. It was intended to be used there, compare with the
min value, and with the test for correctness in ksched_setscheduler().
Note that P1B_PRIO_MAX and RTP_PRIO_MAX do have the same numerical
values, the change is cosmetical.
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
loos [Sun, 30 Aug 2015 16:10:12 +0000 (16:10 +0000)]
Reduce the difference to vendor DTS by using the vendor compat strings (at
some point we have to use the complete vendor DTS files, but we're not
there yet).
loos [Sun, 30 Aug 2015 15:38:41 +0000 (15:38 +0000)]
In preparation to support other A20 based boards, rename the CUBIEBOARD2
kernel configuration to A20.
There are other boards (namely the banana pi) that use exactly the same
devices.
Additionally, we are moving from static FDT support (DTB compiled
in-kernel) to DTB passed to kernel by the boot loader (ubldr). The u-boot
for these boards are already available on ports and as the crochet support
for these boards isn't committed yet, this should not bring any issues.
jch [Sun, 30 Aug 2015 13:44:46 +0000 (13:44 +0000)]
Revert r286880: If at first this change made sense, it turns out
it helps only the TCP timers callout(9) usage. As the benefit for
others callout(9) usages did not reach a consensus the historical
usage should prevail.
jch [Sun, 30 Aug 2015 13:44:39 +0000 (13:44 +0000)]
Put r284245 back in place: If at first this fix was seen as a temporary
workaround for a callout(9) issue, it turns out it is instead the right
way to use callout in mpsafe mode without using callout_drain().
r284245 commit message:
Fix a callout race condition introduced in TCP timers callouts with r281599.
In TCP timer context, it is not enough to check callout_stop() return value
to decide if a callout is still running or not, previous callout_reset()
return values have also to be checked.
kib [Sun, 30 Aug 2015 04:46:44 +0000 (04:46 +0000)]
Fix a mistake in r287292. Despite correctly stating intent in the
comment above, POSIX_SPAWN_SETSIGMASK and POSIX_SPAWN_SETSIGDEF
handlers used libthr interposed functions instead of syscalls.
Noted by: jilles
Sponsored by: The FreeBSD Foundation
MFC after: 6 days
marcel [Sun, 30 Aug 2015 01:39:59 +0000 (01:39 +0000)]
Add a gop command to help diagnose VT efifb problems. The gop
command has the following sub-commands:
list - list all possible modes (paged)
get - return the current mode
set <mode> - set the current mode to <mode>
rodrigc [Sat, 29 Aug 2015 19:47:20 +0000 (19:47 +0000)]
- Replace N(a)/N(i)/N(T)/LEN(a)/ARRAY_SIZE(a) with nitems()
- Add missing <err.h> for err() and <sys/sysctl.h> for sysctlbyname()
- NULL -> 0 for 5th parameter of sysctlbyname()
jilles [Sat, 29 Aug 2015 19:41:47 +0000 (19:41 +0000)]
sh: Add set -o nolog.
POSIX requires this to prevent entering function definitions in history but
this implementation does nothing except retain the option's value. In ksh88,
function definitions were usually entered in the history file, even when
they came from ~/.profile and the $ENV file, to allow displaying their
definitions.
This is also the first option that does not have a letter.
kib [Sat, 29 Aug 2015 14:25:01 +0000 (14:25 +0000)]
Switch libc from using _sig{procmask,action,suspend} symbols, which
are aliases for the syscall stubs and are plt-interposed, to the
libc-private aliases of internally interposed sigprocmask() etc.
Since e.g. _sigaction is not interposed by libthr, calling signal()
removes thr_sighandler() from the handler slot etc. The result was
breaking signal semantic and rtld locking.
The added __libc_sigprocmask and other symbols are hidden, they are
not exported and cannot be called through PLT. The setjmp/longjmp
functions for x86 were changed to use direct calls, and since
PIC_PROLOGUE only needed for functional PLT indirection on i386, it is
removed as well.
The PowerPC bug of calling the syscall directly in the setjmp/longjmp
implementation is kept as is.
Reported by: Pete French <petefrench@ingresso.co.uk>
Tested by: Michiel Boland <boland37@xs4all.nl>
Reviewed by: jilles (previous version)
Sponsored by: The FreeBSD Foundation
MFC after: 1 week