Jayachandran C. [Mon, 12 Dec 2016 15:35:57 +0000 (15:35 +0000)]
Fix gic_cpu_mask() calculation in ARM GIC
r309616 changed the definition of GICD_ITARGETSR(n) to take the irq
id as argument, but the usage of the macro in gic_cpu_mask() was not
updated to reflect this. This causes the cpu mask to be computed
incorrectly.
Update the GICD_ITARGETSR() call to fix this, this fixes a hang seen
while booting freebsd on qemu-system-aarch64 with SMP enabled.
Jayachandran C. [Mon, 12 Dec 2016 15:17:56 +0000 (15:17 +0000)]
Increase interrupt cells in generic_pcie_fdt_route_interrupt
ARM GIC specification in device trees use 3 cells, so the current
limit of 2 causes the last cell to be dropped. This in turn can
cause the interrupt polarity and trigger settings to be incorrect.
Increase the limit to 4 which should handle all reasonable cases.
This fixes issues seen in QEMU when registering PCI interrupts.
Add rcvif local variable to keep inbound interface pointer. Count
ifs6_in_discard errors in all "goto bad" cases. Now it will count
errors even if mbuf was freed. Modify all places where m->m_pkthdr.rcvif
is used to use local rcvif variable.
When a zombie gets reparented due to the parent exit, send SIGCHLD to
the reaper.
The traditional reaper init(8) is aware of zombies silently reparented
to it after the parents exit, it loops around waitpid(2) to collect
them. For other reapers, the silent reparenting is surprising and
collecting zombies requires a thread blocking in waitpid(2) just for
that purpose. It seems that sending second SIGCHLD is a better
workaround than forcing all reapers to obey the setup.
Reported by: Michael Zuo <muh.muhten@gmail.com>, jilles
PR: 213928
Reviewed by: jilles (previous version)
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Add ip6_tryforward() - a run to completion forwarding implementation
for IPv6.
It gets performance benefits from reduced number of checks. It doesn't
copy mbuf to be able send ICMPv6 error message, because it keeps mbuf
unchanged until the moment, when the route decision has been made.
It doesn't do IPsec checks, and when some IPsec security policies present,
ip6_input() uses normal slow path.
Hiren Panchasara [Sun, 11 Dec 2016 23:14:47 +0000 (23:14 +0000)]
We currently don't do TSO if ip options are present. In case of IPv6, we look at
in6p_options to check that. That is incorrect as we carry ip options in
in6p_outputopts. Also, just checking for in6p_outputopts being NULL won't
suffice as we combine ip options and ip header fields both in that one field.
The commit fixes this by using ip6_optlen() which correctly calculates length
of only ip options for IPv6.
Alexander Motin [Sun, 11 Dec 2016 19:50:39 +0000 (19:50 +0000)]
Postpone ZVOL media/block size caching till first open.
At least on FreeBSD there are no legal way to access media or get its
size without opening device/provider first. Postponing this caching
allows to skip several disk seeks per ZVOL/snapshot during import.
For HDD pool with 1 ZVOL in dev mode with 1000 snapshots this reduces
pool import time from 40 to 10 seconds.
Alan Cox [Sun, 11 Dec 2016 19:24:41 +0000 (19:24 +0000)]
When tmpfs and POSIX shm pagein a page for the sole purpose of performing
truncation, immediately queue the page for asynchronous laundering rather
than making the page pass through inactive queue first.
Michael Tuexen [Sun, 11 Dec 2016 13:26:35 +0000 (13:26 +0000)]
Ensure that the reported ppid and tsn are taken from the first fragment.
This fixes a bug where the wrong ppid was reported, if
* I-DATA was used on the first fragement was not received first
* DATA was used and different ppids where used.
Thanks to Julian Cordes for making me aware of the issue.
- Do not ignore initialization errors; call ieee80211_stop()
when initialization failed.
- Use usb_pause_mtx() instead of DELAY() while waiting for firmware
loading; this fixes system freeze during firmware startup.
- Do not execute rsu_stop() when device is powered off; fixes
'unknown board type (rfconfig=0xff)' error when the device is
reattached.
Enji Cooper [Sat, 10 Dec 2016 22:08:33 +0000 (22:08 +0000)]
Change the process limits for RLIMIT_MEMLOCK to RLIM_INFINITY when
executing :mincore_resid
The default process limits in FreeBSD is 64kB for unprivileged users,
which empirically is too low to run the :mincore_resid testcase.
Process limits are inherited, so even though the default limit for
root users is RLIM_INFINITY, the inherited limit with "sudo" with the
default login.conf will be 64kB.
Use setrlimit to set rlim_max for RLIMIT_MEMLOCK to RLIM_INFINITY to
avoid ENOMEM issues when calling mlock to wire the mmap'ed address
space.
setrlimit requires root access to increase rlim_max, so require root
privileges when running the test
Discovered when executing the tests with sudo, e.g.
"sudo kyua test -k /usr/tests/lib/libc/sys/Kyuafile mincore_test"
Dimitry Andric [Sat, 10 Dec 2016 22:03:44 +0000 (22:03 +0000)]
Tentatively apply https://reviews.llvm.org/D18730 to work around gcc PR
70528 (bogus error: constructor required before non-static data member).
This should fix buildworld with the external gcc package.
- Replace all remaining DPRINTF(N)'s with RSU_DPRINTF.
- Add new RSU_DEBUG_USB flag to track error codes returned by
usbd_do_request_flags().
- Improve few messages.
- Add partial promiscuous mode support (no management frames;
they cannot be received by the firmware and net80211 at the same time).
- Add monitor mode support (all frames).
For horizontal (T-axis) wheel reporting which is not supported by
sysmouse protocol kern.evdev.sysmouse_t_axis sysctl is introduced.
It can take following values:
0 - no T-axis events (default)
1 - T-axis events are originated in ums(4) driver.
2 - T-axis events are originated in psm(4) driver.
Changes include modifications in kernel crash dump routines, dumpon(8) and
savecore(8). A new tool called decryptcore(8) was added.
A new DIOCSKERNELDUMP I/O control was added to send a kernel crash dump
configuration in the diocskerneldump_arg structure to the kernel.
The old DIOCSKERNELDUMP I/O control was renamed to DIOCSKERNELDUMP_FREEBSD11 for
backward ABI compatibility.
dumpon(8) generates an one-time random symmetric key and encrypts it using
an RSA public key in capability mode. Currently only AES-256-CBC is supported
but EKCD was designed to implement support for other algorithms in the future.
The public key is chosen using the -k flag. The dumpon rc(8) script can do this
automatically during startup using the dumppubkey rc.conf(5) variable. Once the
keys are calculated dumpon sends them to the kernel via DIOCSKERNELDUMP I/O
control.
When the kernel receives the DIOCSKERNELDUMP I/O control it generates a random
IV and sets up the key schedule for the specified algorithm. Each time the
kernel tries to write a crash dump to the dump device, the IV is replaced by
a SHA-256 hash of the previous value. This is intended to make a possible
differential cryptanalysis harder since it is possible to write multiple crash
dumps without reboot by repeating the following commands:
# sysctl debug.kdb.enter=1
db> call doadump(0)
db> continue
# savecore
A kernel dump key consists of an algorithm identifier, an IV and an encrypted
symmetric key. The kernel dump key size is included in a kernel dump header.
The size is an unsigned 32-bit integer and it is aligned to a block size.
The header structure has 512 bytes to match the block size so it was required to
make a panic string 4 bytes shorter to add a new field to the header structure.
If the kernel dump key size in the header is nonzero it is assumed that the
kernel dump key is placed after the first header on the dump device and the core
dump is encrypted.
Separate functions were implemented to write the kernel dump header and the
kernel dump key as they need to be unencrypted. The dump_write function encrypts
data if the kernel was compiled with the EKCD option. Encrypted kernel textdumps
are not supported due to the way they are constructed which makes it impossible
to use the CBC mode for encryption. It should be also noted that textdumps don't
contain sensitive data by design as a user decides what information should be
dumped.
savecore(8) writes the kernel dump key to a key.# file if its size in the header
is nonzero. # is the number of the current core dump.
decryptcore(8) decrypts the core dump using a private RSA key and the kernel
dump key. This is performed by a child process in capability mode.
If the decryption was not successful the parent process removes a partially
decrypted core dump.
Description on how to encrypt crash dumps was added to the decryptcore(8),
dumpon(8), rc.conf(5) and savecore(8) manual pages.
EKCD was tested on amd64 using bhyve and i386, mipsel and sparc64 using QEMU.
The feature still has to be tested on arm and arm64 as it wasn't possible to run
FreeBSD due to the problems with QEMU emulation and lack of hardware.
This adds a configuration for arm64 users that track CURRENT but
don't need the extra debug facilities. Copied from the amd64
configuration of the same name.
Marcel Moolenaar [Sat, 10 Dec 2016 03:31:38 +0000 (03:31 +0000)]
Improve upon r309394
Instead of taking an extra reference to deal with pfsync_q_ins()
and pfsync_q_del() taken and dropping a reference (resp,) make
it optional of those functions to take or drop a reference by
passing an extra argument.
Mark Johnston [Sat, 10 Dec 2016 03:13:11 +0000 (03:13 +0000)]
Don't create FBT probes for lock owner methods.
These functions may be called in DTrace probe context, so they cannot be
safely traced. Moreover, they are currently only used by DTrace, so their
corresponding FBT probes are not particularly useful.
Enji Cooper [Fri, 9 Dec 2016 23:42:04 +0000 (23:42 +0000)]
Make test_unmount usable in cleanup subroutines
- Duplicate test_unmount to _test_unmount
- Remove atf_check calls
- Call _test_unmount from test_unmount, checking the exit code
at the end, and returning it to maintain the test_unmount
"contract"
Warner Losh [Fri, 9 Dec 2016 23:37:11 +0000 (23:37 +0000)]
Permit loading of efirt module even when there's no EFI to call. The
module loading is successful, but attempts to use it will not be
successful. This is similar to what we do (did?) with ACPI on non-ACPI
systems. We succeed if we can't find the necessary information to hook
into EFI, but still fail if we're unable to allocate resources if we
do find EFI.
Maxim Sobolev [Fri, 9 Dec 2016 22:13:00 +0000 (22:13 +0000)]
Check that SCM_XXX timestamp returned by the kernel is less 1 second
away in the past from the current time. This should be plenty for the
scheduler to do its job. It provides assurance that the timestamp
returned is actually a valid one, not just some random garbage.
Gleb Smirnoff [Fri, 9 Dec 2016 18:07:28 +0000 (18:07 +0000)]
Treat R_X86_64_PLT32 relocs as R_X86_64_PC32.
If we load a binary that is designed to be a library, it produces
relocatable code via assembler directives in the assembly itself
(rather than compiler options). This emits R_X86_64_PLT32 relocations,
which are not handled by the kernel linker.
Gleb Smirnoff [Fri, 9 Dec 2016 17:58:34 +0000 (17:58 +0000)]
Provide counter_ratecheck(), a MP-friendly substitution to ppsratecheck().
When rated event happens at a very quick rate, the ppsratecheck() is not
only racy, but also becomes a performance bottleneck.
Prefix the Linux KPI's kmem_xxx() functions with linux_ to avoid
conflict with the opensolaris kernel module.
This patch solves a problem where the kernel linker will incorrectly
resolve opensolaris kmem_xxx() functions as linuxkpi ones, which leads
to a panic when these functions are used.
hyperv/storvsc: Fix the SCSI disk attachment issue.
On pre-WS2016 Hyper-V, if the only LUNs > 7 are used, then all disks
fails to attach. Mainly because those versions of Hyper-V do not set
SRB_STATUS properly and deliver junky INQUERY responses.
Submitted by: Hongjiang Zhang <honzhan microsoft com>
Reported by: Hongxiong Xian <v-hoxian microsoft com>
MFC after: 1 week
Sponsored by: Microsoft
Differential Revision: https://reviews.freebsd.org/D8724
Rick Macklem [Thu, 8 Dec 2016 23:29:56 +0000 (23:29 +0000)]
Patch the nfsd so that it doesn't register with rpcbind for an NFSv4 only
server.
This patch uses the sysctl vfs.nfsd.server_min_nfsvers to determine
if/what versions of NFS service should be registered with rpcbind.
For NFSv4 only, it does not register at all, since NFSv4 always uses 2049
and does not require rpcbind.
For NFSv3 minimum, it registers NFSv3 but not NFSv2.
Dimitry Andric [Thu, 8 Dec 2016 21:02:34 +0000 (21:02 +0000)]
Pull in r281586 from upstream llvm trunk (by Wei Mi):
Add some shortcuts in LazyValueInfo to reduce compile time of
Correlated Value Propagation.
The patch is to partially fix PR10584. Correlated Value Propagation
queries LVI to check non-null for pointer params of each callsite. If
we know the def of param is an alloca instruction, we know it is
non-null and can return early from LVI. Similarly, CVP queries LVI to
check whether pointer for each mem access is constant. If the def of
the pointer is an alloca instruction, we know it is not a constant
pointer. These shortcuts can reduce the cost of CVP significantly.
Alexander Motin [Thu, 8 Dec 2016 15:58:03 +0000 (15:58 +0000)]
Fix spa_alloc_tree sorting by offset in r305331.
Original commit "7090 zfs should improve allocation order" declares alloc
queue sorted by time and offset. But in practice io_offset is always zero,
so sorting happened only by time, while order of writes with equal time was
completely random. On Illumos this did not affected much thanks to using
high resolution timestamps. On FreeBSD due to using much faster but low
resolution timestamps it caused bad data placement on disks, affecting
further read performance.
This change switches zio_timestamp_compare() from comparing uninitialized
io_offset to really populated io_bookmark values. I haven't decided yet
what to do with timestampts, but on simple tests this change gives the
same peformance results by just making code to work as declared.
Use the populate() driver paging method for i915 driver.
In particular, the fault access type is accounted for when the
aperture page is moved to GTT domain. On the other hand, the current
pager structure is left intact, most important, only one page is
instantiated per populate call.
Reviewed by: alc
Sponsored by: The FreeBSD Foundation
MFC after: 3 weeks
Implement the populate() pager method for phys pager.
It allows to provide configurable agressive prefaulting and useful
hints to page daemon about memory allocations, on faults for pages
managed by phys pager. In fact, this implementation is superior to
the MAP_SHARED_PHYS hack from my Postgresql paper, while giving
similar benefits of reducing the page faults numbers on SysV shared
memory mappings.
Reviewed by: alc
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 3 weeks