Matt Macy [Thu, 31 May 2018 21:53:07 +0000 (21:53 +0000)]
Reduce overhead of entropy collection
- move harvest mask check inline
- move harvest mask to frequently_read out of actively
modified cache line
- disable ether_input collection and describe its limitations
in NOTES
Typically entropy collection in ether_input was stirring zero
in to the entropy pool while at the same time greatly reducing
max pps. This indicates that perhaps we should more closely
scrutinize how much entropy we're getting from a given source
as well as what our actual entropy collection needs are for
seeding Yarrow.
Dimitry Andric [Thu, 31 May 2018 20:22:47 +0000 (20:22 +0000)]
Resolve conflicts between macros in fenv.h and ieeefp.h
This is a follow-up to r321483, which disabled -Wmacro-redefined for
some lib/msun tests.
If an application included both fenv.h and ieeefp.h, several macros such
as __fldcw(), __fldenv() were defined in both headers, with slightly
different arguments, leading to conflicts.
Fix this by putting all the common macros in the machine-specific
versions of ieeefp.h. Where needed, update the arguments in places
where the macros are invoked.
This also slightly reduces the differences between the amd64 and i386
versions of ieeefp.h.
Conrad Meyer [Thu, 31 May 2018 19:36:24 +0000 (19:36 +0000)]
dhclient(8): allow to supersede interface-mtu option
In some cases broken DHCP servers might send invalid MTU value, so allow to
use 'supersede' in dhclient.conf to override this. When superseded value is
0, MTU value is not updated at all.
PR: 206721
Submitted by: novel@
Reported by: <jimp AT pfsense.org>
MFC after: 37 minutes (if you care about 11, please MFC to 11.2)
Relnotes: yes (potentially surprising behavior change w/ broken dhcpd mtu)
Differential Revision: https://reviews.freebsd.org/D15484
Emmanuel Vadot [Thu, 31 May 2018 15:39:39 +0000 (15:39 +0000)]
aw_mmc: Rework DMA
- Calculate the number of segments based on the page size
- Add some comments on dma function so it's easier to read
- Only enable interrupts on the last dma segment
- If the segments size is the max transfer size, use the special size 0
for the controller.
- The max_data ivars is in block so calculate it properly.
Dimitry Andric [Thu, 31 May 2018 14:38:13 +0000 (14:38 +0000)]
Fix build of stand with base gcc
* Make autoboot() a static function in stand/common/boot.c, so it does
not shadow local variables in gptboot.c and zfsboot.c.
* Remove -Winline from the Makefiles for gptboot, gptzfsboot and
zfsboot, as gcc will always fail to inline some functions, and there
is nothing we can do about it.
* For gcc <= 4.2.1, silence -Wuninitialized for isoboot, as it produces
a false positive warning.
* Remove deprecated and unnecessary -mcpu=i386 flag from stand/defs.mk,
as there is already a -march=i386 flag further in the file.
Reviewed by: imp
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D15628
Warner Losh [Thu, 31 May 2018 14:23:33 +0000 (14:23 +0000)]
Depart from normal man page proactice a little and provide guidance on
when to use assert, as well as providing a bad example of using
assert. While not strictly necessary, experience has shown issues
with poor assert choice happen often enough that this departure seems
warranted. Also, tighten up the previous example (there's no need
to have extra paragraphs or gratuitously long lines).
Mateusz Guzik [Thu, 31 May 2018 09:56:02 +0000 (09:56 +0000)]
amd64: switch pagecopy from non-temporal stores to rep movsq
The copied data is accessed in part soon after and it results with additional
cache misses during a -j 1 buildkernel WITHOUT_CTF=yes KERNFAST=1, as measured
with pmc stat.
before: 256165411 cache-references # 0.003 refs/inst 15105408 cache-misses # 5.897%
20.70 real # 99.67% cpu
13.24 user # 63.94% cpu
7.40 sys # 35.73% cpu
after: 256764469 cache-references # 0.003 refs/inst 11913551 cache-misses # 4.640%
20.70 real # 99.67% cpu
13.19 user # 63.73% cpu
7.44 sys # 35.95% cpu
Note the real time did not change, but traffic to RAM was reduced (multiple
measurements performed with switching the implementation at runtime).
Since nobody else is using non-temporal for this and there is no apparent
benefit at least these days, don't use them either.
Side note is that pagecopy arguments should probably get reversed to not
have to flip them around in the primitive.
- Restore local change to include <net/bpf.h> inside pcap.h.
This fixes ports build problems.
- Update local copy of dlt.h with new DLT types.
- Revert no longer needed <net/bpf.h> includes which were added
as part of r334277.
Warner Losh [Thu, 31 May 2018 02:57:58 +0000 (02:57 +0000)]
Make the data returned by devinfo harder to overflow.
Rather than using fixed-length strings, pack them into a string table
to return. Also expand the buffer from ~300 charaters to 3k. This should
be enough, even for USB.
This fixes a problem where USB pnp info is truncated on return to
userland.
Warner Losh [Thu, 31 May 2018 02:54:11 +0000 (02:54 +0000)]
Pass a struct devdesc to the format commands. Use proper type rather
than doing weird type-punning that happened to work because the size
was right. We copied a zfs devdesc into a disk_devdesc and then after
passing through a NULL pointer reinterpreted it as a
zfs_devdesc. Instead, pass the base devdesc around and 'upcase' when
we know the types are right.
This has the happy side effect of fixing a gcc warning about bad
type punning.
Navdeep Parhar [Wed, 30 May 2018 22:36:09 +0000 (22:36 +0000)]
cxgbe(4): Consider all supported speeds when building the ifmedia list
for a port. Fix other related issues while here:
- Require port lock for access to link_config.
- Allow 100Mbps operation by tracking the speed in Mbps. Yes, really.
- New port flag to indicate that the media list is immutable. It will
be used in future refinements.
This also fixes a bug where the driver reports incorrect media with
recent firmwares.
MFC after: 2 days
Sponsored by: Chelsio Communications
Alan Somers [Wed, 30 May 2018 21:50:23 +0000 (21:50 +0000)]
#include <bsm/audit.h> in security/audit/audit_ioctl.h
security/audit/audit_ioctl.h uses a type from bsm/audit.h, so needs to
include it. And it needs to know the type's size, so it can't just
forward-declare.
Use pmap_pte_ufast() instead of pmap_pte() in pmap_extract(),
pmap_is_prefaultable() and pmap_incore(), pushing the number of
shootdown IPIs back to the 3/1 kernel.
Benchmarked by: bde
Tested by: pho
Sponsored by: The FreeBSD Foundation
Create yet another temporal pte mapping routine pmap_pte_quick3(),
which is the copy of the pmap_pte_quick() and relies on the
pvh_global_lock to protect the frame. It accounts into the same
counters as pmap_pte_quick(). It is needed since pmap_copy() uses
pmap_pte_quick() already, and since a user pmap is no longer current
pmap.
pmap_copy() still provides the advantage for real-world workloads
involving lot of forks where processes do not exec immediately.
Benchmarked by: bde
Sponsored by: The FreeBSD Foundation
Rick Macklem [Wed, 30 May 2018 20:16:17 +0000 (20:16 +0000)]
Strengthen locking for the NFSv4.1 server DestroySession operation.
If a client did a DestroySession on a session while it was still in use,
the server might try to use the session structure after it is free'd.
I think the client has violated RFC5661 if it does this, but this patch
makes DestroySession block all other nfsd threads so no thread could
be using the session when it is free'd. After the DestroySession, nfsd
threads will not be able to find the session. The patch also adds a check
for nd_sessionid being set, although if that was not the case it would have
been all 0s and unlikely to have a false match.
This might fix the crashes described in PR#228497 for the FreeNAS server.
Ed Maste [Wed, 30 May 2018 18:04:25 +0000 (18:04 +0000)]
Enable lld as the system linker by default on amd64
The migration to LLVM's lld linker has been in progress for quite some
time - about three years ago I opened an upstream LLVM meta-bug to track
issues using lld as FreeBSD's linker, and about 1.5 years ago requested
the first exp-run with lld as the system linker.
As of r327783 we enabled LLD_BOOTSTRAP by default on amd64, using lld as
the linker to link the kernel and world, but GNU ld was still installed
as /usr/bin/ld.
The vast majority of issues observed when building ports with lld as the
system linker have now been solved, so set LLD_IS_LD by default on amd64
and install lld as /usr/bin/ld. A small number of port failures remain
and these will be addressed in the near future.
Thanks to antoine@ for handling the exp-runs, krion@ for investigating
many port failures and adding LLD_UNSAFE or other fixes or workarounds,
and everyone who helped investigate, fix or tag ports.
PR: 214864 (exp-run)
Sponsored by: The FreeBSD Foundation
Alan Somers [Wed, 30 May 2018 15:51:48 +0000 (15:51 +0000)]
Fix OpenBSM with GCC with -Wredundant-decls
Upstream change ed47534 consciously added some redundant functional
declarations, and I'm not sure why. AFAICT they were never required. On
FreeBSD, they break the build with GCC (but not Clang) for any program
including libbsm.h with WARNS=6.
Fix by cherry-picking upstream change
https://github.com/openbsm/openbsm/commit/0553c27
Andrew Turner [Wed, 30 May 2018 15:25:48 +0000 (15:25 +0000)]
Further limit when we call pmap_fault.
We should only call pmap_fault in the kernel when accessing a userspace
address. As this should always happen through specific functions that set
a fault handler we can use this to limit calls to pmap_fault to when this
is set.
This should help with NULL pointer dereferences when we are unable to sleep
so we fall into the correct case.
Warner Losh [Wed, 30 May 2018 15:08:59 +0000 (15:08 +0000)]
devinfo_init() returns an errno, but doesn't set errno, so the error
message when it fails reflects some random thing rather than what it
returned. Set errno to the return value.
Andrew Turner [Wed, 30 May 2018 14:18:19 +0000 (14:18 +0000)]
Push down the locking in pmap_fault to just be around the calls to
arm64_address_translate_*. There is no need to lock around the switch
statement as we only care about a few cases.
Ed Maste [Wed, 30 May 2018 13:51:00 +0000 (13:51 +0000)]
makeroot.sh: allow duplicate entries even with -f <filelist>
makefs disallows duplicate entries unless the -D option is specified.
Previously makeroot.sh enabled -D unless a filelist was provided via
the -f options. The filelist logic creates an mtree manifest from the
METALOG and the provided filelist by passing them through `sort -u`,
so duplicates were not expected. However, duplicates can still occur
when a directory appears in multiple packages -- for example,
Ed Maste [Wed, 30 May 2018 12:55:27 +0000 (12:55 +0000)]
link_elf_obj: correct an error message
Previously we'd report that a file has "no valid symbol table" if it in
fact had two or more. Change the message to report that there must be
exactly one.
Kristof Provost [Wed, 30 May 2018 07:11:33 +0000 (07:11 +0000)]
pf: Replace rwlock on PF_RULES_LOCK with rmlock
Given that PF_RULES_LOCK is a mostly read lock, replace the rwlock with rmlock.
This change improves packet processing rate in high pps environments.
Benchmarking by olivier@ shows a 65% improvement in pps.
While here, also eliminate all appearances of "sys/rwlock.h" includes since it
is not used anymore.
Nathan Whitehorn [Wed, 30 May 2018 04:15:33 +0000 (04:15 +0000)]
If linebytes property is missing from the graphics device, assume no
overscan and synthesize it from the display depth and screen width.
This may not be right, but it sometimes right and is better than
returning CN_DEAD.
Justin Hibbits [Wed, 30 May 2018 03:00:57 +0000 (03:00 +0000)]
Make opal_pci driver work with POWER9
Summary:
Coupled with r334365, this makes PCI work on POWER9. There is still more to
do to fully exploit the hardware capabilities, but this is sufficient to
enable USB and ethernet controllers on a POWER9 Talos II system.
Justin Hibbits [Wed, 30 May 2018 02:41:47 +0000 (02:41 +0000)]
Restrict PCIe maxslots to 0, instead of PCI_SLOTMAX
Summary:
PCIe only permits 1 device on an endpoint, so some devices ignore the device
part of B:D:F probing. Although ARI likely fixes this, not all platforms
support ARI completely or correctly, so some devices end up showing up 32
times on the bus.
This was found during bringup of POWER9/Talos, and has been tested on POWER9
and POWER8 hardware.
Alan Somers [Tue, 29 May 2018 23:08:33 +0000 (23:08 +0000)]
Add initial set of tests for audit(4)
This change includes the framework for testing the auditability of various
syscalls, and includes changes for the first 12. The tests will start
auditd(8) if needed, though they'll be much faster if it's already running.
The syscalls tested in this commit include mkdir(2), mkdirat(2), mknod(2),
mknodat(2), mkfifo(2), mkfifoat(2), link(2), linkat(2), symlink(2),
symlinkat(2), rename(2), and renameat(2).
Devin Teske [Tue, 29 May 2018 22:36:37 +0000 (22:36 +0000)]
dwatch(1): Fix "-t test" for post-processing profiles
Profiles that perform post-processing of the DTrace output were
dropping the "-t test" option on the floor. Fix handling of this
option for said profiles.
X-MFC-to: stable/11
X-MFC-with: r334261-334262
Sponsored by: Smule, Inc.
Stephen Hurd [Tue, 29 May 2018 21:56:39 +0000 (21:56 +0000)]
iflib: mark irq allocation name parameter as constant
The *name parameter passed to iflib_irq_alloc_generic and
iflib_softirq_alloc_generic is never modified. Many places in code pass
string literals and thus should not be modified.
Mark the *name parameter as a const char * instead, so that we enforce
that the name is not modified before passing to bus_describe_intr()
Matt Macy [Tue, 29 May 2018 20:28:34 +0000 (20:28 +0000)]
pmc: Add new sub-command structured "pmc" utility
This will manage pmc functionality with a more
manageable structure of subcommands rather than the
gradually accreted spaghetti logic of overlapping flags
that exists in pmcstat.
This is intended to ultimately have all the same functionality
as pmcannotate+pmccontrol+pmcstat. Currently it just has
"stat" and "system-stat" - counters for the process itself and counters
for the system as a whole respectively (i.e. system-stat includes kernel
threads). Note that the rusage results (page faults/context switches/
user/sys) for stat-system will not account for the system as a whole -
only for the child process specified on the command line.
Implementing stat was suggested by mjg@ and the output is based on that
from Linux's "perf stat".
% pmc stat -- make -j32 buildkernel -DNO_MODULES -ss > /dev/null 9598393 page faults # 0.674 M/sec
387085 voluntary csw # 0.027 M/sec
106989 involuntary csw # 0.008 M/sec 2763965982317 cycles 2542953049760 instructions # 0.920 inst/cycle 511562750157 branches 12917006881 branch-misses # 2.525% 17944429878 cache-references # 0.007 refs/inst 2205119560 cache-misses # 12.289%
43.74 real # 2019.72% cpu
795.09 user # 1817.72% cpu
88.35 sys # 202.00% cpu
Andrew Turner [Tue, 29 May 2018 17:44:40 +0000 (17:44 +0000)]
Increase the number of fdt memory regions we support to 16. Some SoCs have
many excluded regions causing a buffer overflow in the early boot code if
this value is too small.
Obtained from: ABT Systems Ltd
Sponsored by: Turing Robotic Industries
Andriy Gapon [Tue, 29 May 2018 16:16:24 +0000 (16:16 +0000)]
add support for console resuming, implement it for uart, use on x86
This change adds a new optional console method cn_resume and a kernel
console interface cnresume. Consoles that may need to re-initialize
their hardware after suspend (e.g., because firmware does not care to do
it) will implement cn_resume. Note that it is called in rather early
environment not unlike early boot, so the same restrictions apply.
Platform specific code, for platforms that support hardware suspend,
should call cnresume early after resume, before any console output is
expected.
This change fixes a problem with a system of mine failing to resume when
a serial console is used. I found that the serial port was in a strange
configuration and an attempt to write to it likely resulted in an
infinite loop.
To avoid adding cn_resume method to every console driver, CONSOLE_DRIVER
macro has been extended to support optional methods.