Justin Hibbits [Wed, 30 May 2018 02:41:47 +0000 (02:41 +0000)]
Restrict PCIe maxslots to 0, instead of PCI_SLOTMAX
Summary:
PCIe only permits 1 device on an endpoint, so some devices ignore the device
part of B:D:F probing. Although ARI likely fixes this, not all platforms
support ARI completely or correctly, so some devices end up showing up 32
times on the bus.
This was found during bringup of POWER9/Talos, and has been tested on POWER9
and POWER8 hardware.
Alan Somers [Tue, 29 May 2018 23:08:33 +0000 (23:08 +0000)]
Add initial set of tests for audit(4)
This change includes the framework for testing the auditability of various
syscalls, and includes changes for the first 12. The tests will start
auditd(8) if needed, though they'll be much faster if it's already running.
The syscalls tested in this commit include mkdir(2), mkdirat(2), mknod(2),
mknodat(2), mkfifo(2), mkfifoat(2), link(2), linkat(2), symlink(2),
symlinkat(2), rename(2), and renameat(2).
Devin Teske [Tue, 29 May 2018 22:36:37 +0000 (22:36 +0000)]
dwatch(1): Fix "-t test" for post-processing profiles
Profiles that perform post-processing of the DTrace output were
dropping the "-t test" option on the floor. Fix handling of this
option for said profiles.
X-MFC-to: stable/11
X-MFC-with: r334261-334262
Sponsored by: Smule, Inc.
Stephen Hurd [Tue, 29 May 2018 21:56:39 +0000 (21:56 +0000)]
iflib: mark irq allocation name parameter as constant
The *name parameter passed to iflib_irq_alloc_generic and
iflib_softirq_alloc_generic is never modified. Many places in code pass
string literals and thus should not be modified.
Mark the *name parameter as a const char * instead, so that we enforce
that the name is not modified before passing to bus_describe_intr()
Matt Macy [Tue, 29 May 2018 20:28:34 +0000 (20:28 +0000)]
pmc: Add new sub-command structured "pmc" utility
This will manage pmc functionality with a more
manageable structure of subcommands rather than the
gradually accreted spaghetti logic of overlapping flags
that exists in pmcstat.
This is intended to ultimately have all the same functionality
as pmcannotate+pmccontrol+pmcstat. Currently it just has
"stat" and "system-stat" - counters for the process itself and counters
for the system as a whole respectively (i.e. system-stat includes kernel
threads). Note that the rusage results (page faults/context switches/
user/sys) for stat-system will not account for the system as a whole -
only for the child process specified on the command line.
Implementing stat was suggested by mjg@ and the output is based on that
from Linux's "perf stat".
% pmc stat -- make -j32 buildkernel -DNO_MODULES -ss > /dev/null 9598393 page faults # 0.674 M/sec
387085 voluntary csw # 0.027 M/sec
106989 involuntary csw # 0.008 M/sec 2763965982317 cycles 2542953049760 instructions # 0.920 inst/cycle 511562750157 branches 12917006881 branch-misses # 2.525% 17944429878 cache-references # 0.007 refs/inst 2205119560 cache-misses # 12.289%
43.74 real # 2019.72% cpu
795.09 user # 1817.72% cpu
88.35 sys # 202.00% cpu
Andrew Turner [Tue, 29 May 2018 17:44:40 +0000 (17:44 +0000)]
Increase the number of fdt memory regions we support to 16. Some SoCs have
many excluded regions causing a buffer overflow in the early boot code if
this value is too small.
Obtained from: ABT Systems Ltd
Sponsored by: Turing Robotic Industries
Andriy Gapon [Tue, 29 May 2018 16:16:24 +0000 (16:16 +0000)]
add support for console resuming, implement it for uart, use on x86
This change adds a new optional console method cn_resume and a kernel
console interface cnresume. Consoles that may need to re-initialize
their hardware after suspend (e.g., because firmware does not care to do
it) will implement cn_resume. Note that it is called in rather early
environment not unlike early boot, so the same restrictions apply.
Platform specific code, for platforms that support hardware suspend,
should call cnresume early after resume, before any console output is
expected.
This change fixes a problem with a system of mine failing to resume when
a serial console is used. I found that the serial port was in a strange
configuration and an attempt to write to it likely resulted in an
infinite loop.
To avoid adding cn_resume method to every console driver, CONSOLE_DRIVER
macro has been extended to support optional methods.
Ed Maste [Tue, 29 May 2018 15:06:13 +0000 (15:06 +0000)]
switch amd64 memstick installer images to MBR
A good number of BIOSes have trouble booting from GPT in non-UEFI mode.
This is commonly reported with Lenovo desktops and laptops (including
X220, X230, T430, and E31) and Dell systems. Although UEFI is the
preferred amd64 boot method on recent hardware, older hardware does not
support UEFI, a user may wish to boot via BIOS/CSM, and some systems
that support UEFI fail to boot FreeBSD via UEFI (such as an old
AMD FX-6100 that I have).
With this change amd64 memsticks remain dual-mode (booting from either
UEFI or CSM); the partitioning type is just switched from GPT to MBR.
The "vestigial swap partition" in the GPT scheme was added in r265017 to
work around some issue with loader's GPT support, so we should not need
it when using MBR.
There is some concern that future UEFI systems may not boot from MBR,
but I am not aware of any today. In any case the likely path forward
for our installers is to migrate to CD/USB combo images, and if it
becomes necessary introduce a separate memstick specifically for the
MBR BIOS/CSM case.
PR: 227954
Reviewed by: gjb, imp, tsoome
MFC after: 3 days
Relnotes: Yes
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D15599
Add support for hardware rate limiting to mlx5en(4).
The hardware rate limiting feature is enabled by the RATELIMIT kernel
option. Please refer to ifconfig(8) and the txrtlmt option and the
SO_MAX_PACING_RATE set socket option for more information. This
feature is compatible with hardware transmit send offload, TSO.
A set of sysctl(8) knobs under dev.mce.<N>.rate_limit are provided to
setup the ratelimit table and also to fine tune various rate limit
related parameters.
Andrew Turner [Tue, 29 May 2018 13:52:25 +0000 (13:52 +0000)]
On ThunderX2 we need to be careful to only map the memory the firmware
lists in the EFI memory map. As such we need to reduce the mappings to
restrict them to not be the full 1G block. For now reduce this to a 2M
block, however this may be further restricted to be 4k page aligned as
other SoCs may require.
This allows ThunderX2 to boot reliably to userspace without performing
any speculative memory accesses to invalid physical memory.
This is a recommit of r334035 now that we can access the EFI Runtime data
through the DMAP region.
Fix an inverted conditional in the netrc code, which would ignore the
value of $HOME and always use the home directory from the passwd
database, unless $HOME was unset, in which case it would use (null).
While there, clean up handling of netrcfd and add debugging aids.
Warner Losh [Tue, 29 May 2018 03:58:29 +0000 (03:58 +0000)]
Teach ufs_module.c about bsd labels and probe 'a' partition.
If the check for a UFS partition at offset 0 on the disk fails, check
to see if there's a BSD disklabel at block 1 (standard) or at offset
512 (install images assume 512 sector size). If found, probe for UFS
on the 'a' partition.
This fixes UEFI booting images from a BSD labeled MBR slice when the
'a' partiton isn't at offset 0. This is a stop-gap fix since we plan
on removing boot1.efi in FreeBSD 12. We can't easily do that for 11.2,
however, hence the short MFC window.
Eric van Gyzen [Tue, 29 May 2018 02:41:32 +0000 (02:41 +0000)]
Bump the date on man pages in r334306
It seems a shame to ruin the patina of the June 4, 1993 date
on abort.3, especially since it still matched the date of
the SCCS ID, but those are the rules.
Marcelo Araujo [Tue, 29 May 2018 01:46:00 +0000 (01:46 +0000)]
Simplify macros EFPRINTF and EFFLUSH. [0]
Also stdarg(3) says that each invocation of va_start() must be paired
with a corresponding invocation of va_end() in the same function. [1]
Matt Macy [Tue, 29 May 2018 00:53:53 +0000 (00:53 +0000)]
route: fix missed ref adds
- ensure that we bump the ifa ref whenever we add a reference
- defer freeing epoch protected references until after the if_purgaddrs
loop
Alan Somers [Mon, 28 May 2018 20:47:39 +0000 (20:47 +0000)]
Fix "Bad tailq" panic when auditing auditon(A_SETCLASS, ...)
Due to an oversight in r195280, auditon(A_SETCLASS, ...) would cause a tailq
element to get added to the tailq twice, resulting in a circular tailq. This
panics when INVARIANTS are on.
Ed Maste [Mon, 28 May 2018 20:06:40 +0000 (20:06 +0000)]
if_muge: Add GMII enable (vs RGMII) bit
The GMII control bit ETH_MAC_CR_GMII_EN_ is not documented in
LAN78xx datasheets, but from the permissively licensed header provided
by Microchip it is:
Change the default USB template from the current 0 to -1. The reason
is that current one (mass storage device) doesn't work as it is - it
needs to be set to 0 after the LUN is configured, which is what the
cfumass rc script does. In other words: the current default does not
work, and to actually make it work it had to be set to -1 in
/boot/loader.conf.
Reviewed by: hselasky@
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Andrew Turner [Mon, 28 May 2018 17:09:29 +0000 (17:09 +0000)]
Create a new function to walk the EFI memory table & run a callback for
each entry. We can then use this to ensure the RunTime data is mapped in
the DMAP, but not in phys_avail.
Niclas Zeising [Mon, 28 May 2018 17:08:37 +0000 (17:08 +0000)]
Complete removal of lmc(4)
The lmc(4) driver was removed in r333144 and relevant files added to
ObsoleteFiles.inc, however, include/sys/dev/lmc was not removed from mtree
and is recreated on every install. Remove it from mtree.
Alan Cox [Mon, 28 May 2018 16:23:39 +0000 (16:23 +0000)]
Addendum to r334233. In vm_fault_populate(), since the page lock is held,
we must use vm_page_xunbusy_maybelocked() rather than vm_page_xunbusy() to
unbusy the page.
Alan Cox [Mon, 28 May 2018 04:38:10 +0000 (04:38 +0000)]
Eliminate duplicate assertions. We assert at the start of vm_fault_hold()
that the map entry is wired if the caller passes the flag VM_FAULT_WIRE.
Eliminate the same assertion, but spelled differently, at the end of
vm_fault_hold() and vm_fault_populate(). Repeat the assertion only if the
map is unlocked and the map lookup must be repeated.
Reviewed by: kib
MFC after: 10 days
Differential Revision: https://reviews.freebsd.org/D15582
Justin Hibbits [Sun, 27 May 2018 20:24:24 +0000 (20:24 +0000)]
Stop idle threads on power9 in the idle task until an interrupt.
This reduces the CPU cycle wastage on power9, which is SMT4. Any idle
thread that's spinning is simply starving working threads on the same core
of valuable resources.
This can be reduced further by taking more advantage of the PSSCR supported
states, as well as permitting state loss, as is currently done for power8.
The currently implemented stop state is the lowest latency, which may still
consume resources.