mmacy [Tue, 29 May 2018 20:28:34 +0000 (20:28 +0000)]
pmc: Add new sub-command structured "pmc" utility
This will manage pmc functionality with a more
manageable structure of subcommands rather than the
gradually accreted spaghetti logic of overlapping flags
that exists in pmcstat.
This is intended to ultimately have all the same functionality
as pmcannotate+pmccontrol+pmcstat. Currently it just has
"stat" and "system-stat" - counters for the process itself and counters
for the system as a whole respectively (i.e. system-stat includes kernel
threads). Note that the rusage results (page faults/context switches/
user/sys) for stat-system will not account for the system as a whole -
only for the child process specified on the command line.
Implementing stat was suggested by mjg@ and the output is based on that
from Linux's "perf stat".
% pmc stat -- make -j32 buildkernel -DNO_MODULES -ss > /dev/null 9598393 page faults # 0.674 M/sec
387085 voluntary csw # 0.027 M/sec
106989 involuntary csw # 0.008 M/sec 2763965982317 cycles 2542953049760 instructions # 0.920 inst/cycle 511562750157 branches 12917006881 branch-misses # 2.525% 17944429878 cache-references # 0.007 refs/inst 2205119560 cache-misses # 12.289%
43.74 real # 2019.72% cpu
795.09 user # 1817.72% cpu
88.35 sys # 202.00% cpu
andrew [Tue, 29 May 2018 17:44:40 +0000 (17:44 +0000)]
Increase the number of fdt memory regions we support to 16. Some SoCs have
many excluded regions causing a buffer overflow in the early boot code if
this value is too small.
Obtained from: ABT Systems Ltd
Sponsored by: Turing Robotic Industries
avg [Tue, 29 May 2018 16:16:24 +0000 (16:16 +0000)]
add support for console resuming, implement it for uart, use on x86
This change adds a new optional console method cn_resume and a kernel
console interface cnresume. Consoles that may need to re-initialize
their hardware after suspend (e.g., because firmware does not care to do
it) will implement cn_resume. Note that it is called in rather early
environment not unlike early boot, so the same restrictions apply.
Platform specific code, for platforms that support hardware suspend,
should call cnresume early after resume, before any console output is
expected.
This change fixes a problem with a system of mine failing to resume when
a serial console is used. I found that the serial port was in a strange
configuration and an attempt to write to it likely resulted in an
infinite loop.
To avoid adding cn_resume method to every console driver, CONSOLE_DRIVER
macro has been extended to support optional methods.
emaste [Tue, 29 May 2018 15:06:13 +0000 (15:06 +0000)]
switch amd64 memstick installer images to MBR
A good number of BIOSes have trouble booting from GPT in non-UEFI mode.
This is commonly reported with Lenovo desktops and laptops (including
X220, X230, T430, and E31) and Dell systems. Although UEFI is the
preferred amd64 boot method on recent hardware, older hardware does not
support UEFI, a user may wish to boot via BIOS/CSM, and some systems
that support UEFI fail to boot FreeBSD via UEFI (such as an old
AMD FX-6100 that I have).
With this change amd64 memsticks remain dual-mode (booting from either
UEFI or CSM); the partitioning type is just switched from GPT to MBR.
The "vestigial swap partition" in the GPT scheme was added in r265017 to
work around some issue with loader's GPT support, so we should not need
it when using MBR.
There is some concern that future UEFI systems may not boot from MBR,
but I am not aware of any today. In any case the likely path forward
for our installers is to migrate to CD/USB combo images, and if it
becomes necessary introduce a separate memstick specifically for the
MBR BIOS/CSM case.
PR: 227954
Reviewed by: gjb, imp, tsoome
MFC after: 3 days
Relnotes: Yes
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D15599
hselasky [Tue, 29 May 2018 14:04:57 +0000 (14:04 +0000)]
Add support for hardware rate limiting to mlx5en(4).
The hardware rate limiting feature is enabled by the RATELIMIT kernel
option. Please refer to ifconfig(8) and the txrtlmt option and the
SO_MAX_PACING_RATE set socket option for more information. This
feature is compatible with hardware transmit send offload, TSO.
A set of sysctl(8) knobs under dev.mce.<N>.rate_limit are provided to
setup the ratelimit table and also to fine tune various rate limit
related parameters.
andrew [Tue, 29 May 2018 13:52:25 +0000 (13:52 +0000)]
On ThunderX2 we need to be careful to only map the memory the firmware
lists in the EFI memory map. As such we need to reduce the mappings to
restrict them to not be the full 1G block. For now reduce this to a 2M
block, however this may be further restricted to be 4k page aligned as
other SoCs may require.
This allows ThunderX2 to boot reliably to userspace without performing
any speculative memory accesses to invalid physical memory.
This is a recommit of r334035 now that we can access the EFI Runtime data
through the DMAP region.
des [Tue, 29 May 2018 13:07:36 +0000 (13:07 +0000)]
Fix an inverted conditional in the netrc code, which would ignore the
value of $HOME and always use the home directory from the passwd
database, unless $HOME was unset, in which case it would use (null).
While there, clean up handling of netrcfd and add debugging aids.
imp [Tue, 29 May 2018 03:58:29 +0000 (03:58 +0000)]
Teach ufs_module.c about bsd labels and probe 'a' partition.
If the check for a UFS partition at offset 0 on the disk fails, check
to see if there's a BSD disklabel at block 1 (standard) or at offset
512 (install images assume 512 sector size). If found, probe for UFS
on the 'a' partition.
This fixes UEFI booting images from a BSD labeled MBR slice when the
'a' partiton isn't at offset 0. This is a stop-gap fix since we plan
on removing boot1.efi in FreeBSD 12. We can't easily do that for 11.2,
however, hence the short MFC window.
vangyzen [Tue, 29 May 2018 02:41:32 +0000 (02:41 +0000)]
Bump the date on man pages in r334306
It seems a shame to ruin the patina of the June 4, 1993 date
on abort.3, especially since it still matched the date of
the SCCS ID, but those are the rules.
araujo [Tue, 29 May 2018 01:46:00 +0000 (01:46 +0000)]
Simplify macros EFPRINTF and EFFLUSH. [0]
Also stdarg(3) says that each invocation of va_start() must be paired
with a corresponding invocation of va_end() in the same function. [1]
mmacy [Tue, 29 May 2018 00:53:53 +0000 (00:53 +0000)]
route: fix missed ref adds
- ensure that we bump the ifa ref whenever we add a reference
- defer freeing epoch protected references until after the if_purgaddrs
loop
asomers [Mon, 28 May 2018 20:47:39 +0000 (20:47 +0000)]
Fix "Bad tailq" panic when auditing auditon(A_SETCLASS, ...)
Due to an oversight in r195280, auditon(A_SETCLASS, ...) would cause a tailq
element to get added to the tailq twice, resulting in a circular tailq. This
panics when INVARIANTS are on.
emaste [Mon, 28 May 2018 20:06:40 +0000 (20:06 +0000)]
if_muge: Add GMII enable (vs RGMII) bit
The GMII control bit ETH_MAC_CR_GMII_EN_ is not documented in
LAN78xx datasheets, but from the permissively licensed header provided
by Microchip it is:
trasz [Mon, 28 May 2018 18:34:16 +0000 (18:34 +0000)]
Change the default USB template from the current 0 to -1. The reason
is that current one (mass storage device) doesn't work as it is - it
needs to be set to 0 after the LUN is configured, which is what the
cfumass rc script does. In other words: the current default does not
work, and to actually make it work it had to be set to -1 in
/boot/loader.conf.
Reviewed by: hselasky@
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
andrew [Mon, 28 May 2018 17:09:29 +0000 (17:09 +0000)]
Create a new function to walk the EFI memory table & run a callback for
each entry. We can then use this to ensure the RunTime data is mapped in
the DMAP, but not in phys_avail.
zeising [Mon, 28 May 2018 17:08:37 +0000 (17:08 +0000)]
Complete removal of lmc(4)
The lmc(4) driver was removed in r333144 and relevant files added to
ObsoleteFiles.inc, however, include/sys/dev/lmc was not removed from mtree
and is recreated on every install. Remove it from mtree.
alc [Mon, 28 May 2018 16:23:39 +0000 (16:23 +0000)]
Addendum to r334233. In vm_fault_populate(), since the page lock is held,
we must use vm_page_xunbusy_maybelocked() rather than vm_page_xunbusy() to
unbusy the page.
alc [Mon, 28 May 2018 04:38:10 +0000 (04:38 +0000)]
Eliminate duplicate assertions. We assert at the start of vm_fault_hold()
that the map entry is wired if the caller passes the flag VM_FAULT_WIRE.
Eliminate the same assertion, but spelled differently, at the end of
vm_fault_hold() and vm_fault_populate(). Repeat the assertion only if the
map is unlocked and the map lookup must be repeated.
Reviewed by: kib
MFC after: 10 days
Differential Revision: https://reviews.freebsd.org/D15582
jhibbits [Sun, 27 May 2018 20:24:24 +0000 (20:24 +0000)]
Stop idle threads on power9 in the idle task until an interrupt.
This reduces the CPU cycle wastage on power9, which is SMT4. Any idle
thread that's spinning is simply starving working threads on the same core
of valuable resources.
This can be reduced further by taking more advantage of the PSSCR supported
states, as well as permitting state loss, as is currently done for power8.
The currently implemented stop state is the lowest latency, which may still
consume resources.
rmacklem [Sat, 26 May 2018 23:02:15 +0000 (23:02 +0000)]
Fix the sleep event for layout recall.
The sleep for I/O completion during an NFSv4.1 pNFS layout recall used
the wrong event value and could result in the "[nfscl]" thread hung
for the mount.
This patch fixes the event to be the correct.
This bug will only affect NFSv4.1 pnfs mounts and only when the server
does a layout recall callback, so it won't affect many. Without the patch,
a mount without the "pnfs" option will avoid the problem.
Found during testing of the pNFS server.
mmacy [Sat, 26 May 2018 19:29:19 +0000 (19:29 +0000)]
pmc(3)/hwpmc(4): update supported Intel processors to rely fully on the
vendor provided pmu-events tables and sundry cleanups.
The vendor pmu-events tables provide counter descriptions, default
sample rates, event, umask, and flag values for all the counter
configuration permutations. Using this gives us:
- much simpler kernel code for the MD component
- helpful long and short event descriptions
- simpler user code
- sample rates that won't overload the system
Update man page with newer sample types and remove unused sample type.
mmacy [Sat, 26 May 2018 18:12:50 +0000 (18:12 +0000)]
pmc(3)/hwpmc(4): update supported Intel processors to rely fully on the
vendor provided pmu-events tables and sundry cleanups.
The vendor pmu-events tables provide counter descriptions, default
sample rates, event, umask, and flag values for all the counter
configuration permutations. Using this gives us:
- much simpler kernel code for the MD component
- helpful long and short event descriptions
- simpler user code
- sample rates that won't overload the system
Update man page with newer sample types and remove unused sample type.
pmclog: update log record types, bump PMC_MAJOR
- explicitly make log record types a multiple of 8 bytes
- hook in pmu event types for pmc_allocate records
- remove references to no longer PCSAMPLE record
vangyzen [Sat, 26 May 2018 14:14:56 +0000 (14:14 +0000)]
if_hn: fix use of uninitialized variable
omcast was used without being initialized in the non-multicast case.
The only effect was that the interface's multicast output counter could be
incorrect.