jhibbits [Sat, 19 May 2018 04:21:50 +0000 (04:21 +0000)]
Add hypervisor trap handling, using HSRR0/HSRR1
Summary:
Some hypervisor exceptions on POWER architecture only save state to HSRR0/HSRR1.
Until we have bhyve on POWER, use a lightweight exception frontend which copies
HSRR0/HSRR1 into SRR0/SRR1, and run the normal trap handler.
The first user of this is the Hypervisor Virtualization Interrupt, which targets
the XIVE interrupt controller on POWER9.
jhibbits [Sat, 19 May 2018 03:45:38 +0000 (03:45 +0000)]
Add yet another option for gathering available memory
On some POWER9 systems, 'reg' denotes the full memory in the system, while
'linux,usable-memory' denotes the usable memory. Some memory is reserved for
NVLink usage, so is partitioned off.
mmacy [Sat, 19 May 2018 00:04:01 +0000 (00:04 +0000)]
Silence non-actionable warnings in vendor code
We can't modify vendor code so there's no signal in warnings from it.
Similarly -Waddress-of-packed-member is not useful on networking code
as access to packed structures is fundamental to its operation.
mjg [Fri, 18 May 2018 22:57:52 +0000 (22:57 +0000)]
lockmgr: avoid atomic on unlock in the slow path
The code is pretty much guaranteed not to be able to unlock.
This is a minor nit. The code still performs way too many reads.
The altered exclusive-locked condition is supposed to be always
true as well, to be cleaned up at a later date.
mmacy [Fri, 18 May 2018 20:13:34 +0000 (20:13 +0000)]
ifnet: Replace if_addr_lock rwlock with epoch + mutex
Run on LLNW canaries and tested by pho@
gallatin:
Using a 14-core, 28-HTT single socket E5-2697 v3 with a 40GbE MLX5
based ConnectX 4-LX NIC, I see an almost 12% improvement in received
packet rate, and a larger improvement in bytes delivered all the way
to userspace.
When the host receiving 64 streams of netperf -H $DUT -t UDP_STREAM -- -m 1,
I see, using nstat -I mce0 1 before the patch:
o job.c: skip polling job token pipe
o parse.c: be more cautious about detecting depenency line
rather than sysV style include.
also in mk:
* dirdeps.mk: include local.dirdeps-build.mk when .MAKE.LEVEL > 0
ie. we are building something.
* FILES: add dirdeps-options.mk to deal with optional DIRDEPS.
* ldorder.mk: describe how to use LDORDER_EXTERN_BARRIER
if needed.
jhb [Fri, 18 May 2018 19:09:11 +0000 (19:09 +0000)]
Be more robust against garbage input on a TOE TLS TX socket.
If a socket is closed or shutdown and a partial record (or what
appears to be a partial record) is waiting in the socket buffer,
discard the partial record and close the connection rather than
waiting forever for the rest of the record.
mmacy [Fri, 18 May 2018 17:29:43 +0000 (17:29 +0000)]
epoch(9): Make epochs non-preemptible by default
There are risks associated with waiting on a preemptible epoch section.
Change the name to make them not be the default and document the issue
under CAVEATS.
markj [Fri, 18 May 2018 16:59:58 +0000 (16:59 +0000)]
Don't increment addl_page_shortage for wired pages.
Such pages are dequeued as they're encountered during the inactive queue
scan, so by the time we get to the active queue scan, they should have
already been subtracted from the inactive queue length.
gallatin [Fri, 18 May 2018 14:14:04 +0000 (14:14 +0000)]
Teach pmcannotate about $TMPDIR and _PATH_TMP
Convert pmcannotate to using $TMPDIR and _PATH_TMP rather than hard
coding /tmp for temporary files. Pmcannotate sometimes needs quite a
lot of space to store the output from objdump, and will fail in odd
ways if that output is truncated due to lack of space in /tmp.
cognet [Fri, 18 May 2018 13:28:02 +0000 (13:28 +0000)]
Instead of ignoring the VFP registers, set the dumppcb's pcb_fpusaved
field, so that they are saved, as they may be used in the kernel, in the
EFI and the crypto code.
ae [Fri, 18 May 2018 12:12:24 +0000 (12:12 +0000)]
Make the name of option that toggles IFCAP_HWRXTSTMP capability to
match the name of this capability. It was added recently and is not merged
to stable branch, so I hope it is not too late to change the name.
mjg [Fri, 18 May 2018 07:31:26 +0000 (07:31 +0000)]
amd64: tweak the read_frequently section
1. align to 128 bytes to avoid possible waste from the preceeding section
2. sort entries by alignment SORT_BY_ALIGNMENT, plugging the holes (most
entries are one byte in size, but they got interleaved with bigger ones)
Interestingly I was looking for a feature of the sort earlier and failed
to find it. It turns out the script was already utilizing sorting in other
places, so shame on me.
Thanks for Travis Geiselbrecht for pointing me at the feature.
np [Fri, 18 May 2018 06:09:15 +0000 (06:09 +0000)]
cxgbe(4): Implement ifnet callbacks that deal with send tags.
An etid (ethoffload tid) is allocated for a send tag and it acquires a
reference on the traffic class that matches the send parameters
associated with the tag.
emaste [Fri, 18 May 2018 02:58:26 +0000 (02:58 +0000)]
vt: add more cp437 mappings for vga textmode
In UTF-8 locales mandoc uses a number of characters outside of the Basic
Latin group, e.g. from general punctuation or miscellaneous mathematical
symbols, and these rendered as ? in text mode.
This change adds (char, replacement, code point, description):
mmacy [Fri, 18 May 2018 01:52:51 +0000 (01:52 +0000)]
epoch: add non-preemptible "critical" variant
adds:
- epoch_enter_critical() - can be called inside a different epoch,
starts a section that will acquire any MTX_DEF mutexes or do
anything that might sleep.
- epoch_exit_critical() - corresponding exit call
- epoch_wait_critical() - wait variant that is guaranteed that any
threads in a section are running.
- epoch_global_critical - an epoch_wait_critical safe epoch instance
cognet [Thu, 17 May 2018 22:38:16 +0000 (22:38 +0000)]
In vfp_save_state(), don't bother trying to save the VFP registers if the
provided PCB doesn't have a pcb_fpusaved. All PCBs associated to a thread
should have one, but the dumppcb used when panic'ing doesn't.
rmacklem [Thu, 17 May 2018 21:17:20 +0000 (21:17 +0000)]
Add a missing nfsrv_freesession() call for an unlikely failure case.
Since NFSv4.1 clients normally create a single session which supports
both fore and back channels, it is unlikely that a callback will fail
due to a lack of a back channel.
However, if this failure occurred, the session wasn't being dereferenced
and would never be free'd.
Found by inspection during pNFS server development.
trasz [Thu, 17 May 2018 19:54:11 +0000 (19:54 +0000)]
Add a "multifunction" device side USB template, which provides mass
storage, CDC ACM (serial), and CDC ECM (ethernet) at the same time.
It's quite similar in function to Linux' "g_multi" gadget.
Reviewed by: hselasky@
MFC after: 2 weeks
Relnotes: yes
Sponsored by: The FreeBSD Foundation
manu [Thu, 17 May 2018 19:10:13 +0000 (19:10 +0000)]
release: rpi3: Copy the special rpi3 config.txt
RPI* 32bits and RPI* 64bits have a different config.txt
Copy to correct config.txt to the fat partition of the release image.
Also copy pwm.dtbo as some people want to use it.
mmacy [Thu, 17 May 2018 17:59:35 +0000 (17:59 +0000)]
AF_UNIX: make unix socket locking finer grained
This change moves to using a reference count across lock drop / reacquire
to guarantee liveness.
Currently sends on unix sockets contend heavily on read locking the list lock.
unix1_processes in will-it-scale peaks at 6 processes and then declines.
With this change I get a substantial improvement in number of operations per second
with 96 processes:
And even for 2 processes shows a ~18% improvement.
"Small" iron changes (1, 2, and 4 processes):
x before1
+ after1.2
+------------------------------------------------------------------------+
| + |
| x + |
| x + |
| x + |
| x ++ |
| xx ++ |
|x x xx ++ |
| |__________________A_____M_____AM____||
+------------------------------------------------------------------------+
N Min Max Median Avg Stddev
x 10 113164811977501197138.5 1190369.3 20651.839
+ 10 1203840120505612049191204827.9 353.27404
Difference at 95.0% confidence
14458.6 +/- 13723
1.21463% +/- 1.16683%
(Student's t, pooled s = 14605.2)
x before2
+ after2.2
+------------------------------------------------------------------------+
| +|
| +|
| +|
| +|
| +|
| +|
| x +|
| x +|
| x xx +|
|x xxxx +|
| |___AM_| A|
+------------------------------------------------------------------------+
N Min Max Median Avg Stddev
x 10 197284320458662038186.5 2030443.8 21367.694
+ 10 240085324021962401043.5 2401172.7 385.40024
Difference at 95.0% confidence
370729 +/- 14198.9
18.2585% +/- 0.826943%
(Student's t, pooled s = 15111.7)
x before4
+ after4.2
N Min Max Median Avg Stddev
x 10 398699439917283990137.5 3989985.2 1300.0164
+ 10 479999048066644806116.5 4805194 1990.6625
Difference at 95.0% confidence
815209 +/- 1579.64
20.4314% +/- 0.0421713%
(Student's t, pooled s = 1681.19)
manu [Thu, 17 May 2018 16:21:12 +0000 (16:21 +0000)]
release: arm: Format FAT partition as FAT16
r332674 raised the size of the FAT partition from 2MB to 41MB for some
boards. But we format them in FAT12 and this size appears to be to big
for FAT12 and some SoC bootrom cannot cope with that.
Format the msdosfs partition as FAT16,
trasz [Thu, 17 May 2018 15:19:29 +0000 (15:19 +0000)]
Fix off-by-one in usb_decode_str_desc(). Previously it would decode
one character too many. Note that this function is only used to decode
string descriptors generated by the kernel itself.
Reviewed by: hselasky@
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
sbruno [Thu, 17 May 2018 14:55:41 +0000 (14:55 +0000)]
Retire vxge(4).
This driver was merged to HEAD one week prior to Exar publicly announcing they
had left the Ethernet market. It is not known to be used and has various code
quality issues spotted by Brooks and Hiren. Retire it in preparation for
FreeBSD 12.0.
manu [Thu, 17 May 2018 14:51:22 +0000 (14:51 +0000)]
aw_spi: Fix some silly clock mistake
The module uses the mod clock and not the ahb one.
We need to set the mod clock to twice the speed requested as the smallest
divider in the controller is 2.
The clock test function weren't calculating the register value best on the
best div but on the max one.
The cdr2 test function was using the cdr1 formula.