markj [Sat, 12 May 2018 15:35:26 +0000 (15:35 +0000)]
DTrace aarch64: Avoid calling unwind_frame() in the probe context.
unwind_frame() may be instrumented by FBT, leading to recursion into
dtrace_probe(). Manually inline unwind_frame() as we do with stack
unwinding code for other architectures.
According to the Intel SDM (Volme 3, 9.11.7) the BIOS signature MSR
should be zeroed before executing cpuid (although in practice it does
not seem to matter).
PR: 192487
Submitted by: Dan Lukes
Reported by: Henrique de Moraes Holschuh
MFC after: 3 days
manu [Sat, 12 May 2018 13:14:01 +0000 (13:14 +0000)]
aw_mmc: Rework regulator handling
Don't enable regulator on attach but dealt with them on power_up/power_off
Only set the voltage for the signaling regulator since I don't have boards
that can change the supply voltage.
Enable 1.8v signaling voltage.
manu [Sat, 12 May 2018 13:13:34 +0000 (13:13 +0000)]
aw_mmc: Do not fully init the controller in attach
Only do a reset of the controller at attach and init it at power_up.
We use to enable some interrupts in reset, only enable the interrupts
we are interested in when doing a request.
While here remove the regulators handling in power_on as it is very wrong
and will be dealt with in another commit.
des [Sat, 12 May 2018 12:00:18 +0000 (12:00 +0000)]
Upgrade LDNS to 1.7.0.
I've been holding back on this because 1.7.0 requires OpenSSL 1.1.0 or
newer for full DANE support. But we can't wait forever, and nothing in
base uses DANE anyway, so here we go.
kib [Sat, 12 May 2018 11:06:59 +0000 (11:06 +0000)]
Kernel entry from vm86 mode, where PCB_VM86CALL pcb flag is not set,
is executed on the right stack already. No copy from the entry stack
to the kstack must be performed for vm86 bios call code to function.
To access the pcb flags on kernel entry, unconditionally switch to
kernel address space if vm86 mode is detected.
This fixes very early vm86 bios calls, typically done when boot is
performed by boot2 without loader, and kernel falls back to BIOS calls
to get SMAP.
Reported by: bde
Sponsored by: The FreeBSD Foundation
kib [Sat, 12 May 2018 10:48:53 +0000 (10:48 +0000)]
Fix use of the custom TSS on i386 after the 4/4 split.
Record common_tssd, the descriptor to be written in GDT to point to
the common TSS, before LTR is executed. The LTR instruction sets the
loaded descriptor type to 386 TSS busy, which traps on reloads.
dteske [Sat, 12 May 2018 06:18:15 +0000 (06:18 +0000)]
dwatch(1): Expose process for ip/tcp/udp
Knowing the value of execname during these probes is of some value even
if it is commonly the interrupt kernel thread (intr[12]) -- quite often
it is not, but that depends on the probe.
dteske [Sat, 12 May 2018 05:49:31 +0000 (05:49 +0000)]
dwatch(1): Export ARGV to profiles loaded via load_profile()
A module that wishes to post-process the output needs to know which
arguments were passed in order to re-execute a child in a pipe-chain.
Further, the expansion of ARGV needs to be such that items are escaped
properly.
dteske [Sat, 12 May 2018 05:41:28 +0000 (05:41 +0000)]
dwatch(1): Separate default values so `-[BK] num' don't affect usage
If you were to pass an invalid option after `-B num' or `-K num' you
would see that the usage statement would show the value you passed
instead of the actual default.
Moving the default values to separate variables that are unaffected
by the options-parsing allows the usage statement to correctly show
the hard-coded default values if no flags are used.
dteske [Sat, 12 May 2018 05:36:47 +0000 (05:36 +0000)]
dwatch(1): Bugfix, usage displayed with `-1Q'
A return statement should have been an exit in list_profiles().
If the user passed `-Q' to list profiles and asked for one-line
per profile (`-1'), list_profiles() would not exit as should.
mmacy [Sat, 12 May 2018 01:26:34 +0000 (01:26 +0000)]
hwpmc(9): Make pmclog buffer pcpu and update constants
On non-trivial SMP systems the contention on the pmc_owner mutex leads
to a substantial number of samples captured being from the pmc process
itself. This change a) makes buffers larger to avoid contention on the
global list b) makes the working sample buffer per cpu.
Run pmcstat in the background (default event rate of 64k):
pmcstat -S UNHALTED_CORE_CYCLES -O /dev/null sleep 600 &
Before:
make -j96 buildkernel -s >&/dev/null 3336.68s user 24684.10s system 7442% cpu 6:16.50 total
After:
make -j96 buildkernel -s >&/dev/null 2697.82s user 1347.35s system 6058% cpu 1:06.77 total
For more realistic overhead measurement set the sample rate for ~2khz
on a 2.1Ghz processor:
pmcstat -n 1050000 -S UNHALTED_CORE_CYCLES -O /dev/null sleep 6000 &
Collecting 10 samples of `make -j96 buildkernel` from each:
x before
+ after
real time:
N Min Max Median Avg Stddev
x 10 76.4 127.62 84.845 88.577 15.100031
+ 10 59.71 60.79 60.135 60.179 0.29957192
Difference at 95.0% confidence
-28.398 +/- 10.0344
-32.0602% +/- 7.69825%
(Student's t, pooled s = 10.6794)
system time:
N Min Max Median Avg Stddev
x 10 2277.96 6948.53 2949.47 3341.492 1385.2677
+ 10 1038.7 1081.06 1070.555 1064.017 15.85404
Difference at 95.0% confidence
-2277.47 +/- 920.425
-68.1574% +/- 8.77623%
(Student's t, pooled s = 979.596)
x no pmc
+ pmc running
real time:
HEAD:
N Min Max Median Avg Stddev
x 10 58.38 59.15 58.86 58.847 0.22504567
+ 10 76.4 127.62 84.845 88.577 15.100031
Difference at 95.0% confidence
29.73 +/- 10.0335
50.5208% +/- 17.0525%
(Student's t, pooled s = 10.6785)
patched:
N Min Max Median Avg Stddev
x 10 58.38 59.15 58.86 58.847 0.22504567
+ 10 59.71 60.79 60.135 60.179 0.29957192
Difference at 95.0% confidence
1.332 +/- 0.248939
2.2635% +/- 0.426506%
(Student's t, pooled s = 0.264942)
system time:
HEAD:
N Min Max Median Avg Stddev
x 10 1010.15 1073.31 1025.465 1031.524 18.135705
+ 10 2277.96 6948.53 2949.47 3341.492 1385.2677
Difference at 95.0% confidence
2309.97 +/- 920.443
223.937% +/- 89.3039%
(Student's t, pooled s = 979.616)
patched:
N Min Max Median Avg Stddev
x 10 1010.15 1073.31 1025.465 1031.524 18.135705
+ 10 1038.7 1081.06 1070.555 1064.017 15.85404
Difference at 95.0% confidence
32.493 +/- 16.0042
3.15% +/- 1.5794%
(Student's t, pooled s = 17.0331)
rmacklem [Fri, 11 May 2018 22:16:23 +0000 (22:16 +0000)]
Add support for the TestStateID operation to the NFSv4.1 server.
The Linux client now uses the TestStateID operation, so this patch adds
support for it to the NFSv4.1 server. The FreeBSD client never uses this
operation, so it should not be affected.
shurd [Fri, 11 May 2018 19:37:18 +0000 (19:37 +0000)]
Fix mld6query(8) and add a new -g option
The mld6query command relies on KAME behaviour which allows the
ipv6mr_multiaddr member of the request object in a IPV6_JOIN_GROUP
setsockopt() call to be INADDR6_ANY. The FreeBSD stack doesn't allow
this, so mld6query has been non-functional.
Also, add a -g option which sends a General Query (query INADDR6_ANY)
sbruno [Fri, 11 May 2018 17:26:59 +0000 (17:26 +0000)]
vxge(4): deprecation notice
This hardware isn't totally ancient, about equal to a mxge(4) or mlx4en(4),
but the company was sold to Exar which then promptly exited the Ethernet
business so the card was commercially available for under 2 years. On deep
search, the only usage of these cards I found was by the importing of the
driver. There are code quality issues identified by Brooks and Hiren and
no visible use nor maintainership that warrant removal from FreeBSD 12.0.
ae [Fri, 11 May 2018 16:50:25 +0000 (16:50 +0000)]
Apply the change from r272770 to if_ipsec(4) interface.
It is guaranteed that if_ipsec(4) interface is used only for tunnel
mode IPsec, i.e. decrypted and decapsultaed packet has its own IP header.
Thus we can consider it as new packet and clear the protocols flags.
This allows ICMP/ICMPv6 properly handle errors that may cause this packet.
trasz [Fri, 11 May 2018 15:11:53 +0000 (15:11 +0000)]
Improve development(7):
- Use Fx when referring to FreeBSD.
- Use Ql instead of Cm for command invocations.
- Remove some redundant Pp macros.
- Use a literal indented Bd instead of a series of Dl macros.
trasz [Fri, 11 May 2018 14:52:35 +0000 (14:52 +0000)]
Set kldxref_enable="YES" for ARM images. Without it, the images are missing
the /boot/kernel/linker.hints file, which breaks loading some of the modules
with dependencies, eg cfiscsi.ko.
This is a minimal fix for ARM images, in order to safely MFC it before
11.2-RELEASE. Afterwards, however, I believe we should actually just change
the default (as in, etc/defaults/rc.conf). The reason is that it's required
for every image that's being cross-built, as kldxref(1) cannot handle files
for non-native architectures. For the one that is not - amd64 - having it
on by default doesn't change anything - the script is noop if the linker.hints
already exists.
The long-term solution would be to rewrite kldxref(1) to handle other
architectures, and generate linker.hints at build time.
ken [Fri, 11 May 2018 14:50:26 +0000 (14:50 +0000)]
Clear out the entire structure, not just the size of a pointer to it.
sys/dev/ocs/ocs_os.c:
In ocs_thread_create(), use sizeof(*thread) (instead of
sizeof(thread)) as the size argument to memset so that we clear
out the entire thread structure instead of just a few bytes of it.
trasz [Fri, 11 May 2018 14:43:21 +0000 (14:43 +0000)]
Make /etc/rc.d/kldxref not print anything for directories that don't
contain any kernel modules. This makes the common case completely silent,
as it should be.
mjg [Fri, 11 May 2018 08:56:39 +0000 (08:56 +0000)]
amd64: align the .data.exclusive_cache_line section to 128
This aligns the section itself compared to other sections, does not change
internal alignment of fields stored inside. This may or may not come later.
The motivation is partially combating adverse effects of the adjacent cache
line prefetcher. Without the annotation part of read_mostly section was on
the line of fire.
mjg [Fri, 11 May 2018 07:04:57 +0000 (07:04 +0000)]
uma: increase alignment to 128 bytes on amd64
Current UMA internals are not suited for efficient operation in
multi-socket environments. In particular there is very common use of
MAXCPU arrays and other fields which are not always properly aligned and
are not local for target threads (apart from the first node of course).
Turns out the existing UMA_ALIGN macro can be used to mostly work around
the problem until the code get fixed. The current setting of 64 bytes
runs into trouble when adjacent cache line prefetcher gets to work.
An example 128-way benchmark doing a lot of malloc/frees has the following
instruction samples:
mmacy [Fri, 11 May 2018 05:00:40 +0000 (05:00 +0000)]
Allow different bridge types to coexist
if_bridge has a lot of limitations that make it scale poorly to higher data
rates. In my projects/VPC branch I leverage the bridge interface between
layers for my high speed soft switch as well as for purposes of stacking
in general.
mmacy [Fri, 11 May 2018 04:54:12 +0000 (04:54 +0000)]
epoch(9): fix priority handling, make callback lists pcpu, and other fixes
- Lend priority to preempted threads in epoch_wait to handle the case
in which we've had priority lent to us. Previously we borrowed the
priority of the lowest priority preempted thread. (pointed out by mjg@)
- Don't attempt allocate memory per-domain on powerpc, we don't currently
handle empty sockets (as is the case on jhibbits Talos' board).
- Handle deferred callbacks as pcpu lists and poll the lists periodically.
Currently the interval is 1/hz.
- Drop the thread lock when adaptive spinning. Holding the lock starves
other threads and can even lead to lockups.
- Keep a generation count pcpu so that we don't keep spining if a thread
has left and re-entered an epoch section.
- Actually removed the callback from the callback list so that we don't
double free. Sigh ...
des [Fri, 11 May 2018 00:19:49 +0000 (00:19 +0000)]
Slight cleanup of interface event logging.
Make if_printf() use vlog() instead of vprintf(). This means it can no
longer return the number of characters printed, as it used to, but every
single call to if_printf() in the entire kernel ignores the return value
anyway; just return 0 so we don't have to change the prototype.
Consistently use if_printf() throughout sys/net/if.c, instead of a
mixture of if_printf() and log().
In ifa_maintain_loopback_route(), don't needlessly log an error if we
either failed to add a route because it already existed or failed to
remove one because it did not. We still return an error code, though.
des [Fri, 11 May 2018 00:01:43 +0000 (00:01 +0000)]
Reduce <sys/queue.h> pollution.
While <sys/sysctl.h> includes <sys/queue.h> unconditionally, it is only
actually used in code which is conditional on _KERNEL. Make the #include
itself conditional as well, and fix userland code that uses <sys/queue.h>
for other purposes but relied on <sys/sysctl.h> to bring it in.
gjb [Thu, 10 May 2018 21:46:58 +0000 (21:46 +0000)]
Add a special GCE_LICENSE variable to Makefile.gce, which when set,
will include license metadata in the resultant GCE image.
GCE_LICENSE is unset by default, as it primarily pertains to images
produced by the FreeBSD Project, but for downstream FreeBSD consumers,
it can be set in the make(1) environment in the format of:
The "license" is not a license, per se, but required metadata that
is required by the GCE marketplace. For the FreeBSD Project, the
license name is simply 'freebsd', with the description of 'FreeBSD'.
MFC after: 3 days
Sponsored by: The FreeBSD Foundation
imp [Thu, 10 May 2018 20:27:12 +0000 (20:27 +0000)]
Revert r333365
Even though we don't use it, it appears something else requires it to
be != 0 to work. This breaks tftp boot in loader.efi, so revert until
that can be sorted out.
emaste [Thu, 10 May 2018 20:10:02 +0000 (20:10 +0000)]
Error out on attempt to link amd64 kernel with old binutils linker
As of r333461 we require ifunc support to link a working amd64 kernel.
The default in-tree bootstrap linker is lld and it has the required
support, as does any modern out-of-tree binutils linker. The in-tree
GNU ld is from binutils 2.17.50 and it does not have ifunc support,
so produce an error rather than a broken kernel.
Reviewed by: kib
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D15378
dumbbell [Thu, 10 May 2018 17:00:33 +0000 (17:00 +0000)]
vt(4): Use default VGA palette
Before this change, the VGA palette was configured to match the shell
palette (e.g. color #1 was red). There was one glitch early in boot when
the vt(4)'s VGA palette was loaded: the loader's logo would switch from
red to blue. Likewise for the "Booting..." message switching from blue
to red. That's because the loader's logo was drawed with the default VGA
palette where a few colors are swapped compared to the shell palette
(e.g. blue <-> red).
This change configures the default VGA palette during initialization and
converts input's colors from shell to VGA palette index.
There should be no visible changes, except the loader's logo which will
keep its original color.
kib [Thu, 10 May 2018 15:01:43 +0000 (15:01 +0000)]
Make fpusave() and fpurestore() on amd64 ifuncs.
From now on, linking amd64 kernel requires either lld or newer ld.bfd.
Reviewed by: jhb (as part of the large patch)
Discussed with: emaste
Sponsored by: The FreeBSD Foundation
Differential revision: https://reviews.freebsd.org/D13838
ae [Thu, 10 May 2018 12:25:01 +0000 (12:25 +0000)]
Fix the printing of rule comments.
Change uint8_t type of opcode argument to int in the print_opcode()
function. Use negative value to print the rest of opcodes, because
zero value is O_NOP, and it can't be uses for this purpose.