Merge r334562 from stable/11 to releng/11.2. r334562 MFC'd the
following revisions to stable/11: r333650, r333652, r333682, r334406,
r334409-r334410, and r334489.
r333650:
cxgbe(4): Claim some more T5 and T6 boards.
r333652:
cxgbe(4): Add support for two more flash parts.
r333682:
cxgbe(4): Fall back to a failsafe configuration built into the firmware
if an error is reported while pre-processing the configuration file that
the driver attempted to use.
Also, allow the user to explicitly use the built-in configuration with
hw.cxgbe.config_file="built-in"
r334406:
cxgbe(4): Consider all supported speeds when building the ifmedia list
for a port. Fix other related issues while here:
- Require port lock for access to link_config.
- Allow 100Mbps operation by tracking the speed in Mbps. Yes, really.
- New port flag to indicate that the media list is immutable. It will
be used in future refinements.
This also fixes a bug where the driver reports incorrect media with
recent firmwares.
r334409:
cxgbe(4): Implement ifm_change callback.
r334410:
cxgbe(4): Use ifm for ifmedia just like the rest of the kernel.
No functional change.
r334489:
cxgbe(4): Include full duplex mediaopt in media that can be reported as
active. Always report full duplex in active media.
gjb [Thu, 31 May 2018 23:55:59 +0000 (23:55 +0000)]
MFC r334068 (phil):
Import libxo-0.9.0:
- Add xo_format_is_numeric() with improved logic to decide if format
strings are numeric, so json output quotes them
- Convert docs to sphinx/rst
- update tests
PR: 221676
Approved by: re (marius)
Sponsored by: The FreeBSD Foundation
marius [Thu, 31 May 2018 23:48:27 +0000 (23:48 +0000)]
Akin r302691 in head, synchronize the build stripping for the disc1
image with that of the bootonly image (but similarly modulo games
and groff(1)) as the amd64 disc1 image is overflowing. This also
removes the redundant MK_LLDB.
This is a direct commit to stable/11 rather than a MFC of r302691 as
the the disc1 image stripping previously has been directly modified
in stable/11 by r303027.
gjb [Thu, 31 May 2018 20:01:58 +0000 (20:01 +0000)]
MFC r334310, r334337:
r334310 (imp):
Teach ufs_module.c about bsd labels and probe 'a' partition.
If the check for a UFS partition at offset 0 on the disk fails, check
to see if there's a BSD disklabel at block 1 (standard) or at offset
512 (install images assume 512 sector size). If found, probe for UFS
on the 'a' partition.
This fixes UEFI booting images from a BSD labeled MBR slice when the
'a' partiton isn't at offset 0. This is a stop-gap fix since we plan
on removing boot1.efi in FreeBSD 12. We can't easily do that for 11.2,
however, hence the short MFC window.
r334337 (emaste):
switch amd64 memstick installer images to MBR
A good number of BIOSes have trouble booting from GPT in non-UEFI
mode.
With this change amd64 memsticks remain dual-mode (booting from either
UEFI or CSM); the partitioning type is just switched from GPT to MBR.
PR: 227954
Note, there are two changes specific to stable/11 where there is code
that had diverged from head and never merged back. The two changes are
an include in stand/efi/boot1/ufs_module.c, replacing sys/disk/bsd.h
with sys/disklabel.h and replacing BSD_MAGIC with DISKMAGIC in the
same file. The latter two are direct commits to stable/11 in order to
avoid unexpected regressions at this point of the 11.2 cycle. Thank
you to imp@ for pointing out what changes needed to be made.
tuexen [Thu, 31 May 2018 16:48:08 +0000 (16:48 +0000)]
MFC r333176:
Fix in the documentation that the default hop limit is not 30, but
the value of the sysctl variable net.inet6.ip6.hlim.
This is true since
https://svnweb.freebsd.org/base?view=revision&revision=122574
The default of 30 (which was correct up to r122574) was incorrectly
documented in
https://svnweb.freebsd.org/base?view=revision&revision=130268
Thanks to Timo Voelker for makeing me aware of the inconsistency
between to code and the documentation.
tuexen [Thu, 31 May 2018 16:14:45 +0000 (16:14 +0000)]
MFC r333382:
When reporting ERROR or ABORT chunks, don't use more data
that is guaranteed to be contigous.
Thanks to Felix Weinrank for finding and reporting this bug
by fuzzing the usrsctp stack.
MFC r333386:
Fix two typos reported by N. J. Mann, which were introduced in
https://svnweb.freebsd.org/changeset/base/333382 by me.
tuexen [Thu, 31 May 2018 16:00:03 +0000 (16:00 +0000)]
MFC r333186:
Send an ICMPv6 PacketTooBig message in case of forwading a packet which
is too big for the outgoing interface and no firewall is involed.
This problem was introduced in
https://svnweb.freebsd.org/changeset/base/324996
Thanks to Irene Ruengeler for finding the bug and testing the fix.
rwlock: diff-reduction of runlock compared to sx sunlock
==
Undo LOCK_PROFILING pessimisation after r313454 and r313455
With the option used to compile the kernel both sx and rw shared ops would
always go to the slow path which added avoidable overhead even when the
facility is disabled.
Furthermore the increased time spent doing uncontested shared lock acquire
would be bogusly added to total wait time, somewhat skewing the results.
Restore old behaviour of going there only when profiling is enabled.
This change is a no-op for kernels without LOCK_PROFILING (which is the
default).
==
sx: fix adaptive spinning broken in r327397
The condition was flipped.
In particular heavy multithreaded kernel builds on zfs started suffering
due to nested sx locks.
For instance make -s -j 128 buildkernel:
before: 3326.67s user 1269.62s system 6981% cpu 1:05.84 total
after: 3365.55s user 911.27s system 6871% cpu 1:02.24 total
==
locks: fix a corner case in r327399
If there were exactly rowner_retries/asx_retries (by default: 10) transitions
between read and write state and the waiters still did not get the lock, the
next owner -> reader transition would result in the code correctly falling
back to turnstile/sleepq where it would incorrectly think it was waiting
for a writer and decide to leave turnstile/sleepq to loop back. From this
point it would take ts/sq trips until the lock gets released.
The bug sometimes manifested itself in stalls during -j 128 package builds.
Refactor the code to fix the bug, while here remove some of the gratituous
differences between rw and sx locks.
==
sx: don't do an atomic op in upgrade if it cananot succeed
The code already pays the cost of reading the lock to obtain the waiters
flag. Checking whether there is more than one reader is not a problem and
avoids dirtying the line.
This also fixes a small corner case: if waiters were to show up between
reading the flag and upgrading the lock, the operation would fail even
though it should not. No correctness change here though.
==
mtx: tidy up recursion handling in thread lock
Normally after grabbing the lock it has to be verified we got the right one
to begin with. However, if we are recursing, it must not change thus the
check can be avoided. In particular this avoids a lock read for non-recursing
case which found out the lock was changed.
While here avoid an irq trip of this happens.
==
locks: slightly depessimize lockstat
The slow path is always taken when lockstat is enabled. This induces
rdtsc (or other) calls to get the cycle count even when there was no
contention.
Still go to the slow path to not mess with the fast path, but avoid
the heavy lifting unless necessary.
This reduces sys and real time during -j 80 buildkernel:
before: 3651.84s user 1105.59s system 5394% cpu 1:28.18 total
after: 3685.99s user 975.74s system 5450% cpu 1:25.53 total
disabled: 3697.96s user 411.13s system 5261% cpu 1:18.10 total
So note this is still a significant hit.
LOCK_PROFILING results are not affected.
==
rw: whack avoidable re-reads in try_upgrade
==
locks: extend speculative spin waiting for readers to drain
Now that 10 years have passed since the original limit of 10000 was
committed, bump it a little bit.
Spinning waiting for writers is semi-informed in the sense that we always
know if the owner is running and base the decision to spin on that.
However, no such information is provided for read-locking. In particular
this means that it is possible for a write-spinner to completely waste cpu
time waiting for the lock to be released, while the reader holding it was
preempted and is now waiting for the spinner to go off cpu.
Nonetheless, in majority of cases it is an improvement to spin instead of
instantly giving up and going to sleep.
The current approach is pretty simple: snatch the number of current readers
and performs that many pauses before checking again. The total number of
pauses to execute is limited to 10k. If the lock is still not free by
that time, go to sleep.
Given the previously noted problem of not knowing whether spinning makes
any sense to begin with the new limit has to remain rather conservative.
But at the very least it should also be related to the machine. Waiting
for writers uses parameters selected based on the number of activated
hardware threads. The upper limit of pause instructions to be executed
in-between re-reads of the lock is typically 16384 or 32678. It was
selected as the limit of total spins. The lower bound is set to
already present 10000 as to not change it for smaller machines.
Bumping the limit reduces system time by few % during benchmarks like
buildworld, buildkernel and others. Tested on 2 and 4 socket machines
(Broadwell, Skylake).
Figuring out how to make a more informed decision while not pessimizing
the fast path is left as an exercise for the reader.
==
fix uninitialized variable warning in reader locks
r334261: Guard against error when given -t "*..."
r334262: Eliminate ANSI dimming in developer mode
r334359: Fix "-t test" for post-processing profiles
Bump FreeBSD_version directly in stable/11 for ports IGNORE (as in r334290)
Reviewed by: gjb
Approved by: re (gjb)
Sponsored by: Smule, Inc.
jhb [Tue, 29 May 2018 13:54:34 +0000 (13:54 +0000)]
MFC 333606: Make the common interrupt entry point labels local labels.
Kernel debuggers depend on symbol names to find stack frames with a
trapframe rather than a normal stack frame. The labels used for the
shared interrupt entry point for the PTI and non-PTI cases did not
match the existing patterns confusing debuggers. Add the '.L' prefix
to mark these symbols as local so they are not visible in the symbol
table.
royger [Tue, 29 May 2018 07:51:24 +0000 (07:51 +0000)]
MFC r334027: xen-blkback: do not use state 3
Linux will not connect to a backend that's in state 3
(XenbusStateInitialised), it needs to be in state 2
(XenbusStateInitWait) for Linux to attempt to connect to the
backend.
marius [Thu, 24 May 2018 23:11:25 +0000 (23:11 +0000)]
MFC: r333955
- Unbreak booting sparc64 kernels after the metadata unification in
r329190 (MFCed to stable/11 in r332150); sparc64 kernels are always
64-bit but with that revision in place, the loader was treating them
as 32-bit ones.
- In order to reduce the likelihood of this kind of breakage in the
future, #ifdef out md_load() on sparc64 and make md_load_dual() -
which is currently local to metadata.c anyway - static.
- Make md_getboothowto() - also local to metadata.c - static.
- Get rid of the unused DTB pointer on sparc64.
ae [Thu, 24 May 2018 11:02:21 +0000 (11:02 +0000)]
MFC r333986:
Remove check for matching the rulenum, ruleid and rule pointer from
dyn_lookup_ipv[46]_state_locked(). These checks are remnants of not
ready to be committed code, and they are there by accident.
Due to the race these checks can lead to creating of duplicate states
when concurrent threads in the same time will try to add state for two
packets of the same flow, but in reverse directions and matched by
different parent rules.
Reported by: lev
MFC r334039:
Restore the ability to keep states after parent rule deletion.
This feature is disabled by default and was removed when dynamic states
implementation changed to be lockless. Now it is reimplemented with small
differences - when dyn_keep_states sysctl variable is enabled,
dyn_match_ipv[46]_state() function doesn't match child states of deleted
rule. And thus they are keept alive until expired. ipfw_dyn_lookup_state()
function does check that state was not orphaned, and if so, it returns
pointer to default_rule and its position in the rules map. The main visible
difference is that orphaned states still have the same rule number that
they have before parent rule deleted, because now a state has many fields
related to rule and changing them all atomically to point to default_rule
seems hard enough.
Reported by: <lantw44 at gmail.com>
Approved by: re (kib)
ken [Mon, 21 May 2018 18:59:34 +0000 (18:59 +0000)]
MFC r333492:
------------------------------------------------------------------------
r333492 | ken | 2018-05-11 08:50:26 -0600 (Fri, 11 May 2018) | 10 lines
Clear out the entire structure, not just the size of a pointer to it.
sys/dev/ocs/ocs_os.c:
In ocs_thread_create(), use sizeof(*thread) (instead of
sizeof(thread)) as the size argument to memset so that we clear
out the entire thread structure instead of just a few bytes of it.
dim [Sun, 20 May 2018 16:03:21 +0000 (16:03 +0000)]
MFC r333715:
Pull in r322325 from upstream llvm trunk (by Matthias Braun):
PeepholeOpt cleanup/refactor; NFC
- Less unnecessary use of `auto`
- Add early `using RegSubRegPair(AndIdx) =` to avoid countless
`TargetInstrInfo::` qualifications.
- Use references instead of pointers where possible.
- Remove unused parameters.
- Rewrite the CopyRewriter class hierarchy:
- Pull out uncoalescable copy rewriting functionality into
PeepholeOptimizer class.
- Use an abstract base class to make it clear that rewriters are
independent.
- Remove unnecessary \brief in doxygen comments.
- Remove unused constructor and method from ValueTracker.
- Replace UseAdvancedTracking of ValueTracker with DisableAdvCopyOpt
use.
Even though upstream marked this as "No Functional Change", it does
contain some functional changes, and these fix a compiler hang for one
particular source file in the devel/godot port.
gjb [Fri, 18 May 2018 14:57:58 +0000 (14:57 +0000)]
MFC r315733, r315737, r315740, r330054:
r315733 (imp):
Impelemnt ttys onifexists in init.
Implement a new init(8) option in /etc/ttys. If this option is present
on the entry in /etc/ttys, the entry will be active if and only if it
exists. If the name starts with a '/', it will be considered an
absolute path. If not, it will be a path relative to /dev.
This allows one to turn off video console getty that aren't present
(while running a getty on them even when they aren't the system
console). Likewise with serial ports.
It differs from onifconsole in only requiring the device exist rather
than it be listed as one of the system consoles.
r315737 (ngie):
Unbreak world by adding sys/stat.h for stat(2)
r315740 (imp):
Simplify the code a little.
r330054 (trasz):
Improve missing tty handling in init(8). This removes a check that did
nothing - it was checking for ENXIO, which, with devfs, is no longer
returned - and was badly placed anyway, and replaces it with similar
one that works, and is done just before starting getty, instead of being
done when rereading ttys(5).
From the practical point of view, this makes init(8) handle disappearing
terminals (eg /dev/ttyU*) gracefully, without unneccessary getty restarts
and resulting error messages.
Reported by: Bart Ender, Andre Albsmeier
PR: 228315
Blocks: 11.2-BETA2
Approved by: re (marius)
Sponsored by: The FreeBSD Foundation
ae [Fri, 18 May 2018 10:17:13 +0000 (10:17 +0000)]
MFC r333497:
Apply the change from r272770 to if_ipsec(4) interface.
It is guaranteed that if_ipsec(4) interface is used only for tunnel
mode IPsec, i.e. decrypted and decapsulated packet has its own IP header.
Thus we can consider it as new packet and clear the protocols flags.
This allows ICMP/ICMPv6 properly handle errors that may cause this packet.
marius [Thu, 17 May 2018 21:22:19 +0000 (21:22 +0000)]
MFC: r333613
The broken DDR52 support of Intel Bay Trail eMMC controllers rumored
in the commit log of r321385 has been confirmed via the public VLI54
erratum. Thus, stop advertising DDR52 for these controllers.
Note that this change should hardly make a difference in practice as
eMMC chips from the same era as these SoCs most likely support HS200
at least, probably even up to HS400ES.
manu [Thu, 17 May 2018 17:00:07 +0000 (17:00 +0000)]
MFC r333737:
release: arm: Format FAT partition as FAT16
r332674 raised the size of the FAT partition from 2MB to 41MB for some
boards. But we format them in FAT12 and this size appears to be to big
for FAT12 and some SoC bootrom cannot cope with that.
Format the msdosfs partition as FAT16,
sbruno [Thu, 17 May 2018 16:32:38 +0000 (16:32 +0000)]
MFC r333499
Add deprecation notice for vxge.
This driver was merged to HEAD one week prior to Exar publicly
announcing theyhad left the Ethernet market. It is not known to be used
and has various code quality issues spotted by Brooks and Hiren. Retire
it in preparation for FreeBSD 12.0.
ae [Thu, 17 May 2018 10:01:47 +0000 (10:01 +0000)]
MFC r333458:
Fix the printing of rule comments.
Change uint8_t type of opcode argument to int in the print_opcode()
function. Use negative value to print the rest of opcodes, because
zero value is O_NOP, and it can't be uses for this purpose.
jhb [Wed, 16 May 2018 21:04:19 +0000 (21:04 +0000)]
MFC 332891,332892: Fixes for atomic_*cmpset() on arm.
332891:
Fix some harmless type mismatches in the ARM atomic_cmpset implementations.
The return value of atomic_cmpset() and atomic_fcmpset() is an int (which
is really a bool) that has the values 0 or 1. Some of the inlines were
using the type being operated on (e.g. uint32_t) as either the return type
of the function, or the type of a local 'ret' variable used to hold the
return value. Fix all of these to just use plain 'int'. Due to C promotion
rules and the fact that the value can only be 0 or 1, these should all be
harmless.
332892:
Implement 32-bit atomic_fcmpset() in userland for armv4/v5.
- Add an implementation of atomic_fcmpset_32() using RAS for armv4/v5.
This fixes recent world breakage due to use of atomic_fcmpset() in
userland.
- While here, be more careful to not expose wrapper macros for 64-bit
atomic_*cmpset to userland for armv4/v5 as only 32-bit cmpset is
implemented.
This has been reviewed, but not runtime-tested, but should fix the arm.arm
and arm.armeb worlds that have been broken for a while.
r331340:
cxgbe(4): Tunnel congestion drops on a port should be cleared when the
stats for that port are cleared.
r331342:
cxgbe(4): Do not read MFG diags information from custom boards.
r331472:
cxgbe(4): Always initialize requested_speed to a valid value.
This fixes an avoidable EINVAL when the user tries to disable AN after
the port is initialized but l1cfg doesn't have a valid speed to use.
r332050:
cxgbe(4): Always display an error message if SIOCSIFFLAGS will leave
IFF_UP and IFF_DRV_RUNNING out of sync. ifhwioctl in the kernel pays no
attention to the return code from the driver ioctl during SIOCSIFFLAGS
so these messages are the only indication that the ioctl was called but
failed.
r333276:
cxgbe(4): Update all firmwares to 1.19.1.0.
r333448:
cxgbe(4): Disable write-combined doorbells by default.
This had been the default behavior but was changed accidentally as part
of the recent iw_cxgbe+OFED overhaul. Fix another bug in that change
while here: the global knob affects all the adapters in the system and
should be left alone by per-adapter code.
ae [Tue, 15 May 2018 11:43:05 +0000 (11:43 +0000)]
MFC r333244:
Immediately propagate EACCES error code to application from tcp_output.
In r309610 and r315514 the behavior of handling EACCES was changed, and
tcp_output() now returns zero when EACCES happens. The reason of this
change was a hesitation that applications that use TCP-MD5 will be
affected by changes in project/ipsec.
TCP-MD5 code returns EACCES when security assocition for given connection
is not configured. But the same error code can return pfil(9), and this
change has affected connections blocked by pfil(9). E.g. application
doesn't return immediately when SYN segment is blocked, instead it waits
when several tries will be failed.
Actually, for TCP-MD5 application it doesn't matter will it get EACCES
after first SYN, or after several tries. Security associtions must be
configured before initiating TCP connection.
I left the EACCES in the switch() to show that it has special handling.
Reported by: Andreas Longwitz <longwitz at incore dot de>
Approved by: re (marius)
hselasky [Tue, 15 May 2018 09:40:52 +0000 (09:40 +0000)]
MFC r333362:
Fix for missing network interface address event when adding the default IPv6
based link-local address.
The default link local address for IPv6 is added as part of bringing the
network interface up. Move the call to "EVENTHANDLER_INVOKE(ifaddr_event,)"
from the SIOCAIFADDR_IN6 ioctl(2) handler to in6_notify_ifa() which should
catch all the cases of adding IPv6 based addresses to a network interface.
Add a witness warning in case the event handler is not allowed to sleep.
Approved by: re (marius)
Reviewed by: network (ae), kib
Differential Revision: https://reviews.freebsd.org/D13407
Sponsored by: Mellanox Technologies
gonzo [Tue, 15 May 2018 02:26:50 +0000 (02:26 +0000)]
MFC r331906:
Approved by: re (gjb)
Fix accidental USB port resets by GPIO on Zynq/Zedboard boards
The Zynq/Zedboard GPIO driver attempts to tri-state all GPIO pins on
boot up but the order in which I reset the hardware can cause the pins
to be briefly held low before being tri-stated. This is a problem on
boards that use GPIO pins to reset devices.
In particular, the Zybo and ZC-706 boards use a GPIO pin as a USB PHY
reset. If U-boot enables the USB port before booting the kernel, the
GPIO driver attach causes a glitch on the USB PHY reset and the USB
port loses power. My fix is to have the GPIO driver leave the pins in
whatever configuration U-boot placed them.
PR: 225713
Submitted by: Thomas Skibo <thoma555-bsd@yahoo.com>
r329188: Use tabs in io.d, fix alignment issues, remove extra newlines
r329334: Add errno definitions to /usr/lib/dtrace/errno.d
r329353: Add inline to errno.d for translating int to string
r329914: Updates and enhancements to io.d to aid DTrace scripting
r329995: Updates and enhancements to signal.d to aid DTrace scripting
r329996: Consistent casing for fallback SIGCHLD (s/Unknown/unknown/)
r330559: Introduce dwatch(1) as a tool for making DTrace more useful
r330560: Bump dwatch(1) internal version from 1.0-beta-91 to 1.0
r330672: Fix display of wrong pid from dtrace_sched(4)
r332865: Add `-dev' option to aid debugging of profiles
r332866: Add profile for send(2)/recv(2) syscalls
r332867: Remove the line used to demonstrate `-dev' option
r333513: Bugfix, usage displayed with `-1Q'
r333514: Separate default values so `-[BK] num' don't affect usage
r333515: Simplify info message test
r333516: Export ARGV to profiles loaded via load_profile()
r333517: Allow `-E code' to override profile EVENT_DETAILS
r333518: Expose process for ip/tcp/udp
r333519: Refactor sendrecv profile
gjb [Mon, 14 May 2018 17:43:43 +0000 (17:43 +0000)]
MFC r333473:
Add a special GCE_LICENSE variable to Makefile.gce, which when set,
will include license metadata in the resultant GCE image.
GCE_LICENSE is unset by default, as it primarily pertains to images
produced by the FreeBSD Project, but for downstream FreeBSD consumers,
it can be set in the make(1) environment in the format of:
The "license" is not a license, per se, but required metadata that
is required by the GCE marketplace. For the FreeBSD Project, the
license name is simply 'freebsd', with the description of 'FreeBSD'.
Approved by: re (marius)
Sponsored by: The FreeBSD Foundation
trasz [Mon, 14 May 2018 15:35:54 +0000 (15:35 +0000)]
MFC r333493:
Set kldxref_enable="YES" for ARM images. Without it, the images are missing
the /boot/kernel/linker.hints file, which breaks loading some of the modules
with dependencies, eg cfiscsi.ko.
This is a minimal fix for ARM images, in order to safely MFC it before
11.2-RELEASE. Afterwards, however, I believe we should actually just change
the default (as in, etc/defaults/rc.conf). The reason is that it's required
for every image that's being cross-built, as kldxref(1) cannot handle files
for non-native architectures. For the one that is not - amd64 - having it
on by default doesn't change anything - the script is noop if the linker.hints
already exists.
The long-term solution would be to rewrite kldxref(1) to handle other
architectures, and generate linker.hints at build time.
Approved by: re (marius@)
Sponsored by: DARPA, AFRL