mmacy [Wed, 9 May 2018 17:48:52 +0000 (17:48 +0000)]
Remove bogus panic
r333345 added a panic to the default case statement on the incorrect
premise that it should "never happen" when in fact it is simply a
different adapter version.
imp [Wed, 9 May 2018 14:11:35 +0000 (14:11 +0000)]
Minor style nits
Use full copyright year.
Remove 'All Rights Reserved' from new file (rights holder OK'd)
Minor #ifdef motion and #endif tagging
Remove __FBSDID macro from comments
kib [Wed, 9 May 2018 12:09:08 +0000 (12:09 +0000)]
Remove PG_U from the rest of the kernel pmap ptes.
Supposedly, they PG_U bits there were set to easier making some kernel
page accessible to userspace in-place. Since it was not used for the
whole existence of the amd64 pmap.c and current design of the shared
pages prefers double-mapping over the in-place access, remove PG_U
both from the direct map and KVA slots.
Reviewed by: alc, markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
kib [Wed, 9 May 2018 12:03:40 +0000 (12:03 +0000)]
Remove PG_U from the recursive pte for kernel pmap' PML4 page.
This PML4 page is never used for the userspace process, so there is no
security implications. But the configuration trips SMAP check, which
should be corrected.
Reviewed by: alc, markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Bring in some last changes in NAT64 implementation:
o Modify ipfw(8) to be able set any prefix6 not just Well-Known,
and also show configured prefix6;
o relocate some definitions and macros into proper place;
o convert nat64_debug and nat64_allow_private variables to be
VNET-compatible;
o add struct nat64_config that keeps generic configuration needed
to NAT64 code;
o add nat64_check_prefix6() function to check validness of specified
by user IPv6 prefix according to RFC6052;
o use nat64_check_private_ip4() and nat64_embed_ip4() functions
instead of nat64_get_ip4() and nat64_set_ip4() macros. This allows
to use any configured IPv6 prefixes that are allowed by RFC6052;
o introduce NAT64_WKPFX flag, that is set when IPv6 prefix is
Well-Known IPv6 prefix. It is used to reduce overhead to check this;
o modify nat64lsn_cfg and nat64stl_cfg structures to use nat64_config
structure. And respectivelly modify the rest of code;
o remove now unused ro argument from nat64_output() function;
o remove __FreeBSD_version ifdef, NAT64 was not merged to older versions;
o add commented -DIPFIREWALL_NAT64_DIRECT_OUTPUT flag to module's Makefile
as example.
emaste [Wed, 9 May 2018 11:17:01 +0000 (11:17 +0000)]
lld: Omit PT_NOTE for SHT_NOTE without SHF_ALLOC
A non-alloc note section should not have a PT_NOTE program header.
Found while linking ghc (Haskell compiler) with lld on FreeBSD. Haskell
emits a .debug-ghc-link-info note section (as the name suggests, it
contains link info) as a SHT_NOTE section without SHF_ALLOC set.
For this case ld.bfd does not emit a PT_NOTE segment for
.debug-ghc-link-info. lld previously emitted a PT_NOTE with p_vaddr = 0
and FreeBSD's rtld segfaulted when trying to parse a note at address 0.
kib [Wed, 9 May 2018 10:28:24 +0000 (10:28 +0000)]
Created static libc PIC/no-SSP library to be used by rtld.
Rtld is not compatible with SSP, and since we link libc_pic.a to rtld
to have the basic support like memory and string copy functions, we
have to both carefully limit libc use, and to provide the ssp support
shims. This change makes the libc use in rtld more straighforward but
still limited, and allows to remove the shims, to be done in the next
commit.
eadler [Wed, 9 May 2018 07:46:57 +0000 (07:46 +0000)]
enigma(1) Remove reference to PGP; modernize a bit
- the port was removed 2017-06-07 in r442847
- gnupg1 is the older version of gpg with legacy PGP support
- remove unused macro
- remove now-false statement about export restrictions
These filters reside in the card's memory instead of its TCAM and can be
configured via a new "hashfilter" subcommand in cxgbetool. Hash and
normal TCAM filters can be used together. The hardware does an
exact-match of packet fields for hash filters, unlike the masked match
performed for TCAM filters. Any T5/T6 card with memory can support at
least half a million hash filters. The sample config file with the
driver configures 512K of these, it is possible to double this to 1
million+ in some cases.
The chip does an exact-match of fields of incoming datagrams with hash
filters and performs the action configured for the filter if it matches.
The fields to match are specified in a "filter mask" in the firmware
config file. The filter mask always includes the 5-tuple (sip, dip,
sport, dport, ipproto). It can, optionally, also include any subset of
the filter mode (see filterMode and filterMask in the firmware config
file).
Exact values of the 5-tuple, the physical port, and VLAN tag would have
to be provided while setting up a hash filter with the chip
configuration above.
Hash filters support all actions supported by TCAM filters. A packet
that hits a hash filter can be dropped, let through (with optional
steering to a specific queue or RSS region), switched out of another
port (with optional L2 rewrite of DMAC, SMAC, VLAN tag), or get NAT'ed.
(Support for some of these will show up in the driver in a follow-up
commit very shortly).
imp [Wed, 9 May 2018 02:02:49 +0000 (02:02 +0000)]
Remove 'All Rights Reserved' from the collection copyright and templates.
The original Berkeley Software Distributions were made in the 1980's
and 1990's. At that time, the Buenos Ares Convention of 1910 was in
force in most of the countries in the Americas. It required an
affirmative statement of rights reservation, typically using 'All
Rights Reserved.' The Regents included this phrase in their copyright
notices to invoke this treaty to ensure maximal copyright protection.
In the 1990's, Latin America coutries ratifeid the Berne Convention on
copyrights which prohibited them from requiring an affirmative
statement to reserve the rights. When Nicaragua ratified in 2000, the
Buenos Ares Convention of 1910 was effectively repealed. This made all
the 'All Rights Reserved' phrases obsolete and legal deadweight most
of the time, and certainly in the cases removed here.
Since it's no longer required, and is in fact meaningless, core has
decided to dropped it from the project's collection copyright and
sample templates. It encourages other rights holders to do the same
after consultation with their legal department.
More see https://en.wikipedia.org/wiki/Buenos_Aires_Convention for
more information.
mmacy [Wed, 9 May 2018 00:00:47 +0000 (00:00 +0000)]
Reduce overhead of ktrace checks in the common case.
KTRPOINT() checks both if we are tracing _and_ if we are recursing within
ktrace. The second condition is only ever executed if ktrace is actually
enabled. This change moves the check out of the hot path in to the functions
themselves.
des [Tue, 8 May 2018 23:13:11 +0000 (23:13 +0000)]
Upgrade to OpenSSH 7.6p1. This will be followed shortly by 7.7p1.
This completely removes client-side support for the SSH 1 protocol,
which was already disabled in 12 but is still enabled in 11. For that
reason, we will not be able to merge 7.6p1 or newer back to 11.
imp [Tue, 8 May 2018 20:02:44 +0000 (20:02 +0000)]
Remove ignored command line options
The --device and --part command line options were planned for Linux
compatibility mode. However, that mode will never happen, so remove
them as last vestiges of a false start.
imp [Tue, 8 May 2018 19:43:57 +0000 (19:43 +0000)]
Improve printing the boot variables.
Print the boot variables in the order in the BootOrder variable, if it
exists, and then in verbose mode print any unreferneced BootXXXX
variables. If BootOrder isn't set, fall back to printing all the
variables.
tuexen [Tue, 8 May 2018 18:48:51 +0000 (18:48 +0000)]
When reporting ERROR or ABORT chunks, don't use more data
that is guaranteed to be contigous.
Thanks to Felix Weinrank for finding and reporting this bug
by fuzzing the usrsctp stack.
shurd [Tue, 8 May 2018 17:15:10 +0000 (17:15 +0000)]
iflib: print message when iflib_tx_structures_setup fails
Print a message when iflib_tx_structures_setup fails, like we do for
iflib_rx_structures_setup.
Now that we always print a message from within
iflib_qset_structures_setup when it fails, stop printing one in
iflib_device_register() at the call site.
Submitted by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed by: gallatin
MFC after: 3 days
Sponsored by: Intel Corporation
Differential Revision: https://reviews.freebsd.org/D15300
kib [Tue, 8 May 2018 17:00:34 +0000 (17:00 +0000)]
Prepare DB# handler for deferred trigger of watchpoints.
Since pop %ss/mov %ss instructions defer all interrupts and exceptions
for the next instruction, it is possible that the userspace watchpoint
trap executes on the first instruction of the kernel entry for
syscall/bpt.
In this case, DB# should be treated similarly to NMI: on amd64 we must
always load GSBASE even if the trap comes from kernel mode, and load
the kernel page table root into %cr3. Moreover, the trap must
use the dedicated stack, because we are still on the user stack when
trapped on syscall entry.
For i386, we must reload %cr3. The syscall instruction is not configured,
so there is no issue with executing on user stack when trapping.
Due to some CPU erratas it is not always possible to detect that the
userspace watchpoint triggered by inspecting %dr6. In trap(), compare the
trap %rip with the known unsafe entry points and if matched pretend that
the watchpoint did not fire at all.
Thank you to the MSRC Incident Response Team, and in particular Greg
Lenti and Nate Warfield, for coordinating the response to this issue
across multiple vendors.
Thanks to Computer Recycling at The Working Center of Kitchener for
making hardware available to allow us to test the patch on additional
CPU families.
Reviewed by: jhb
Discussed with: Matthew Dillon
Tested by: emaste
Sponsored by: The FreeBSD Foundation
Security: CVE-2018-8897
Security: FreeBSD-SA-18:06.debugreg
jhibbits [Tue, 8 May 2018 13:23:39 +0000 (13:23 +0000)]
Fix wrong cpu0 identification
Summary:
chrp_cpuref_init() was relying on the boot strap processor to be
the first child of /cpus. That was not always the case, specially
on pseries with FDT.
This change uses the "reg" property of each CPU instead and also
adds several sanity checks to avoid unexpected behavior (maybe
too many panics?).
The main observed symptom was interrupts being missed by the main
processor, leading to timeouts and the kernel aborting the boot.
hselasky [Tue, 8 May 2018 11:39:01 +0000 (11:39 +0000)]
Fix for missing network interface address event when adding the default IPv6
based link-local address.
The default link local address for IPv6 is added as part of bringing the
network interface up. Move the call to "EVENTHANDLER_INVOKE(ifaddr_event,)"
from the SIOCAIFADDR_IN6 ioctl(2) handler to in6_notify_ifa() which should
catch all the cases of adding IPv6 based addresses to a network interface.
Add a witness warning in case the event handler is not allowed to sleep.
kevans [Tue, 8 May 2018 03:53:46 +0000 (03:53 +0000)]
bsdgrep: Allow "-" to be passed to -f to mean "standard input"
A version of this patch was originally sent to me by se@, matching behavior
from newer versions of GNU grep.
While there have been some differences of opinion on whether stdin should be
closed or not after depleting it in process of -f, I've opted to leave stdin
open and just let the later matching stuff fail and result in a no-match.
I'm not married to the current behavior- it was generally chosen since we
are adopting this in particular from GNU grep, and I would like to stay
consistent without a strong argument to the contrary. The current behavior
isn't technically wrong, it's just fairly unfriendly to the developer-user
of grep that may not realize their usage is trivially invalid.
mmacy [Tue, 8 May 2018 02:22:34 +0000 (02:22 +0000)]
Fix spurious retransmit recovery on low latency networks
TCP's smoothed RTT (SRTT) can be much larger than an actual observed RTT. This can be either because of hz restricting the calculable RTT to 10ms in VMs or 1ms using the default 1000hz or simply because SRTT recently incorporated a larger value.
If an ACK arrives before the calculated badrxtwin (now + SRTT):
tp->t_badrxtwin = ticks + (tp->t_srtt >> (TCP_RTT_SHIFT + 1));
We'll erroneously reset snd_una to snd_max. If multiple segments were dropped and this happens repeatedly the transmit rate will be limited to 1MSS per RTO until we've retransmitted all drops.
mjg [Mon, 7 May 2018 22:29:32 +0000 (22:29 +0000)]
Avoid calls to syscall_thread_enter/exit for statically defined syscalls
The entire mechanism is rarely used and is quite not performant due to
atomci ops on the syscall table. It also has added overhead for completely
unrelated syscalls.
Reduce it by avoiding the func calls if possible (which consistutes vast
majority of cases).
Provides about 3% syscall rate speed up for getuid on Broadwell.
mjg [Mon, 7 May 2018 21:32:08 +0000 (21:32 +0000)]
amd64: stop asserting params != NULL in the syscall path
The parameter is effectively controllable by userspace. It does not matter
what it is set to as it is being passed to copyin - worst case the operation
will just fail.
While here stop computing it unless it is going to be used.
mjg [Mon, 7 May 2018 20:54:42 +0000 (20:54 +0000)]
amd64: fix up memset added in r333324
There was a missing trick expanding the passed pattern to a full word
by multiplication. As a side effect non-zero patterns would be
incorrectly laid down.
This stems from the use of rep stosq which is word-sized, while the passed
argument is byte-sized.
I initially repurposed memcpy into memset without taking this into account.
All but non-bzero testing was performed with a variant utilizing ERMS, i.e.
using only stosb which happens to not into the problem whatsoever. So my bad
twice.
Thanks to Oliver Pinter for noting the problem and providing a testcase.
oshogbo [Mon, 7 May 2018 20:38:09 +0000 (20:38 +0000)]
Introduce caph_enter and caph_enter_casper.
The caph_enter function should made it easier to sandbox application
and not force us to remember that we need to check errno on failure.
Another function is also checking if casper is present.
Reviewed by: emaste, cem (partially)
Differential Revision: https://reviews.freebsd.org/D14557
gallatin [Mon, 7 May 2018 18:11:22 +0000 (18:11 +0000)]
Fix an off-by-one error when deciding to request a tx interrupt
The canonical check for whether or not a ring is drainable is
TXQ_AVAIL() > MAX_TX_DESC() + 2. Use this same construct here,
in order to avoid a potential off-by-one error where we might otherwise
fail to request an interrupt.
gallatin [Mon, 7 May 2018 15:24:03 +0000 (15:24 +0000)]
Boost thread priority while changing CPU frequency
Boost the priority of user-space threads when they set
their affinity to a core to adjust its frequency. This avoids a situation
where a CPU bound kernel thread with the same affinity is running on a
down-clocked core, and will "block" powerd from up-clocking the core
until the kernel thread yields. This can lead to poor perfomance,
and to things potentially getting stuck on Giant.
mav [Mon, 7 May 2018 14:44:55 +0000 (14:44 +0000)]
Keep CARP state as INIT when net.inet.carp.allow=0.
Currently when net.inet.carp.allow=0 CARP state remains as MASTER, which is
not very useful (if there are other masters -- it can lead to split brain,
if there are none -- it makes no sense). Having it as INIT makes it clear
that carp packets are disabled.
Submitted by: wg
MFC after: 1 month
Relnotes: yes
Sponsored by: iXsystems, Inc.
Differential Revision: https://reviews.freebsd.org/D14477
avg [Mon, 7 May 2018 12:22:25 +0000 (12:22 +0000)]
x86 cpususpend_handler: call wbinvd after setting suspend state bits
Without a subsequent wbinvd the changes to suspended_cpus (and
resuming_cpus) can be lost at least on AMD systems that use MOESI cache
coherency protocol. That can happen because one of APs ends up as an
Owner of the corresponding cache line(s) and the changes may never reach
the main memory before the AP is reset.
While here, move clearing of suspended_cpus a little bit earlier as the
fact of returning from savectx (with zero return value) means that the
CPU has fully restored it execution context.
Also, rework the comment that describes the need for resuming_cpus.
This change fixed suspend to RAM a previously broken AMD-based system.
manu [Mon, 7 May 2018 07:30:40 +0000 (07:30 +0000)]
clk: Add support for assigned-clock-rates
The properties 'assigned-clocks', 'assigned-clock-parents' and
'assigned-clock-rates' all work together.
'assigned-clocks' holds the list of clock for which we need to either
assign a new parent or a new frequency.
The old code just supported assigning a new parents, add support for
assigning a new frequency too.
manu [Mon, 7 May 2018 07:28:10 +0000 (07:28 +0000)]
arm64: rk: Add support for setting pll rate
Add support for setting pll rate. On RockChip SoC two kind of plls are
supported, integer mode and fractional mode.
The two modes are intended to support more frequencies for the core plls.
While here change the recalc method as it appears that the datasheet is
wrong on the calculation method.
manu [Mon, 7 May 2018 07:26:48 +0000 (07:26 +0000)]
arm64: rockchip: rk3328: Add armclk clock
Add the clock definition for the arm clock.
While here remove the indexes in the clock table as we will need clock
with a 0 index (non-exported clocks).
pfg [Sun, 6 May 2018 21:29:29 +0000 (21:29 +0000)]
msdosfs: use vfs_timestamp() to generate timestamps instead of getnanotime().
Most filesystems, with the notable exceptions of msdosfs and autofs use
only vfs_timestamp() to read the current time. This has the benefit of
configurable granularity (using the vfs.timestamp_precision sysctl).
mmacy [Sun, 6 May 2018 20:34:13 +0000 (20:34 +0000)]
r333175 introduced deferred deletion of multicast addresses in order to permit the driver ioctl
to sleep on commands to the NIC when updating multicast filters. More generally this permitted
driver's to use an sx as a softc lock. Unfortunately this change introduced a race whereby a
a multicast update would still be queued for deletion when ifconfig deleted the interface
thus calling down in to _purgemaddrs and synchronously deleting _all_ of the multicast addresses
on the interface.
Synchronously remove all external references to a multicast address before enqueueing for delete.
mmacy [Sun, 6 May 2018 20:32:47 +0000 (20:32 +0000)]
The ifnet pointer (ifp) in rt_newaddrmsg can be valid without ifp->if_addr being set if
if the ifnet is still live by way of a reference but
in line for deletion. Check ifp->if_addr before dereferencing.
sbruno [Sun, 6 May 2018 16:22:02 +0000 (16:22 +0000)]
Cleanup sundry clang warnings for code that is not upstream in illumos.
https://github.com/illumos/illumos-gate/edit/master/usr/src/lib/libzfs/common/libzfs_sendrecv.c
Patch our version of it to quiesce warnings until someone decides to sync
up our code:
libzfs_sendrecv.c:2555:30: warning: format specifies type 'unsigned long'
but the argument has type 'uint64_t' (aka 'unsigned long long') [-Wformat]
sprintf(guidname, "%lu", thisguid);
~~~ ^~~~~~~~
%llu
libzfs_sendrecv.c:2612:29: warning: format specifies type 'unsigned long'
but the argument has type 'uint64_t' (aka 'unsigned long long') [-Wformat]
sprintf(guidname, "%lu", parent_fromsnap_guid);
~~~ ^~~~~~~~~~~~~~~~~~~~
%llu
libzfs_sendrecv.c:2645:29: warning: format specifies type 'unsigned long'
but the argument has type 'uint64_t' (aka 'unsigned long long') [-Wformat]
sprintf(guidname, "%lu", parent_fromsnap_guid);
~~~ ^~~~~~~~~~~~~~~~~~~~
%llu
manu [Sun, 6 May 2018 14:37:11 +0000 (14:37 +0000)]
am335x_prcm: Delay the frequencies read check
With Linux 4.17 dts the compatible for the prcm added 'simplebus' we mean
that the simplebus driver will attach to it at the BUS_PASS_BUS pass.
Change the pass for the prcm driver to be at BUS_PASS_BUS so we will win
the attach.
This introduce a problem as this driver needs the ti_scm one to be already
attached. ti_scm also attach at BUS_PASS_BUS but after the prcm one as it is
after in the dtb and the simplebus driver simpy walk the tree to attach it's
children.
Use the bus_new_pass method to defer the frequencies read at BUS_PASS_TIMER.
This fixes booting on BeagleBone*
markj [Sun, 6 May 2018 00:38:29 +0000 (00:38 +0000)]
Import the netdump client code.
This is a component of a system which lets the kernel dump core to
a remote host after a panic, rather than to a local storage device.
The server component is available in the ports tree. netdump is
particularly useful on diskless systems.
The netdump(4) man page contains some details describing the protocol.
Support for configuring netdump will be added to dumpon(8) in a future
commit. To use netdump, the kernel must have been compiled with the
NETDUMP option.
The initial revision of netdump was written by Darrell Anderson and
was integrated into Sandvine's OS, from which this version was derived.
Reviewed by: bdrewery, cem (earlier versions), julian, sbruno
MFC after: 1 month
X-MFC note: use a spare field in struct ifnet
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D15253
markj [Sun, 6 May 2018 00:22:38 +0000 (00:22 +0000)]
Refactor some of the MI kernel dump code in preparation for netdump.
- Add clear_dumper() to complement set_dumper().
- Drain netdump's preallocated mbuf pool when clearing the dumper.
- Don't do bounds checking for dumpers with mediasize 0.
- Add dumper callbacks for initialization for writing out headers.
markj [Sun, 6 May 2018 00:19:48 +0000 (00:19 +0000)]
Add an mbuf allocator for netdump.
The aim is to permit mbuf allocations after a panic without calling into
the page allocator, without imposing any runtime overhead during regular
operation of the system, and without modifying driver code. The approach
taken is to preallocate a number of mbufs and clusters, storing them
in linked lists, and using the lists to back some UMA cache zones. At
panic time, the mbuf and cluster zone pointers are overwritten with
those of the cache zones so that the mbuf allocator returns
preallocated items.
Using this scheme, drivers which cache mbuf zone pointers from
m_getzone() require special handling when implementing netdump support.