jamie [Fri, 4 May 2018 20:54:27 +0000 (20:54 +0000)]
Make it easier for filesystems to count themselves as jail-enabled,
by doing most of the work in a new function prison_add_vfs in kern_jail.c
Now a jail-enabled filesystem need only mark itself with VFCF_JAIL, and
the rest is taken care of. This includes adding a jail parameter like
allow.mount.foofs, and a sysctl like security.jail.mount_foofs_allowed.
Both of these used to be a static list of known filesystems, with
predefined permission bits.
gjb [Fri, 4 May 2018 20:38:26 +0000 (20:38 +0000)]
Ensure the ports and src trees are available on GCE images,
satisfying a requirement to allow FreeBSD to be considered
a top-tier supported OS in Google Compute Engine.
MFC after: 3 days
Sponsored by: The FreeBSD Foundation
mmacy [Fri, 4 May 2018 19:31:28 +0000 (19:31 +0000)]
% WITHOUT_FORMAT_EXTENSIONS= XCC=/usr/local/bin/gcc8 make -j96 buildkernel KERNCONF=GENERIC-NODEBUG -s >& log
% grep "inlining failed" log | wc
234 3570 36065
Consensus on those polled is that inlining failure warnings are not useful
ian [Fri, 4 May 2018 19:28:05 +0000 (19:28 +0000)]
Properly support the GPIO_PIN_PRESET_{LOW,HIGH} options when configuring
a gpio pin. If neither of the options is specified, pre-set the pin's
output value to the pin's current input value, to achieve glitch-free
transitions to output mode on pins that are pulled up or down at reset
or via fdt pinctrl data.
markj [Fri, 4 May 2018 17:17:30 +0000 (17:17 +0000)]
Fix some races introduced in r332974.
With r332974, when performing a synchronized access of a page's "queue"
field, one must first check whether the page is logically dequeued. If
so, then the page lock does not prevent the page from being removed
from its page queue. Intoduce vm_page_queue(), which returns the page's
logical queue index. In some cases, direct access to the "queue" field
is still required, but such accesses should be confined to sys/vm.
ian [Fri, 4 May 2018 16:23:54 +0000 (16:23 +0000)]
Make reading imx6 gpio pins work correctly whether the pin is in open-drain
mode or not. An earlier attempt to make this work was done in r320456, by
always reading the pad status register (PSR) instead of the data register.
But it turns out the values in PSR only reflect the electrical level of an
output pin if the pad is configured with the SION (Set Input On) bit in the
pinmux config, and most output gpio pads are not configured that way.
So now a gpio read is done by returning the value from the data register,
which works right whether the pin is configured for input or output, unless
the pin has been set for OPENDRAIN mode, in which case the PSR is read
instead. For this to work, the pin must also be configured with SION turned
on in the fdt pinmux data, which is a reasonable thing to require for the
unusual case of reading an open-drain output pin.
shurd [Fri, 4 May 2018 15:20:34 +0000 (15:20 +0000)]
iflib: fix invalid free during queue allocation failure
In r301567, code was added to cleanup to prevent memory leaks for the
Tx and Rx ring structs. This code carefully tracked txq and rxq, and
made sure to free them properly during cleanup.
Because we assigned the txq and rxq pointers into the ctx->ifc_txqs and
ctx->ifc_rxqs, we carefully reset these pointers to NULL, so that
cleanup code would not accidentally free the memory twice.
This was changed by r304021 ("Update iflib to support more NIC designs"),
which removed this resetting of the pointers to NULL, because it re-used
the txq and rxq pointers as an index into the queue set array.
Unfortunately, the cleanup code was left alone. Thus, if we fail to
allocate DMA or fail to configure the queues using the drivers ifdi
methods, we will attempt to free txq and rxq. These variables would now
incorrectly point to the wrong location, resulting in a page fault.
There are a number of methods to correct this, but ultimately the root
cause was that we reuse the txq and rxq pointers for two different
purposes.
Instead, when allocating, store the returned pointer directly into
ctx->ifc_txqs and ctx->ifc_rxqs. Then, assign this to txq and rxq as
index pointers before starting the loop to allocate each queue.
Drop the cleanup code for txq and rxq, and only use ctx->ifc_txqs and
ctx->ifc_rxqs.
Thus, we no longer need to free txq or rxq under any error flow, and
intsead rely solely on the pointers stored in ctx->ifc_txqs and
ctx->ifc_rxqs. This prevents the invalid free(), and ensures that we
still properly cleanup after ourselves as before when failing to
allocate.
philip [Fri, 4 May 2018 10:52:17 +0000 (10:52 +0000)]
Point out that the tzdata 2018e update brings in negative DST for certain time
zones. This does not affect the vast majority of users who do not care about
(or even know about) the tm_isdst flag but may be slightly surprising to those
with a more specialised interest in time zone arcana.
Immediately propagate EACCES error code to application from tcp_output.
In r309610 and r315514 the behavior of handling EACCES was changed, and
tcp_output() now returns zero when EACCES happens. The reason of this
change was a hesitation that applications that use TCP-MD5 will be
affected by changes in project/ipsec.
TCP-MD5 code returns EACCES when security assocition for given connection
is not configured. But the same error code can return pfil(9), and this
change has affected connections blocked by pfil(9). E.g. application
doesn't return immediately when SYN segment is blocked, instead it waits
when several tries will be failed.
Actually, for TCP-MD5 application it doesn't matter will it get EACCES
after first SYN, or after several tries. Security associtions must be
configured before initiating TCP connection.
I left the EACCES in the switch() to show that it has special handling.
Reported by: Andreas Longwitz <longwitz at incore dot de>
MFC after: 10 days
mmacy [Fri, 4 May 2018 06:51:01 +0000 (06:51 +0000)]
`dup1_processes -t 96 -s 5` on a dual 8160
x dup_before
+ dup_after
+------------------------------------------------------------+
| x + |
|x x x x ++ ++|
| |____AM___| |AM||
+------------------------------------------------------------+
N Min Max Median Avg Stddev
x 5 1.514954e+08 1.5230351e+08 1.5206157e+08 1.5199371e+08 341205.71
+ 5 1.5494336e+08 1.5519569e+08 1.5511982e+08 1.5508323e+08 96232.829
Difference at 95.0% confidence
3.08952e+06 +/- 365604
2.03266% +/- 0.245071%
(Student's t, pooled s = 250681)
mjg [Fri, 4 May 2018 04:05:07 +0000 (04:05 +0000)]
amd64: get rid of the pessimized bcopy in syscall arg copy
The code was unnecessarily conditionally copying either 5 or 6 args.
It can blindly copy 6, which also means the size is known at compilation
time and the operation can be depessimized.
Note the entire syscall handling code is rather slow.
Tested on Skylake, sample result for getppid (calls/s):
without pti: 7310106 -> 10653569
with pti: 3304843 -> 4148306
Some syscalls (like read) did not note any difference, other have typically
very modest wins.
kevans [Fri, 4 May 2018 03:13:25 +0000 (03:13 +0000)]
bsdgrep: annihilate our in-tree TRE, previously disabled by default
It was an old TRE that had plenty of bugs and no performance gain over
regex(3). I disabled it by default in r323615, and there was some confusion
about what the knob does- likely due to poor naming on my part- to the tune
of "well, it sounds like it should speed things up" (mentioned by multiple
people).
To compound this, I have no intention of maintaining a second regex
implementation. If someone would like to step up and volunteer to maintain a
lean-and-mean implementation for grep, this is OK, but we have very few
volunteers to maintain even our primary regex implementation.
grehan [Fri, 4 May 2018 01:36:49 +0000 (01:36 +0000)]
Allow arbitrary numbers of columns for VNC server screen resolution.
The prior code only allowed multiples of 32 for the
numbers of columns. Remove this restriction to allow
a forthcoming UEFI firmware update to allow arbitrary
x,y resolutions.
(the code for handling rows already supported non mult-32 values)
grehan [Thu, 3 May 2018 22:51:44 +0000 (22:51 +0000)]
Allow PCI VGA devices to be detached.
GPUs often have a VGA PCI class code and are probed/attached
by the VGA driver. Allow them to be detached so they can
be presented as passthru devices to VM guests.
kib [Thu, 3 May 2018 21:45:59 +0000 (21:45 +0000)]
Add helper macros to hide some boring repeatable ceremonies to define
ifuncs on x86.
Also keep helpers to define 'pseudo-ifuncs' which are emulated by the
indirect jmp.
Reviewed by: jhb (previous version, as part of the larger patch)
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D13838
kib [Thu, 3 May 2018 21:37:46 +0000 (21:37 +0000)]
Implement support for ifuncs in the kernel linker.
Required MD bits are only provided for x86.
Reviewed by: jhb (previous version, as part of the larger patch)
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D13838
kevans [Thu, 3 May 2018 19:49:40 +0000 (19:49 +0000)]
Garbage collect the a83t emac overlays
The 4.16 DTS import brought in emac support for the a83t. Since these
boards' DTS is pulled from /boot and I forgot to hook these up to the build,
they should be fairly safe to go away.
The a83t-sid and h3-sid overlays are still relevant. a83t-sid will likely
come in with 4.18 DTS.
benno [Thu, 3 May 2018 17:52:40 +0000 (17:52 +0000)]
Add a stub manual page for iflib(9).
Currently 'man -k iflib' would find you the right pages for iflib
documentation, namely iflibdd(9) and iflibdi(9) but 'man iflib' would leave
you in the dark. This allows both approaches to find the relevant
documentation.
shurd [Thu, 3 May 2018 17:02:31 +0000 (17:02 +0000)]
Allow iflib NIC drivers to sleep rather than busy wait
Since the move to SMP NIC driver locking has had to go through serious
contortions using mtx around long running hardware operations. This moves
iflib past that.
Individual drivers may now sleep when appropriate.
avg [Thu, 3 May 2018 15:33:18 +0000 (15:33 +0000)]
amdsbwd: add suspend and resume methods
Without the suspend method the watchdog may fire in S1 state. Without
the resume method the watchdog is not re-enabled after returning from S3
state. I observe this on one of my systems.
Not sure if watchdog(4) should participate in the suspend actions.
Right now everything is up to individual drivers.
Marvell SoC identification function was called by SYSINIT on all armv7
platforms, which brakes platforms other than Marvell built with
GENERIC config. Fix this by shifting SoC identifying to Marvell platform
initialization.
Submitted by: Patryk Duda <pdk@semihalf.com>
Reviewed by: manu
Tested by: manu
Obtained from: Semihalf
Sponsored by: Stormshield
eadler [Thu, 3 May 2018 00:57:19 +0000 (00:57 +0000)]
[etc] Update newsyslog.conf default comment
Remove line about allowed flags. It was missing 'pRTY' and is duplicative
of the man page. It didn't describe the flags in any detail to help
remind users of how to configure newsylog.
emaste [Wed, 2 May 2018 23:43:33 +0000 (23:43 +0000)]
Build lld as long as we have a C++11 host compiler
As with Clang, build our toolchain components by default when the host
compiler is capable of doing so, to make them available for testing and
experimentation.
tuexen [Wed, 2 May 2018 22:11:16 +0000 (22:11 +0000)]
Send an ICMPv6 PacketTooBig message in case of forwading a packet which
is too big for the outgoing interface and no firewall is involed.
This problem was introduced in
https://svnweb.freebsd.org/changeset/base/324996
Thanks to Irene Ruengeler for finding the bug and testing the fix.
rmacklem [Wed, 2 May 2018 20:36:11 +0000 (20:36 +0000)]
Add two missing LIST_INIT()s.
This patch adds two missing LIST_INIT()s. Found by inspection.
In practice, these are currently no-ops, since the structure they are
in is malloc'd with M_ZERO and all LIST_INIT does is set the pointer
in the list head to NULL. (In other words, the M_ZERO has already
correctly initialized it.)
kib [Wed, 2 May 2018 20:22:03 +0000 (20:22 +0000)]
mlx5en: Always allow VLAN id 0.
According to the 802.1Q-2014 9.6 VLAN Tag Control Information, VID value 0
means that there is no VLAN tag assigned to the packet, and only PCP and
DEI values from the tag are meaningful. Current flow table programming
filter out such packets.
When programming VLAN filter for flow table, unconditionally add rule which
accept packets with VLAN id 0. The packets are already handled correctly
by the network stack.
mav [Wed, 2 May 2018 20:13:03 +0000 (20:13 +0000)]
Fix LOR between controller and queue locks.
Admin pass-through requests took controller lock before the queue lock,
but in case of request submission to a failed controller controller lock
was taken after the queue lock. Fix that by reducing the lock scopes and
switching to mtx_pool locks to track pass-through request completion.
gonzo [Wed, 2 May 2018 20:04:25 +0000 (20:04 +0000)]
Unbreak RaspberryPi 2 boot after r332839
r332839 changed number of cells per interrupt for local_intc from 1 to 2
to pass type of IRQ. Driver expected only 1 cell so after r332839
all interrupt children of local_intc failed to allocate IRQ resource.
Fix this regression by relaxing check for number of cells in interrupt
property to be either 1 or 2.
tuexen [Wed, 2 May 2018 19:36:46 +0000 (19:36 +0000)]
Fix in the documentation that the default hop limit is not 30, but
the value of the sysctl variable net.inet6.ip6.hlim.
This is true since
https://svnweb.freebsd.org/base?view=revision&revision=122574
The default of 30 (which was correct up to r122574) was incorrectly
documented in
https://svnweb.freebsd.org/base?view=revision&revision=130268
Thanks to Timo Voelker for makeing me aware of the inconsistency
between to code and the documentation.
shurd [Wed, 2 May 2018 19:36:29 +0000 (19:36 +0000)]
Separate list manipulation locking from state change in multicast
Multicast incorrectly calls in to drivers with a mutex held causing drivers
to have to go through all manner of contortions to use a non sleepable lock.
Serialize multicast updates instead.
grehan [Wed, 2 May 2018 17:41:00 +0000 (17:41 +0000)]
Use PCI power-mgmt to reset a device if FLR fails.
A large number of devices don't support PCIe FLR, in particular
graphics adapters. Use PCI power management to perform the
reset if FLR fails or isn't available, by cycling the device
through the D3 state.
This has been tested by a number of users with Nvidia and AMD GPUs.
Submitted and tested by: Matt Macy
Reviewed by: jhb, imp, rgrimes
MFC after: 3 weeks
Differential Revision: https://reviews.freebsd.org/D15268
royger [Wed, 2 May 2018 10:19:17 +0000 (10:19 +0000)]
xen: fix gntdev
Current interface to the gntdev in FreeBSD is wrong, and mostly worked
out of luck before the PTI FreeBSD fixes, when kernel and user-space
where sharing the same page tables.
On FreeBSD ioctls have the size of the passed struct encoded in the
ioctl number, because the generic ioctl handler in the OS takes care
of copying the data from user-space to kernel space, and then calls
the device specific ioctl handler. Thus using ioctl structs with
variable sizes is not possible.
The fix is to turn the array of structs at the end of
ioctl_gntdev_alloc_gref and ioctl_gntdev_map_grant_ref into pointers,
that can be properly accessed from the kernel gntdev driver using the
copyin/copyout functions. Note that this is exactly how it's done for
the privcmd driver.
kevans [Wed, 2 May 2018 01:32:34 +0000 (01:32 +0000)]
cmp(1): Provide some long options
These match GNU cmp(1) for compatibility where applicable.
Future work might implement the -i option from GNU cmp(1) to express skip
either in terms of both files or of the form "SKIP1:SKIP2" rather than
specifying them as additional arguments to cmp(1).
scottl [Tue, 1 May 2018 21:42:27 +0000 (21:42 +0000)]
Refactor dadone(). There was no useful code sharing in it; it was just
a 1500 line switch statement. Callers now specify a discrete completion
handler, though they're still welcome to track state via ccb_state.
scottl [Tue, 1 May 2018 20:09:29 +0000 (20:09 +0000)]
cam_periph_runccb() changed several years ago to overwrite the ccb callback
pointer. It's now unhelpful and misleading for callers to continue to set
it, so bring all callers into conformance. There's no real functional change,
but it makes reading the code a lot less confusing.
erj [Tue, 1 May 2018 18:50:12 +0000 (18:50 +0000)]
ixl(4): Update to 1.9.9-k
Refresh upstream driver before impending conversion to iflib.
Major changes:
- Support for descriptor writeback mode (required by ixlv(4) for AVF support)
- Ability to disable firmware LLDP agent by user (PR 221530)
- Fix for TX queue hang when using TSO (PR 221919)
- Separate descriptor ring sizes for TX and RX rings
markj [Tue, 1 May 2018 17:32:43 +0000 (17:32 +0000)]
Print the dump progress indicator after calling dump_start().
Dumpers may wish to print messages from an initialization hook; this
change ensures that such messages aren't mixed with output from the
generic dump code.
emaste [Tue, 1 May 2018 16:30:48 +0000 (16:30 +0000)]
Retire lmc(4)
This driver supports legacy, 32-bit PCI devices, and had an ambiguous
license. Supported devices were already reported to be rare in 2003
(when an earlier version of the driver was removed in r123201).
Reviewed by: rgrimes
Relnotes: Yes
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D15245
imp [Tue, 1 May 2018 16:21:01 +0000 (16:21 +0000)]
Remove 'All Rights Reserved.' from all of my Copyrights in sys/arm and
always use 'M. Warner Losh' for consistency.
'All Rights Reserved.' was prescribed by the Buenos Aires Copyright
Convention of 1910, but has been mostly dead since the early 1990's
and completely meaningless since 2000 when Nicaragua ratified the
Berne convention.
Some files not done due to ambiguity of various types.
jhb [Tue, 1 May 2018 15:17:46 +0000 (15:17 +0000)]
Initial debug server for bhyve.
This commit adds a new debug server to bhyve. Unlike the existing -g
option which provides an efficient connection to a debug server
running in the guest OS, this debug server permits inspection and
control of the guest from within the hypervisor itself without
requiring any cooperation from the guest. It is similar to the debug
server provided by qemu.
To avoid conflicting with the existing -g option, a new -G option has
been added that accepts a TCP port. An IPv4 socket is bound to this
port and listens for connections from debuggers. In addition, if the
port begins with the character 'w', the hypervisor will pause the
guest at the first instruction until a debugger attaches and
explicitly continues the guest. Note that only a single debugger can
attach to a guest at a time.
Virtual CPUs are exposed to the remote debugger as threads. General
purpose register values can be read for each virtual CPU. Other
registers cannot currently be read, and no register values can be
changed by the debugger.
The remote debugger can read guest memory but not write to guest
memory. To facilitate source-level debugging of the guest, memory
addresses from the debugger are treated as virtual addresses (rather
than physical addresses) and are resolved to a physical address using
the active virtual address translation of the current virtual CPU.
Memory reads should honor memory mapped I/O regions, though the debug
server does not attempt to honor any alignment or size constraints
when accessing MMIO.
The debug server provides limited support for controlling the guest.
The guest is suspended when a debugger is attached and resumes when a
debugger detaches. A debugger can suspend a guest by sending a Ctrl-C
request (e.g. via Ctrl-C in GDB). A debugger can also continue a
suspended guest while remaining attached. Breakpoints are not yet
supported. Single stepping is supported on Intel CPUs that support
MTRAP VM exits, but is not available on other systems.
While the current debug server has limited functionality, it should
at least be usable for basic debugging now. It is also a useful
checkpoint to serve as a base for adding additional features.
cxgbe(4): Destroy the cdev before disabling interrupts in driver detach.
Filter work requests are submitted in the nexus cdev's ioctl which then
blocks waiting for a reply. If driver detach runs in this state and
disables interrupts the ioctl will never complete and detach will hang
in destroy_cdev.
manu [Tue, 1 May 2018 13:57:08 +0000 (13:57 +0000)]
uart_snps: Add early printf support
Move the allwinner early printf support to the snps driver as it
should work with all implementation.
While here add instruction for enabling it on 64bits SoCs.
jhibbits [Tue, 1 May 2018 04:31:17 +0000 (04:31 +0000)]
Remove dead errata fixup code
This code caused more problems than it should have fixed (boot failures) on
the machines I tested, so has been commented out for a while now. Remove
it, and assume the errata fixups were done by the bootloader where they
belong.
emaste [Tue, 1 May 2018 00:53:46 +0000 (00:53 +0000)]
pwd_mkdb: retire legacy v3 db support (-l option)
pwd_mkdb has emitted v4 password database records since 2003 (r113596)
in addition to v3, and as of r283981 by default it emitted only v4.
As described in r283981, retire the -l legacy option.
The -B and -L options were originally added to set the endianness of v3
records emitted by pwd_mkdb, but they also set the db hash endiannes and
so have been retained temporarily.
Announced on the FreeBSD-Current and FreeBSD-Stable lists. In stable/11
the man page contains a deprecation notice, and pwd_mkdb will emit a
deprecation notice if the -l option is specified.
Reviewed by: delphij, lidl, rgrimes
Relnotes: Yes
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D15144
This change allows clean device detach on attach failures and driver unload,
while previous code tried to talk to already shut down controller, or even
accessed resources failed to allocate.
Reserve 3b in the 14b atid to identify the owner and use it to dispatch
the CPL. This allows all CPLs that use an atid to be used as shared
CPLs, although ACT_OPEN_RPL is the only one being converted in this
revision.
Resume starts CPU from the init state, which clears any loaded
microcode updates. As result, IBRS MSRs are no longer available,
until the microcode is reloaded.
I have to forcibly clear cpu_stdext_feature3, which assumes that CPUID
leaf 7 reg %ebx does not report anything except Meltdown/Spectre bugs
bits. If future CPUs add new bits there, hw_ibrs_recalculate() and
identify_cpu1()/identify_cpu2() need to be adjusted for that.
Submitted and tested by: Michael Danilov <mike.d.ft402@gmail.com>
PR: 227866
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D15236
Previously it was possible to connect a socket (which had the
CAP_CONNECT right) by calling "connectat(AT_FDCWD, ...)" even in
capabilties mode. This combination should be treated the same as a call
to connect (i.e. forbidden in capabilities mode). Similarly for bindat.
Disable connectat/bindat with AT_FDCWD in capabilities mode, fix up the
documentation and add tests.