nwhitehorn [Fri, 24 Nov 2017 15:48:17 +0000 (15:48 +0000)]
Switch the default firmware for npe(4) from the QOS_VLAN one to the
plain-vanilla ETH microcode. The QOS_VLAN firmware added support in microcode
for handling IEEE 802.1q tags, but the npe(4) driver did not actually
support the relevant signalling. As a result, it was impossible to use
VLANs with npe(4). Switching to the more basic microcode (same license)
removes the on-NIC promisisng and makes vlan(4) work on both NPE interfaces.
3) libibmad was cloned from git://git.openfabrics.org/~iraweiny/libibmad.git
Tag 1.3.13 with some additional patches from Mellanox.
4) infiniband-diags was cloned from git://git.openfabrics.org/~iraweiny/infiniband-diags.git
Tag 1.6.7 with some additional patches from Mellanox.
NOTES:
======
1) The mthca driver has been removed in kernel and in userspace.
2) All GPLv2 only sources have been removed and where applicable
rewritten from scratch under a BSD license.
3) List of fully supported drivers in userspace and kernel:
a) iw_cxgbe (Chelsio)
b) mlx4ib (Mellanox)
c) mlx5ib (Mellanox)
4) WITH_OFED=YES is still required by make in order to build
OFED userspace and kernel code.
5) Full support has been added for routable RoCE, RoCE v2.
ed [Fri, 24 Nov 2017 14:02:32 +0000 (14:02 +0000)]
Pick the right vDSO file/linker flags when building cloudabi32.ko on ARM64.
The recently imported cloudabi_vdso_armv6_on_64bit.S should be the vDSO
for 32-bit processes when being run on FreeBSD/arm64. This vDSO ensures
that all system call arguments are padded to 64 bits, so that they can
be used by the kernel to call into most of the native implementations
directly.
ed [Fri, 24 Nov 2017 13:51:59 +0000 (13:51 +0000)]
Set CP15BEN in SCTLR to make memory barriers work in 32-bit mode.
Binaries generated by Clang for ARMv6 may contain these instructions:
MCR p15, 0, <Rd>, c7, c10, 5
These instructions are deprecated as of ARMv7, which is why modern
processors have a way of toggling support for them. On FreeBSD/arm64 we
currently disable support for these instructions, meaning that if 32-bit
executables with these instructions are run, they would crash with
SIGILL. This is likely not what we want.
Reviewed by: andrew
Differential Revision: https://reviews.freebsd.org/D13145
ed [Fri, 24 Nov 2017 13:50:53 +0000 (13:50 +0000)]
Add rudimentary support for building FreeBSD/arm64 with COMPAT_FREEBSD32.
Right now I'm using two Raspberry Pi's (2 and 3) to test CloudABI
support for armv6, armv7 and aarch64. It would be nice if I could
restrict this to just a single instance when testing smaller changes.
This is why I'd like to get COMPAT_CLOUDABI32 to work on arm64.
As COMPAT_CLOUDABI32 depends on COMPAT_FREEBSD32, at least for the ELF
loading, this change adds all of the bits necessary to at least build a
kernel with COMPAT_FREEBSD32. All of the machine dependent system calls
are still stubbed out, for the reason that implementations for these are
only useful if actual support for running FreeBSD binaries is added.
This is outside the scope of this work.
Reviewed by: andrew
Differential Revision: https://reviews.freebsd.org/D13144
avg [Fri, 24 Nov 2017 11:35:43 +0000 (11:35 +0000)]
amd-vi: small improvements to event printing
Ensure that an opening bracket always has a matching closing one.
Ensure that there is always a new-line at the end of a report line.
Also, add a space before the printed event flag.
avg [Fri, 24 Nov 2017 11:20:10 +0000 (11:20 +0000)]
amd-vi: fix and extend definition of Command and Event Status Register (0x2020)
The defined bits are the lower bits, not the higher ones.
Also, the specification has been extended to define bits 0:18 and they
all could potentially be interesting to us, so extend the width of the
field accordingly.
avg [Fri, 24 Nov 2017 11:10:36 +0000 (11:10 +0000)]
vmm/amd: improve iteration over IVHD (type 10h) entries in IVRS table
Many 8-byte entries have zero at byte 4, so the second 4-byte part is
skipped as a 4-byte padding entry. But not all 8-byte entries have that
property and they get misinterpreted.
A real example:
48 00 00 00 ff 01 00 01
This an 8-byte ACPI_IVRS_TYPE_SPECIAL entry for IOAPIC with ID 255 (bogus).
It is reported as:
ivhd0: Unknown dev entry:0xff
Fortunately, it was completely harmless.
Also, bail out early if we encounter an entry of a variable length type.
We do not have proper handling for those yet.
avg [Fri, 24 Nov 2017 10:45:33 +0000 (10:45 +0000)]
zdb: use a heap allocation instead of a huge array on stack
SPA_MAXBLOCKSIZE is 16 MB and having such a large object on the stack is
not nice in general and it could cause some confusing failures in the
single-user mode where the default stack size of 8 MB is used.
I expect that the upstream would make the same change.
ed [Fri, 24 Nov 2017 07:35:08 +0000 (07:35 +0000)]
Don't let cpu_set_syscall_retval() clobber exec_setregs().
Upon successful completion, the execve() system call invokes
exec_setregs() to initialize the registers of the initial thread of the
newly executed process. What is weird is that when execve() returns, it
still goes through the normal system call return path, clobbering the
registers with the system call's return value (td->td_retval).
Though this doesn't seem to be problematic for x86 most of the times (as
the value of eax/rax doesn't matter upon startup), this can be pretty
frustrating for architectures where function argument and return
registers overlap (e.g., ARM). On these systems, exec_setregs() also
needs to initialize td_retval.
Even worse are architectures where cpu_set_syscall_retval() sets
registers to values not derived from td_retval. On these architectures,
there is no way cpu_set_syscall_retval() can set registers to the way it
wants them to be upon the start of execution.
To get rid of this madness, let sys_execve() return EJUSTRETURN. This
will cause cpu_set_syscall_retval() to leave registers intact. This
makes process execution easier to understand. It also eliminates the
difference between execution of the initial process and successive ones.
The initial call to sys_execve() is not performed through a system call
context.
imp [Fri, 24 Nov 2017 05:01:00 +0000 (05:01 +0000)]
Mark the func pointer as __dead2. It looks up loader_main, which
either aborts or exits, but never returns. Tag it as a non-returning
function rather than supply a bogus return(0) at the end of main.
imp [Fri, 24 Nov 2017 05:00:25 +0000 (05:00 +0000)]
Fix theoretical integer overflow issues. If the product here is
greater than 2^31-1, then the result will be huge. This is unlikely,
as we don't support that many sections, but out of an abundace of
caution cast to size_t so the multiplication won't overflow
mysteriously when size_t is larger than 32-bits. The resulting code
may be a smidge larger, but this isn't super-space critical code.
avg [Thu, 23 Nov 2017 22:10:12 +0000 (22:10 +0000)]
vmrun.sh: add -A option for AHCI emulation of disk devices
AHCI emulation is useful for testing scenarios closer to the real
hardware. For example, it allows to exercise the CAM subsystem.
There could be other uses as well.
andrew [Thu, 23 Nov 2017 17:40:40 +0000 (17:40 +0000)]
Ensure we check the program state set in the trap frame on arm and arm64.
This value may be set by userspace so we need to check it before using it.
If this is not done correctly on exception return the kernel may continue
in kernel mode with all registers set to a userspace controlled value. Fix
this by moving the check into set_mcontext, and also add the missing
sanitisation from the arm64 set_regs.
markj [Thu, 23 Nov 2017 14:29:07 +0000 (14:29 +0000)]
Duplicate helpers after disabling inherited tracepoints during a fork.
We may create probes in the nascent child process, so we first need to
ensure that any inherited tracepoints are first removed. Otherwise the
probe sites will not be in the state expected by fasttrap, and it won't
be able to enable the probes.
markj [Thu, 23 Nov 2017 14:07:52 +0000 (14:07 +0000)]
Allow kern.geom.mirror.debug to be negative.
A negative value can be used to suppress all prints from the gmirror
kernel code, which can be useful when attempting to trigger race
conditions using stress tests.
hselasky [Thu, 23 Nov 2017 13:57:44 +0000 (13:57 +0000)]
Make sure the iSCSI I/O limits are set properly so that the ISCSIDSEND IOCTL
can be used prior to the ISCSIDHANDOFF IOCTL which set the negotiated values.
Else the login PDU will fail when passing the "-r" option to "iscsictl" which
means iSCSI over RDMA instead of TCP/IP.
ae [Thu, 23 Nov 2017 08:02:02 +0000 (08:02 +0000)]
Modify ipfw's dynamic states KPI.
Hide the locking logic used in the dynamic states implementation from
generic code. Rename ipfw_install_state() and ipfw_lookup_dyn_rule()
function to have similar names: ipfw_dyn_install_state() and
ipfw_dyn_lookup_state(). Move dynamic rule counters updating to the
ipfw_dyn_lookup_state() function. Now this function return NULL when
there is no state and pointer to the parent rule when state is found.
Thus now there is no need to return pointer to dynamic rule, and no need
to hold bucket lock for this state. Remove ipfw_dyn_unlock() function.
kevans [Thu, 23 Nov 2017 05:54:04 +0000 (05:54 +0000)]
Allwinner a83t: add ccung bits
Upstream DTS has switched to using CCU rather than /clocks nodes. Add a CCU
driver for the a83t to bring us closer to upstream, but don't yet attach it
to ccu node.
Reviewed by: manu
Approved by: emaste (mentor)
Differential Revision: https://reviews.freebsd.org/D12843
mjg [Thu, 23 Nov 2017 03:40:51 +0000 (03:40 +0000)]
sx: unbreak debug after r326107
An assertion was modified to use the found value, but it was not updated to
handle a race where blocked threads appear after the entrance to the func.
Move the assertion down to the area protected with sleepq lock where the
lock is read anyway. This does not affect coverage of the assertion and
is consistent with what rw locks are doing.
landonf [Wed, 22 Nov 2017 23:10:20 +0000 (23:10 +0000)]
bhnd(4): Add a basic ChipCommon GPIO driver sufficient to support bwn(4)
The driver is functional on both BHND Wi-Fi adapters and MIPS SoCs, but
does not currently include support for features not required by bwn(4),
including GPIO interrupt handling.
Approved by: adrian (mentor, implicit)
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D12708
landonf [Wed, 22 Nov 2017 20:27:46 +0000 (20:27 +0000)]
bhnd(4): extend the PMU APIs to support bwn(4)
The bwn(4) driver requires a number of extensions to the bhnd(4) PMU
interface to support external configuration of PLLs, LDOs, and other
parameters that require chipset or PHY-specific workarounds.
These changes add support for:
- Writing raw voltage register values to PHY-specific LDO regulator
registers (required by LP-PHY).
- Enabling/disabling PHY-specific LDOs (required by LP-PHY)
- Writing to arbitrary PMU chipctrl registers (required for common PHY PLL
reset support).
- Requesting chipset/PLL-specific spurious signal avoidance modes.
- Querying clock frequency and latency.
Additionally, rather than updating legacy PWRCTL support to conform to the
new PMU interface:
- PWRCTL API is now provided by a bhnd_pwrctl_if.m interface.
- Since PWRCTL is only found in older SSB-based chipsets, translation from
bhnd(4) bus APIs to corresponding PWRCTL operations is now handled
entirely within the siba(4) driver.
- The PWRCTL-specific host bridge clock gating APIs in bhnd_bus_if.m have
been lifted out into a standalone bhnd_pwrctl_hostb_if.m interface.
Approved by: adrian (mentor, implicit)
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D12664
kib [Wed, 22 Nov 2017 16:45:27 +0000 (16:45 +0000)]
Return different error code for the guard page layout violation.
On KERN_NO_SPACE error, as it is returned now, vm_map_find() continues
the loop searching for the suitable range for the requested mapping
with specific alignment. Since the vm_map_findspace() succesfully
finds the same place, the loop never ends.
The errors returned from vm_map_stack() completely repeat the behavior
of vm_map_insert() now, as suggested by Alan.
Reported by: Arto Pekkanen <aksyom@gmail.com>
PR: 223732
Reviewed by: alc, markj
Discussed with: jhb
Sponsored by: The FreeBSD Foundation
MFC after: 3 days
Differential revision: https://reviews.freebsd.org/D13186
alc [Wed, 22 Nov 2017 16:39:24 +0000 (16:39 +0000)]
When vm_map_find(find_space = VMFS_OPTIMAL_SPACE) fails to find space, a
second scan of the address space with find_space = VMFS_ANY_SPACE is
performed. Previously, vm_map_find() released and reacquired the map lock
between the first and second scans. However, there is no compelling
reason to do so. This revision modifies vm_map_find() to retain the map
lock.
markj [Wed, 22 Nov 2017 15:54:52 +0000 (15:54 +0000)]
Annotate pragma/err.invalidlibdep.ksh as EXFAIL.
The test creates a D library with a "depends_on library" pragma
referencing a non-existent library, and expects compilation to fail.
However, as far as I can tell, libdtrace is supposed simply abort
compilation of the library in this case, and continue. This behaviour
is desirable when adding libraries which depend on optional KLDs, for
example.
manu [Wed, 22 Nov 2017 15:27:47 +0000 (15:27 +0000)]
bsdinstall: Add ntpdate option
When you install a computer for the first time, the date in the CMOS sometimes
not accurate and you need to ntpdate as ntpd will fail a the time difference
is too big.
Add an option in bsdinstall to enable ntpdate that will do that for us.
emaste [Wed, 22 Nov 2017 15:18:11 +0000 (15:18 +0000)]
Fix indentation in bsdinstall-created wpa_supplicant.conf
r309934 cleaned up some cases in bsdinstall to use heredocs but broke
the indentation of the generated output, because <<- heredocs strip
leading tabs.
PR: 221982
Reviewed by: allanjude, dteske
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D13190
br [Wed, 22 Nov 2017 14:10:58 +0000 (14:10 +0000)]
o Invalidate the correct page in pmap_protect().
With this bug fix we don't need to invalidate all the entries.
o Remove a call to pmap_invalidate_all(). This was never called
as the anyvalid variable is never set.
tsoome [Wed, 22 Nov 2017 08:48:00 +0000 (08:48 +0000)]
loader.efi: efipart does not recognize partitionless disks
Rework the block device handle check to allow more robust device
classification. This is mostly usability issue - it can be quite confusing
for user when no disks are listed with lsdev.
kevans [Wed, 22 Nov 2017 03:44:19 +0000 (03:44 +0000)]
patch(1): don't assume a match if we run out of context to check
Patches with very little context (-U0 and -U1) could get misapplied if
the file to be patched changes and a hunk is no longer applicable. Matching
with fuzz would be attempted and default to a match when we unexpectedly ran
out of context.
This also affected patches with higher levels of context but had limited
actual context due to the hunk being located near the beginning/end of file.
landonf [Tue, 21 Nov 2017 23:25:22 +0000 (23:25 +0000)]
bhnd(4): Add support for querying DMA address translation parameters
BHND Wi-Fi chipsets and SoCs share a common DMA engine, operating within
backplane address space. To support host DMA on Wi-Fi chipsets, the bridge
core maps host address space onto the backplane; any host addresses must
be translated to their corresponding backplane address.
- Defines a new bhnd_get_dma_translation(9) API to support querying DMA
address translation parameters from the bhnd(4) bus.
- Extends bhndb(4) to provide DMA translation descriptors from a DMA
address translation table defined in the host bridge-specific
bhndb_hwcfg.
- Defines bhndb(4) DMA address translation tables for all supported host
bridge cores.
- Extends mips/broadcom's bhnd_nexus driver to return an identity (no-op)
DMA translation descriptor; no translation is required when addressing
the SoC backplane.
Approved by: adrian (mentor)
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D12582
landonf [Tue, 21 Nov 2017 23:15:20 +0000 (23:15 +0000)]
bhnd(4): implement MIPS and PCI(e) interrupt support
On BHND MIPS SoCs, this replaces the use of hard-coded MIPS IRQ#s in the
common bhnd(4) core drivers; we now register an INTRNG child PIC that
handles routing of backplane interrupt vectors via the MIPS core.
On BHND PCI devices, backplane interrupt vectors are now routed to the
PCI/PCIe host bridge core when bus_setup_intr() is called, where they are
dispatched by the PCI core via a host interrupt (e.g. INTx/MSI).
The bhndb(4) bridge driver tracks registered interrupt handlers for the
bridged bhnd(4) devices and manages backplane interrupt routing, while
delegating actual bus interrupt setup/teardown to the parent bus on behalf
of the bridged cores.
Approved by: adrian (mentor, implicit)
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D12518
ed [Tue, 21 Nov 2017 20:46:21 +0000 (20:46 +0000)]
Import the latest CloudABI definitions, v0.18.
In addition to some small style fixes to the ARMv6 vDSO, this release
includes a new vDSO that can be used for the execution of ARMv6/ARMv7
code on 64-bit platforms.
Just like for i686 on x86-64, this new vDSO is responsible for padding
arguments and return values to 64-bit values, so that the kernel can
easily forward system calls to the native system calls.
emaste [Tue, 21 Nov 2017 20:31:54 +0000 (20:31 +0000)]
filter all passwords (not only changed) from periodic passwd backup
The periodic 200.backup-passwd script outputs any differences it finds
in master.passwd, relative to the previous backup. It intends to elide
the encrypted password field, but previously did so only for changed
lines (i.e., those beginning with - or + in the diff).
Apply the sed expression also to unchanged lines to also elide their
passwords.
PR: 223461
Reported by: Andre Albsmeier
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
kib [Tue, 21 Nov 2017 19:55:32 +0000 (19:55 +0000)]
systat: use and correctly display 64bit counters.
Following struct vmtotal changes, make systat use and correctly
display 64-bit counters. Switch to humanize_number(3) to overcome
homegrown arithmetics limits in pretty printing large numbers. Use
1024 as a divisor for memory fields to make it consistent with other
tools and users expectations.
imp [Tue, 21 Nov 2017 19:23:20 +0000 (19:23 +0000)]
Unbreak riscv build in universe.
riscv doesn't have -msoft-float. For the moment, just don't add
anything. There's no /boot/loader or other bootstrap contained in the
tree for riscv*. However, with real hardware coming next year, there
are plans for one, so keep building at least a minimal libsa and
ficl to prevent bitrot.
avg [Tue, 21 Nov 2017 18:28:14 +0000 (18:28 +0000)]
zfs_write: fix problem with writes appearing to succeed when over quota
The problem happens when the writes have offsets and sizes aligned with
a filesystem's recordsize (maximum block size). In this scenario
dmu_tx_assign() would fail because of being over the quota, but the uio
would already be modified in the code path where we copy data from the
uio into a borrowed ARC buffer. That makes an appearance of a partial
write, so zfs_write() would return success and the uio would be modified
consistently with writing a single block.
That bug can result in a data loss because the writes over the quota
would appear to succeed while the actual data is being discarded.
This commit fixes the bug by ensuring that the uio is not changed until
after all error checks are done. To achieve that the code now uses
uiocopy() + uioskip() as in the original illumos design. We can do that
now that uiocopy() has been updated in r326067 to use
vn_io_fault_uiomove().
imp [Tue, 21 Nov 2017 18:03:47 +0000 (18:03 +0000)]
Fix gptzfsboot for cases with GELI.
HAVE_GPT isn't currently a thing, but HAVE_GELI is. Replace the former
with the latter and remove util.o from the build list (it's picked up
from libsa/libsa32, and that's OK).
gjb [Tue, 21 Nov 2017 18:02:18 +0000 (18:02 +0000)]
Remove /etc/resolv.conf from virtual machine images, which is
copied from the build host. It is renamed to /etc/resolv.conf.bak
on boot, so never used anyway.
Noticed by: peter
MFC after: 3 days
Sponsored by: The FreeBSD Foundation
avg [Tue, 21 Nov 2017 18:01:43 +0000 (18:01 +0000)]
make illumos uiocopy use vn_io_fault_uiomove
uiocopy() is currently unused, its purpose is copy data from a uio
without modifying the uio. It was in use before the vn_io_fault support
was added to ZFS, at which point our code diverged from the illumos code
a little bit. Because ZFS is the only (potential) user of the function
we are free to modify it to better suit ZFS needs.
The intention behind this change is to remove the differences introduced
earlier in zfs_write().
While here, re-implement uioskip() using uiomove() with
uio_segflg == UIO_NOCOPY.
The story of uioskip is the same as with uiocopy.
andrew [Tue, 21 Nov 2017 17:23:16 +0000 (17:23 +0000)]
Add a driver for the EFI RTC. This uses the EFI Runtime Services to query
the system time.
As we seem to only read this time on boot, and this is the only source of
time on many arm64 machines we need to enable this by default there. As
this is not always the case with U-Boot firmware, or when we have been
booted from a non-UEFI environment we only enable the device driver when
the Runtime Services are present and reading the time doesn't result in an
error.
markj [Tue, 21 Nov 2017 16:03:21 +0000 (16:03 +0000)]
Refine symtab sorting in libproc.
Add some rules to more closely match what illumos does when an address
resolves to multiple symbols:
- prefer non-local symbols
- prefer symbols with fewer leading underscores and no leading '$'
hselasky [Tue, 21 Nov 2017 13:56:30 +0000 (13:56 +0000)]
Make sure all initialized mutexes are destroyed in the iser module,
else WITNESS will panic. Prefix all mutex names with "iser_" to
prevent future WITNESS issues.