Ed Schouten [Fri, 24 Nov 2017 13:50:53 +0000 (13:50 +0000)]
Add rudimentary support for building FreeBSD/arm64 with COMPAT_FREEBSD32.
Right now I'm using two Raspberry Pi's (2 and 3) to test CloudABI
support for armv6, armv7 and aarch64. It would be nice if I could
restrict this to just a single instance when testing smaller changes.
This is why I'd like to get COMPAT_CLOUDABI32 to work on arm64.
As COMPAT_CLOUDABI32 depends on COMPAT_FREEBSD32, at least for the ELF
loading, this change adds all of the bits necessary to at least build a
kernel with COMPAT_FREEBSD32. All of the machine dependent system calls
are still stubbed out, for the reason that implementations for these are
only useful if actual support for running FreeBSD binaries is added.
This is outside the scope of this work.
Reviewed by: andrew
Differential Revision: https://reviews.freebsd.org/D13144
Andriy Gapon [Fri, 24 Nov 2017 11:35:43 +0000 (11:35 +0000)]
amd-vi: small improvements to event printing
Ensure that an opening bracket always has a matching closing one.
Ensure that there is always a new-line at the end of a report line.
Also, add a space before the printed event flag.
Andriy Gapon [Fri, 24 Nov 2017 11:20:10 +0000 (11:20 +0000)]
amd-vi: fix and extend definition of Command and Event Status Register (0x2020)
The defined bits are the lower bits, not the higher ones.
Also, the specification has been extended to define bits 0:18 and they
all could potentially be interesting to us, so extend the width of the
field accordingly.
Andriy Gapon [Fri, 24 Nov 2017 11:10:36 +0000 (11:10 +0000)]
vmm/amd: improve iteration over IVHD (type 10h) entries in IVRS table
Many 8-byte entries have zero at byte 4, so the second 4-byte part is
skipped as a 4-byte padding entry. But not all 8-byte entries have that
property and they get misinterpreted.
A real example:
48 00 00 00 ff 01 00 01
This an 8-byte ACPI_IVRS_TYPE_SPECIAL entry for IOAPIC with ID 255 (bogus).
It is reported as:
ivhd0: Unknown dev entry:0xff
Fortunately, it was completely harmless.
Also, bail out early if we encounter an entry of a variable length type.
We do not have proper handling for those yet.
Andriy Gapon [Fri, 24 Nov 2017 10:45:33 +0000 (10:45 +0000)]
zdb: use a heap allocation instead of a huge array on stack
SPA_MAXBLOCKSIZE is 16 MB and having such a large object on the stack is
not nice in general and it could cause some confusing failures in the
single-user mode where the default stack size of 8 MB is used.
I expect that the upstream would make the same change.
Ed Schouten [Fri, 24 Nov 2017 07:35:08 +0000 (07:35 +0000)]
Don't let cpu_set_syscall_retval() clobber exec_setregs().
Upon successful completion, the execve() system call invokes
exec_setregs() to initialize the registers of the initial thread of the
newly executed process. What is weird is that when execve() returns, it
still goes through the normal system call return path, clobbering the
registers with the system call's return value (td->td_retval).
Though this doesn't seem to be problematic for x86 most of the times (as
the value of eax/rax doesn't matter upon startup), this can be pretty
frustrating for architectures where function argument and return
registers overlap (e.g., ARM). On these systems, exec_setregs() also
needs to initialize td_retval.
Even worse are architectures where cpu_set_syscall_retval() sets
registers to values not derived from td_retval. On these architectures,
there is no way cpu_set_syscall_retval() can set registers to the way it
wants them to be upon the start of execution.
To get rid of this madness, let sys_execve() return EJUSTRETURN. This
will cause cpu_set_syscall_retval() to leave registers intact. This
makes process execution easier to understand. It also eliminates the
difference between execution of the initial process and successive ones.
The initial call to sys_execve() is not performed through a system call
context.
Warner Losh [Fri, 24 Nov 2017 05:01:00 +0000 (05:01 +0000)]
Mark the func pointer as __dead2. It looks up loader_main, which
either aborts or exits, but never returns. Tag it as a non-returning
function rather than supply a bogus return(0) at the end of main.
Warner Losh [Fri, 24 Nov 2017 05:00:25 +0000 (05:00 +0000)]
Fix theoretical integer overflow issues. If the product here is
greater than 2^31-1, then the result will be huge. This is unlikely,
as we don't support that many sections, but out of an abundace of
caution cast to size_t so the multiplication won't overflow
mysteriously when size_t is larger than 32-bits. The resulting code
may be a smidge larger, but this isn't super-space critical code.
Andriy Gapon [Thu, 23 Nov 2017 22:10:12 +0000 (22:10 +0000)]
vmrun.sh: add -A option for AHCI emulation of disk devices
AHCI emulation is useful for testing scenarios closer to the real
hardware. For example, it allows to exercise the CAM subsystem.
There could be other uses as well.
Andrew Turner [Thu, 23 Nov 2017 17:40:40 +0000 (17:40 +0000)]
Ensure we check the program state set in the trap frame on arm and arm64.
This value may be set by userspace so we need to check it before using it.
If this is not done correctly on exception return the kernel may continue
in kernel mode with all registers set to a userspace controlled value. Fix
this by moving the check into set_mcontext, and also add the missing
sanitisation from the arm64 set_regs.
Mark Johnston [Thu, 23 Nov 2017 14:29:07 +0000 (14:29 +0000)]
Duplicate helpers after disabling inherited tracepoints during a fork.
We may create probes in the nascent child process, so we first need to
ensure that any inherited tracepoints are first removed. Otherwise the
probe sites will not be in the state expected by fasttrap, and it won't
be able to enable the probes.
Mark Johnston [Thu, 23 Nov 2017 14:07:52 +0000 (14:07 +0000)]
Allow kern.geom.mirror.debug to be negative.
A negative value can be used to suppress all prints from the gmirror
kernel code, which can be useful when attempting to trigger race
conditions using stress tests.
Make sure the iSCSI I/O limits are set properly so that the ISCSIDSEND IOCTL
can be used prior to the ISCSIDHANDOFF IOCTL which set the negotiated values.
Else the login PDU will fail when passing the "-r" option to "iscsictl" which
means iSCSI over RDMA instead of TCP/IP.
Hide the locking logic used in the dynamic states implementation from
generic code. Rename ipfw_install_state() and ipfw_lookup_dyn_rule()
function to have similar names: ipfw_dyn_install_state() and
ipfw_dyn_lookup_state(). Move dynamic rule counters updating to the
ipfw_dyn_lookup_state() function. Now this function return NULL when
there is no state and pointer to the parent rule when state is found.
Thus now there is no need to return pointer to dynamic rule, and no need
to hold bucket lock for this state. Remove ipfw_dyn_unlock() function.
Kyle Evans [Thu, 23 Nov 2017 05:54:04 +0000 (05:54 +0000)]
Allwinner a83t: add ccung bits
Upstream DTS has switched to using CCU rather than /clocks nodes. Add a CCU
driver for the a83t to bring us closer to upstream, but don't yet attach it
to ccu node.
Reviewed by: manu
Approved by: emaste (mentor)
Differential Revision: https://reviews.freebsd.org/D12843
Mateusz Guzik [Thu, 23 Nov 2017 03:40:51 +0000 (03:40 +0000)]
sx: unbreak debug after r326107
An assertion was modified to use the found value, but it was not updated to
handle a race where blocked threads appear after the entrance to the func.
Move the assertion down to the area protected with sleepq lock where the
lock is read anyway. This does not affect coverage of the assertion and
is consistent with what rw locks are doing.
Landon J. Fuller [Wed, 22 Nov 2017 23:10:20 +0000 (23:10 +0000)]
bhnd(4): Add a basic ChipCommon GPIO driver sufficient to support bwn(4)
The driver is functional on both BHND Wi-Fi adapters and MIPS SoCs, but
does not currently include support for features not required by bwn(4),
including GPIO interrupt handling.
Approved by: adrian (mentor, implicit)
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D12708
Landon J. Fuller [Wed, 22 Nov 2017 20:27:46 +0000 (20:27 +0000)]
bhnd(4): extend the PMU APIs to support bwn(4)
The bwn(4) driver requires a number of extensions to the bhnd(4) PMU
interface to support external configuration of PLLs, LDOs, and other
parameters that require chipset or PHY-specific workarounds.
These changes add support for:
- Writing raw voltage register values to PHY-specific LDO regulator
registers (required by LP-PHY).
- Enabling/disabling PHY-specific LDOs (required by LP-PHY)
- Writing to arbitrary PMU chipctrl registers (required for common PHY PLL
reset support).
- Requesting chipset/PLL-specific spurious signal avoidance modes.
- Querying clock frequency and latency.
Additionally, rather than updating legacy PWRCTL support to conform to the
new PMU interface:
- PWRCTL API is now provided by a bhnd_pwrctl_if.m interface.
- Since PWRCTL is only found in older SSB-based chipsets, translation from
bhnd(4) bus APIs to corresponding PWRCTL operations is now handled
entirely within the siba(4) driver.
- The PWRCTL-specific host bridge clock gating APIs in bhnd_bus_if.m have
been lifted out into a standalone bhnd_pwrctl_hostb_if.m interface.
Approved by: adrian (mentor, implicit)
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D12664
Return different error code for the guard page layout violation.
On KERN_NO_SPACE error, as it is returned now, vm_map_find() continues
the loop searching for the suitable range for the requested mapping
with specific alignment. Since the vm_map_findspace() succesfully
finds the same place, the loop never ends.
The errors returned from vm_map_stack() completely repeat the behavior
of vm_map_insert() now, as suggested by Alan.
Reported by: Arto Pekkanen <aksyom@gmail.com>
PR: 223732
Reviewed by: alc, markj
Discussed with: jhb
Sponsored by: The FreeBSD Foundation
MFC after: 3 days
Differential revision: https://reviews.freebsd.org/D13186
Alan Cox [Wed, 22 Nov 2017 16:39:24 +0000 (16:39 +0000)]
When vm_map_find(find_space = VMFS_OPTIMAL_SPACE) fails to find space, a
second scan of the address space with find_space = VMFS_ANY_SPACE is
performed. Previously, vm_map_find() released and reacquired the map lock
between the first and second scans. However, there is no compelling
reason to do so. This revision modifies vm_map_find() to retain the map
lock.
Mark Johnston [Wed, 22 Nov 2017 15:54:52 +0000 (15:54 +0000)]
Annotate pragma/err.invalidlibdep.ksh as EXFAIL.
The test creates a D library with a "depends_on library" pragma
referencing a non-existent library, and expects compilation to fail.
However, as far as I can tell, libdtrace is supposed simply abort
compilation of the library in this case, and continue. This behaviour
is desirable when adding libraries which depend on optional KLDs, for
example.
Emmanuel Vadot [Wed, 22 Nov 2017 15:27:47 +0000 (15:27 +0000)]
bsdinstall: Add ntpdate option
When you install a computer for the first time, the date in the CMOS sometimes
not accurate and you need to ntpdate as ntpd will fail a the time difference
is too big.
Add an option in bsdinstall to enable ntpdate that will do that for us.
Ed Maste [Wed, 22 Nov 2017 15:18:11 +0000 (15:18 +0000)]
Fix indentation in bsdinstall-created wpa_supplicant.conf
r309934 cleaned up some cases in bsdinstall to use heredocs but broke
the indentation of the generated output, because <<- heredocs strip
leading tabs.
PR: 221982
Reviewed by: allanjude, dteske
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D13190
Ruslan Bukin [Wed, 22 Nov 2017 14:10:58 +0000 (14:10 +0000)]
o Invalidate the correct page in pmap_protect().
With this bug fix we don't need to invalidate all the entries.
o Remove a call to pmap_invalidate_all(). This was never called
as the anyvalid variable is never set.
Toomas Soome [Wed, 22 Nov 2017 08:48:00 +0000 (08:48 +0000)]
loader.efi: efipart does not recognize partitionless disks
Rework the block device handle check to allow more robust device
classification. This is mostly usability issue - it can be quite confusing
for user when no disks are listed with lsdev.
Kyle Evans [Wed, 22 Nov 2017 03:44:19 +0000 (03:44 +0000)]
patch(1): don't assume a match if we run out of context to check
Patches with very little context (-U0 and -U1) could get misapplied if
the file to be patched changes and a hunk is no longer applicable. Matching
with fuzz would be attempted and default to a match when we unexpectedly ran
out of context.
This also affected patches with higher levels of context but had limited
actual context due to the hunk being located near the beginning/end of file.
Landon J. Fuller [Tue, 21 Nov 2017 23:25:22 +0000 (23:25 +0000)]
bhnd(4): Add support for querying DMA address translation parameters
BHND Wi-Fi chipsets and SoCs share a common DMA engine, operating within
backplane address space. To support host DMA on Wi-Fi chipsets, the bridge
core maps host address space onto the backplane; any host addresses must
be translated to their corresponding backplane address.
- Defines a new bhnd_get_dma_translation(9) API to support querying DMA
address translation parameters from the bhnd(4) bus.
- Extends bhndb(4) to provide DMA translation descriptors from a DMA
address translation table defined in the host bridge-specific
bhndb_hwcfg.
- Defines bhndb(4) DMA address translation tables for all supported host
bridge cores.
- Extends mips/broadcom's bhnd_nexus driver to return an identity (no-op)
DMA translation descriptor; no translation is required when addressing
the SoC backplane.
Approved by: adrian (mentor)
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D12582
Landon J. Fuller [Tue, 21 Nov 2017 23:15:20 +0000 (23:15 +0000)]
bhnd(4): implement MIPS and PCI(e) interrupt support
On BHND MIPS SoCs, this replaces the use of hard-coded MIPS IRQ#s in the
common bhnd(4) core drivers; we now register an INTRNG child PIC that
handles routing of backplane interrupt vectors via the MIPS core.
On BHND PCI devices, backplane interrupt vectors are now routed to the
PCI/PCIe host bridge core when bus_setup_intr() is called, where they are
dispatched by the PCI core via a host interrupt (e.g. INTx/MSI).
The bhndb(4) bridge driver tracks registered interrupt handlers for the
bridged bhnd(4) devices and manages backplane interrupt routing, while
delegating actual bus interrupt setup/teardown to the parent bus on behalf
of the bridged cores.
Approved by: adrian (mentor, implicit)
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D12518
Ed Schouten [Tue, 21 Nov 2017 20:46:21 +0000 (20:46 +0000)]
Import the latest CloudABI definitions, v0.18.
In addition to some small style fixes to the ARMv6 vDSO, this release
includes a new vDSO that can be used for the execution of ARMv6/ARMv7
code on 64-bit platforms.
Just like for i686 on x86-64, this new vDSO is responsible for padding
arguments and return values to 64-bit values, so that the kernel can
easily forward system calls to the native system calls.
Ed Maste [Tue, 21 Nov 2017 20:31:54 +0000 (20:31 +0000)]
filter all passwords (not only changed) from periodic passwd backup
The periodic 200.backup-passwd script outputs any differences it finds
in master.passwd, relative to the previous backup. It intends to elide
the encrypted password field, but previously did so only for changed
lines (i.e., those beginning with - or + in the diff).
Apply the sed expression also to unchanged lines to also elide their
passwords.
PR: 223461
Reported by: Andre Albsmeier
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Following struct vmtotal changes, make systat use and correctly
display 64-bit counters. Switch to humanize_number(3) to overcome
homegrown arithmetics limits in pretty printing large numbers. Use
1024 as a divisor for memory fields to make it consistent with other
tools and users expectations.
Warner Losh [Tue, 21 Nov 2017 19:23:20 +0000 (19:23 +0000)]
Unbreak riscv build in universe.
riscv doesn't have -msoft-float. For the moment, just don't add
anything. There's no /boot/loader or other bootstrap contained in the
tree for riscv*. However, with real hardware coming next year, there
are plans for one, so keep building at least a minimal libsa and
ficl to prevent bitrot.
Andriy Gapon [Tue, 21 Nov 2017 18:28:14 +0000 (18:28 +0000)]
zfs_write: fix problem with writes appearing to succeed when over quota
The problem happens when the writes have offsets and sizes aligned with
a filesystem's recordsize (maximum block size). In this scenario
dmu_tx_assign() would fail because of being over the quota, but the uio
would already be modified in the code path where we copy data from the
uio into a borrowed ARC buffer. That makes an appearance of a partial
write, so zfs_write() would return success and the uio would be modified
consistently with writing a single block.
That bug can result in a data loss because the writes over the quota
would appear to succeed while the actual data is being discarded.
This commit fixes the bug by ensuring that the uio is not changed until
after all error checks are done. To achieve that the code now uses
uiocopy() + uioskip() as in the original illumos design. We can do that
now that uiocopy() has been updated in r326067 to use
vn_io_fault_uiomove().
Warner Losh [Tue, 21 Nov 2017 18:03:47 +0000 (18:03 +0000)]
Fix gptzfsboot for cases with GELI.
HAVE_GPT isn't currently a thing, but HAVE_GELI is. Replace the former
with the latter and remove util.o from the build list (it's picked up
from libsa/libsa32, and that's OK).
Glen Barber [Tue, 21 Nov 2017 18:02:18 +0000 (18:02 +0000)]
Remove /etc/resolv.conf from virtual machine images, which is
copied from the build host. It is renamed to /etc/resolv.conf.bak
on boot, so never used anyway.
Noticed by: peter
MFC after: 3 days
Sponsored by: The FreeBSD Foundation
Andriy Gapon [Tue, 21 Nov 2017 18:01:43 +0000 (18:01 +0000)]
make illumos uiocopy use vn_io_fault_uiomove
uiocopy() is currently unused, its purpose is copy data from a uio
without modifying the uio. It was in use before the vn_io_fault support
was added to ZFS, at which point our code diverged from the illumos code
a little bit. Because ZFS is the only (potential) user of the function
we are free to modify it to better suit ZFS needs.
The intention behind this change is to remove the differences introduced
earlier in zfs_write().
While here, re-implement uioskip() using uiomove() with
uio_segflg == UIO_NOCOPY.
The story of uioskip is the same as with uiocopy.
Andrew Turner [Tue, 21 Nov 2017 17:23:16 +0000 (17:23 +0000)]
Add a driver for the EFI RTC. This uses the EFI Runtime Services to query
the system time.
As we seem to only read this time on boot, and this is the only source of
time on many arm64 machines we need to enable this by default there. As
this is not always the case with U-Boot firmware, or when we have been
booted from a non-UEFI environment we only enable the device driver when
the Runtime Services are present and reading the time doesn't result in an
error.
Mark Johnston [Tue, 21 Nov 2017 16:03:21 +0000 (16:03 +0000)]
Refine symtab sorting in libproc.
Add some rules to more closely match what illumos does when an address
resolves to multiple symbols:
- prefer non-local symbols
- prefer symbols with fewer leading underscores and no leading '$'
Make sure all initialized mutexes are destroyed in the iser module,
else WITNESS will panic. Prefix all mutex names with "iser_" to
prevent future WITNESS issues.
Andrew Turner [Tue, 21 Nov 2017 13:19:38 +0000 (13:19 +0000)]
When fpcurthread is not the current thread it may be non-NULL. In this
case another thread has had the VFP unit enabled and will have its state
in the VFP registers along with it stored in memory. As such we don't need
to store the state, but do need to zero the fpcurthread pointer to stop
the VFP driver from using the enable fast path.
Mark Johnston [Tue, 21 Nov 2017 13:17:40 +0000 (13:17 +0000)]
Allow for fictitious physical pages in vm_page_scan_contig().
Some drm2 drivers will set PG_FICTITIOUS in physical pages in order to
satisfy the OBJT_MGTDEVICE object interface, so a scan may encounter
fictitous pages. For now, allow for this possibility; such pages will be
skipped later in the scan since they are wired.
Warner Losh [Tue, 21 Nov 2017 07:35:29 +0000 (07:35 +0000)]
This program is more useful if it skips leading whitespace when
parsing a textual UEFI Device Path, since otherwise it things the
passed in path is a filename. While here, reduce the repetition of
8192.
Warner Losh [Tue, 21 Nov 2017 06:12:21 +0000 (06:12 +0000)]
While the EFI spec allows numbers to be in many forms, libefivar
produces hex numbers for the dsn. Since that come is from EDK2, change
this for symmetry, by generating the dsn as a hex number.
Justin Hibbits [Tue, 21 Nov 2017 03:12:16 +0000 (03:12 +0000)]
Check the page table before TLB1 in pmap_kextract()
The vast majority of pmap_kextract() calls are looking for a physical memory
address, not a device address. By checking the page table first this saves
the formerly inevitable 64 (on e500mc and derivatives) iteration loop
through TLB1 in the most common cases.
Benchmarking this on the P5020 (e5500 core) yields a 300% throughput
improvement on dtsec(4) (115Mbit/s -> 460Mbit/s) measured with iperf.
Benchmarked on the P1022 (e500v2 core, 16 TLB1 entries) yields a 50%
throughput improvement on tsec(4) (~93Mbit/s -> 165Mbit/s) measured with
iperf.
Landon J. Fuller [Tue, 21 Nov 2017 01:54:48 +0000 (01:54 +0000)]
Preemptively map MIPS INTRNG interrupts on non-FDT MIPS targets
This replaces a partial workaround introduced in r305527 that was
incompatible with nested INTRNG interrupt controllers if not also using
FDT.
On non-FDT MIPS INTRNG targets, we now preemptively produce a set of fixed
mappings for the MIPS IRQ range during nexus attach. On FDT targets,
OFW_BUS_MAP_INTR() remains responsible for mapping the MIPS IRQs.
Approved by: adrian (mentor)
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D12385