Justin Hibbits [Sun, 10 Nov 2019 22:08:07 +0000 (22:08 +0000)]
Consolidate powerpcspe CFLAGS
Don't depend on CPUTYPE to define powerpcspe CFLAGS, they should be set
unconditionally. This reduces duplication. Also, set some CFLAGS as
gcc-only, because clang's SPE support always uses the SPE ABI, it's not an
optional feature.
This saves some memory, around 256K I think. It removes some code,
e.g. KPTI does not need to specially map common_tss anymore. Also,
common_tss become domain-local.
Reviewed by: jhb
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D22231
Alan Cox [Sun, 10 Nov 2019 05:22:01 +0000 (05:22 +0000)]
Eliminate a redundant pmap_load() from pmap_remove_pages().
There is no reason why the pmap_invalidate_all() in pmap_remove_pages()
must be performed before the final PV list lock release. Move it past
the lock release.
Eliminate a stale comment from pmap_page_test_mappings(). We implemented
a modified bit in r350004.
Alexander Motin [Sun, 10 Nov 2019 03:37:45 +0000 (03:37 +0000)]
Add compact scraptchpad protocol for ntb_transport(4).
Previously ntb_transport(4) required at least 6 scratchpad registers,
plus 2 more for each additional memory window. That is too much for some
configurations, where several drivers have to share resources of the same
NTB hardware. This patch introduces new compact version of the protocol,
requiring only 3 scratchpad registers, plus one more for each additional
memory window. The optimization is based on fact that neither of version,
number of windows or number of queue pairs really need more then one byte
each, and window sizes of 4GB are not very useful now. The new protocol
is activated automatically when the configuration is low on scratchpad
registers, or it can be activated explicitly with loader tunable.
Alexander Motin [Sun, 10 Nov 2019 03:24:53 +0000 (03:24 +0000)]
Allow splitting PLX NTB BAR2 into several memory windows.
Address Lookup Table (A-LUT) being enabled allows to specify separate
translation for each 1/128th or 1/256th of the BAR2. Previously it was
used only to limit effective window size by blocking access through some
of A-LUT elements. This change allows A-LUT elements to also point
different memory locations, providing to upper layers several (up to 128)
independent memory windows. A-LUT hardware allows even more flexible
configurations than this, but NTB KPI have no way to manage that now.
Kyle Evans [Sun, 10 Nov 2019 03:06:03 +0000 (03:06 +0000)]
bcm2835_sdhci: don't panic in DMA interrupt if curcmd went away
This is an exceptional case; generally found during controller errors.
A panic when we attempt to acess slot->curcmd->data is less ideal than
warning, and other verbiage will be emitted to indicate the exact error.
Rick Macklem [Sun, 10 Nov 2019 01:08:14 +0000 (01:08 +0000)]
Update copy_file_range(2) to be Linux5 compatible.
The current linux man page and testing done on a fairly recent linux5.n
kernel have identified two changes to the semantics of the linux
copy_file_range system call.
Since the copy_file_range(2) system call is intended to be linux compatible
and is only currently in head/current and not used by any commands,
it seems appropriate to update the system call to be compatible with
the current linux one.
The first of these semantic changes was changed to be compatible with
linux5.n by r354564.
For the second semantic change, the old linux man page stated that, if
infd and outfd referred to the same file, EBADF should be returned.
Now, the semantics is to allow infd and outfd to refer to the same file
so long as the byte ranges defined by the input file offset, output file offset
and len does not overlap. If the byte ranges do overlap, EINVAL should be
returned.
This patch modifies copy_file_range(2) to be linux5.n compatible for this
semantic change.
Add GEOM attribute to report physical device name, and report it
via 'diskinfo -v'. This avoids the need to track it down via CAM,
and should also work for disks that don't use CAM. And since it's
inherited thru the GEOM hierarchy, in most cases one doesn't need
to walk the GEOM graph either, eg you can use it on a partition
instead of disk itself.
Doug Moore [Sat, 9 Nov 2019 17:08:27 +0000 (17:08 +0000)]
For vm_map, #defining DIAGNOSTIC to turn on full assertion-based
consistency checking slows performance dramatically. This change
reduces the number of assertions checked by completely walking the
vm_map tree only when the write-lock is released, and only then if the
number of modifications to the tree since the last walk exceeds the
number of tree nodes.
Rick Macklem [Fri, 8 Nov 2019 23:39:17 +0000 (23:39 +0000)]
Update copy_file_range(2) to be Linux5 compatible.
The current linux man page and testing done on a fairly recent linux5.n
kernel have identified two changes to the semantics of the linux
copy_file_range system call.
Since the copy_file_range(2) system call is intended to be linux compatible
and is only currently in head/current and not used by any commands,
it seems appropriate to update the system call to be compatible with
the current linux one.
The old linux man page stated that, if the
offset + len exceeded file_size for the input file, EINVAL should be returned.
Now, the semantics is to copy up to at most file_size bytes and return that
number of bytes copied. If the offset is at or beyond file_size, a return
of 0 bytes is done.
This patch modifies copy_file_range(2) to be linux compatible for this
semantic change.
A separate patch will change copy_file_range(2) for the other semantic
change, which allows the infd and outfd to refer to the same file, so
long as the byte ranges do not overlap.
Kyle Evans [Fri, 8 Nov 2019 20:14:36 +0000 (20:14 +0000)]
bcm2835_sdhci: remove unused power_id field
This was once set, but I removed it by the time I committed it because both
configurations use the same POWER_ID. This can be separated back out if the
situation changes.
Kyle Evans [Fri, 8 Nov 2019 20:12:57 +0000 (20:12 +0000)]
bcm2835_sdhci: add some very basic support for rpi4
DMA is currently disabled while I work out why it's broken, but this is
enough for upstream U-Boot + rpi-firmware + our rpi3-psci-monitor to boot
with the right config.
The RPi 4 is still not in a good "supported" state, as we have no
USB/PCI-E/Ethernet drivers, but if air-gapped pies only able to operate over
cereal is your thing, here's your guy.
Submitted by: Robert Crowston (with modifications)
Emmanuel Vadot [Fri, 8 Nov 2019 20:08:44 +0000 (20:08 +0000)]
loader.efi: Default to serial if we don't have a ConOut variable
In the EFI implementation in U-Boot no ConOut efi variable is created,
this cause loader to fallback to TERM_EMU implementation which is very
very very slow (and uses the ConOut device in the system table anyway).
The UEFI spec aren't clear as if this variable needs to exists or not.
Michal Meloun [Fri, 8 Nov 2019 18:57:41 +0000 (18:57 +0000)]
Implement support for (soft)linked clocks.
This kind of clock nodes represent temporary placeholder for clocks
defined later in boot process. Also, these are necessary to break
circular dependencies occasionally occurring in complex clock graphs.
bhyve: add support for virtio-net mergeable rx buffers
Mergeable rx buffers is a virtio-net feature that allows the hypervisor
to use multiple RX descriptor chains to receive a single receive packet.
Without this feature, a TSO-enabled guest is compelled to publish only
64K (or 32K) long chains, and each of these large buffers is consumed
to receive a single packet, even a very short one. This is a waste of
memory, as a RX queue has room for 256 chains, which means up to 16MB
of buffer memory for each (single-queue) vtnet device.
With the feature on, the guest can publish 2K long chains, and the
hypervisor will merge them as needed.
This change also enables the feature in the netmap backend, which
supports virtio-net offloads. We plan to add support for the
tap backend too.
Note that differently from QEMU/KVM, here we implement one-copy receive,
while QEMU uses two copies.
Ed Maste [Fri, 8 Nov 2019 14:59:41 +0000 (14:59 +0000)]
elfcopy/strip: Ensure sections have required alignment on output
Object files may specify insufficient alignment on certain sections, for
example due to a bug in NASM[1]. When we detect that case in elfcopy or
strip, emit a warning and increase the alignment to the minimum
required.
The NASM bug was fixed in 2015[2], but we might as well have this fixup
(and warning) in elfcopy in case we encounter such a file for any other
reason.
This might be reworked somewhat upstream - see ELF Tool Chain
ticket 485[3].
Ed Maste [Fri, 8 Nov 2019 14:51:09 +0000 (14:51 +0000)]
kvm: fix types for cross-debugging
As with other libkvm interfaces use maximum-sized types to support
cross-debugging (e.g. a 64-bit vmcore on a 32-bit host). See
https://lists.freebsd.org/pipermail/svn-src-all/2019-February/176051.html
for further discussion.
This is an API-breaking change, but there are few consumers of this
interface today.
Reviewed by: will
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D21945
Bjoern A. Zeeb [Fri, 8 Nov 2019 14:36:44 +0000 (14:36 +0000)]
frag6: properly handle atomic fragments according to RFCs.
RFC 8200 says:
"If the fragment is a whole datagram (that is, both the Fragment
Offset field and the M flag are zero), then it does not need
any further reassembly and should be processed as a fully
reassembled packet (i.e., updating Next Header, adjust Payload
Length, removing the Fragment header, etc.). .."
That means we should remove the fragment header and make all the adjustments
rather than just skipping over the fragment header. The difference should
be noticeable in that a properly handled atomic fragment triggering an ICMPv6
message at an upper layer (e.g. dest unreach, unreachable port) will not
include the fragment header.
Update the test cases to also test for an unfragmentable part. That is
needed so that the next header is properly updated (not just lengths).
Kyle Evans [Fri, 8 Nov 2019 14:28:39 +0000 (14:28 +0000)]
csu: Fix dynamiclib/init_test:jcr_test on !HAVE_CTORS archs
.jcr still needs a 0-entry added in crtend, even on !HAVE_CTORS archs, as
we're still getting .jcr sections added -- presumably due to the reference
in crtbegin. Without this terminal, the .jcr section (without data) overlaps
with the next section and register_classes in crtbegin will be examining the
wrong item.
PR: 241439
Reviewed by: andrew
Differential Revision: https://reviews.freebsd.org/D22132
Ed Maste [Fri, 8 Nov 2019 14:25:26 +0000 (14:25 +0000)]
add reference to PR for sparc64 BSD_CRTBEGIN in BROKEN_OPTIONS
We will soon remove the BSD_CRTBEGIN option (and will use the new CRT
files always) as part of the GCC 4.2.1 removal. Right now BSD_CRTBEGIN
works everywhere but sparc64; add a reference to the PR in case anyone
stumbles across this and is looking for more information.
Ed Maste [Fri, 8 Nov 2019 14:11:25 +0000 (14:11 +0000)]
makefs: avoid warning when creating FAT filesystem on existing file
Previously the mkfs_msdos function (from newfs_msdos) emitted warnings
in the case that an image size is specified and the target is not a
file, or no size is specified and the target is not a character device.
The latter warning (not a character device) doesn't make sense when this
code is used in makefs, regardless of whether an image size is specified
or not.
Ed Maste [Fri, 8 Nov 2019 14:06:48 +0000 (14:06 +0000)]
suggest xtoolchain package if binutils and GCC bootstraps are both broken
Previously we checked for only BINUTILS_BOOTSTRAP as a broken option
and suggested installing the binutils package. This was originally done
for arm64 where we used the in-tree Clang with external binutils package.
Add a case to the warning to suggest instead the full xtoolchain package
if we have no in-tree compiler either.
Humanize more columns in the vmstat(8) output and adjust widths.
The few columns that are not humanized are usually 0. This makes
the output mostly aligned.
Rick Macklem [Fri, 8 Nov 2019 06:40:17 +0000 (06:40 +0000)]
Fix the man page to correctly describe the use of the "len" argument.
The man page incorrectly described the use of the"len" argument, which
is updated to the number of bytes copied and not reduced by the number
of bytes copied.
Justin Hibbits [Fri, 8 Nov 2019 04:26:19 +0000 (04:26 +0000)]
powerpc/booke: Only handle kernel page faults in KVA range
The memory range between VM_MAXUSER_ADDRESS and VM_MIN_KERNEL_ADDRESS is
reserved for devices currently, which are always mapped in TLB1, and
therefore do not exist in the kernel page table. Any page fault in this
range is therefore automatically a fatal fault.
Justin Hibbits [Fri, 8 Nov 2019 03:45:13 +0000 (03:45 +0000)]
powerpc/booke: Make the TLB save area and mask match
Since TLB_MAXNEST is 3, the insert mask should only be 2 bits. Given that 2
bits counts to 4, and that we already have plenty of space wasted in
padding, make the nest level 4 to match the mask.
Justin Hibbits [Fri, 8 Nov 2019 03:36:19 +0000 (03:36 +0000)]
powerpc/mpc85xx: Add MSI support for Freescale PowerPC SoCs
Freescale SoCs use a set of IRQs at the high end of the OpenPIC IRQ
list, not counted in the NIRQs of the Feature reporting register. Some
SoCs include a MSI inbound window in the PCIe controller configuration
registers as well, but some don't. Currently, this only handles the
SoCs *with* the MSI window.
There are 256 MSIs per MSI bank (32 per MSI IRQ, 8 IRQs per MSI bank).
The P5020 has 3 banks, yielding up to 768 MSIs; older SoCs have only one
bank.
Kyle Evans [Fri, 8 Nov 2019 03:27:56 +0000 (03:27 +0000)]
bcm2835_dma: Mark IRQs shareable
On the RPi4, some of these IRQs are shared. Start moving toward a mode where
we accept that shared IRQs happen and simply ignore interrupts that are
seemingly for no reason.
I would like to be more verbose here, but my 30-minute assessment of the
current world order is that mapping a resource/rid to an actual IRQ number
(as found in FDT) data is not a simple matter. Determining if more than one
handler is attached to an IRQ is closer to feasible, but it's unclear which
way is the cleaner path. Beyond that, we're only really using it to be
slightly more verbose when something's going wrong, so for now just suppress
and drop a complaint-comment.
This was originally submitted (via freebsd-arm@) by Robert Crowston; the
additional verbosity was dropped by kevans@.
Submitted by: Robert Crowston <crowston@protonmail.com>
Mark Johnston [Thu, 7 Nov 2019 23:37:17 +0000 (23:37 +0000)]
iwm: Fix scheduler configuration for aux and cmd queue configuration.
- Configure the scheduler only for the management queue.
- Fix a bug when enabling the schduler: the queues are specified using a
bitmask.
- Fix style in the area.
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Mark Johnston [Thu, 7 Nov 2019 23:37:02 +0000 (23:37 +0000)]
iwm: Implement the new receive path.
This is the multiqueue receive code required for 9000-series chips.
Note that we still only configure a single RX queue for now. Multiqueue
support will require MSI-X configuration and a scheme for managing a
global pool of RX buffers.
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Mark Johnston [Thu, 7 Nov 2019 23:36:46 +0000 (23:36 +0000)]
iwm: Enable all 31 tx queues.
For now iwm only ever uses queue 0 and the management queue, but my 9560
raises a software error interrupt during initialization if this flag is
not set. iwlwifi sets it for all 7000- and 8000-series hardware, so we
might as well do it unconditionally.
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Mark Johnston [Thu, 7 Nov 2019 23:35:54 +0000 (23:35 +0000)]
iwm: Add device configuration definitions for 9000-series chips.
Match such chips using the device ID. We should really be checking the
subdevice as well, since a smaller number of 9460 and 9560 devices
actually belong to a new series of devices and require different
firmware, but that will require some extra logic in iwm_attach().
Mark Johnston [Thu, 7 Nov 2019 23:29:43 +0000 (23:29 +0000)]
iwm: Call iwm_dev_check() earlier in iwm_attach().
This ensures that the driver softc reflects device capabilities as early
as possible, for use by device initialization code that is conditional
on certain capabilities.
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Mark Johnston [Thu, 7 Nov 2019 23:27:54 +0000 (23:27 +0000)]
iwm: Fix style in the TX path.
Also ensure that the htole* macros are applied correctly when specifying
the segment length and upper address bits. No functional change
intended (unless you use iwm(4) on a big-endian machine).
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Brooks Davis [Thu, 7 Nov 2019 22:58:10 +0000 (22:58 +0000)]
libcompat: build 32-bit rtld and ldd as part of "everything"
Alter bsd.compat.mk to set MACHINE and MACHINE_ARCH when included
directly so MD paths in Makefiles work. In the process centralize
setting them in LIBCOMPATWMAKEENV.
Alter .PATH and CFLAGS settings in work when the Makefile is included.
While here only support LIB32 on supported platforms rather than always
enabling it and requiring users of MK_LIB32 to filter based
TARGET/MACHINE_ARCH.
The net effect of this change is to make Makefile.libcompat only build
compatability libraries.
Changes relative to r354449:
Correct detection of the compiler type when bsd.compat.mk is used
outside Makefile.libcompat. Previously it always matched the clang
case.
Set LDFLAGS including the linker emulation for mips where -m32 seems to
be insufficent.
Reviewed by: imp, kib (origional version in r354449)
Obtained from: CheriBSD (conceptually)
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D22251
Kyle Evans [Thu, 7 Nov 2019 21:31:15 +0000 (21:31 +0000)]
bcm_lintc: don't attach if "interrupt-controller" is missing
This is a standard required property for interrupt controllers, and present
on the bcm_lintc nodes for currently supported RPi models. For the RPi4, we
have both bcm_lintc as well as GIC-400, but only one may be active at a
time.
Don't probe bcm_lintc if it's missing the "interrupt-controller" property --
in RPi 4 DTS, the bcm_lintc node is actually missing this along with other
required interrupt properties. Presumably, if the earlier boot stages will
support switching to the legacy interrupt controller (as is suggested
possible by the documentation), the DTS will need to be updated to indicate
the proper interrupt-parent and hopefully also mark this node as an
interrupt-controller instead.
Gleb Smirnoff [Thu, 7 Nov 2019 21:28:46 +0000 (21:28 +0000)]
Since pfslowtimo() runs in the network epoch, tcp_slowtimo()
also does. This allows to simplify tcp_tw_2msl_scan() and
always require the network epoch in it.