This is work done by Michael Tuexen and myself at the IETF. This
adds the new I-Data (Interleaved Data) message. This allows a user
to be able to have complete freedom from Head Of Line blocking that
was previously there due to the in-ability to send multiple large
messages without the TSN's being in sequence. The code as been
tested with Michaels various packet drill scripts as well as
inter-networking between the IETF's location in Argentina and Germany.
This revision adds support to if_rt for more SoCs.
The SoCs I've tried the driver with include the following:
RT3050, RT5350, RT3662, RT3883, MT7620, MT7621, MT7688.
On boards, based on the above SoCs traffic is passing through correctly
and the boards survive a flood ping with very little or no drops (drops
may be caused elsewhere in my test setup, however).
One issue still remains and needs to be fixed in the future: if_rt does
not survive an ifconfig rt0 down/ifconfig rt0 up cycle.
This issue existed before this commit as well, however.
Reviewed by: ray
Approved by: adrian (mentor)
Sponsored by: Smartcom - Bulgaria AD
Differential Revision: https://reviews.freebsd.org/D5864
hyperv: Use lapic_{alloc,free}_ipi to allocate private interrupt vector
Suggested by: jhb
Reviewed by: Dexuan Cui <decui microsoft com>, Jun Su <junsu microsoft com>
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5850
Disable the NetBSD-specific EFAULT requirements test in gettimeofday_err
FreeBSD doesn't specifically list this as a supported error, and in some
configurations/versions of FreeBSD, this test will segfault as the memory
address might be evaluated in userspace, instead of in kernel space like
in NetBSD.
hyperv/vmbus: Use default mtx for channel message queue
First of all sema_post() can't be called w/ spinlock, and the channel
message queue processing is not on hot code path, i.e. spinlock is not
necessary.
Submitted by: Jun Su <junsu microsoft com>
Reviewed by: sephe, Dexuan Cui <decui microsoft com>
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5812
Fix PIC lookup by device and xref. There was not taken into account
the situation that someone has a pointer to device but not its xref.
This situation is regular now, after r297539.
Make CloudABI's way of doing TLS more friendly to userspace emulators.
We're currently seeing how hard it would be to run CloudABI binaries on
operating systems cannot be modified easily (Windows, Mac OS X). The
idea is that we want to just run them without any sandboxing. Now
that CloudABI executables are PIE, this is already a bit easier, but TLS
is still problematic:
- CloudABI executables want to write to the %fs, which typically
requires extra system calls by the emulator every time it needs to
switch between CloudABI's and its own TLS.
- If CloudABI executables overwrite the %fs base unconditionally, it
also becomes harder for the emulator to store a backup of the old
value of %fs. To solve this, let's no longer overwrite %fs, but just
%fs:0.
As CloudABI's C library does not use a TCB, this space can now be used
by an emulator to keep track of its internal state. The executable can
now safely overwrite %fs:0, as long as it makes sure that the TCB is
copied over to the new TLS area.
Ensure that there is an initial TLS area set up when the process starts,
only containing a bogus TCB. We don't really care about its contents on
FreeBSD.
Allow using DTRACE for performance analysis of userspace
applications - the function call stack can be captured.
This is almost an exact copy of AMD64 solution.
Convert pci_delete_child() to a bus_child_deleted() method.
Instead of providing a wrapper around device_delete_child() that the PCI
bus and child bus drivers must call explicitly, move the bulk of the logic
from pci_delete_child() into a bus_child_deleted() method
(pci_child_deleted()). This allows PCI devices to be safely deleted via
device_delete_child().
- Add a bus_child_deleted method to the ACPI PCI bus which clears the
device_t associated with the corresponding ACPI handle in addition to
the normal PCI bus cleanup.
- Change cardbus_detach_card to call device_delete_children() and move
CardBus-specific delete logic into a new cardbus_child_deleted() method.
- Use device_delete_child() instead of pci_delete_child() in the SRIOV code.
- Add a bus_child_deleted method to the OpenFirmware PCI bus drivers which
frees the OpenFirmware device info for each PCI device.
Reviewed by: imp
Tested on: amd64 (CardBus and PCI-e hotplug)
Differential Revision: https://reviews.freebsd.org/D5831
adrian [Wed, 6 Apr 2016 01:21:51 +0000 (01:21 +0000)]
[net80211] Initial A-MSDU support for testing / evaluation
A-MSDU is another 11n aggregation mechanism where multiple ethernet
frames get LLC encapsulated (so they have a length field), padded,
and put in a single MPDU (802.11 MAC frame.) This means it gets sent
out as a single frame, with a single seqno, it's acked as one frame, etc.
It turns out that, hah, atheros fast frames is almost but not quite
like this, so I'm reusing all of the current superg/fast-frames stuff
in order to actually transmit A-MSDU. Yes, this means that A-MSDU
frames are also only aggregated two at a time, so it's not necessarily
a huge win, but it's better than nothing.
This doesn't do anything by default - the driver needs to say it does
A-MSDU as well as set the AMSDU software TX capability so this code path
gets exercised.
For now, the only driver that enables this is urtwn. I'll enable it
for rsu at some point soon.
Tested:
* Add an amsdu encap path to aggregate two frames, same as the
fast-frames path.
* Always do the superg init/teardown and node init/teardown stuff,
regardless of whether the nodes are doing fast-frames (the ATH
capability stuff.) That way we can reuse it for amsdu.
* Don't do AMSDU for multicast/broadcast and EAPOL frames.
* If we're doing A-MPDU, then don't bother doing FF/A-MSDU.
We can likely do both together, but I don't want to change
behaviour.
* Teach the fast frames approx txtime logic to support the 11n
rates. But, since we don't currently have a full "current rate"
support, assume it's HT20, long-gi, etc. That way we overshoot
on the TX time estimation, so we're always inside the requirements.
(And we only aggregate two frames for now, so we're not really
going to exceed that.)
* Drop the maximum FF age default down to 2ms, otherwise we end up
with some very annoyingly large latencies.
TODO:
* We only aggregate two ethernet frames, so I'm not checking the max
A-MSDU size. But when it comes time to support >2 frames, we should
obey that.
We were setting an incorrect/undefined size and as it came out the st
struct was not really being used at all. This was actually a bug but
by sheer luck it had no visual effect.
adrian [Tue, 5 Apr 2016 22:14:21 +0000 (22:14 +0000)]
[urtwn] first cut of getting the fast-frames / amsdu support in shape.
The urtwn hardware transmits FF/A-MSDU just fine - it takes an 802.11
frame and will dutifully send the thing.
So:
* bump RX queue up from 1. Why's it 1? That's really silly.
* Add the "software A-MSDU" encap capability bit.
* bump the TX buffer size up so we can at least send A-MSDU frames.
* track active frames submitted to the NIC - we can't make assumptions
about how many are in flight in the NIC though. For 88E parts we
could use per-packet TX indication, but for R92 parts we can't.
So, just fake it somewhat.
* Kick the transmit queue when we finish reception; try to avoid stalls.
* Kick the FF queue a little more regularly.
A-MSDU TX won't happen until the net80211 side is done, but atheros
fast-frames support should now work.
adrian [Tue, 5 Apr 2016 21:54:07 +0000 (21:54 +0000)]
[net80211] Add a new capability flag to indicate that the stack should
do software A-MSDU encapsulation.
Right now there's AMSDU TX/RX capability bits and they're mostly
unused, however I'd like to maintain those as the general configuration,
not also "please software encap AMSDU." For platforms that can do
A-MSDU in firmware (iwn, iwm, etc) then their init paths can read
this flag to configure A-MSDU.
ian [Tue, 5 Apr 2016 13:47:06 +0000 (13:47 +0000)]
Add more DPRINTF() to the ftdi driver. Now everything that can change the
chip's state has a DPRINTF, with things that happen repeatedly at debug=2
level and things that happen frequently (like per-transfer IO) at debug=3.
TEGRA: Fix CPU frequency switching.
The PLL_X, base CPU frequency source, doesn't have a bypass switch and thus
we must use another frequency source for CPU while changing its frequency.
PLL_P is ideal for this, it runs at 480MHz and CPU can be clocked at this
frequency at any CPU voltage.
This is compatible with the ds1307, but comparing the mcp7941x datasheet vs the
ds1307 code, appears there is one bit placement difference, so that is now
accounted for.
Make i2c device child auto-probe work for MPC85xx and QorIQ SoCs.
OFW i2c probing requires a new method ofw_bus_get_node(), and the bus device is
assumed iichb. With these changes, i2c devices attached in fdt are probed and
attached automagically.
Don't wakeup the fdc worker thread once a second when idle.
The fdc worker thread was using a one second timeout while waiting for
a new bio to arrive or for the device to detach. However, the driver
already does a wakeup when queueing a new bio or asking the thread to
detach, so the timeout only served to waste CPU time waking up the
thread once a second just so it could go right back to sleep. Use an
infinite timeout instead.
andrew [Mon, 4 Apr 2016 17:04:33 +0000 (17:04 +0000)]
Add a table to map from the FreeBSD CPUID space to the GIC CPUID space. On
many SoCs these two are the same, however there is no requirement for this
to be the case, e.g. on the ARM Juno we boot on what the GIC thinks of as
CPU 2, but FreeBSD numbers it CPU 0.
Obtained from: ABT Systems Ltd
Sponsored by: The FreeBSD Foundation
Previously, the code determined a topology of processing units
(hardware threads, cores, packages) and then deduced a cache topology
using certain assumptions. The new code builds a topology that
includes both processing units and caches using the information
provided by the hardware.
At the moment, the discovered full topology is used only to creeate
a scheduling topology for SCHED_ULE.
There is no KPI for other kernel uses.
Summary:
- based on APIC ID derivation rules for Intel and AMD CPUs
- can handle non-uniform topologies
- requires homogeneous APIC ID assignment (same bit widths for ID
components)
- topology for dual-node AMD CPUs may not be optimal
- topology for latest AMD CPU models may not be optimal as the code is
several years old
- supports only thread/package/core/cache nodes
Todo:
- AMD dual-node processors
- latest AMD processors
- NUMA nodes
- checking for homogeneity of the APIC ID assignment across packages
- more flexible cache placement within topology
- expose topology to userland, e.g., via sysctl nodes
Long term todo:
- KPI for CPU sharing and affinity with respect to various resources
(e.g., two logical processors may share the same FPU, etc)
Define local-intc for BCM2836 platform (RPI2) and make BCM2835 intc
a child of it. This is done in conformity with Linux dts files and
as preparation for rework of BCM2836 interrupt controller for INTRNG.
Remove FDT specific parts from INTRNG. Change its interface to make it
universal.
(1) New struct intr_map_data is defined as a container for arbitrary
description of an interrupt used by a device. Typically, an interrupt
number and configuration relevant to an interrupt controller is encoded
in such description. However, any additional information may be encoded
too like a set of cpus on which an interrupt should be enabled or vendor
specific data needed for setup of an interrupt in controller. The struct
intr_map_data itself is meant to be opaque for INTRNG.
(2) An intr_map_irq() function is created which takes an interrupt
controller identification and struct intr_map_data as arguments and
returns global interrupt number which identifies an interrupt.
(3) A set of functions to be used by bus drivers is created as well as
a corresponding set of methods for interrupt controller drivers. These
sets take both struct resource and struct intr_map_data as one of the
arguments. There is a goal to keep struct intr_map_data in struct
resource, however, this way a final solution is not limited to that.
(4) Other small changes are done to reflect new situation.
This is only first step aiming to create stable interface for interrupt
controller drivers. Thus, some temporary solution is taken. Interrupt
descriptions for devices are stored in INTRNG and two specific mapping
function are created to be temporary used by bus drivers. That's why
the struct intr_map_data is not opaque for INTRNG now. This temporary
solution will be replaced by final one in next step.
This optimization attempts to utylize as wide as possible register store instructions to zero large buffers.
The implementation, if possible, will use 'dc zva' to zero buffer by cache lines.
Enable 4-byte address support for the mx25l family of SPI flash devices.
Introduce 2 new flags:
- FL_ENABLE_4B_ADDR (forces the use of 4-byte addresses)
- FL_DISABLE_4B_ADDR (forces the use of 3-byte addresses)
If an SPI flash chip is defined with FL_ENABLE_4B_ADDR in its flags,
then an 'Enter 4-byte mode' command is sent to the chip at attach time
and, later, all commands that require addressing are issued with 4-byte
addresses.
If an SPI flash chip is defined with FL_DISABLE_4B_ADDR in its flags,
then an 'Exit 4-byte mode' command is sent to the chip at attach time
and, later, all commands that require addressing are issued with 3-byte
addresses.
For chips that do not have any of these flags defined the behaviour is
unchanged.
This change also adds support for the MX25L25735F and MX25L25635E chips
(vendor id 0xc2, device id 0x2019), which support 4-byte mode and enables
4-byte mode for them. These are 256Mbit devices (32MiB) and, as such, can
only be fully addressed by using 4-byte addresses.
Approved by: adrian (mentor)
Sponsored by: Smartcom - Bulgaria AD
Differential Revision: https://reviews.freebsd.org/D5808
adrian [Sun, 3 Apr 2016 23:39:58 +0000 (23:39 +0000)]
[iwn] Don't try to seamlessly recover from a firmware panic; just restart
the interface.
I know this may be unpopular, but iwn is not yet completely ready for
a transparent firmware restart. I have this thing panic my laptop
reliably because 11n state isn't kept in sync and the TX completion
path ends up trying to free a null node reference.
If there is an error different from ERESTART, there is some
chance that we may end up accessing an uninitialized value. This
doesn't seem likely/possible but initialize announce_buf[0],
just in case.
Improve HDMI display detection by searching the CEA-861 extension block for
an HDMI vendor-specific data block (VSDB) containing the HDMI 24-bit IEEE
registration ID (0x000C03).
remove emulation of VFS_HOLD and VFS_RELE from opensolaris compat
On FreeBSD VFS_HOLD/VN_RELE were mapped to MNT_REF/MNT_REL that
manipulate mnt_ref. But the job of properly maintaining the reference
count is already automatically performed by insmntque(9) and
delmntque(9). So, in effect all ZFS vnodes referenced the corresponding
mountpoint twice.
That was completely harmless, but we want to be very explicit about what
FreeBSD VFS APIs are used, because illumos VFS_HOLD and FreeBSD MNT_REF
provide quite different guarantees with respect to the held vfs_t /
mountpoint. On illumos VFS_HOLD is sufficient to guarantee that
vfs_t.vfs_data stays valid. On the other hand, on FreeBSD MNT_REF does
*not* provide the same guarantee about mnt_data. We have to use
vfs_busy() to get that guarantee.
Thus, the calls to VFS_HOLD/VFS_RELE on vnode init and fini are removed.
VFS_HOLD calls are replaced with vfs_busy in the ioctl handlers.
And because vfs_busy has a richer interface that can not be dumbed down
in all cases it's better to explicitly use it rather than trying to mask
it behind VFS_HOLD.
This change fixes a panic that could result from a race between
zfs_umount() and zfs_ioc_rollback(). We observed a case where
zfsvfs_free() tried to destroy data that zfsvfs_teardown() was still
using. That happened because there was nothing to prevent unmounting of
a ZFS filesystem that was in between zfs_suspend_fs() and
zfs_resume_fs().
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed by: Dan McDonald <danmcd@omniti.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Eli Rosenthal <eli.rosenthal@delphix.com>
MFV r297505:
6739 userland version of cv_timedwait_hires() always assumes absolute time
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Dan McDonald <danmcd@omniti.com>
Reviewed by: Robert Mustacchi <rm@joyent.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: George Wilson <george.wilson@delphix.com>
MFV r297504: 6681 zfs list burning lots of time in dodefault() via dsl_prop_*
Reviewed by: Patrick Mooney <patrick.mooney@joyent.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Dan McDonald <danmcd@omniti.com>
Approved by: Matthew Ahrens <mahrens@delphix.com>
Author: Alex Wilson <alex.wilson@joyent.com>