arybchik [Wed, 25 Mar 2015 13:48:54 +0000 (13:48 +0000)]
MFC: 280375
sfxge: add barriers to BAR write macros
In theory the barriers are required to cope with write combining and
reordering. Two barriers are added (sometimes merged to one):
1. Before the first write to guarantee that previous writes to the region
have been done
2. Before the last write to guarantee that write to the last dword/qword is
done after previous writes
Barriers are inserted before in the assumption that it is better to
postpone barriers as much as it is possible (more chances that the
operation has already been already done and barrier does not stall CPU).
On x86 and amd64 bus space write barriers are just compiler memory barriers
which are definitely required.
Sponsored by: Solarflare Communications, Inc.
Original Differential Revision: https://reviews.freebsd.org/D2077
arybchik [Wed, 25 Mar 2015 13:47:48 +0000 (13:47 +0000)]
MFC: 280374
sfxge: assert either kernel or internal copy of interface flags
ioctl to put interface down sets ifp->if_flags which holds the intended
administratively defined state and calls driver callback to apply it.
When everything is done, driver updates internal copy of
interface flags sc->if_flags which holds the operational state.
So, transmit from Rx path is possible when interface is intended to be
administratively down in accordance with ifp->if_flags, but not applied
yet and the operational state is up in accordance with sc->if_flags.
Sponsored by: Solarflare Communications, Inc.
Original Differential Revision: https://reviews.freebsd.org/D2075
arybchik [Wed, 25 Mar 2015 13:45:20 +0000 (13:45 +0000)]
MFC: 280163
sfxge: prefetch txq->common if TxQ is started only
Transmit may be called when TxQ is not started yet (i.e. txq->common is
invalid). TxQ state is checked below when mbuf is processed and dropped
if TxQ is not started.
Sponsored by: Solarflare Communications, Inc.
Approved by: gnn (mentor)
arybchik [Wed, 25 Mar 2015 13:18:51 +0000 (13:18 +0000)]
MFC: 279351
sfxge: expect required init_state on data path and in periodic calls
With the patch applied the number of instruction events is 1% less and
number of mispredicted branch events is 5% less under multistream TCP
traffic load close to line rate.
Sponsored by: Solarflare Communications, Inc.
Approved by: gnn (mentor)
hselasky [Wed, 25 Mar 2015 13:14:25 +0000 (13:14 +0000)]
MFC r280322 and r280429:
The synchronisation value returned by the so-called feedback endpoint
appears to be too inaccurate that it can be used to synchronize the
playback data stream. If there is a recording endpoint associated with
the playback endpoint, use that instead. That means if the isochronous
OUT endpoint is asynchronus the USB audio driver will automatically
start recording, if possible, to get exact information about the
needed sample rate adjustments. In no recording endpoint is present,
no rate adaption will be done.
While at it fix an issue where the hardware buffer pointers don't get
reset at the first device PCM trigger.
Make some variables 32-bit to avoid problems with multithreading.
Use the feedback value from the synchronization endpoint as fallback
when there is no recording channel.
arybchik [Wed, 25 Mar 2015 13:12:15 +0000 (13:12 +0000)]
MFC: 279183
sfxge: add common code support for changing TX queue pace
To delay packets from a particular TX queue by a particular time, write a value
into the TX Pace table s.t. pace time <= TX Pace Clock Period * (2 ^ pace value)
- the TX pace clock is 1/13 of the system clock, so its period should be 104 or
52 ns depending on whether turbo mode is active.
EFX_TX_PACE_CLOCK_BASE added by me.
Submitted by: Mark Spender <mspender at solarflare.com>
Sponsored by: Solarflare Communications, Inc.
Approved by: gnn (mentor)
arybchik [Wed, 25 Mar 2015 13:05:33 +0000 (13:05 +0000)]
MFC: 279178
sfxge: do no allow EFSYS_MEM_ALLOC sleep
It solves locking problem when EFSYS_MEM_ALLOC is called in
the context holding a mutex (not allowed to sleep).
E.g. on interface bring up or multicast addresses addition.
Sponsored by: Solarflare Communications, Inc.
Approved by: gnn (mentor)
arybchik [Wed, 25 Mar 2015 13:03:36 +0000 (13:03 +0000)]
MFC: 279176
sfxge: pass correct address to free allocated memory in the case of load error
It is one more place missed in the previous fix.
Most likely is was just memory leak on the error handling path since
typically efsys_mem_t is filled in by zeros on allocation.
Sponsored by: Solarflare Communications, Inc.
Approved by: gnn (mentor)
hselasky [Wed, 25 Mar 2015 13:01:51 +0000 (13:01 +0000)]
MFC r280345:
Fix for out of order device destruction notifications when using the
delist_dev() function. In addition to this change:
- add a proper description of this function
- add a proper witness assert inside this function
- switch a nearby line to use the "cdp" pointer instead of cdev2priv()
arybchik [Wed, 25 Mar 2015 11:06:16 +0000 (11:06 +0000)]
MFC: 279141
sfxge: style fixes and cleanup
Sync endif comment with conditional.
BOOTROM and SIENA_BOOTROM are the same, but highlight that it is Siena.
Restore commented out assertion.
Sync comments with out-of-tree driver.
Sponsored by: Solarflare Communications, Inc.
Approved by: gnn (mentor)
arybchik [Wed, 25 Mar 2015 11:04:12 +0000 (11:04 +0000)]
MFC: 279098
sfxge: allow TX and RX queue limits to be changed
Before the common code had hard coded limits on the IDs RXQs and TXQs could
be created with which were suited for the Windows driver with VMQ, and so
would prevent queues with IDs greater than or equal to 259 (for TXQs) or 768
(for RXQs) from being created. This change allows the limits to be set in
efsys.h, so that all 1024 queues can be created during new manftest tests.
Also, the descriptor cache sizes were also hard coded to values suited to
the smaller queue counts, and so it was necessary to make them configurable
as well.
Submitted by: Mark Spender <mspender at solarflare.com>
Sponsored by: Solarflare Communications, Inc.
Approved by: gnn (mentor)
arybchik [Wed, 25 Mar 2015 11:00:09 +0000 (11:00 +0000)]
MFC: 279095
sfxge: never set RX_DESCQ_EN during self-test
We must not enable RX queues with random parameters when they are
mapped into a VF with an untrusted driver. It's probably not a good
idea to do this anyway, so take this bit out of the table test masks.
Submitted by: Ben Hutchings
Sponsored by: Solarflare Communications, Inc.
Approved by: gnn (mentor)
arybchik [Wed, 25 Mar 2015 10:57:26 +0000 (10:57 +0000)]
MFC: 279078
sfxge: add assertions that required event handlers are implemented
efx_ev_mcdi() does not assert or check that all event handlers it
calls are non-null. Add assertions at the top for all required
event handlers, as some events (in the case of this bug, monitor
events) are rare.
Submitted by: Ben Hutchings
Sponsored by: Solarflare Communications, Inc.
Approved by: gnn (mentor)
hselasky [Wed, 25 Mar 2015 10:55:08 +0000 (10:55 +0000)]
MFC r279281:
Fix a special case in ip_fragment() to produce a more sensible chain
of packets. When the data payload length excluding any headers, of an
outgoing IPv4 packet exceeds PAGE_SIZE bytes, a special case in
ip_fragment() can kick in to optimise the outgoing payload(s). The
code which was added in r98849 as part of zero copy socket support
assumes that the beginning of any MTU sized payload is aligned to
where a MBUF's "m_data" pointer points. This is not always the case
and can sometimes cause large IPv4 packets, as part of ping replies,
to be split more than needed.
Instead of iterating the MBUFs to figure out how much data is in the
current chain, use the value already in the "m_pkthdr.len" field of
the first MBUF in the chain.
arybchik [Wed, 25 Mar 2015 10:48:28 +0000 (10:48 +0000)]
MFC: 278941
sfxge: support variable-length response to MCDI GET_BOARD_CFG
Allocate the minimum or maximum response length for GET_BOARD_CFG as
appropriate. When looking up firmware subtypes by partition ID,
check the ID against the actual response length.
Merge of the patch made by Ben Hutchings in 2011.
Sponsored by: Solarflare Communications, Inc.
Approved by: gnn (mentor)
arybchik [Wed, 25 Mar 2015 10:35:19 +0000 (10:35 +0000)]
MFC: 278835
sfxge: remove full_packet_size from sfxge_tso_state
It makes sfxge_tso_state smaller and even makes tso_start_new_packet()
few bytes smaller. Data used to calculate packet size are used nearby,
so it should be no problems with cache etc.
Sponsored by: Solarflare Communications, Inc.
Approved by: gnn (mentor), glebius
arybchik [Wed, 25 Mar 2015 10:25:45 +0000 (10:25 +0000)]
MFC: 277895
sfxge: Separate software Tx queue limit for non-TCP traffic
Add separate software Tx queue limit for non-TCP traffic to make total
limit higher and avoid local drops of TCP packets because of no
backpressure.
There is no point to make non-TCP limit high since without backpressure
UDP stream easily overflows any sensible limit.
Split early drops statistics since it is better to have separate counter
for each drop reason to make it unabmiguous.
Add software Tx queue high watermark. The information is very useful to
understand how big queues grow under traffic load.
Sponsored by: Solarflare Communications, Inc.
Approved by: gnn (mentor)
arybchik [Wed, 25 Mar 2015 10:14:30 +0000 (10:14 +0000)]
MFC: 277887
sfxge: Remove extra cache-line alignment and reorder sfxge_evq_t
Remove the first member alignment to cacheline since it is nop.
Use __aligned() for the whole structure to make sure that the structure
size is cacheline aligned.
Remove lock alignment to make the structure smaller and fit all members
used on event queue processing into one cacheline (128 bytes) on x86-64.
The lock is obtained as well from different context when event queue
statistics are retrived from sysctl context, but it is infrequent.
Reorder members to avoid padding and go in usage order on event
processing.
As the result all structure members used on event queue processing fit
into exactly one cacheline (128 byte) now.
Sponsored by: Solarflare Communications, Inc.
Approved by: gnn (mentor)
arybchik [Wed, 25 Mar 2015 10:12:13 +0000 (10:12 +0000)]
MFC: 277885
sfxge: Move txq->next pointer to part writable on completion path
In fact the pointer is used only if more than one TXQ is processed in
one interrupt.
It is used (read-write) on completion path only.
Also it makes the first part of the structure smaller and it fits now
into one 128byte cache line. So, TXQ structure becomes 128 bytes
smaller.
Sponsored by: Solarflare Communications, Inc.
Approved by: gnn (mentor)
arybchik [Wed, 25 Mar 2015 10:05:19 +0000 (10:05 +0000)]
MFC: 272331
Support tunable to control Tx deferred packet list limits
Also increase default for Tx queue get-list limit.
Too small limit results in TCP packets drops especiall when many
streams are running simultaneously.
Put list may be kept small enough since it is just a temporary
location if transmit function can't get Tx queue lock.
Submitted by: Andrew Rybchenko <arybchenko at solarflare.com>
Sponsored by: Solarflare Communications, Inc.
arybchik [Wed, 25 Mar 2015 09:59:38 +0000 (09:59 +0000)]
MFC: 272325
cleanup: code style fixes
Remove trailing whitespaces and tabs.
Enclose value in return statements in parentheses.
Use tabs after #define.
Do not skip comparison with 0/NULL in boolean expressions.
Submitted by: Andrew Rybchenko <arybchenko at solarflare.com>
Sponsored by: Solarflare Communications, Inc.
arybchik [Wed, 25 Mar 2015 09:56:48 +0000 (09:56 +0000)]
MFC: 263649
sfxge: limit software Tx queue size.
Previous implementation limits put queue size only (when Tx lock can't
be acquired), but get queue may grow unboundedly which results in mbuf
pools exhaustion and latency growth.
Submitted by: Andrew Rybchenko <Andrew.Rybchenko at oktetlabs.ru>
Sponsored by: Solarflare Communications, Inc.