yongari [Fri, 8 Oct 2010 20:51:33 +0000 (20:51 +0000)]
MFC r212157:
Unlike most other controllers, NS DP83815/DP83816 controllers seem
to pad with 0xFF when it encounter short frames. According to RFC
1042 the pad bytes should be 0x00.
Because manual padding consumes extra CPU cycles, introduce a new
tunable which controls the padding behavior. Turning this tunable
on will have driver pad manually but it's disabled by default. Users
can enable software padding by setting the following tunable to
non-zero value.
yongari [Fri, 8 Oct 2010 20:48:09 +0000 (20:48 +0000)]
MFC r212121,212156:
r212121:
Move sis_reset() to sis_initl(). This ensures driver starts with
known good state of controller.
r212156:
Fix the last endianness issue on handling station address which
prevented driver from working on big-endian machines. Also rewrite
station address programming to make it work on strict-alignment
architectures. With this change, sis(4) now works on sparc64 and
performance number looks good even though sis(4) have to apply
fixup code to align received frames on 2 bytes boundary on sparc64.
yongari [Fri, 8 Oct 2010 20:39:45 +0000 (20:39 +0000)]
MFC r212117,212119:
rr212117:
Report result of link state change to caller. Previously it always
returned success.
r212119:
Do not reinitialize controller whenever promiscuous mode or
allmulti is toggled. Controller does not require reinitialization.
This removes unnecessary controller reinitialization whenever
tcpdump is used.
While I'm here remove unnecessary variable reinitialization.
yongari [Fri, 8 Oct 2010 20:33:43 +0000 (20:33 +0000)]
MFC r212116:
Overhaul link state change handling. Previously sis(4) blindly
configured TX/RX MACs before getting a valid link. After that, when
link state change callback is called, it called device
initialization again to reconfigure TX/RX MACs depending on
resolved link state. This hack created several bad side effects and
it required more hacks to not collide with sis_tick callback as
well as disabling switching to currently selected media in device
initialization. Also it seems sis(4) was used to be a template
driver for long time so other drivers which was modeled after
sis(4) also should be changed.
TX/RX MACs are now reconfigured after getting a valid link. Fix for
short cable error is also applied after getting a link because it's
only valid when the resolved speed is 100Mbps.
While I'm here slightly reorganize interrupt handler such that
sis(4) always read SIS_ISR register to see whether the interrupt is
ours or not. This change removes another hack and make it possible
to nuke sis_stopped variable in softc.
yongari [Fri, 8 Oct 2010 20:18:44 +0000 (20:18 +0000)]
MFC r212109,212124,212185:
r212109:
bus_dma(9) cleanup.
o Enforce TX/RX descriptor ring alignment. NS data sheet says the
controller needs 4 bytes alignment but use 16 to cover both SiS
and NS controllers. I don't have SiS data sheet so I'm not sure
what is alignment restriction of SiS controller but 16 would be
enough because it's larger than the size of a TX/RX descriptor.
Previously sis(4) ignored the alignment restriction.
o Enforce RX buffer alignment, 4.
Previously sis(4) ignored RX buffer alignment restriction.
o Limit number of TX DMA segment to be used to 16. It seems
controller has no restriction on number of DMA segments but
using more than 16 looks resource waste.
o Collapse long mbuf chains with m_collapse(9) instead of calling
expensive m_defrag(9).
o TX/RX side bus_dmamap_load_mbuf_sg(9) support and remove
unnecessary callbacks.
o Initial endianness support.
o Prefer local alignment fixup code to m_devget(9).
o Pre-allocate TX/RX mbuf DMA maps instead of creating/destroying
these maps in fast TX/RX path. On non-x86 architectures, this is
very expensive operation and there is no need to do that.
o Add missing bus_dmamap_sync(9) in TX/RX path.
o watchdog is now unarmed only when there are no pending frames
on controller. Previously sis(4) blindly unarmed watchdog
without checking the number of queued frames.
o For efficiency, loaded DMA map is reused for error frames.
o DMA map loading failure is now gracefully handled. Previously
sis(4) ignored any DMA map loading errors.
o Nuke unused macros which are not appropriate for endianness
operation.
o Stop embedding driver maintained structures into descriptor
rings. Because TX/RX descriptor structures are shared between
host and controller, frequent bus_dmamap_sync(9) operations are
required in fast path. Embedding driver structures will increase
the size of DMA map which in turn will slow down performance.
r212124:
Fix stupid error in r212109 which didn't swap DMA maps. This caused
IOMMU panic on sparc64 under high TX load.
r212185:
Fix another bug introduced in r212109. We should unload DMA maps
only after sending the last fragment of a frame so the mbuf pointer
also should be stored in the last descriptor index.
yongari [Fri, 8 Oct 2010 19:27:34 +0000 (19:27 +0000)]
MFC r212971:
Remove unnecessary controller reinitialization.
StarFire controller does not require controller reinitialization to
program perfect filters. While here, make driver immediately exit
from interrupt/polling handler if driver reinitialized controller.
yongari [Fri, 8 Oct 2010 19:24:38 +0000 (19:24 +0000)]
MFC r212969:
Make sure to clear IFF_DRV_RUNNING to reinitialize controller.
While I'm here update if_oerrors counter when driver encounters
watchdog timeout.
yongari [Fri, 8 Oct 2010 18:49:59 +0000 (18:49 +0000)]
MFC r212069,212071:
r212069:
bge_txeof() already checks whether it has to free transmitted mbufs
or not by comparing reported TX consumer index with saved index. So
remove unnecessary check done after freeing transmitted mbufs.
While I'm here nuke unnecessary variable initializations.
r212071:
Remove unnecessary atomic operation in bge_poll. bge(4) always
holds a driver lock in the function entry and
memory synchronization is handled by bus_dmamap_sync(9).
yongari [Fri, 8 Oct 2010 18:43:06 +0000 (18:43 +0000)]
MFC r212061,212065,212302:
r212061:
Split common parent DMA tag into ring DMA tag and TX/RX mbuf DMA
tag. All controllers that are not BCM5755 or higher have 4GB
boundary DMA bug. Previously bge(4) used 32bit DMA address to
workaround the bug(r199670). However this caused the use of bounce
buffers such that it resulted in poor performance for systems which
have more than 4GB memory. Because bus_dma(9) honors boundary
restriction requirement of DMA tag for dynamic buffers, having a
separate TX/RX mbuf DMA tag will greatly reduce the possibility of
using bounce buffers. For DMA buffers allocated with
bus_dmamem_alloc(9), now bge(4) explicitly checks whether the
requested memory region crossed the boundary or not.
With this change, only the DMA buffer that crossed the boundary
will use 32bit DMA address. Other DMA buffers are not affected as
separate DMA tag is created for each DMA buffer.
Even if 32bit DMA address space is used for a buffer, the chance to
use bounce buffer is still very low as the size of buffer is small.
This change should eliminate most usage of bounce buffers on
systems that have more than 4GB memory.
More correct fix would be teaching bus_dma(9) to honor boundary
restriction for buffers created with bus_dmamem_alloc(9) but it
seems that is not easy.
While I'm here cleanup bge_dma_map_addr() and remove unnecessary
member variables in bge_dmamap_arg structure.
Tested by: marcel
r212065:
Handle PAE case correctly. You cannot effectively specify a 4GB
boundary in PAE case so use a 2GB boundary for PAE as suggested by
jhb.
Pointed out by: jhb
Reviewed by: jhb
r212302:
Make sure to create DMA'able memory for statistics block. This was
missed in r212061 and it caused crashes for 570x controllers as
controller DMAed statistics to physical address 0.
yongari [Fri, 8 Oct 2010 18:19:05 +0000 (18:19 +0000)]
MFC r213306:
Rename rl_setmulti() to rl_rxfilter() as rl_rxfilter() will handle
IFF_ALLMULTI/IFF_PROMISC as well as multicast filter configuration.
Rewrite RX filter logic to reduce number of register accesses and
make it handle promiscuous/allmulti toggling without controller
reinitialization.
Previously rl(4) counted on controller reinitialization to reprogram
promiscuous configuration but r211767 resulted in avoiding
controller reinitialization whenever promiscuous mode is toggled.
To address this, keep track of driver's view of interface state and
handle IFF_ALLMULTI/IFF_PROMISC changes without reinitializing
controller. This should fix a regression introduced in r211267.
While I'm here remove unnecessary variable reassignment in ioctl
handler.
emaste [Fri, 8 Oct 2010 14:59:14 +0000 (14:59 +0000)]
MFC r213013:
Move test for zero bufp or size before rseq and wseq calculation. This
avoids spinning in an infinite loop for some (possibly corrupt?) core
files at work.
emaste [Fri, 8 Oct 2010 14:56:39 +0000 (14:56 +0000)]
MFC r212570:
Allow a kernel config to specify a set but empty value via
'makeoptions OPTION=' for consistency with the make commandline.
Previously 'makeoptions WERROR=' would result in a syntax error; now
it produces the same effect as 'makeoptions WERROR'. Both forms now
result in 'WERROR=' in the generated Makefile.
delphij [Thu, 7 Oct 2010 00:36:58 +0000 (00:36 +0000)]
MFC r211059:
Address an edge condition that we found at work, where the carp(4)
interface goes to issue LINK_UP, then LINK_DOWN, then LINK_UP at
cold boot. This behavior is not observed when carp(4) interface
is created slightly later, when the underlying interface is fully
up.
Before this change what happen at boot is roughly:
- ifconfig creates em0 interface;
- ifconfig clones a carp device using em0;
(em0's link state is DOWN at this point)
- carp state: INIT -> BACKUP [*]
- carp state: BACKUP -> MASTER
- [Some negotiate between em0 and switch]
- em0 kicks up link state change event
(em0's link state is now up DOWN at this point)
- do_link_state_change() -> carp_carpdev_state()
- carp state: MASTER -> INIT (via carp_set_state(sc, INIT)) [+]
- carp state: INIT -> BACKUP
- carp state: BACKUP -> MASTER
At the [*] stage, em0 did not received any broadcast message from other
node, and assume our node is the master, thus carp(4) sets the link
state to "UP" after becoming a master. At [+], the master status
is forcely set to "INIT", then an election is casted, after which our
node would actually become a master.
We believe that at the [*] stage, the master status should remain as
"INIT" since the underlying parent interface's link state is not up.
Obtained from: iXsystems, Inc.
Reported by: jpaetzel
delphij [Thu, 7 Oct 2010 00:29:07 +0000 (00:29 +0000)]
MFC r213044:
In the past gunzip(1) write()'s after each inflate return. This is
not optimal from a performance standpoint since the write buffer is
not necessarily be filled up when the inflate rountine reached the
end of input buffer and it's not the end of file.
This problem gets uncovered by trying to pipe gunzip -c output to
a GEOM device directly, which enforces the writes be multiple of
sector size.
Sponsored by: iXsystems, Inc.
Reported by: jpaetzel
kib [Wed, 6 Oct 2010 10:00:37 +0000 (10:00 +0000)]
MFC r212998:
For sparc64 relocations that directly put bits of the symbol value into
the location, apply elf_relocaddr to the symbol value to have right
values for the symbols from dpcpu segment.
marius [Mon, 4 Oct 2010 20:13:19 +0000 (20:13 +0000)]
MFC: r213105, r213147
Improve r56796; the reply handler actually may remove the request from
the chain in which case it shouldn't be removed twice.
Reported by: Staale Kristoffersen
marius [Mon, 4 Oct 2010 20:02:48 +0000 (20:02 +0000)]
MFC: r213102
Remove the duplicate logging of failed read requests, whose error message
also was inappropriate as it triggered for every EACCESS and ENOTFOUND, not
just the case the -n option is intended to deal with and thus really spammed
us with ~20 messages in the default configuration when booting a diskless
FreeBSD client, introduced with r207608 (commited to stable/8 in 213038)
again.
MFC r213174:
Some schemes can allocate memory for internal purposes but when
GEOM does withering this memory doesn't freed. Add G_PART_DESTROY
call to g_part_wither. Also add missed g_free() call to G_PART_READ
method for MBR and PC98 schemes.
MFC r203261 (by marcel):
Export the UUID of the partition in the XML. The partition UUID is used
by EFI's device path to identify a partition. In order for FreeBSD to
add EFI boot options, proper device paths need to be constructed.
kib [Sat, 2 Oct 2010 17:41:47 +0000 (17:41 +0000)]
MFC r212824:
Adopt the deferring of object deallocation for the deleted map entries
on map unlock to the lock downgrade and later read unlock operation.
MFC r212868 (by alc) [1]:
Make refinements to r212824. Redo the implementation of
vm_map_unlock_and_wait().
MFC r212732:
Fix panic, when due to some kind of congestion on FIS-based switching
port multiplier some command triggers false positive timeout, but then
completes normally.
In the case of non-sequential mappings, ofw_mapmem() could ask Open
Firmware to map a memory region with negative length, causing crashes
and Undefined Behavior. Add the appropriate check to make the behavior
defined.
Fix a problem where device detection would work unreliably on Serverworks
K2 SATA controllers. The chip's status register must be read first, and
as a long, for other registers to be correctly updated after a command, and
this includes the command sequence in device detection as well as the
previously handled case after interrupts. While here, clean up some
previous hacks related to this controller.
MFC r211913:
Do not allocate multicast array memory in multicast filter
configuration function. For failed memory allocations, em(4)/lem(4)
called panic(9) which is not acceptable on production box.
igb(4)/ixgb(4)/ix(4) allocated the required memory in stack which
consumed 768 bytes of stack memory which looks too big.
To address these issues, allocate multicast array memory in device
attach time and make multicast configuration success under any
conditions. This change also removes the excessive use of memory in
stack.
MFC r211909:
If em(4) failed to allocate RX buffers, do not call panic(9).
Just showing some buffer allocation error is more appropriate
action for drivers. This should fix occasional panic reported on
em(4) when driver encountered resource shortage.
MFC r212755:
Fix incorrect RX BD producer updates. The producer index was
already updated after allocating mbuf so driver had to use the last
index instead of using next producer index. This should fix driver
hang which may happen under high network load.
Reported by: Igor Sysoev <is <> rambler-co dot ru>, Vlad Galu <dudu <> dudu dot ro>
Tested by: Igor Sysoev <is <> rambler-co dot ru>, Vlad Galu <dudu <> dudu dot ro>
MFC 212902:
Tweak the stats exported by the e1000 drivers:
- Add a single sysctl procedure to all three drivers to read an arbitrary
register (the register is passed as arg2). Use it to replace existing
routines in igb(4) that used a separate routine for each register, and
to add support for missing stats in em(4) and lem(4).
- Move the 'rx_overruns' and 'watchdog_timeouts' stats out of the MAC stats
section as they are driver stats, not MAC counters.
- Simplify the code that creates per-queue stats in igb(4) to use a single
loop and remove duplicated code.
- Properly read all 64 bits of the 'good octets received/transmitted' in
em(4) and lem(4).
- Actually read the interrupt count registers in em(4), and drop the
'host to card' sysctl stats from em(4) as they are not implemented in
any of the hardware this driver supports.
- Restore several stats to em(4) that were lost in the earlier stats
conversion including per-queue stats.
- Export several MAC stats in em(4) that were exported in igb(4) but not
in em(4).
- Export stats in lem(4) using individual sysctls as in em(4) and igb(4).
MFC: r212834
Fix nfsrv_freeallnfslocks() in the experimental NFSv4 server so that
it frees local locks correctly upon close. In order for
nfsrv_localunlock() to work correctly, the lock can no longer be in
the lockowner's stateid list. As such, nfsrv_freenfslock() has to
be called before nfsrv_localunlock(), to get rid of the lock structure
on the lockowner's stateid list. This only affected operation when
local locks (vfs.newnfs.enable_locallocks=1) are enabled, which is
not the default at this time.
MFC: r212833
Fix the experimental NFSv4 server so that it performs local VOP_ADVLOCK()
unlock operations correctly. It was passing in F_SETLK instead of
F_UNLCK as the operation for the unlock case. This only affected
operation when local locking (vfs.newnfs.enable_locallocks=1) was enabled.
MFC: r212443
This patch applies one of the two fixes suggested by
zack.kirsch at isilon.com for a race between nfsrv_freeopen()
and nfsrv_getlockfile() in the experimental NFS server that
he found during testing. Although nfsrv_freeopen() holds a
sleep lock on the lock file structure when called with
cansleep != 0, nfsrv_getlockfile() could still search the
list, once it acquired the NFSLOCKSTATE() mutex. I believe
that acquiring the mutex in nfsrv_freeopen() fixes the race.
MFC: r212439
Fix the NFSVNO_CMPFH() macro in the experimental NFS server so
that it works correctly for ZFS file handles. It is possible to
have two ZFS file handles that differ only in the bytes in the
fid_reserved field of the generic "struct fid" and comparing the
bytes in fid_data didn't catch this case. This patch changes the
macro to compare all bytes of "struct fid".
MFC: r212362
Fix the experimental NFS client so that it doesn't panic when
NFSv2,3 byte range locking is attempted. A fix that allows the
nlm_advlock() to work with both clients is in progress, but
may take a while. As such, I am doing this commit so that
the kernel doesn't panic in the meantime.
Don't attempt to write label with GEOM_BSD based method if the class is
not available. This improves error reporting when bsdlabel(8) is unable
to open a device for writing. If GEOM_BSD was unavailable, only a rather
obscure error message "Class not found" was printed.
In setusercontext(), do not apply user settings unless running as the
user in question (usually but not necessarily because we were called
with LOGIN_SETUSER). This plugs a hole where users could raise their
resource limits and expand their CPU mask.
MFC r211765:
Remove unnecessary controller reinitialization.
CAM filter handling was rewritten long time ago so it should not
require controller reinitialization.
MFC r211764:
Add PNP id for Compex RL2000.
I'm not sure whether adding this logical id is correct or not
because Compex RL2000 is in the list of supported hardware list.
I guess the Compex RL2000 could be PCI variant while the controller
in question is ISA controller. It seems PNP compat id didn't match
or it had multiple compat ids so isa_pnp_probe() seemed to return
ENOENT.