emaste [Fri, 20 Jan 2012 00:20:00 +0000 (00:20 +0000)]
MFC r216269:
Don't warn if a partition appears not to be aligned on a track boundary.
Modern disks use LBA and create a fake CHS geometry that doesn't have any
relation to the on-disk layout of data.
gnn [Thu, 19 Jan 2012 19:39:41 +0000 (19:39 +0000)]
MFC: 229965
Fix for PR 138526.
Add the ability for /dev/null and /dev/zero to accept
being set into non blocking mode via fcntl(). This
brings the code into compliance with IEEE Std 1003.1-2001
as referenced in another PR, 94729.
truckman [Wed, 18 Jan 2012 21:50:59 +0000 (21:50 +0000)]
MFC: r229984
Pass the arguments to mtx_init() in the correct order. There should be
no change to the binary because the value of MTX_DEF is zero and there
is a visible function prototype.
bz [Tue, 17 Jan 2012 22:08:58 +0000 (22:08 +0000)]
MFC r225048:
In this branch when doing no further checkes there is no reason use
the temporary variable and check with if as TUNABLE_*_FETCH do not
alter values unless successfully found the tunable.
bz [Tue, 17 Jan 2012 22:02:11 +0000 (22:02 +0000)]
MFC r224516:
Introduce a tunable to disable the time consuming parts of bootup
memtesting, which can easily save seconds to minutes of boot time.
The tunable name is kept general to allow reusing the code in
alternate frameworks.
glebius [Fri, 13 Jan 2012 23:25:58 +0000 (23:25 +0000)]
Merge r228463, that explicily uses 255.0.0.0 mask for the temporary prefix.
This change isn't actually needed in the stable/8, but let it be here, in
case if anyone tries to run stable/8 world on a head/ kernel.
jhb [Fri, 13 Jan 2012 20:35:43 +0000 (20:35 +0000)]
MFC 221891,229665,229672,229700:
Remove the assertion from tcp_input() that rcv_nxt is always greater
than or equal to rcv_adv and fix tcp_twstart() to handle this case by
assuming the last window was zero rather than a negative value.
The code in tcp_input() already safely handled this case. It can happen
due to delayed ACKs along with a remote sender that sends data beyond
the window we previously advertised. If we have room in our socket buffer
for the extra data beyond the advertised window, we will accept it.
However, if the ACK for that segment is delayed, then we will not
effectively fixup rcv_adv to account for that extra data until the
next segment arrives and forces out an ACK. When that next segment
arrives, rcv_nxt will be beyond rcv_adv.
jhb [Fri, 13 Jan 2012 20:25:56 +0000 (20:25 +0000)]
MFC 228960:
Cap the priority calculated from the current thread's running tick count
at SCHED_PRI_RANGE to prevent overflows in the priority value. This can
happen due to irregularities with clock interrupts under certain
virtualization environments.
jhb [Fri, 13 Jan 2012 20:15:49 +0000 (20:15 +0000)]
MFC 229429:
Some small fixes to CPU accounting for threads:
- Only initialize the per-cpu switchticks and switchtime in sched_throw()
for the very first context switch on APs during boot. This avoids a
small gap between the middle of thread_exit() and sched_throw() where
time is not accounted to any thread.
- In thread_exit(), update the timestamp bookkeeping to track the changes
to mi_switch() introduced by td_rux so that the code once again matches
the comment claiming it is mimicing mi_switch(). Specifically, only
update the per-thread stats directly and depend on ruxagg() to update
p_rux rather than adjusting p_rux directly. While here, move the
timestamp bookkeeping as late in the function as possible.
jhb [Fri, 13 Jan 2012 19:51:15 +0000 (19:51 +0000)]
MFC 229390,229420,229479:
Fix some races in the multicast code by removing places where we would
drop the IF_ADDR_LOCK while walking an interface's multicast address list:
- Use TAILQ_FOREACH() instead of TAILQ_FOREACH_SAFE() for some loops that
do not modify the queues they iterate over.
- When cancelling multicast timers on an interface, don't release the
reference on a group in the leaving state while iterating over the loop.
Instead, use the same approach used in igmp_ifdetach() and mld_ifdetach()
of placing the groups to free on a pending release list and then releasing
the references after dropping the IF_ADDR_LOCK.
- Use the mli_relinmhead list normally used to defer calls to
in6m_release_locked() to defer calls to mld_v1_transmit_report() until
after the IF_ADDR_LOCK is dropped.
jhb [Fri, 13 Jan 2012 19:20:33 +0000 (19:20 +0000)]
MFC 229414,229476,229477:
Various fixes to the SIOC[DG]LIFADDR ioctl handlers:
- Grab a reference on any matching interface address (ifa) before dropping
the IF_ADDR_LOCK() and release the reference after using it to prevent a
potential use-after-free.
- Fix the IPv4 ioctl handlers in in_lifaddr_ioctl() to work with IPv4
interface addresses rather than IPv6.
- Add missing interface address list locking in the IPv4 handlers.
jhb [Fri, 13 Jan 2012 19:13:43 +0000 (19:13 +0000)]
MFC 215605,215606,222952,229400:
Various improvements to the 'cscope' target:
- Add x86 to ALL_ARCH.
- Add lex and yacc sources to things cscope'd.
- Include sys/xen in cscope tag file generation.
- Improve the cscope target's handling of MD directories. Automatically
include the MACHINE_ARCH directory if it differs from MACHINE when
building an index for a single machine. Also, include the 'x86'
directory when building an index for i386, pc98, or amd64.
jhb [Fri, 13 Jan 2012 18:54:10 +0000 (18:54 +0000)]
MFC 228849, 229727:
Add post-VOP hooks for VOP_DELETEEXTATTR() and VOP_SETEXTATTR() and use
these to trigger a NOTE_ATTRIB EVFILT_VNODE kevent when the extended
attributes of a vnode are changed.
jhb [Fri, 13 Jan 2012 18:49:28 +0000 (18:49 +0000)]
MFC 228738:
Allow boot0cfg to force a PXE boot via boot0 on the next boot.
- Fix boot0 to check for PXE when using the pre-set setting for the
preferred slice.
- Update boot0cfg to use slice 6 to select PXE. Accept a 'pxe' argument
instead of a number for the 's' option as a way to select PXE as well.
mckusick [Fri, 13 Jan 2012 07:10:52 +0000 (07:10 +0000)]
MFC: 226520
The current /etc/dumpdates file restricts device names to 32 characters.
With the addition of various GEOM layers some device names now exceed
this length, for example /dev/mirror/encrypted.elig.journal. This
change expands the field to 53 bytes which brings the /etc/dumpdates
lines to 80 characters. Exceeding 80 characters makes the /etc/dumpdates
file much less human readable. A test is added to dump so that it
verifies that the device name will fit in the 53 character field
failing the dump if it is too long.
This change has been checked to verify that its /etc/dumpdates file
is compatible with older versions of dump.
Reported by: Martin Sugioarto <martin@sugioarto.com>
PR: kern/160678
mav [Thu, 12 Jan 2012 15:57:03 +0000 (15:57 +0000)]
MFC r228461:
Fix few bugs in isp(4) target mode support:
- in destroy_lun_state() assert hold == 1 instead of 0, as it should
receive hold taken by the create_lun_state() or get_lun_statep() before;
- fix hold count leak inside rls_lun_statep() that also fired above assert;
- in destroy_lun_state() use SIM bus number instead of SIM path id for
ISP_GET_PC_ADDR(), as it was before r196008;
- make isp_disable_lun() to set status in CCB;
- make isp_target_mark_aborted() set status into the proper CCB.
mav [Thu, 12 Jan 2012 15:02:51 +0000 (15:02 +0000)]
MFC r228808, r228847, 229395:
r228808, r228847:
Make cd driver to handle Audio CDs, reporting their 2352 bytes sectors to
GEOM and using READ CD command for reading data, same as acd driver does.
Audio CDs identified by checking respective bit of the control field of
the first track in TOC.
229395:
Add support for CDRIOCGETBLOCKSIZE and CDRIOCSETBLOCKSIZE IOCTLs to control
sector size same as acd driver does. Together with r228808 and r228847 this
allows existing multimedia/vlc to play Audio CDs via CAM cd driver.
mckusick [Wed, 11 Jan 2012 19:12:29 +0000 (19:12 +0000)]
MFC: 226265
When unmounting a filesystem always wait for the vfs_busy lock to clear
so that if no vnodes in the filesystem are actively in use the unmount
will succeed rather than failing with EBUSY.
Reported by: Garrett Cooper
Reviewed by: Attilio Rao and Kostik Belousov
Tested by: Garrett Cooper
PR: kern/161016
mav [Wed, 11 Jan 2012 18:14:22 +0000 (18:14 +0000)]
MFC r228726, r228727:
Cast some vendor-specific spell on VIA VT1708S codecs to:
- make analog input loopback work;
- get access to the mics boost controls.
rmacklem [Wed, 11 Jan 2012 01:58:49 +0000 (01:58 +0000)]
MFC: r228827
During investigation of an NFSv4 client crash reported by glebius@,
jhb@ spotted that nfscl_getstateid() might modify credentials when
called from nfsrpc_read() for the case where p != NULL, whereas
nfsrpc_read() only did a crdup() to get new credentials for p == NULL.
This bug was introduced by r195510, since pre-r195510 nfscl_getstateid()
only modified credentials for the p == NULL case. This patch modifies
nfsrpc_read()/nfsrpc_write() so that they do crdup() for the p != NULL case.
It is conceivable that this bug caused the crash reported by glebius@, but
that will not be determined for some time, since the crash occurred after
about 1month of operation.
rmacklem [Tue, 10 Jan 2012 02:55:43 +0000 (02:55 +0000)]
MFC: r228757
jwd@ reported a problem via email where the old NFS client would
get a reply of EEXIST from an NFS server when a Mkdir RPC was retried,
for an NFS over UDP mount.
Upon investigation, it was found that the client was retransmitting
the Mkdir RPC request over UDP, but with a different xid. As such,
the retransmitted message would miss the Duplicate Request Cache
in the server, causing it to reply EEXIST. The kernel client side
UDP rpc code has two timers. The first one causes a retransmit using
the same xid and socket and was set to a fixed value of 3seconds.
(The default can be overridden via CLSET_RETRY_TIMEOUT.)
The second one creates a new socket and xid and should be larger
than the first. However, both NFS clients were setting the second
timer to nm_timeo ("timeout=<value>" mount argument), which defaulted to
1second, so the first timer would never time out.
This patch fixes both NFS clients so that they set the first timer
using nm_timeo and makes the second timer larger than the first one.
yongari [Mon, 9 Jan 2012 20:14:52 +0000 (20:14 +0000)]
MFC r198999:
Take a step towards removing if_watchdog/if_timer. Don't explicitly set
if_watchdog/if_timer to NULL/0 when initializing an ifnet. if_alloc()
sets those members to NULL/0 already.
yongari [Mon, 9 Jan 2012 19:30:23 +0000 (19:30 +0000)]
MFC r228716:
TCP header size is represented by number of 32bits words.
Fix the TCP header size calculation such that makes TSO engine
cache all header(ethernet/IP/TCP) bytes to its internal buffer.
While here, remove extra pull up for TCP payload. Unlike some
em(4) controllers, fxp(4) does not require such work around for
TSO.
The two limitations are ethernet/IP/TCP header size should be less
than or equal to the size of controller's internal buffer(80 bytes)
and these header information should be found in the first fragment
of a TSO frame.
yongari [Mon, 9 Jan 2012 19:08:52 +0000 (19:08 +0000)]
MFC r228476:
Rework link state tracking and remove superfluous link UP/DOWN
messages.
o Add check for actually resolved speed in miibus_statchg callback
instead of blindly reprogramming BCE_EMAC_MODE register. The
callback may be called multiple times(e.g. link UP, link
transition, auto-negotiate complete etc) while auto-negotiation
is in progress. All unresolved link state changes are ignored
now and setting BCE_EMAC_MODE after link establishment is done
once.
o bce(4) is careful enough not to drive MII_TICK if driver got a
valid link. To detect lost link, bce(4) relied on link state
change interrupt and if driver see the interrupt, it forced to
drive MII_TICK by calling bce_tick() in interrupt handler.
Because bce(4) generates multiple link state change interrupts
while auto-negotiation is in progress, bce_tick() would be
called multiple times and this resulted in generating multiple
link UP/DOWN messages.
With this change, bce_tick() is not called in interrupt handler
anymore such that miibus_statchg callback handles link state
changes with consistent manner.
yongari [Mon, 9 Jan 2012 18:32:45 +0000 (18:32 +0000)]
MFC r210522,213489,218423,218527:
r210522:
Fix an apparent typo.
r213489:
Add the capability to read the complete contents of the NVRAM via sysctl
dev.bce.<unit>.nvram_dump
Add the capability to write the complete contents of the NVRAM via sysctl
dev.bce.<unit>.nvram_write
These are only available if the kernel option BCE_DEBUG is enabled.
The nvram_write sysctl also requires the kernel option
BCE_NVRAM_WRITE_SUPPORT to be enabled. These are to be used at your
own caution. Since the MAC addresses are stored in the NVRAM, if you
dump one NIC and restore it on another NIC the destination NIC's
MAC addresses will not be preserved. A tool can be made using these
sysctl's to manage the on-chip firmware.
r218423:
- Added systcls for header splitting, RX/TX buffer count, interrupt
coalescing, strict RX MTU, verbose output, and shared memory debug.
- Added additional debug counters (VLAN tags and split header frames).
- Updated debug counters to 64 bit definitions.
- Updated l2fhdr bit definitions.
- Combined RX buffer sizing into a single function.
- Added buffer size and interrupt coalescing settings to adapter info
printout.
r218527:
- Added error checking to nvram read functions.
- Minor style updates.
rmacklem [Sun, 8 Jan 2012 23:30:23 +0000 (23:30 +0000)]
MFC: r228560
Patch the new NFS server in a manner analagous to r228520 for the
old NFS server, so that it correctly handles a count == 0 argument
for Commit.
rmacklem [Sun, 8 Jan 2012 01:09:00 +0000 (01:09 +0000)]
MFC: r228260
This patch adds a sysctl to the NFSv4 server which optionally disables the
check for a UTF-8 compliant file name. Enabling this sysctl results in
an NFSv4 server that is non-RFC3530 compliant, therefore it is not enabled
by default. However, enabling this sysctl results in NFSv3 compatible
behaviour and fixes the problem reported by "dan at sunsaturn.com"
to freebsd-current@ on Nov. 14, 2011 under the subject "NFSV4 readlink_stat".
- Instead of printing file's content calculate MD5 hash of the file,
so it can be easly compared to the hash calculated via file system.
- Some other minor improvements.
MFC r226612 (pjd):
Because ZFS boot code was very fragile in the past and real PITA to debug,
introduce zfsboottest.sh script that will verify if it will be possible to boot
from the given pool.
# zfsboottest.sh system
Where "system" is pool name of the pool we want to boot from.
What is being verified by the script:
- Does the pool exist?
- Does it have bootfs property configured?
- Is mountpoint property of the boot dataset set to 'legacy'?
Dataset configured in bootfs property has to be mounted to perform more
checks:
- Does the /boot directory in boot dataset exist?
- Is this dataset configured as root file system in /etc/fstab or set
in vfs.root.mountfrom variable in /boot/loader.conf?
By using zfsboottest tool the script will read all the files in /boot
directory using ZFS boot code and calculate their checksums.
Then, it will walk /boot directory using find(1) though regular file sytem
and also read all the files in /boot directory and calculate their checksums.
If any of the files cannot be looked up, read or checksum is invalid it will
be reported and booting off of this pool is probably not possible.
Some additional checks may be interesting as well. For example if the disks
contain proper pmbr and gptzfsboot code or if all expected files in /boot/
are present.
When upgrading FreeBSD, one should snapshot datasets that contain operating
system, upgrade (install new world and kernel) and use zfsboottest.sh to verify
if it will be possible to boot from new configuration. If all is good one
should upgrade boot blocks, by eg.:
MFC r226550 (pjd):
Initialize 'rc' properly before using it. This error could lead to infinite
loop when data reconstruction was needed.
MFC r226551 (pjd):
Don't mark vdev as healthy too soon, so we won't try to use invalid vdevs.
MFC r226552 (pjd):
Never pass NULL block pointer when reading. This is neither expected nor
handled by lower layers like vdev_raidz, which uses bp for checksum
verification. This bug could lead to NULL pointer reference and resets
during boot.
MFC r226553 (pjd):
Always pass data size for checksum verification function, as using
physical block size declared in bp may not always be what we want.
For example in case of gang block header physical block size declared
in bp is much larger than SPA_GANGBLOCKSIZE (512 bytes) and checksum
calculation failed. This bug could lead to accessing unallocated
memory and resets/failures during boot.
MFC r226568 (pjd) [1]:
- Correctly read gang header from raidz.
- Decompress assembled gang block data if compressed.
- Verify checksum of a gang header.
- Verify checksum of assembled gang block data.
- Verify checksum of uber block.
rmacklem [Sat, 7 Jan 2012 02:09:49 +0000 (02:09 +0000)]
MFC: r228217
Post r223774, the NFSv4 client no longer has multiple instances
of the same lock_owner4 string. As such, the handling of cleanup
of lock_owners could be simplified. This simplification permitted
the client to do a ReleaseLockOwner operation when the process that
the lock_owner4 string represents, has exited. This permits the
server to release any storage related to the lock_owner4 string
before the associated open is closed. Without this change, it
is possible to exhaust a server's storage when a long running
process opens a file and then many child processes do locking
on the file, because the open doesn't get closed. A similar patch
was applied to the Linux NFSv4 client recently so that it wouldn't
exhaust a server's storage.
yongari [Sat, 7 Jan 2012 01:08:17 +0000 (01:08 +0000)]
MFC r207761:
Belatedly merge r207761. For unknown reason r207761 was not
fully merged (r208073) to stable/8 but mergeinfo was recorded.
Add a fastpath to allocate from packet zone when using m_getjcl.
This will add support for packet zone for at least igb and ixgbe
and will avoid to check for that in bce and mxge.
hselasky [Fri, 6 Jan 2012 22:54:03 +0000 (22:54 +0000)]
Fix build of ehci_mbus.c by applying patches similar
to ones in r228483. This file was missed by a recent
MFC because the file is named differently in 10-current.
yongari [Fri, 6 Jan 2012 21:45:08 +0000 (21:45 +0000)]
MFC r228333,228335-228336,228362,228368-228369,228381:
r228333:
Protect SIOCSIFMTU ioctl handler with driver lock.
Don't blindly re-initialize controller whenever MTU is changed.
Now, reinitializing is done only when driver is running.
While here, remove unnecessary assignment of error value since it
was already initialized to 0.
r228335:
Consistently use a tab character instead of using either a space or
tab after #define.
While I'm here consistently use capital letters when it uses
hexadecimal notation.
No functional changes.
r228336:
Disable all clocks and put PHY into COMA before entering into
suspend state. This will save more power.
On resume, make sure to enable all clocks. While I'm here, if
controller is not fast ethernet, enable gigabit PHY.
r228362:
Do not disable interrupt without knowing whether the raised
interrupt is ours. Note, interrupts are automatically ACKed when
the status register is read.
Add RX/TX DMA error to interrupt handler and do full controller
reset if driver happen to encounter these errors. There is no way
to recover from these DMA errors without controller reset.
Rename local variable name intrs with status to enhance
readability.
While I'm here, rename ET_INTR_TXEOF and ET_INTR_RXEOF to
ET_INTR_TXDMA and ET_INTR_RXDMA respectively. These interrupts
indicate that a frame is successfully DMAed to controller's
internal FIFO and they have nothing to do with EOF(end of frame).
Driver does not need to wait actual end of TX/RX of a frame(e.g.
no need to wait the end signal of TX which is generated when a
frame in TX FIFO is emptied by MAC). Previous names were somewhat
confusing.
r228368:
Remove unnecessary definition of ET_PCIR_BAR. Controller support
I/O memory only.
While here, use pci_set_max_read_req(9) rather than directly
manipulating PCIe device control register.
r228369:
Announce flow control ability to PHY driver and enable RX flow
control. Controller does not automatically generate pause frames
based on number of available RX buffers so it's very hard to
know when driver should generate XON frame in time. The only
mechanism driver can detect low number of RX buffer condition is
ET_INTR_RXRING0_LOW or ET_INTR_RXRING1_LOW interrupt. This
interrupt is generated whenever controller notices the number of
available RX buffers are lower than pre-programmed value(
ET_RX_RING0_MINCNT and ET_RX_RING1_MINCNT register). This scheme
does not provide a way to detect when controller sees enough number
of RX buffers again such that efficient generation of XON/XOFF
frame is not easy.
While here, add more flow control related register definition.
r228381:
FreeBSD driver does not require arpcom structure in softc.
jhb [Fri, 6 Jan 2012 19:32:39 +0000 (19:32 +0000)]
MFC 226217,227070,227341,227502:
Add the posix_fadvise(2) system call. It is somewhat similar to
madvise(2) except that it operates on a file descriptor instead of a
memory region. It is currently only supported on regular files.
Note that this adds a new VOP, so all filesystem modules must be
recompiled.
yongari [Fri, 6 Jan 2012 19:26:31 +0000 (19:26 +0000)]
MFC r228326-228327,228331-228332:
r228326:
Controller does not require TX start command for every frame. So
send a single TX command after setting up all TX frames. This
removes unnecessary register accesses and bus_dmamap_sync(9) calls.
et(4) uses TX interrupt moderation so it's possible to have TX
buffers that were already transmitted but waiting for TX completion
interrupt. If the number of available TX descriptor is less then
1/3 of total TX descriptor, try reclaiming first to get enough free
TX descriptors before setting up TX descriptors.
After r228325, et_txeof() no longer tries to send frames after
reclaiming TX buffers. That change was made to give more chance
to transmit frames in main interrupt handler since we can still
send frames in interrupt handler with RX interrupt. So right
before exiting interrupt hander, after enabling interrupt, try to
send more frames. This gives slightly better performance numbers.
While I'm here reduce number of spare TX descriptors from 8 to 4.
Controller does not require reserved TX descriptors, it was just to
reduce TX overhead. After r228325, driver has much lower TX
overhead so it does not make sense to reserve 8 TX descriptors.
r228327:
Remove et_enable_intrs(), et_disable_intrs() functions and
manipulation of interrupt register access is done through
CSR_WRITE_4 macro. Also add disabling interrupt into et_reset()
because we want interrupt disabled state after controller reset.
While I'm here slightly change interrupt handler to be more
readable one.
r228331:
Rework link state tracking and TX/RX MAC configuration.
o Do not report link status if driver is not running.
o TX/RX MAC configuration should be done with resolved speed,
duplex and flow control after establishing a link so it can't
be done in driver initialization routine.
Move the configuration to miibus_statchg callback which will be
called whenever any link state change is detected.
At this moment, flow-control is not enabled yet mainly because
I was not able to set correct flow control parameters to
generate TX pause frames.
o Now TX/RX MAC is enabled only when a valid link is detected.
Rearragnge hardware initialization routine a bit to leave
enabling MAC to miibus_statchg callback. In order to that,
TX/RX DMA engine is enabled in et_init_locked().
o Introduce ET_FLAG_LINK flag to track current link state.
o Introduce ET_FLAG_FASTETHER flag to mark whether controller is
fast ethernet. This flag is checked in miibus_statchg callback
to know whether PHY established a valid link.
o In et_stop(), TX/RX MAC is explicitly disabled instead of
relying on et_reset(). And move et_reset() from et_stop() to
controller initialization. Controler reset is not required here
and it would also clear critial registers(i.e station address,
RX filter configuration, WOL etc) that are required to make WOL
work.
o Switching to current media is done in et_init_locked() after
setting IFF_DRV_RUNNING flag. This should ensure reliable
auto-negotiation/manual link establishment.
o In et_start_locked(), check whether driver got a valid link
before trying to send frames.
o Remove checking a link in et_tick() as this is done by
miibus_statchg callback.
r228332:
Implement hardware MAC statistics counter. Counters could be
queried with dev.et.%d.stats sysctl node where %d is an instance of
device.
yongari [Fri, 6 Jan 2012 19:11:20 +0000 (19:11 +0000)]
MFC r228325:
Overhaul bus_dma(9) usage in et(4) and clean up TX/RX path. This
change should make et(4) work on any architectures.
o Remove m_getl inline function and replace it with stanard mbuf
interfaces. Previous code tried to minimize code duplication
but this came from incorrect use of common DMA tag.
Driver may be still use a common RX allocation handler with
additional structure changes but I don't see much point to do
that it would make it hard to understand the code.
o Remove DragonflyBSD specific constant EVL_ENCAPLEN, use
ETHER_VLAN_ENCAP_LEN instead.
o Add bunch of new RX status definition. It seems controller
supports RX checksum offloading but I was not able to make the
feature work yet. Currently driver checks whether recevied
frame is good one or not.
o Avoid a typedef ending in '_t' as style(9) says.
o Controller has no restriction on DMA address space, so there
is no reason to limit the DMA address to 32bit. Descriptor
rings, status blocks and TX/RX buffers now use full 64bit DMA
addressing.
o Allocate DMA memory shared between host and controller as
coherent.
o Create 3 separate DMA tags to be used as TX, mini RX ring and
stanard RX ring. Previously it created a single DMA tag and it
was used to all three rings.
o et(4) does not support jumbo frame at this moment and I still
don't quite understand how jumbo frame works on this controller
so use two RX rings to handle small sized frame and normal sized
frame respectively. The mini RX ring will be used to receive
frames that are less than or equal to 127 bytes. The second RX
ring is used to receive frames that are not handled by the first
RX ring.
If jumbo frame support is implemented, driver may have to choose
better RX scheme by letting the second RX ring handle jumbo
frames. This scheme will mimic Broadcom's efficient jumbo frame
handling feature. However RAM buffer size(16KB) of the
controller is too small to hold 2 jumbo frames, if 9KB
jumbo frame is used, I'm not sure how good performance would it
have.
o In et_rxeof(), make sure to check whether controller received
good frame or not. Passing corrupted frame to upper layer is
bad idea.
o If driver receives a bad frame or driver fails to allocate RX
buffer due to resource shortage condition, reuse previously
loaded DMA map for RX buffer instead of unloading/loading RX
buffer again.
o et_init_tx_ring() never fails so change return type to void.
o In watchdog handler, show TX DMA write back status of errored
frame which could be used as a clue to debug watchdog timeout.
o Add missing bus_dmamap_sync() in various places such that et(4)
should work with bounce buffers(e.g. PAE).
o TX side bus_dmamap_load_mbuf_sg(9) support.
o RX side bus_dmamap_load_mbuf_sg(9) support.
o Controller has no DMA alignment limit in RX buffer so use
m_adj(9) in RX buffer allocation to make IP header align on 2
bytes boundary. Otherwise it would trigger unaligned access
error in upper layer on strict alignment architectures.
One of down side of controller is it provides limited set of RX
buffer length like most Intel controllers. This is not problem
at this moment because driver does not support jumbo frame yet
but it may require alignment fixup code to support jumbo frame
on strict alignment architectures.
o In et_txeof(), don't zero TX descriptors for transmitted frames.
TX descriptors don't need write access after transmission.
Driver sets IFF_DRV_OACTIVE when the number of available TX
descriptors are less than or equal to ET_NSEG_SPARE. Make sure
to clear IFF_DRV_OACTIVE only when the number of available TX
descriptor is greater than ET_NSEG_SPARE.