kib [Mon, 8 Nov 2010 10:23:39 +0000 (10:23 +0000)]
MFC r214026:
Do not synchronously start the nfsiod threads at all. The r212506
fixed the issues with file descriptor locks, but the same problems are
present for vnode lock/user map lock.
kib [Mon, 8 Nov 2010 10:18:01 +0000 (10:18 +0000)]
MFC r214851:
Fix a bug in r214049. The nvp == vp case shall be handled specially
only for !usevget case. If VFS_VGET is working, the vnode shared lock
is obtained recursively and vput() shall be done, not vunref().
MFC r214352 adapted to stable/8:
Reimplemented "gpart destroy -F". Now it does all work in kernel.
This was needed for recover implementation.
Implement the recover command for GPT. Now GPT will marked as
corrupt when any of three types of corruption will be detected:
1. Damaged primary GPT header or table
2. Damaged secondary GPT header or table
3. Secondary header is not located in the last LBA
Marked GPT becomes read-only. Any changes with corrupt table
are prohibited. Only "destroy" and "recover" commands are allowed.
marius [Sun, 7 Nov 2010 17:50:54 +0000 (17:50 +0000)]
MFC: r214528
- When resetting pm_active and pm_context of a pmap in pmap_pinit() we
need locking as otherwise we may race against the other parts of the
MD code which expects a consistent state of these. While at it move
the resetting of the pmap before entering it in the TSB.
- Spell a 0 as TLB_CTX_KERNEL.
marius [Sun, 7 Nov 2010 17:48:07 +0000 (17:48 +0000)]
MFC: r214264
- Add IFM_10_2 and IFM_10_5 media via tlphy(4) only in case the respective
interface also has such connectors.
- In tl_attach() unify three different ways of obtaining the device and
vendor IDs and remove the now obsolete tl_dinfo from tl_softc.
- Given that tlphy(4) only handles the integrated PHYs of NICs driven by
tl(4) make it only probe on the latter.
- Switch mlphy(4) and tlphy(4) to use mii_phy_add_media()/mii_phy_setmedia().
- Simplify looking for the respective companion PHY in mlphy(4) and tlphy(4)
by ignoring the native one by just comparing the device_t's directly rather
than the device name.
marius [Sun, 7 Nov 2010 17:35:42 +0000 (17:35 +0000)]
MFC: r214262
- Take advantage of mii_phy_dev_probe().
- Use mii_phy_add_media() instead of mii_add_media(). I'm not sure how
this driver actually managed to work before as mii_add_media() is
intended to be used to gether with mii_anar() while mii_phy_add_media()
is intended to be used with mii_phy_setmedia(), however this driver
mii_add_media() along with mii_phy_setmedia().
marius [Sun, 7 Nov 2010 16:56:29 +0000 (16:56 +0000)]
MFC: r213894, r213896, r214913
Converted the remainder of the NIC drivers to use the mii_attach()
introduced in r213878 (MFC'ed to stable/8 in r214685) instead of
mii_phy_probe(). Unlike r213893 (MFC'ed to stable/8 in r214909) these
are only straight forward conversions though.
marius [Sun, 7 Nov 2010 11:12:29 +0000 (11:12 +0000)]
MFC: r213893, r213908, r214566, r214605, r214846
Convert the PHY drivers to honor the mii_flags passed down and convert
the NIC drivers as well as the PHY drivers to take advantage of the
mii_attach() introduced in r213878 (MFC'ed to stable/8 in r214684) to
get rid of certain hacks. For the most part these were:
- Artificially limiting miibus_{read,write}reg methods to certain PHY
addresses; we now let mii_attach() only probe the PHY at the desired
address(es) instead.
- PHY drivers setting MIIF_* flags based on the NIC driver they hang
off from, partly even based on grabbing and using the softc of the
parent; we now pass these flags down from the NIC to the PHY drivers
via mii_attach(). This got us rid of all such hacks except those of
brgphy() in combination with bce(4) and bge(4), which is way beyond
what can be expressed with simple flags.
While at it, I took the opportunity to change the NIC drivers to pass
up the error returned by mii_attach() (previously by mii_phy_probe())
and unify the error message used in this case where and as appropriate
as mii_attach() actually can fail for a number of reasons, not just
because of no PHY(s) being present at the expected address(es).
Make the IPsec SADB embedded route cache a union to be able to hold both the
legacy and IPv6 route destination address.
Previously in case of IPv6, there was a memory overwrite due to not enough
space for the IPv6 address.
lstewart [Sat, 6 Nov 2010 10:31:52 +0000 (10:31 +0000)]
MFC r213913:
Retire the system-wide, per-reassembly queue segment limit. The mechanism is far
too coarse grained to be useful and the default value significantly degrades TCP
performance on moderate to high bandwidth-delay product paths with non-zero loss
(e.g. 5+Mbps connections across the public Internet often suffer).
Replace the outgoing mechanism with an individual per-queue limit based on the
number of MSS segments that fit into the socket's receive buffer. This should
strike a good balance between performance and the potential for resource
exhaustion when FreeBSD is acting as a TCP receiver. With socket buffer
autotuning (which is enabled by default), the reassembly queue tracks the socket
buffer and benefits too.
As the XXX comment suggests, my testing uncovered some unexpected behaviour
which requires further investigation. By using so->so_rcv.sb_hiwat instead of
sbspace(&so->so_rcv), we allow more segments to be held across both the socket
receive buffer and reassembly queue than we probably should. The tradeoff is
better performance in at least one common scenario, versus a devious sender's
ability to consume more resources on a FreeBSD receiver.
Sponsored by: FreeBSD Foundation
Reviewed by: andre, gnn, rpaulo
lstewart [Sat, 6 Nov 2010 10:26:49 +0000 (10:26 +0000)]
MFC r213912:
- Switch the "net.inet.tcp.reass.cursegments" and
"net.inet.tcp.reass.maxsegments" sysctl variables to be based on UMA zone
stats. The value returned by the cursegments sysctl is approximate owing to
the way in which uma_zone_get_cur is implemented.
- Discontinue use of V_tcp_reass_qsize as a global reassembly segment count
variable in the reassembly implementation. The variable was used without
proper synchronisation and was duplicating accounting done by UMA already. The
lack of synchronisation was particularly problematic on SMP systems
terminating many TCP sessions, resulting in poor TCP performance for
connections with non-zero packet loss.
Sponsored by: FreeBSD Foundation
Reviewed by: andre, gnn, rpaulo (as part of a larger patch)
lstewart [Sat, 6 Nov 2010 10:17:43 +0000 (10:17 +0000)]
MFC r210203:
- Move common code from the hook functions that fills in a packet node struct to
a separate inline function. This further reduces duplicate code that didn't
have a good reason to stay as it was.
- Reorder the malloc of a pkt_node struct in the hook functions such that it
only occurs if we managed to find a usable tcpcb associated with the packet.
- Make the inp_locally_locked variable's type consistent with the prototype of
siftr_siftdata().
lstewart [Sat, 6 Nov 2010 10:06:58 +0000 (10:06 +0000)]
MFC r213910:
- Simplify implementation of uma_zone_get_max.
- Add uma_zone_get_cur which returns the current approximate occupancy of a
zone. This is useful for providing stats via sysctl amongst other things.
Sponsored by: FreeBSD Foundation
Reviewed by: gnn, jhb
lstewart [Sat, 6 Nov 2010 09:56:14 +0000 (09:56 +0000)]
MFC r211396 (originally committed by andre):
Add uma_zone_get_max() to obtain the effective limit after a call
to uma_zone_set_max().
The UMA zone limit is not exactly set to the value supplied but rounded up to
completely fill the backing store increment (a page normally). This can lead to
surprising situations where the number of elements allocated from UMA is higher
than the supplied limit value. The new get function reads back the effective
value so that the supplied limit value can be adjusted to the real limit.
lstewart [Sat, 6 Nov 2010 09:42:41 +0000 (09:42 +0000)]
MFC r213158:
Internalise reassembly queue related functionality and variables which should
not be used outside of the reassembly queue implementation. Provide a new
function to flush all segments from a reassembly queue and call it from the
appropriate places instead of manipulating the queue directly.
Sponsored by: FreeBSD Foundation
Reviewed by: andre, gnn, rpaulo
lstewart [Sat, 6 Nov 2010 09:34:51 +0000 (09:34 +0000)]
MFC r209662,209665:
Import the Statistical Information For TCP Research (SIFTR) kernel module into
FreeBSD. SIFTR logs a range of statistics on active TCP connections to a log
file, providing the ability to make highly granular measurements of TCP
connection state. The tool is aimed at system administrators, developers and
researchers alike. Please take it for a spin and test it out - the man page
should have all the information required to get you going.
Many thanks go to the Cisco University Research Program Fund at Community
Foundation Silicon Valley and the FreeBSD Foundation. Their support of our work
at the Centre for Advanced Internet Architectures, Swinburne University of
Technology is greatly appreciated.
r209980:
Catch up with the rename of DPCPU_SUM to DPCPU_VARSUM.
r209982:
The SIFTR DPCPU statistics struct was not being zeroed between enable/disable
cycles so the values would accumulate rather than reset for each cycle.
Sponsored by: Cisco URP (r209662), FreeBSD Foundation
Reviewed by: dwmalone, gnn, rpaulo (r209662)
Tested by: Many on freebsd-current@ and elsewhere over the years
lstewart [Sat, 6 Nov 2010 09:23:49 +0000 (09:23 +0000)]
MFC r209050 (originally committed by jhb):
Add helper macros to iterate over available CPUs in the system.
CPU_FOREACH(i) iterates over the CPU IDs of all available CPUs. The
CPU_FIRST() and CPU_NEXT(i) macros can also be used to iterate over
available CPU IDs. CPU_NEXT(i) wraps around to CPU_FIRST() rather than
returning some sort of terminator.
rmacklem [Fri, 5 Nov 2010 02:33:27 +0000 (02:33 +0000)]
MFC: r214511
Add a call for nfsrpc_close() to ncl_reclaim() in the experimental
NFSv4 client, since the call in ncl_inactive() might be missed
because VOP_INACTIVE() is not guaranteed to be called before
VOP_RECLAIM().
rmacklem [Fri, 5 Nov 2010 02:12:18 +0000 (02:12 +0000)]
MFC: r214406
Add a flag to the experimental NFSv4 client to indicate when
delegations are being returned for reasons other than a Recall.
Also, re-organize nfscl_recalldeleg() slightly, so that it leaves
clearing NMODIFIED to the ncl_flush() call and invalidates the
attribute cache after flushing. It is hoped that these changes
might fix the problem others have seen when using the NFSv4
client with delegations enabled, since I can't reliably reproduce
the problem. These changes only affect the client when doing NFSv4
mounts with delegations enabled.
jhb [Thu, 4 Nov 2010 17:19:16 +0000 (17:19 +0000)]
MFC 214448:
Use 'PCPU_GET(apic_id)' to determine the BSP's APIC ID on a UP machine
when routing interrupts instead of cpu_apic_ids[0] since cpu_apic_ids[]
is only populated for multiple-CPU machines. This also matches what the
code does when SMP is not enabled.
jhb [Thu, 4 Nov 2010 17:12:29 +0000 (17:12 +0000)]
MFC 214203:
- Add a new PCI quirk to whitelist an old chipset that doesn't support
PCI-express or PCI-X capabilities if we are running in a virtual machine.
- Whitelist the Intel 82440 chipset used by QEMU.
jhb [Thu, 4 Nov 2010 17:06:54 +0000 (17:06 +0000)]
MFC 211820,211821,212292:
Intel QPI chipsets actually provide extra "non-core" PCI buses that
provide PCI devices for various hardware such as memory controllers,
etc. for each socket. These PCI buses are not enumerated via ACPI
however. Add qpi(4) psuedo bus and Host-PCI bridge drivers to
enumerate these buses. Currently the driver uses the CPU ID to
determine the bridges' presence.
rmacklem [Wed, 3 Nov 2010 22:17:42 +0000 (22:17 +0000)]
MFC: r214255
Modify the experimental NFSv4 server's file handle hash function
to use the generic hash32_buf() function. Although adding the
bytes seemed sufficient for UFS and ZFS, since most of the bytes
are the same for file handles on the same volume, this might not
be sufficient for other file systems. Use of a generic function
also seems preferable to one specific to NFSv4.
kib [Wed, 3 Nov 2010 21:21:12 +0000 (21:21 +0000)]
MFC r208453:
Reorganize syscall entry and leave handling.
Implement ptrace_lwpinfo pl_flags PL_FLAG_SCE, PL_FLAG_SCX and
PL_FLAG_EXEC.
The i386, amd64, sparc64, sun4v, powerpc and ia64 syscall()s are
changed to use syscallenter()/syscallret(). MIPS and arm are not
converted and use the mostly unchanged syscall() implementation.
MFC r208514:
Change ia64' struct syscall_args definition so that args is a pointer to
the arguments array instead of array itself.
MFC r208566:
Allow to use syscallname(9) outside subr_trap.c.
MFC r209258 (by rpaulo):
Make DTrace syscall provider work again by including opt_kdtrace.h here.
MFC r209313:
Only enable kdtrace hook in the LINT on the architectures that implement it.
MFC r209697:
Obey sv_syscallnames bounds in syscallname().
NOTE: The KBI of the struct sysentvec is changed, new required members
sv_set_syscall_retval, sv_fetch_syscall_args and sv_syscallnames are
added. The sv_prepsyscall field is now ignored. Third-party modules
using the struct sysentvec must be modified and recompiled, we believe
that only ABI emulators are affected. No such out-of-tree modules are
known. In-tree modules that are affected by the change were converted
to depend on exact version of the kernel, see r214421.
nwhitehorn [Wed, 3 Nov 2010 15:31:37 +0000 (15:31 +0000)]
MFC r214349:
The EHCI_CAPLENGTH and EHCI_HCIVERSION registers are actually sub-registers
within the first 4 bytes of the EHCI memory space. For controllers that
use big-endian MMIO, reading them with 1- and 2-byte reads would then
return the wrong values. Instead, read the combined register with a
4-byte read and mask out the interesting quantities.
jhb [Wed, 3 Nov 2010 15:25:30 +0000 (15:25 +0000)]
MFC 213672,213674,214396:
- Report subcommand handler errors in mfiutil so that tools that
invoke the utilities can robustly report errors.
- Fix compile with -DDEBUG by using the correct mfi_pd_ref union definition
in mfireg.h.
- Save errno values before calling warn(3) so that errors are correctly
reported.
- Use powerof2() from <sys/param.h> rather than a copy and paste version.
Remove setpgid() call before executing child process.
Using a separate process group here is bad, since (for example) job
control in the TTY layer prevents interaction with the TTY, causing the
child process to hang.
edwin [Wed, 3 Nov 2010 10:10:34 +0000 (10:10 +0000)]
MFC of r214002, r214010
- Stylify of uudecode(1)
Part of PR bin/124739.
- "b64decode -r" did not handle arbitary breaks in base64 encoded
data. White space should be accepted anywhere in a base64 encoded
stream, not just after every chunk (4 characters).
Test-scenario:
VmVsb2NpdHkgUmV3YXJkcw==
and
VmVsb2NpdHkgUmV3YXJkcw
==
should both produce "Velocity Rewards"
PR: bin/124739
Submitted by: Mark Andrews <marka@isc.org>
kib [Wed, 3 Nov 2010 08:34:00 +0000 (08:34 +0000)]
MFC r214049:
When readdirplus() is handled on the exported filesystem that does
not support VFS_VGET, like msdosfs, do not call VOP_LOOKUP() for
dotdot on the root directory. Our filesystems expect that VFS handles
dotdot lookups on root on its own.
yongari [Wed, 3 Nov 2010 01:28:09 +0000 (01:28 +0000)]
MFC r214432:
Use shorten model name and add RTL8168, RTL8111 to the list of
supported device. re(4) had been supported all variants of RTL8168,
RTL8111 and RTL810x. I think this change will cover all controllers
supported by re(4).
yongari [Wed, 3 Nov 2010 01:24:33 +0000 (01:24 +0000)]
MFC r214302:
Add TSO support over VLAN for i82550/i82551. Controller requires
VLAN hardware tagging to make TSO work over VLAN. So if VLAN
hardware tagging is disabled explicitly clear TSO over VLAN. While
I'm here allow disabling VLAN TX checksum offloading.
yongari [Wed, 3 Nov 2010 00:03:26 +0000 (00:03 +0000)]
MFC r214087,214219,214251,214292:
r214087:
Add workaround for BCM5906 controller silicon bug. If device
receive two back-to-back send BDs with less than or equal to 8
total bytes then the device may hang. The two back-to-back send
BDs must be in the same frame for this failure to occur.
Thanks to davidch for detailed errata information.
Reviewed by: davidch
r214219:
Add workaround for BCM5906 A1 controller silicon bug. When
auto-negotiation results in half-duplex operation, excess collision
on the ethernet link may cause internal chip delays that may result
in subsequent valid frames being dropped due to insufficient
receive buffer resources. The workaround is to choose de-pipeline
method as a flow control decision for SDI. De-pipeline method
allows only 1 data in TxMbuf at a time such that a request to RDMA
from SDI is made only when TxMbuf is empty. Thanks for david for
providing detailed errata information.
r214251:
Apply the same workaround for SDI flow control used on BCM5906 A1
to BCM6906 A0/A2. This should fix a long standing BCM5906 A2 lockup
issues. Data sheet explicitly mentions BCM5906 A0, A1 and A2 use
de-pipelined mode on these revisions.
Special thanks to Buganini who tried all combinations of
experimental patches for more than 10 days.
Tested by: Buganini <buganini <> gmail dot com >
r214292:
Use bge_chipid to compare controller ids. r214251 incorrectly used
bge_chiprev.
Reported by: Buganini <buganini <> gmail dot com >
yongari [Tue, 2 Nov 2010 23:54:59 +0000 (23:54 +0000)]
MFC r213747,213808,214216:
r213747:
Protect bge(4) from accessing invalid NIC internal memory regions
on BCM5906.
Tested by: Buganini < buganini <> gmail dot com >
r213808:
Add more checks for resolved link speed in bge_miibus_statchg().
Link UP state could be reported first before actual completion of
auto-negotiation. This change makes bge(4) reprogram BGE_MAC_MODE,
BGE_TX_MODE and BGE_RX_MODE register only after controller got a
valid link.
r214216:
Enable TX MAC state machine lockup fix for both BCM5755 or higher
and BCM5906. Publicly available data sheet just says it may happen
due to corrupted TxMbuf.
yongari [Tue, 2 Nov 2010 23:48:08 +0000 (23:48 +0000)]
MFC r213522,213587,213711:
r213522:
Fix a long standing bug which regarded some revisions of controller
as 5788. This caused BGE_MISC_LOCAL_CTL register is used to
generate link state change interrupt for non-5788 controllers. The
interrupt handler may or may not detect link state attention as
status block wouldn't be updated when an interrupt was generated
with BGE_MISC_LOCAL_CTL register. All controllers except 5700 and
5788 should use host coalescing mode register to trigger an
interrupt.
r213587:
Do not blindly UP the interface when interface's MTU is changed. If
driver is not running there is no need to up the interface. While
I'm here hold driver lock before modifying MTU as it is referenced
in RX handler.
r213711:
The IFF_DRV_RUNNING flag is set at the end of bge_init_locked. But
before setting the flag, interrupt was already enabled such that
interrupt handler could be run before setting IFF_DRV_RUNNING flag.
This can lose initial link state change interrupt which in turn
make bge(4) think that it still does not have valid link. Fix this
race by protecting the taskqueue with a driver lock.
While I'm here move reenabling interrupt code after handling of link
state chage.
yongari [Tue, 2 Nov 2010 23:41:43 +0000 (23:41 +0000)]
MFC r213495,213742:
r213495:
Add more comments to rings supported by the controller. Different
versions of controller support different number of ring control
blocks such that adjust code a bit to access known number of
send/receive ring control blocks. Previously bge(4) blindly
accessed 16 send/receive RCBs. Also move initializing standard
receive producer ring producer index, jumbo receive producer ring
producer index and mini receive producer ring producer index to
the end of each receive producer ring initialization.
Do not assume mini receive producer ring is available only when
controller has jumbo frame capability, instead explicitly check
ASIC version BCM5700 to disable mini receive producer ring.
Additionally always enable send ring 0 regardless of controller
versions. Previously bge(4) didn't enable send ring 0 if controller
is BGE_IS_5705_PLUS. Becase bge(4) need 1 send ring to send frames
at least, I have no idea how it would have worked so far.
Submitted by: davidch
r213742:
Fix a regression introduced in r213495. r213495 disabled mini
receive producer ring only for BCM5700. It was believed that
BCM5700 with external SSRAM is the only controller that supports
mini ring but it seems all BCM570[0-4] requires to disable mini
receive producer ring. Otherwise, it caused unexpected RX DMA
error or watchdog timeouts.
Reported by: marius, Steve Kargl <sgk <> troutmask dot apl dot washington dot edu>
Tested by: marius, Steve Kargl <sgk <> troutmask dot apl dot washington dot edu>
yongari [Tue, 2 Nov 2010 23:35:08 +0000 (23:35 +0000)]
MFC r213485,213710,213812:
r213485:
Overhaul MII register access routine and remove unnecessary
BGE_MI_MODE register accesses. Previously bge(4) used to read
BGE_MI_MODE register to detect whether it needs to disable
autopolling feature or not. Because we don't touch autopolling in
other part of driver there is no reason to read BGE_MI_MODE
register given that we know default value in advance. In order to
achieve the goal, check whether the controller has CPMU(Central
Power Mangement Unit) capability. If controller has CPMU feature,
use 500KHz MII management interface(mdio/mdc) frequency regardless
core clock frequency. Otherwise use default MII clock. While I'm
here, add CPMU register definition.
In bge_miibus_readreg(), rearrange code a bit and remove goto
statement. In bge_miibus_writereg(), make sure to restore
autopolling even if MII write failed. The delay time inserted after
accessing BGE_MI_MODE register increased from 40us to 80us.
The default PHY address is now stored in softc. All PHYs supported
by bge(4) currently uses PHY address 1 but it will be changed when
we add newer controllers. This change will make it easier to change
default PHY address depending on PHY models.
Submitted by: davidch
r213710:
Remove one last reference of BGE_MI_MODE register for auto polling.
Previously bge(4) always enabled auto polling for non-BGE_FLAG_TBI
controllers. With this change, auto polling is not used anymore so
polling through mii(4) was introduced.
Reviewed by: davidch
r213812:
Fix a regression introduced in r213710. r213710 removed the use of
auto polling such that it made all controllers obtain link status
information from the state of the LNKRDY input signal. Broadcom
recommends disabling auto polling such that driver should rely on
PHY interrupts for link status change indications. Unfortunately it
seems some controllers(BCM5703, BCM5704 and BCM5705) have PHY
related issues so Linux took other approach to workaround it.
bge(4) didn't follow that and it used to enable auto polling to
workaround it. Restore this old behavior for BCM5700 family
controllers and BCM5705 to use auto polling. For BCM5700 and
BCM5701, it seems it does not need to enable auto polling but I
restored it for safety.
Special thanks to marius who tried lots of patches with patience.
yongari [Tue, 2 Nov 2010 23:23:48 +0000 (23:23 +0000)]
MFC r213411,213464-213465,213468:
r213411:
Enable fix for read DMA FIFO overruns on controllers that have this
fix. Note, we still need workaround for controllers that lacks this
fix and it needs more work in RX BD updating.
Submitted by: davidch
r213464:
Separate common flags into controller specific and PHY related
flags. There should be no functional changes. This change will make
it easy to add more quirk/flags in future.
Reviewed by: davidch
r213465:
Rearrange code a bit to correctly set PHY flags. This change make
it easy to add more newer ASICs.
Obtained from: OpenBSD
r213468:
Fix bge(4) build breakage when BGE_REGISTER_DEBUG is defined.
yongari [Tue, 2 Nov 2010 23:04:23 +0000 (23:04 +0000)]
MFC r213316,213333-213334:
r213316:
Fix IFCAP_TXCSUM/IFCAP_RXCSUM handling. Previously bge(4) used
IFCAP_HWCSUM to know which capability should be changed such that
disabling RX checksun offloading resulted in disabling TX checksum
offloading.
r213333:
Allow write DMA to request larger DMA burst size to get better
performance on BCM5785.
yongari [Tue, 2 Nov 2010 22:57:20 +0000 (22:57 +0000)]
MFC r213283,213410:
r213283:
Implement hardware MAC statistics for BCM5705 or newer Broadcom
controllers. bge(4) exported MAC statistics on controllers that
maintain the statistics in the NIC's internal memory. Newer
controllers require register access to fetch these values. These
counters provide useful information to diagnose driver issues.
r213410:
Consistently use ifHCOutOctets/ifHCInOctets instead of Octets as
these names are used in data sheet. Also use UnicastPkts,
MulticastPkts and BroadcastPkts instead of UcastPkts, McastPkts
and BcastPkts to clarify its meaning.
Move all NV defines into nv.c, they are not used externally thus there is
no need to make then visible from outside.
r214283:
Implement nv_exists() function that returns true if argument of the given
name exists.
r214284:
Before this change on first connect between primary and secondary we
initialize all the data. This is huge waste of time and resources if
there were no writes yet, as there is no real data to synchronize.
Optimize this by sending "virgin" argument to secondary, which gives it a hint
that synchronization is not needed.
In the common case (where noth nodes are configured at the same time) instead
of synchronizing everything, we don't synchronize at all.
r214692:
Send packets to remote node only via the send thread to avoid possible
races - in this case a keepalive packet was send from wrong thread which
lead to connection dropping, because of corrupted packet.
Fix it by sending keepalive packets directly from the send thread.
As a bonus we now send keepalive packets only when connection is idle.
yongari [Tue, 2 Nov 2010 22:44:51 +0000 (22:44 +0000)]
MFC r213081,213225,213280:
r213081:
Always show asic/chip revision in device attach phase. There are
too many bge(4) controllers there and model name does not
necessarily match asic/chip revision. Relying on VPD string made
it hard to identify exact asic/chip revision so the first step to
debug bge(4) was getting exact asic/chip information with verbose
boot which may not be available on production server.
r213255:
Set the number of RX frames to receive after RX MBUF low watermark
has reached. This reduced number of dropped frames when
flow-control is enabled. Previously it dropped incoming frames once
RX MBUF low watermark has reached. The value used in MAC RX MBUF
low watermark is greater than or equal to 4 so receiving two more
RX frames should not be a problem.
Obtained from: OpenBSD
r213280:
After r207391, brgphy(4) passes resolved flow-control settings to
parent driver. Use that information to configure flow-control.
One drawback is there is no way to disable flow-control as we still
don't have proper way to not advertise RX/TX pause capability to
link partner. But I don't think it would cause severe problems and
users can selectively disable flow-control in switch port.
pjd [Tue, 2 Nov 2010 22:30:19 +0000 (22:30 +0000)]
MFC r211854:
- When VFS_VGET() is not supported, switch to VOP_LOOKUP().
- We are fine by only share-locking the vnode.
- Remove assertion that doesn't hold for ZFS where we cross mount points
boundaries by going into .zfs/snapshot/<name>/.
marius [Tue, 2 Nov 2010 22:12:06 +0000 (22:12 +0000)]
MFC: r214526
Partially revert r203829 (MFC'ed to stable/7 in r205920); as it turns out
what the PowerPC OFW loader did was incorrect as further down the road
cons_probe() calls malloc() so the former can't be called before init_heap()
has succeed. Instead just exit to the firmware in case init_heap() fails
like OF_init() does when hitting a problem as we're then likely running in
a very broken environment where hardly anything can be trusted to work.
marius [Tue, 2 Nov 2010 20:06:46 +0000 (20:06 +0000)]
MFC: r213878
Add a NetBSD-compatible mii_attach(), which is intended to eventually
replace mii_phy_probe() altogether. Compared to the latter the advantages
of mii_attach() are:
- intended to be called multiple times in order to attach PHYs in multiple
passes (f.e. in order to only use sub-ranges of the 0 to MII_NPHY - 1
range)
- being able to pass along the capability mask from the NIC to the PHY
drivers
- being able to specify at which address (phyloc) to probe for a PHY
(instead of always probing at all addresses from 0 to MII_NPHY - 1)
- being able to specify which PHY instance (offloc) to attach
- being able to pass along MIIF_* flags from the NIC to the PHY drivers
(f.e. as required to indicated to the PHY drivers that flow control is
supported by the NIC driver, which actually is the motivation for this
change).
While at it, I used the opportunity to get rid of some hacks in mii(4)
like miibus_probe() generally doing work besides sheer probing and the
"EVIL HACK" (which will vanish entirely along with mii_phy_probe()) by
passing the struct ifnet pointer via an argument of mii_attach() as well
as to fix some resource leaks in mii(4) in case something fails.
Commits which will update the PHY drivers to honor the MII flags passed
down from the NIC drivers and take advantage of mii_attach() to get rid
of certain types of hacks in NIC and PHY drivers as well as a conversion
of the remaining uses of mii_phy_probe() will follow shortly.
mav [Tue, 2 Nov 2010 09:26:12 +0000 (09:26 +0000)]
MFC r214016:
Set of legacy mode SATA enchancements:
- Implement proper combined mode decoding for Intel controllers to properly
identify SATA and PATA channels and associate ATA channels with SATA ports.
This fixes wrong reporting and in some cases hard resets to wrong SATA ports.
- Improve SATA registers support to handle hot-plug events and potentially
interface errors. For ICH5/6300ESB chipsets these registers accessible via
PCI config space. For later ones they may be accessible via PCI BAR(5).
- For controllers not generating interrupts on hot-plug events, implement
periodic status polling. Use it to detect hot-plug on Intel and VIA
controllers. Same probably could also be used for Serverworks and SIS.
mav [Tue, 2 Nov 2010 09:15:27 +0000 (09:15 +0000)]
MFC r213301:
Revert r132291.
Restore setting PIO/WDMA timings for VIA UDMA133 controllers.
Linux disables only AST register writing there, but no all timings.
mav [Tue, 2 Nov 2010 09:05:40 +0000 (09:05 +0000)]
MFC r214102:
Workaround strange situation when EDMA_RESQIP register returns zero instead
of proper value. It caused bunch of "EMPTY CRPB" messages and potentially
may cause premature requests completion, which could cause data corruption.
For most cases it seems enough to just reread register to get proper value.
To protect against worse cases - erase processed queue entries with
impossible values and ignore them if problem still happen.
bschmidt [Mon, 1 Nov 2010 19:05:38 +0000 (19:05 +0000)]
MFC r214160,214162,214236
r214236 & r214160:
The firmware does pad notifications to an even number of bytes (at least
the association notification), the included information though always
contains an elem block with an odd number of bytes. We handle the last
byte as if it might contain a whole elem block, this of course is not
true as one byte is not enough to hold a block, we therefore discard the
complete frame. The solution here is to subtract one from the actual
notification length, this is also what the Linux driver does. With this
change the frame ends exactly where the last elem block ends.
r214262:
The firmware always sets bit 14 and 15, to get the real associd we need
to clear those bits.
trasz [Mon, 1 Nov 2010 15:36:47 +0000 (15:36 +0000)]
MFC r212906:
First step at adopting FreeBSD to support PSARC/2010/029. This makes
acl_is_trivial_np(3) properly recognize the new trivial ACLs. From
the user point of view, that means "ls -l" no longer shows plus signs
for all the files when running ZFS v28.
rmacklem [Mon, 1 Nov 2010 02:21:35 +0000 (02:21 +0000)]
MFC: r214224
Modify the file handle hash function in the experimental NFS
server so that it will work better for non-UFS file systems.
The new function simply sums the bytes of the fh_fid field
of fhandle_t.
rmacklem [Mon, 1 Nov 2010 01:55:15 +0000 (01:55 +0000)]
MFC: r214149
Modify the experimental NFS server in a manner analagous to
r214049 for the regular NFS server, so that it will not do
a VOP_LOOKUP() of ".." when at the root of a file system
when performing a ReaddirPlus RPC.