MFC of 1.133 import and fixes:
- add support for T3C
- add multicast support
- update copyrights
- add infrastructure for multiple transmit queues
- add support for compiling firmware in to the kernel
- add conditional define for enabling link at device attach
- exit tick handler if shutdown is in progress
- add sysctls for dumping transmit queues
- use jumbo clusters for large packets
- add pcpu caching
- add inline mbuf header on receive
MFC if_rl.c, rev 1.174 to RELENG_7.
It seems that RealTek 8129/8139 chip reports invalid length of
received frame under certain conditions. wpaul said the length
0xfff0 is special meaning that indicates hardware is in the
process of copying a packet into host memory. But it seems
there are other cases that hardware is busy or stuck in bad
situation even if the received frame length is not 0xfff0.
To work-around this condition, add a check that verifys that
recevied frame length is in valid range. If received length is out
of range reinitialize hardware to recover from stuck condition.
These files were repocopied for HEAD. The repo copy process renames
tags, adding a prefix of 'old_', so the history for these files is
in old_RELENG_7 etc.
MFC: nlm_prot_impl.c 1.5, clnt_rc.c 1.3
Avoid error cascades when trying to monitor a host that doesn't resolve in
DNS. Tighten error handling when attempting to communicate with the userland
part of rpc.lockd.
Add rfcomm_pppd_server rc script to allow start rfcomm_pppd(8) in server
mode at boot time. Multiple profiles can be started at the same time.
The whole idea is very similar to the ppp rc script.
- Add write(2) support for psm(4) in native operation level. Now arbitrary
commands can be written to /dev/psm%d and status can be read back from it.
- Clean up and fix style(9) nits.
MFC sf(4) to latest HEAD.
o sf(4) repocopied from sys/pci to sys/dev/sf
Overhaul sf(4) to make it run on all architectures and implement
checksum offoload by downloading AIC-6915 firmware. Changes are
o Header file cleanup.
o Simplified probe logic.
o s/u_int{8,16,32}_t/uint{8,16,32}_t/g
o K&R -> ANSI C.
o In register access function, added support both memory mapped and
IO space register acccess. The function will dynamically detect
which method would be choosed.
o sf_setperf() was modified to support strict-alignment
architectures.
o Use SF_MII_DATAPORT instead of hardcoded value 0xffff.
o Added link state/speed, duplex changes handling task q. The task q
is also responsible for flow control settings.
o Always hornor link up/down state reported by mii layers. The link
state information is used in sf_start() to determine whether we
got a valid link.
o Added experimental flow-control setup. It was commented out but
will be activated once we have flow-cotrol infrastructure in mii
layer.
o Simplify IFF_UP/IFCAP_POLLING and IFF_PROMISC handling logic. Rx
filter always honors promiscuous mode.
o Implemented suspend/resume methods.
o Reorganized Rx filter routine so promiscuous mode changes doesn't
require interface re-initialization.
o Reimplemnted driver probe routine such that it looks for matching
device from supported hardware list table. This change will help to
add newer hardware revision to the driver.
o Use ETHER_ADDR_LEN instead of hardcoded value.
o Prefer memory space register mapping over I/O space as the hardware
requires lots of register access to get various consumer/producer
index. Failing to get memory space mapping, sf(4) falls back to I/O
space mapping. Use of memory space register mapping requires
somewhat large memory space(512K), though.
o Switch to simpler bus_{read,write}_{1,2,4}.
o Use PCIR_BAR macro to get BARs.
o Program PCI cache line size if the cache line size was set to 0
and enable PCI MWI.
o Add a new sysctl node 'dev.sf.N.stats' that shows various MAC
counters for Rx/Tx statistics.
o Add a sysctl node to configure interrupt moderation timer. The
timer defers interrupts generation until time specified in timer
control register is expired. The value in the timer register is in
units of 102.4us. The allowable range for the timer is 0 - 31
(0 ~ 3.276ms).
The default value is 1(102.4us). Users can change the timer value
with dev.sf.N.int_mod sysctl(8) variable/loader(8) tunable.
o bus_dma(9) conversion
- Enable 64bit DMA addressing.
- Enable 64bit descriptor format support.
- Apply descriptor ring alignment requirements(256 bytes alignment).
- Apply Rx buffer address alignment requirements(4 bytes alignment).
- Apply 4GB boundary restrictions(Tx/Rx ring and its completion ring
should live in the same 4GB address space.)
- Set number of allowable number of DMA segments to 16. In fact,
AIC-6915 doesn't have a limit for number of DMA segments but it
would be waste of Tx descriptor resource if we allow more than 16.
- Rx/Tx side bus_dmamap_load_mbuf_sg(9) support.
- Added alignment fixup code for strict-alignment architectures.
- Added endianness support code in Tx/Rx descriptor access.
With these changes sf(4) should work on all platforms.
o Don't set if_mtu in device attach, it's handled in ether_ifattach.
o Use our own callout to drive watchdog timer.
o Enable VLAN oversized frames and announce sf(4)'s VLAN capability
to upper layer.
o In sf_detach(), remove mtx_initialized KASSERT as it's not possible
to get there without initialzing the mutex. Also mark that we're
about to detaching so active bpf listeners do not panic the system.
o To reduce PCI register access cycles, Rx completion ring is
directly scanned instead of reading consumer/producer index
registers. In theory, Tx completion ring also can be directly
scanned. However the completion ring is composed of two types
completion(1 for Tx done and 1 and DMA done). So reading producer
index via register access would be more safer way to detect the
ring wrap-around.
o In sf_rxeof(), don't use m_devget(9) to align recevied frames. The
alignment is required only for strict-alignment architectures and
now the alignment is handled by sf_fixup_rx() if required. The
removal of the copy operation in fast path should increase Rx
performance a lot on non-strict-alignemnt architectures such as
i386 and amd64.
o In sf_newbuf(), don't set descriptor valid bit as sf(4) is
programmed to run with normal mode. In normal mode, the valid bit
have no meaning. The valid bit should be used only when the
hardware uses polling(prefetch) mode. The end of descriptor queue
bit could be used if needed, but sf(4) relys on auto-wrapping of
hardware on 256 descriptor queue entries so both valid and
descriptor end bit are not used anymore.
o Don't disable generation of Tx DMA completion as said in datasheet
and use the Tx DMA completion entry instead of relying on Tx done
completion entry. Also added additional Tx completion entry type
check in Tx completion handler.
o Don't blindly reset watchdog timer in sf_txeof(). sf(4) now unarm
the the watchdog only if there are no active Tx descriptors in Tx
queue.
o Don't manually update various counters in driver, instead, use
built-in MAC statistic registers to update them. The statistic
registers are updated in every second.
o Modified Tx underrun handlers to increase the threshold value
in units of 256 bytes. Previously it used to increase 16 bytes
at a time which seems to take too long to stabalize whenever Tx
underrun occurrs.
o In interrupt handler, additional check for the interrupt is
performed such that interrupts only for this device is allowed to
process descriptor rings. Because reading SF_ISR register clears
all interrtups, nuke writing to a SF_ISR register.
o Tx underrun is abonormal condition and SF_ISR_ABNORMALINTR includes
the interrupt. So there is no need to inspect the Tx underrun again
in main interrupt loop.
o Don't blindly reinitialize hardware for abnormal interrupt
condition. sf(4) reintializes the hardware only when it encounters
DMA error which requires an explicit hardware reinitialization.
o Fix a long standing bug that incorrectly clears MAC statistic
registers in sf_init_locked.
o Added strict-alignment safe way of ethernet address reprogramming
as IF_LLADDR may return unaligned address.
o Move sf_reset() to sf_init_locked in order to always reset the
hardware to a known state prior to configuring hardware.
o Set default Rx DMA, Tx DMA paramters as shown in datasheet.
o Enable PCI busmaster logic and autopadding for VLAN frames.
o Rework sf_encap.
- Previously sf(4) used to type 0 of Tx descriptor with padding
enabled to store driver private data. Emebedding private data
structures into descriptors is bad idea as the structure size
would be different between 64bit and 32bit architectures. The
type 0 descriptor allows fixed number of DMA segments in
a descriptor format and provides relatively simple interface to
manage multi-fragmented frames.
However, it wastes lots of Tx descriptors as not all frames are
fragmented as the number of allowable segments in a descriptor.
- To overcome the limitation of type 0 descriptor, switch to type
2 descriptor which allows 64bit DMA addressing and can handle
unliumited number of fragmented DMA segments. The drawback of
type 2 descriptor is in its complexity in managing descriptors
as driver should handle the end of Tx ring manually.
- Manually set Tx desciptor queue end mark and record number of
used descriptors to reclaim used descriptors in sf_txeof().
o Rework sf_start.
- Honor link up/down state before attempting transmission.
- Because sf(4) uses only one of two Tx queues, use low priority
queue instead of high one. This will remove one shift operation
in each Tx kick command.
- Cache last produder index into softc such that subsequenet Tx
operation doesn't need to access producer index register.
o Rewrote sf_stats_update to include all available MAC statistic
counters.
o Employ AIC-6915 firmware from Adaptec and implement firmware
download routine and TCP/UDP checksum offload.
Partial checksum offload support was commented out due to the
possibility of firmware bug in RxGFP.
The firmware can strip VLAN tag in Rx path but the lack of firmware
assistance of VLAN tag insertion in transmit side made it useless
on FreeBSD. Unlike checksum offload, FreeBSD requires both Tx/Rx
hardware VLAN assistance capability. The firmware may also detect
wakeup frame and can wake system up from states other than D0.
However, the lack of wakeup support form D3cold state keep me from
adding WOL capability. Also detecting WOL frame requires firmware
support but it's not yet known to me whether the firmware can
process the WOL frame.
o Changed *_ADDR_HIADDR to *_ADDR_HI to match other definitions of
registers.
o Added definitioan to interrupt moderation related constants.
o Redefined SF_INTRS to include Tx DMA done and DMA errors. Removed
Tx done as it's not needed anymore.
o Added definition for Rx/Tx DMA high priority threshold.
o Nuked unused marco SF_IDX_LO, SF_IDX_HI.
o Added complete MAC statistic register definition.
o Modified sf_stats structure to hold all MAC statistic regiters.
o Nuke various driver private padding data in Tx/Rx descriptor
definition. sf(4) no longer requires private padding. Also remove
unused padding related definitions. This greatly simplifies
descriptor manipulation on 64bit architectures.
o Becase we no longer pad driver private data into descriptor,
remove deprecated/not-applicable comments for padding.
o Redefine Rx/Tx desciptor status. sf(4) doesn't use bit fileds
anymore to support endianness.
o Import AIC-6915 firmware for GFP from Adaptec.
MFC:
Make ALTQ cope with disappearing interfaces (particularly common with mpd
and netgraph in gernal). This also allows to add queues for an interface
that is not yet existing (you have to provide the bandwidth for the
interface, however).
- Do as the comment in pmap_bootstrap() suggests and flush all non-locked
TLB entries possibly left over by the firmware and also do so while
bootstrapping APs.
- Use __FBSDID.
MFC:
- add _umtx_op_err function to improve stability because of errno
changed by application signal handler code.
- use kernel based userland rwlock to implement pthread_rwlock,
improve performance in most cases.
marius [Fri, 11 Apr 2008 23:38:07 +0000 (23:38 +0000)]
MFC: if_dc.c 1.194; if_dcreg.h 1.55
- Const'ify the dc_devs array.
- Correct the maxsize parameter when creating the mbufs busdma tag to
reflect the actual requirement of dc(4).
- Move the KASSERT in dc_newbuf() to the right spot.
- Also convert the TX side to take advantage of bus_dmamap_load_mbuf_sg(9).
- Move the comment regarding dc_start_locked() to the right spot.
Add support for displaying a process' current working directory, root
directory, and jail directory within procstat. While this functionality
is available already in fstat, encapsulating it in the kern.proc.filedesc
sysctl makes it accessible without using kvm and thus without needing
elevated permissions.
The new procstat output looks like:
PID COMM FD T V FLAGS REF OFFSET PRO NAME
76792 tcsh cwd v d -------- - - - /usr/src
76792 tcsh root v d -------- - - - /
76792 tcsh 15 v c rw------ 16 9130 - -
76792 tcsh 16 v c rw------ 16 9130 - -
76792 tcsh 17 v c rw------ 16 9130 - -
76792 tcsh 18 v c rw------ 16 9130 - -
76792 tcsh 19 v c rw------ 16 9130 - -
I am also bumping __FreeBSD_version for this as this new feature will be
used in at least one port.
Reviewed by: rwatson
Approved by: rwatson
Note that in the MFC, __FreeBSD_version is not bumped as we will bump it
once (shortly) for all procstat(1) MFC changes together.
Merge procstat.c:1.2, procstat_basic.c:1.2, procstat_files.c:1.4,
procstat_kstack.c:1.3, procstat_threads.c:1.3, procstat_vm.c:1.2 from
HEAD to RELENG_7:
WARNS fixes: mainly constness and avoid comparing signed with
unsigned by making array indicies unsigned. Also note one or two
unused parameters.
Merge init_main.c:1.290, kern_proc.c:1.261, kern_resource.c:1.182,
kern_sync.c:1.305, proc.h:1.499 from HEAD to RELENG_7:
Don't zero td_runtime when billing thread CPU usage to the process;
maintain a separate td_incruntime to hold unbilled CPU usage for
the thread that has the previous properties of td_runtime.
When thread information is requested using the thread monitoring
sysctls, export thread td_runtime instead of process rusage runtime
in kinfo_proc.
This restores the display of individual ithread and other kernel
thread CPU usage since inception in ps -H and top -SH, as well for
libthr user threads, valuable debugging information lost with the
move to try kthreads since they are no longer independent processes.
There is universal agreement that we should rewrite the process and
thread export sysctls, but this commit gets things going a bit
better in the mean time. Likewise, there are resevations about the
continued validity of statclock given the speed of modern processors.
Reviewed by: attilio, emaste, jhb, julian
Note: in MFC, td_incruntime is added to the end of struc thread,
rather than the pre-zero'd section, and manually zero'd on thread
creation, in order to avoid a negative ABI impact for third-party
kernel modules.
In "show lockedvnods" DDB command, use db_printf() rather than printf()
so that the results end up in the DDB output stream rather than the
console output stream.
This should likely also be done for the vprint() function it calls.
Return ESRCH when a kernel stack is queried on a process in execve() --
p_candebug() will return EAGAIN which, if the other process never
leaves execve(), will result in the sysctl spinning and never returning
to userspace. Processes should always eventually leave execve(), but
spinning in kernel while we wait is bad for countless reasons, and
particularly harmful if execve() itself is deadlocked.
Possibly we should return another error, or return a marker indicating
the thread is in execve() so it can be reported that way in userspace.
Merge procstat_args.c:1.2, procstat_bin.c:1.2, procstat_cred.c:1.2,
and procstat_files.c:1.2 from HEAD to RELENG_7:
Add 'COMM' column to a few more output modes of procstat(1). The only
one it's missing from is the VM display, where there's really not room,
and the file output display is looking quite cramped.
MFC if_stge.c rev 1.12 to RELENG_7.
Use m_collapse(9) to collapse mbuf chains instead of relying on
shortest possible chain of mbufs of m_defrag(9). What we want is
chains of mbufs that can be safely stored to a Tx descriptor which
can have up to STGE_MAXTXSEGS mbufs. The ethernet controller does
not need to align Tx buffers on 32bit boundary. So the use of
m_defrag(9) was waste of time.
MFC if_stge.c rev 1.11, if_stgereg.h rev 1.4 to RELENG_7.
Implement WOL capability.
- Turn on WOL bits in suspend/shutdown method.
- WOL is disabled in resume routine as WOL can interfere normal
Rx operation.
- Move stge_reset() to stge_init_locked() as resetting hardware
clears configured Rx information which in turn results in
non-working Rx module after suspend/shutdown operation.
MFC vr(4) to latest HEAD.
o vr(4) repocopied from sys/pci to sys/dev/vr.
Teach vr(4) to use bus_dma(9) and major overhauling to handle link
state change and reliable error recovery.
o Moved vr_softc structure and relevant macros to header file.
o Use PCIR_BAR macro to get BARs.
o Implemented suspend/resume methods.
o Implemented automatic Tx threshold configuration which will be
activated when it suffers from Tx underrun. Also Tx underrun
will try to restart only Tx path and resort to previous
full-reset(both Rx/Tx) operation if restarting Tx path have failed.
o Removed old bit-banging MII interface. Rhine provides simple and
efficient MII interface. While I'm here show PHY address and PHY
register number when its read/write operation was failed.
o Define VR_MII_TIMEOUT constant and use it in MII access routines.
o Always honor link up/down state reported by mii layers. The link
state information is used in vr_start() to determine whether we
got a valid link.
o Removed vr_setcfg() which is now handled in vr_link_task(), link
state taskqueue handler. When mii layer reports link state changes
the taskqueue handler reprograms MAC to reflect negotiated duplex
settings. Flow-control changes are not handled yet and it should
be revisited when mii layer knows the notion of flow-control.
o Added a new sysctl interface to get statistics of an instance of
the driver.(sysctl dev.vr.0.stats=1)
o Chip name was renamed to reflect the official name of the chips
described in VIA Rhine I/II/III datasheet.
REV_ID_3065_A -> REV_ID_VT6102_A
REV_ID_3065_B -> REV_ID_VT6102_B
REV_ID_3065_C -> REV_ID_VT6102_C
REV_ID_3106_J -> REV_ID_VT6105_A0
REV_ID_3106_S -> REV_ID_VT6105M_A0
The following chip revisions were added.
#define REV_ID_VT6105_B0 0x83
#define REV_ID_VT6105_LOM 0x8A
#define REV_ID_VT6107_A0 0x8C
#define REV_ID_VT6107_A1 0x8D
#define REV_ID_VT6105M_B1 0x94
o Always show chip revision number in device attach. This shall help
identifying revision specific issues.
o Check whether EEPROM reloading is complete by inspecting the state
of VR_EECSR_LOAD bit. This bit is self-cleared after the EEPROM
reloading. Previously vr(4) blindly spins for 200us which may/may
not enough to complete the EEPROM reload.
o Removed if_mtu setup. It's done in ether_ifattach().
o Use our own callout to drive watchdog timer.
o In vr_attach disable further interrupts after reset. For VT6102 or
newer hardwares, diable MII state change interrupt as well because
mii state handling is done by mii layer.
o Add more sane register initialization for VT6102 or newer chips.
- Have NIC report error instead of retrying forever.
- Let hardware detect MII coding error.
- Enable MODE10T mode.
- Enable memory-read-multiple for VT6107.
o PHY address for VT6105 or newer chips is located at fixed address 1.
For older chips the PHY address is stored in VR_PHYADDR register.
Armed with these information, there is no need to re-read
VR_PHYADDR register in miibus handler to get PHY address. This
saves one register access cycle for each MII access.
o Don't reprogram VR_PHYADDR register whenever access to a register
located at a PHY address is made. Rhine fmaily allows reprogramming
PHY address location via VR_PHYADDR register depending on
VR_MIISTAT_PHYOPT bit of VR_MIISTAT register. This used to lead
numerous phantom PHYs attached to miibus during phy probe phase and
driver used to limit allowable PHY address in mii register accessors
for certain chip revisions. This removes one more register access
cycle for each MII access.
o Correctly set VLAN header length.
o bus_dma(9) conversion.
- Limit DMA access to be in range of 32bit address space. Hardware
doesn't support DAC.
- Apply descriptor ring alignment requirements(16 bytes alignment)
- Apply Rx buffer address alignment requirements(4 bytes alignment)
- Apply Tx buffer address alignment requirements(4 bytes alignment)
for Rhine I chip. Rhine II or III has no Tx buffer address
alignment restrictions, though.
- Reduce number of allowable number of DMA segments to 8.
- Removed the atomic(9) used in descriptor ownership managements
as it's job of bus_dmamap_sync(9).
With these change vr(4) should work on all platforms.
o Rhine uses two separated 8bits command registers to control Tx/Rx
MAC. So don't access it as a single 16bit register.
o For non-strict alignment architectures vr(4) no longer require
time-consuming copy operation for received frames to align IP
header. This greatly improves Rx performance on i386/amd64
platforms. However the alignment is still necessary for
strict-alignment platforms(e.g. sparc64). The alignment is handled
in new fuction vr_fixup_rx().
o vr_rxeof() now rejects multiple-segmented(fragmented) frames as
vr(4) is not ready to handle this situation. Datasheet said nothing
about the reason when/why it happens.
o In vr_newbuf() don't set VR_RXSTAT_FIRSTFRAG/VR_RXSTAT_LASTFRAG
bits as it's set by hardware.
o Don't pass checksum offload information to upper layer for
fragmented frames. The hardware assisted checksum is valid only
when the frame is non-fragmented IP frames. Also mark the checksum
is valid for corrupted frames such that upper layers doesn't need
to recompute the checksum with software routine.
o Removed vr_rxeoc(). RxDMA doesn't seem to need to be idle before
sending VR_CMD_RX_GO command. Previously it used to stop RxDMA
first which in turn resulted in long delays in Rx error recovery.
o Rewrote Tx completion handler.
- Always check VR_TXSTAT_OWN bit in status word prior to
inspecting other status bits in the status word.
- Collision counter updates were corrected as VT3071 or newer
ones use different bits to notify collisions.
- Unlike other chip revisions, VT86C100A uses different bit to
indicate Tx underrun. For VT3071 or newer ones, check both
VR_TXSTAT_TBUFF and VR_TXSTAT_UDF bits to see whether Tx
underrun was happend. In case of Tx underrun requeue the failed
frame and restart stalled Tx SM. Also double Tx DMA threshold
size on each failure to mitigate future Tx underruns.
- Disarm watchdog timer only if we have no queued packets,
otherwise don't touch watchdog timer.
o Rewrote interrupt handler.
- status word in Tx/Rx descriptors indicates more detailed error
state required to recover from the specific error. There is no
need to rely on interrupt status word to recover from Tx/Rx
error except PCI bus error. Other event notifications like
statistics counter overflows or link state events will be
handled in main interrupt handler.
- Don't touch VR_IMR register if we are in suspend mode. Touching
the register may hang the hardware if we are in suspended state.
Previously it seems that touching VR_IMR register in interrupt
handler was to work-around panic occurred in system shutdown
stage on SMP systems. I think that work-around would hide
root-cause of the panic and I couldn't reproduce the panic
with multiple attempts on my box.
o While padding space to meet minimum frame size, zero the pad data
in order to avoid possibly leaking sensitive data.
o Rewrote vr_start_locked().
- Don't try to queue packets if number of available Tx descriptors
are short than that of required one.
o Don't reinitialize hardware whenever media configuration is
changed. Media/link state changes are reported from mii layer if
this happens and vr_link_task() will perform necessary changes.
o Don't reinitialize hardware if only PROMISC bit was changed. Just
toggle the PROMISC bit in hardware is sufficient to reflect the
request.
o Rearrganed the IFCAP_POLLING/IFCAP_HWCSUM handling in vr_ioctl().
o Generate Tx completion interrupts for every VR_TX_INTR_THRESH-th
frames. This reduces Tx completion interrupts under heavy network
loads.
o Since vr(4) doesn't request Tx interrupts for every queued frames,
reclaim any pending descriptors not handled in Tx completion
handler before actually firing up watchdog timeouts.
o Added vr_tx_stop()/vr_rx_stop() to wait for the end of active
TxDMA/RxDMA cycles(draining). These routines are used in vr_stop()
to ensure sane state of MAC before releasing allocated Tx/Rx
buffers. vr_link_task() also takes advantage of these functions to
get to idle state prior to restarting Tx/Rx.
o Added vr_tx_start()/vr_rx_start() to restart Rx/Tx. By separating
Rx operation from Tx operation vr(4) no longer need to full-reset
the hardware in case of Tx/Rx error recovery.
o Implemented WOL.
o Added VT6105M specific register definitions. VT6105M has the
following hardware capabilities.
- Tx/Rx IP/TCP/UDP checksum offload.
- VLAN hardware tag insertion/extraction. Due to lack of information
for getting extracted VLAN tag in Rx path, VLAN hardware support
was not implemented yet.
- CAM(Content Addressable Memory) based 32 entry perfect multicast/
VLAN filtering.
- 8 priority queues.
o Implemented CAM based 32 entry perfect multicast filtering for
VT6105M. If number of multicast entry is greater than 32, vr(4)
uses traditional hash based filtering.
o Reflect real Tx/Rx descriptor structure. Previously vr(4) used to
embed other driver (private) data into these structure. This type
of embedding make it hard to work on LP64 systems.
o Removed unused vr_mii_frame structure and MII bit-baning
definitions.
o Added new PCI configuration registers that controls mii operation
and mode selection.
o Reduced number of Tx/Rx descriptors to 128 from 256. From my
testing, increasing number of descriptors above than 64 didn't help
increasing performance at all. Experimentations show 128 Rx
descriptors seems to help a lot reducing Rx FIFO overruns under
high system loads. It seems the poor Tx performance of Rhine
hardwares comes from the limitation of hardware. You wouldn't
satuarte the link with vr(4) no matter how fast CPU/large number of
descriptors are used.
o Added vr_statistics structure to hold various counter values.
Add a new function is_default_interface() which determines if this
interface is one with the default route (or there isn't one). Use it to
decide if we should adjust the default route and /etc/resolv.conf.
Fix the delete of the default route. The if statement was totally bogus
and the delete only worked due to a typo. [1]
Reported by: Jordan Coleman <jordan at JordanColeman dot com> [1]
Merge procstaT_kstack.c:1.2, procstat_threads.c:1.2 from HEAD to
RELENG_7:
Display per-thread command line in TDNAME field for -k and -t; if
no per-thread name is available or the name is identical to the
process name, display "-" instead. Very slightly shrink the COMM
entry to make a bit more room, although this doesn't help with
stack traces much.
When a symbol name can't be resolved, return "??" as the name, rather
than "Unknown func", in order to avoid putting spaces in what ideally
is a string separated by white space.
Merge Makefile:1.1, procstat.1:1.1, procstat.c:1.1, procstat.h:1.1,
procstat_args.c:1.1, procstat_basic.c:1.1, procstat_bin.c:1.1,
procstat_cred.c:1.1, procstat_files.c:1.1, procstat_kstack.c:1.1,
procstat_threads.c:1.1, and procstat_vm.c:1.1 from HEAD to RELENG_7:
Add procstat(1), a process inspection utility. This provides both some
of the missing functionality from procfs(4) and new functionality for
monitoring and debugging specific processes. procstat(1) operates in
the following modes:
-b Display binary information for the process.
-c Display command line arguments for the process.
-f Display file descriptor information for the process.
-k Display the stacks of kernel threads in the process.
-s Display security credential information for the process.
-t Display thread information for the process.
-v Display virtual memory mappings for the process.
Further revision and modes are expected.
Testing, ideas, etc: cognet, sam, Skip Ford <skip at menantico dot com>
Wesley Shields <wxs at atarininja dot org>
Merge kern_proc.c:1.257, sysctl.h:1.154, and user.h:1.72 from HEAD to
RELENG_7:
Add another new sysctl in support of the forthcoming procstat(1) to
support its -k argument:
kern.proc.kstack - dump the kernel stack of a process, if debugging
is permitted.
This sysctl is present if either "options DDB" or "options STACK" is
compiled into the kernel. Having support for tracing the kernel
stacks of processes from user space makes it much easier to debug
(or understand) specific wmesg's while avoiding the need to enter
DDB in order to determine the path by which a process came to be
blocked on a particular wait channel or lock.
MFC revision 1.22
date: 2008/03/07 00:01:19; author: delphij; state: Exp; lines: +61 -68
Merge revisions 1.10 and 1.11 from DragonFly:
- Use real getopt() handling instead of the hand-rolled and
IOCCC-worthy "Micro getopt()" macros, plus clean up to the
option handling code:
* Sort the options in the switch statement;
* Plug piddling memory leaks when processing repeated options
by freeing strings before allocating them for a second time;
* Die with a fatal error if the requested report file cannot
be opened for appending;
* Don't call init() before usage() (to prevent the usage
message being mangled by changes to the terminal settings;)
- Clean up the usage message, both in usage() and in the main
program comment, both stylistically (sort and combine options)
and for accuracy (following the manual page, make note of the -s
and -S flags, and use the term 'send' instead of 'say' to reduce
confusion (SAY is the name of a command for output to the user,
not the connection.))
- Introduce per-architecture stack_machdep.c to hold stack_save(9).
- Introduce per-architecture machine/stack.h to capture any common
definitions required between db_trace.c and stack_machdep.c.
- Add new kernel option "options STACK"; we will build in stack(9) if it
is defined, or also if "options DDB" is defined to provide
compatibility with existing users of stack(9).
Add new stack_save_td(9) function, which allows the capture of a
stacktrace of another thread rather than the current thread, which the
existing stack_save(9) was limited to. It requires that the thread be
neither swapped out nor running, which is the responsibility of the
consumer to enforce.
Merge kern_descrip.c:1.314, kern_proc.c:1.256, sysctl.h:1.153,
user.h:1.71 from HEAD to RELENG_7:
Add two new sysctls in support of the forthcoming procstat(1) to support
its -f and -v arguments:
kern.proc.filedesc - dump file descriptor information for a process, if
debugging is permitted, including socket addresses, open flags, file
offsets, file paths, etc.
kern.proc.vmmap - dump virtual memory mapping information for a process,
if debugging is permitted, including layout and information on
underlying objects, such as the type of object and path.
These provide a superset of the information historically available
through the now-deprecated procfs(4), and are intended to be exported
in an ABI-robust form.
Merge Makefile:1.320, stack.9:1.4, kern_lock.c:1.114, subr_stack.c:1.4,
stack.h:1.3, redzone.c:1.2 from HEAD to RELENG_7:
Modify stack(9) stack_print() and stack_sbuf_print() routines to use new
linker interfaces for looking up function names and offsets from
instruction pointers. Create two variants of each call: one that is
"DDB-safe" and avoids locking in the linker, and one that is safe for
use in live kernels, by virtue of observing locking, and in particular
safe when kernel modules are being loaded and unloaded simultaneous to
their use. This will allow them to be used outside of debugging
contexts.
Modify two of three current stack(9) consumers to use the DDB-safe
interfaces, as they run in low-level debugging contexts, such as inside
lockmgr(9) and the kernel memory allocator.
Merge kern_linker.c:1.153, linker.h:1.50 from HEAD to RELENG_7:
The kernel linker includes a number of utility functions to look up symbol
information in support of DDB(4); these functions bypass normal linker
locking as they may run in contexts where locking is unsafe (such as the
kernel debugger).
Add a new interface linker_ddb_search_symbol_name(), which looks up a
symbol name and offset given an address, and also
linker_search_symbol_name() which does the same but *does* follow the
locking conventions of the linker.
Unlike existing functions, these functions place the name in a
caller-provided buffer, which is stable even after linker locks have been
released. These functions will be used in upcoming revisions to stack(9)
to support kernel stack trace generation in contexts as part of a live,
rather than suspended, kernel.
Sync to HEAD.
- Simplify the error handling
- Nuke licence clause 3 & 4
- Drop frames to any of the reserved multicast addresses
- Remove duplicated code for testing local packets.
MFC: Do nextboot -D twice during boot. The first time in rc.d/root which ensures
that we can remove the file as early as possible, but shut up nextboot at this
moment if the operation is failed, because /boot is not necessarily a part of /;
the newly added second run is placed in rc.d/mountlate after all filesystems were
mounted.
peter [Wed, 9 Apr 2008 19:05:59 +0000 (19:05 +0000)]
MFC: collect per-cpu stats for %user, %system, %idle etc. Counters are
stored in pcpu area and merged at collection time rather than a common
set being manipulated with atomic ops.
MFC if_lagg.c r1.26-27, ieee8023ad_lacp.c r1.13-15, ieee8023ad_lacp.h r1.9-11
- Pass any unmatched slowprotocols frames up the stack
- Switch the LACP state machine over to its own mutex
MFC r1.97:
Sync with rev 1.63 of NetBSD's ums.c:
If a mouse has both a wheel and a Z direction we report both.
XXX Due to tradition the wheel is reported as the Z direction (and the Z
direction as W).
Now Apple's Mighty Mouse is fully supported, except the X11 mouse driver
doesn't know what to do with the new coordinate.
MFC 1.206: If the disk reports that it support the Compact Flash
Association command set, announce BIO_DELETE capability and issue
ATA_CFA_ERASE when we get one.