marius [Fri, 26 Nov 2010 21:27:13 +0000 (21:27 +0000)]
MFC: r215780
Remove the description of the link0 link option, since r215297 (merged to
stable/8 in r215881) the master media option generally should be used
instead.
This is MFC'ed in order to discourage the use of the link0 link option,
although it's still available in stable/8).
jkim [Fri, 26 Nov 2010 21:16:21 +0000 (21:16 +0000)]
MFC: r196769, r196771, r211424, r215703, r215754
- Disable caches and flush caches/TLBs when we update PAT as we do for MTRR.
Flushing TLBs is required to ensure cache coherency according to the AMD64
architecture manual. Flushing caches is only required when changing from a
cacheable memory type (WB, WP or WT) to an uncacheable type (WC, UC or UC-).
Since this function is only used once per processor during startup, there is
no need to take any shortcuts.
- Leave PAT indices 0-3 at the default of WB, WT, UC-, and UC. Program 5 as
WP (from default WT) and 6 as WC (from default UC-). Leave 4 and 7 at the
default of WB and UC. This is to avoid transition from a cacheable memory
type to an uncacheable type to minimize possible cache incoherency. Since
we perform flushing caches and TLBs now, this change may not be necessary
any more but we do not want to take any chances.
- Improve pmap_cache_bits() with an array to map PAT memory type to index.
This array is initialized early from pmap_init_pat(), so that we do not need
to handle special cases in the function any more. Now this function is
identical on both amd64 and i386.
marius [Fri, 26 Nov 2010 21:01:19 +0000 (21:01 +0000)]
MFC: r215722
- Fix and enable support for flow control.
- Partially revert r172334; as it turns out the DELAYs in gem_reset_{r,t}x()
are actually necessary although bus space barriers and gem_bitwait() are
used, otherwise the controller may trigger an IOMMU errors on at least
sparc64. This is in line with what Linux and OpenSolaris do.
marius [Fri, 26 Nov 2010 20:59:43 +0000 (20:59 +0000)]
MFC: r215720
- Also probe BCM5214 and BCM5222.
- Add some DSP init code for BCM5221. The values derived from Apple's GMAC
driver and the same init code also exists in Linux's sungem_phy driver.
- Only read media status bits when they are valid.
marius [Fri, 26 Nov 2010 20:55:58 +0000 (20:55 +0000)]
MFC: r215302
Move the limiting of the PHY to 10/100 modes of operation due to limitations
of certain MAC models from brgphy(4) to bge(4) where it belongs. While at it,
update the list of models having that restriction to what OpenBSD uses, which
in turn seems to have obtained that information from the Linux tg3 driver.
marius [Fri, 26 Nov 2010 20:45:49 +0000 (20:45 +0000)]
MFC: r215298, r215459, r215714, r215716
- Change these drivers to take advantage and use the generic IEEE 802.3
annex 31B full duplex flow control as well as the IFM_1000_T master
support committed in r215297 (merged to stable/8 in r215881). For
atphy(4) and jmphy(4) this includes changing these PHY drivers to no
longer unconditionally advertise support for flow control but only if
the selected media has IFM_FLOW set (or MIIF_FORCEPAUSE is set).
- Rename {atphy,jmphy}_auto() to {atphy,jmphy}_setmedia() as these handle
other media types as well.
marius [Fri, 26 Nov 2010 20:37:19 +0000 (20:37 +0000)]
MFC: r214608, r215297(partial), r215713
o Flesh out the generic IEEE 802.3 annex 31B full duplex flow control
support in mii(4):
- Merge generic flow control advertisement (which can be enabled by
passing by MIIF_DOPAUSE to mii_attach(9)) and parsing support from
NetBSD into mii_physubr.c and ukphy_subr.c. Unlike as in NetBSD,
IFM_FLOW isn't implemented as a global option via the "don't care
mask" but instead as a media specific option this. This has the
following advantages:
o allows flow control advertisement with autonegotiation to be
turned on and off via ifconfig(8) with the default typically
being off (though MIIF_FORCEPAUSE has been added causing flow
control to be always advertised, allowing to easily MFC this
changes for drivers that previously used home-grown support for
flow control that behaved that way without breaking POLA)
o allows to deal with PHY drivers where flow control advertisement
with manual selection doesn't work or at least isn't implemented,
like it's the case with brgphy(4), e1000phy(4) and ip1000phy(4),
by setting MIIF_NOMANPAUSE
o the available combinations of media options are readily available
from the `ifconfig -m` output
- Add IFM_FLOW to IFM_SHARED_OPTION_DESCRIPTIONS and IFM_ETH_RXPAUSE
and IFM_ETH_TXPAUSE to IFM_SUBTYPE_ETHERNET_OPTION_DESCRIPTIONS so
these are understood by ifconfig(8).
o Make the master/slave support in mii(4) actually usable:
- Change IFM_ETH_MASTER from being implemented as a global option via
the "don't care mask" to a media specific one as it actually is only
applicable to IFM_1000_T to date.
- Let mii_phy_setmedia() set GTCR_MAN_MS in IFM_1000_T slave mode to
actually configure manually selected slave mode (like we also do in
the PHY specific implementations).
- Add IFM_ETH_MASTER to IFM_SUBTYPE_ETHERNET_OPTION_DESCRIPTIONS so it
is understood by ifconfig(8).
o Switch bge(4), bce(4), msk(4), nfe(4) and stge(4) along with brgphy(4),
e1000phy(4) and ip1000phy(4) to use the generic flow control support
instead of home-grown solutions via IFM_FLAGs. This includes changing
these PHY drivers and smcphy(4) to no longer unconditionally advertise
support for flow control but only if the selected media has IFM_FLOW
set (or MIIF_FORCEPAUSE is set) and implemented for these media variants,
i.e. typically only for copper.
o Switch brgphy(4), ciphy(4), e1000phy(4) and ip1000phy(4) to report and
set IFM_1000_T master mode via IFM_ETH_MASTER instead of via IFF_LINK0
and some IFM_FLAGn.
o Switch brgphy(4) to add at least the the supported copper media based on
the contents of the BMSR via mii_phy_add_media() instead of hardcoding
them. The latter approach seems to have developed historically, besides
causing unnecessary code duplication it was also undesirable because
brgphy_mii_phy_auto() already based the capability advertisement on the
contents of the BMSR though.
o Let brgphy(4) set IFM_1000_T master mode on all supported PHY and not
just BCM5701. Apparently this was a misinterpretation of a workaround
in the Linux tg3 driver; BCM5701 seem to require RGPHY_1000CTL_MSE and
BRGPHY_1000CTL_MSC to be set when configuring autonegotiation but
this doesn't mean we can't set these as well on other PHYs for manual
media selection.
o Let ukphy_status() report IFM_1000_T master mode via IFM_ETH_MASTER so
IFM_1000_T master mode support now is generally available with all PHY
drivers.
o Don't let e1000phy(4) set master/slave bits for IFM_1000_SX as it's
not applicable there.
Unlike as in head, bge(4), bce(4), msk(4), nfe(4) and stge(4) are changed
to set MIIF_FORCEPAUSE in stable/8 so they continue to always advertise
support of flow control and brgphy(4), ciphy(4), e1000phy(4) as well as
ip1000phy(4) are changed to still also accept IFF_LINK0 in addition to
the master media option for setting master mode, both in order to not
violate POLA.
marius [Fri, 26 Nov 2010 18:44:01 +0000 (18:44 +0000)]
MFC: r215259, r215272
- When printing media with more than one media option set aggregate these
in a comma delimited list instead of repeating "mediaopt" for each one.
This matches how the options of the active media are printed with
print_media_word() and brings us in line what NetBSD does.
- When setting a media with no sub-type specified also reset the type
specific options along with the global ones so these options don't
stick when f.e. switching to IFM_AUTO.
zec [Fri, 26 Nov 2010 15:46:49 +0000 (15:46 +0000)]
MFC r215726:
Allow for vlan(4) ifnets to have overlapping unit numbers if they are
created in separated vnets. As a side-effect of having a separated
if_cloner instance for each vnet, all vlan ifnets created in a vnet
will be automatically destroyed when vnet teardown is initiated.
Disallow SIOCSETVLAN and SIOCGETVLAN ioctls on vlan ifnets which are
associated with physical ifnets residing in parent vnets.
This is an interim vlan-specific solution which will be superseded by a
more generic if_cloner V_irtualization change from p4. For nooptions
VIMAGE builds, this should be a no-op change.
zec [Fri, 26 Nov 2010 15:44:16 +0000 (15:44 +0000)]
MFC r215673:
Allow for MTU sizes of up to ETHER_MAX_LEN_JUMBO (i.e. 9018) bytes to be
configured on ng_eiface ifnets. The default MTU remains unchanged at
1500 bytes.
Mark ng_eiface ifnets as IFCAP_VLAN_MTU capable, so that the associated
vlan(4) ifnets may use full-sized Ethernet MTUs (1500 bytes).
bschmidt [Fri, 26 Nov 2010 11:55:51 +0000 (11:55 +0000)]
MFC r215708:
Resurrect amd64 support.
- Many drivers on amd64 are picking system uptime, interrupt time and ticks
via global data structure instead of calling functions for performance
reasons. For now just patch such address so driver will not trigger page
fault when trying to access such data. In future, additional callout may
be added to update data in periodic intervals.
- On amd64 we need to allocate "shadow space" on stack before calling any
function.
bschmidt [Fri, 26 Nov 2010 11:48:47 +0000 (11:48 +0000)]
MFC r215707:
Prefer pmap_extract() over pmap_kextract() as done in MmIsAddressValid().
According to the comment for MmIsAddressValid() there are issues on PAE
kernels using pmap_kextract().
kib [Fri, 26 Nov 2010 11:37:35 +0000 (11:37 +0000)]
Partial MFC r215548:
Remove printf()s in the vop_inactive and vop_reclaim() methods related
to prtactive variable. The prtactive variable definition and declaration
are kept in the stable branch to preserve the KPI and KBI.
bschmidt [Thu, 25 Nov 2010 18:50:59 +0000 (18:50 +0000)]
MFC r215135,215419,215420:
- According to specs for MmAllocateContiguousMemorySpecifyCache() physically
contiguous memory with requested restrictions must be allocated.
- Use kmem_alloc_contig() to honour the cache_type variable.
- Fix a panic on i386 for drivers using MmAllocateContiguousMemory()
and MmAllocateContiguousMemorySpecifyCache().
rstone [Thu, 25 Nov 2010 18:32:02 +0000 (18:32 +0000)]
MFC 215474
When netstat was run with -i/-I and -w1 to produce running counters, the idrop
field printed an absolute value rather than the delta from the last value
delphij [Thu, 25 Nov 2010 18:21:08 +0000 (18:21 +0000)]
MFC r215234:
Update to vendor release 1.20.00.19.
Bug fixes:
* Fixed "inquiry data fails comparion at DV1 step"
* Fixed bad range input in bus_alloc_resource for ADAPTER_TYPE_B
* Fixed arcmsr driver prevent arcsas support for Areca SAS HBA ARC13x0
Many thanks to Areca for continuing to support FreeBSD.
MFC r215280:
Workaround build for PAE case for now - revert the PHYS
case to previous panic behavior.
rrs [Thu, 25 Nov 2010 12:10:59 +0000 (12:10 +0000)]
MFC of 215110:
Fix so that a multicast packet can be sent
even if there is no route out to that mcast address. The code in
in_pcb inadvertantly would error (no route) even though
the user may have specified the address with the
proper socket option (to specify the egress interface).
Thanks bz for reminding me I forgot to commit this ;-)
bschmidt [Thu, 25 Nov 2010 08:55:57 +0000 (08:55 +0000)]
MFC r215699:
The meshid element is memcpy()'ed into se_meshid if included in either
beacon or probe-response frames. Fix the condition by checking for the
the array's content instead of the always existing array itself.
brucec [Wed, 24 Nov 2010 21:54:45 +0000 (21:54 +0000)]
MFC r215637:
dispatch_add_command:
Modify the logic so there's only one exit point instead of two.
Only insert valid (non-NULL) values into the queue.
dispatch_free_command:
Ensure that item is not NULL before removing it from the queue and
dereferencing the pointer.
NULL out free'd pointers to catch any use-after-free bugs.
glebius [Wed, 24 Nov 2010 05:37:12 +0000 (05:37 +0000)]
MFhead r214508:
Revert a small part of the r198301, that is entirely unrelated to the
r198301 itself. It also broke the logic of not sending more than one
ARP request per second, that consequently lead to a potential problem
of flooding network with broadcast packets.
Correct bug introduced while purging the -ERRNO Linuxism from the
grant table API. Valid grant refs are in the range of positive 32bit
integers. ENOSPACE, being 29, is also a positive integer. Return
GNTTAB_LIST_END (-1) instead when gnttab_claim_grant_reference() fails.
Correct alignment and boundary constraints in blkfront's bus dma tag. The
blkif interface in Xen requires all I/O to be 512 byte aligned with each
segment bounded by a 4k page.
Note: This submission only documents the proper contraints for blkif I/O.
The alignment code in busdma does not yet handle alignment constraints
correctly in all cases.
Improve the Xen para-virtualized device infrastructure of FreeBSD:
o Add support for backend devices (e.g. blkback)
o Implement extensions to the Xen para-virtualized block API to allow
for larger and more outstanding I/Os.
o Import a completely rewritten block back driver with support for
fronting I/O to both raw devices and files.
o General cleanup and documentation of the XenBus and XenStore support
code.
o Robustness and performance updates for the block front driver.
o Fixes to the netfront driver.
Sponsored by: Spectra Logic Corporation
sys/xen/xenbus/init.txt:
Deleted: This file explains the Linux method for XenBus device
enumeration and thus does not apply to FreeBSD's NewBus approach.
sys/xen/xenbus/xenbus_probe_backend.c:
Deleted: Linux version of backend XenBus service routines. It
was never ported to FreeBSD. See xenbusb.c, xenbusb_if.m,
xenbusb_front.c xenbusb_back.c for details of FreeBSD's XenBus
support.
sys/xen/xenbus/xenbusvar.h:
sys/xen/xenbus/xenbus_xs.c:
sys/xen/xenbus/xenbus_comms.c:
sys/xen/xenbus/xenbus_comms.h:
sys/xen/xenstore/xenstorevar.h:
sys/xen/xenstore/xenstore.c:
Split XenStore into its own tree. XenBus is a software layer
built on top of XenStore. The old arrangement and the naming of
some structures and functions blurred these lines making it
difficult to discern what services are provided by which layer
and at what times these services are available (e.g. during
system startup and shutdown).
sys/xen/xenbus/xenbus_client.c:
sys/xen/xenbus/xenbus.c:
sys/xen/xenbus/xenbus_probe.c:
sys/xen/xenbus/xenbusb.c:
sys/xen/xenbus/xenbusb.h:
Split up XenBus code into methods available for use by client
drivers (xenbus.c) and code used by the XenBus "bus code" to
enumerate, attach, detach, and service bus drivers.
sys/xen/reboot.c:
sys/dev/xen/control/control.c:
Add a XenBus front driver for handling shutdown, reboot,
suspend, and resume events published in the XenStore.
Move all PV suspend/reboot support from reboot.c into
this driver.
sys/xen/blkif.h:
New file from Xen vendor with macros and structures used by
a block back driver to service requests from a VM running a
different ABI (e.g. amd64 back with i386 front).
sys/conf/files:
Adjust kernel build spec for new XenBus/XenStore layout and added
Xen functionality.
sys/dev/xen/balloon/balloon.c:
sys/dev/xen/netfront/netfront.c:
sys/dev/xen/blkfront/blkfront.c:
sys/xen/xenbus/...
sys/xen/xenstore/...
o Rename XenStore APIs and structures from xenbus_* to xs_*.
o Adjust to use of M_XENBUS and M_XENSTORE malloc types
for allocation of objects returned by these APIs.
o Adjust for changes in the bus interface for Xen
drivers.
sys/xen/xenbus/...
sys/xen/xenstore/...
Add Doxygen comments for these interfaces and the code that
implements them.
sys/dev/xen/blkback/blkback.c:
o Rewrite the Block Back driver to attach properly via newbus,
operate correctly in both PV and HVM mode regardless of domain
(e.g. can be in a DOM other than 0), and to deal with the latest
metadata available in XenStore for block devices.
o Allow users to specify a file as a backend to blkback, in addition
to character devices. Use the namei lookup of the backend path
to automatically configure, based on file type, the appropriate
backend method.
The current implementation is limited to a single outstanding I/O
at a time to file backed storage.
sys/dev/xen/blkback/blkback.c:
sys/xen/interface/io/blkif.h:
sys/xen/blkif.h:
sys/dev/xen/blkfront/blkfront.c:
sys/dev/xen/blkfront/block.h:
Extend the Xen blkif API: Negotiable request size and number of
requests.
This change extends the information recorded in the XenStore
allowing block front/back devices to negotiate for optimal I/O
parameters. This has been achieved without sacrificing backward
compatibility with drivers that are unaware of these protocol
enhancements. The extensions center around the connection protocol
which now includes these additions:
o The back-end device publishes its maximum supported values for,
request I/O size, the number of page segments that can be
associated with a request, the maximum number of requests that
can be concurrently active, and the maximum number of pages that
can be in the shared request ring. These values are published
before the back-end enters the XenbusStateInitWait state.
o The front-end waits for the back-end to enter either the InitWait
or Initialize state. At this point, the front end limits it's
own capabilities to the lesser of the values it finds published
by the backend, it's own maximums, or, should any back-end data
be missing in the store, the values supported by the original
protocol. It then initializes it's internal data structures
including allocation of the shared ring, publishes its maximum
capabilities to the XenStore and transitions to the Initialized
state.
o The back-end waits for the front-end to enter the Initalized
state. At this point, the back end limits it's own capabilities
to the lesser of the values it finds published by the frontend,
it's own maximums, or, should any front-end data be missing in
the store, the values supported by the original protocol. It
then initializes it's internal data structures, attaches to the
shared ring and transitions to the Connected state.
o The front-end waits for the back-end to enter the Connnected
state, transitions itself to the connected state, and can
commence I/O.
Although an updated front-end driver must be aware of the back-end's
InitWait state, the back-end has been coded such that it can
tolerate a front-end that skips this step and transitions directly
to the Initialized state without waiting for the back-end.
sys/xen/interface/io/blkif.h:
o Increase BLKIF_MAX_SEGMENTS_PER_REQUEST to 255. This is
the maximum number possible without changing the blkif
request header structure (nr_segs is a uint8_t).
o Add two new constants:
BLKIF_MAX_SEGMENTS_PER_HEADER_BLOCK, and
BLKIF_MAX_SEGMENTS_PER_SEGMENT_BLOCK. These respectively
indicate the number of segments that can fit in the first
ring-buffer entry of a request, and for each subsequent
(sg element only) ring-buffer entry associated with the
"header" ring-buffer entry of the request.
o Add the blkif_request_segment_t typedef for segment
elements.
o Add the BLKRING_GET_SG_REQUEST() macro which wraps the
RING_GET_REQUEST() macro and returns a properly cast
pointer to an array of blkif_request_segment_ts.
o Add the BLKIF_SEGS_TO_BLOCKS() macro which calculates the
number of ring entries that will be consumed by a blkif
request with the given number of segments.
sys/xen/blkif.h:
o Update for changes in interface/io/blkif.h macros.
o Update the BLKIF_MAX_RING_REQUESTS() macro to take the
ring size as an argument to allow this calculation on
multi-page rings.
o Add a companion macro to BLKIF_MAX_RING_REQUESTS(),
BLKIF_RING_PAGES(). This macro determines the number of
ring pages required in order to support a ring with the
supplied number of request blocks.
sys/dev/xen/blkback/blkback.c:
sys/dev/xen/blkfront/blkfront.c:
sys/dev/xen/blkfront/block.h:
o Negotiate with the other-end with the following limits:
Reqeust Size: MAXPHYS
Max Segments: (MAXPHYS/PAGE_SIZE) + 1
Max Requests: 256
Max Ring Pages: Sufficient to support Max Requests with
Max Segments.
o Dynamically allocate request pools and segemnts-per-request.
o Update ring allocation/attachment code to support a
multi-page shared ring.
o Update routines that access the shared ring to handle
multi-block requests.
sys/dev/xen/blkfront/blkfront.c:
o Track blkfront allocations in a blkfront driver specific
malloc pool.
o Strip out XenStore transaction retry logic in the
connection code. Transactions only need to be used when
the update to multiple XenStore nodes must be atomic.
That is not the case here.
o Fully disable blkif_resume() until it can be fixed
properly (it didn't work before this change).
o Destroy bus-dma objects during device instance tear-down.
o Properly handle backend devices with powef-of-2 sector
sizes larger than 512b.
sys/dev/xen/blkback/blkback.c:
Advertise support for and implement the BLKIF_OP_WRITE_BARRIER
and BLKIF_OP_FLUSH_DISKCACHE blkif opcodes using BIO_FLUSH and
the BIO_ORDERED attribute of bios.
sys/dev/xen/blkfront/blkfront.c:
sys/dev/xen/blkfront/block.h:
Fix various bugs in blkfront.
o gnttab_alloc_grant_references() returns 0 for success and
non-zero for failure. The check for < 0 is a leftover
Linuxism.
o When we negotiate with blkback and have to reduce some of our
capabilities, print out the original and reduced capability before
changing the local capability. So the user now gets the correct
information.
o Fix blkif_restart_queue_callback() formatting. Make sure we hold
the mutex in that function before calling xb_startio().
o Fix a couple of KASSERT()s.
o Fix a check in the xb_remove_* macro to be a little more specific.
sys/xen/gnttab.h:
sys/xen/gnttab.c:
Define GNTTAB_LIST_END publicly as GRANT_REF_INVALID.
sys/dev/xen/netfront/netfront.c:
Use GRANT_REF_INVALID instead of driver private definitions of the
same constant.
sys/xen/gnttab.h:
sys/xen/gnttab.c:
Add the gnttab_end_foreign_access_references() API.
This API allows a client to batch the release of an
array of grant references, instead of coding a private
for loop. The implementation takes advantage of this
batching to reduce lock overhead to one acquisition and
release per-batch instead of per-freed grant reference.
While here, reduce the duration the gnttab_list_lock
is held during gnttab_free_grant_references() operations.
The search to find the tail of the incoming free list
does not rely on global state and so can be performed
without holding the lock.
sys/dev/xen/xenpci/evtchn.c:
sys/dev/xen/evtchn/evtchn.c:
sys/xen/xen_intr.h:
o Implement the bind_interdomain_evtchn_to_irqhandler
API for HVM mode. This allows an HVM domain to serve
back end devices to other domains. This API is already
implemented for PV mode.
o Synchronize the API between HVM and PV.
sys/dev/xen/xenpci/xenpci.c:
o Scan the full region of CPUID space in which the Xen
VMM interface may be implemented. On systems using
SuSE as a Dom0 where the Viridian API is also exported,
the VMM interface is above the region we used to
search.
o Pass through bus_alloc_resource() calls so that XenBus drivers
attaching on an HVM system can allocate unused physical address
space from the nexus. The block back driver makes use of this
facility.
sys/i386/xen/xen_machdep.c:
Use the correct type for accessing the statically mapped xenstore
metadata.
sys/xen/interface/hvm/params.h:
sys/xen/xenstore/xenstore.c:
Move hvm_get_parameter() to the correct global header file instead
of as a private method to the XenStore.
sys/xen/interface/io/protocols.h:
Sync with vendor.
sys/xeninterface/io/ring.h:
Add macro for calculating the number of ring pages needed for an N
deep ring.
To avoid duplication within the macros, create and use the new
__RING_HEADER_SIZE() macro. This macro calculates the size of the
ring book keeping struct (producer/consumer indexes, etc.) that
resides at the head of the ring.
Add the __RING_PAGES() macro which calculates the number of shared
ring pages required to support a ring with the given number of
requests.
These APIs are used to support the multi-page ring version of the
Xen block API.
sys/xeninterface/io/xenbus.h:
Add Comments.
sys/xen/xenbus/...
o Refactor the FreeBSD XenBus support code to allow for
both front and backend device attachments.
o Make use of new config_intr_hook capabilities to allow
front and back devices to be probed/attached in parallel.
o Fix bugs in probe/attach state machine that could
cause the system to hang when confronted with a failure
either in the local domain or in a remote domain to
which one of our driver instances is attaching.
o Publish all required state to the XenStore on device
detach and failure. The majority of the missing
functionality was for serving as a back end since the
typical "hot-plug" scripts in Dom0 don't handle the
case of cleaning up for a "service domain" that is
not itself.
o Add dynamic sysctl nodes exposing the generic ivars of
XenBus devices.
o Add doxygen style comments to the majority of the code.
o Cleanup types, formatting, etc.
sys/xen/xenbus/xenbusb.c:
Common code used by both front and back XenBus busses.
sys/xen/xenbus/xenbusb_if.m:
Method definitions for a XenBus bus.
sys/xen/xenbus/xenbusb_front.c:
sys/xen/xenbus/xenbusb_back.c:
XenBus bus specialization for front and back devices.
sys/dev/xen/blkback/blkback.c:
In xbb_detach() only perform cleanup of our taskqueue and
device statistics structures if they have been initialized.
This avoids a panic when xbb_detach() is called on a partially
initialized device instance, due to an early failure in
attach.
Purge mergeinfo on sys/dev/xen/xenpci. The only unique mergeinfo compared
to head was not useful (it came in with the merge from /user/dfr/xenhvm/7
and that mergeinfo is still present at sys/) and not worth keeping an extra
set of mergeinfo around in the kernel.
mav [Tue, 23 Nov 2010 21:42:26 +0000 (21:42 +0000)]
MFC r215468:
Make ATA_CAM wrapper to report SATA power management capabilities to CAM to
make it configure device to initiate transitions if controller configured
to accept them. This makes hint.ata.X.pm_level=1 mode working.
uqs [Tue, 23 Nov 2010 21:36:53 +0000 (21:36 +0000)]
MFC r214237,214489:
Remove mention of non-existant -o flag for debugging options.
Fix CPU load reporting independent of scheduler used.
- Sample CPU usage data from kern.cp_times, this makes for a far more
accurate and scheduler independent algorithm.
- Rip out the process list scraping that is no longer required.
- Don't update CPU usage sampling on every request, but every 15s
instead. This makes it impossible for an attacker to hide the CPU load
by triggering 4 samplings in short succession when the system is idle.
- After reaching the steady-state, the system will always report the
average CPU load of the last 60 sampled seconds.
- Untangling of call graph.
mav [Tue, 23 Nov 2010 21:35:13 +0000 (21:35 +0000)]
MFC r214288:
Make da driver to handle some probably broken Android devices, returning
zero media and sector size instead of "Medium not present" error,
until some confirmation button is tapped on device.
yongari [Tue, 23 Nov 2010 19:11:27 +0000 (19:11 +0000)]
MFC r215327,215350:
r215327:
P5N32-SLI PREMIUM from ASUSTeK is known to have MSI/MSI-X issue
such that nfe(4) does not work with MSI-X. When MSI-X support was
introduced, I remember MCP55 controller worked without problems so
the issue could be either PCI bridge or BIOS issue. But I also
noticed snd_hda(4) disabled MSI on all MCP55 chipset so I'm still
not sure this is generic issue of MCP55 chipset. If this was PCI
bridge issue we would have added it to a system wide black-list
table but it's not clear to me at this moment whether it was caused
by either broken BIOS or silicon bug of MCP55 chipset.
To workaround the issue, maintain a MSI/MSI-X black-list table in
driver and lookup base board manufacturer and product name from the
table before attempting to use MSI-X. If driver find an matching
entry, nfe(4) will not use MSI/MSI-X and fall back on traditional
INTx mode. This approach should be the last resort since it relies
on smbios and if another instance of MSI/MSI-X breakage is reported
with different maker/product, we may have to get the PCI bridge
black-listed instead of adding an new entry.
PR: kern/152150
r215350:
Plug memory leakage introduced in r215327.
nwhitehorn [Tue, 23 Nov 2010 14:13:12 +0000 (14:13 +0000)]
MFC r208839,214999:
Add two new flags (IIC_M_NOSTOP and IIC_M_NOSTART) to struct iic_msg to
allow consumers of iicbus_transfer() to send messages with repeated starts.
pluknet [Tue, 23 Nov 2010 11:39:11 +0000 (11:39 +0000)]
MFC r215176:
Stop documenting vgonel() after its converting to the static function:
svn r147332 (by jeff): "Don't make vgonel() globally visible".
While here, specify the vnode locking scheme for vgone().
mckusick [Tue, 23 Nov 2010 01:24:27 +0000 (01:24 +0000)]
MFC of 213119
Reported problem:
Large (60GB) filesystems created using "newfs -U -O 1 -b 65536 -f 8192"
show incorrect results from "df" for free and used space when mounted
immediately after creation. fsck on the new filesystem (before ever
mounting it once) gives a "SUMMARY INFORMATION BAD" error in phase 5.
This error hasn't occurred in any runs of fsck immediately after
"newfs -U -b 65536 -f 8192" (leaving out the "-O 1" option).
Solution:
The default UFS1 superblock is located at offset 8K in the filesystem
partition; the default UFS2 superblock is located at offset 64K in
the filesystem partition. For UFS1 filesystems with a blocksize of
64K, the first alternate superblock resides at 64K which is the the
location used for the default UFS2 superblock. By default, the
system first checks for a valid superblock at the default location
for a UFS2 filoesystem. For a UFS1 filesystem with a blocksize of
64K, there is a valid UFS1 superblock at this location. Thus, even
though it is expected to be a backup superblock, the system will
use it as its default superblock. So, we have to ensure that all the
statistcs on usage are correct in this first alternate superblock
as it is the superblock that will actually be used.
While tracking down this problem, another limitation of UFS1 became
evident. For UFS1, the number of inodes per cylinder group is stored
in an int16_t. Thus the maximum number of inodes per cylinder group
is limited to 2^15 - 1. This limit can easily be exceeded for block
sizes of 32K and above. Thus when building UFS1 filesystems, newfs
must limit the number of inodes per cylinder group to 2^15 - 1.
Reported by: Guy Helmer<ghelmer@palisadesys.com>
Followup by: Bruce Cran <brucec@freebsd.org>
PR: 107692
des [Mon, 22 Nov 2010 19:02:30 +0000 (19:02 +0000)]
MFC r209052, r210702, r211095, r211182, r212149: Fix a memory leak and
some potential deadlocks, increase the target limit from 4 to 64, and
numerous other scalability and stability improvements, including.
Submitted by: Daniel Braniss <danny@cs.huji.ac.il>
Sponsored by: Dansk Scanning A/S, Data Robotics Inc.
nwhitehorn [Mon, 22 Nov 2010 17:39:18 +0000 (17:39 +0000)]
MFC r212054:
Restructure how reset and poweroff are handled on PowerPC systems, since
the existing code was very platform specific, and broken for SMP systems
trying to reboot from KDB.
- Add a new PLATFORM_RESET() method to the platform KOBJ interface, and
migrate existing reset functions into platform modules.
- Modify the OF_reboot() routine to submit the request by hand to avoid
the IPIs involved in the regular openfirmware() routine. This fixes
reboot from KDB on SMP machines.
- Move non-KDB reset and poweroff functions on the Powermac platform
into the relevant power control drivers (cuda, pmu, smu), instead of
using them through the Open Firmware backdoor.
- Rename platform_chrp to platform_powermac since it has become
increasingly Powermac specific. When we gain support for IBM systems,
we will grow a new platform_chrp.
nwhitehorn [Mon, 22 Nov 2010 17:13:04 +0000 (17:13 +0000)]
MFC r205506:
Get nexus(4) out of the RTC business. The interface used by nexus(4)
in Open Firmware was Apple-specific, and we have complete coverage of Apple
system controllers, so move RTC responsibilities into the system controller
drivers. This avoids interesting problems from manipulating these devices
through Open Firmware behind the backs of their drivers.
nwhitehorn [Mon, 22 Nov 2010 17:09:42 +0000 (17:09 +0000)]
MFC r215100:
Disabling CPU NAP modes during SMU commands is a hack needed only on U3
systems. Don't use it on non-U3 systems to allow cpu_idle() to work
correctly.
nwhitehorn [Mon, 22 Nov 2010 16:58:07 +0000 (16:58 +0000)]
MFC r213986:
Fix an XXX comment by answering 'no'. OS X does not set the day-of-week
counter on SMU-based systems, which causes FreeBSD to reject the RTC time
when used in a dual-boot environment. Since we don't use the day-of-week
counter anyway, solve this by just not checking that it matches.
kensmith [Mon, 22 Nov 2010 16:09:57 +0000 (16:09 +0000)]
We're a bit under a week from Code Freeze for the upcoming 8.2-RELEASE
cycle. Warn people tracking stable/8 that the branch may be more
active than usual.
brian [Mon, 22 Nov 2010 09:32:54 +0000 (09:32 +0000)]
MFC r212247 & r212724 from head:
Handle geli-encrypted root disk devices.
Add support for identifying a journaled root filesystem.
Fix support for identifying the given /dev/vinum/root example.
netchild [Mon, 22 Nov 2010 08:21:58 +0000 (08:21 +0000)]
MFC r215338:
- print out the PID and program name of the program trying to use an
unsupported futex operation
- for those futex operations which are known to be not supported,
print out which futex operation it is
- shortcut the error return of the unsupported FUTEX_CLOCK_REALTIME in
some cases:
FUTEX_CLOCK_REALTIME can be used to tell linux to use
CLOCK_REALTIME instead of CLOCK_MONOTONIC. FUTEX_CLOCK_REALTIME
however must only be set, if either FUTEX_WAIT_BITSET or
FUTEX_WAIT_REQUEUE_PI are set too. If that's not the case
we can die with ENOSYS right at the beginning.
Submitted by: arundel
Reviewed by: rdivacky (earlier iteration of the patch)
MFC after: 1 week