thompsa [Sun, 28 Nov 2010 07:18:14 +0000 (07:18 +0000)]
MFC r215253
Fix LibUSB v1.0 compliancy.
1) We need to allow the USB callback to free the USB transfer itself.
2) The USB transfer buffer should only be automatically freed when
freeing the USB transfer.
thompsa [Sun, 28 Nov 2010 07:04:28 +0000 (07:04 +0000)]
MFC r213809
USB network (NCM driver):
- correct the ethernet payload remainder which
must be post-offseted by -14 bytes instead of
0 bytes. This is not very clearly defined in the
NCM specification.
jchandra [Sun, 28 Nov 2010 06:43:39 +0000 (06:43 +0000)]
MFC r215973
The current implementation of vm_page_alloc_freelist() does not handle
order > 0 correctly. Remove order parameter to the function and use it
only for order 0 pages.
zec [Sat, 27 Nov 2010 23:48:53 +0000 (23:48 +0000)]
MFC r215800:
Simplify ng_pipe locking model by relying on the netgraph framework
to provide serialization of calls into the node, which is accomplished
by markng the node as single-threaded (NGF_FORCE_WRITER).
The price we pay is that each ng_pipe instance now has its own callout
handler which polls for queued frames on each clock tick, as long as
the pipe has any frames in its internal queues. OTOH, we got rid of
the global ng_pipe mutex, so from now on multiple ng_pipe instances
can operate in parallel. This change also fixes counting of forwarded
frames when an ng_pipe node is not enforcing any packet impairments.
While here, attempt to improve adherance to style(9) throughout
otherwise mostly unreadable code.
ae [Sat, 27 Nov 2010 13:53:21 +0000 (13:53 +0000)]
MFC r215570:
Add to gpart(8) an ability to backup partition table and
restore it from given backup.
MFC r215671:
Always dump partition labels with `gpart backup`, but `gpart restore`
does restore them only when -l option is specified [1]. Make number of
entries field in backup format optional. Document -l and -r options of
`gpart show` action.
MFC r215672:
Add SIGINT handler to `gpart restore` action.
ae [Sat, 27 Nov 2010 13:38:17 +0000 (13:38 +0000)]
r214975 was not fully adapted to stable/8 and in-kernel version of
"destroy -F" does not work, because g_part_parm_uint assumes that
parameter is an asci string, but in head/ it is not.
Add gpart_destroy wrapper function to gpart(8). It changes "force"
parameter and does convert it to string.
jchandra [Sat, 27 Nov 2010 12:26:40 +0000 (12:26 +0000)]
Merge MIPS platform support to 8-STABLE.
This commit merges the MIPS platform changes that was now stable in
-CURRENT into 8-STABLE. The MIPS changesets are too many (~400) to list
here. But the changesets merged in this commit that affect other platforms
are summarized below:
r204635 : (changes to sys/dev/hwpmc, lib/libpmc, sys/sys/pmc.h)
Add support for hwpmc(4) on the MIPS 24K, 32 bit, embedded processor.
r205845: (changes to sys/modules/Makefile)
Fix for building modules on mips and arm.
r204031: (changes to sys/kern/link_elf_obj.c)
printf fix, as part of kernel module support for MIPS.
r206404: (changes to sys/arm/include/bus.h)
Add BUS_SPACE_UNRESTRICTED and define it to be ~0, just like all the
other platforms - for arm and mips.
r206819: (changes to sys/vm/)
Add VMFS_TLB_ALIGNED_SPACE option and kmem_alloc_nofault_space(), which
is used to allocate kernel stack address on MIPS.
r208165, r211087: (sys/kern/subr_smp.c, sys/kern/sched_ule.c)
Enable ULE scheduler for MIPS, Fix for an issue in SMP when 32 cpus are
enabled.
r208659: (sys/{ia64/ia64,mips/mips,sun4v/sun4v}/pmap.c)
Simplify the inner loop of get_pv_entry()
r208794: (changes to sys/vm/)
Make vm_contig_grow_cache() extern, and use it when vm_phys_alloc_contig()
fails to allocate MIPS page table pages.
r210327: (changes to sys/vm/)
Support for MIPS page table page allocation. Add a new function 'vm_page_t
vm_page_alloc_freelist(int flind, int order, int req)' to vm/vm_page.c to
allocate a page from a specified freelist, and other related changes.
mjacob [Sat, 27 Nov 2010 03:59:55 +0000 (03:59 +0000)]
This is an MFC of 205847
Change how multipath labels are created and managed. This makes it easier
to support various storage boxes which really aren't active-active.
We only write the label on the *first* provider. For all other providers
we just "add" the disk. This also allows for an "add" verb.
A usage implication is that you should specificy the currently active
storage path as the first provider.
Note that this does not add RDAC-like functionality, but better allows for
autovolumefailover configurations (additional checkins elsewhere will support
this).
lstewart [Sat, 27 Nov 2010 03:19:59 +0000 (03:19 +0000)]
MFC r215552:
When enabling or disabling SIFTR with a VIMAGE kernel, ensure we add or remove
the SIFTR pfil(9) hook functions to or from all network stacks. This patch
allows packets inbound or outbound from a vnet to be "seen" by SIFTR.
Reported and tested by: David Hayes <dahayes at swin edu au>
lstewart [Sat, 27 Nov 2010 02:18:55 +0000 (02:18 +0000)]
Partially MFC r215166:
Disable priming the congestion window from the host cache. The current method
interacts poorly with delayed ack and appropriate byte counting amongst other
things, resulting in undesired delay during a connection's opening slow start.
Even if we did fix the issues, the current method is still dubious at best and
needs to be thought through thoroughly.
Due to a mistake on my behalf, the change described above was committed to head
as part of a larger patch in revision 215166. Instead of waiting for the MFC of
215166, I'm merging just this small portion for the upcoming release without
bringing the mergeinfo for r215166 along. The mergeinfo will sort itself out
when r215166 is eventually merged.
This is an intentional direct commit to the 8-STABLE branch.
Reported by: Maxim Dounin <mdounin at mdounin ru> and others
Submitted by: andre
nwhitehorn [Sat, 27 Nov 2010 00:46:57 +0000 (00:46 +0000)]
MFC r215752:
Properly use SCHAR_MAX instead of CHAR_MAX for 0x7f. This fixes operation
of locate(1) on systems on which char is unsigned by default (ARM and
PowerPC).
nwhitehorn [Sat, 27 Nov 2010 00:37:13 +0000 (00:37 +0000)]
MFC r214494:
Fix netboot on some Apple machines on which calling dma-free on the
network device can hang the machine. This causes the loss of 64 KB of
accessible memory on netbooted machines.
nwhitehorn [Sat, 27 Nov 2010 00:36:11 +0000 (00:36 +0000)]
MFC r214493,214495:
Fix some memory management issues discovered when trying to boot the PPC
OF loader on systems where address cells and size cells are both 2 (the
Mambo simulator) and fix an error where cons_probe() was called before
init_heap() but used malloc() to set environment variables.
simon [Fri, 26 Nov 2010 22:50:58 +0000 (22:50 +0000)]
Merge OpenSSL 0.9.8p into stable/8.
This merges up to and including head/crypto/openssl/ r215697; and
head/secure/lib/libcrypto/, head/secure/lib/libssl/,
head/secure/usr.bin/openssl/ r215698.
To make the merge simpler, a hack was added to set MACHINE_CPUARCH.
Security: CVE-2010-2939, CVE-2010-3864
Security: http://www.openssl.org/news/secadv_20101116.txt
Security: FreeBSD-SA-10:10.openssl
Approved by: re (implicitly - they did not object of the general idea
of OpenSSL update)
marius [Fri, 26 Nov 2010 21:27:13 +0000 (21:27 +0000)]
MFC: r215780
Remove the description of the link0 link option, since r215297 (merged to
stable/8 in r215881) the master media option generally should be used
instead.
This is MFC'ed in order to discourage the use of the link0 link option,
although it's still available in stable/8).
jkim [Fri, 26 Nov 2010 21:16:21 +0000 (21:16 +0000)]
MFC: r196769, r196771, r211424, r215703, r215754
- Disable caches and flush caches/TLBs when we update PAT as we do for MTRR.
Flushing TLBs is required to ensure cache coherency according to the AMD64
architecture manual. Flushing caches is only required when changing from a
cacheable memory type (WB, WP or WT) to an uncacheable type (WC, UC or UC-).
Since this function is only used once per processor during startup, there is
no need to take any shortcuts.
- Leave PAT indices 0-3 at the default of WB, WT, UC-, and UC. Program 5 as
WP (from default WT) and 6 as WC (from default UC-). Leave 4 and 7 at the
default of WB and UC. This is to avoid transition from a cacheable memory
type to an uncacheable type to minimize possible cache incoherency. Since
we perform flushing caches and TLBs now, this change may not be necessary
any more but we do not want to take any chances.
- Improve pmap_cache_bits() with an array to map PAT memory type to index.
This array is initialized early from pmap_init_pat(), so that we do not need
to handle special cases in the function any more. Now this function is
identical on both amd64 and i386.
marius [Fri, 26 Nov 2010 21:01:19 +0000 (21:01 +0000)]
MFC: r215722
- Fix and enable support for flow control.
- Partially revert r172334; as it turns out the DELAYs in gem_reset_{r,t}x()
are actually necessary although bus space barriers and gem_bitwait() are
used, otherwise the controller may trigger an IOMMU errors on at least
sparc64. This is in line with what Linux and OpenSolaris do.
marius [Fri, 26 Nov 2010 20:59:43 +0000 (20:59 +0000)]
MFC: r215720
- Also probe BCM5214 and BCM5222.
- Add some DSP init code for BCM5221. The values derived from Apple's GMAC
driver and the same init code also exists in Linux's sungem_phy driver.
- Only read media status bits when they are valid.
marius [Fri, 26 Nov 2010 20:55:58 +0000 (20:55 +0000)]
MFC: r215302
Move the limiting of the PHY to 10/100 modes of operation due to limitations
of certain MAC models from brgphy(4) to bge(4) where it belongs. While at it,
update the list of models having that restriction to what OpenBSD uses, which
in turn seems to have obtained that information from the Linux tg3 driver.
marius [Fri, 26 Nov 2010 20:45:49 +0000 (20:45 +0000)]
MFC: r215298, r215459, r215714, r215716
- Change these drivers to take advantage and use the generic IEEE 802.3
annex 31B full duplex flow control as well as the IFM_1000_T master
support committed in r215297 (merged to stable/8 in r215881). For
atphy(4) and jmphy(4) this includes changing these PHY drivers to no
longer unconditionally advertise support for flow control but only if
the selected media has IFM_FLOW set (or MIIF_FORCEPAUSE is set).
- Rename {atphy,jmphy}_auto() to {atphy,jmphy}_setmedia() as these handle
other media types as well.
marius [Fri, 26 Nov 2010 20:37:19 +0000 (20:37 +0000)]
MFC: r214608, r215297(partial), r215713
o Flesh out the generic IEEE 802.3 annex 31B full duplex flow control
support in mii(4):
- Merge generic flow control advertisement (which can be enabled by
passing by MIIF_DOPAUSE to mii_attach(9)) and parsing support from
NetBSD into mii_physubr.c and ukphy_subr.c. Unlike as in NetBSD,
IFM_FLOW isn't implemented as a global option via the "don't care
mask" but instead as a media specific option this. This has the
following advantages:
o allows flow control advertisement with autonegotiation to be
turned on and off via ifconfig(8) with the default typically
being off (though MIIF_FORCEPAUSE has been added causing flow
control to be always advertised, allowing to easily MFC this
changes for drivers that previously used home-grown support for
flow control that behaved that way without breaking POLA)
o allows to deal with PHY drivers where flow control advertisement
with manual selection doesn't work or at least isn't implemented,
like it's the case with brgphy(4), e1000phy(4) and ip1000phy(4),
by setting MIIF_NOMANPAUSE
o the available combinations of media options are readily available
from the `ifconfig -m` output
- Add IFM_FLOW to IFM_SHARED_OPTION_DESCRIPTIONS and IFM_ETH_RXPAUSE
and IFM_ETH_TXPAUSE to IFM_SUBTYPE_ETHERNET_OPTION_DESCRIPTIONS so
these are understood by ifconfig(8).
o Make the master/slave support in mii(4) actually usable:
- Change IFM_ETH_MASTER from being implemented as a global option via
the "don't care mask" to a media specific one as it actually is only
applicable to IFM_1000_T to date.
- Let mii_phy_setmedia() set GTCR_MAN_MS in IFM_1000_T slave mode to
actually configure manually selected slave mode (like we also do in
the PHY specific implementations).
- Add IFM_ETH_MASTER to IFM_SUBTYPE_ETHERNET_OPTION_DESCRIPTIONS so it
is understood by ifconfig(8).
o Switch bge(4), bce(4), msk(4), nfe(4) and stge(4) along with brgphy(4),
e1000phy(4) and ip1000phy(4) to use the generic flow control support
instead of home-grown solutions via IFM_FLAGs. This includes changing
these PHY drivers and smcphy(4) to no longer unconditionally advertise
support for flow control but only if the selected media has IFM_FLOW
set (or MIIF_FORCEPAUSE is set) and implemented for these media variants,
i.e. typically only for copper.
o Switch brgphy(4), ciphy(4), e1000phy(4) and ip1000phy(4) to report and
set IFM_1000_T master mode via IFM_ETH_MASTER instead of via IFF_LINK0
and some IFM_FLAGn.
o Switch brgphy(4) to add at least the the supported copper media based on
the contents of the BMSR via mii_phy_add_media() instead of hardcoding
them. The latter approach seems to have developed historically, besides
causing unnecessary code duplication it was also undesirable because
brgphy_mii_phy_auto() already based the capability advertisement on the
contents of the BMSR though.
o Let brgphy(4) set IFM_1000_T master mode on all supported PHY and not
just BCM5701. Apparently this was a misinterpretation of a workaround
in the Linux tg3 driver; BCM5701 seem to require RGPHY_1000CTL_MSE and
BRGPHY_1000CTL_MSC to be set when configuring autonegotiation but
this doesn't mean we can't set these as well on other PHYs for manual
media selection.
o Let ukphy_status() report IFM_1000_T master mode via IFM_ETH_MASTER so
IFM_1000_T master mode support now is generally available with all PHY
drivers.
o Don't let e1000phy(4) set master/slave bits for IFM_1000_SX as it's
not applicable there.
Unlike as in head, bge(4), bce(4), msk(4), nfe(4) and stge(4) are changed
to set MIIF_FORCEPAUSE in stable/8 so they continue to always advertise
support of flow control and brgphy(4), ciphy(4), e1000phy(4) as well as
ip1000phy(4) are changed to still also accept IFF_LINK0 in addition to
the master media option for setting master mode, both in order to not
violate POLA.
marius [Fri, 26 Nov 2010 18:44:01 +0000 (18:44 +0000)]
MFC: r215259, r215272
- When printing media with more than one media option set aggregate these
in a comma delimited list instead of repeating "mediaopt" for each one.
This matches how the options of the active media are printed with
print_media_word() and brings us in line what NetBSD does.
- When setting a media with no sub-type specified also reset the type
specific options along with the global ones so these options don't
stick when f.e. switching to IFM_AUTO.
zec [Fri, 26 Nov 2010 15:46:49 +0000 (15:46 +0000)]
MFC r215726:
Allow for vlan(4) ifnets to have overlapping unit numbers if they are
created in separated vnets. As a side-effect of having a separated
if_cloner instance for each vnet, all vlan ifnets created in a vnet
will be automatically destroyed when vnet teardown is initiated.
Disallow SIOCSETVLAN and SIOCGETVLAN ioctls on vlan ifnets which are
associated with physical ifnets residing in parent vnets.
This is an interim vlan-specific solution which will be superseded by a
more generic if_cloner V_irtualization change from p4. For nooptions
VIMAGE builds, this should be a no-op change.
zec [Fri, 26 Nov 2010 15:44:16 +0000 (15:44 +0000)]
MFC r215673:
Allow for MTU sizes of up to ETHER_MAX_LEN_JUMBO (i.e. 9018) bytes to be
configured on ng_eiface ifnets. The default MTU remains unchanged at
1500 bytes.
Mark ng_eiface ifnets as IFCAP_VLAN_MTU capable, so that the associated
vlan(4) ifnets may use full-sized Ethernet MTUs (1500 bytes).
bschmidt [Fri, 26 Nov 2010 11:55:51 +0000 (11:55 +0000)]
MFC r215708:
Resurrect amd64 support.
- Many drivers on amd64 are picking system uptime, interrupt time and ticks
via global data structure instead of calling functions for performance
reasons. For now just patch such address so driver will not trigger page
fault when trying to access such data. In future, additional callout may
be added to update data in periodic intervals.
- On amd64 we need to allocate "shadow space" on stack before calling any
function.
bschmidt [Fri, 26 Nov 2010 11:48:47 +0000 (11:48 +0000)]
MFC r215707:
Prefer pmap_extract() over pmap_kextract() as done in MmIsAddressValid().
According to the comment for MmIsAddressValid() there are issues on PAE
kernels using pmap_kextract().
kib [Fri, 26 Nov 2010 11:37:35 +0000 (11:37 +0000)]
Partial MFC r215548:
Remove printf()s in the vop_inactive and vop_reclaim() methods related
to prtactive variable. The prtactive variable definition and declaration
are kept in the stable branch to preserve the KPI and KBI.
bschmidt [Thu, 25 Nov 2010 18:50:59 +0000 (18:50 +0000)]
MFC r215135,215419,215420:
- According to specs for MmAllocateContiguousMemorySpecifyCache() physically
contiguous memory with requested restrictions must be allocated.
- Use kmem_alloc_contig() to honour the cache_type variable.
- Fix a panic on i386 for drivers using MmAllocateContiguousMemory()
and MmAllocateContiguousMemorySpecifyCache().
rstone [Thu, 25 Nov 2010 18:32:02 +0000 (18:32 +0000)]
MFC 215474
When netstat was run with -i/-I and -w1 to produce running counters, the idrop
field printed an absolute value rather than the delta from the last value
delphij [Thu, 25 Nov 2010 18:21:08 +0000 (18:21 +0000)]
MFC r215234:
Update to vendor release 1.20.00.19.
Bug fixes:
* Fixed "inquiry data fails comparion at DV1 step"
* Fixed bad range input in bus_alloc_resource for ADAPTER_TYPE_B
* Fixed arcmsr driver prevent arcsas support for Areca SAS HBA ARC13x0
Many thanks to Areca for continuing to support FreeBSD.
MFC r215280:
Workaround build for PAE case for now - revert the PHYS
case to previous panic behavior.
rrs [Thu, 25 Nov 2010 12:10:59 +0000 (12:10 +0000)]
MFC of 215110:
Fix so that a multicast packet can be sent
even if there is no route out to that mcast address. The code in
in_pcb inadvertantly would error (no route) even though
the user may have specified the address with the
proper socket option (to specify the egress interface).
Thanks bz for reminding me I forgot to commit this ;-)
bschmidt [Thu, 25 Nov 2010 08:55:57 +0000 (08:55 +0000)]
MFC r215699:
The meshid element is memcpy()'ed into se_meshid if included in either
beacon or probe-response frames. Fix the condition by checking for the
the array's content instead of the always existing array itself.
brucec [Wed, 24 Nov 2010 21:54:45 +0000 (21:54 +0000)]
MFC r215637:
dispatch_add_command:
Modify the logic so there's only one exit point instead of two.
Only insert valid (non-NULL) values into the queue.
dispatch_free_command:
Ensure that item is not NULL before removing it from the queue and
dereferencing the pointer.
NULL out free'd pointers to catch any use-after-free bugs.
glebius [Wed, 24 Nov 2010 05:37:12 +0000 (05:37 +0000)]
MFhead r214508:
Revert a small part of the r198301, that is entirely unrelated to the
r198301 itself. It also broke the logic of not sending more than one
ARP request per second, that consequently lead to a potential problem
of flooding network with broadcast packets.
Correct bug introduced while purging the -ERRNO Linuxism from the
grant table API. Valid grant refs are in the range of positive 32bit
integers. ENOSPACE, being 29, is also a positive integer. Return
GNTTAB_LIST_END (-1) instead when gnttab_claim_grant_reference() fails.
Correct alignment and boundary constraints in blkfront's bus dma tag. The
blkif interface in Xen requires all I/O to be 512 byte aligned with each
segment bounded by a 4k page.
Note: This submission only documents the proper contraints for blkif I/O.
The alignment code in busdma does not yet handle alignment constraints
correctly in all cases.
Improve the Xen para-virtualized device infrastructure of FreeBSD:
o Add support for backend devices (e.g. blkback)
o Implement extensions to the Xen para-virtualized block API to allow
for larger and more outstanding I/Os.
o Import a completely rewritten block back driver with support for
fronting I/O to both raw devices and files.
o General cleanup and documentation of the XenBus and XenStore support
code.
o Robustness and performance updates for the block front driver.
o Fixes to the netfront driver.
Sponsored by: Spectra Logic Corporation
sys/xen/xenbus/init.txt:
Deleted: This file explains the Linux method for XenBus device
enumeration and thus does not apply to FreeBSD's NewBus approach.
sys/xen/xenbus/xenbus_probe_backend.c:
Deleted: Linux version of backend XenBus service routines. It
was never ported to FreeBSD. See xenbusb.c, xenbusb_if.m,
xenbusb_front.c xenbusb_back.c for details of FreeBSD's XenBus
support.
sys/xen/xenbus/xenbusvar.h:
sys/xen/xenbus/xenbus_xs.c:
sys/xen/xenbus/xenbus_comms.c:
sys/xen/xenbus/xenbus_comms.h:
sys/xen/xenstore/xenstorevar.h:
sys/xen/xenstore/xenstore.c:
Split XenStore into its own tree. XenBus is a software layer
built on top of XenStore. The old arrangement and the naming of
some structures and functions blurred these lines making it
difficult to discern what services are provided by which layer
and at what times these services are available (e.g. during
system startup and shutdown).
sys/xen/xenbus/xenbus_client.c:
sys/xen/xenbus/xenbus.c:
sys/xen/xenbus/xenbus_probe.c:
sys/xen/xenbus/xenbusb.c:
sys/xen/xenbus/xenbusb.h:
Split up XenBus code into methods available for use by client
drivers (xenbus.c) and code used by the XenBus "bus code" to
enumerate, attach, detach, and service bus drivers.
sys/xen/reboot.c:
sys/dev/xen/control/control.c:
Add a XenBus front driver for handling shutdown, reboot,
suspend, and resume events published in the XenStore.
Move all PV suspend/reboot support from reboot.c into
this driver.
sys/xen/blkif.h:
New file from Xen vendor with macros and structures used by
a block back driver to service requests from a VM running a
different ABI (e.g. amd64 back with i386 front).
sys/conf/files:
Adjust kernel build spec for new XenBus/XenStore layout and added
Xen functionality.
sys/dev/xen/balloon/balloon.c:
sys/dev/xen/netfront/netfront.c:
sys/dev/xen/blkfront/blkfront.c:
sys/xen/xenbus/...
sys/xen/xenstore/...
o Rename XenStore APIs and structures from xenbus_* to xs_*.
o Adjust to use of M_XENBUS and M_XENSTORE malloc types
for allocation of objects returned by these APIs.
o Adjust for changes in the bus interface for Xen
drivers.
sys/xen/xenbus/...
sys/xen/xenstore/...
Add Doxygen comments for these interfaces and the code that
implements them.
sys/dev/xen/blkback/blkback.c:
o Rewrite the Block Back driver to attach properly via newbus,
operate correctly in both PV and HVM mode regardless of domain
(e.g. can be in a DOM other than 0), and to deal with the latest
metadata available in XenStore for block devices.
o Allow users to specify a file as a backend to blkback, in addition
to character devices. Use the namei lookup of the backend path
to automatically configure, based on file type, the appropriate
backend method.
The current implementation is limited to a single outstanding I/O
at a time to file backed storage.
sys/dev/xen/blkback/blkback.c:
sys/xen/interface/io/blkif.h:
sys/xen/blkif.h:
sys/dev/xen/blkfront/blkfront.c:
sys/dev/xen/blkfront/block.h:
Extend the Xen blkif API: Negotiable request size and number of
requests.
This change extends the information recorded in the XenStore
allowing block front/back devices to negotiate for optimal I/O
parameters. This has been achieved without sacrificing backward
compatibility with drivers that are unaware of these protocol
enhancements. The extensions center around the connection protocol
which now includes these additions:
o The back-end device publishes its maximum supported values for,
request I/O size, the number of page segments that can be
associated with a request, the maximum number of requests that
can be concurrently active, and the maximum number of pages that
can be in the shared request ring. These values are published
before the back-end enters the XenbusStateInitWait state.
o The front-end waits for the back-end to enter either the InitWait
or Initialize state. At this point, the front end limits it's
own capabilities to the lesser of the values it finds published
by the backend, it's own maximums, or, should any back-end data
be missing in the store, the values supported by the original
protocol. It then initializes it's internal data structures
including allocation of the shared ring, publishes its maximum
capabilities to the XenStore and transitions to the Initialized
state.
o The back-end waits for the front-end to enter the Initalized
state. At this point, the back end limits it's own capabilities
to the lesser of the values it finds published by the frontend,
it's own maximums, or, should any front-end data be missing in
the store, the values supported by the original protocol. It
then initializes it's internal data structures, attaches to the
shared ring and transitions to the Connected state.
o The front-end waits for the back-end to enter the Connnected
state, transitions itself to the connected state, and can
commence I/O.
Although an updated front-end driver must be aware of the back-end's
InitWait state, the back-end has been coded such that it can
tolerate a front-end that skips this step and transitions directly
to the Initialized state without waiting for the back-end.
sys/xen/interface/io/blkif.h:
o Increase BLKIF_MAX_SEGMENTS_PER_REQUEST to 255. This is
the maximum number possible without changing the blkif
request header structure (nr_segs is a uint8_t).
o Add two new constants:
BLKIF_MAX_SEGMENTS_PER_HEADER_BLOCK, and
BLKIF_MAX_SEGMENTS_PER_SEGMENT_BLOCK. These respectively
indicate the number of segments that can fit in the first
ring-buffer entry of a request, and for each subsequent
(sg element only) ring-buffer entry associated with the
"header" ring-buffer entry of the request.
o Add the blkif_request_segment_t typedef for segment
elements.
o Add the BLKRING_GET_SG_REQUEST() macro which wraps the
RING_GET_REQUEST() macro and returns a properly cast
pointer to an array of blkif_request_segment_ts.
o Add the BLKIF_SEGS_TO_BLOCKS() macro which calculates the
number of ring entries that will be consumed by a blkif
request with the given number of segments.
sys/xen/blkif.h:
o Update for changes in interface/io/blkif.h macros.
o Update the BLKIF_MAX_RING_REQUESTS() macro to take the
ring size as an argument to allow this calculation on
multi-page rings.
o Add a companion macro to BLKIF_MAX_RING_REQUESTS(),
BLKIF_RING_PAGES(). This macro determines the number of
ring pages required in order to support a ring with the
supplied number of request blocks.
sys/dev/xen/blkback/blkback.c:
sys/dev/xen/blkfront/blkfront.c:
sys/dev/xen/blkfront/block.h:
o Negotiate with the other-end with the following limits:
Reqeust Size: MAXPHYS
Max Segments: (MAXPHYS/PAGE_SIZE) + 1
Max Requests: 256
Max Ring Pages: Sufficient to support Max Requests with
Max Segments.
o Dynamically allocate request pools and segemnts-per-request.
o Update ring allocation/attachment code to support a
multi-page shared ring.
o Update routines that access the shared ring to handle
multi-block requests.
sys/dev/xen/blkfront/blkfront.c:
o Track blkfront allocations in a blkfront driver specific
malloc pool.
o Strip out XenStore transaction retry logic in the
connection code. Transactions only need to be used when
the update to multiple XenStore nodes must be atomic.
That is not the case here.
o Fully disable blkif_resume() until it can be fixed
properly (it didn't work before this change).
o Destroy bus-dma objects during device instance tear-down.
o Properly handle backend devices with powef-of-2 sector
sizes larger than 512b.
sys/dev/xen/blkback/blkback.c:
Advertise support for and implement the BLKIF_OP_WRITE_BARRIER
and BLKIF_OP_FLUSH_DISKCACHE blkif opcodes using BIO_FLUSH and
the BIO_ORDERED attribute of bios.
sys/dev/xen/blkfront/blkfront.c:
sys/dev/xen/blkfront/block.h:
Fix various bugs in blkfront.
o gnttab_alloc_grant_references() returns 0 for success and
non-zero for failure. The check for < 0 is a leftover
Linuxism.
o When we negotiate with blkback and have to reduce some of our
capabilities, print out the original and reduced capability before
changing the local capability. So the user now gets the correct
information.
o Fix blkif_restart_queue_callback() formatting. Make sure we hold
the mutex in that function before calling xb_startio().
o Fix a couple of KASSERT()s.
o Fix a check in the xb_remove_* macro to be a little more specific.
sys/xen/gnttab.h:
sys/xen/gnttab.c:
Define GNTTAB_LIST_END publicly as GRANT_REF_INVALID.
sys/dev/xen/netfront/netfront.c:
Use GRANT_REF_INVALID instead of driver private definitions of the
same constant.
sys/xen/gnttab.h:
sys/xen/gnttab.c:
Add the gnttab_end_foreign_access_references() API.
This API allows a client to batch the release of an
array of grant references, instead of coding a private
for loop. The implementation takes advantage of this
batching to reduce lock overhead to one acquisition and
release per-batch instead of per-freed grant reference.
While here, reduce the duration the gnttab_list_lock
is held during gnttab_free_grant_references() operations.
The search to find the tail of the incoming free list
does not rely on global state and so can be performed
without holding the lock.
sys/dev/xen/xenpci/evtchn.c:
sys/dev/xen/evtchn/evtchn.c:
sys/xen/xen_intr.h:
o Implement the bind_interdomain_evtchn_to_irqhandler
API for HVM mode. This allows an HVM domain to serve
back end devices to other domains. This API is already
implemented for PV mode.
o Synchronize the API between HVM and PV.
sys/dev/xen/xenpci/xenpci.c:
o Scan the full region of CPUID space in which the Xen
VMM interface may be implemented. On systems using
SuSE as a Dom0 where the Viridian API is also exported,
the VMM interface is above the region we used to
search.
o Pass through bus_alloc_resource() calls so that XenBus drivers
attaching on an HVM system can allocate unused physical address
space from the nexus. The block back driver makes use of this
facility.
sys/i386/xen/xen_machdep.c:
Use the correct type for accessing the statically mapped xenstore
metadata.
sys/xen/interface/hvm/params.h:
sys/xen/xenstore/xenstore.c:
Move hvm_get_parameter() to the correct global header file instead
of as a private method to the XenStore.
sys/xen/interface/io/protocols.h:
Sync with vendor.
sys/xeninterface/io/ring.h:
Add macro for calculating the number of ring pages needed for an N
deep ring.
To avoid duplication within the macros, create and use the new
__RING_HEADER_SIZE() macro. This macro calculates the size of the
ring book keeping struct (producer/consumer indexes, etc.) that
resides at the head of the ring.
Add the __RING_PAGES() macro which calculates the number of shared
ring pages required to support a ring with the given number of
requests.
These APIs are used to support the multi-page ring version of the
Xen block API.
sys/xeninterface/io/xenbus.h:
Add Comments.
sys/xen/xenbus/...
o Refactor the FreeBSD XenBus support code to allow for
both front and backend device attachments.
o Make use of new config_intr_hook capabilities to allow
front and back devices to be probed/attached in parallel.
o Fix bugs in probe/attach state machine that could
cause the system to hang when confronted with a failure
either in the local domain or in a remote domain to
which one of our driver instances is attaching.
o Publish all required state to the XenStore on device
detach and failure. The majority of the missing
functionality was for serving as a back end since the
typical "hot-plug" scripts in Dom0 don't handle the
case of cleaning up for a "service domain" that is
not itself.
o Add dynamic sysctl nodes exposing the generic ivars of
XenBus devices.
o Add doxygen style comments to the majority of the code.
o Cleanup types, formatting, etc.
sys/xen/xenbus/xenbusb.c:
Common code used by both front and back XenBus busses.
sys/xen/xenbus/xenbusb_if.m:
Method definitions for a XenBus bus.
sys/xen/xenbus/xenbusb_front.c:
sys/xen/xenbus/xenbusb_back.c:
XenBus bus specialization for front and back devices.
sys/dev/xen/blkback/blkback.c:
In xbb_detach() only perform cleanup of our taskqueue and
device statistics structures if they have been initialized.
This avoids a panic when xbb_detach() is called on a partially
initialized device instance, due to an early failure in
attach.
Purge mergeinfo on sys/dev/xen/xenpci. The only unique mergeinfo compared
to head was not useful (it came in with the merge from /user/dfr/xenhvm/7
and that mergeinfo is still present at sys/) and not worth keeping an extra
set of mergeinfo around in the kernel.
mav [Tue, 23 Nov 2010 21:42:26 +0000 (21:42 +0000)]
MFC r215468:
Make ATA_CAM wrapper to report SATA power management capabilities to CAM to
make it configure device to initiate transitions if controller configured
to accept them. This makes hint.ata.X.pm_level=1 mode working.
uqs [Tue, 23 Nov 2010 21:36:53 +0000 (21:36 +0000)]
MFC r214237,214489:
Remove mention of non-existant -o flag for debugging options.
Fix CPU load reporting independent of scheduler used.
- Sample CPU usage data from kern.cp_times, this makes for a far more
accurate and scheduler independent algorithm.
- Rip out the process list scraping that is no longer required.
- Don't update CPU usage sampling on every request, but every 15s
instead. This makes it impossible for an attacker to hide the CPU load
by triggering 4 samplings in short succession when the system is idle.
- After reaching the steady-state, the system will always report the
average CPU load of the last 60 sampled seconds.
- Untangling of call graph.
mav [Tue, 23 Nov 2010 21:35:13 +0000 (21:35 +0000)]
MFC r214288:
Make da driver to handle some probably broken Android devices, returning
zero media and sector size instead of "Medium not present" error,
until some confirmation button is tapped on device.
yongari [Tue, 23 Nov 2010 19:11:27 +0000 (19:11 +0000)]
MFC r215327,215350:
r215327:
P5N32-SLI PREMIUM from ASUSTeK is known to have MSI/MSI-X issue
such that nfe(4) does not work with MSI-X. When MSI-X support was
introduced, I remember MCP55 controller worked without problems so
the issue could be either PCI bridge or BIOS issue. But I also
noticed snd_hda(4) disabled MSI on all MCP55 chipset so I'm still
not sure this is generic issue of MCP55 chipset. If this was PCI
bridge issue we would have added it to a system wide black-list
table but it's not clear to me at this moment whether it was caused
by either broken BIOS or silicon bug of MCP55 chipset.
To workaround the issue, maintain a MSI/MSI-X black-list table in
driver and lookup base board manufacturer and product name from the
table before attempting to use MSI-X. If driver find an matching
entry, nfe(4) will not use MSI/MSI-X and fall back on traditional
INTx mode. This approach should be the last resort since it relies
on smbios and if another instance of MSI/MSI-X breakage is reported
with different maker/product, we may have to get the PCI bridge
black-listed instead of adding an new entry.
PR: kern/152150
r215350:
Plug memory leakage introduced in r215327.