rstone [Sat, 10 Mar 2012 02:27:04 +0000 (02:27 +0000)]
MFC r230984:
Whenever a new kernel thread is spawned, explicitly clear any CPU affinity
set on the new thread. This prevents the thread from inadvertently
inheriting affinity from a random sibling.
kib [Wed, 7 Mar 2012 18:33:11 +0000 (18:33 +0000)]
Synchronize nullfs with HEAD, mostly merge all locking changes.
Tested by: pho
MFC r229428:
Document the state of the lowervp vnode for null_nodeget().
MFC r229431:
Do the vput() for the lowervp in the null_nodeget() for error case too.
Several callers of null_nodeget() did the cleanup itself, but several
missed it, most prominent being null_bypass(). Remove the cleanup from
the callers, now null_nodeget() handles lowervp free itself.
MFC r229600 (by dim):
In sys/fs/nullfs/null_subr.c, in a KASSERT, output the correct vnode
pointer 'lowervp' instead of 'vp', which is uninitialized at that point.
MFC r230304 (by rea):
Use hashdestroy() instead of naive free().
MFC r232299:
Move the code to destroy half-contructed nullfs vnode into helper
function null_destroy_proto() from null_insmntque_dtr(). Also
apply null_destroy_proto() in null_nodeget() when we raced and a vnode
is found in the hash, so the currently allocated protonode shall be
destroyed.
Lock the vnode interlock around reassigning the v_vnlock.
MFC r232301:
Always request exclusive lock for the lower vnode in nullfs_vget().
The null_nodeget() requires exclusive lock on lowervp to be able to
insmntque() new vnode.
MFC r232303:
In null_reclaim(), assert that reclaimed vnode is fully constructed,
instead of accepting half-constructed vnode. Previous code cannot decide
what to do with such vnode anyway, and although processing it for hash
removal, paniced later when getting rid of nullfs reference on lowervp.
While there, remove initializations from the declaration block.
MFC r232304:
Document that null_nodeget() cannot take shared-locked lowervp due to
insmntque() requirements.
MFC r232305:
Allow shared locks for reads when lower filesystem accept shared locking.
MFC r232383:
Do not expose unlocked unconstructed nullfs vnode on mount list.
Lock the native nullfs vnode lock before switching the locks.
kib [Tue, 6 Mar 2012 11:16:14 +0000 (11:16 +0000)]
MFC r232239:
Fix a race in top non-interactive mode. Use plain sleep(3) call instead
of arming timer and then pausing. If SIGALRM is delivered before pause(3)
is entered, top hangs.
Add regression tests scripts for multi-IP FIBs exercising the send,
receive and forward path tagging packets with both the ifconfig fib
option or using ipfw, running ICMP6, TCP/v6 and UDP/v6 tests and
testing both setfib(2) as well as the SO_SETFIB socket option.
At 16 FIBs a total of over 64k return codes/replies/stati are checked,
sometimes multiple times (in different ways, e.g. the reflected request
as well as ipfw counter values).
The scripts need two or three machines to run and are thus not added
to the tools/regression framework but only to tools/test.
MFC r232114:
Update scripts to work around two sh(1) bugs found in stable/8:
1) _x=$((_x + 1)) does not work while x=$((x + 1)) does.
2) Parameter Expansion, esp. "${x%%bar}" does not work if quoted.
Correct typos and improve some details forwarding.sh already
had in initiator, esp. related to ipfw accepting if the default
is deny.
Add an extra stat call to the "delay" function in addition to the
touch which together is still a lot faster than sleep 1 but seems
to help a lot more to mitigate the unrelated kernel race seen.
Add regression tests for the setsockopt(2) SO_SETFIB socket option.
Check that the expected domain(9) families all handle the socket option
correctly and do proper bounds checks. This would catch bugs as fixed
in (r230938,)r230981.
ken [Mon, 5 Mar 2012 19:01:23 +0000 (19:01 +0000)]
MFC 232411:
Fix a problem that was causing the mpt(4) driver to attach to MegaRAID
cards that should be handled by the mfi(4) driver.
The root of the problem is that the mpt(4) driver was masking off the
bottom bit of the PCI device ID when deciding which cards to attach to.
It appears that a number of the mpt(4) Fibre Channel cards had a LAN
variant whose PCI device ID was just one bit off from the FC card's device
ID. The FC cards were even and the LAN cards were odd.
The problem was that this pattern wasn't carried over on the SAS and
parallel SCSI mpt(4) cards. Luckily the SAS and parallel SCSI PCI device
IDs were either even numbers, or they would get masked to a supported
adjacent PCI device ID, and everything worked well.
Now LSI is using some of the odd-numbered PCI device IDs between the 3Gb
SAS device IDs for their new MegaRAID cards. This is causing the mpt(4)
driver to attach to the RAID cards instead of the mfi(4) driver.
The solution is to stop masking off the bottom bit of the device ID, and
explicitly list the PCI device IDs of all supported cards.
This change should be a no-op for mpt(4) hardware. The only intended
functional change is that for the 929X, the is_fc variable gets set. It
wasn't being set previously, but needs to be because the 929X is a Fibre
Channel card.
hrs [Mon, 5 Mar 2012 18:40:53 +0000 (18:40 +0000)]
MFC r225682:
Copy ip6po_minmtu and ip6po_prefer_tempaddr in ip6_copypktopts(). This fixes
inconsistency when options are specified by both setsockopt() and ancillary
data types.
delphij [Mon, 5 Mar 2012 17:09:16 +0000 (17:09 +0000)]
MFC r232202:
Drop setuid status while doing file operations to prevent potential
information leak. This changeset is intended to be a minimal one
to make backports easier.
delphij [Mon, 5 Mar 2012 17:06:34 +0000 (17:06 +0000)]
MFC r231888:
Put the signal trap output to standard error instead of standard output.
Without this change, pressing ^T could result in rc.d script putting
junk strings like:
Script <filename> running
in configuration files when redirecting standard output to these files.
emaste [Thu, 1 Mar 2012 19:43:28 +0000 (19:43 +0000)]
MFC r232267:
Workaround for PCIe 4GB boundary issue
Enforce a boundary of no more than 4GB - transfers crossing a 4GB
boundary can lead to data corruption due to PCIe limitations. This
change is a less-intrusive workaround that can be quickly merged back
to older branches; a cleaner implementation will arrive in HEAD later
but may require KPI changes.
r231743
=======
Enhance documentation, improve interoperability, and fix defects in
FreeBSD's front and back Xen blkif interface drivers.
sys/dev/xen/blkfront/block.h:
sys/dev/xen/blkfront/blkfront.c:
sys/dev/xen/blkback/blkback.c:
Replace FreeBSD specific multi-page ring impelementation with
support for both the Citrix and Amazon/RedHat versions of this
extension.
sys/dev/xen/blkfront/blkfront.c:
o Add a per-instance sysctl tree that exposes all negotiated
transport parameters (ring pages, max number of requests,
max request size, max number of segments).
o In blkfront_vdevice_to_unit() add a missing return statement
so that we properly identify the unit number for high numbered
xvd devices.
sys/dev/xen/blkback/blkback.c:
o Add static dtrace probes for several events in this driver.
o Defer connection shutdown processing until the front-end
enters the closed state. This avoids prematurely tearing
down the connection when buggy front-ends transition to the
closing state, even though the device is open and they
veto the close request from the tool stack.
o Add nodes for maximum request size and the number of active
ring pages to the exising, per-instance, sysctl tree.
o Miscelaneous style cleanup.
sys/xen/interface/io/blkif.h:
o Add extensive documentation of the XenStore nodes used to
implement the blkif interface.
o Document the startup sequence between a front and back driver.
o Add structures and documenatation for the "discard" feature
(AKA Trim).
o Cleanup some definitions related to FreeBSD's request
number/size/segment-limit extension.
sys/dev/xen/blkfront/blkfront.c:
sys/dev/xen/blkback/blkback.c:
sys/xen/xenbus/xenbusvar.h:
Add the convenience function xenbus_get_otherend_state() and
use it to simplify some logic in both block-front and block-back.
r231836
=======
Fix "_" vs. "-" typo in a comment. No functional changes.
r231837
=======
Fix typo in a printf string: "specificed" -> "specified".
r231839
=======
Fix a bug in the calculation of the maximum I/O request size.
The previous code did not limit the I/O request size based on
the maximum number of segments supported by the back-end. In
current practice, since the only back-end supporting chained
requests is the FreeBSD implementation, this limit was never
exceeded.
sys/dev/xen/blkfront/block.h:
Add two macros, XBF_SEGS_TO_SIZE() and XBF_SIZE_TO_SEGS(),
to centralize the logic of reserving a segment to deal with
non-page-aligned I/Os.
sys/dev/xen/blkfront/blkfront.c:
o When negotiating transfer parameters, limit the
max_request_size we use and publish, if it is greater
than the maximum, unaligned, I/O we can support with
the number of segments advertised by the backend.
o Don't unilaterally reduce the I/O size published to
the disk layer by a single page. max_request_size
is already properly limited in the transfer parameter
negotiation code.
o Fix typos in printf strings:
"max_requests_segments" -> "max_request_segments"
"specificed" -> "specified"
r231883
=======
Fix regression in the handling of blkback close events for
devices that are unplugged via QEMU.
sys/dev/xen/blkback/blkback.c:
Toolstack initiated closures change the frontend's state
to Closing. The backend must change to Closing as well,
even if we can't actually close yet, in order for the
frontend to notice and start the closing process.
r232308
=======
blkif interface comment cleanups. No functional changes
sys/xen/interface/io/blkif.h:
o Insert space in "Red Hat".
o Fix typo "discard-aligment" -> "discard-alignment"
o Fix typo "unamp" -> "unmap"
o Fix typo "formated" -> "formatted"
o Clarify the text for "params".
o Clarify the text for "sector-size".
o Clarify the text for "max-requests" in the backend section.
thompsa [Wed, 29 Feb 2012 20:22:45 +0000 (20:22 +0000)]
MFC r232008,232010,232080,232089
Using the flowid in the mbuf assumes the network card is giving a good hash for
the traffic flow, this may not be the case giving poor traffic distribution.
Add a sysctl which allows us to fall back to our own flow hash code.
brueffer [Sat, 25 Feb 2012 18:48:06 +0000 (18:48 +0000)]
MFC: r231871
Switch the license boilerplates to our standard one.
Advantages:
- Reduces the number of different license versions in the tree
- Eliminates a typo
- Removes some incorrect author attributions due to c/p
- Removes c/p error potential for future pmc manpages
Approved by: re (kib), jkoshy, gnn, rpaulo, fabient (copyright holders)
marius [Sat, 25 Feb 2012 00:35:28 +0000 (00:35 +0000)]
MFC: r231913
- Probe BCM57780.
- In case the parent is bge(4), don't set the Jumbo frame settings unless
the MAC actually is Jumbo capable as otherwise the PHY might not have the
corresponding registers implemented. This is also in line with what the
Linux tg3 driver does.
PR: 165032
Submitted by: Alexander Milanov
Approved by: re (kib)
Obtained from: OpenBSD
glebius [Fri, 24 Feb 2012 12:32:50 +0000 (12:32 +0000)]
Merge r230598 by kmacy from head:
A flowtable entry can continue referencing an llentry indefinitely
if the entry is repeatedly referenced within its timeout window.
This change clears the LLE_VALID flag when an llentry is removed
from an interface's hash table and adds an extra check to the
flowtable code for the LLE_VALID flag in llentry to avoid retaining
and using a stale reference.
marius [Fri, 24 Feb 2012 00:48:27 +0000 (00:48 +0000)]
MFC: r231621
- As it turns out, MSI-X is broken for at least LSI SAS1068E when passed
through by VMware so blacklist their PCI-PCI bridge for MSI/MSI-X here.
Note that besides currently there not being a quirk type that disables
MSI-X only and there's no evidence that MSI doesn't work with the VMware
pass-through, it's really questionable whether MSI generally works in
that setup as VMware only mention three know working devices [1, p. 4].
Also not that this quirk entry currently doesn't affect the devices
emulated by VMware in any way as these don't claim support MSI/MSI-X to
begin with. [2]
While at it, make the PCI quirk table const and static.
- Remove some duplicated empty lines.
- Use DEVMETHOD_END.
kib [Tue, 21 Feb 2012 10:16:17 +0000 (10:16 +0000)]
MFC r231160 (by mckusick):
Do not fsync all resident UFS vnodes from the syncer vnode call
to ffs_sync(). Since all inode metadata updates are translated to
inodeblock updates, the vnodes syncing is handled by syncer dirty buffer
wheel. The only things that shall be synced by ffs_sync() from the
syncer calls are the filesystem metadata proper.
kib [Tue, 21 Feb 2012 10:11:17 +0000 (10:11 +0000)]
MFC r231122:
Sprinkle missed calls to asynchronous UFS_UPDATE() in attempt to
guarantee that all UFS inode metadata changes results in the dirtiness
of the inodeblock.
kib [Tue, 21 Feb 2012 00:32:24 +0000 (00:32 +0000)]
MFC r231075:
Instead of removing MNTK_ASYNC from mnt_kern_flag, provide a local
thread flag to disable async i/o for current thread only. Use the
opportunity to move DOINGASYNC() macro into sys/vnode.h and
consistently use it through places which tested for MNTK_ASYNC.
MFC r231204:
Unbreak detection of the async mode for clustered writes after r231075.
dougb [Mon, 20 Feb 2012 10:14:22 +0000 (10:14 +0000)]
MFC r231862:
Increase the default shutdown timer to 90 seconds. This will allow
certain systems that take a long time to shut down, without adversely
affecting things that shut down quickly. It's also 30 seconds less than
the default hard limit of 120 seconds in kern.init_shutdown_timeout.
tuexen [Sun, 19 Feb 2012 20:15:13 +0000 (20:15 +0000)]
MFC 231672:
Fix a bug where the wrong protocol overhead was used. This can lead
to a deadlock of an association when an IPv6 socket was used to
communicate with IPv4 and an ICMPv4 fragmentation needed message
was received.
While there, simplify the code a bit.
- Implement RDNSS and DNSSL options (RFC 6106, IPv6 Router Advertisement
Options for DNS Configuration).
- rtadvd(8) now supports "noifprefix" to disable gathering on-link prefixes
from interfaces when no "addr" is specified[2]. An entry in rtadvd.conf
with "noifprefix" + no "addr" generates an RA message with no prefix
information option.
- rtadvd(8) now supports RTM_IFANNOUNCE message to fix crashes when an
interface is added or removed.
- Implement burst unsolicited RA sending into the internal RA timer framework
when AdvSendAdvertisements and/or configuration entries are changed as
described in RFC 4861 6.2.4. This fixes issues that make termination of the
rtadvd(8) daemon take very long time.
- rtadvd(8) now accepts non-existent interfaces as well in the command line.
- Add control socket support and rtadvctl(8) utility to show the RA information
in rtadvd(8). Dumping by SIGUSR1 has been removed in favor of it.
emaste [Thu, 16 Feb 2012 00:38:35 +0000 (00:38 +0000)]
MFC r231573:
Fix panic after "WARNING - ATA_IDENTIFY taskqueue timeout"
When performing a firmware upgrade via atacontrol[1] the subsequent
command may time out producing the error message above. When this
happens the callout could still be active, and the system would then
panic due to a destroyed semaphore.
Instead, ensure that the callout is done first, via callout_drain.
jilles [Wed, 15 Feb 2012 22:35:30 +0000 (22:35 +0000)]
MFC r230512: sockstat: Also show sockets not associated with a descriptor.
Sockets not associated with a file descriptor include TCP TIME_WAIT states
and sockets created via the socket(9) API such as from rpc.lockd and the NFS
client.
alc [Wed, 15 Feb 2012 18:18:29 +0000 (18:18 +0000)]
MFC r229363
Don't pass VM_ALLOC_ZERO to vm_page_grab() in tmpfs_mappedwrite() and
tmpfs_nocacheread(). It is both unnecessary and a pessimization. It
results in either the page being zeroed twice or zeroed first and then
overwritten by an I/O operation.
alc [Wed, 15 Feb 2012 17:09:26 +0000 (17:09 +0000)]
Partial merge of r218345:
Unless "cnt" exceeded MAX_COMMIT_COUNT, nfsrv_commit() and nfsvno_fsync()
were incorrectly calling vm_object_page_clean(). They were passing the
length of the range rather than the ending offset of the range.
bz [Wed, 15 Feb 2012 16:58:08 +0000 (16:58 +0000)]
MFC r231505,231520:
Introduce a new NET_RT_IFLISTL API to query the address list. It works
on extended and extensible structs if_msghdrl and ifa_msghdrl. This
will allow us to extend both the msghdrl structs and eventually if_data
in the future without breaking the ABI.
The MFC is just to provide the new API to old stable branches to make
updating and if needed downgrading a lot easier for updates to 10.
Bump __FreeBSD_version to allow ports to more easily detect the new API.
mav [Wed, 15 Feb 2012 14:31:45 +0000 (14:31 +0000)]
MFC r231762:
Do not handle MOD_SHUTDOWN equally to MOD_UNLOAD in sound kernel module.
MOD_SHUTDOWN is not an end of existence, and there is a life after it.
In particular, code previously called on MOD_SHUTDOWN grabbed lock and
deallocated unit numbering. That caused infinite wait loop if snd_uaudio
tried to destroy its PCM device after that point.
Bring Xen support in stable/8 up to parity with head. Almost all
outstanding Xen support differences between head and stable/8 are included,
except for the just added r231743.
Rename HYPERVISOR_multicall (which performs the multicall hypercall) to
_HYPERVISOR_multicall, and create a new HYPERVISOR_multicall function which
invokes _HYPERVISOR_multicall and checks that the individual hypercalls all
succeeded.
Add options NO_ADAPTIVE_SX to the XENHVM kernel configuration, matching
its similar disabling of adaptive mutexes and rwlocks. The existing
comment on why this is the case also applies to sx locks.
Make "options XENHVM" compile for i386, not just amd64 -- a largely
mechanical change. This opens the door for using PV device drivers
under Xen HVM on i386, as well as more general harmonisation of i386
and amd64 Xen support in FreeBSD.
Monitor and emit events for XenStore changes to XenBus trees
of the devices we manage. These changes can be due to writes
we make ourselves or due to changes made by the control domain.
The goal of these changes is to insure that all state transitions
can be detected regardless of their source and to allow common
device policies (e.g. "onlined" backend devices) to be centralized
in the XenBus bus code.
sys/xen/xenbus/xenbusvar.h:
sys/xen/xenbus/xenbus.c:
sys/xen/xenbus/xenbus_if.m:
Add a new method for XenBus drivers "localend_changed".
This method is invoked whenever a write is detected to
a device's XenBus tree. The default implementation of
this method is a no-op.
sys/xen/xenbus/xenbus_if.m:
sys/dev/xen/netfront/netfront.c:
sys/dev/xen/blkfront/blkfront.c:
sys/dev/xen/blkback/blkback.c:
Change the signature of the "otherend_changed" method.
This notification cannot fail, so it should return void.
sys/xen/xenbus/xenbusb_back.c:
Add "online" device handling to the XenBus Back Bus
support code. An online backend device remains active
after a front-end detaches as a reconnect is expected
to occur in the near future.
sys/xen/interface/io/xenbus.h:
Add comment block further explaining the meaning and
driver responsibilities associated with the XenBus
Closed state.
sys/xen/xenbus/xenbusb.c:
sys/xen/xenbus/xenbusb.h:
sys/xen/xenbus/xenbusb_back.c:
sys/xen/xenbus/xenbusb_front.c:
sys/xen/xenbus/xenbusb_if.m:
o Register a XenStore watch against the local XenBus tree
for all devices.
o Cache the string length of the path to our local tree.
o Allow the xenbus front and back drivers to hook/filter both
local and otherend watch processing.
o Update the device ivar version of "state" when we detect
a XenStore update of that node.
sys/dev/xen/control/control.c:
sys/xen/xenbus/xenbus.c:
sys/xen/xenbus/xenbusb.c:
sys/xen/xenbus/xenbusb.h:
sys/xen/xenbus/xenbusvar.h:
sys/xen/xenstore/xenstorevar.h:
Allow clients of the XenStore watch mechanism to attach
a single uintptr_t worth of client data to the watch.
This removes the need to carefully place client watch
data within enclosing objects so that a cast or offsetof
calculation can be used to convert from watch to enclosing
object.
Several enhancements to the Xen block back driver.
sys/dev/xen/blkback/blkback.c:
o Implement front-end request coalescing. This greatly improves the
performance of front-end clients that are unaware of the dynamic
request-size/number of requests negotiation available in the
FreeBSD backend driver. This required a large restructuring
in how this driver records in-flight transactions and how those
transactions are mapped into kernel KVA. For example, the driver
now includes a mini "KVA manager" that allocates ranges of
contiguous KVA to patches of requests that are physically
contiguous in the backing store so that a single bio or UIO
segment can be used to represent the I/O.
o Refuse to open any backend files or devices if the system
has yet to mount root. This avoids a panic.
o Properly handle "onlined" devices. An "onlined" backend
device stays attached to its backing store across front-end
disconnections. This feature is intended to reduce latency
when a front-end does a hand-off to another driver (e.g.
PV aware bootloader to OS kernel) or during a VM reboot.
o Harden the driver against a pathological/buggy front-end
by carefully vetting front-end XenStore data such as the
front-end state.
o Add sysctls that report the negotiated number of
segments per-request and the number of requests that
can be concurrently in flight.
Properly handle suspend/resume events in the Xen device framework.
Sponsored by: BQ Internet
sys/xen/xenbus/xenbusb.c:
o In xenbusb_resume(), publish the state transition of the
resuming device into XenbusStateIntiailising so that the
remote peer can see it. Recording the state locally is
not sufficient to trigger a re-connect sequence.
o In xenbusb_resume(), defer new-bus resume processing until
after the remote peer's XenStore address has been updated.
The drivers may need to refer to this information during
resume processing.
sys/xen/xenbus/xenbusb_back.c:
sys/xen/xenbus/xenbusb_front.c:
Register xenbusb_resume() rather than bus_generic_resume()
as the handler for device_resume events.
sys/xen/xenstore/xenstore.c:
o Fix grammer in a comment.
o In xs_suspend(), pass suspend events on to the child
devices (e.g. xenbusb_front/back, that are attached
to the XenStore.
Add suspend/resume support to the Xen blkfront driver.
Sponsored by: BQ Internet
sys/dev/xen/blkfront/block.h:
sys/dev/xen/blkfront/blkfront.c:
Remove now unused blkif_vdev_t from the blkfront soft.
sys/dev/xen/blkfront/blkfront.c:
o In blkfront_suspend(), indicate the desire to suspend
by changing the softc connected state to SUSPENDED, and
then wait for any I/O pending on the remote peer to
drain. Cancel suspend processing if I/O does not
drain within 30 seconds.
o Enable and update blkfront_resume(). Since I/O is
drained prior to the suspension of the VM, the complicated
recovery process performed by other Xen blkfront
implementations is avoided. We simply tear down the
connection to our old peer, and then re-connect.
o In blkif_initialize(), fix a resource leak and botched
return if we cannot allocate shadow memory for our
requests.
o In blkfront_backend_changed(), correct our response to
the XenbusStateInitialised state. This state indicates
that our backend peer has published sufficient data for
blkfront to publish ring information and other XenStore
data, not that a connection can occur. Blkfront now
will only perform connection processing in response to
the XenbusStateConnected state. This corrects an issue
where blkfront connected before the backend was ready
during resume processing.
[ Forced commit. Actual changes accidentally included in r225704 ]
sys/dev/xen/control/control.c:
Fix locking violations in Xen HVM suspend processing
and have it perform similar actions to those performed
during an ACPI triggered suspend.
Sponsored by: BQ Internet
Approved by: re
MFC after: 1 week
Correct suspend/resume support in the Netfront driver.
Sponsored by: BQ Internet
sys/dev/xen/netfront/netfront.c:
o Implement netfront_suspend(), a specialized suspend
handler for the netfront driver. This routine simply
disables the carrier so the driver is idle during
system suspend processing.
o Fix a leak when re-initializing LRO during a link reset.
o In netif_release_tx_bufs(), when cleaning up the grant
references for our TX ring, use gnttab_end_foreign_access_ref
instead of attempting to grant the page again.
o In netif_release_tx_bufs(), we do not track mbufs associated
with mbuf chains, but instead just free each mbuf directly.
Use m_free(), not m_freem(), to avoid double frees of mbufs.
o Refactor some code to enhance clarity.
Update netfront so that it queries and honors published
back-end features.
sys/dev/xen/netfront/netfront.c:
o Add xn_query_features() which reads the XenStore and
records the TSO, LRO, and chained ring-request support
of the backend.
o Rename xn_configure_lro() to xn_configure_features() and
use this routine to manage the setup of TSO, LRO, and
checksum offload.
o In create_netdev(), initialize if_capabilities and
if_hwassist to the capabilities found on all backends.
Delegate configuration of if_capenable and the TSO flag
if if_hwassist to xn_configure_features().
Reported by: Hugo Silva (fix inspired by patch provided)
Approved by: re
MFC after: 1 week
Add event handlers for (ACPI) suspend/resume events. Suspend event handlers
are invoked right before device drivers go into sleep state and resume event
handlers are invoked right after all device drivers are waken up.
Rewrote the netback driver for xen to attach properly via newbus
and work properly in both HVM and PVM mode (only HVM is tested).
Works with the in-tree FreeBSD netfront driver or the Windows
netfront driver from SuSE. Has not been extensively tested with
a Linux netfront driver. Does not implement LRO, TSO, or
polling. Includes unit tests that may be run through sysctl
after compiling with XNB_DEBUG defined.
Fix page fault in kernel mode when calling m_print() on a
null mbuf. Since m_print() is only used for debugging, there
are no performance concerns for extra error checking code.
sys/kern/subr_scanf.c:
Add the "hh" and "ll" width specifiers from C99 to scanf().
A few callers were already using "ll" even though scanf()
was handling it as "l".
Submitted by: Alan Somers <alans@spectralogic.com>
Submitted by: John Suykerbuyk <johns@spectralogic.com>
Sponsored by: Spectra Logic
MFC after: 1 week
Reviewed by: ken
r230916 | ken | 2012-02-02 10:54:35 -0700 (Thu, 02 Feb 2012) | 13 lines
Fix the netback driver build for i386.
netback.c: Add missing VM includes.
xen/xenvar.h,
xen/xenpmap.h: Move some XENHVM macros from <machine/xen/xenpmap.h> to
<machine/xen/xenvar.h> on i386 to match the amd64 headers.
luigi [Wed, 15 Feb 2012 06:16:52 +0000 (06:16 +0000)]
use 4096 instead of PAGE_SIZE to determine the initial size
of the memory allocated for netmap. Apparently the previous
value fails with an integer overflow on stable/8-IA64
(4M pages ? curious that it does not fail on stable/9 and head)
kevlo [Wed, 15 Feb 2012 05:35:37 +0000 (05:35 +0000)]
MFC r224703:
In rtinit1(), before rtrequest1_fib() is called, info.rti_flags is
initialized by flags (function argument) or-ed with ifa->ifa_flags.
If both NIC has a loopback route to itself, so IFA_RTSELF is set on ifa(s).
As IFA_RTSELF is defined by RTF_HOST, rtrequest1_fib() is called with
RTF_HOST flag even if netmask is not NULL. Consequently, netmask is set
to zero in rtrequest1_fib(), and request to add network route is changed
under hands to request to add host route.
Tested by: Andrew Boyer <aboyer at averesystems.com>
Submitted by: Svatopluk Kraus <onwahe at gmail dot com>
yongari [Wed, 15 Feb 2012 04:03:41 +0000 (04:03 +0000)]
MFC r230286,230337-230338,231159:
r230286:
Introduce a tunable that disables use of MSI.
Non-zero value will use INTx.
r230337-230338:
Rename dev.bge.%d.msi_disable to dev.bge.%d.msi which matches
enable/disable and default it to on.
r231159:
Call bge_add_sysctls() early and especially before bge_can_use_msi() so
r230337 actually has a chance of working and doesn't always unconditionally
disable the use of MSIs.
yongari [Wed, 15 Feb 2012 03:49:50 +0000 (03:49 +0000)]
MFC r230336:
Fix a logic error which resulted in putting PHY into sleep when WOL
is active. If WOL is active driver should not put PHY into sleep.
This change makes WOL work on RTL8168E.
glebius [Tue, 14 Feb 2012 17:35:44 +0000 (17:35 +0000)]
Merge netgraph related fixes and enhancements from head/. Revisions
merged: r223754,224031,226829,229003,230213,230480, 230486-230487,231585.
r223754 to ng_base:
- Use refcount(9) API to manage node and hook refcounting.
r224031 to ng_socket:
In ng_attach_cntl() first allocate things that may fail, and then
do the rest of initialization. This simplifies code and fixes
a double free in failure scenario.
r226829 to ng_base:
- If KDB & NETGRAPH_DEBUG are on, print traces on discovered failed
invariants.
- Reduce tautology in NETGRAPH_DEBUG output.
r229003 to ng_base:
style(9), whitespace and spelling nits.
r230213 to ng_socket:
Remove some disabled NOTYET code. Probability of enabling it is low,
if anyone wants, he/she can take it from svn.
r230480 to ng_base:
Convert locks that protect name hash, ID hash and typelist from
mutex(9) to rwlock(9) based locks.
While here remove dropping lock when processing NGM_LISTNODES,
and NGM_LISTTYPES generic commands. We don't need to drop it
since memory allocation is done with M_NOWAIT.
r230486 to hashinit(9):
Convert panic()s to KASSERT()s. This is an optimisation for
hashdestroy() since in absence of INVARIANTS a compiler
will drop the entire for() cycle.
r230487,r231585 to ng_socket:
Provide a findhook method for ng_socket(4). The node stores a
hash with names of its hooks. It starts with size of 16, and
grows when number of hooks reaches twice the current size. A
failure to grow (memory is allocated with M_NOWAIT) isn't
fatal, however.
jimharris [Tue, 14 Feb 2012 15:56:01 +0000 (15:56 +0000)]
MFC r230843, r231134, r231136, r231137, r231296
Add isci(4) driver for amd64 and i386 targets.
The isci driver is for the integrated SAS controller in the Intel C600
(Patsburg) chipset. Source files in sys/dev/isci directory are
FreeBSD-specific, and sys/dev/isci/scil subdirectory contains
an OS-agnostic library (SCIL) published by Intel to control the SAS
controller. This library is used primarily as-is in this driver, with
some post-processing to better integrate into the kernel build
environment.
isci.4 and a README in the sys/dev/isci directory contain a few
additional details.
This driver is only built for amd64 and i386 targets.
ken [Tue, 14 Feb 2012 14:18:28 +0000 (14:18 +0000)]
MFC 231240
Bring in a number of mps(4) driver fixes from LSI:
1. Fixed timeout specification for the msleep in mps_wait_command().
Added 30 second timeout for mps_wait_command() calls in mps_user.c.
2. Make sure we call mps_detach_user() from the kldunload path.
3. Raid Hotplug behavior change.
The driver now removes a volume when it goes to a failed state,
so we also need to add volume back to the OS when it goes to
opitimal/degraded/online from failed/missing.
Handle raid volume add and remove from the IR_Volume event.
4. Added some more debugging information.
5. Replace xpt_async(AC_LOST_DEVICE, path, NULL) with
mpssas_rescan_target().
This is to work around a panic in CAM that shows up when adding a
drive with a rescan and removing another device from the driver thread
with an AC_LOST_DEVICE async notification.
This problem was encountered in testing with the LSI sas2ircu utility,
which was used to create a RAID volume from physical disks. The driver
has to create the RAID volume target and remove the physical disk
targets, and triggered a panic in the process.
The CAM issue needs to be fully diagnosed and fixed, but this works
around the issue for now.
6. Fix some memory initialization issues in mps_free_command().
7. Resolve the "devq freeze forever" issue. This was caused by the
internal read capacity command issued in the non-head version of the
driver. When the command completed with an error, the driver wasn't
unfreezing thd device queue.
The version in head uses the CAM infrastructure for getting the read
capacity information, and therefore doesn't have the same issue.
8. Bump the version to 13.00.00.00-fbsd. (this is very close to LSI's
internal stable driver 13.00.00.00)