emaste [Thu, 16 Feb 2012 00:46:11 +0000 (00:46 +0000)]
MFC r231573:
Fix panic after "WARNING - ATA_IDENTIFY taskqueue timeout"
When performing a firmware upgrade via atacontrol[1] the subsequent
command may time out producing the error message above. When this
happens the callout could still be active, and the system would then
panic due to a destroyed semaphore.
Instead, ensure that the callout is done first, via callout_drain.
jilles [Wed, 15 Feb 2012 21:03:26 +0000 (21:03 +0000)]
MFC r230512: sockstat: Also show sockets not associated with a descriptor.
Sockets not associated with a file descriptor include TCP TIME_WAIT states
and sockets created via the socket(9) API such as from rpc.lockd and the NFS
client.
alc [Wed, 15 Feb 2012 18:15:26 +0000 (18:15 +0000)]
MFC r229363
Don't pass VM_ALLOC_ZERO to vm_page_grab() in tmpfs_mappedwrite() and
tmpfs_nocacheread(). It is both unnecessary and a pessimization. It
results in either the page being zeroed twice or zeroed first and then
overwritten by an I/O operation.
ken [Wed, 15 Feb 2012 17:28:09 +0000 (17:28 +0000)]
MFC r229997, r230033, and r230334
Bring the CAM Target Layer into stable/9.
r230334 | ken | 2012-01-19 11:42:03 -0700 (Thu, 19 Jan 2012) | 19 lines
Quiet some clang warnings when compiling CTL.
ctl_error.c,
ctl_error.h: Take out the ctl_sense_format enumeration, and use
scsi_sense_data_type instead.
Remove ctl_get_sense_format() and switch ctl_build_ua()
over to using scsi_sense_data_type.
ctl_backend_ramdisk.c,
ctl_backend_block.c:
Use C99 structure initializers instead of GNU initializers.
ctl.c: Switch over to using the SCSI sense format enumeration
instead of the CTL-specific enumeration.
Submitted by: dim (partially)
MFC after: 1 month
r230033 | ken | 2012-01-12 15:08:33 -0700 (Thu, 12 Jan 2012) | 5 lines
Silence some unnecessary verbosity.
Reported by: mav
MFC after: 1 month
r229997 | ken | 2012-01-11 17:34:33 -0700 (Wed, 11 Jan 2012) | 170 lines
Add the CAM Target Layer (CTL).
CTL is a disk and processor device emulation subsystem originally written
for Copan Systems under Linux starting in 2003. It has been shipping in
Copan (now SGI) products since 2005.
It was ported to FreeBSD in 2008, and thanks to an agreement between SGI
(who acquired Copan's assets in 2010) and Spectra Logic in 2010, CTL is
available under a BSD-style license. The intent behind the agreement was
that Spectra would work to get CTL into the FreeBSD tree.
Some CTL features:
- Disk and processor device emulation.
- Tagged queueing
- SCSI task attribute support (ordered, head of queue, simple tags)
- SCSI implicit command ordering support. (e.g. if a read follows a mode
select, the read will be blocked until the mode select completes.)
- Full task management support (abort, LUN reset, target reset, etc.)
- Support for multiple ports
- Support for multiple simultaneous initiators
- Support for multiple simultaneous backing stores
- Persistent reservation support
- Mode sense/select support
- Error injection support
- High Availability support (1)
- All I/O handled in-kernel, no userland context switch overhead.
(1) HA Support is just an API stub, and needs much more to be fully
functional.
ctl.c: The core of CTL. Command handlers and processing,
character driver, and HA support are here.
ctl.h: Basic function declarations and data structures.
ctl_backend.c,
ctl_backend.h: The basic CTL backend API.
ctl_backend_block.c,
ctl_backend_block.h: The block and file backend. This allows for using
a disk or a file as the backing store for a LUN.
Multiple threads are started to do I/O to the
backing device, primarily because the VFS API
requires that to get any concurrency.
ctl_backend_ramdisk.c: A "fake" ramdisk backend. It only allocates a
small amount of memory to act as a source and sink
for reads and writes from an initiator. Therefore
it cannot be used for any real data, but it can be
used to test for throughput. It can also be used
to test initiators' support for extremely large LUNs.
ctl_cmd_table.c: This is a table with all 256 possible SCSI opcodes,
and command handler functions defined for supported
opcodes.
ctl_debug.h: Debugging support.
ctl_error.c,
ctl_error.h: CTL-specific wrappers around the CAM sense building
functions.
ctl_frontend.c,
ctl_frontend.h: These files define the basic CTL frontend port API.
ctl_frontend_cam_sim.c: This is a CTL frontend port that is also a CAM SIM.
This frontend allows for using CTL without any
target-capable hardware. So any LUNs you create in
CTL are visible in CAM via this port.
ctl_frontend_internal.c,
ctl_frontend_internal.h:
This is a frontend port written for Copan to do
some system-specific tasks that required sending
commands into CTL from inside the kernel. This
isn't entirely relevant to FreeBSD in general,
but can perhaps be repurposed.
ctl_ha.h: This is a stubbed-out High Availability API. Much
more is needed for full HA support. See the
comments in the header and the description of what
is needed in the README.ctl.txt file for more
details.
ctl_io.h: This defines most of the core CTL I/O structures.
union ctl_io is conceptually very similar to CAM's
union ccb.
ctl_ioctl.h: This defines all ioctls available through the CTL
character device, and the data structures needed
for those ioctls.
ctl_mem_pool.c,
ctl_mem_pool.h: Generic memory pool implementation used by the
internal frontend.
ctl_private.h: Private data structres (e.g. CTL softc) and
function prototypes. This also includes the SCSI
vendor and product names used by CTL.
ctl_scsi_all.c,
ctl_scsi_all.h: CTL wrappers around CAM sense printing functions.
ctl_ser_table.c: Command serialization table. This defines what
happens when one type of command is followed by
another type of command.
ctl_util.c,
ctl_util.h: CTL utility functions, primarily designed to be
used from userland. See ctladm for the primary
consumer of these functions. These include CDB
building functions.
scsi_ctl.c: CAM target peripheral driver and CTL frontend port.
This is the path into CTL for commands from
target-capable hardware/SIMs.
ctladm/Makefile,
ctladm/ctladm.8,
ctladm/ctladm.c,
ctladm/ctladm.h,
ctladm/util.c: ctladm(8) is the CTL management utility.
It fills a role similar to camcontrol(8).
It allow configuring LUNs, issuing commands,
injecting errors and various other control
functions.
usr.bin/Makefile: Add ctlstat.
ctlstat/Makefile
ctlstat/ctlstat.8,
ctlstat/ctlstat.c: ctlstat(8) fills a role similar to iostat(8).
It reports I/O statistics for CTL.
sys/conf/files: Add CTL files.
sys/conf/NOTES: Add device ctl.
sys/cam/scsi_all.h: To conform to more recent specs, the inquiry CDB
length field is now 2 bytes long.
Add several mode page definitions for CTL.
sys/cam/scsi_all.c: Handle the new 2 byte inquiry length.
bz [Wed, 15 Feb 2012 16:56:52 +0000 (16:56 +0000)]
MFC r231505,231520:
Introduce a new NET_RT_IFLISTL API to query the address list. It works
on extended and extensible structs if_msghdrl and ifa_msghdrl. This
will allow us to extend both the msghdrl structs and eventually if_data
in the future without breaking the ABI.
The MFC is just to provide the new API to old stable branches to make
updating and if needed downgrading a lot easier for updates to 10.
Bump __FreeBSD_version to allow ports to more easily detect the new API.
mav [Wed, 15 Feb 2012 14:30:04 +0000 (14:30 +0000)]
MFC r231647:
Do not handle MOD_SHUTDOWN equally to MOD_UNLOAD in sound kernel module.
MOD_SHUTDOWN is not an end of existence, and there is a life after it.
In particular, code previously called on MOD_SHUTDOWN grabbed lock and
deallocated unit numbering. That caused infinite wait loop if snd_uaudio
tried to destroy its PCM device after that point.
yongari [Wed, 15 Feb 2012 04:02:41 +0000 (04:02 +0000)]
MFC r230286,230337-230338,231159:
r230286:
Introduce a tunable that disables use of MSI.
Non-zero value will use INTx.
r230337-230338:
Rename dev.bge.%d.msi_disable to dev.bge.%d.msi which matches
enable/disable and default it to on.
r231159:
Call bge_add_sysctls() early and especially before bge_can_use_msi() so
r230337 actually has a chance of working and doesn't always unconditionally
disable the use of MSIs.
yongari [Wed, 15 Feb 2012 03:48:22 +0000 (03:48 +0000)]
MFC r230336:
Fix a logic error which resulted in putting PHY into sleep when WOL
is active. If WOL is active driver should not put PHY into sleep.
This change makes WOL work on RTL8168E.
Rewrote the netback driver for xen to attach properly via newbus
and work properly in both HVM and PVM mode (only HVM is tested).
Works with the in-tree FreeBSD netfront driver or the Windows
netfront driver from SuSE. Has not been extensively tested with
a Linux netfront driver. Does not implement LRO, TSO, or
polling. Includes unit tests that may be run through sysctl
after compiling with XNB_DEBUG defined.
Fix page fault in kernel mode when calling m_print() on a
null mbuf. Since m_print() is only used for debugging, there
are no performance concerns for extra error checking code.
sys/kern/subr_scanf.c:
Add the "hh" and "ll" width specifiers from C99 to scanf().
A few callers were already using "ll" even though scanf()
was handling it as "l".
Submitted by: Alan Somers <alans@spectralogic.com>
Submitted by: John Suykerbuyk <johns@spectralogic.com>
Sponsored by: Spectra Logic
Reviewed by: ken
r230916 | ken | 2012-02-02 10:54:35 -0700 (Thu, 02 Feb 2012) | 13 lines
Fix the netback driver build for i386.
netback.c: Add missing VM includes.
xen/xenvar.h,
xen/xenpmap.h: Move some XENHVM macros from <machine/xen/xenpmap.h> to
<machine/xen/xenvar.h> on i386 to match the amd64 headers.
jimharris [Tue, 14 Feb 2012 15:58:49 +0000 (15:58 +0000)]
MFC r230843, r231134, r231136, r231137, r231296
Add isci(4) driver for amd64 and i386 targets.
The isci driver is for the integrated SAS controller in the Intel C600
(Patsburg) chipset. Source files in sys/dev/isci directory are
FreeBSD-specific, and sys/dev/isci/scil subdirectory contains
an OS-agnostic library (SCIL) published by Intel to control the SAS
controller. This library is used primarily as-is in this driver, with
some post-processing to better integrate into the kernel build
environment.
isci.4 and a README in the sys/dev/isci directory contain a few
additional details.
This driver is only built for amd64 and i386 targets.
ken [Tue, 14 Feb 2012 14:17:46 +0000 (14:17 +0000)]
MFC 231240
Bring in a number of mps(4) driver fixes from LSI:
1. Fixed timeout specification for the msleep in mps_wait_command().
Added 30 second timeout for mps_wait_command() calls in mps_user.c.
2. Make sure we call mps_detach_user() from the kldunload path.
3. Raid Hotplug behavior change.
The driver now removes a volume when it goes to a failed state,
so we also need to add volume back to the OS when it goes to
opitimal/degraded/online from failed/missing.
Handle raid volume add and remove from the IR_Volume event.
4. Added some more debugging information.
5. Replace xpt_async(AC_LOST_DEVICE, path, NULL) with
mpssas_rescan_target().
This is to work around a panic in CAM that shows up when adding a
drive with a rescan and removing another device from the driver thread
with an AC_LOST_DEVICE async notification.
This problem was encountered in testing with the LSI sas2ircu utility,
which was used to create a RAID volume from physical disks. The driver
has to create the RAID volume target and remove the physical disk
targets, and triggered a panic in the process.
The CAM issue needs to be fully diagnosed and fixed, but this works
around the issue for now.
6. Fix some memory initialization issues in mps_free_command().
7. Resolve the "devq freeze forever" issue. This was caused by the
internal read capacity command issued in the non-head version of the
driver. When the command completed with an error, the driver wasn't
unfreezing thd device queue.
The version in head uses the CAM infrastructure for getting the read
capacity information, and therefore doesn't have the same issue.
8. Bump the version to 13.00.00.00-fbsd. (this is very close to LSI's
internal stable driver 13.00.00.00)
dougb [Tue, 14 Feb 2012 10:16:56 +0000 (10:16 +0000)]
MFC r230099:
Change rcvar= assignments to the literal values set_rcvar
would have returned. This will slightly reduce boot time,
and help in diff reduction to HEAD.
luigi [Tue, 14 Feb 2012 09:42:02 +0000 (09:42 +0000)]
MFC: import netmap core files into RELENG_9.
This is the same code as in HEAD.
Device driver modifications will be imported separately
because the base drivers differ and patches might be
slightly different between the various releases.
The code is disconnected from the main build targets
unless you explicitly put a 'device netmap' in your
kernel config file.
rmacklem [Tue, 14 Feb 2012 04:48:36 +0000 (04:48 +0000)]
MFC: r230803
When a "mount -u" switches an NFS mount point from TCP to UDP,
any thread doing an I/O RPC with a transfer size greater than
NFS_UDPMAXDATA will be hung indefinitely, retrying the RPC.
After a discussion on freebsd-fs@, I decided to add a warning
message for this case, as suggested by Jeremy Chadwick.
rmacklem [Tue, 14 Feb 2012 04:07:35 +0000 (04:07 +0000)]
MFC: r230801
jwd@ reported a problem via email to freebsd-fs@ on Aug 25, 2011
under the subject "F_RDLCK lock to FreeBSD NFS fails to R/O target file".
This occurred because the server side NLM always checked for VWRITE
access, irrespective of the type of lock request. This patch
replaces VOP_ACCESS(..VWRITE..) with one appropriate to
the lock operation. It allows unlock and lock cancellation
to be done without a check of VOP_ACCESS(), so that files
can't be left locked indefinitely after the file permissions
have been changed.
marius [Tue, 14 Feb 2012 01:05:37 +0000 (01:05 +0000)]
Forced commit to denote that the commit message of r231623 actually
should have read:
MFC: r231518
Flesh out support for SAS1078 and SAS1078DE (which are said to actually
be the same chip):
- The I/O port resource may not be available with these. However, given
that we actually only need this resource for some controllers that
require their firmware to be up- and downloaded (which excludes the
SAS1078{,DE}) just handle failure to allocate this resource gracefully
when possible. While at it, generally put non-fatal resource allocation
failures under bootverbose.
- SAS1078{,DE} use a different hard reset protocol.
- Add workarounds for the 36GB physical address limitation of scatter/
gather elements of these controllers.
dim [Mon, 13 Feb 2012 20:59:20 +0000 (20:59 +0000)]
MFC r231079:
Let rpcgen(1) support an environment variable RPCGEN_CPP to find the C
preprocessor to run. Previously, it always ran /usr/bin/cpp, unless you
used the -Y option, and even then you could not set the basename. It
also attempted to run /usr/ccs/lib/cpp for SVR4 compatibility, but this
is obsolete, and has been removed.
Note that setting RPCGEN_CPP to a command with arguments is supported,
though the command line parsing is simplistic. However, setting it to
e.g. "gcc46 -E" or "clang -E" will lead to problems, because both gcc
and clang in -E mode will consider files with unknown extensions (such
as .x) as object files, and attempt to link them.
This could be worked around by also adding "-x c", but it is much safer
to set RPCGEN_CPP to e.g. "cpp46" or "clang-cpp" instead.
MFC r231080:
Amend r231079 by properly shifting up the existing arguments in
rpc_main.c's insarg() function. I had forgotten to put this in my patch
queue, sorry.
Pointy hat to: me
MFC r231101:
In usr.bin/rpcgen/rpc_main.c, use execvp(3) instead of execv(3), so
rpcgen will search the current PATH for the preprocessor. This makes it
possible to run a preprocessor built during the cross-tools stage of
buildworld.
jhb [Mon, 13 Feb 2012 19:51:59 +0000 (19:51 +0000)]
MFC 230340:
Properly return success once a matching VPD entry is found in
pci_get_vpd_readonly_method(). Previously the loop was always running
to completion and falling through to failing with ENXIO.
glebius [Mon, 13 Feb 2012 15:21:12 +0000 (15:21 +0000)]
Merge from head 226829, 230213, 230480, 230486, 230487, 231585:
r226829 in ng_base:
- If KDB & NETGRAPH_DEBUG are on, print traces on discovered failed
invariants.
- Reduce tautology in NETGRAPH_DEBUG output.
r230213 in ng_socket:
Remove some disabled NOTYET code. Probability of enabling it is low,
if anyone wants, he/she can take it from svn.
r230480 in ng_base:
Convert locks that protect name hash, ID hash and typelist from
mutex(9) to rwlock(9) based locks.
While here remove dropping lock when processing NGM_LISTNODES,
and NGM_LISTTYPES generic commands. We don't need to drop it
since memory allocation is done with M_NOWAIT.
r230486 in subr_hash.c:
Convert panic()s to KASSERT()s. This is an optimisation for
hashdestroy() since in absence of INVARIANTS a compiler
will drop the entire for() cycle.
230487, 231585 in ng_socket:
Provide a findhook method for ng_socket(4). The node stores a
hash with names of its hooks. It starts with size of 16, and
grows when number of hooks reaches twice the current size. A
failure to grow (memory is allocated with M_NOWAIT) isn't
fatal, however.
I used standard hash(9) function for the hash. With 25000
hooks named in the mpd (ports/net/mpd5) manner of "b%u", the
distributions is the following: 72.1% entries consist of one
element, 22.1% consist of two, 5.2% consist of three and
0.6% of four.
Speedup in a synthetic test that creates 25000 hooks and then
runs through a long cyclce dereferencing them in a random order
is over 25 times.
The last merge was done in an ABI preserving manner, the struct
ngsock is still exposed to userland (unlike in head), but its new
fields are at its end and under #ifdef _KERNEL.
tijl [Mon, 13 Feb 2012 10:24:49 +0000 (10:24 +0000)]
MFC r229794:
- Fix how hexdump parses escape strings
From the NetBSD bug:
The way how hexdump(1) parses escape sequences has some bugs.
It shows up when an escape sequence is used as the non-last character
of a format string.
MFC r230649:
Fix decoding of escape sequences in format strings:
- Zero-terminate the resulting string by letting the for-loop copy the
terminating zero.
- Exit the for-loop after handling a backslash at the end of the format
string to fix a buffer overrun.
- Remove some unnecessary comments and blank lines.
truckman [Mon, 13 Feb 2012 07:30:42 +0000 (07:30 +0000)]
MFC r231102:
Improve sparse file handling when printing the block list for an inode by
not bailing out early when a hole is encountered in the direct block list.
Print NULL block pointers in the direct block list. Simplify the
code that prints the fragment count.
brooks [Sun, 12 Feb 2012 23:07:45 +0000 (23:07 +0000)]
MFC 231196:
eui64_aton and eui64_ntoa are actually the equivalent of ether_aton_r and
ether_nota_r and do not use static variables so remove the note copied
from ethers.3 saying they do.
trociny [Sun, 12 Feb 2012 07:57:58 +0000 (07:57 +0000)]
MFC r231015, r231016:
r231015:
Fix the regression introduced in r226859: if the local component is
out of date BIO_READ requests got lost instead of being sent to the
remote component.
Reviewed by: pjd
r231016:
If a local write request is from the synchronization thread, when it
is synchronizing data that is out of date on the local component, we
should not send G_GATE_CMD_DONE acknowledge to the kernel.
This fixes the issue, observed in async mode, when on synchronization
from the remote component the worker terminated with "G_GATE_CMD_DONE
failed" error.
trociny [Sun, 12 Feb 2012 07:55:33 +0000 (07:55 +0000)]
MFC r230874:
Try to avoid ambiguity when sysctl returns ENOMEM additionally
checking the returned oldlen: when ENOMEM is due to the supplied
buffer being too short the return oldlen is equal to buffer size.
Without this additional check sockstat gets stuck in loop leaking the
memory if the returned ENOMEM was due the exceeded memorylocked
limit. This is easily can be observed running `limits -l 1k sockstat'.
trociny [Sun, 12 Feb 2012 07:52:14 +0000 (07:52 +0000)]
MFC r230873:
Try to avoid ambiguity when sysctl returns ENOMEM additionally
checking the returned oldlen: when ENOMEM is due to the supplied
buffer being too short the return oldlen is equal to buffer size.
Without this additional check kvm_getprocs() gets stuck in loop if the
returned ENOMEM was due the exceeded memorylocked limit. This is
easily can be observed running `limits -l 1k top'.
rmacklem [Sun, 12 Feb 2012 06:01:49 +0000 (06:01 +0000)]
MFC: r231133
r228827 fixed a problem where copying of NFSv4 open credentials into
a credential structure would corrupt it. This happened when the
p argument was != NULL. However, I now realize that the copying of
open credentials should only happen for p == NULL, since that indicates
that it is a read-ahead or write-behind. This patch fixes this.
After this commit, r228827 could be reverted, but I think the code is
clearer and safer with the patch, so I am going to leave it in.
Without this patch, it was possible that a NFSv4 VOP_SETATTR() could have
changed the credentials of the caller. This would have happened if
the process doing the VOP_SETATTR() did not have the file open, but
some other process running as a different uid had the file open for writing
at the same time.
brooks [Fri, 10 Feb 2012 15:54:17 +0000 (15:54 +0000)]
MFC r230403.
When creating the jails /dev/log symlink, do it by full path to avoid
creating stray "log" symlinks if the mount fails. That apparently
happens in some ezjail configs.
PR: conf/143084
Submitted by: Dirk Engling <erdgeist at erdgeist.org>
ae [Fri, 10 Feb 2012 06:34:21 +0000 (06:34 +0000)]
MFC r228061:
The size of APM could be bigger than number of already allocated entries.
And the first usable sector should not start from the inside of APM area.
MFC r228076:
Add an ability to increase number of allocated APM entries when we
have reserved free space in the APM area.
Also instead of one write request per each APM entry, use MAXPHYS
sized writes when we are updating APM.
rmacklem [Fri, 10 Feb 2012 03:32:29 +0000 (03:32 +0000)]
MFC: r230605
A problem with respect to data read through the buffer cache for both
NFS clients was reported to freebsd-fs@ under the subject "NFS
corruption in recent HEAD" on Nov. 26, 2011. This problem occurred when
a TCP mounted root fs was changed to using UDP. I believe that this
problem was caused by the change in mnt_stat.f_iosize that occurred
because rsize was decreased to the maximum supported by UDP. This
patch fixes the problem by using v_bufobj.bo_bsize instead of f_iosize,
since the latter is set to f_iosize when the vnode is allocated, but
does not change for a given vnode when f_iosize changes.