mav [Fri, 22 Feb 2013 09:47:21 +0000 (09:47 +0000)]
MFC r242852, r243069:
Several optimizations to sched_idletd():
- Do not try to steal load from other CPUs if there was no context switches
on this CPU (i.e. it was idle all the time and woke up just for bus mastering
or TLB shutdown). If current CPU was idle, then it is quite unlikely that some
other CPU has load to steal. Under high I/O rate, when TLB shutdowns cause
numerous CPU wakeups, on 24-CPU system load stealing code may consume up to
25% of all CPU time without giving any benefits.
- Change code that implements spinning for load to restart spin in case of
context switch. Previous code periodically called cpu_idle() even under
high interrupt/context switch rate.
- Rise spinning threshold to 10KHz, where it gives at least some effect
that may worth consumed power.
mav [Thu, 21 Feb 2013 19:02:29 +0000 (19:02 +0000)]
MFC r244014 (by ken):
Fix a device departure bug for the the pass(4), enc(4), sg(4) and ch(4)
drivers.
The bug occurrs when a userland process has the driver instance
open and the underlying device goes away. We get the devfs
callback that the device node has been destroyed, but not all of
the closes necessary to fully decrement the reference count on the
CAM peripheral.
The reason is that once devfs calls back and says the device has
been destroyed, it is moved off to deadfs, and devfs guarantees
that there will be no more open or close calls. So the solution
is to keep track of how many outstanding open calls there are on
the device, and just release that many references when we get the
callback from devfs.
scsi_pass.c,
scsi_enc.c,
scsi_enc_internal.h: Add an open count to the softc in these
drivers. Increment it on open and
decrement it on close.
When we get a devfs callback to say that
the device node has gone away, decrement
the peripheral reference count by the
number of still outstanding opens.
Make sure we don't access the peripheral
with cam_periph_unlock() after what might
be the final call to
cam_periph_release_locked(). The
peripheral might have been freed, and we
will be dereferencing freed memory.
scsi_ch.c,
scsi_sg.c: For the ch(4) and sg(4) drivers, add the
same changes described above, and in
addition, fix another bug that was
previously fixed in the pass(4) and enc(4)
drivers.
These drivers were calling destroy_dev()
from their cleanup routine, but that could
cause a deadlock because the cleanup
routine could be indirectly called from
the driver's close routine. This would
cause a deadlock, because the device node
is being held open by the active close
call, and can't be destroyed.
mav [Thu, 21 Feb 2013 18:56:09 +0000 (18:56 +0000)]
MFC r237328 (by ken) for recently merged scsi_enc.c:
Fix several reference counting and object lifetime issues between
the pass(4) and enc(4) drivers and devfs.
The pass(4) driver uses the destroy_dev_sched() routine to
schedule its device node for destruction in a separate thread
context. It does this because the passcleanup() routine can get
called indirectly from the passclose() routine, and that would
cause a deadlock if the close routine tried to destroy its own
device node.
In any case, once a particular passthrough driver number, e.g.
pass3, is destroyed, CAM considers that unit number (3 in this
case) available for reuse.
The problem is that devfs may not be done cleaning up the previous
instance of pass3, and will panic if isn't done cleaning up the
previous instance.
The solution is to get a callback from devfs when the device node
is removed, and make sure we hold a reference to the peripheral
until that happens.
Testing exposed some other cases where we have reference counting
issues, and those were also fixed in the pass(4) driver.
cam_periph.c: In camperiphfree(), reorder some of the operations.
The peripheral destructor needs to be called before
the peripheral is removed from the peripheral is
removed from the list. This is because once we
remove the peripheral from the list, and drop the
topology lock, the peripheral number may be reused.
But if the destructor hasn't been called yet, there
may still be resources hanging around (like devfs
nodes) that haven't been fully cleaned up.
cam_xpt.c: Add an argument to xpt_remove_periph() to indicate
whether the topology lock is already held.
scsi_enc.c: Acquire an extra reference to the peripheral during
registration, and release it once we get a callback
from devfs indicating that the device node is gone.
Call destroy_dev_sched_cb() in enc_oninvalidate()
instead of calling destroy_dev() in the cleanup
routine.
scsi_pass.c: Add reference counting to handle peripheral and
devfs object lifetime issues.
Add a reference to the peripheral and the devfs
node in the peripheral registration.
Don't attempt to add a physical path alias if the
peripheral has been marked invalid.
Release the devfs reference once the initial
physical path alias taskqueue run has completed.
Schedule devfs node destruction in the
passoninvalidate(), and release our peripheral
reference in a new routine, passdevgonecb() once
the devfs node is gone. This allows the peripheral
to fully go away, and the peripheral destructor,
passcleanup(), will get called.
mav [Thu, 21 Feb 2013 18:49:05 +0000 (18:49 +0000)]
MFC r236138 (by ken) for recently merged scsi_enc.c:
Work around a race condition in devfs by changing the way closes
are handled in most CAM peripheral drivers that are not handled by
GEOM's disk class.
The usual character driver open and close semantics are that the
driver gets N open calls, but only one close, when the last caller
closes the device.
CAM peripheral drivers expect that behavior to be honored to the
letter, and the CAM peripheral driver code (specifically
cam_periph_release_locked_busses()) panics if it is done incorrectly.
Since devfs has to drop its locks while it calls a driver's close
routine, and it does not have a way to delay or prevent open calls
while it is calling the close routine, there is a race.
The sequence of events, simplified a bit, is:
- devfs acquires a lock
- devfs checks the reference count, and if it is 1, continues to close.
- devfs releases the lock
- 2nd process open call on the device happens here
- devfs calls the driver's close routine
- devfs acquires a lock
- devfs decrements the reference count
- devfs releases the lock
- 2nd process close call on the device happens here
At the second close, we get a panic in
cam_periph_release_locked_busses(), complaining that peripheral
has been released when the reference count is already 0. This is
because we have gotten two closes in a row, which should not
happen.
The fix is to add the D_TRACKCLOSE flag to the driver's cdevsw, so
that we get a close() call for each open(). That does happen
reliably, so we can make sure that our reference counts are
correct.
Note that the sa(4) and pt(4) drivers only allow one context
through the open routine. So these drivers aren't exposed to the
same race condition.
scsi_ch.c,
scsi_enc.c,
scsi_enc_internal.h,
scsi_pass.c,
scsi_sg.c:
For these drivers, change the open() routine to
increment the reference count for every open, and
just decrement the reference count in the close.
Call cam_periph_release_locked() in some scenarios
to avoid additional lock and unlock calls.
scsi_pt.c: Call cam_periph_release_locked() in some scenarios
to avoid additional lock and unlock calls.
mav [Thu, 21 Feb 2013 16:59:28 +0000 (16:59 +0000)]
MFC r238379, r238382 (by bruefer):
Renamed the kern.cam.da.da_send_ordered sysctl and tunable to
kern.cam.da.send_ordered, more in line with the other da sysctls/tunables.
Renamed the kern.cam.ada.ada_send_ordered sysctl and tunable to
kern.cam.ada.send_ordered, more in line with the other da sysctls/tunables.
hselasky [Thu, 21 Feb 2013 07:48:07 +0000 (07:48 +0000)]
MFC r246616 and r246759:
- Move scratch data from the USB bus structure to the USB device
structure so that simultaneous access cannot happen. Protect scratch
area using the enumeration lock.
- Reduce stack usage in usbd_transfer_setup() by moving some big stack
members to the scratch area. This saves around 200 bytes of stack.
- Fix a whitespace.
- Protect control requests using the USB device enumeration lock.
- Make sure all callers of usbd_enum_lock() check the return value.
- Remove the control transfer specific lock.
- Bump the FreeBSD version number, hence external USB modules may need
to be recompiled due to a USB device structure change.
Use EXT2_LINK_MAX instead of LINK_MAX.
Use nitems().
Correct off-by-one errors in FFTODT() and DDTOFT().
Remove useless rootino local variable.
Remove unused em_e2fsb definition.
Move assignment where it is not dead.
ed [Tue, 19 Feb 2013 17:57:17 +0000 (17:57 +0000)]
MFC r232977 and r233945:
Make init(8) slightly more robust when /dev/console is missing.
If the environment doesn't offer a working /dev/console, the existing
version of init(8) will simply refuse running rc(8) scripts. This means
you'll only have a system running init(8) and nothing else.
Change the code to do the following:
- Open /dev/console like we used to do, but make it more robust to use
O_NONBLOCK to prevent blocking on a carrier.
- If this fails, use /dev/null as stdin and /var/log/init.log as stdout
and stderr.
- If even this fails, use /dev/null as stdin, stdout and stderr.
So why us this useful? Well, if you remove the `getpid() == 1' check in
main(), you can now use init(8) inside jails to properly execute rc(8).
It still requires some polishing, as existing tools assume init(8) has
PID 1.
Also it is now possible to use use init(8) on `headless' devices that
don't even have a serial boot console.
markj [Tue, 19 Feb 2013 16:39:53 +0000 (16:39 +0000)]
MFC r239672 (by rrs):
This small change takes care of a race condition
that can occur when both sides close at the same time.
If that occurs, without this fix the connection enters
FIN1 on both sides and they will forever send FIN|ACK at
each other until the connection times out. This is because
we stopped processing the FIN|ACK and thus did not advance
the sequence and so never ACK'd each others FIN. This
fix adjusts it so we *do* process the FIN properly and
the race goes away ;-)
marcel [Mon, 18 Feb 2013 05:05:01 +0000 (05:05 +0000)]
MFC r246715:
Eliminate the PC_CURTHREAD symbol and load the current thread's
thread structure pointer atomically from r13 (the pcpu pointer)
for the current CPU/core.
markj [Sun, 17 Feb 2013 19:49:18 +0000 (19:49 +0000)]
MFC r245961 r245962 r245963.
MFC r245961:
When the 'R' flag is used with a newsyslog.conf entry, some fields of
the corresponding struct sigwork_entry were left uninitialized,
potentially causing an early return from do_sigwork(). Ensure that these
fields are initialized, and handle the 'R' flag properly in
do_sigwork().
MFC r245962:
Ensure that newsyslog -n prints the correct message for a rotation rule
that uses the 'R' flag.
MFC r245963:
Rename the run_cmd field to sw_runcmd to make it consistent with the
other fields in struct sigwork_entry.
pfg [Sun, 17 Feb 2013 01:34:41 +0000 (01:34 +0000)]
MFC r240355, r239372, r246258;
ext2fs: general cleanups.
- Remove unused extern declarations in fs.h
- Correct comments in ext2_dir.h
- Several panic() messages showed wrong function names.
- Remove commented out stray line in ext2_alloc.c.
- Remove the unused macro EXT2_BLOCK_SIZE_BITS() and the then
write-only member e2fs_blocksize_bits from struct m_ext2fs.
- Remove the unused macro EXT2_FIRST_INO() and the then write-only
member e2fs_first_inode from struct m_ext2fs.
- Remove EXT2_DESC_PER_BLOCK() and the member e2fs_descpb from
struct m_ext2fs.
- Remove the unused members e2fs_bmask, e2fs_dbpg and
e2fs_mount_opt from struct m_ext2fs
- Correct harmless off-by-one error for fspath in ext2_vfsops.c.
- Remove the unused and broken macros EXT2_ADDR_PER_BLOCK_BITS()
and EXT2_DESC_PER_BLOCK_BITS().
- Remove the !_KERNEL versions of the EXT2_* macros.
Submitted by: Christoph Mallon
To ease the ease bringing this change I also brought this changes:
- Fix typo.
- Fix style nit.
luigi [Sat, 16 Feb 2013 22:44:02 +0000 (22:44 +0000)]
partial MFC of rev=245362:
enable building virtio devices into static kernels.
I think the 'files.*' entries should be improved (also in HEAD) because
bringing up a vtnet device now requires 3 entries in your kernel config:
"device virtio, device virtio_pci, device vtnet"
but i'll leave the fix to a future commit.
This is also the reason not to enable the device in GENERIC kernels now.
hselasky [Thu, 14 Feb 2013 10:32:47 +0000 (10:32 +0000)]
MFC r246397:
Make sure that all mouse buttons are released when clients
using /dev/consolectl close. This fixes a problem where if
a USB mouse is detached while a button is pressed, that
button is never released.
kib [Wed, 13 Feb 2013 23:25:11 +0000 (23:25 +0000)]
MFC r246117:
Rework the __vdso_* symbols attributes to only make the symbols weak,
but use normal references instead of weak. This makes the statically
linked binaries to use fast gettimeofday(2) by forcing the linker to
resolve references and providing the neccessary functions.
kib [Wed, 13 Feb 2013 13:55:54 +0000 (13:55 +0000)]
MFC r246116:
Reduce default shift used to calculate the max frequency for the TSC
timecounter to 1, and correspondingly increase the precision of the
gettimeofday(2) and related functions in the default configuration.
MFC r246212:
Remove the (shift > 0) condition when selecting the get_timecount()
implementation.
yongari [Wed, 13 Feb 2013 00:46:41 +0000 (00:46 +0000)]
MFC r246341:
Rework jumbo frame handling. QAC confirmed that the controller
requires 8 bytes alignment on RX buffer. Given that non-jumbo
frame works on any alignments I guess this DMA limitation for RX
buffer could be jumbo frame specific one. Also I'm not sure
whether this DMA limitation is related with 64bit DMA. Previously
age(4) disabled 64bit DMA addressing due to silent data corruption.
So we may need more testing on re-enabling 64bit DMA in future.
While I'm here, change mbuf chaining algorithm to use fixed sized
buffer and force software checksum if controller reports length
error. According to QAC, RFD is not updated at all for jumbo frame
so it works just like alc(4) controllers. This change also added
alignment fixup for strict alignment architectures. Because I'm
not aware of any non-x86 machines that use age(4) controllers it's
just for completeness at this moment.
Wit this change, jumbo frame should work with age(4).
dim [Sun, 10 Feb 2013 21:24:47 +0000 (21:24 +0000)]
MFC r246259:
Pull in r170135 from upstream clang trunk:
Dont use/link ARCMT, StaticAnalyzer and Rewriter to clang when the user
specifies not to. Dont build ASTMatchers with Rewriter disabled and
StaticAnalyzer when it's disabled.
Without all those three, the clang binary shrinks (x86_64) from ~36MB
to ~32MB (unstripped).
To disable these clang components, and get a smaller clang binary built
and installed, set WITHOUT_CLANG_FULL in src.conf(5). During the
initial stages of buildworld, those extra components are already
disabled automatically, to save some build time.
cperciva [Sun, 10 Feb 2013 17:48:46 +0000 (17:48 +0000)]
MFC r246016:
Add a loader tunable "hw.broken_txfifo" which enables a workaround for a
bug in old versions of QEMU (and Xen, and other places using QEMU code).
- r238072: Do not include <sys/types.h> in the local headers.
- r238360: Various VirtIO improvements
- r240430: No need to leak these into the includer's namespace.
- r241469: virtqueue: Fix non-indirect virtqueues
- r241470: Add Virtio SCSI driver
- r241495: Fix build with PAE enabled
- r244136: Remove duplicated lines
- r244200: Start taskqueues threads after attach cannot fail
pfg [Sat, 9 Feb 2013 01:08:49 +0000 (01:08 +0000)]
MFC r237574, r237625, r246256;
crunch: Sync some NetBSD changes.
crunchide:
Apr 11, 2009: fix some -Wsign-compare issues.
Sep 20, 1999: Free the right thing.
crunchgen:
Apr 14, 2009: Fix some WARNS=4 issues (-Wshadow -Wcast-qual)
Oct 30, 2004: Add (unsigned char) cast to ctype functions
Feb 5, 2001: fix nested extern.
examples:
Aug 30, 2007: NetBSD 36867 - trsp references are deprecated
In order to merge this I also had to merge some previous changes:
- Ensure crunchen uses the same make binary as the rest of the
build.
- Some amount of style(9): function definitions, header ordering,
and $FreeBSD$.
delphij [Sat, 9 Feb 2013 00:29:36 +0000 (00:29 +0000)]
MFC r245264:
The current ZFS code expects ddt_zap_count to always succeed by asserting
the underlying zap_count() to return no errors. However, it is possible
that the pool reaches to such a state where zap_count would return error,
leading to panics when a pool is imported.
This commit changes the ddt_zap_count to return error returned from
zap_count and handle the error appropriately. With this change, it's now
possible to let zpool rollback damaged transaction groups and import the
pool.
kib [Fri, 8 Feb 2013 10:38:12 +0000 (10:38 +0000)]
MFC r246218:
Backup FATs were sometimes marked dirty by copying their first block
from the primary FAT, and then they were not marked clean on unmount.
Force marking them clean when appropriate.
mav [Wed, 6 Feb 2013 22:30:40 +0000 (22:30 +0000)]
MFC r244001 (by ken):
Fix a panic during CAM EDT traversal.
The problem was a race condition between the EDT traversal used by
things like 'camcontrol devlist', and CAM peripheral driver
removal.
The EDT traversal code holds the CAM topology lock, and wants
to show devices that have been invalidated. It acquires a
reference to the peripheral to make sure the peripheral it is
examining doesn't go away.
However, because the peripheral removal code in camperiphfree()
drops the CAM topology lock to call the peripheral's destructor
routine, we can run into a situation where the EDT traversal
increments the peripheral reference count after free process is
already in progress. At that point, the reference count is
ignored, because it was 0 when we started the process.
Fix this race by setting a flag, CAM_PERIPH_FREE, that I previously
added and checked in xptperiphtraverse() and xptpdperiphtravsere(),
but failed to use. If the EDT traversal code sees that flag,
it will know that the peripheral free process has already started,
and that it should not access that peripheral.
Also, fix an inconsistency in the locking between
xptpdperiphtraverse() and xptperiphtraverse(). They now both
hold the CAM topology lock while calling the peripheral traversal
function.
cam_xpt.c: Change xptperiphtraverse() to hold the CAM topology
lock across calls to the traversal function.
Take out the comment in xptpdperiphtraverse() that
referenced the locking inconsistency.
cam_periph.c: Set the CAM_PERIPH_FREE flag when we are in the
process of freeing a peripheral driver.
mav [Wed, 6 Feb 2013 22:22:15 +0000 (22:22 +0000)]
MFC r241503:
XPT_DEV_MATCH is probably the only xpt_action() method that is called
without holding SIM lock. It really doesn't need that lock, but adding it
removes that specific exception, allowing to assert locking there later.
mav [Wed, 6 Feb 2013 22:07:38 +0000 (22:07 +0000)]
MFC r235911, r235980, r238739, r238740, r238894, r239213, r241488, r241952,
r242173, r242621, r242634, r242638, r242647, r242720, r244418, r244508,
r245891:
Revamp the CAM enclosure services driver.
This updated driver uses an in-kernel daemon to track state changes and
publishes physical path location information for disk elements into the
CAM device database.
Sponsored by: Spectra Logic Corporation
Sponsored by: iXsystems, Inc.
mav [Wed, 6 Feb 2013 18:40:07 +0000 (18:40 +0000)]
MFC r242175:
Remove priority enforcement from xpt_ation(). It is not good and even not
safe in some cases to reduce CCB priority after it was scheduled with high
priority. This fixes reproducible deadlock when command sent through the
pass interface while ATA XPT recovers from command timeout.
Instead of that enforce priority at passioctl(). libcam provides no obvious
interface to specify CCB priority and so much (all?) code specifies zero
(highest) priority. This change limits pass CCBs priority to NORMAL run
level, allowing XPT to complete bus and device recovery after reset before
running any payload.
kib [Wed, 6 Feb 2013 13:53:59 +0000 (13:53 +0000)]
MFC r246119:
Rework the handling of the children for the pthread_vfork_test. The
trivial handler for SIGCHLD is installed, and SIGCHLD is blocked, to
not abandon our zombies to init(8). This way, the zombies are around
slightly longer, allowing to actually exercise the logic for p_pwait
use by the test.
kib [Wed, 6 Feb 2013 13:49:56 +0000 (13:49 +0000)]
MFC r246118:
The case of pid == WAIT_MYPGRP for the kern_wait() is already handled
in kern_wait6(), which is called by kern_wait(). Remove the redundand
check, introduced in r243136, and add a comment noting this, to make
the code less confusing.
dim [Tue, 5 Feb 2013 19:10:50 +0000 (19:10 +0000)]
MFC r246028 (by theraven):
Fix some symbol version mismatches between libstdc++ and libsupc++/libcxxrt
that were causing the runtime and STL libraries to see different versions of
various classes and functions when libstdc++ is used as a filter.
Note: This changes the ABI for libcxxrt, but libcxxrt is currently only in
-STABLE for testing and is not used by anything unless explicitly enabled by
the end user. No default compiler configurations use it.
libc++ will need to be recompiled after this change. make buildworld will do
this automatically, but make in lib/libc++ will not necessarily work unless the
new libcxxrt is installed first.
PR: kern/171610, stand/175453
Reviewed by: kib
MFC r246297:
Add several missing symbols to libcxxrt's symbol version map, and remove
a few duplicates. This should fix building world with -stdlib=libc++
after r246028.