alc [Mon, 11 Jun 2012 21:41:16 +0000 (21:41 +0000)]
Avoid unnecessary atomic operations for clearing PGA_WRITEABLE in
pmap_remove_pages(). This reduces pmap_remove_pages()'s running time by
4 to 11% in my tests.
mm [Mon, 11 Jun 2012 11:35:22 +0000 (11:35 +0000)]
Introduce "feature flags" for ZFS pools (bump SPA version to 5000).
Add first feature "com.delphix:async_destroy" (asynchronous destroy
of ZFS datasets).
Implement features support in ZFS boot code.
Illumos revisions merged:
13700:2889e2596bd6
13701:1949b688d5fb
2619 asynchronous destruction of ZFS file systems
2747 SPA versioning with zfs feature flags
adrian [Mon, 11 Jun 2012 07:44:16 +0000 (07:44 +0000)]
Wrap the whole (software) TX path from ifnet dequeue to software queue
(or direct dispatch) behind the TXQ lock (which, remember, is doubling
as the TID lock too for now.)
This ensures that:
(a) the sequence number and the CCMP PN allocation is done together;
(b) overlapping transmit paths don't interleave frames, so we don't
end up with the original issue that triggered kern/166190.
Ie, that we don't end up with seqno A, B in thread 1, C, D in
thread 2, and they being queued to the software queue as "A C D B"
or similar, leading to the BAW stalls.
This has been tested:
* both STA and AP modes with INVARIANTS and WITNESS;
* TCP and UDP TX;
* both STA->AP and AP->STA.
STA is a Routerstation Pro (single CPU MIPS) and the AP is a dual-core
Centrino.
adrian [Mon, 11 Jun 2012 07:29:25 +0000 (07:29 +0000)]
When scheduling frames in an aggregate session, the frames should be
scheduled from the head of the software queue rather than trying to
queue the newly given frame.
This leads to some rather unfortunate out of order (but still valid
as it's inside the BAW) frame TX.
This now:
* Always queues the frame at the end of the software queue;
* Tries to direct dispatch the frame at the head of the software queue,
to try and fill up the hardware queue.
TODO:
* I should likely try to queue as many frames to the hardware as I can
at this point, rather than doing one at a time;
* ath_tx_xmit_aggr() may fail and this code assumes that it'll schedule
the TID. Otherwise TX may stall.
adrian [Mon, 11 Jun 2012 07:15:48 +0000 (07:15 +0000)]
Retried frames need to be inserted in the head of the list, not the tail.
This is an unfortunate byproduct of how the routine is used - it's called
with the head frame on the queue, but if the frame is failed, it's inserted
into the tail of the queue.
Because of this, the sequence numbers would get all shuffled around and
the BAW would be bumped past this sequence number, that's now at the
end of the software queue. Then, whenever it's time for that frame
to be transmitted, it'll be immediately outside of the BAW and TX will
stall until the BAW catches up.
It can also result in all kinds of weird duplicate BAW frames, leading
to hilarious panics.
adrian [Mon, 11 Jun 2012 06:59:28 +0000 (06:59 +0000)]
Revert r233227 and followup commits as it breaks CCMP PN replay detection.
This showed up when doing heavy UDP throughput on SMP machines.
The problem with this is because the 802.11 sequence number is being
allocated separately to the CCMP PN replay number (which is assigned
during ieee80211_crypto_encap()).
Under significant throughput (200+ MBps) the TX path would be stressed
enough that frame TX/retry would force sequence number and PN allocation
to be out of order. So once the frames were reordered via 802.11 seqnos,
the CCMP PN would be far out of order, causing most frames to be discarded
by the receiver.
I've fixed this in some local work by being forced to:
(a) deal with the issues that lead to the parallel TX causing out of
order sequence numbers in the first place;
(b) fix all the packet queuing issues which lead to strange (but mostly
valid) TX.
I'll begin fixing these in a subsequent commit or five.
kib [Sun, 10 Jun 2012 11:31:50 +0000 (11:31 +0000)]
Use the previous stack entry protection and max protection to correctly
propagate the stack execution permissions when stack is grown down.
First, curproc->p_sysent->sv_stackprot specifies maximum allowed stack
protection for current ABI, so the new stack entry was typically marked
executable always. Second, for non-main stack MAP_STACK mapping,
the PROT_ flags should be used which were specified at the mmap(2) call
time, and not sv_stackprot.
mav [Sun, 10 Jun 2012 11:17:14 +0000 (11:17 +0000)]
Partially revert r236666:
Return PROTO_ATA protocol in response to XPT_PATH_INQ.
smartmontools uses it to identify ATA devices and I don't know any other
place now where it is important. It could probably use XPT_GDEV_TYPE
instead for more accurate protocol information, but let it live for now.
andrew [Sun, 10 Jun 2012 10:40:22 +0000 (10:40 +0000)]
Remove an unneeded increment from initarm. The variable is uninitialised,
is not used in this part of the function and correctly initialised later
when it is used.
iwasaki [Sun, 10 Jun 2012 02:38:51 +0000 (02:38 +0000)]
Some fixes for r236772.
- Remove cpuset stopped_cpus which is no longer used.
- Add a short comment for cpuset suspended_cpus clearing.
- Fix the un-ordered x86/acpica/acpi_wakeup.c in conf/files.amd64 and i386.
mckusick [Sat, 9 Jun 2012 22:26:53 +0000 (22:26 +0000)]
When synchronously syncing a device (MNT_WAIT), wait for buffers
to become available. Otherwise we may excessively spin and fail
with ``fsync: giving up on dirty''.
pjd [Sat, 9 Jun 2012 20:16:19 +0000 (20:16 +0000)]
ds_guid of 0 is special, as it is used by snapshot receive code to
differentiate between an incremental and full stream.
Be sure not to generate guid equal to 0.
Reported by: someone who saw 0 being generated as 64bit random guid
MFC after: 3 days
mav [Sat, 9 Jun 2012 13:07:44 +0000 (13:07 +0000)]
One more major cam_periph_error() rewrite to improve error handling and
reporting. It includes:
- removing of error messages controlled by bootverbose, replacing them
with more universal and informative debugging on CAM_DEBUG_INFO level,
that is now built into the kernel by default;
- more close following to the arguments submitted by caller, such as
SF_PRINT_ALWAYS, SF_QUIET_IR and SF_NO_PRINT; consumer knows better which
errors are usual/expected at this point and which are really informative;
- adding two new flags SF_NO_RECOVERY and SF_NO_RETRY to allow caller
specify how much assistance it needs at this point; previously consumers
controlled that by not calling cam_periph_error() at all, but that made
behavior inconsistent and debugging complicated;
- tuning debug messages and taken actions order to make debugging output
more readable and cause-effect relationships visible;
- making camperiphdone() (common device recovery completion handler) to
also use cam_periph_error() in most cases, instead of own dumb code;
- removing manual sense fetching code from cam_periph_error(); I was told
by number of people that it is SIM obligation to fetch sense data, so this
code is useless and only significantly complicates recovery logic;
- making ada, da and pass driver to use cam_periph_error() with new limited
recovery options to handle error recovery and debugging in common way;
as one of results, CAM_REQUEUE_REQ and other retrying statuses are now
working fine with pass driver, that caused many problems before.
- reverting r186891 by raj@ to avoid burning few seconds in tight DELAY()
loops on device probe, while device simply loads media; I think that problem
may already be fixed in other way, and even if it is not, solution must be
different.
iwasaki [Sat, 9 Jun 2012 00:37:26 +0000 (00:37 +0000)]
Add x86/acpica/acpi_wakeup.c for amd64 and i386. Difference of
suspend/resume procedures are minimized among them.
common:
- Add global cpuset suspended_cpus to indicate APs are suspended/resumed.
- Remove acpi_waketag and acpi_wakemap from acpivar.h (no longer used).
- Add some variables in acpi_wakecode.S in order to minimize the difference
among amd64 and i386.
- Disable load_cr3() because now CR3 is restored in resumectx().
amd64:
- Add suspend/resume related members (such as MSR) in PCB.
- Modify savectx() for above new PCB members.
- Merge acpi_switch.S into cpu_switch.S as resumectx().
i386:
- Merge(and remove) suspendctx() into savectx() in order to match with
amd64 code.
jilles [Fri, 8 Jun 2012 22:54:25 +0000 (22:54 +0000)]
sh: Do not assume that SIGPIPE will only kill a subshell in builtins/wait3.0
test.
POSIX says that SIGPIPE affects a process and therefore a SIGPIPE caused and
received by a subshell environment may or may not affect the parent shell
environment.
The change assumes that ${SH} is executed in a new process. This must be the
case if it contains a slash and everyone appears to do so anyway even though
POSIX might permit otherwise.
jhb [Fri, 8 Jun 2012 21:30:35 +0000 (21:30 +0000)]
Several updates:
- Consistently refer to rmlocks as "read-mostly locks".
- Relate rmlocks to rwlocks rather than sx locks since they are closer to
rwlocks.
- Add a separate paragraph on sleepable read-mostly locks contrasting them
with "normal" read-mostly locks.
- The flag passed to rm_init_flags() to enable recursion for readers is
RM_RECURSE, not LO_RECURSABLE.
- Fix the description for RM_RECURSE (it allows readers to recurse, not
writers).
- Explicitly note that rm_try_rlock() honors RM_RECURSE.
- Fix some minor grammar nits.
jhb [Fri, 8 Jun 2012 18:32:09 +0000 (18:32 +0000)]
Split the second half of vn_open_cred() (after a vnode has been found via
a lookup or created via VOP_CREATE()) into a new vn_open_vnode() function
and use this function in fhopen() instead of duplicating code from
vn_open_cred() directly.
dim [Fri, 8 Jun 2012 17:08:27 +0000 (17:08 +0000)]
In usr.bin/sort, use another method of silencing warnings about unused
arguments, which does not trigger self-assignment warnings in certain
circumstances (for example, using clang with ccache).
mav [Thu, 7 Jun 2012 10:05:51 +0000 (10:05 +0000)]
To make CAM debugging easier, compile in some debug flags (CAM_DEBUG_INFO,
CAM_DEBUG_CDB, CAM_DEBUG_PERIPH and CAM_DEBUG_PROBE) by default.
List of these flags can be modified with CAM_DEBUG_COMPILE kernel option.
CAMDEBUG kernel option still enables all possible debug, if not overriden.
Additional 50KB of kernel size is a good price for the ability to debug
problems without rebuilding the kernel. In case where size is important,
debugging can be compiled out by setting CAM_DEBUG_COMPILE option to 0.
kib [Wed, 6 Jun 2012 16:30:16 +0000 (16:30 +0000)]
Improve handling of uiomove(9) errors for the NFS client.
Do not brelse() the buffer unconditionally with BIO_ERROR set if
uiomove() failed. The brelse() treats most buffers with BIO_ERROR as
B_INVAL, dropping their content. Instead, if the write request
covered the whole buffer, remember the cached state and brelse() with
BIO_ERROR set only if the buffer was not cached previously.
Update the buffer dirtyoff/dirtyend based on the progress recorded by
uiomove() in passed struct uio, even in the presence of
error. Otherwise, usermode could see changed data in the backed pages,
but later the buffer is destroyed without write-back.
If uiomove() failed for IO_UNIT request, try to truncate the vnode
back to the pre-write state, and rewind the progress in passed uio
accordingly, following the FFS behaviour.