John Baldwin [Thu, 27 May 2010 18:07:20 +0000 (18:07 +0000)]
More gracefully handle stale file handles and attributes when opening a
file via NFS. Specifically, to satisfy close-to-open-consistency, the NFS
client always performs at least one RPC on a file during an open(2) to see
if the file has changed. Normally this RPC is an ACCESS or GETATTR RPC
that is forced by flushing a file's attribute cache during nfs_open() and
then requesting new attributes. However, if the file is noticed to be
stale during nfs_open(), the only recourse is to fail the open(2) call
with ESTALE. On the other hand, if the ACCESS or GETATTR RPC is sent
during nfs_lookup(), then the NFS client can fall back to a LOOKUP RPC to
obtain the new file handle in the case that a file has been replaced.
This change causes the NFS client to flush the attribute cache during
nfs_lookup() when validating a name cache hit if the attributes fetched
during nfs_lookup() can be reused in nfs_open(). This allows the client
to open a replaced file via the new file handle the first time that it
notices a replaced file rather than failing with ESTALE in some cases.
Robert Watson [Thu, 27 May 2010 15:27:31 +0000 (15:27 +0000)]
When close() is called on a connected socket pair, SO_ISCONNECTED might be
set but be cleared before the call to sodisconnect(). In this case,
ENOTCONN is returned: suppress this error rather than returning it to
userspace so that close() doesn't report an error improperly.
PR: kern/144061
Reported by: Matt Reimer <mreimer at vpop.net>,
Nikolay Denev <ndenev at gmail.com>,
Mikolaj Golub <to.my.trociny at gmail.com>
MFC after: 3 days
Ulrich Spörlein [Thu, 27 May 2010 12:59:49 +0000 (12:59 +0000)]
mail(1) misses addresses when replying to all
There's a parsing error for fields where addresses are not separated by
space. This is often produced by MS Outlook, eg.: Cc: <foo@bar.com>,"Mr Foo" <foo@baz.com>
The following line now splits into the right tokens: Cc: f@b.com,z@y.de, <a@a.de>,<c@c.de>, "foo" <foo>,"bar" <bar>
PR: bin/131861
Submitted by: Pete French <petefrench at ticketswitch.com>
Tested by: Pete French
Reviewed by: mikeh
MFC after: 2 weeks
Alan Cox [Wed, 26 May 2010 18:00:44 +0000 (18:00 +0000)]
Push down page queues lock acquisition in pmap_enter_object() and
pmap_is_referenced(). Eliminate the corresponding page queues lock
acquisitions from vm_map_pmap_enter() and mincore(), respectively. In
mincore(), this allows some additional cases to complete without ever
acquiring the page queues lock.
Assert that the page is managed in pmap_is_referenced().
On powerpc/aim, push down the page queues lock acquisition from
moea*_is_modified() and moea*_is_referenced() into moea*_query_bit().
Again, this will allow some additional cases to complete without ever
acquiring the page queues lock.
Reorder a few statements in vm_page_dontneed() so that a race can't lead
to an old reference persisting. This scenario is described in detail by a
comment.
Correct a spelling error in vm_page_dontneed().
Assert that the object is locked in vm_page_clear_dirty(), and restrict the
page queues lock assertion to just those cases in which the page is
currently writeable.
Add object locking to vnode_pager_generic_putpages(). This was the one
and only place where vm_page_clear_dirty() was being called without the
object being locked.
Eliminate an unnecessary vm_page_lock() around vnode_pager_setsize()'s call
to vm_page_clear_dirty().
Change vnode_pager_generic_putpages() to the modern-style of function
definition. Also, change the name of one of the parameters to follow
virtual memory system naming conventions.
Robert Watson [Wed, 26 May 2010 10:46:03 +0000 (10:46 +0000)]
Add unix_close_race, a regresion test to catch ENOTCONN being returned
improperly from one of two instances of close(2) being called
simultaneously on both ends of a connected UNIX domain socket. The test
tool is slightly tweaked to improve failure modes, and while often does
trigger the problem, doesn't do so consistently due to the nature of the
race.
Gleb Smirnoff [Tue, 25 May 2010 21:20:56 +0000 (21:20 +0000)]
Add uep(4), driver for USB onscreen touch panel from eGalax.
The driver is stub. It just creates device entry and feeds
reassembled packets from hardware into it.
If in future we would port wsmouse(4) from NetBSD, or make
sysmouse(4) to support absolute motion events, then the driver
can be extended to act as system mouse. Meanwhile, it just
presents a /dev/uep0, that can be utilized by X driver, that
I am going to commit to ports tree soon.
The name for the driver is chosen to be the same as in NetBSD,
however, due to different USB stacks this driver isn't a port.
Change ia64' struct syscall_args definition so that args is a pointer to
the arguments array instead of array itself. ia64 syscall arguments are
readily available in the frame, point args to it, do not do unnecessary
bcopy. Still reserve the array in syscall_args for ia32 emulation.
Suggested and reviewed by: marcel
MFC after: 1 month
Pyun YongHyeon [Mon, 24 May 2010 17:12:44 +0000 (17:12 +0000)]
sge_encap() can sometimes return an error with m_head set to NULL.
Make sure not to requeue freed mbuf in sge_start_locked(). This
should fix NULL pointer dereference panic.
Bjoern A. Zeeb [Mon, 24 May 2010 16:41:05 +0000 (16:41 +0000)]
MFp4 @178364:
Implement an optional delay to the ddb reset/reboot command.
This allows textdumps to be run automatically with unattended reboots
after a resonable timeout, while still permitting an administrator to
break into debugger if attached to the console at the time of the
event for further debugging. Cap the maximum delay at 1 week to avoid
highly accidental results, and default to 15s in case of problems
parsing the timeout value.
Move hex2dec helper function from db_thread.c to db_command.c to make
it generally available and prefix it with a "db_" to avoid namespace
collisions.
Bjoern A. Zeeb [Mon, 24 May 2010 16:27:47 +0000 (16:27 +0000)]
MFp4 @178283:
Improve IPsec flow distribution for better netisr parallelism.
Instead of using the pointer that would have the last bits masked in a %
statement in netisr_select_cpuid() to select the queue, use the SPI.
John Baldwin [Mon, 24 May 2010 15:45:05 +0000 (15:45 +0000)]
Add support for corrected machine check interrupts. CMCI is a new local
APIC interrupt that fires when a threshold of corrected machine check
events is reached. CMCI also includes a count of events when reporting
corrected errors in the bank's status register. Note that individual
banks may or may not support CMCI. If they do, each bank includes its own
threshold register that determines when the interrupt fires. Currently
the code uses a very simple strategy where it doubles the threshold on
each interrupt until it succeeds in throttling the interrupt to occur
only once a minute (this interval can be tuned via sysctl). The threshold
is also adjusted on each hourly poll which will lower the threshold once
events stop occurring.
Tested by: Sailaja Bangaru sbappana at yahoo com
MFC after: 1 month
Jilles Tjoelker [Mon, 24 May 2010 15:12:12 +0000 (15:12 +0000)]
sh(1): Rework documentation of shell variables.
* Move the "environment variables" that do not need exporting to be
effective or that are set by the shell without exporting to a new section
"Special Variables".
* Add special variables LINENO and PPID.
* Add environment variables LANG, LC_* and PWD; also describe ENV under
environment variables.
Alan Cox [Mon, 24 May 2010 14:26:57 +0000 (14:26 +0000)]
Roughly half of a typical pmap_mincore() implementation is machine-
independent code. Move this code into mincore(), and eliminate the
page queues lock from pmap_mincore().
Push down the page queues lock into pmap_clear_modify(),
pmap_clear_reference(), and pmap_is_modified(). Assert that these
functions are never passed an unmanaged page.
Eliminate an inaccurate comment from powerpc/powerpc/mmu_if.m:
Contrary to what the comment says, pmap_mincore() is not simply an
optimization. Without a complete pmap_mincore() implementation,
mincore() cannot return either MINCORE_MODIFIED or MINCORE_REFERENCED
because only the pmap can provide this information.
Eliminate the page queues lock from vfs_setdirty_locked_object(),
vm_pageout_clean(), vm_object_page_collect_flush(), and
vm_object_page_clean(). Generally speaking, these are all accesses
to the page's dirty field, which are synchronized by the containing
vm object's lock.
Reduce the scope of the page queues lock in vm_object_madvise() and
vm_page_dontneed().
Alexander Motin [Mon, 24 May 2010 11:40:49 +0000 (11:40 +0000)]
- Implement MI helper functions, dividing one or two timer interrupts with
arbitrary frequencies into hardclock(), statclock() and profclock() calls.
Same code with minor variations duplicated several times over the tree for
different timer drivers and architectures.
- Switch all x86 archs to new functions, simplifying the code and removing
extra logic from timer drivers. Other archs are also welcome.
Jilles Tjoelker [Mon, 24 May 2010 10:35:57 +0000 (10:35 +0000)]
sh: Reap any zombies before forking for a background command.
This prevents accumulating huge amounts of zombies if a script executes
many background commands but no external commands or subshells.
Note that zombies will not be reaped during long calculations (within
the shell process) or read builtins, but those actions do not create
more zombies.
The terminated background commands will also still be remembered by the
shell.
Fix the double counting of the last process thread td_incruntime
on exit, that is done once in thread_exit() and the second time in
proc_reap(), by clearing td_incruntime.
Use the opportunity to revert to the pre-RUSAGE_THREAD exporting of ruxagg()
instead of ruxagg_locked() and use it from thread_exit().
Intention of this commit is to let us take a full advantage
of libusb(8) ported to Linux. This decreases a possibility of getting
any collisions within ioctl() "command" space, especially with
relation to LINUX_SNDCTL_SEQ... stuff.
Basically, we provide commands, that will be mapped in the kernel
to correct ones and forward those to the USB layer. Port enabling
functionality brought with this patch is here:
http://www.freebsd.org/cgi/query-pr.cgi?pr=146895
Bump __FreeBSD_version to catch, since which version installing a
port makes sense.
This patch should bring no regressions. So far, only i386 is tested.
Jayachandran C. [Mon, 24 May 2010 06:01:37 +0000 (06:01 +0000)]
Remove unused code in sys/mips/rmi :
- ehcireg.h,ehcivar.h : USB related files from old merge
- pcibus.c : was merged into xlr_pci.c earlier
- xlr_boot1_console.c : obsolete console code using bootloader hooks
- sys/mips/rmi/perfmon* : obsolete custom performance monitoring code
Martin Matuska [Sun, 23 May 2010 21:16:34 +0000 (21:16 +0000)]
Remove kstat.zfs.arcstats.l2_write_bytes_written
The arcstats.l2_write_bytes_written kstat counter introduced
in r205231 was duplicite with vendor's arcstats.l2_write_bytes counter
imported in r208373 (OpenSolaris revision 8582:df9361868dbe)
Approved by: pjd, delphij (mentor)
MFC after: 3 days
Marius Strobl [Sun, 23 May 2010 19:46:19 +0000 (19:46 +0000)]
Update the sparc64 hardware list regarding machines that will be supported
beginning with 8.1-RELEASE as well as correct some existing entries and
add previously missed ones.
Extend struct sysvec with three new elements:
sv_fetch_syscall_args - the method to fetch syscall arguments from
usermode into struct syscall_args. The structure is machine-depended
(this might be reconsidered after all architectures are converted).
sv_set_syscall_retval - the method to set a return value for usermode
from the syscall. It is a generalization of
cpu_set_syscall_retval(9) to allow ABIs to override the way to set a
return value.
sv_syscallnames - the table of syscall names.
Use sv_set_syscall_retval in kern_sigsuspend() instead of hardcoding
the call to cpu_set_syscall_retval().
The new functions syscallenter(9) and syscallret(9) are provided that
use sv_*syscall* pointers and contain the common repeated code from
the syscall() implementations for the architecture-specific syscall
trap handlers.
Syscallenter() fetches arguments, calls syscall implementation from
ABI sysent table, and set up return frame. The end of syscall
bookkeeping is done by syscallret().
Take advantage of single place for MI syscall handling code and
implement ptrace_lwpinfo pl_flags PL_FLAG_SCE, PL_FLAG_SCX and
PL_FLAG_EXEC. The SCE and SCX flags notify the debugger that the
thread is stopped at syscall entry or return point respectively. The
EXEC flag augments SCX and notifies debugger that the process address
space was changed by one of exec(2)-family syscalls.
The i386, amd64, sparc64, sun4v, powerpc and ia64 syscall()s are
changed to use syscallenter()/syscallret(). MIPS and arm are not
converted and use the mostly unchanged syscall() implementation.
Alexander Motin [Sun, 23 May 2010 07:53:22 +0000 (07:53 +0000)]
Make table-based HPET identification more clever. Before creating fake
device, make sure we have no real HPET device entry with same ID.
As side effect, it potentially allows several HPETs to be attached.
Use first of them for timecounting, rest (if ever present) could later
be used as event sources.
Neel Natu [Sat, 22 May 2010 21:38:57 +0000 (21:38 +0000)]
- Use ptpgzone zone to allocate page table pages irrespective of the amount of
memory on a platform. Tested on the Sibyte with 256MB and 1GB memory
configurations.
- Replace vtophys() with MIPS_KSEG0_TO_PHYS() to convert a page table
page's virtual address to physical. We can safely do this because
page table pages are allocated out of KSEG0.
- Add an assertion to verify that when a page table page is freed it
contains all zeroes. We can now use it after allocation without
zeroing it.