kan [Mon, 16 Aug 2004 15:01:22 +0000 (15:01 +0000)]
Upgrading a lock does not play well together with acquiring an exclusive lock
and can lead to two threads being granted exclusive access. Check that no one
has the same lock in exclusive mode before proceeding to acquire it.
The LK_WANT_EXCL and LK_WANT_UPGRADE bits act as mini-locks and can block
other threads. Normally this is not a problem since the mini locks are
upgraded to full locks and the release of the locks will unblock the other
threads. However if a thread reset the bits without obtaining a full lock
other threads are not awoken. Add missing wakeups for these cases.
PR: kern/69964
Submitted by: Stephan Uphoff <ups at tree dot com>
Very good catch by: Stephan Uphoff <ups at tree dot com>
tjr [Mon, 16 Aug 2004 14:18:22 +0000 (14:18 +0000)]
Store a pointer to "null" in struct ndblock's defn member instead of a
duplicate allocated on the heap; the address defn points to is significant,
and is checked against the address of "null" in certain conditionals.
obrien [Mon, 16 Aug 2004 11:09:59 +0000 (11:09 +0000)]
I'm not sure what tjr envisioned for turning on FreeBSD/i386 rt support,
but make it COMPAT_IA32 for now.
Fix the 'DEBUG' argument code to unbreak the amd64 LINT build.
simon [Mon, 16 Aug 2004 10:49:45 +0000 (10:49 +0000)]
Remove Wiretek UBRJ4 from the list of supported devices. While it is
detected by the driver, it doesn't really work as soon as it really used
for network traffic. Perhaps it can be re-added later when the issues
are resolved.
dwmalone [Mon, 16 Aug 2004 10:00:44 +0000 (10:00 +0000)]
When looking for some extra data to include in the hash, use the
address of the dirhash, rather than the first sizeof(struct dirhash
*) bytes of the structure (which, thankfully, seem to be constant).
dwmalone [Mon, 16 Aug 2004 09:38:34 +0000 (09:38 +0000)]
Improve MIME handling. This patch is based on Eugene's patch, but
with the following changes:
1) Don't make a mime_types.h 'cos we should avoid creating variables
in header files,
2) Use strrchr to find the extension, rather than strchr,
3) Slightly simplify the mime-type matching loop.
any goof are likely to be mine. Note that there are links to more
improvements by Eugene in the PR.
tjr [Mon, 16 Aug 2004 08:19:18 +0000 (08:19 +0000)]
Add support for 32-bit Linux binary emulation on amd64:
- include <machine/../linux32/linux.h> instead of <machine/../linux/linux.h>
if building with the COMPAT_LINUX32 option.
- make minimal changes to the i386 linprocfs_docpuinfo() function to support
amd64. We return a fake CPU family of 6 for now.
tjr [Mon, 16 Aug 2004 07:55:06 +0000 (07:55 +0000)]
Add preliminary support for running 32-bit Linux binaries on amd64, enabled
with the COMPAT_LINUX32 option. This is largely based on the i386 MD Linux
emulations bits, but also builds on the 32-bit FreeBSD and generic IA-32
binary emulation work.
Some of this is still a little rough around the edges, and will need to be
revisited before 32-bit and 64-bit Linux emulation support can coexist in
the same kernel.
alfred [Mon, 16 Aug 2004 07:51:22 +0000 (07:51 +0000)]
This patch merges the sort fields for both pages, so you can (for
example) view io stats while sorting by process size. Also adds
voluntary and involuntary context-switch stats to the io page because
there was lots of room.
Submitted by: Dan Nelson dnelson at allantgroup.com
tjr [Mon, 16 Aug 2004 07:28:16 +0000 (07:28 +0000)]
Changes to MI Linux emulation code necessary to run 32-bit Linux binaries
on AMD64, and the general case where the emulated platform has different
size pointers than we use natively:
- declare certain structure members as l_uintptr_t and use the new PTRIN
and PTROUT macros to convert to and from native pointers.
- declare some structures __packed on amd64 when the layout would differ
from that used on i386.
- include <machine/../linux32/linux.h> instead of <machine/../linux/linux.h>
if compiling with COMPAT_LINUX32. This will need to be revisited before
32-bit and 64-bit Linux emulation support can coexist in the same kernel.
- other small scattered changes.
tjr [Mon, 16 Aug 2004 07:05:44 +0000 (07:05 +0000)]
Add a new type, l_uintptr_t, which is an unsigned integer type with the
same width as a pointer under Linux. Add two new macros, PTRIN and PTROUT,
which convert between l_uintptr_t and native pointers.
pjd [Mon, 16 Aug 2004 06:23:14 +0000 (06:23 +0000)]
Introduce GEOM RAID3 class, i.e. kernel module, which implements RAID3
transformation and graid3(8) userland utility, which can be used for
configuration. No manual page yet, sorry.
alc [Mon, 16 Aug 2004 06:16:12 +0000 (06:16 +0000)]
- Introduce and use a new tunable "debug.mpsafevm". At present, setting
"debug.mpsafevm" results in (almost) Giant-free execution of zero-fill
page faults. (Giant is held only briefly, just long enough to determine
if there is a vnode backing the faulting address.)
Also, condition the acquisition and release of Giant around calls to
pmap_remove() on "debug.mpsafevm".
The effect on performance is significant. On my dual Opteron, I see a
3.6% reduction in "buildworld" time.
- Use atomic operations to update several counters in vm_fault().
rwatson [Mon, 16 Aug 2004 04:41:03 +0000 (04:41 +0000)]
Always acquire the UNIX domain socket subsystem lock (UNP lock)
before dereferencing sotounpcb() and checking its value, as so_pcb
is protected by protocol locking, not subsystem locking. This
prevents races during close() by one thread and use of ths socket
in another.
unp_bind() now assert the UNP lock, and uipc_bind() now acquires
the lock around calls to unp_bind().
green [Mon, 16 Aug 2004 03:11:09 +0000 (03:11 +0000)]
Rather than bringing back all of the changes to make VM map deletion
wait for system wires to disappear, do so (much more trivially) by
instead only checking for system wires of user maps and not kernel maps.
green [Mon, 16 Aug 2004 03:08:38 +0000 (03:08 +0000)]
Allocate the marker, when scanning a kqueue, from the "heap" instead of the
stack. When swapped out, a process's kernel stack would be unavailable,
and we could get a page fault when scanning the same kqueue.
rwatson [Mon, 16 Aug 2004 01:52:04 +0000 (01:52 +0000)]
Annotate the current UNIX domain socket locking strategies, order,
strengths, and weaknesses in a comment. Assert a copyright over the
changes made as part of the locking work.
silby [Mon, 16 Aug 2004 01:27:24 +0000 (01:27 +0000)]
Major enhancements to pipe memory usage:
- pipespace is now able to resize non-empty pipes; this allows
for many more resizing opportunities
- Backing is no longer pre-allocated for the reverse direction
of pipes. This direction is rarely (if ever) used, so this cuts the
amount of map space allocated to a pipe in half.
- Pipe growth is now much more dynamic; a pipe will now grow when
the total amount of data it contains and the size of the write are
larger than the size of pipe. Previously, only individual writes greater
than the size of the pipe would cause growth.
- In low memory situations, pipes will now shrink during both read
and write operations, where possible. Once the memory shortage
ends, the growth code will cause these pipes to grow back to an appropriate
size.
- If the full PIPE_SIZE allocation fails when a new pipe is created, the
allocation will be retried with SMALL_PIPE_SIZE. This helps to deal
with the situation of a fragmented map after a low memory period has
ended.
- Minor documentation + code changes to support the above.
In total, these changes increase the total number of pipes that
can be allocated simultaneously, drastically reducing the chances that
pipe allocation will fail.
Performance appears unchanged due to dynamic resizing.
mbr [Mon, 16 Aug 2004 00:20:31 +0000 (00:20 +0000)]
MFNetBSD
Decrease log severity to debug if a protocol is not supported by the
kernel (rpcbind checks /etc/netconfig if a protocol is available).
This avoids "rpcbind: cannot create socket for tcp6" messages
at startup on IPv4-only kernels.
njl [Sun, 15 Aug 2004 23:39:37 +0000 (23:39 +0000)]
Comment out the ability to enable/disable ACPI at runtime. This appears
to not work reliably and crash some systems. It is not supported at all
on others. Pending discussion, the underlying ioctls will be removed.
marius [Sun, 15 Aug 2004 22:59:34 +0000 (22:59 +0000)]
Add a kludge for building SBus-only kernels, i.e. kernels without support
for EBus, ISA and PCI, by compiling ofw_isa.c and ofw_pci_if.m unconditio-
nally. The correct way is to rewrite OF_decode_addr() in ofw_machdep.c in
a bus-neutral way. That's certainly possible but we unfortunately didn't
make it for FreeBSD 5.3.
simon [Sun, 15 Aug 2004 22:33:10 +0000 (22:33 +0000)]
- Handle the '\&' mdoc(7) escape sequence.
- Handle the .Sx macro and give a warning if it is used in the
HARDWARE section, since that will probably produce odd text in the
Hardware Notes.
des [Sun, 15 Aug 2004 22:22:35 +0000 (22:22 +0000)]
Fix a couple of edge cases in which sb.st_size may be incorrect or
meaningless. In particular, don't assume that it is left untouched if
stat(2) fails; that assumption happens to fail at high optimization
levels on some platforms.
simon [Sun, 15 Aug 2004 22:14:29 +0000 (22:14 +0000)]
- Auto generate device listings for the following drivers: mpt, trm,
rl, vr, dc, de, and gem.
- hme(4) is not sparc64 only anymore, so update dev.archlist.txt
acordingly.
des [Sun, 15 Aug 2004 21:58:02 +0000 (21:58 +0000)]
Release the vnode cache mutex when calling vgone(), since vgone() may
sleep. This makes pfs_exit() even less efficient than before, but on
the bright side, the vnode cache mutex no longer needs to be recursive.
marius [Sun, 15 Aug 2004 21:37:52 +0000 (21:37 +0000)]
Correct some uses of the wrong members of the *min()/*max()-familiy, e.g.
min() on unsigned long. None of these are believed to have been fatal though.
alc [Sun, 15 Aug 2004 20:54:25 +0000 (20:54 +0000)]
- Make pmap_emulate_reference() MP and preemption safe. Previously, it
contained "sanity" checks that could be violated if another CPU modified
the pmap between the emulation trap and locking the pmap in
pmap_emulate_reference(). As a result, the pte could be inconsistent
with the access that caused the emulation trap. In such cases,
pmap_emulate_reference() now flushes the current CPU's TLB entry and
returns.
- Make pmap_changebit() an inline function, reducing object code size.
simon [Sun, 15 Aug 2004 20:54:07 +0000 (20:54 +0000)]
- Add a HARDWARE section which lists supported devices.
- Add the manufacturer name to each item in the device list.
- Make the note about supporting "IBM e335" into a general list and
change the entry to use the full product name ("IBM eServer xSeries
335").
- Add Dell PowerEdge 1750 to the list of systems with mpt onboard.
marius [Sun, 15 Aug 2004 20:17:29 +0000 (20:17 +0000)]
- Correct the description of the "local-mac-address?" variable. Not all NICs
use it, only those with FCode. Add references to dc(4), gem(4) and hme(4)
for obtaining further information about such devices presently supported
by FreeBSD.
- Correct the HISTORY section. There was an eeprom(8) utility in 4.4BSD and
early versions of FreeBSD 2.x.
- Add an AUTHORS section.
truckman [Sun, 15 Aug 2004 19:17:23 +0000 (19:17 +0000)]
Yet another tweak to the shutdown messages in boot():
Don't count busy buffers before the initial call to sync() and
don't skip the initial sync() if no busy buffers were called.
Always call sync() at least once if syncing is requested. This
defers the "Syncing disks, buffers remaining..." message until
after the initial sync() call and the first count of busy
buffers. This backs out changes in kern_shutdown 1.162.
Print a different message when there are no busy buffers after the
initial sync(), which is now the expected situation.
Print an additional message when syncing has completed successfully
in the unusual situation where the work of syncing was done by
boot().
Uppercase one message to make it consistent with all of the other
kernel shutdown messages.
Discussed with: bde (in a much earlier form, prior to 1.162)
Reviewed by: njl (in an earlier form)
rwatson [Sun, 15 Aug 2004 19:10:05 +0000 (19:10 +0000)]
Add a "fillchar" command line argument to dd(1) that permits the user
to specify an alternative padding character when using a conversion
mode, or when using noerror with sync and an input error occurs. This
facilities reading old and error-prone media by allowing the user to
more effectively mark error blocks in the output stream.
simon [Sun, 15 Aug 2004 18:09:47 +0000 (18:09 +0000)]
- Add a HARDWARE section which lists supported devices.
- Remove reference to the NOTES section in the entry for Sun DMFE,
since ot doesn't work well with the auto generated Hardware Notes. [1]
rwatson [Sun, 15 Aug 2004 18:02:09 +0000 (18:02 +0000)]
Add an "options MP_WATCHDOG" to i386. This option allows one of the
logical CPUs on a system to be used as a dedicated watchdog to cause a
drop to the debugger and/or generate an NMI to the boot processor if
the kernel ceases to respond. A sysctl enables the watchdog running
out of the processor's idle thread; a callout is launched to reset a
timer in the watchdog. If the callout fails to reset the timer for ten
seconds, the watchdog will fire. The sysctl allows you to select which
CPU will run the watchdog.
A sample "debug.leak_schedlock" is included, which causes a sysctl to
spin holding sched_lock in order to trigger the watchdog. On my Xeons,
the watchdog is able to detect this failure mode and break into the
debugger, which cannot otherwise be done without an NMI button.
This option does not currently work with sched_ule due to ule's push
notion of scheduling, similar to machdep.hlt_logical_cpus failing to
work with that scheduler.
On face value, this might seem somewhat inefficient, but there are a
lot of dual-processor Xeons with HTT around, so using one as a watchdog
for testing is not as inefficient as one might fear.