phk [Sun, 29 Dec 2002 11:18:25 +0000 (11:18 +0000)]
Use a timeout of one second while we wait for the vnode washer,
this prevents a potential race and makes the system a little bit
less jerky under extreme loads.
phk [Sun, 29 Dec 2002 10:39:05 +0000 (10:39 +0000)]
Vnodes pull in 800-900 bytes these days, all things counted, so we need
to treat desiredvnodes much more like a limit than as a vague concept.
On a 2GB RAM machine where desired vnodes is 130k, we run out of
kmem_map space when we hit about 190k vnodes.
If we wake up the vnode washer in getnewvnode(), sleep until it is done,
so that it has a chance to offer us a washed vnode. If we don't sleep
here we'll just race ahead and allocate yet a vnode which will never
get freed.
In the vnodewasher, instead of doing 10 vnodes per mountpoint per
rotation, do 10% of the vnodes distributed evenly across the
mountpoints.
phk [Sun, 29 Dec 2002 10:32:16 +0000 (10:32 +0000)]
There is some sort of race/deadlock which I have not identified
here. It manifests itself by sendmail hanging in "fifoow" during
boot on a diskless machine with sendmail disabled.
Giving the sleep a 1sec timout breaks the deadlock, but does not solve
the underlying problem.
alc [Sun, 29 Dec 2002 07:17:06 +0000 (07:17 +0000)]
Reduce the number of times that we acquire and release the page queues
lock by making vm_page_rename()'s caller, rather than vm_page_rename(),
responsible for acquiring it.
jake [Sat, 28 Dec 2002 23:57:52 +0000 (23:57 +0000)]
- Moved storing %g1-%g5 in the trapframe until after interrupts are enabled.
- Restore %g6 and %g7 for kernel traps if we are returning to prom code.
This allows complex traps (ones that call into C code) to be handled from
the prom.
dillon [Sat, 28 Dec 2002 23:39:47 +0000 (23:39 +0000)]
Add 'swapctl' - as a hardlink to swapon/swapoff, and augment swapon with
swapctl functionality. The idea is to create a swapctl command that is
fairly close to the OpenBSD and NetBSD version. FreeBSD does not implement
swap priority (and it would be a mistake if we did) so we didn't bother with
that part of it.
rwatson [Sat, 28 Dec 2002 23:33:09 +0000 (23:33 +0000)]
Since our default boot block now supports UFS1 and UFS2 even on
i386, remove the seatbelt preventing users from setting the UFS2 flag
on the root file system on i386. This seatbelt did not exist on
other platforms.
phk [Sat, 28 Dec 2002 22:17:29 +0000 (22:17 +0000)]
It is bad style to define the same structure in multiple header
files which might be included together.
Things like debuggers and lint-like programs get their knickers in
a twist (rightly so one might add) when they find different locations
for the same named struct depending on which .h file were included
first.
This is a stellar example of Very Bad Thinking on the part of the
standards dudes who wrote that both sys/uio.h and sys/socket.h
should define struct iovec the same way.
Fix this by putting struct iovec into its own miniature sys/_iovec.h
file and #include that from sys/socket.h and sys/uio.h.
Sensible people could just put iovec into sys/_types.h but there
is probably some standard or other which will be violated if we
did something that horrible.
dillon [Sat, 28 Dec 2002 21:15:39 +0000 (21:15 +0000)]
vm_pager_put_pages() takes VM_PAGER_* flags, not OBJPC_* flags. It just
so happens that OBJPC_SYNC has the same value as VM_PAGER_PUT_SYNC so no
harm done. But fix it :-)
dillon [Sat, 28 Dec 2002 21:03:42 +0000 (21:03 +0000)]
Allow the VM object flushing code to cluster. When the filesystem syncer
comes along and flushes a file which has been mmap()'d SHARED/RW, with
dirty pages, it was flushing the underlying VM object asynchronously,
resulting in thousands of 8K writes. With this change the VM Object flushing
code will cluster dirty pages in 64K blocks.
Note that until the low memory deadlock issue is reviewed, it is not safe
to allow the pageout daemon to use this feature. Forced pageouts still
use fs block size'd ops for the moment.
rwatson [Sat, 28 Dec 2002 14:58:50 +0000 (14:58 +0000)]
Change ACPI make_dev() calls to use UID_ and GID_ constants rather
than hard-coded uids and gids.
Switch the device to a group of wheel instead of operator.
Narrow down the permissions on the device to require root privilege
to manipulate the system power state. It may be that we can broaden
access to the device after review of the access control in ACPI.
julian [Sat, 28 Dec 2002 01:23:07 +0000 (01:23 +0000)]
Add code to ddb to allow backtracing an arbitrary thread.
(show thread {address})
Remove the IDLE kse state and replace it with a change in
the way threads sahre KSEs. Every KSE now has a thread, which is
considered its "owner" however a KSE may also be lent to other
threads in the same group to allow completion of in-kernel work.
n this case the owner remains the same and the KSE will revert to the
owner when the other work has been completed.
All creations of upcalls etc. is now done from
kse_reassign() which in turn is called from mi_switch or
thread_exit(). This means that special code can be removed from
msleep() and cv_wait().
kse_release() does not leave a KSE with no thread any more but
converts the existing thread into teh KSE's owner, and sets it up
for doing an upcall. It is just inhibitted from being scheduled until
there is some reason to do an upcall.
Remove all trace of the kse_idle queue since it is no-longer needed.
"Idle" KSEs are now on the loanable queue.
jake [Fri, 27 Dec 2002 19:31:26 +0000 (19:31 +0000)]
Define UMA_MD_SMALL_ALLOC so that uma_small_alloc and uma_small_free will
be used for zones that allocate objects of less 1 page. The biggest advantage
of this is that all of a sudden the majority of kernel malloc-ed data doesn't
need kva allocated for it. Besides microbenchmarks I haven't seen a measurable
performance improvement from doing this.
rwatson [Fri, 27 Dec 2002 18:20:16 +0000 (18:20 +0000)]
Re-add MNT_ACLS to the list of "updateable" mount flags, per our
documentation. Generally, you really shouldn't twiddle the flag,
but there are sensible scenarios where one might.
rwatson [Fri, 27 Dec 2002 17:50:39 +0000 (17:50 +0000)]
Use UID_ and GID_ constants instead of hard-coded numeric values
with make_dev(). Use OPERATOR instead of implicit WHEEL to match
other storage devices. Use a mode of 0640 to be consistent
with other storage devices.
iedowse [Fri, 27 Dec 2002 17:43:25 +0000 (17:43 +0000)]
Bridged packets are supplied to the firewall with their IP header
in network byte order, but icmp_error() expects the IP header to
be in host order and the code here did not perform the necessary
swapping for the bridged case. This bug causes an "icmp_error: bad
length" panic when certain length IP packets (e.g. ip_len == 0x100)
are rejected by the firewall with an ICMP response.
iedowse [Fri, 27 Dec 2002 17:15:16 +0000 (17:15 +0000)]
Oops, I misread the purpose of the NULL check in EH_RESTORE() in
revision 1.62. It was checking for M_PREPEND() failing, not for the
case of a NULL mbuf pointer being supplied to the macro. Back out
that revision, and fix the NULL dereference by not calling EH_RESTORE()
in the case where the mbuf pointer is NULL because the firewall
rejected the packet.
rwatson [Fri, 27 Dec 2002 16:44:11 +0000 (16:44 +0000)]
Use UID_ROOT and GID_WHEEL for uid/gid argments to make_dev().
Remove the setgid bit from the tga device (?).
Synchronize mode with modes used for related frame buffer devices
in MAKEDEV (tga doesn't appear in MAKEDEV).
rwatson [Fri, 27 Dec 2002 16:40:54 +0000 (16:40 +0000)]
Make use of UID_ROOT, GID_WHEEL for make_dev() arguments.
Remove the setgid bit from the 3dfx device (?).
Synchronize permissions with the values in MAKEDEV for consistency.
rwatson [Fri, 27 Dec 2002 16:28:31 +0000 (16:28 +0000)]
Remove S_IROTH from the make_dev() lines for iir-related devices. This
improves protection consistency with other storage devices (generally
root:operator,660). This driver appears not to have an active
maintainer.
phk [Fri, 27 Dec 2002 11:05:05 +0000 (11:05 +0000)]
Use three UMA zones for FFS/UFS inodes instead of malloc space.
Since inodes are currently 144 bytes, this will save 112 bytes per
inode. This can amount to up to 10MByte on large systems.
ru [Fri, 27 Dec 2002 10:09:04 +0000 (10:09 +0000)]
POLA dictates that in the file designated with the -f option
argument, leading whitespace and empty lines be ignored, and
the `#' character marks the rest of the line as a comment.
jake [Fri, 27 Dec 2002 01:50:29 +0000 (01:50 +0000)]
- Use direct mapped addresses for the message buffer, for the crash dump
mappings, and for pmap_map which is used to map the vm_page structures.
- Don't allocate kva space for any of the above.
hsu [Fri, 27 Dec 2002 00:24:35 +0000 (00:24 +0000)]
Long chain of calls starting with bridge_on(), going through IPv6, and
ending up at ifa_ifwithdstaddr() could lead to a recursive lock of
the ifnet list mutex.
tjr [Thu, 26 Dec 2002 14:34:18 +0000 (14:34 +0000)]
Add an implementation of the POSIX wordexp() and wordfree() functions,
which perform shell-style word expansion on strings. This is still a
little rough around the edges.
iedowse [Thu, 26 Dec 2002 13:25:57 +0000 (13:25 +0000)]
When resuming after a system suspend, re-issue the UHCI_CMD_MAXP
command in case this setting was not saved. Since bandwidth reclamation
(-current only) often results in bus activity continuing to the end
of every frame, most transfers would fail with IOERROR if this
setting is missed.