emaste [Fri, 26 May 2006 18:17:53 +0000 (18:17 +0000)]
Add sanity checking for QUEUE(3) TAILQs under INVARIANTS (similar to
the LIST checks). Races may lead to list corruption, which can be
difficult to unravel in a post-mortem analysis. These checks verify
that the prev and next pointers are consistent when inserting or
removing elements, thus catching any corruption earlier.
netchild [Fri, 26 May 2006 18:06:07 +0000 (18:06 +0000)]
This is the kernel subsystem API documentation generation framework.
It uses doxygen to generate the API documentation. For each subsystem
a very small (about 20 lines with comments) subsystem specific Doxyfile
has to be written (have a look at the README for more). All common doxygen
options are specified in a separate file.
The framework is configured to not only generate the HTML version, but also
a PDF version (the paper size is hardcoded to DIN A4 currently and depending
on the subsystem you have to increase some limits in the latex configuration
of your system, the README tells more about this).
It also allows cross-references between the subsystems (it generates doxygen
tag files).
Currently the docs are generated in OBJDIR, but this may change after
coordination with doc@. The makefile is prepared to generate/move various
parts of the generated docs to different destinations.
TARGET_ARCH is respected and some env-vars are set for architecture specific
handling of the source (the README tells more).
Subsystems for which docs are generated:
- cam - crypto - dev_pci
- dev_sound - dev_usb - geom
- i4b - kern - libkern
- linux - net80211 - netgraph
- netinet - netinet6 - netipsec
- opencrypto - vm
rodrigc [Fri, 26 May 2006 11:58:30 +0000 (11:58 +0000)]
Remove calls to vfs_export() for exporting a filesystem for NFS mounting
from individual filesystems. Call it instead in vfs_mount.c,
after we call VFS_MOUNT() for a specific filesystem.
phk [Fri, 26 May 2006 11:52:20 +0000 (11:52 +0000)]
Wrap our drivers gdb_getc() function so that if it returns -1 we
try again. This way it matches the console behaviour and allows us
to share more code.
phk [Fri, 26 May 2006 10:24:00 +0000 (10:24 +0000)]
GC the cn_dbctl_t hook for consoles, it is unused.
This used to make syscons switch to vty0 when we entered DDB but this
was lost in the KDB shuffle. We may want to bring it back down the road
but it should be done by calling cn_init_t/cn_term_t instead, possibly
with a flag argument saying "Debugger!"
rodrigc [Fri, 26 May 2006 01:21:51 +0000 (01:21 +0000)]
Remove calls to vfs_export() for exporting a filesystem for NFS mounting
from individual filesystems. Call it instead in vfs_mount.c,
after we call VFS_MOUNT() for a specific filesystem.
rodrigc [Fri, 26 May 2006 00:32:21 +0000 (00:32 +0000)]
Remove calls to vfs_export() for exporting a filesystem for NFS mounting
from individual filesystems. Call it instead in vfs_mount.c,
after we call VFS_MOUNT() for a specific filesystem.
imp [Thu, 25 May 2006 23:06:38 +0000 (23:06 +0000)]
APM was calling the suspend process from a timeout. This meant that
other timeouts could not happen while suspending, including timeouts
for things like msleep. This caused the system to hang on suspend
when the cbb was enabled, since its suspend path powered down the
socket which used a timeout to wait for it to be done.
APM now creates a thread when it is enabled, and deletes the thread
when it is disabled. This thread takes the place of the timeout by
doing its polling every ~.9s. When the thread is disabled, it will
wakeup early, otherwise it times out and polls the varius things the
old timeout polled (APM events, suspend delays, etc).
This makes my Sony VAIO 505TS suspend/resume correctly when APM is
enabled (ACPI is black listed on my 505TS).
This will likely fix other problems with the suspend path where
drivers would sleep with msleep and/or do other timeouts. Maybe
there's some special case code that would use DELAY while suspending
and msleep otherwise that can be revisited and removed.
This was also tested by glebius@, who pointed out that in the patch I
sent him, I'd forgotten apm_saver.c
rodrigc [Thu, 25 May 2006 22:12:05 +0000 (22:12 +0000)]
Ignore SIGPIPE signals on write() failures.
We already check for write() failures and handle EPIPE.
Failure to handle SIGPIPE was resulting in rpc.lockd terminating.
PR: bin/97768
Reported by: Gea-Suan Lin <gslin at csie dot nctu dot edu dot tw>
MFC after: 1 day
jhb [Thu, 25 May 2006 22:04:46 +0000 (22:04 +0000)]
Only reference the firmware module once rather than twice. The extra call
was accidentally added in 1.55 and resulted in an extra reference count
being held on the linker file.
rwatson [Thu, 25 May 2006 15:10:13 +0000 (15:10 +0000)]
Use getsock() and fput() instead of fgetsock() and fputsock() in
sendfile(). This causes sendfile() to use the file descriptor
reference to the socket instead of bumping the socket reference
count, which avoids an additional refcount operation, as well as a
potential expensive socket refcount drop, which can lead to
contention on the accept mutex. This change also has the side
effect of further reducing the number of cases where an in-progress
I/O operation can occur on a socket after close, as using the file
descriptor refcount prevents the socket from closing while in use.
phk [Thu, 25 May 2006 11:21:40 +0000 (11:21 +0000)]
In our system there's no intermediate step between a definitive Supreme
Court decision and violent revolution.
-- Al Gore (New York Magazine, May 29 2006)
rwatson [Thu, 25 May 2006 09:50:14 +0000 (09:50 +0000)]
Add a basic regression test for sendfile() over TCP, which sends varying
lengths of headers and data and makes sure it receives about the right
number of bytes.
ups [Thu, 25 May 2006 01:00:35 +0000 (01:00 +0000)]
Do not set B_NOCACHE on buffers when releasing them in flushbuflist().
If B_NOCACHE is set the pages of vm backed buffers will be invalidated.
However clean buffers can be backed by dirty VM pages so invalidating them
can lead to data loss.
Add support for flush dirty page in the data invalidation function
of some network file systems.
This fixes data losses during vnode recycling (and other code paths
using invalbuf(*,V_SAVE,*,*)) for data written using an mmaped file.
sam [Wed, 24 May 2006 22:11:07 +0000 (22:11 +0000)]
When starting up threads in taskqueue_start_threads create them
stopped before adjusting their priority and setting them on the run
q so they cannot race for resources (pointed out by njl).
While here add a console printf on thread create fails; otherwise
noone may notice (e.g. return value is always 0 and caller has no
way to verify).
rwatson [Wed, 24 May 2006 21:04:46 +0000 (21:04 +0000)]
Adjust minimum iod threads from 4 to 0 -- since we compile the NFS
client into the kernel by default, and many users won't use NFS,
don't start an extra 4 kernel threads that are unused. Once NFS
becomes active, it will start nfsiod's as it needs them.
We might consider mandating a minimum iod's equal to the number of
active NFS mounts (truncated to some value), which would force some
to remain available without having to create a new one if the file
system is mostly inactive.
PR: 70880
MFC after: 2 weeks
Prodded by: cel
Head nod: peter
Pointed out by: Joe <fbsd_user at a1poweruser dot com>
imp [Wed, 24 May 2006 17:26:16 +0000 (17:26 +0000)]
Suspend the children before we turn off card events in hardware. This
was done, I believe, to work around some cards having issues in the
suspend case. I think that this helped my Sony VAIO TS505 work better
when it had certain wireless cards in it and I did a apm -z. I've not
tested suspend/resume on other laptops in a long time, so I hope this
doesn't cause greif. Please let me know if it does.
imp [Wed, 24 May 2006 17:22:53 +0000 (17:22 +0000)]
Fix a race when detaching the cbb worker thread. There were a couple
of cases where we didn't take out the lock before setting or clearing
a bit. This apparently can lead to a race at kldunload time (at least
on my Turion64 laptop, never saw it on my Sony Vaio).
cel [Wed, 24 May 2006 15:56:36 +0000 (15:56 +0000)]
While reviewing NFS client for another PR, noticed this omission in the
NFSv4 client READDIR logic. This change matches the logic in the version
2 and 3 code.
mjacob [Wed, 24 May 2006 15:22:21 +0000 (15:22 +0000)]
Make physical buffers in cam_periph_mapmem owned by the kernel in case we
return to user space w/o waiting for I/O to complete.
I tried to get several folks who know this code better than me to review it
with no luck. I *do* know that w/o this code, using the SCSI target driver
panics in userret (if it doesn't panic in knote first).
ghelmer [Wed, 24 May 2006 14:03:51 +0000 (14:03 +0000)]
Revision 1.4 set access for all sensitive files in /proc/<PID> to mode 0
if a process's uid or gid has changed, but the /proc/<PID> directory
itself was also set to mode 0. Assuming this doesn't open any
security holes, open access to the /proc/<PID> directory for users
other than root to read or search the directory.
Reviewed by: des (back in February)
MFC after: 3 weeks
oleg [Wed, 24 May 2006 13:09:55 +0000 (13:09 +0000)]
Implement internal (i.e. inside kernel) packet tagging using mbuf_tags(9).
Since tags are kept while packet resides in kernelspace, it's possible to
use other kernel facilities (like netgraph nodes) for altering those tags.
Submitted by: Andrey Elsukov <bu7cher at yandex dot ru>
Submitted by: Vadim Goncharov <vadimnuclight at tpu dot ru>
Approved by: glebius (mentor)
Idea from: OpenBSD PF
MFC after: 1 month
cperciva [Wed, 24 May 2006 03:34:57 +0000 (03:34 +0000)]
If the user asks for "kernel sources" to be installed, extract the
SRC_BASE package (src/[A-Z]*) as well as SRC_SYS (src/sys/*). This
allows users who only install the kernel source code to use the
modern "make buildkernel" approach.
Discussed with: re (scottl, kensmith)
MFC after: 3 days
iedowse [Wed, 24 May 2006 03:04:11 +0000 (03:04 +0000)]
Attempt to follow the procedure described in section 4.10 of the
EHCI spec for linking in new qTDs into an asynchronous QH. This
requires that there is a qTD marked as not active and not halted
at the start of the QH's list, and the hardware will know to re-fetch
the qTD on each pass rather than just looking at the overlay qTD:
"The host controller must be able to advance the queue from the
Fetch QH state in order to avoid all hardware/software race
conditions. This simple mechanism allows software to simply link
qTDs to the queue head and activate them, then the host controller
will always find them if/when they are reachable."
This is achieved by keeping an "inactivesqtd" entry on the QH list,
and re-using it each time as the start of the next transfer, and
allocating a new qTD to become the next inactivesqtd. Then a new
transfer can be activated by just setting its "active" flag, which
avoids all the previous messing with overlay qTD state in
ehci_set_qh_qtd().
kris [Wed, 24 May 2006 00:06:14 +0000 (00:06 +0000)]
Increase the nfs access cache timeout from 2 to 60. The latter is a
more appropriate value and is also the default set by the kernel. I
could not find a justification of why rc.conf began overriding it back
in 1998.
This dramatically cuts NFS traffic on e.g. a busy system with NFS root.
cel [Tue, 23 May 2006 18:48:07 +0000 (18:48 +0000)]
NFS over TCP retransmit behavior should default to a 60 second time out,
mimicing the NFS reference implementation.
NFS over TCP does not need fast retransmit timeouts, since network loss
and congestion are managed by the transport (TCP), unlike with NFS over
UDP. A long timeout prevents the unnecessary retransmission of non-
idempotent NFS requests.
cel [Tue, 23 May 2006 18:33:58 +0000 (18:33 +0000)]
Refactor the NFS over UDP retransmit timeout estimation logic to allow
the estimator to be more easily tuned and maintained.
There should be no functional change except there is now a lower limit
on the retransmit timeout to prevent the client from retransmitting
faster than the server's disks can fill requests, and an upper limit
to prevent the estimator from taking to long to retransmit during a
server outage.
iedowse [Tue, 23 May 2006 01:27:23 +0000 (01:27 +0000)]
When usb_event_thread() first starts, wait significantly longer
before starting exploring (4 seconds), and extend the wait period
if new USB buses are attached while waiting.
This works around a problem seen when there is more than one EHCI
controller in the system and you kldload usb.ko after the system
has booted. The problem is that usb.ko contains 3 separate PCI
drivers which get initialised one by one (uhci, ohci, ehci), and
when each driver is initialised, all PCI buses are re-probed after
just the addition of that driver. This means that there can be a
significant delay between the attaching of a companion controller
and the subsequent EHCI attach, so it is possible for the companion
controller's USB 1.x bus to be scanned before the EHCI driver gets
a chance to check if there is really a USB 2.x device connected.