des [Mon, 5 Jul 2004 11:10:57 +0000 (11:10 +0000)]
Make libalias WARNS?=6-clean. This mostly involves renaming variables
named link, foo_link or link_foo to lnk, foo_lnk or lnk_foo, fixing
signed / unsigned comparisons, and shoving unused function arguments
under the carpet.
I was hoping WARNS?=6 might reveal more serious problems, and perhaps
the source of the -O2 breakage, but found no smoking gun.
Locking cleanup for rl(4).
- Eliminate the use of a recursive mutex.
- Mark the driver INTR_MPSAFE.
This work is incomplete and will be refined in a future commit.
- Most notably, _locked() variants of entry points need to be introduced.
- The mii upcall/downcall may still be racy.
- Add a stubbed-out guard against racing rl_detach() for the time being.
Tested on: UP, debug.mpsafenet && !debug.mpsafenet
Reviewed by: silence on -net
style(9) and whitespace cleanup.
Use C99 types. Use ANSI function definitions. Sort prototypes.
Split long lines correctly. Punctuate/wordsmith comments.
Use device_printf()/if_printf() where possible.
- Eliminate the use of a recursive mutex.
- Mark the driver as INTR_MPSAFE.
- Split the default media choice code out into xl_choose_media() to
avoid making poor assumptions about the state of the lock during attach.
- The miibus upcall/downcall paths may still be racy.
Change to commented-out locking assertions there for now.
- Tested with nfsclient, routed, ssh, ntp, dhclient and quagga bgpd.
- This needs SMP test coverage. I do not have such resources.
Tested on: UP, !debug.mpsafenet && debug.mpsafenet
Hardware: 3C905B-TX (0x905510b7)
Use if_printf() and device_printf() where appropriate, i.e.:
- Use device_printf() during device probe/attach.
- Move if_xname initialization to before xl_reset() is called.
- Use if_printf() at all other times after struct ifnet has been
initialized.
Where syslogd would have fsync()ed a file in the past, instead set a flag
FFLAG_NEEDSYNC and fsync the file when select() next returns zero. This
dramatically speeds up the process of logging large amounts of data, while
leaving the essential semantics (that data can be expected to be on disk
if we crash) unchanged.
In my tests, this speeds up the rc phase of booting by 18-20%. [1]
Workaround a locking problem in vlan(4). vlan_setmulti() may be called
with sleepable locks held from further up in the network stack, and
attempts to allocate memory to hold multicast group membership information
with M_WAITOK.
This panic was triggered specifically when an exiting routing daemon
process closes its raw sockets after joining multicast groups on them.
While we're here, comment some possible locking badness.
Yet another pointy hat: When restoring file flags, it's okay to use the
shared stat buffer, but don't try to access it through an uninitialized
pointer.
Nothing says that /var/log can't be not a directory but a symbolic link
to a directory. Therefore, use stat(2) instead of lstat(2) to check if
/var/log exists.
Make the default memory range in the top 2GB of ram in the hopes that
this more accurately reflects what the underlying hardware of most
acpi machines that don't have children pci busses.
We still need a better way to get this information from acpi/hardware.
Make grep run much (~10x) faster in multibyte locales by caching the wide
character representation of input data across calls to dfaexec(), and by
caching the lengths of character across calls to check_multibyte_string().
meta_p is a void *, so a variable that's of type void * can't be
dereferenced directly. Toss an ifdef around it for the moment and
allow this to compile. This likely means that priority packets aren't
queued to the special high priority queue. The maintainer of this
should look into the problem.
This is likely fallout from the netgraph migration to using a more
generic meta tag from the mbug recently.
Introduce debug.nosleepwithlocks sysctl, 0 by default. If set to 1
and WITNESS is not built, then force all M_WAITOK allocations to
M_NOWAIT behavior (transparently). This is to be used temporarily
if wierd deadlocks are reported because we still have code paths
that perform M_WAITOK allocations with lock(s) held, which can
lead to deadlock. If WITNESS is compiled, then the sysctl is ignored
and we ask witness to tell us wether we have locks held, converting
to M_NOWAIT behavior only if it tells us that we do.
Note this removes the previous mbuf.h inclusion as well (only needed
by last revision), and cleans up unneeded [artificial] comparisons
to just the mbuf zones. The problem described above has nothing to
do with previous mbuf wait behavior; it is a general problem.
green [Sun, 4 Jul 2004 15:59:25 +0000 (15:59 +0000)]
Reextend the M_WAITOK-disabling-hack to all three of the mbuf-related
zones, and do it by direct comparison of uma_zone_t instead of strcmp.
The mbuf subsystem used to provide M_TRYWAIT/M_DONTWAIT semantics, but
this is mostly no longer the case. M_WAITOK has taken over the spot
M_TRYWAIT used to have, and for mbuf things, still may return NULL if
the code path is incorrectly holding a mutex going into mbuf allocation
functions.
The M_WAITOK/M_NOWAIT semantics are absolute; though it may deadlock
the system to try to malloc or uma_zalloc something with a mutex held
and M_WAITOK specified, it is absolutely required to not return NULL
and will result in instability and/or security breaches otherwise.
There is still room to add the WITNESS_WARN() to all cases so that
we are notified of the possibility of deadlocks, but it cannot change
the value of the "badness" variable and allow allocation to actually
fail except for the specialized cases which used to be M_TRYWAIT.
The net.link.ether.bridge.enable sysctl MIB variable enables bridge
functionality by setting to a non-zero value. This is an integer, but
is treated as a boolean by the code, so clamp it to a boolean value
when set so as to avoid unnecessary bridge reinitialization if it's
changed to another value.
simon [Sun, 4 Jul 2004 14:17:41 +0000 (14:17 +0000)]
Add a HARDWARE section which lists supported devices. The actual
device listings has been moved (and in some cases more or less
rewritten) from the DESCRIPTION section.
This will be used later for automatically generating device listings
in the Hardware Notes, by parsing the manual pages.
Reviewed in principle by: ru, hrs, trhodes
No objections: -doc, re
Section name inspired by: NetBSD
Avoid accessing accessing memory past the end of mb_properties in the
degenerate case of fgrep with an empty pattern in a multibyte locale.
Found by phkmalloc.
alfred [Sun, 4 Jul 2004 10:52:54 +0000 (10:52 +0000)]
Introduce a new kevent filter. EVFILT_FS that will be used to signal
generic filesystem events to userspace. Currently only mount and unmount
of filesystems are signalled. Soon to be added, up/down status of NFS.
Introduce a sysctl node used to route requests to/from filesystems
based on filesystem ids.
Introduce a new vfsop, vfs_sysctl(mp, req) that is used as the callback/
entrypoint by the sysctl code to change individual filesystems.
alfred [Sun, 4 Jul 2004 10:19:15 +0000 (10:19 +0000)]
Revision 1.496 would not boot on my system due to
ffs_mount -> bdevvp -> getnewvnode(..., mp = NULL, ...) ->
insmntqueue(vp, mp = NULL) -> KASSERT -> panic
Make getnewvnode() only call insmntqueue() if the mountpoint parameter
is not NULL.
When we traverse the vnodes on a mountpoint we need to look out for
our cached 'next vnode' being removed from this mountpoint. If we
find that it was recycled, we restart our traversal from the start
of the list.
Code to do that is in all local disk filesystems (and a few other
places) and looks roughly like this:
(Take a moment and try to spot the locking error before you read on.)
On a SMP system, one CPU could have removed nvp from our mountlist
but not yet gotten to assign a new value to vp->v_mount while another
CPU simultaneously get to the top of the traversal loop where it
finds that (vp->v_mount != mp) is not true despite the fact that
the vnode has indeed been removed from our mountpoint.
Fix:
Introduce the macro MNT_VNODE_FOREACH() to traverse the list of
vnodes on a mountpoint while taking into account that vnodes may
be removed from the list as we go. This saves approx 65 lines of
duplicated code.
Split the insmntque() which potentially moves a vnode from one mount
point to another into delmntque() and insmntque() which does just
what the names say.
Fix delmntque() to set vp->v_mount to NULL while holding the
mountpoint lock.
Yes, NgRecvAsciiMsg has the same results as NgRecvAsciiMsg, but it's
much more apt to note that it has the same result as NgRecvMsg. Make
the manual page less circular in its reference to this fact.
Fix regression in new version of GNU regex code: bracket expressions
like [X-Y] should match all characters between X-Y according to the
locale's collating order, not by binary value. For now, this only fixes
the !MBS_SUPPORT case (which is the default).
Document /var/run/dmesg.boot, which is created by the rc scripts. Many
people have suggested that we document this somewhere, and this was a common
suggestion.
Use the rman_* functions in preference to reaching into struct resource.
Remove __RMAN_RESOURCE_VISIBLE after compilation confirms it is now not
needed.
Don't define __RMAN_RESOURCE_VISISBLE. They aren't needed here after
I've converted the direct accessing of struct resource members to the
preferred interface.
Change M_WAITOK argument to sodupsockaddr() to M_NOWAIT. When the call
to dup_sockaddr() was renamed to sodupsockaddr(), the argument was
changed from '1' to 'M_WAITOK', which changed the semantics. This
resulted in a WITNESS warning about a potential sleep while holding the
NFS server mutex. Now this will no longer happen, restoring a possible
bug present in the original code (setting RC_NAM even though the malloc
to copy the addres may fail). bde observes that the flag names here
should probably not be the same as the malloc flags for name space
reasons.
Commit the first of half of changes that allow busdma to transparently
honor the alignment and boundary constraints in the dma tag when loading
buffers. Previously, these constraints were only honored when allocating
memory via bus_dmamem_alloc(). Now, bus_dmamap_load() will automatically
use bounce buffers when needed.
Also add a set of sysctls to monitor the global busdma stats. These are:
green [Sat, 3 Jul 2004 18:11:41 +0000 (18:11 +0000)]
Limit mbuma damage. Suddenly ALL allocations with M_WAITOK are subject
to failing -- that is, allocations via malloc(M_WAITOK) that are required
to never fail -- if WITNESS is not defined. While everyone should be
running WITNESS, in any case, zone "Mbuf" allocations are really the only
ones that should be screwed with by this hack.
This hack is crashing people, and would continue to do so with or without
WITNESS. Things shouldn't be allocating with M_WAITOK with locks held,
but it's not okay just to always remove M_WAITOK when !WITNESS.
Reported by: Bernd Walter <ticso@cicely5.cicely.de>
By popular request, add a workaround that allows large (>128GB or so)
FAT32 filesystems to be mounted, subject to some fairly serious limitations.
This works by extending the internal pseudo-inode-numbers generated from
the file's starting cluster number to 64-bits, then creating a table
mapping these into arbitrary 32-bit inode numbers, which can fit in
struct dirent's d_fileno and struct vattr's va_fileid fields. The mappings
do not persist across unmounts or reboots, so it's not possible to export
these filesystems through NFS. The mapping table may grow to be rather
large, and may grow large enough to exhaust kernel memory on filesystems
with millions of files.
Don't enable this option unless you understand the consequences.
SMPng locking cleanup for vr(4).
- Remove recursive locking situations. Remove the MTX_RECURSE bit.
- Take the lock for any routine which is not called from within if_vr.c
itself; this includes entry points called by newbus, ifnet, callout,
ifmedia, and polling subsystems.
- Remove spl references from the code added to miibus callbacks in rev 1.60.
- Add the INTR_MPSAFE bit.
- Tidy up some assignments; locks are not needed for taking the address
of something at a known offset, for example.
- Tested on the machine this was committed from.
Tested on: UP only, !debug.mpsafenet && debug.mpsafenet
Reviewed by: rwatson
Fix SCHED_ULE build on SMP. The previous revision (1.110)
introduced a KSE_CAN_MIGRATE() invocation with one argument
missing (class). Either this is a genuine forget or it crept
in from JHB's repo where he may have modified it. If it's
the latter then it may require more attention. For now fix
the make depend.