rwatson [Sat, 28 Jan 2006 21:03:16 +0000 (21:03 +0000)]
Use $FreeBSD$ for the juggle version, rather than $P4$. This is good for
two reasons:
(1) juggle is now maintained in CVS, not P4, so the CVS revision number is
the authoritative one.
(2) Apparently $P4$ requires special handling and juggle was not marked
as needing it, resulting in problems for the P4 importer.
csjp [Sat, 28 Jan 2006 19:24:40 +0000 (19:24 +0000)]
Manage the ucred for the NFS server using the crget/crfree API defined in
kern_prot.c. This API handles reference counting among many other things.
Notably, if MAC is compiled into the kernel, it will properly initialize the
MAC labels when the ucred is allocated.
This work is in preparation for a new MAC entry point which will be responsible
for properly initializing policy specific labels for the NFS server credential.
Utilization of the crfree/crget APIs reduce the complexity associated with
this label's management.
Submitted by: green (with changes) [1]
Obtained from: TrustedBSD Project
Discussed with: rwatson, alfred
[1] I moved the ucred allocation outside the scope of the NFS server lock to
prevent M_WAIKOK allocations from occurring with non-sleep-able locks held.
Additionally, to reduce complexity, the ucred persist as long as the NFS
server descriptor.
pjd [Sat, 28 Jan 2006 14:13:15 +0000 (14:13 +0000)]
- Add a note that passing NULL to pidfile_write(), pidfile_remove() and
pidfile_close() functions is safe. This possibility is used in example code.
- Cast pid_t to int.
jhb [Fri, 27 Jan 2006 23:13:26 +0000 (23:13 +0000)]
Add a basic reader/writer lock implementation to the kernel. This
implementation is by no means perfect as far as some of the algorithms
that it uses and the fact that it is missing some functionality (try
locks and upgrades/downgrades are not there yet), however it does seem
to work in my local testing. There is more detail in the comments in the
code, but the short version follows.
A reader/writer lock is very much like a regular mutex: it cannot be held
across a voluntary sleep; it can be acquired in an interrupt thread; if
the lock is held by a writer then the priority of any threads that block
on the lock will be lent to the owner; the simple case lock operations all
are done in a single atomic op. It also shares some similiarities
with sx locks: it supports reader/writer semantics (multiple readers,
but single writers); readers are allowed to recurse, but writers are not.
We can extend this implementation further by either improving algorithms
or adding new functionality, but this should at least give us a base to
work with now.
Reviewed by: arch (in theory)
Tested on: i386 (4 cpu box with a kernel module that used 4 threads
that randomly chose between read locks and write locks
that ran w/o panicing for over a day solid. It usually
panic'd within a few seconds when there were bugs during
testing. :) The kernel module source is available on
request.)
jhb [Fri, 27 Jan 2006 23:04:43 +0000 (23:04 +0000)]
Oops, commit missed file from the previous change to enable multiple
queues in turnstiles. Add a new thread member td_tsqueue which contains
the sub-queue of a turnstile that a thread is on when it is blocked on a
turnstile.
jhb [Fri, 27 Jan 2006 22:42:12 +0000 (22:42 +0000)]
- Add support for having both a shared and exclusive queue of threads in
each turnstile. Also, allow for the owner thread pointer of a turnstile
to be NULL. This is needed for the upcoming reader/writer lock
implementation.
- Add a new ddb command 'show turnstile' that will look up the turnstile
associated with the given lock argument and display useful information
like the list of threads blocked on each queue, etc. If there isn't an
active turnstile for a lock at the specified address, then the function
will see if there is an active turnstile at the specified address and
display info about it if so.
- Adjust the mutex code to handle the turnstile API changes.
Tested on: i386 (all), alpha, amd64, sparc64 (1 and 3)
jhb [Fri, 27 Jan 2006 22:24:07 +0000 (22:24 +0000)]
Add a new ddb command 'show sleepq'. It takes a wait channel as an
argument and looks for a sleep queue associated with that wait channel.
If it finds one it will display information such as the list of threads
sleeping on that queue. If it can't find a sleep queue for that wait
channel, then it will see if that address matches any of the active
sleep queues. If so, it will display information about the sleepq at the
specified address.
jhb [Fri, 27 Jan 2006 22:22:10 +0000 (22:22 +0000)]
Call WITNESS_CHECK() in the page fault handler and immediately assume it
is a fatal fault if we are holding any non-sleepable locks. This should
cut down on the number of bogus LORs we currently get when the kernel
panics due to a NULL (or bogus) pointer dereference that goes wandering
off into the VM system which tries to acquire locks and then kicks off
the spurious LORs. This should probably be ported to all the archs at
some point.
jhb [Fri, 27 Jan 2006 22:20:15 +0000 (22:20 +0000)]
Add a new macro wrapper WITNESS_CHECK() around the witness_warn() function.
The difference between WITNESS_CHECK() and WITNESS_WARN() is that
WITNESS_CHECK() should be used in the places that the return value of
witness_warn() is checked, whereas WITNESS_WARN() should be used in places
where the return value is ignored. Specifically, in a kernel without
WITNESS enabled, WITNESS_WARN() evaluates to an empty string where as
WITNESS_CHECK evaluates to 0. I also updated the one place that was
checking the return value of WITNESS_WARN() to use WITNESS_CHECK.
jhb [Fri, 27 Jan 2006 22:17:31 +0000 (22:17 +0000)]
Add a new sysctl, debug.ktr.clear. If you write a non-zero value to this
sysctl then it will clear the KTR buffer. Note that if you have active
KTR traces at the same time as a clear operation the behavior is undefined,
though it shouldn't panic.
jkim [Fri, 27 Jan 2006 21:41:49 +0000 (21:41 +0000)]
- Hide 'incorrect geometry warning' in non-interactive mode. A user should
know what they are doing in non-interactive mode. Less scarier warning
goes to debugging info instead.
- Print sanitized geometry to debugging info.
cognet [Fri, 27 Jan 2006 21:11:50 +0000 (21:11 +0000)]
Make sure b_vp and b_bufobj are NULL before calling relpbuf(), as it asserts
they are. They should be NULL at this point, except if we're coming from
swapdev_strategy().
It should only affect the case where we're swapping directly on a file over
NFS.
glebius [Fri, 27 Jan 2006 10:56:22 +0000 (10:56 +0000)]
o Introduce D-Link compat mode, that is default to off and can be set
by NGM_PPPOE_SETMODE message. When D-Link compat mode is on, we will
broadcast PADI with empty Service-Name to all listening hooks.
o Rewrite the compatibility options. Before we had two modes - standard
and non-standard (aka 3Com). Now we have standard mode and two compat
flags, that can be combined.
o Be consistent and do s/STUPID/3COM/g. I don't say that 3Com mode isn't
stupid, just want to make code easier to read.
imp [Fri, 27 Jan 2006 08:25:47 +0000 (08:25 +0000)]
Create mediachg functions for the 3c503 and hpp cards. This is used
to properly configure the right interface to use.
Also call the mediachg function when we set flags UP and are already
running. If this were a pure ifmedia driver, we'd not need to do this
since we'd be ignoring the linkX flags.
This reduces the number of ifdefs to support sub-devices a little as a
nice side effect. It also reduces the number of hpp interfaces
exposed by 33%.
imp [Fri, 27 Jan 2006 08:00:40 +0000 (08:00 +0000)]
Transition from ALTPHYS to LINK2. We already document in the ed(4)
man page that the ifconfig option link2 is used to disable the AUI
transceiver on the 3com boards (should also say HP PC Lan+). This
makes the connection clearer.
Add a note about why we set this flag prior to attaching the device.
We never set or clear the flag later, only test it. There can be no
races here, but this might be asthetically displeasing to some. Also
note that we may no longer need to have this knob at all as we may be
able to do it with the more sophisticated rc.d scripts we have today I
think the only reason it is there is because we didn't used to allow
its proper setting when configured to get the IP address via DHCP.
I'll note that this would be better handled by using ifmedia for all
ed cards, not just those with a miibus...
jasone [Fri, 27 Jan 2006 04:42:10 +0000 (04:42 +0000)]
Add NO_MALLOC_EXTRAS, so that various extra features that can cause
performance degradation can be disabled via something like the following
in /etc/malloc.conf:
ambrisko [Thu, 26 Jan 2006 22:39:12 +0000 (22:39 +0000)]
When the RAID firmware returns a failure, don't hard error the result.
This is important with MegaLib, when issuing a GET_REBUILD_PROG since
it returns an error if the drive is not in rebuild state.
This will be MFC'ed shortly.
Submitted by: ps
Reviewed by: scottl
Found by: ambrisko
marius [Thu, 26 Jan 2006 21:14:32 +0000 (21:14 +0000)]
- Register the generic implementations for the device shutdown, suspend
and resume methods so these events propagate through the device driver
hierarchy.
- In dma(4) enable the chaining of the DMA engine interrupt handler for
the LANCE devices via a dma_setup_intr(). This was commented out before
as I was unsure whether I'd use it but this is probably cleaner than
fiddling with the DMA engine interrupt in the LANCE driver directly.
- In ebus_setup_dinfo() free 'intrs' instead of 'reg' twice in case
setting up a child fails due to routing one of its interrupts fails. [1]
brooks [Thu, 26 Jan 2006 21:05:39 +0000 (21:05 +0000)]
Fix rev 1.12.
/tmp may not be writeable yet when dhclient is first run via
/etc/rc.d/netif so using it may not work. Also, writing to a
predictable file in /tmp as root is a really bad idea since a malicious
user may be able to win a race and insert a symlink which will allow
them to cause any file to be overwritten. To solve these problems,
create the tempory file in /var/run which will exist this early and is
writable only by root.
Security: Local risk if users can cause dhclient to run on demand
(such as by unplugging and replugging the network cable).
cognet [Thu, 26 Jan 2006 20:54:49 +0000 (20:54 +0000)]
Don't attempt to re-create the /dev entry for the slave part if it already
exist when opening the master. This can happen if one open the master, then
open the slave, then close and re-open the master.
njl [Thu, 26 Jan 2006 19:55:29 +0000 (19:55 +0000)]
Since the A-Z range is contained in the previous check, the else-if is
dead code. Clean up both by using isprint() instead, since that's what
it really wants.
marius [Thu, 26 Jan 2006 19:04:18 +0000 (19:04 +0000)]
- Only touch the LED bit of the (LED) AUXIO register when turning the
system LED on or off. Unlike the EBus LED AUXIO register where the
remaining bits are unused the upper bits of the SBus AUXIO register
are used to control other things like the link test enable pin of
the on-board NIC which we don't want to change as a side-effect.
- Remove the superfluous bzero()'ing of the softc obtained from
device_get_softc().
glebius [Thu, 26 Jan 2006 13:06:49 +0000 (13:06 +0000)]
From the RFC2516 it is not clear, what is the correct behavior for a
PPPoE AC, servicing a specific Service-Name, when client sends a PADI
with an empty Service-Name. Should it reply with all available service
names or should it be silent? Our implementation had chosen the latter,
while some other had chosen the former (they say Linux and Cisco). Now
some PPPoE clients appear, that rely on the assumption that AC will
send all names in a PADO reply to a PADI with wildcard Service-Name.
These clients can't connect to FreeBSD AC.
I have requested comments from authors of RFC2516 via email, but
received no reply.
This change makes FreeBSD AC compatible with D-Link DI-614+ and
D-Link DI-624+ SOHO routers, and probably others.
Big thanks to D-Link's Russian office, namely Victor Platov, for
assistance and support in investigation and testing of this change.
Details:
o Split pppoe_match_svc() into three different functions serving
different purposes:
- pppoe_match_svc() - match non-empty Service-Name tag from PADI
against all available hooks in listening state.
- pppoe_find_svc() - check that given Service-Name is not yet
registered.
- pppoe_broadcast_padi() - send a copy of PADI packet with empty
Service-Name tag to all listening hooks.
o For NGM_PPPOE_LISTEN message use pppoe_find_svc().
o In ng_pppoe_rcvdata() in a PADI case use pppoe_match_svc() for
a non-empty Service-Name tag, and pppoe_broadcast_padi() in
either case.
A side effect from the above changes is that now pppoed(8) and mpd
will reply to a empty Service-Name PADI sending a PADO with two
Service-Name tags - an empty one and correct one. This is not fatal,
and will be corrected in pppoed(8) and mpd later. No need to update
node interface version.
jasone [Thu, 26 Jan 2006 08:11:23 +0000 (08:11 +0000)]
Optimize arena_bin_pop() to reduce the number of separator operations.
Remove the block of code that tries to use delayed regions in LIFO order,
since from a policy perspective, it conflicts with LRU caching of newly
coalesced regions in arena_undelay(). There are numerous policy
alternatives, and it isn't readily obvious which (if any) is superior;
this change at least has the virtue of being consistent with policy.
alc [Thu, 26 Jan 2006 05:51:26 +0000 (05:51 +0000)]
Plug a leak in the newer contigmalloc() implementation. Specifically, if
a multipage allocation was aborted midway, the pages that were already
allocated were not always returned to the free list.
cognet [Thu, 26 Jan 2006 01:32:46 +0000 (01:32 +0000)]
Linux compat bits needed to make linux programs use the new ptys :
linux_ioctl.[ch] : Implement LINUX_TIOCGPTN, which returns the pty number
linux_stats.c :
- Return the magic number for devfs.
- In various stats()-related functions, check that we're stating a
file in /dev/pts, and if so, change the st_rdev field to match what linux
expects to be there for a slave pty device. The glibc checks for this, and
their openpty() fails if it is no correct.
cognet [Thu, 26 Jan 2006 01:30:34 +0000 (01:30 +0000)]
Bring in a sysv-style pts implementation, as found in the rwatson_pts perforce branch. It works the same as its SysV/linux counterpart : You obtain a fd to the master pseudo terminal by opening /dev/ptmx, which craetes a node for the master as /dev/pty[num] and a node for the slave as /dev/pts/[num].
It should play nicely with the existing BSD ptys.
By default, the system will use the BSD ptys, one can set the sysctl
kern.pts.enable to 1 to make it use the new pts system.
The max number of pty that can be allocated on a system can be changed with the
sysctl kern.pts.max. It defaults to 1000, and can be increased, but it is not
recommanded, as any pty with a number > 999 won't be handled by whatever uses
utmp(5).