In revision 1.29 timeout() was converted to ng_callout().
The difference is that the callout function installed via the
ng_callout() method is guaranteed to NOT fire after the shutdown
method was run (when a node is marked NGF_INVALID). Also, the
shutdown method and the callout function are guaranteed to NOT
run at the same time, as both require the writer lock. Thus
we can safely ignore a zero return value from ng_uncallout()
(callout_stop()) in shutdown methods, and go on with freeing
the node.
The said revision broke the node shutdown -- ng_bridge_timeout()
is no longer fired after ng_bridge_shutdown() was run, resulting
in a memory leak, dead nodes, and inability to unload the module.
Fix this by cancelling the callout on shutdown, and moving part
responsible for freeing a node resources from ng_bridge_timer()
to ng_bridge_shutdown().
scottl [Wed, 9 Feb 2005 11:44:15 +0000 (11:44 +0000)]
Provide locking for the ccb_bioq. This allows xpt_done() to be called without
Giant held. In camisr(), move the ccb_bioq elements to a temporary local list
and then process the elements off of that list. This enables the list to be
processed by only taking the ccb_bioq_lock once and only for a very short
time.
ccb_bioq_lock is a leaf mutex, so it's fine to call xpt_done() with other
locks held. This is just a very minor step in the work to lock CAM, but
it allows us to avoid some messy locking/unlock dances in certain drivers.
pjd [Wed, 9 Feb 2005 08:29:39 +0000 (08:29 +0000)]
- Remove g_gate_hold()/g_gate_release() from start/done paths. It saves
4 mutex operations per I/O requests.
- Use only one mutex to protect both (incoming and outgoing) queue.
As MUTEX_PROFILING(9) shows, there is no big contention for this lock.
- Protect sc_queue_count with queue mutex, instead of doing atomic
operations on it.
- Remove DROP_GIANT()/PICKUP_GIANT() - ggate is marked as MPSAFE and no
Giant there.
imp [Wed, 9 Feb 2005 06:16:27 +0000 (06:16 +0000)]
Remove DLINK_3, its unused. Remove NETGEAR FA410TX, since it is the
same as the LINKSYS COMBO_ECARD (which also seems to be the same as
another linksys product that also has a modem, but I can't find that
one at the moment). Remove the PCM100, since it is now no longer
used.
imp [Wed, 9 Feb 2005 06:03:36 +0000 (06:03 +0000)]
o Remove duplicate LINKSYS ETHERFAST entry.
o The COMBO_ECARD comes in many flavors, it seems, so probe both the DL10019
and the AX88x90 on it. Since this seems to work with no ill effects, maybe
the probing should happen more generally rather than being table driven.
Need to think more about this.
o Remove PCM100 because it is duplicative (the ETHERFAST is the pcm100 and
apparently has the same IDs). It was here for NetBSD because they match
up an expected MAC address OID, but since we don't bother with that, we
don't need to be so finely discriminating.
o Minor style nit.
imp [Wed, 9 Feb 2005 00:06:12 +0000 (00:06 +0000)]
o Remove ifdef PC98, since this file has diverged quite a bit from
if_ed_isa.c, and they seem to not be helpful anymore.
o Fix style issues from de-Pification.
o change from _isa_ to _cbus_ to the largest extent possible to reflect that
this is really for cbus, not isa.
o Use ANSI function definitions.
o Use ed_clear_memory
o eliminate kvtop
jeff [Tue, 8 Feb 2005 23:27:10 +0000 (23:27 +0000)]
- Add a new assert in the getnewvnode(). Assert that the usecount is still
0 to detect getnewvnode() races.
- Add the vnode address to a few panics near by to help in debugging.
pjd [Tue, 8 Feb 2005 22:15:24 +0000 (22:15 +0000)]
- Add debug.watchdog tunable, so we can specify watchdog CPU from loader
which will help to debug hangs on boot.
- Remove 'U' from debug.watchdog sysctl definition, so if we set it to '-1'
it really shows '-1'.
- Fix comment.
cperciva [Tue, 8 Feb 2005 21:31:11 +0000 (21:31 +0000)]
Add a new sysctl, "security.jail.chflags_allowed", which controls the
behaviour of chflags within a jail. If set to 0 (the default), then a
jailed root user is treated as an unprivileged user; if set to 1, then
a jailed root user is treated the same as an unjailed root user.
This is necessary to allow "make installworld" to work inside a jail,
since it attempts to manipulate the system immutable flag on certain
files.
phk [Tue, 8 Feb 2005 20:29:10 +0000 (20:29 +0000)]
Background writes are entirely an FFS/Softupdates thing.
Give FFS vnodes a specific bufwrite method which contains all the
background write stuff and then calls into the default bufwrite()
for the rest of the job.
Remove all the background write related stuff from the normal bufwrite.
This drags the softdep_move_dependencies() back into FFS.
Long term, it is worth looking at simply copying the data into
allocated memory and issuing the bio directly and not create the
"shadow buf" in the first place (just like copy-on-write is done
in snapshots for instance). I don't think we really gain anything
but complexity from doing this with a buf.
jhb [Tue, 8 Feb 2005 20:25:07 +0000 (20:25 +0000)]
Use the local APIC timer to drive the various kernel clocks on SMP machines
rather than forwarding interrupts from the clock devices around using IPIs:
- Add an IDT vector that pushes a clock frame and calls
lapic_handle_timer().
- Add functions to program the local APIC timer including setting the
divisor, and setting up the timer to either down a periodic countdown
or one-shot countdown.
- Add a lapic_setup_clock() function that the BSP calls from
cpu_init_clocks() to setup the local APIC timer if it is going to be
used. The setup uses a one-shot countdown to calibrate the timer. We
then program the timer on each CPU to fire at a frequency of hz * 3.
stathz is defined as freq / 23 (hz * 3 / 23), and profhz is defined as
freq / 2 (hz * 3 / 2). This gives the clocks relatively prime divisors
while keeping a low LCM for the frequency of the clock interrupts.
Thanks to Peter Jeremy for suggesting this approach.
- Remove the hardclock and statclock forwarding code including the two
associated IPIs. The bitmap IPI handler has now effectively degenerated
to just IPI_AST.
- When the local APIC timer is used we don't turn the RTC on at all, but
we still enable interrupts on the ISA timer 0 (i8254) for timecounting
purposes.
njl [Tue, 8 Feb 2005 17:43:35 +0000 (17:43 +0000)]
Add an initial manpage for the cpufreq framework and methods to help users
and developers who want to import new hardware drivers. This could also
certainly use mdoc review.
phk [Tue, 8 Feb 2005 17:23:39 +0000 (17:23 +0000)]
(forced commit to record correct commit message)
Split ffs_fsync() into a VOP_FSYNC() component and an internal part
called ffs_syncvnode().
Eliminate unnecessary thread argument and XXX'ed curthread passes
for same. Reduce softdep_sync_metadata() from a struct vop_fsync_args
to just the vnode argument it needs.
Convert internal VOP_FSYNC() calls to use ffs_syncvnode().
wpaul [Tue, 8 Feb 2005 17:23:25 +0000 (17:23 +0000)]
Next step on the road to IRPs: create and use an imitation of the
Windows DRIVER_OBJECT and DEVICE_OBJECT mechanism so that we can
simulate driver stacking.
In Windows, each loaded driver image is attached to a DRIVER_OBJECT
structure. Windows uses the registry to match up a given vendor/device
ID combination with a corresponding DRIVER_OBJECT. When a driver image
is first loaded, its DriverEntry() routine is invoked, which sets up
the AddDevice() function pointer in the DRIVER_OBJECT and creates
a dispatch table (based on IRP major codes). When a Windows bus driver
detects a new device, it creates a Physical Device Object (PDO) for
it. This is a DEVICE_OBJECT structure, with semantics analagous to
that of a device_t in FreeBSD. The Windows PNP manager will invoke
the driver's AddDevice() function and pass it pointers to the DRIVER_OBJECT
and the PDO.
The AddDevice() function then creates a new DRIVER_OBJECT structure of
its own. This is known as the Functional Device Object (FDO) and
corresponds roughly to a private softc instance. The driver uses
IoAttachDeviceToDeviceStack() to add this device object to the
driver stack for this PDO. Subsequent drivers (called filter drivers
in Windows-speak) can be loaded which add themselves to the stack.
When someone issues an IRP to a device, it travel along the stack
passing through several possible filter drivers until it reaches
the functional driver (which actually knows how to talk to the hardware)
at which point it will be completed. This is how Windows achieves
driver layering.
Project Evil now simulates most of this. if_ndis now has a modevent
handler which will use MOD_LOAD and MOD_UNLOAD events to drive the
creation and destruction of DRIVER_OBJECTs. (The load event also
does the relocation/dynalinking of the image.) We don't have a registry,
so the DRIVER_OBJECTS are stored in a linked list for now. Eventually,
the list entry will contain the vendor/device ID list extracted from
the .INF file. When ndis_probe() is called and detectes a supported
device, it will create a PDO for the device instance and attach it
to the DRIVER_OBJECT just as in Windows. ndis_attach() will then call
our NdisAddDevice() handler to create the FDO. The NDIS miniport block
is now a device extension hung off the FDO, just as it is in Windows.
The miniport characteristics table is now an extension hung off the
DRIVER_OBJECT as well (the characteristics are the same for all devices
handled by a given driver, so they don't need to be per-instance.)
We also do an IoAttachDeviceToDeviceStack() to put the FDO on the
stack for the PDO. There are a couple of fake bus drivers created
for the PCI and pccard buses. Eventually, there will be one for USB,
which will actually accept USB IRP.s
Things should still work just as before, only now we do things in
the proper order and maintain the correct framework to support passing
IRPs between drivers.
Various changes:
- corrected the comments about IRQL handling in subr_hal.c to more
accurately reflect reality
- update ndiscvt to make the drv_data symbol in ndis_driver_data.h a
global so that if_ndis_pci.o and/or if_ndis_pccard.o can see it.
- Obtain the softc pointer from the miniport block by referencing
the PDO rather than a private pointer of our own (nmb_ifp is no
longer used)
- implement IoAttachDeviceToDeviceStack(), IoDetachDevice(),
IoGetAttachedDevice(), IoAllocateDriverObjectExtension(),
IoGetDriverObjectExtension(), IoCreateDevice(), IoDeleteDevice(),
IoAllocateIrp(), IoReuseIrp(), IoMakeAssociatedIrp(), IoFreeIrp(),
IoInitializeIrp()
- fix a few mistakes in the driver_object and device_object definitions
- add a new module, kern_windrv.c, to handle the driver registration
and relocation/dynalinkign duties (which don't really belong in
kern_ndis.c).
- made ndis_block and ndis_chars in the ndis_softc stucture pointers
and modified all references to it
- fixed NdisMRegisterMiniport() and NdisInitializeWrapper() so they
work correctly with the new driver_object mechanism
- changed ndis_attach() to call NdisAddDevice() instead of ndis_load_driver()
(which is now deprecated)
- used ExAllocatePoolWithTag()/ExFreePool() in lookaside list routines
instead of kludged up alloc/free routines
- added kern_windrv.c to sys/modules/ndis/Makefile and files.i386.
phk [Tue, 8 Feb 2005 16:25:50 +0000 (16:25 +0000)]
For snapshots we need all VOP_LOCKs to be exclusive.
The "business class upgrade" was implemented in UFS's VOP_LOCK
implementation ufs_lock() which is the wrong layer, so move it to
ffs_lock().
Also, as long as we have not abandonned advanced vfs-stacking we
should not preclude it from happening: instead of implementing a
copy locally, use the VOP_LOCK_APV(&ufs) to correctly arrive at
vop_stdlock() at the bottom.
phk [Tue, 8 Feb 2005 15:54:30 +0000 (15:54 +0000)]
For snapshots we need all VOP_LOCKs to be exclusive.
The "business class upgrade" was implemented in UFS's VOP_LOCK
implementation ufs_lock() which is the wrong layer, so move it to
ffs_lock().
Also, as long as we have not abandonned advanced vfs-stacking we
should not preclude it from happening: instead of implementing a
copy locally, use the VOP_LOCK_APV(&ufs) to correctly arrive at
vop_stdlock() at the bottom.
njl [Tue, 8 Feb 2005 07:51:14 +0000 (07:51 +0000)]
Unroll the loop for calculating the 8.3 filename checksum. In testing
on my P3, microbenchmarks show the unrolled version is 78x faster. In
actual use (recursive ls), this gives an average of 9% improvement in
system time and 2% improvement in wall time.
imp [Tue, 8 Feb 2005 06:12:44 +0000 (06:12 +0000)]
Use ANSI function definitions, tweak a couple of prototypes to match (since
K&R prototypes needed to mismatch in the way that they were mismatched),
rename ds_getmcaf to ed_ds_getmcaf. Remove a few register keywords.
imp [Tue, 8 Feb 2005 05:59:43 +0000 (05:59 +0000)]
use fixed types for the calls to ed_pio_readmem, ed_pio_writemem.
Make the special hp versions match the general ones. Also use fixed
types in the WD80x3_generic probe, and change callers' arrays to
match. Fix a couple of minor style issues by using newstyle function
definitions in a couple places.
imp [Tue, 8 Feb 2005 05:45:35 +0000 (05:45 +0000)]
Make it possible to unload ed. Move the ed_pccard_detach routine to
if_ed and rename it to ed_detach(). Tell other busses to use this
routine for detach.
Since I don't actually have any non-pccard ed hardware I can test
with, I've only tested with my pccards.
More improvements in this area likely are possible.
scottl [Tue, 8 Feb 2005 03:43:02 +0000 (03:43 +0000)]
Fix crashdumps on twe. The twe_immediate_request() path was not only
copying data to a temporary buffer before the I/O, but also copying that
temporary buffer back to the original data location after the I/O. When
you're dumping kernel heap and stack and protected pages, this is very
very bad.
A belated thanks to Robert Watson for donating hardware for this (and future)
work.
mlaier [Mon, 7 Feb 2005 23:20:12 +0000 (23:20 +0000)]
Fix sloppy use of "manpage", bump .Dd where applicable and rename RED to
Random Early Detection (not ... Drop) in order to be consistent with other
documentation on ALTQ
jhb [Mon, 7 Feb 2005 22:02:18 +0000 (22:02 +0000)]
- Implement ibcs2_emul_find() using kern_alternate_path(). This changes
the semantics in that the returned filename to use is now a kernel
pointer rather than a user space pointer. This required changing the
arguments to the CHECKALT*() macros some and changing the various system
calls that used pathnames to use the kern_foo() functions that can accept
kernel space filename pointers instead of calling the system call
directly.
- Use kern_open(), kern_access(), kern_execve(), kern_mkfifo(), kern_mknod(),
kern_setitimer(), kern_getrusage(), kern_utimes(), kern_unlink(),
kern_chdir(), kern_chmod(), kern_chown(), kern_symlink(), kern_readlink(),
kern_select(), kern_statfs(), kern_fstatfs(), kern_stat(), kern_lstat(),
kern_fstat().
- Drop the unused 'uap' argument from spx_open().
- Replace a stale duplication of vn_access() in xenix_access() lacking
recent additions such as MAC checks, etc. with a call to kern_access().
jhb [Mon, 7 Feb 2005 21:53:42 +0000 (21:53 +0000)]
- Implement svr4_emul_find() using kern_alternate_path(). This changes
the semantics in that the returned filename to use is now a kernel
pointer rather than a user space pointer. This required changing the
arguments to the CHECKALT*() macros some and changing the various system
calls that used pathnames to use the kern_foo() functions that can accept
kernel space filename pointers instead of calling the system call
directly.
- Use kern_open(), kern_access(), kern_msgctl(), kern_execve(),
kern_mkfifo(), kern_mknod(), kern_statfs(), kern_fstatfs(),
kern_setitimer(), kern_stat(), kern_lstat(), kern_fstat(), kern_utimes(),
kern_pathconf(), and kern_unlink().
Trim more cookies, by playing with different hash functions,
e.g., by trimming all non-alphabet characters and whitespace,
converting to lowercase, and considering only first (or last)
N letters (maybe only consonants). The fortune editor then
displays all fortunes that have the same hash, and allows to
remove one of them. The rest is written to stdout.
jhb [Mon, 7 Feb 2005 18:47:28 +0000 (18:47 +0000)]
- Use kern_{l,f,}stat() and kern_{f,}statfs() functions rather than
duplicating the contents of the same functions inline.
- Consolidate common code to convert a BSD statfs struct to a Linux struct
into a static worker function.
jhb [Mon, 7 Feb 2005 18:44:55 +0000 (18:44 +0000)]
- Tweak kern_msgctl() to return a copy of the requested message queue id
structure in the struct pointed to by the 3rd argument for IPC_STAT and
get rid of the 4th argument. The old way returned a pointer into the
kernel array that the calling function would then access afterwards
without holding the appropriate locks and doing non-lock-safe things like
copyout() with the data anyways. This change removes that unsafeness and
resulting race conditions as well as simplifying the interface.
- Implement kern_foo wrappers for stat(), lstat(), fstat(), statfs(),
fstatfs(), and fhstatfs(). Use these wrappers to cut out a lot of
code duplication for freebsd4 and netbsd compatability system calls.
- Add a new lookup function kern_alternate_path() that looks up a filename
under an alternate prefix and determines which filename should be used.
This is basically a more general version of linux_emul_convpath() that
can be shared by all the ABIs thus allowing for further reduction of
code duplication.
sobomax [Mon, 7 Feb 2005 11:35:24 +0000 (11:35 +0000)]
Fix the problem with incorrect throttling level reported immediately after
reboot. Safter the reboot the TCC is usually in the Automatic mode, in which
reading current performance level is likely to produce bogus results make sure
to switch it to the On-Demand mode and set to some known performance level.
Unfortunately there is no reliable way to check that TCC is in the Automatic
mode. Reading bit 4 of ACPI Thermal Monitor Control Register produces 0
regardless of the current mode.