scottl [Mon, 21 Nov 2005 21:27:40 +0000 (21:27 +0000)]
Teach schedgraph how to parse KTR_CRITICAL records. critical_enter/exit
events are now plotted as a counting graph, similar to CPU load, so that
their duration and critnest values can be visualized.
jhb [Mon, 21 Nov 2005 20:22:35 +0000 (20:22 +0000)]
Don't enable PUC_FASTINTR by default in the source. Instead, enable it
via the DEFAULTS kernel configs. This allows folks to turn it that option
off in the kernel configs if desired without having to hack the source.
This is especially useful since PUC_FASTINTR hangs the kernel boot on my
ultra60 which has two uart(4) devices hung off of a puc(4) device.
I did not enable PUC_FASTINTR by default on powerpc since powerpc does not
currently allow sharing of INTR_FAST with non-INTR_FAST like the other
archs.
jhb [Mon, 21 Nov 2005 20:17:46 +0000 (20:17 +0000)]
Create DEFAULTS files for alpha, ia64, powerpc, and sparc64 and move
'device mem' over from GENERIC to DEFAULTS to be consistent with i386 and
amd64. Additionally, on ia64 enable ACPI by default since ia64 requires
acpi.
ps [Mon, 21 Nov 2005 19:23:46 +0000 (19:23 +0000)]
- Always return success from NFS strategy. nfs_doio(), in the
event of an error, does the right thing, in terms of setting
the error flags in the buf header. That fixes a crash from
bstrategy().
- Treat ETIMEDOUT as a "recoverable" error, causing the buffer
to be re-dirtied. ETIMEDOUT can occur on soft mounts, when
the number of retries are exceeded, and we don't want data loss
in that case.
cognet [Mon, 21 Nov 2005 19:10:44 +0000 (19:10 +0000)]
Force pmap to write-back the pte cacheline after each pte modification,
even if the pte is supposed to be cached in write through mode (might be a
skyeye bug, I'll have to check).
jhb [Mon, 21 Nov 2005 18:39:17 +0000 (18:39 +0000)]
Expand the hack to mask the atpics if 'device atpic' is not in the kernel
during boot up. Now we do a full reset of the 8259As and setup a simple
interrupt handler (we actually borrow the apic one that just does an
immediate iret) to handle any spurious interrupts triggered by either chip.
This should fix some folks that were getting a Trap 30 during bootup of
certain SMP AMD systems. This might get pushed into the 6.0 branch as an
errata. For now a suitable workaround is to add 'device atpic' to your
kernel config.
ru [Mon, 21 Nov 2005 16:44:16 +0000 (16:44 +0000)]
- Merge FreeBSD Configuration subsection etc. with SYNOPSIS.
- Remove the description of how to build a module.
- Remove the description of how to patch the sources.
- Refer to the polling(4) manpage on how to enable the polling mode.
- Tidy up markup.
ru [Mon, 21 Nov 2005 14:41:10 +0000 (14:41 +0000)]
Fix mysterious build failures (with parallel make) early in
buildkernel: provide a real but dummy name to ${DEPENDFILE}
so that the relevant exists() check in bsd.prog.mk fails and
ensures that ${GENHDRS} are built before any other objects.
bde [Mon, 21 Nov 2005 04:57:12 +0000 (04:57 +0000)]
Mess up the "kernel" float trig function .c files with ifdefs so that
they can be #included in other .c files to give inline functions, and
use them to inline the functions in most callers (not in e_lgammaf_r.c).
__kernel_tanf() is too large and complicated for gcc to inline very well.
An athlons, this gives a speed increase under favourable pipeline
conditions of about 10% overall (larger for AXP, smaller for A64).
E.g., on AXP, sinf() on uniformly distributed args in [-2Pi, 2Pi]
now takes 30-56 cycles; it used to take 45-61 cycles; hardware fsin
takes 65-129.
yongari [Mon, 21 Nov 2005 04:17:43 +0000 (04:17 +0000)]
busdma cleanup for em(4).
- don't force busdma to pre-allocate bounce pages for parent tag.
- use system supplied roundup2 macro instead of rolling its own version.
- TX/RX decriptor length should be multiple of 128. There is no
no need to expand the size with the multiple of 4096.
- don't create/destroy DMA maps in TX/RX handlers. Use pre-allocated
DMA maps. Since creating DMA maps on sparc64 is time consuming
operations(resource mananger overhead), this change should boost
performance on sparc64. I could get > 2x speedup on Ultra60.
- TX/RX descriptors could be aligned on 128 boundary. Aligning them
on PAGE_SIZE is waste of resource.
- don't blindly create TX DMA tag with size of MCLBYTES * 8. The size
is only valid under jumbo frame environments. Instead of using the
hardcoded value, re-compute necessary size on the fly.
- RX side bus_dmamap_load_mbuf_sg(9) support.
- remove unused macro EM_ROUNDUP and constant EM_MMBA.
yongari [Mon, 21 Nov 2005 03:37:43 +0000 (03:37 +0000)]
Add a hack to ignore PCR bit for 6300ESB, 82801[D-G]B chips. It seems
that enabling busmastering would result in PCR bit ON after codec
reset.
While I'm here add DELAY(1) to codec access routine to give reasonable
time to codec operation. Without the delay, it would cause problems on
super-fast machines(> 2GHz). Also enable legacy audio for all 6300ESB,
82801[D-G]B chips. Previously, it enabled legacy audio for 82801DB(ICH4)
chip only.
Reported by: Maxim Maximov mcsi AT mcsi DOT pp DOT ru
Andrew Bliznak andriko.b AT gmail DOT com
Tested by: brueffer, Maxim Maximov, Andrew Bliznak
bde [Mon, 21 Nov 2005 00:38:21 +0000 (00:38 +0000)]
Use double precision to simplify and optimize a long division.
On athlons, this gives a speedup of 10-20% for tanf() on uniformly
distributed args in [-2Pi, 2Pi]. (It only directly applies for 43%
of the args and gives a 16-20% speedup for these (more for AXP than
A64) and this gives an overall speedup of 10-12% which is all that it
should; however, it gives an overall speedup of 17-20% with gcc-3.3
on AXP-A64 by mysteriously effected cases where it isn't executed.)
I originally intended to use double precision for all internals of
float trig functions and will probably still do this, but benchmarking
showed that converting to double precision and back is a pessimization
in cases where a simple float precision calculation works, so it may
be optimal to switch precisions only when using extra precision is
much simpler.
rodrigc [Sun, 20 Nov 2005 17:04:50 +0000 (17:04 +0000)]
If export mount flag is not passed in, set default parameters
for export structure and pass that to vfs_export().
Currently in userland mount(8), an export structure is unconditionally
passed in, only for UFS. This is an attempt to move that UFS-specific
behavior out of mount(8) and into the UFS filesystem code.
le [Sun, 20 Nov 2005 12:12:31 +0000 (12:12 +0000)]
Always declare variables at the start of the function.
Don't allocate potentially large variables on the stack.
Check strsep() return values when the string comes from userland.
Shorten variable names for lucidity's sake.
marcel [Sun, 20 Nov 2005 01:31:29 +0000 (01:31 +0000)]
Improve inittodr(). Assume the real-time clock is reliable and only
use the base time in case the real-time clock is bogus or behind the
base time. Most importantly, don't sanity-check the base time up front
because it may be zero. This is not a preposterous condition. It just
means that none of the file systems have their mount time updated.
wpaul [Sun, 20 Nov 2005 01:29:29 +0000 (01:29 +0000)]
Correct the API for Windows interupt handling a little. The prototype
for a Windows ISR is 'BOOLEAN isrfunc(KINTERRUPT *, void *)' meaning
the ISR get a pointer to the interrupt object and a context pointer,
and returns TRUE if the ISR determines the interrupt was really generated
by the associated device, or FALSE if not.
I had mistakenly used 'void isrfunc(void *)' instead. It happens the
only thing this affects is the internal ndis_intr() ISR in subr_ndis.c,
but it should be fixed just in case we ever need to register a real
Windows ISR vi IoConnectInterrupt().
For NDIS miniports that provide a MiniportISR() method, the 'is_our_intr'
value returned by the method serves as the return value from ndis_isr(),
and 'call_isr' is used to decide whether or not to schedule the interrupt
handler via DPC. For drivers that only supply MiniportEnableInterrupt()
and MiniportDisableInterrupt() methods, call_isr is always TRUE and
is_our_intr is always FALSE.
In the end, there should be no functional changes, except that now
ntoskrnl_intr() can terminate early once it finds the ISR that wants
to service the interrupt.
rodrigc [Sat, 19 Nov 2005 23:28:19 +0000 (23:28 +0000)]
Add more options to ffs_opts, so that vfs_filteropts() will not
complain when we pass these options to a UFS filesystem as strings
via nmount(): noexec, nosuid, nosymfollow, sync, suiddir
marcel [Sat, 19 Nov 2005 21:51:45 +0000 (21:51 +0000)]
Fix bug introduced in revision 1.186:
When all file systems have a time stamp of zero, which is the case
for example when the root file system is on a read-only medium, we
ended up not calling inittodr() at all. A potential uncleanliness
existed as well. If multiple file systems had a non-zero time stamp,
we would call inittodr() multiple times. While this should not be
harmful, it's definitely not ideal.
Fix both issues by iterating over the mounted file systems to find
the largest time stamp and call inittodr() exactly once with that
time stamp. This could of course be a zero time stamp if none of the
mounted file systems have a non-zero time stamp. In that case the
annoying errors mentioned in the commit log for revision 1.186 still
haven't been avoided. The bottom line is that inittodr() should not
complain when it gets a time base of zero. At the time of this
commit only alpha seems to have that problem.
rodrigc [Sat, 19 Nov 2005 21:22:21 +0000 (21:22 +0000)]
Parse more mount options in vfs_donmount(), before vfs_domount()
is called. It looks like there are lots of different mount flags checked
in vfs_domount(), so we need to do the parsing for these particular
mount flags earlier on. The new flags parsed are:
async, force, multilabel, noasync, noatime, noclusterr, noclusterw,
noexec, nosuid, nosymfollow, snapshot, suiddir, sync, union.
Existing code which uses mount() to mount UFS filesystems is not
affected, but new code which uses nmount() to mount UFS filesystems
should behave better.
andre [Sat, 19 Nov 2005 17:04:52 +0000 (17:04 +0000)]
Remove 'ipprintfs' which were protected under DIAGNOSTIC. It doesn't
have any know to enable it from userland and could only be enabled by
either setting it to 1 at compile time or through the kernel debugger.
In the future it may be brought back as KTR tracing points.
damien [Sat, 19 Nov 2005 16:54:55 +0000 (16:54 +0000)]
Load firmware images directly from the filesystem (looks into /etc/firmware
directory by default) without requiring the user to load them by hand using
e.g iwicontrol. Get rid of the old ioctl crud.
Updated iwi-firmware port coming soon.
jkoshy [Sat, 19 Nov 2005 12:21:11 +0000 (12:21 +0000)]
- Move the documentation for the ENABLE_WPA_SUPPLICANT_EAPOL knob to into
the list for 'world' builds.
- Increase the width of a bullet list.
- Use .Ss to name sub-sections of this file.
ru [Sat, 19 Nov 2005 06:45:44 +0000 (06:45 +0000)]
Add the NO_INCS knob to bsd.prog.mk and bsd.lib.mk to not include
bsd.incs.mk, and use it when installing 32-bit compat libraries
on amd64. This causes it to *not* overwrite native headers with
i386 versions, which was the case with <fenv.h> and <vgl.h>.
bde [Sat, 19 Nov 2005 02:38:27 +0000 (02:38 +0000)]
Moved all the optimizations for |x| <= 9pi/2 from
__ieee754_rem_pio2f() to its 3 callers and manually inline them.
On Athlons, with favourable compiler flags and optimizations and
favourable pipeline conditions, this gives a speedup of 30-40 cycles
for cosf(), sinf() and tanf() on the range pi/4 < |x| <= 9pi/4, so
thes functions are now signifcantly faster than the hardware trig
functions in many cases. E.g., in a benchmark with uniformly distributed
x in [-2pi, 2pi], A64 hardware fcos took 72-129 cycles and cosf() took
37-55 cycles. Out-of-order execution is needed to get both of these
times. The optimizations in this commit apparently work more by
removing 1 serialization point than by reducing latency.
rodrigc [Fri, 18 Nov 2005 22:34:31 +0000 (22:34 +0000)]
Add "shortnames" and "longnames" mount options which are
synonyms for "shortname" and "longname" mount options. The old
(before nmount()) mount_msdosfs program accepted "shortnames" and "longnames",
but the kernel nmount() checked for "shortname" and "longname".
So, make the kernel accept "shortnames", "longnames", "shortname", "longname"
for forwards and backwarsd compatibility.
Discovered by: Rainer Hurling <rhurlin at gwdg dot de>
emaste [Fri, 18 Nov 2005 19:41:55 +0000 (19:41 +0000)]
Add sanity checking for QUEUE(3) lists under INVARIANTS. Races may lead
to list corruption, which can be difficult to unravel in a post-mortem
analysis. These checks verify that prev and next pointers are consistent
when inserting or removing elements, thus catching any corruption earlier.
Also use TRASHIT to break LIST and SLIST link pointers on element removal,
from mlaier via -hackers.
jhb [Fri, 18 Nov 2005 19:26:46 +0000 (19:26 +0000)]
- Always print the trap number so that we have something to start with for
mystery traps. If we don't have a message for a given trap, just use
UNKNOWN for the message.
- Add trap messages for T_XMMFLT and T_RESERVED.
andre [Fri, 18 Nov 2005 17:13:22 +0000 (17:13 +0000)]
Document CLOCK_UPTIME which returns the current uptime in SI seconds.
At the moment it is just an alias for CLOCK_MONOTONIC which reports
the same number.
andre [Fri, 18 Nov 2005 14:44:48 +0000 (14:44 +0000)]
In ip_forward() copy as much into the temporary error mbuf as we
have free space in it. Allocate correct mbuf from the beginning.
This allows icmp_error() to quote the entire TCP header in error
messages.
rodrigc [Fri, 18 Nov 2005 06:06:10 +0000 (06:06 +0000)]
- Add parsing for the following existing UFS/FFS mount options in the nmount()
callpath via vfs_getopt(), and set the appropriate MNT_* flag:
-> acls, async, force, multilabel, noasync, noatime,
-> noclusterr, noclusterw, snapshot, update
- Allow errmsg as a valid mount option via vfs_getopt(),
so we can later add a hook to propagate mount errors back
to userspace via vfs_mount_error().
jdp [Fri, 18 Nov 2005 02:43:49 +0000 (02:43 +0000)]
Fix a bug that caused some /dev entries to continue to exist after
the underlying drive had been hot-unplugged from the system. Here
is a specific example. Filesystem code had opened /dev/da1s1e.
Subsequently, the drive was hot-unplugged. This (correctly) caused
all of the associated /dev/da1* entries to be deleted. When the
filesystem later realized that the drive was gone it closed the
device, reducing the write-access counts to 0 on the geom providers
for da1s1e, da1s1, and da1. This caused geom to re-taste the
providers, resulting in the devices being created again. When the
drive was hot-plugged back in, it resulted in duplicate /dev entries
for da1s1e, da1s1, and da1.
This fix adds a new disk_gone() function which is called by CAM when a
drive goes away. It orphans all of the providers associated with the
drive, setting an error condition of ENXIO in each one. In addition,
we prevent a re-taste on last close for writing if an error condition
has been set in the provider.
rodrigc [Fri, 18 Nov 2005 01:31:10 +0000 (01:31 +0000)]
In vfs_nmount(), check to see if "update" mount option was passed
in, and if so, set MNT_UPDATE filesystem flag.
vfs_nmount() calls vfs_domount(), and there is special logic
inside vfs_domount() if MNT_UPDATE is set. This is very important
when we want to do an update mount of the root filesystem, using nmount().
rwatson [Thu, 17 Nov 2005 19:31:52 +0000 (19:31 +0000)]
Print (total - used) as the amount of available swap for a swap device
when printing swapinfo output, rather than (total), as that is (strictly
speaking) more accurate.
Pointed out by: Rob <spamrefuse at yahoo dot com>
MFC after: 3 days
harti [Thu, 17 Nov 2005 12:19:19 +0000 (12:19 +0000)]
When a user is in more than 16 groups the call to authunix_create() will
result in abort() beeing called. This is because there is a limit of
the number of groups in the RPC which is 16. When the actual number of
groups is too large it results in xdr_array() returning an error which,
in turn, authunix_create() handles by just calling abort().
Fix this by passing only the first 16 groups to authunix_create().
cperciva [Thu, 17 Nov 2005 11:01:32 +0000 (11:01 +0000)]
Correctly handle a TCP connection being shutdown by the server while
we're reading response headers. (Handle it as a connection-killing
error, rather than entering an infinite loop reading zero bytes.)
Reported by: simon
Discovered thanks to: A not-very-transparent transparent HTTP proxy.
MFC after: 3 days
glebius [Thu, 17 Nov 2005 10:13:18 +0000 (10:13 +0000)]
- Backout last change, since it is memory overkill for a non busy host or
for a notebook with em(4) adapter.
- Introduce tunables em.hw.txd and em.hw.rxd, which allow administrator
to configure number of transmit and receive descriptors.
- Check em.hw.txd and em.hw.rxd against hardware limits [*] and require
them to be multiple of 128.
[*] According to comments in if_em.h the 82540EM/82541ER chips can handle
more than 256 descriptors. Since we don't have this hardware to test,
we decided to mimic NetBSD wm(4) driver, that limits these chips to
256 descriptors.