Bruce Evans [Thu, 24 Nov 2005 02:04:26 +0000 (02:04 +0000)]
Optimized by eliminating the special case for 0.67434 <= |x| < pi/4.
A single polynomial approximation for tan(x) works in infinite precision
up to |x| < pi/2, but in finite precision, to restrict the accumulated
roundoff error to < 1 ulp, |x| must be restricted to less than about
sqrt(0.5/((1.5+1.5)/3)) ~= 0.707. We restricted it a bit more to
give a safety margin including some slop for optimizations. Now that
we use double precision for the calculations, the accumulated roundoff
error is in double-precision ulps so it can easily be made almost 2**29
times smaller than a single-precision ulp. Near x = pi/4 its maximum
is about 0.5+(1.5+1.5)*x**2/3 ~= 1.117 double-precision ulps.
The minimax polynomial needs to be different to work for the larger
interval. I didn't increase its degree the old degree is just large
enough to keep the final error less than 1 ulp and increasing the
degree would be a pessimization. The maximum error is now ~0.80
ulps instead of ~0.53 ulps.
The speedup from this optimization for uniformly distributed args in
[-2pi, 2pi] is 28-43% on athlons, depending on how badly gcc selected
and scheduled the instructions in the old version. The old version
has some int-to-float conversions that are apparently difficult to schedule
well, but gcc-3.3 somehow did everything ~10 cycles or ~10% faster than
gcc-3.4, with the difference especially large on AXPs. On A64s, the
problem seems to be related to documented penalties for moving single
precision data to undead xmm registers. With this version, the speed
is cycles is almost independent of the athlon and gcc version despite
the large differences in instruction selection to use the FPU on AXPs
and SSE on A64s.
Craig Rodrigues [Wed, 23 Nov 2005 23:06:33 +0000 (23:06 +0000)]
These files were never hooked into the build, and were the start
of an nmount()-based mount program for UFS.
Now that mount(8) calls nmount() directly for mounting UFS filesystems,
they are unnecessary.
Craig Rodrigues [Wed, 23 Nov 2005 20:51:15 +0000 (20:51 +0000)]
In nmount() and vfs_donmount(), do not strcmp() the options in the iovec
directly. We need to copyin() the strings in the iovec before
we can strcmp() them. Also, when we want to send the errmsg back
to userspace, we need to copyout()/copystr() the string.
Add a small helper function vfs_getopt_pos() which takes in the
name of an option, and returns the array index of the name in the iovec,
or -1 if not found. This allows us to locate an option in
the iovec without actually manipulating the iovec members. directly via
strcmp().
Craig Rodrigues [Wed, 23 Nov 2005 20:17:27 +0000 (20:17 +0000)]
Do not pass userquota and groupquota mount options to nmount().
These options are read from fstab by quotacheck(8), but are not
valid mount options that need to be passed down the the filesystem.
Tai-hwa Liang [Wed, 23 Nov 2005 19:52:14 +0000 (19:52 +0000)]
- Adding the missing 'W' option back which was accidentally removed
in rev1.37.
- Fixing a core dump inside build_iovec_argf by providing a !NULL format
string to vsnprintf(3).
John Baldwin [Wed, 23 Nov 2005 18:51:34 +0000 (18:51 +0000)]
Add locking and mark MPSAFE:
- Add locked variants of start, init, and ifmedia_upd.
- Add a mutex to the softc and remove spl calls.
- Use callout(9) rather than timeout(9).
- Setup interrupt handler last in attach.
- Use M_ZERO rather than bzero.
John Baldwin [Wed, 23 Nov 2005 16:36:13 +0000 (16:36 +0000)]
- Quiet the pci_link(4) devices so that they don't show up in dmesg now.
- Improve panic message if we fail to read the PCI bus number from a bridge
device.
- Don't try to lookup a BIOS IRQ for a link unless the link is routed via
an ISA IRQ since BIOSen currently only route PCI link devices via ISA
IRQs.
Bruce Evans [Wed, 23 Nov 2005 14:27:56 +0000 (14:27 +0000)]
Use only double precision for "kernel" tanf (except for returning float).
This is a minor interface change. The function is renamed from
__kernel_tanf() to __kernel_tandf() so that misues of it will cause
link errors and not crashes.
This version is a routine translation with no special optimizations
for accuracy or efficiency. It gives an unimportant increase in
accuracy, from ~0.9 ulps to 0.5285 ulps. Almost all of the error is
from the minimax polynomial (~0.03 ulps and the final rounding step
(< 0.5 ulps). It gives strange differences in efficiency in the -5
to +10% range, with -O1 fairly consistently becoming faster and -O2
slower on AXP and A64 with gcc-3.3 and gcc-3.4.
Ed Maste [Wed, 23 Nov 2005 04:02:27 +0000 (04:02 +0000)]
Userland applications may include queue.h and define INVARIANTS
but not provide a panic(9) implementation. Thus, enable the sanity
checks under INVARIANTS only if _KERNEL is also defined.
Bruce Evans [Wed, 23 Nov 2005 03:03:09 +0000 (03:03 +0000)]
Simplified setiing up args for __kernel_rem_pio2(). We already have x
with a 24-bit fraction, so we don't need a loop to split it into up to
3 terms with 24-bit fractions.
Bruce Evans [Wed, 23 Nov 2005 02:06:06 +0000 (02:06 +0000)]
Quick fix for stack buffer overrun in rev.1.13. Oops. The prec == 1
arg to __kernel_rem_pio2() gives 53-bit (double) precision, not single
precision and/or the array dimension like I thought. prec == 2 is
used in e_rem_pio2.c for double precision although it is documented
to be for 64-bit (extended) precision, and I just reduced it by 1
thinking that this would give the value suitable for 24-bit (float)
precision. Reducing it 1 more to the documented value for float
precision doesn't actually work (it gives errors of ~0.75 ulps in the
reduced arg, but errors of much less than 0.5 ulps are needed; the bug
seems to be in kernel_rem_pio2.c). Keep using a value 1 larger than
the documented value but supply an array large enough hold the extra
unused result from this.
The bug can also be fixed quickly by increasing init_jk[0] in
k_rem_pio2.c from 2 to 3. This gives behaviour identical to using
prec == 1 except it doesn't create the extra result. It isn't clear
how the precision bug affects higher precisions. 113-bit (quad) is
the largest precision, so there is no way to use a large precision
to fix it.
Nate Lawson [Wed, 23 Nov 2005 00:57:51 +0000 (00:57 +0000)]
Try to fix problems with periodic hangs by never directly calling _BIF.
Instead, re-evaluate _BIF only when we get a notify and use the cached
results. We also still evaluate _BIF once on boot. Also, optimize the
init loop a little by only querying for a particular info if it's not valid.
Maksim Yevmenkin [Wed, 23 Nov 2005 00:56:18 +0000 (00:56 +0000)]
Teach rfcomm_sppd(1) about service names, so it is possible to specify
service name instead of channel number with -c command option. Supported
service names are: DUN (Dial-Up Networking), FAX (Fax) and SP (Serial Port).
John Baldwin [Tue, 22 Nov 2005 22:34:14 +0000 (22:34 +0000)]
Garbage collect the code to store diagnostics codes in a CMOS register
during SMP startup. We haven't had any issues with starting up the APs
on i386 in quite a while now which is all this code is really useful for.
If someone ever does really need it they can always dig it up out of the
attic.
Marius Strobl [Tue, 22 Nov 2005 22:32:50 +0000 (22:32 +0000)]
- Add a workaround (change the interrupt map mask to compare the full
INO) for incorrect interrupt map entries on E250 machines. These
incorrect entries caused the INO of the on-board HME to be also
assigned to the second on-board NS16550 and to the on-board printer
port controller. Further down the road caused hme(4) to fail to attach
to the on-board HME in FreeBSD 5 and 6 as INTR_FAST and non-INTR_FAST
handlers can't share the same IRQ there (it's unknown what whould
happen in -CURRENT now that INTR_FAST and non-INTR_FAST handlers can
share an IRQ but I'd expect funny problems with uart(4)).
- Make sure there are exactly 4 PCI ranges instead of just checking
that the bridge has a 'ranges' property in the OFW device tree at all.
Besides the fact that currently the 64bit memory range isn't used by
this driver it we can't really work with less than 4 ranges and don't
have memory for more than 4 bus handles for the ranges in the softc.
- Remove sc_range and sc_nrange from softc; for the bridges supported
by this driver we no longer need to know the ranges besides the bus
handles obtained from them once this driver is attached. That way we
also can free the memory allocated for sc_range during attach again.
- Remove sc_dvmabase from the softc and pass it to psycho_iommu_init()
via an additional argument as we no longer need to know the DVMA base
in this driver once the IOMMU is initialized.
- Remove sc_dmatag from the softc, there isn't much sense in keeping
the nexus dma tag around locally.
Marius Strobl [Tue, 22 Nov 2005 21:34:26 +0000 (21:34 +0000)]
Some clean-up, style changes and changes that will reduce differences
between this driver and other Host-PCI bridge drivers based on this one:
- Make the code fit into 80 columns.
- Make the code adhere style(9) (don't use function calls in initializers,
use uintXX_t instead of u_intXX_t, add missing prototypes, ...).
- Remove unused and superfluous struct declaration, softc member, casts,
includes, etc.
- Use FBSDID.
- Sprinkle const.
- Try to make comments and messages consistent in style throughout the
driver.
- Use convenience macros for the number of interrupts and ranges of the
bridge.
- Use __func__ instead of hardcoded function names in panic strings and
error messages. Some of the hardcoded function names actually were
outdated through moving code around. [1]
- Rename softc members related to the PCI side of the bridge to sc_pci_*
in order to make it clear which side of the bridge they refer to (so
stuff like sc_bushandle vs. sc_bh is less confusing while reading the
code).
Marius Strobl [Tue, 22 Nov 2005 17:32:51 +0000 (17:32 +0000)]
- Add ofw_bus_if.h to SRCS on sparc64 as envctrl.c and pcf_ebus.c depend
on it.
- Sync with sys/conf/files* and build pcf_isa.c only on i386 for now.
- Try to adhere to style.Makefile(5) (sorting, whitespace).
Marius Strobl [Tue, 22 Nov 2005 17:25:10 +0000 (17:25 +0000)]
Conditionalize the compilation of the envctrl.c front-end of pcf(4)
additionally on ebus(4) as the 'SUNW,envctrl' devices (as well as
'SUNW,envctrltwo' and 'SUNW,rasctrl', which we might want to also
support in envctrl.c in the future) are only found on EBus.
Marius Strobl [Tue, 22 Nov 2005 17:12:49 +0000 (17:12 +0000)]
Move zs.c from files to files.powerpc as zs(4) by now is only supported
on powerpc (more or less...). That way people updating from FreeBSD 5 to
FreeBSD 6 and beyond on sparc64 will get an error from config(8) rather
than a mysterious compile error when they have a stale 'device zs' in
their kernel config file.
Marius Strobl [Tue, 22 Nov 2005 16:39:44 +0000 (16:39 +0000)]
- Convert these bus drivers to make use of the newly introduced set of
ofw_bus_gen_get_*() for providing the ofw_bus KOBJ interface in order
to reduce code duplication.
- While here sync the various sparc64 bus drivers a bit (handle failure
to attach a child gracefully instead of panicing, move the printing
of child resources common to bus_print_child() and bus_probe_nomatch()
implementations of a bus into a <bus>_print_res() function, ...) and
fix some minor bugs and nits (plug memory leaks present when attaching
a bus or child device fails, remove unused struct members, ...).
Additional testing by: kris (central(4) and fhc(4))
Marius Strobl [Tue, 22 Nov 2005 16:37:45 +0000 (16:37 +0000)]
- Add a new method ofw_bus_default_get_devinfo() that allows to retrieve
a newly introduced struct ofw_bus_devinfo which can hold the OFW info
of a device recallable via the ofw_bus KOBJ interface. Introduce a set
of functions ofw_bus_gen_get_*() which use ofw_bus_default_get_devinfo()
to provide generic subroutines for implementing the rest of the ofw_bus
KOBJ interface in a bus driver.
This is inspired by bus_get_resource_list() and bus_generic_rl_*_resource()
and allows to reduce code duplication in bus drivers as they only have
to provide an ofw_bus_default_get_devinfo() implementation in order to
provide the ofw_bus KOBJ interface via ofw_bus_gen_get_*().
- While here add a comment to ofw_bus_if.m describing the intention of
the ofw_bus KOBJ interface.
Ruslan Ermilov [Tue, 22 Nov 2005 12:02:41 +0000 (12:02 +0000)]
Get rid of SPECIAL_INSTALLCHECKS variable that isn't settable
by a user. Instead, add individual checks as dependencies to
the main "installcheck" target. Make sure that installkernel
etc. depend on it (including the UID/GID checks).
Boris Popov [Tue, 22 Nov 2005 07:13:00 +0000 (07:13 +0000)]
Fix interaction with Windows 2000/XP based servers:
If the complete reply on the TRANS2_FIND_FIRST2 request fits exactly
into one responce packet, then next call to TRANS2_FIND_NEXT2 will return
zero entries and server will close current transaction. To avoid
subsequent errors we should not perform FIND_CLOSE2 request.
John Polstra [Tue, 22 Nov 2005 01:55:29 +0000 (01:55 +0000)]
Fix a bug in the loop in sonewconn that makes room on the incomplete
connection queue for a new connection. It was removing connections
from the wrong list.
Submitted by: Paul Mikesell
Sponsored by: Isilon Systems
MFC after: 1 week
John Baldwin [Mon, 21 Nov 2005 22:14:49 +0000 (22:14 +0000)]
Overhaul nve(4) locking to make it more like other ethernet drivers in
the tree.
- Add locked variants of nve_start(), nve_init(), and nve_ifmedia_upd().
- Use callout_* to manage callouts rather than timeout(9).
- Mark interrupt handler MPSAFE (IFF_NEEDGIANT was already clear).
- Lock the driver lock in driver entry points such as the interrupt
handler, if_start, and if_init rather than locking the driver mutex
in the various work functions called by the binary blob. The spin lock
used by the binary block can probably be stubbed out now.
- Use IFQ_DRV_IS_EMPTY() macro rather than doing it by hand.
- Fix locking in detach.
- Remove some unused fields from the softc.
John Baldwin [Mon, 21 Nov 2005 22:01:16 +0000 (22:01 +0000)]
Fix the code to look up the BIOS IRQ for a given link device by reading
the IRQ set by the BIOS in existing devices to actually get the correct
bus number of the child PCI bus. I was not reading the bus number from
the bridge device correctly. The __BUS_ACCESSOR() macros (from which
pcib_get_bus() is built) assume that the passed in argument is a child
device. However, at the time I'm reading the bus there is no child
device yet, so I was passing in the pcib device as the child device.
The parent of the pcib device probably returned an error in the case of
a host bridge, thus resulting in random stack garbage for the bus number.
For PCI-PCI bridges, the bus number being used was actually the subvendor
of the PCI-PCI bridge device itself.
John Baldwin [Mon, 21 Nov 2005 21:50:07 +0000 (21:50 +0000)]
Various fixes to make de(4) not panic after ru@'s IF_LLADDR() changes:
- Don't call tulip_addr_filter() to reset the RX address filter in
tulip_reset() since that gets called before ether_ifattach(). Just
call it in tulip_init_locked().
- Use be16dec() and le16dec() to parse MAC addresses when programming
the RX filter.
- Let ether_ioctl() handle SIOCSIFMTU since we were doing the exact same
thing with the added bonus that we leaked the driver lock if the MTU
was > ETHERMTU in the homerolled version. This part will be MFC'd.
Clue from: wpaul (1)
Stolen from: marcel (2 via patch for dc(4))
MFC after: 1 week
Scott Long [Mon, 21 Nov 2005 21:27:40 +0000 (21:27 +0000)]
Teach schedgraph how to parse KTR_CRITICAL records. critical_enter/exit
events are now plotted as a counting graph, similar to CPU load, so that
their duration and critnest values can be visualized.
John Baldwin [Mon, 21 Nov 2005 20:22:35 +0000 (20:22 +0000)]
Don't enable PUC_FASTINTR by default in the source. Instead, enable it
via the DEFAULTS kernel configs. This allows folks to turn it that option
off in the kernel configs if desired without having to hack the source.
This is especially useful since PUC_FASTINTR hangs the kernel boot on my
ultra60 which has two uart(4) devices hung off of a puc(4) device.
I did not enable PUC_FASTINTR by default on powerpc since powerpc does not
currently allow sharing of INTR_FAST with non-INTR_FAST like the other
archs.
John Baldwin [Mon, 21 Nov 2005 20:17:46 +0000 (20:17 +0000)]
Create DEFAULTS files for alpha, ia64, powerpc, and sparc64 and move
'device mem' over from GENERIC to DEFAULTS to be consistent with i386 and
amd64. Additionally, on ia64 enable ACPI by default since ia64 requires
acpi.
Paul Saab [Mon, 21 Nov 2005 19:23:46 +0000 (19:23 +0000)]
- Always return success from NFS strategy. nfs_doio(), in the
event of an error, does the right thing, in terms of setting
the error flags in the buf header. That fixes a crash from
bstrategy().
- Treat ETIMEDOUT as a "recoverable" error, causing the buffer
to be re-dirtied. ETIMEDOUT can occur on soft mounts, when
the number of retries are exceeded, and we don't want data loss
in that case.
Olivier Houchard [Mon, 21 Nov 2005 19:10:44 +0000 (19:10 +0000)]
Force pmap to write-back the pte cacheline after each pte modification,
even if the pte is supposed to be cached in write through mode (might be a
skyeye bug, I'll have to check).
John Baldwin [Mon, 21 Nov 2005 18:39:17 +0000 (18:39 +0000)]
Expand the hack to mask the atpics if 'device atpic' is not in the kernel
during boot up. Now we do a full reset of the 8259As and setup a simple
interrupt handler (we actually borrow the apic one that just does an
immediate iret) to handle any spurious interrupts triggered by either chip.
This should fix some folks that were getting a Trap 30 during bootup of
certain SMP AMD systems. This might get pushed into the 6.0 branch as an
errata. For now a suitable workaround is to add 'device atpic' to your
kernel config.
Ruslan Ermilov [Mon, 21 Nov 2005 16:44:16 +0000 (16:44 +0000)]
- Merge FreeBSD Configuration subsection etc. with SYNOPSIS.
- Remove the description of how to build a module.
- Remove the description of how to patch the sources.
- Refer to the polling(4) manpage on how to enable the polling mode.
- Tidy up markup.
Ruslan Ermilov [Mon, 21 Nov 2005 14:41:10 +0000 (14:41 +0000)]
Fix mysterious build failures (with parallel make) early in
buildkernel: provide a real but dummy name to ${DEPENDFILE}
so that the relevant exists() check in bsd.prog.mk fails and
ensures that ${GENHDRS} are built before any other objects.
Bruce Evans [Mon, 21 Nov 2005 04:57:12 +0000 (04:57 +0000)]
Mess up the "kernel" float trig function .c files with ifdefs so that
they can be #included in other .c files to give inline functions, and
use them to inline the functions in most callers (not in e_lgammaf_r.c).
__kernel_tanf() is too large and complicated for gcc to inline very well.
An athlons, this gives a speed increase under favourable pipeline
conditions of about 10% overall (larger for AXP, smaller for A64).
E.g., on AXP, sinf() on uniformly distributed args in [-2Pi, 2Pi]
now takes 30-56 cycles; it used to take 45-61 cycles; hardware fsin
takes 65-129.
Pyun YongHyeon [Mon, 21 Nov 2005 04:17:43 +0000 (04:17 +0000)]
busdma cleanup for em(4).
- don't force busdma to pre-allocate bounce pages for parent tag.
- use system supplied roundup2 macro instead of rolling its own version.
- TX/RX decriptor length should be multiple of 128. There is no
no need to expand the size with the multiple of 4096.
- don't create/destroy DMA maps in TX/RX handlers. Use pre-allocated
DMA maps. Since creating DMA maps on sparc64 is time consuming
operations(resource mananger overhead), this change should boost
performance on sparc64. I could get > 2x speedup on Ultra60.
- TX/RX descriptors could be aligned on 128 boundary. Aligning them
on PAGE_SIZE is waste of resource.
- don't blindly create TX DMA tag with size of MCLBYTES * 8. The size
is only valid under jumbo frame environments. Instead of using the
hardcoded value, re-compute necessary size on the fly.
- RX side bus_dmamap_load_mbuf_sg(9) support.
- remove unused macro EM_ROUNDUP and constant EM_MMBA.
Pyun YongHyeon [Mon, 21 Nov 2005 03:37:43 +0000 (03:37 +0000)]
Add a hack to ignore PCR bit for 6300ESB, 82801[D-G]B chips. It seems
that enabling busmastering would result in PCR bit ON after codec
reset.
While I'm here add DELAY(1) to codec access routine to give reasonable
time to codec operation. Without the delay, it would cause problems on
super-fast machines(> 2GHz). Also enable legacy audio for all 6300ESB,
82801[D-G]B chips. Previously, it enabled legacy audio for 82801DB(ICH4)
chip only.
Reported by: Maxim Maximov mcsi AT mcsi DOT pp DOT ru
Andrew Bliznak andriko.b AT gmail DOT com
Tested by: brueffer, Maxim Maximov, Andrew Bliznak
Bruce Evans [Mon, 21 Nov 2005 00:38:21 +0000 (00:38 +0000)]
Use double precision to simplify and optimize a long division.
On athlons, this gives a speedup of 10-20% for tanf() on uniformly
distributed args in [-2Pi, 2Pi]. (It only directly applies for 43%
of the args and gives a 16-20% speedup for these (more for AXP than
A64) and this gives an overall speedup of 10-12% which is all that it
should; however, it gives an overall speedup of 17-20% with gcc-3.3
on AXP-A64 by mysteriously effected cases where it isn't executed.)
I originally intended to use double precision for all internals of
float trig functions and will probably still do this, but benchmarking
showed that converting to double precision and back is a pessimization
in cases where a simple float precision calculation works, so it may
be optimal to switch precisions only when using extra precision is
much simpler.
Craig Rodrigues [Sun, 20 Nov 2005 17:04:50 +0000 (17:04 +0000)]
If export mount flag is not passed in, set default parameters
for export structure and pass that to vfs_export().
Currently in userland mount(8), an export structure is unconditionally
passed in, only for UFS. This is an attempt to move that UFS-specific
behavior out of mount(8) and into the UFS filesystem code.
Lukas Ertl [Sun, 20 Nov 2005 12:12:31 +0000 (12:12 +0000)]
Always declare variables at the start of the function.
Don't allocate potentially large variables on the stack.
Check strsep() return values when the string comes from userland.
Shorten variable names for lucidity's sake.
Marcel Moolenaar [Sun, 20 Nov 2005 01:31:29 +0000 (01:31 +0000)]
Improve inittodr(). Assume the real-time clock is reliable and only
use the base time in case the real-time clock is bogus or behind the
base time. Most importantly, don't sanity-check the base time up front
because it may be zero. This is not a preposterous condition. It just
means that none of the file systems have their mount time updated.
Bill Paul [Sun, 20 Nov 2005 01:29:29 +0000 (01:29 +0000)]
Correct the API for Windows interupt handling a little. The prototype
for a Windows ISR is 'BOOLEAN isrfunc(KINTERRUPT *, void *)' meaning
the ISR get a pointer to the interrupt object and a context pointer,
and returns TRUE if the ISR determines the interrupt was really generated
by the associated device, or FALSE if not.
I had mistakenly used 'void isrfunc(void *)' instead. It happens the
only thing this affects is the internal ndis_intr() ISR in subr_ndis.c,
but it should be fixed just in case we ever need to register a real
Windows ISR vi IoConnectInterrupt().
For NDIS miniports that provide a MiniportISR() method, the 'is_our_intr'
value returned by the method serves as the return value from ndis_isr(),
and 'call_isr' is used to decide whether or not to schedule the interrupt
handler via DPC. For drivers that only supply MiniportEnableInterrupt()
and MiniportDisableInterrupt() methods, call_isr is always TRUE and
is_our_intr is always FALSE.
In the end, there should be no functional changes, except that now
ntoskrnl_intr() can terminate early once it finds the ISR that wants
to service the interrupt.
Craig Rodrigues [Sat, 19 Nov 2005 23:28:19 +0000 (23:28 +0000)]
Add more options to ffs_opts, so that vfs_filteropts() will not
complain when we pass these options to a UFS filesystem as strings
via nmount(): noexec, nosuid, nosymfollow, sync, suiddir