Yaroslav Tykhiy [Thu, 21 Dec 2006 21:35:49 +0000 (21:35 +0000)]
Allow this module to get its options from the kernel build directory
instead of always hard coding them in CFLAGS. POLA is kept here:
The module file built with GENERIC stays the same.
Randall Stewart [Thu, 21 Dec 2006 19:58:04 +0000 (19:58 +0000)]
The prepend function did not handle non-pkthdr's correctly.
It always called MH_ALIGN for small lengths being
prepended (less than MHLEN). This meant that if you did
a prepend on a non M_PKTHDR the system would panic with
the KASSERT in MH_ALIGN. Instead we are not aware of
this and do a MH_ALIGN or M_ALIGN as appropriate.
Ceri Davies [Thu, 21 Dec 2006 19:08:25 +0000 (19:08 +0000)]
Correct the description of minpoll and maxpoll.
Note that while later versions of the ntpd documentation use the term
"dual logarithm", the text added here is consistent with the remainder
of the current document.
Robert Watson [Thu, 21 Dec 2006 09:51:34 +0000 (09:51 +0000)]
Remove mac_enforce_subsystem debugging sysctls. Enforcement on
subsystems will be a property of policy modules, which may require
access control check entry points to be invoked even when not actively
enforcing (i.e., to track information flow without providing
protection).
Obtained from: TrustedBSD Project
Suggested by: Christopher dot Vance at sparta dot com
Marcel Moolenaar [Thu, 21 Dec 2006 05:40:46 +0000 (05:40 +0000)]
Unbreak 64-bit little-endian systems that do require alignment.
The fix involves using le16dec(), le32dec(), le16enc() and
le32enc(). This eliminates invalid casts and duplicated logic.
Robert Watson [Wed, 20 Dec 2006 23:41:59 +0000 (23:41 +0000)]
Comment LABEL_TO_SLOT() macro, including observing that we'd like to improve
this policy API to avoid encoding struct label binary layout in policy
modules.
Keep in sync with the if_bridge(4) module (rev. 1.20 if_bridgevar.h,
1.12 bridgestp.h) and rename all PointToPoint related variables
from P2P to PTP (s/P2P/PTP/g s/p2p/ptp/g).
Robert Watson [Wed, 20 Dec 2006 20:40:29 +0000 (20:40 +0000)]
Externalize local stack copy of the ifnet label, rather than the copy on
the ifnet itself. The stack copy has been made while holding the mutex
protecting ifnet labels, so copying from the ifnet copy could result in
an inconsistent version being copied out.
Jung-uk Kim [Wed, 20 Dec 2006 20:17:35 +0000 (20:17 +0000)]
MFP4: 109655
- Move linux_nanosleep() from src/sys/amd64/linux32/linux32_machdep.c to
src/sys/compat/linux/linux_time.c.
- Validate timespec ranges before use as Linux kernel does.
- Fix l_timespec structure.
- Clean up style(9) nits.
Jung-uk Kim [Wed, 20 Dec 2006 20:08:45 +0000 (20:08 +0000)]
MFP4: 110179
Add rudimentary IPC_INFO/MSG_INFO command support for linux_msgctl()
to pacify Linux ipcs(1). While I am here, add more bound checks
for linux_msgsnd() and linux_msgrcv().
Jung-uk Kim [Wed, 20 Dec 2006 19:26:30 +0000 (19:26 +0000)]
MFP4: (part of) 110058
copyin()/copyout() for message type is separated from msgsnd()/msgrcv() and
it is done from its wrapper functions to support 32-bit emulations. After I
implemented this, I have briefly referenced NetBSD and Darwin. NetBSD passes
copyin()/copyout() function pointers from wrappers. Darwin passes size of
message type as an argument, which is actually similar to my first
implementation (P4 109706). We may revisit these implementations later.
Xin LI [Wed, 20 Dec 2006 17:10:53 +0000 (17:10 +0000)]
On amd64 platform, use linux32 headers so 32-bit Linux applications
would be able to work with aac(4).
This approach is used by some other drivers as well. However, we
need a more generic way to do this in order to avoid having to
special case headers in individual drivers for each platform.
Yaroslav Tykhiy [Wed, 20 Dec 2006 12:59:50 +0000 (12:59 +0000)]
Syscons cannot be stopped, so provide a no-op stop method.
The default stop method from rc.subr isn't suited for this
case and produces a bogus warning: "syscons not running".
Bruce Evans [Wed, 20 Dec 2006 12:03:21 +0000 (12:03 +0000)]
In bge_txeof(), cancel the watchdog timeout if all descriptors have
been handled instead of when at least one descriptor was just handled.
For bge, it is normal to get a txeof when only a small fraction of the
queued tx descriptors have been handled, so the bug broke the watchdog
in a usual case.
Yaroslav Tykhiy [Wed, 20 Dec 2006 11:37:15 +0000 (11:37 +0000)]
Improve rc.d conformance:
- don't play a needless trick with prestart, just use start method;
- provide no-op stop method so that we don't get bogus "abi not running" error.
Bruce Evans [Wed, 20 Dec 2006 11:14:45 +0000 (11:14 +0000)]
Avoid a race and a pessimization in bge_intr():
- moved the synchronizing bus read to after the bus write for the first
interrupt ack so that it actually synchronizes everything necessary.
We were acking not only the status update that triggered the interrupt
together with any status updates that occurred before we got around
to the bus write for the ack, but also any status updates that occur
after we do the bus write but before the write reaches the device.
The corresponding race for the second interrupt ack resulted in
sometimes returning from the interrupt handler with acked but
unserviced interrupt events. Such events then remain unserviced
until further events cause another interrupt or the watchdog times
out.
The race was often lost on my 5705, apparently since my 5705 has broken
event coalescing which causes a status update for almost every packet,
so another status update is quite likely to occur while the interrupt
handler is running. Watchdog timeouts weren't very noticeable,
apparently because bge_txeof() has one of the usual bugs resetting the
watchdog.
- don't disable device interrupts while bge_intr() is running. Doing this
just had the side effects of:
- entering a device mode in which different coalescing parameters apply.
Different coalescing parameters can be used to either inhibit or
enhance the chance of getting another status update while in the
interrupt handler. This feature is useless with the current
organization of the interrupt handler but might be useful with a
taskqueue handler.
- giving a race for ack+reenable/return. This cannot be handled
by simply rearranging the order of bus accesses like the race for
ack+keepenable/entry. It is necessary to sync the ack and then
check for new events.
- taking longer, especially with the extra code to avoid the race on
ack+reenable/return.
In rev. 1.514, iodone on async buffer may happen before code checks the
vnode v_flag. For cluster buffers this would result in dereferencing NULL
b_vp. To prevent the panic, cache relevant vnode flag before calling
bstrategy.
Reported by: Peter Holm, kris
Tested by: Peter Holm
Reviewed by: tegge
Pointy hat to: kib
Yaroslav Tykhiy [Wed, 20 Dec 2006 06:20:04 +0000 (06:20 +0000)]
Allow for module-path being a semicolon-separated list of dirs.
This is consistent with kern.module_path sysctl and also compensates
for the unconventional syntax of asf(8) where the last of multiple
arguments is the output file, which prevents us from using the
traditional Unix syntax "foo file ..." to specify multiple module
dirs.
David Xu [Wed, 20 Dec 2006 04:40:39 +0000 (04:40 +0000)]
Add a lwpid field into per-cpu structure, the lwpid represents current
running thread's id on each cpu. This allow us to add in-kernel adaptive
spin for user level mutex. While spinning in user space is possible,
without correct thread running state exported from kernel, it hardly
can be implemented efficiently without wasting cpu cycles, however
exporting thread running state unlikely will be implemented soon as
it has to design and stablize interfaces. This implementation is
transparent to user space, it can be disabled dynamically. With this
change, mutex ping-pong program's performance is improved massively on
SMP machine. performance of mysql super-smack select benchmark is increased
about 7% on Intel dual dual-core2 Xeon machine, it indicates on systems
which have bunch of cpus and system-call overhead is low (athlon64, opteron,
and core-2 are known to be fast), the adaptive spin does help performance.
Added sysctls:
kern.threads.umtx_dflt_spins
if the sysctl value is non-zero, a zero umutex.m_spincount will
cause the sysctl value to be used a spin cycle count.
kern.threads.umtx_max_spins
the sysctl sets upper limit of spin cycle count.
Marius Strobl [Wed, 20 Dec 2006 02:13:59 +0000 (02:13 +0000)]
- Use the re_tick() callout instead of if_slowtimo() for driving
re_watchdog() in order to avoid races accessing if_timer.
- Use bus_get_dma_tag() so re(4) works on platforms requiring it.
- Remove invalid BUS_DMA_ALLOCNOW when creating the parent DMA tag
and the tags that are used for static memory allocations.
- Don't bother to set if_mtu to ETHERMTU, ether_ifattach() does that.
- Remove an unused variable in re_intr().
Marius Strobl [Wed, 20 Dec 2006 01:49:56 +0000 (01:49 +0000)]
Fix a bug originally introduced in rev. 1.74; don't reloaded the
watchdog timer in dc_txeof() in case there are still unhandled
descriptors as dc_poll() invokes dc_poll() unconditionally.
Otherwise this would result in the watchdog timer constantly being
being reloaded and thus circumvent that the watchdog ever fires in
the DEVICE_POLLING case.
Peter Grehan [Wed, 20 Dec 2006 01:10:21 +0000 (01:10 +0000)]
Remove bogus increment of re-hashed PTEG index. This snuck in with r1.12 of
pmap.c, and is potentially the cause of hangs reported on machines with a
small amount of memory. On machines with sufficient RAM, and without a lot
of processes running, this situation would probably never occur.
Testing is still incomplete, but it is obviously wrong so remove the
offending code now.
The issue of what to do when both the primary and secondary hash overflow
is still open.
Reported by: Dan Kresja at windriver dot com, via alc
Jung-uk Kim [Wed, 20 Dec 2006 00:08:47 +0000 (00:08 +0000)]
- Do not depend on auto negotiation for link speed/duplex status.
- Read link status from BMSR instead of auxilary status register.
- Clean up style(9) nits.
Jung-uk Kim [Tue, 19 Dec 2006 22:50:49 +0000 (22:50 +0000)]
Clear full-duplex when half-duplex flag is set. This actually makes
'mediaopt half-duplex' working as it should. It is now equivalent of
'-mediaopt full-duplex'.
Craig Rodrigues [Tue, 19 Dec 2006 02:31:58 +0000 (02:31 +0000)]
For big-endian version of getulong() macro, cast result to u_int32_t.
This macro was written expecting a 32-bit unsigned long, and
doesn't work properly on 64-bit systems. This bug caused vn_stat()
to return incorrect values for files larger than 2gb on msdosfs filesystems
on 64-bit systems.
PR: 106703
Submitted by: Axel Gonzalez <loox e-shell net>
MFC after: 3 days
Craig Rodrigues [Tue, 19 Dec 2006 01:55:45 +0000 (01:55 +0000)]
Fix get_ulong() macro on AMD64 (or any little-endian 64-bit platform).
This bug caused vn_stat() to fail on files larger than 2gb on msdosfs
filesystems on AMD64.
PR: 106703
Tested by: Axel Gonzalez <loox e-shell net>
MFC after: 3 days
Peter Edwards [Mon, 18 Dec 2006 17:08:07 +0000 (17:08 +0000)]
Clean bound and non-bound pthread structures consistently before
they become candidates for reuse. Without this fix, some of the
state from a thread structure's previous incarnation could interfere
with its new one. Specifically, a non-bound thread started as
"suspended" (see pthread_attr_setcreatesuspend_np()) might not get
scheduled at all when resumed, as the "active" flag would be set
spuriously.
Jung-uk Kim [Mon, 18 Dec 2006 16:40:04 +0000 (16:40 +0000)]
- Remove stale VPD support and its comment and get device name from VPD API.
- Do not repeatedly read vendor/device IDs while probing.
- Remove redundant bzero(3) for softc. device_get_softc(9) does it for free[1].
Kip Macy [Mon, 18 Dec 2006 07:35:14 +0000 (07:35 +0000)]
add an interface for passing the entire kernel size up front to the
loader so that it can memory can be allocated aligned at the beginning of
the desired large page