tjr [Tue, 24 Aug 2004 13:00:55 +0000 (13:00 +0000)]
Replace the current implementations of ftw() and nftw() with the OpenBSD
implementations written by Todd C. Miller. These are cleaner, less buggy
and actively maintained.
rwatson [Tue, 24 Aug 2004 05:28:18 +0000 (05:28 +0000)]
Conditional acquisition of socket buffer mutexes when testing socket
buffers with kqueue filters is no longer required: the kqueue framework
will guarantee that the mutex is held on entering the filter, either
due to a call from the socket code already holding the mutex, or by
explicitly acquiring it. This removes the last of the conditional
socket locking.
imp [Tue, 24 Aug 2004 05:19:15 +0000 (05:19 +0000)]
Set the description to NULL in the right detach routine. This should
keep dangling pointers to strings in loaded modules from hanging
around after the drivers are unloaded.
rwatson [Tue, 24 Aug 2004 04:59:26 +0000 (04:59 +0000)]
Make sure to properly initialize 'size' to sizeof(sin) before passing
it into accept(). Depending on the initial value in memory, it is
otherwise possible to get EINVAL.
rwatson [Tue, 24 Aug 2004 04:02:41 +0000 (04:02 +0000)]
Add a basic kqueue + UNIX domain socket pair regression test to do some
elementary exercising of kqueues on datagram and stream sockets. Note
that the datagram write kqueue case is left untested due to potentially
confusing behavior for the developer (me) that might require attention.
dwhite [Tue, 24 Aug 2004 03:47:41 +0000 (03:47 +0000)]
Pick up changes in rev 1.8 of src/sys/dev/ic/mpt_netbsd.c from NetBSD.
Set the DMA SGL length correctly if the DMA request must be chained because
it is too large to fit in one SGL.
This should fix this driver for some Dell Precision systems.
RELENG_5 candidate.
PR: kern/66479
Submitted by: HITOSHI Osada <qfh02545@nifty.com>
peter [Tue, 24 Aug 2004 00:15:37 +0000 (00:15 +0000)]
struct tm.tm_year is listed as 'years since 1900', and is signed. On
64 bit systems, years roughly -2^31 through 2^31 can be represented in
time_t without any trouble. 32 bit time_t systems only range from
roughly 1902 through 2038. As a consequence, none of the date munging
code for all the various calendar tweaks before then is present. There
are other problems including the fact that there was no 'year zero' and
so on. So rather than get excited about trying to figure out when the
calendar jumped by two weeks etc, simply disallow negative (ie: prior to
1900) years.
This happens to have an important side effect. If you bzero a 'struct
tm', it corresponds to 'Jan 0, 1900, 00:00 GMT'. This happens to be
representable (after canonification) in 64 bit time_t space. Zero tm
structs are generally an error and mktime normally returns -1 for them.
Interestingly, it tries to canonify the 'jan 0' to 'dec 31, 1899', ie:
year -1. This conveniently trips the negative year test above, which
means we can trivially detect the null 'tm' struct.
This actually tripped up code at work. :-/ (Don't ask)
imp [Mon, 23 Aug 2004 23:28:02 +0000 (23:28 +0000)]
Even in a 80 column, fixed point font, there's plenty of room for all
the arguments to bus_dmamap_load, so don't use '...' but list the
actual args. '...' usually means a variable number of args (cf
printf(3)), but bus_dmamap_load takes a fixed number of arguments.
imp [Mon, 23 Aug 2004 23:17:31 +0000 (23:17 +0000)]
In the SYNOPSIS section, move the bus_dmamem_alloc function prototype
to just before bus_dmamem_free, which is (a) more logical; (b) likely
what was originally intended and (c) matches the order in the NAME and
FUNCTIONS sections.
peter [Mon, 23 Aug 2004 21:39:29 +0000 (21:39 +0000)]
Commit Doug White and Alan Cox's fix for the cross-ipi smp deadlock.
We were obtaining different spin mutexes (which disable interrupts after
aquisition) and spin waiting for delivery. For example, KSE processes
do LDT operations which use smp_rendezvous, while other parts of the
system are doing things like tlb shootdowns with a different mutex.
This patch uses the common smp_rendezvous mutex for all MD home-grown
IPIs that spinwait for delivery. Having the single mutex means that
the spinloop to aquire it will enable interrupts periodically, thus
avoiding the cross-ipi deadlock.
mjacob [Mon, 23 Aug 2004 19:04:19 +0000 (19:04 +0000)]
Until I can get a clearer architecture from PHK about why he wants
the geometry code to grab a mutex that prohibits any driver on the
stack below it from sleeping, it's not safe to allow anything in
the top half of isp to sleep (excepting the thread that Fibre Channel
instances use to re-scan loops/fabrics).
imp [Mon, 23 Aug 2004 18:51:36 +0000 (18:51 +0000)]
Add a blanket note about 5.x being the same as 6.0 and vice versa for
the time being. Also add a note that says we are going to remove the
band-aides for 4.early -> 6.0 after 5.3-RELEASE so people get used to
the idea, even though it has been planned since before 5.0 was
released.
le [Mon, 23 Aug 2004 17:50:18 +0000 (17:50 +0000)]
Compare the addresses of two RAID5 work packets directly instead
of the addresses of their related bios when locking one out, since
they could share a bio and this could lead to parity corruption.
njl [Mon, 23 Aug 2004 16:28:42 +0000 (16:28 +0000)]
Rework sysresource management. Instead of having each sysresource object
hold its own values, pass them up to the parent (acpi0) and merge/uniq them
on the way. After the namespace evaluation, acpi will reserve these
resources and manage them via rman before bus_generic_probe() and
bus_generic_attach(). This is necessary because some systems specify
conflicting resources in separate sysresource objects. It's also cleaner
in that the interface between sysresource and acpi is now merely the parent's
resource list. This code handles the following cases:
1. Unique resource: add it to the parent via bus_set_resource().
2. New wholly contained in old: discard new.
3. New tail overlaps old head: grow old head downward.
AND/OR
4. New head overlaps old tail: grow old tail upward.
obrien [Mon, 23 Aug 2004 16:25:07 +0000 (16:25 +0000)]
Forced commit to document:
Doug Rabson <dfr@nlsystems.com>
Message-Id: <200408220940.18504.dfr@nlsystems.com>
Size does matter for the alpha loader. The firmware gives it 256k
of address space which we overflowed many years ago. I extended it
in sys/boot/alpha/common/main.c:extend_heap() by adding 512k to the
loader's mapped address space.
sobomax [Mon, 23 Aug 2004 15:55:03 +0000 (15:55 +0000)]
My recent measurement shows that CPU_DISABLE_CMPXCHG is no longer necessary
with VmWare 4.x. At least with VmWare version 4.5.2, i386 version of
atomic_cmpset_int() is about 30 times slower than non-i386 version. It
makes this delta a good 5.3 MFC candidate, since otherwise it will
mislead users who run FreeBSD under modern VmWare otherwise.
des [Mon, 23 Aug 2004 12:41:29 +0000 (12:41 +0000)]
Don't try to translate the control message unless we're certain it's
valid; otherwise a caller could trick us into changing any 32-bit word
in kernel memory to LINUX_SOL_SOCKET (0x00000001) if its previous value
is SOL_SOCKET (0x0000ffff).
imp [Mon, 23 Aug 2004 03:38:21 +0000 (03:38 +0000)]
Make this compile again in the standalone and the MODULES_WITH_WORLD
environments. Chances are good that this doesn't produce a good
module, but I leave the proper defaults to the dummy opt_* files to
the author.
rwatson [Mon, 23 Aug 2004 03:00:27 +0000 (03:00 +0000)]
Remove in6_prefix.[ch] and the contained router renumbering capability.
The prefix management code currently resides in nd6, leaving only the
unused router renumbering capability in the in6_prefix files. Removing
it will make it easier for us to provide locking for the remainder of
IPv6 by reducing the number of objects requiring synchronized access.
This functionality has also been removed from NetBSD and OpenBSD.
Submitted by: George Neville-Neil <gnn at neville-neil.com>
Discussed with/approved by: suz, keiichi at kame.net, core at kame.net
kan [Mon, 23 Aug 2004 02:39:45 +0000 (02:39 +0000)]
Temporarily back out r1.74 as it seems to cause a number of regressions
accordimg to numerous reports. It might get reintroduced some time later
when an exact failure mode is understood better.
mux [Sun, 22 Aug 2004 23:01:13 +0000 (23:01 +0000)]
Pass a correct lowaddr to bus_dma_tag_create(), lnc(4) cards can only
deal with 24-bit addresses. While the two other attachments, namely
isa and cbus, do it properly, the PCI attachment was passing
BUS_SPACE_MAXADDR instead of BUS_SPACE_MAXADDR_24BIT. This bug
became apparent with the new contigmalloc() code.
This fixes the problem reported with lnc(4) interfaces inside VMWare,
and should theoritically also fix any user of a PCI lnc(4) card. It
is a RELENG_5 MFC candidate.
marcel [Sun, 22 Aug 2004 20:52:23 +0000 (20:52 +0000)]
Move the cow field between wire_count and hold_count. This is the
position that is 64-bit aligned and makes sure that the valid and
dirty fields are also 64-bit aligned. This means that if PAGE_SIZE
is 32K, the size of the vm_page structure is only increased by 8
bytes instead of 16 bytes. More importantly, the vm_page structure
is either 120 or 128 bytes on ia64. These are "interesting" sizes.
cperciva [Sun, 22 Aug 2004 19:44:24 +0000 (19:44 +0000)]
When creating a new md, wait for geom's event queue to become empty
before returning. Device nodes are created via the "taste" mechanism,
so this is necessary in order to make sure that devfs entries are
created before mdconfig(8) returns.
green [Sun, 22 Aug 2004 18:57:40 +0000 (18:57 +0000)]
The new contigmalloc code is exposing a lot of misuses of busdma memory
allocation. Notably, in this case, the driver tries to allocate several
pieces of memory and then fails if the pieces allocated after the first
do not come after it physically, and within a specific range (8MB I
believe). Of course, this could just as easily fail for any number of
reasons, but it almost always fails now that contiguous allocations start
at the end of possible specified memory locations rather than the beginning.
Allocate all the possibly-needed memory up front, even though it's a waste,
to get around this. The least bogus solution would be to take the physical
address from the first allocation and create a new tag that specified that
further allocations must follow it within that 8MB window, then use that
when allocating new channels, but that's left for anyone else that really
feels like doing it.
mlaier [Sun, 22 Aug 2004 16:42:28 +0000 (16:42 +0000)]
Allow early drop for non-ALTQ enabled queues in an ALTQ-enabled kernel.
Previously the early drop was disabled unconditionally for ALTQ-enabled
kernels.
This should give some benefit for the normal gateway + LAN-server case with
a busy LAN leg and an ALTQ managed uplink.
pjd [Sun, 22 Aug 2004 16:21:12 +0000 (16:21 +0000)]
Implementation of 'verify reading' algorithm, which uses parity data for
verification of regular data when device is in complete state.
On verification error, EIO error is returned for the bio and sysctl
kern.geom.raid3.stat.parity_mismatch is increased.
marcel [Sun, 22 Aug 2004 06:24:59 +0000 (06:24 +0000)]
Part 2 of fixing the boot code: gcc 3.4 fixes.
The whole problem seems to be size. Which is odd, because it is said
that size doesn't matter. Anyway... Add -Os to strategic places in the
makefile to have the final loader be as mall as possible. This seems
to be enough to make it work. For now... I think something is more
fundamentally wrong; or something more fundamental is wrong. Potato,
potaato.
kensmith [Sun, 22 Aug 2004 05:34:07 +0000 (05:34 +0000)]
Found another one. Why does mdconfig hate me? Add a "sleep 5" to
this script, without it sparc64 ISO building was consistently failing
because the /dev/md0 device name was not present when the commands
following mdconfig ran. Apparently there is the possibility of a delay
between when mdconfig finishes and the names become visible in /dev.
Yes, we could code this better than an unconditional call to "sleep 5"
but IMHO we should fix the underlying problem instead.
csjp [Sun, 22 Aug 2004 02:03:41 +0000 (02:03 +0000)]
Currently, if the secure level is low enough, system flags can
be manipulated by prison root. In 4.x prison root can not manipulate
system flags, regardless of the security level. This behavior
should remain consistent to avoid any surprises which could lead
to security problems for system administrators which give out
privileged access to jails.
This commit changes suser_cred's flag argument from SUSER_ALLOWJAIL
to 0. This will prevent prison root from being able to manipulate
system flags on files.
rwatson [Sun, 22 Aug 2004 01:32:48 +0000 (01:32 +0000)]
When sliding the m_data pointer forward, update m_pktrhdr.len as well
as m_len, or the pkthdr length will be inconsistent with the actual
length of data in the mbuf chain. The symptom of this occuring was
"out of data" warnings from in_cksum_skip() on large UDP packets sent
via the loopback interface.
marcel [Sun, 22 Aug 2004 00:26:01 +0000 (00:26 +0000)]
Part 1 of fixing the boot code: binutils 2.15 fixes.
The binutils 2.15 assembler now automaticly and non-optionally adds
the .eh_frame section for unwind information. This section appears
to wreck havoc to the final boot code. Fix this by using a special
linker script that discards the .eh_frame sections, but is otherwise
identical to the linker internal script used for -N.
alc [Sun, 22 Aug 2004 00:08:43 +0000 (00:08 +0000)]
In the previous revision, I failed to condition an early release of Giant
in vm_fault() on debug_mpsafevm. If debug_mpsafevm was not set, the result
was an assertion failure early in the boot process.
rwatson [Sat, 21 Aug 2004 21:45:40 +0000 (21:45 +0000)]
If a tunable for the routing socket netisr queue max is defined, allow it
to override the default value, rather than the default value overriding
the tunable.
rwatson [Sat, 21 Aug 2004 21:20:06 +0000 (21:20 +0000)]
Allow the size of the routing socket netisr queue to be configured using
the tunable or sysctl 'net.route.netisr_maxqlen'. Default the maximum
depth to 256 rather than IFQ_MAXLEN due to the downsides of dropping
routing messages.
MT5 candidate.
Discussed with: mdodd, mlaier, Vincent Jardin <jardin at 6wind.com>
trhodes [Sat, 21 Aug 2004 20:19:19 +0000 (20:19 +0000)]
Allow mac_bsdextended(4) to log failed attempts to syslog's AUTHPRIV
facility. This is disabled by default but may be turned on by using
the mac_bsdextended_logging sysctl.
trhodes [Sat, 21 Aug 2004 20:15:08 +0000 (20:15 +0000)]
Give the mac_bsdextended(4) policy the ability to match and apply on a first
rule only in place of all rules match. This is similar to how ipfw(8) works.
Provide a sysctl, mac_bsdextended_firstmatch_enabled, to enable this
feature.
obrien [Sat, 21 Aug 2004 19:44:43 +0000 (19:44 +0000)]
Hit people over the head so they realize run-time errors of the form
/libexec/ld-elf.so.1: Undefined symbol "_ZNSs20_S_empty_rep_storageE"
does mean they are hitting the GCC 3.4 ABI change issue.
alc [Sat, 21 Aug 2004 19:20:21 +0000 (19:20 +0000)]
Further reduce the use of Giant by vm_fault(): Giant is held only when
manipulating a vnode, e.g., calling vput(). This reduces contention for
Giant during many copy-on-write faults, resulting in some additional
speedup on SMPs.
Note: debug_mpsafevm must be enabled for this optimization to take effect.