Warner Losh [Sun, 11 Jan 2004 06:52:31 +0000 (06:52 +0000)]
Add support for subtractive decoding bridges. These bridges pass all
signals to addresses to the child busses. Typically, ProgIf of 1
means a subtractive bridge. However, Intel has a whole lot of ones
with a ProgIf of 80 that are also subtractive. We cope with these
bridges too. This eliminates hw.pci.allow_unsupported_io_range
because that had almost the same effect as these patches (almost means
'buggy'). Remove the bogus checks for ISA bus locations: these cycles
aren't special and are only passed by transparent bridges.
We allow any range to succeed. If the range is a superset of the
range that's decoded, trim the resource to that range. Otherwise,
pass the range unchanged. This will change the location that PC Card
and CardBus cards are attached. This might bogusly cause some
overlapping allocation that wasn't present before, but the overlapping
fixes need to be in the pci level.
Robert Watson [Sun, 11 Jan 2004 06:24:34 +0000 (06:24 +0000)]
Release audit device major number reservation. The new audit
implementation writes directly to a file, similar to the Darwin,
Solaris, and whoever else implementations, rather than buffering
through a pseudo-device.
David E. O'Brien [Sun, 11 Jan 2004 03:34:02 +0000 (03:34 +0000)]
Vendor import emu10k1.h from version 1.0.1 of the ALSA driver.
ftp://ftp.alsa-project.org/pub/driver/alsa-driver-1.0.1.tar.bz2
or http://www.alsa-project.org/alsa/cvs/alsa-kernel/include/emu10k1.h
Robert Watson [Sun, 11 Jan 2004 02:28:06 +0000 (02:28 +0000)]
When not creating a core dump due to resource limits specifying
a maximum dump size of 0, return a size-related error, rather
than returning success. Otherwise, waitpid() will incorrectly
return a status indicating that a core dump was created. Note
that the specific error doesn't actually matter, since it's lost.
Robert Watson [Sun, 11 Jan 2004 01:29:03 +0000 (01:29 +0000)]
Problem:
When an NFS server is port-scanned nfsd sometimes exits. This has
happened 3 times the last few weeks.
Nfsd has been written to exit when accept(2) fails. Unfortunately
accept can sometimes make a "normal" return with errno ECONNABORTED
and in this case nfsd exits prematurely.
Solution:
Check for ECONNABORTED (and also EINTR, since nfsd uses signals)
and continue.
Robert Watson [Sat, 10 Jan 2004 22:38:54 +0000 (22:38 +0000)]
Update the diskless(8) documentation to indicate that the use of the
kernel BOOTP options is *not* required if the boot loader can pass
network configuration information to the kernel using the kernel
environment. As such, PXE doesn't require them. However, the NFS
options are required in the kernel (previously not documented).
Robert Watson [Sat, 10 Jan 2004 17:41:04 +0000 (17:41 +0000)]
Clarify the behavior of ptrace(2) a little bit: the tracing process
must first attach to the traced process. If the tracing process
exits without detaching, the traced process will be killed rather
than continued. For the duration of the tracing session, the traced
process is reparented to the tracing process (with resulting expected
behaviors). It is permissible to trace more than one other process
at a time. When using waitpid() to monitor the behavior of the traced
process, signals are intercepted: they may optionally then be
forwarded using ptrace(). Signals are generated normally by and for
the process, but also by the tracing facility (SIGTRAP).
Ruslan Ermilov [Sat, 10 Jan 2004 16:30:29 +0000 (16:30 +0000)]
Moved the code for :U and :L modifiers where it belongs, so that
the fallback for SysV (now in POSIX) variable substitution works
for old_string arguments starting with 'U' or 'L'.
Maxim Sobolev [Sat, 10 Jan 2004 13:09:21 +0000 (13:09 +0000)]
Fix serious ugliness introduced in 1.61, which leads to long delay in boot
sequence when machine is started without attached USB mouse. Only do
repeated attempts to re-open device if the usb module has been actually
loaded. Also fix broken logic in doing delays between open attempts - do
delays between attempts, not after each attempt.
Due to previous behaviour being very annoying for notebook owners this
is a good 5.2 MFC candidate.
Don Lewis [Sat, 10 Jan 2004 08:53:00 +0000 (08:53 +0000)]
Check that sa_len is the appropriate value in tcp_usr_bind(),
tcp6_usr_bind(), tcp_usr_connect(), and tcp6_usr_connect() before checking
to see whether the address is multicast so that the proper errno value
will be returned if sa_len is incorrect. The checks are identical to the
ones in in_pcbbind_setup(), in6_pcbbind(), and in6_pcbladdr(), which are
called after the multicast address check passes.
Don Lewis [Sat, 10 Jan 2004 08:28:54 +0000 (08:28 +0000)]
Add a somewhat redundant check on the len arguement to getsockaddr() to
avoid relying on the minimum memory allocation size to avoid problems.
The check is somewhat redundant because the consumers of the returned
structure will check that sa_len is a protocol-specific larger size.
Submitted by: Matthew Dillon <dillon@apollo.backplane.com>
Reviewed by: nectar
MFC after: 30 days
Don Lewis [Sat, 10 Jan 2004 08:14:27 +0000 (08:14 +0000)]
Don't execute the code in in6_ifdetach() that removes the link-local
allnodes multicast route if the routing table has not been initialized.
This avoids a panic during boot if an interface detaches before the
routing table is initialized.
Alfred Perlstein [Sat, 10 Jan 2004 02:59:54 +0000 (02:59 +0000)]
Fix a panic when attempting a v4 op against a v3/v2-only server.
It happens because rpcclnt_request is incorrectly returning 0 in the case
of an rpc mismatch or auth error.
Prevent a race condition between fork1() and whatever changes the pgrp by
setting the new process' p_pgrp again before inserting it in the p_pglist.
Without it we can get the new process to be inserted in a different p_pglist
than the one p2->p_pgrp points to, and this is not something we want to happen.
This is not a fix, merely a bandaid, but it will work until someone finds a
better way to do it.
Change sdp_open_local(3) API. It now takes a path to a control socket
Teach sdpcontrol(8) how to talk to the local SDP server
Update man pages
s/u_int/uint
Bruce A. Mah [Fri, 9 Jan 2004 20:10:20 +0000 (20:10 +0000)]
First 5.2-RELEASE errata, documenting some known issues in the
release: xdm(1) black-and-white-ness, ACPI problems, ATA device
problems, NFS floppy install requirements, pcm(4) vchan instabilities.
Nate Lawson [Fri, 9 Jan 2004 20:01:42 +0000 (20:01 +0000)]
Expand the check for overriding the OS name to override _OS* (including
_OS_, _OS, and _OSI). This should fix this option for people who reported
it not changing anything.
Jacques Vidrine [Fri, 9 Jan 2004 16:52:09 +0000 (16:52 +0000)]
Provide sysarch(2) prototypes in the MD sysarch.h headers. While I'm
at it, use the ANSI C generic pointer type for the second argument,
thus matching the documentation.
Remove the now extraneous (and now conflicting) function declarations
in various libc sources. Remove now unnecessary casts.
Andre Oppermann [Fri, 9 Jan 2004 14:14:10 +0000 (14:14 +0000)]
Reduce TCP_MINMSS default to 216. The AX.25 protocol (packet radio)
is frequently used with an MTU of 256 because of slow speeds and a
high packet loss rate.
Jacques Vidrine [Fri, 9 Jan 2004 13:43:49 +0000 (13:43 +0000)]
It was reported that when using nss_ldap, getgrent(3) would behave
incorrectly when encountering `large' groups (many members and/or many
long member names). The reporter tracked this down to the glibc NSS
module compatibility code (nss_compat.c): it would prematurely record
that a NSS module was finished iterating through its database in some
cases.
Two aspects are corrected:
1. nss_compat.c recorded that a NSS module was finished iterating
whenever the module reported something other than SUCCESS. The
correct logic is to continue iteration when the module reports
either SUCCESS or RETURN. The __nss_compat_getgrent_r and
__nss_compat_getpwent_r routines are updated to reflect this.
2. An internal helper macro __nss_compat_result is used to map glibc
NSS status codes to BSD NSS status codes (e.g. NSS_STATUS_SUCCESS ->
NS_SUCCESS). It provided the obvious mapping.
When a NSS routine is called with a too-small buffer, the
convention in the BSD NSS code is to report RETURN. (This is used
to implement reentrant APIs such as getpwnam_r(3).) However, the
convention in glibc for this case is to set errno = ERANGE and
overload TRYAGAIN. __nss_compat_result is updated to handle this
case.
Bill Paul [Fri, 9 Jan 2004 06:53:49 +0000 (06:53 +0000)]
The private data section of ndis_packets has a 'packet flags' byte
which has two important flags in it: the 'allocated by NDIS' flag
and the 'media specific info present' flag. There are two Windows macros
for getting/setting media specific info fields within the ndis_packet
structure which can behave improperly if these flags are not initialized
correctly when a packet is allocated. It seems the correct thing
to do is always set the NDIS_PACKET_ALLOCATED_BY_NDIS flag on
all newly allocated packets.
This fixes the crashes with the Intel Centrino wireless driver.
My sample card now seems to work correctly.
Also, fix a potential LOR involving ndis_txeof() in if_ndis.c.
Bill Paul [Fri, 9 Jan 2004 03:57:00 +0000 (03:57 +0000)]
Implement NdisOpenFile()/NdisCloseFile()/NdisMapFile()/NdisUnmapFile().
By default, we search for files in /compat/ndis. This can be changed with
a systcl. These routines are used by some drivers which need to download
firmware or microcode into their respective devices during initialization.
Also, remove extraneous newlines from the 'built-in' sysctl/registry
variables.
Brian Feldman [Fri, 9 Jan 2004 03:19:40 +0000 (03:19 +0000)]
Add a GraphViz-exporting ngctl(8) "dot" command. You can now create
very useful .dot files of your netgraph(4) to quickly visualize the
nodes, hooks and edges. An example of this can be found here:
http://people.freebsd.org/~green/sample-netgraph-dot.ps
If anyone would like to refine the output further, please do so.
Robert Watson [Thu, 8 Jan 2004 22:49:23 +0000 (22:49 +0000)]
Improve the expressiveness of ttyinfo (^T) when dealing with threads
in slightly less usual states:
If the thread is on a run queue, display "running" if the thread is
actually running, otherwise, "runnable".
If the thread is sleeping, and it's on a sleep queue, display the
name of the queue, otherwise "unknown" -- previously, in this situation
we would display "iowait".
If the thread is waiting on a lock, display *lockname.
If the thread is suspended, display "suspended" -- previously, in
this situation we would display "iowait".
If the thread is waiting for an interrupt, display "intrwait" --
previously, in this situation we would display "iowait".
If the thread is in a state not handled by the above, display
"unknown" -- previously, we would print "iowait".
Among other things, this avoids displaying "iowait" when the foreground
process turns out to be suspended waiting for a debugger to properly
attach.
Robert Watson [Thu, 8 Jan 2004 22:44:54 +0000 (22:44 +0000)]
Drop the sigacts mutex around calls to stopevent() to avoid sleeping
holding the mutex. Because the sigacts pointer can't change while
the process is "live" (proc locking (x)), we know our pointer is still
valid.
Alan Cox [Thu, 8 Jan 2004 20:48:26 +0000 (20:48 +0000)]
- Enable recursive acquisition of the mutex synchronizing access to the
free pages queue. This is presently needed by contigmalloc1().
- Move a sanity check against attempted double allocation of two pages
to the same vm object offset from vm_page_alloc() to vm_page_insert().
This provides better protection because double allocation could occur
through a direct call to vm_page_insert(), such as that by
vm_page_rename().
- Modify contigmalloc1() to hold the mutex synchronizing access to the
free pages queue while it scans vm_page_array in search of free pages.
- Correct a potential leak of pages by contigmalloc1() that I introduced
in revision 1.20: We must convert all cache queue pages to free pages
before we begin removing free pages from the free queue. Otherwise,
if we have to restart the scan because we are unable to acquire the
vm object lock that is necessary to convert a cache queue page to a
free page, we leak those free pages already removed from the free queue.
Maxime Henrion [Thu, 8 Jan 2004 19:08:27 +0000 (19:08 +0000)]
Some integrated Davicom cards in sparc64 boxes have an all zeros
MAC address in the EEPROM, and we need to get it from OpenFirmware.
This isn't very pretty but time is lacking to do this in a better
way this near 5.2-RELEASE. This is a RELENG_5_2 candidate.
Original version by: Marius Strobl <marius@alchemy.franken.de>
Tested by: Pete Bentley <pete@sorted.org>
Reviewed by: jake
Andre Oppermann [Thu, 8 Jan 2004 17:40:07 +0000 (17:40 +0000)]
Limiters and sanity checks for TCP MSS (maximum segement size)
resource exhaustion attacks.
For network link optimization TCP can adjust its MSS and thus
packet size according to the observed path MTU. This is done
dynamically based on feedback from the remote host and network
components along the packet path. This information can be
abused to pretend an extremely low path MTU.
The resource exhaustion works in two ways:
o during tcp connection setup the advertized local MSS is
exchanged between the endpoints. The remote endpoint can
set this arbitrarily low (except for a minimum MTU of 64
octets enforced in the BSD code). When the local host is
sending data it is forced to send many small IP packets
instead of a large one.
For example instead of the normal TCP payload size of 1448
it forces TCP payload size of 12 (MTU 64) and thus we have
a 120 times increase in workload and packets. On fast links
this quickly saturates the local CPU and may also hit pps
processing limites of network components along the path.
This type of attack is particularly effective for servers
where the attacker can download large files (WWW and FTP).
We mitigate it by enforcing a minimum MTU settable by sysctl
net.inet.tcp.minmss defaulting to 256 octets.
o the local host is reveiving data on a TCP connection from
the remote host. The local host has no control over the
packet size the remote host is sending. The remote host
may chose to do what is described in the first attack and
send the data in packets with an TCP payload of at least
one byte. For each packet the tcp_input() function will
be entered, the packet is processed and a sowakeup() is
signalled to the connected process.
For example an attack with 2 Mbit/s gives 4716 packets per
second and the same amount of sowakeup()s to the process
(and context switches).
This type of attack is particularly effective for servers
where the attacker can upload large amounts of data.
Normally this is the case with WWW server where large POSTs
can be made.
We mitigate this by calculating the average MSS payload per
second. If it goes below 'net.inet.tcp.minmss' and the pps
rate is above 'net.inet.tcp.minmssoverload' defaulting to
1000 this particular TCP connection is resetted and dropped.
MITRE CVE: CAN-2004-0002
Reviewed by: sam (mentor)
MFC after: 1 day
Daniel Eischen [Thu, 8 Jan 2004 15:37:09 +0000 (15:37 +0000)]
Add a simple work-around for deadlocking on recursive read locks
on a rwlock while there are writers waiting. We normally favor
writers but when a reader already has at least one other read lock,
we favor the reader. We don't track all the rwlocks owned by a
thread, nor all the threads that own a rwlock -- we just keep
a count of all the read locks owned by a thread.
* firewire
Add tcode_str[] and improve debug message.
* sbp
If max_speed is negative, use the maximum speed which the
ohci chip supports. The default max_speed is -1.
* if_fwe
If tx_speed is negative, use the maximum speed which the
ohci chip supports. The default tx_speed is 2.
Bill Paul [Thu, 8 Jan 2004 10:44:37 +0000 (10:44 +0000)]
Correct the definition of the ndis_miniport_interrupt structure:
the ni_dpccountlock member is an ndis_kspin_lock, not an
ndis_spin_lock (the latter is too big).
Run if_ndis.c:ndis_tick() via taskqueue_schedule(). Also run
ndis_start() via taskqueue in certain circumstances.
Using these tweaks, I can now get the Broadcom BCM5701 NDIS
driver to load and run. Unfortunately, the version I have seems
to suffer from the same bug as the SMC 83820 driver, which is
that it creates a spinlock during its DriverEntry() routine.
I'm still debating the right way to deal with this.
Don Lewis [Thu, 8 Jan 2004 06:22:15 +0000 (06:22 +0000)]
The transmit frame status is stored in the last transmit descriptor for the
frame, not the first. It is probably also not safe to free the mbuf chain
as soon as the OWN bit is cleared on the first descriptor since the chip
may not be done copying the frame into the transmit FIFO. Revert the part of
of busdma conversion (if_dc.c rev 1.115) which changed dc_txeof() to look for
the status in the first descriptor and free the mbuf chain when processing
the first descriptor for the frame, and revert the matching changes elsewhere
in the driver. This part of the busdma change caused the driver to report
spurious collisions and output errors, even when running in full-duplex mode.
Reverting the mbuf chain handling slightly complicates dc_dma_map_txbuf(),
since it is responsible for setting the OWN bits on the descriptors, but does
not normally have direct access to the mbuf chain.
Tested by:
Dejan Lesjak <dejan.lesjak at ijs.si> alpha/<Intel 21143 10/100BaseTX>
"Xin LI" <delphij at frontfree.net> i386/<Macronix 98713 10/100BaseTX>
Wiktor Niesiobedzki <bsd at w.evip.pl> i386/<3Com OfficeConnect 10/100B>