jhb [Tue, 1 Jun 2004 20:28:42 +0000 (20:28 +0000)]
- Add a function ioapic_program_intpin() that completely programs an I/O
APIC interrupt pin based on the settings in the corresponding interrupt
source structure.
- Use ioapic_program_intpin() in place of manual frobbing of the intpin
configuration in ioapic_program_destination() and ioapic_register().
- Use ioapic_program_intpin() to implement suspend/resume support for I/O
APICs.
joerg [Tue, 1 Jun 2004 20:18:25 +0000 (20:18 +0000)]
Add SVR4-compatible VTOC-style elements to the Sun label. The
FreeBSD kernel doesn't use them but sunlabel(8) shortly will,
and both these files are used by sunlabel(8).
jhb [Tue, 1 Jun 2004 19:51:29 +0000 (19:51 +0000)]
Allow the pir0 device add to fail since pir0 may already exist. This should
fix the panics in device_set_ivars() that people were seeing on boxes with
multiple Host-PCI bridges but not using ACPI.
jhb [Tue, 1 Jun 2004 19:50:42 +0000 (19:50 +0000)]
Fix legacy_add_child() to properly handle the case where
device_add_child_ordered() fails (due to a duplicate device add for
example) and properly cleanup and return NULL.
rwatson [Tue, 1 Jun 2004 19:33:06 +0000 (19:33 +0000)]
Replace current locking comments for struct socket/struct sockbuf
with new ones. Annotate constant-after-creation fields as such. The
comments describe a number of locks that are not yet merged.
bde [Tue, 1 Jun 2004 19:28:38 +0000 (19:28 +0000)]
Fixed the sign of the result in some overflow and underflow cases (ones
where the exponent is an odd integer and the base is negative).
Obtained from: fdlibm-5.3
Sun finally released a new version of fdlibm just a coupe of weeks
ago. It only fixes 3 bugs (this one, another one in pow() that we
already have (rev.1.9), and one in tan(). I've learned too much about
powf() lately, so this fix was easy to merge. The patch is not verbatim,
because our base version has many differences for portability and I
didn't like global renaming of an unrelated variable to keep it separate
from the sign variable. This patch uses a new variable named sn for
the sign.
bde [Tue, 1 Jun 2004 19:03:31 +0000 (19:03 +0000)]
Fixed another precision bug in powf(). This one is in the computation
[t=p_l+p_h High]. We multiply t by lg2_h, and want the result to be
exact. For the bogus float case of the high-low decomposition trick,
we normally discard the lowest 12 bits of the fraction for the high
part, keeping 12 bits of precision. That was used for t here, but it
doesnt't work because for some reason we only discard the lowest 9
bits in the fraction for lg2_h. Discard another 3 bits of the fraction
for t to compensate.
As explained in the log for the previous commit, the bug is normally
masked by doing float calculations in extra precision on i386's, but
is easily detected by ucbtest on systems that don't have accidental
extra precision.
This completes fixing all the bugs in powf() that were routinely found
by ucbtest.
gad [Tue, 1 Jun 2004 19:00:42 +0000 (19:00 +0000)]
Since I'm not ready to add the non-standard ADD_PS_LISTRESET feature,
remove the #ifdef for it for now. I might add the feature for real at
some later date, there isn't much reason for the #ifdef for now.
bde [Tue, 1 Jun 2004 18:08:39 +0000 (18:08 +0000)]
Fixed 2 bugs in the computation /* t_h=ax+bp[k] High */.
(1) The bit for the 1.0 part of bp[k] was right shifted by 4. This seems
to have been caused by a typo in converting e_pow.c to e_powf.c.
(2) The lower 12 bits of ax+bp[k] were not discarded, so t_h was actually
plain ax+bp[k]. This seems to have been caused by a logic error in
the conversion.
Fixing (1) gives a result wrong in the opposite direction (hex C66CDAD8),
and fixing (2) gives the correct result.
ucbtest has been reporting this particular wrong result on i386 systems
with unpatched libraries for 9 years. I finally figured out the extent
of the bugs. On i386's they are normally hidden by extra precision.
We use the trick of representing floats as a sum of 2 floats (one much
smaller) to get extra precision in intermediate calculations without
explicitly using more than float precision. This trick is just a
pessimization when extra precision is available naturally (as it always
is when dealing with IEEE single precision, so the float precision part
of the library is mostly misimplemented). (1) and (2) break the trick
in different ways, except on i386's it turns out that the intermediate
calculations are done in enough precision to mask both the bugs and
the limited precision of the float variables (as far as ucbtest can
check).
ucbtest detects the bugs because it forces float precision, but this
is not a normal mode of operation so the bug normally has little effect
on i386's.
On systems that do float arithmetic in float precision, e.g., amd64's,
there is no accidental extra precision and the bugs just give wrong
results.
rwatson [Tue, 1 Jun 2004 18:03:20 +0000 (18:03 +0000)]
Push the VOP_ADVLOCK() call to release advisory locks on vnode file
descriptors out of fdrop_locked() and into vn_closefile(). This
removes all knowledge of vnodes from fdrop_locked(), since the lock
behavior was specific to vnodes. This also removes the specific
requirement for Giant in fdrop_locked(), it's now only required by
code that it calls into.
Add GIANT_REQUIRED to vn_closefile() since VFS requires Giant.
bmilekic [Tue, 1 Jun 2004 16:17:10 +0000 (16:17 +0000)]
Fix a couple of bugs in the mbuf and packet ctors. In the latter case,
nextpkt within the m_hdr was not being initialized to NULL for
!M_PKTHDR cases. *Maybe* this will fix weird socket buffer
inconsistency panics, but we'll see.
phk [Tue, 1 Jun 2004 11:38:06 +0000 (11:38 +0000)]
There is no need to explicitly call ttwakeup() and ttwwakeup() after
ttyclose() has been called. It's already been done once by ttyclose,
and probably once by the line-discipline too.
scottl [Tue, 1 Jun 2004 05:32:26 +0000 (05:32 +0000)]
Collapse aac_map_command() into aac_startio(). Check the AAC_QUEUE_FRZN in
every iteration of aac_startio(). This ensures that a command that is
deferred for lack of resources doesn't immediately get retried in the
aac_startio() loop. This avoids an almost certain livelock.
dougb [Tue, 1 Jun 2004 05:00:46 +0000 (05:00 +0000)]
Update the "All I really need to know I learned in kindergarten" entry
by using the text from the Villard Books edition (1989, pages 6 through
8) and formatting to fit in 72 columns.
dougb [Tue, 1 Jun 2004 04:32:11 +0000 (04:32 +0000)]
* Reformat several attributions according to ../Notes (mostly whitespace)
* Spell out some names that were pointlessly abbreviated
* Remove a couple of incidental duplicates
* Harry Truman had no actual middle name. The initial "S" was added to his
name to make him appear more statesmanlike. Therefore it's not usually punctuated.
* Format a couple of actual fortunes to fit into 72 columns
gad [Tue, 1 Jun 2004 03:01:51 +0000 (03:01 +0000)]
Fix so `ps' catches and complains about null-values specified for a
process id, instead of using pid==0. Ie, `ps -p 12,' and `ps -p ,12'
are now errors (instead of being treated like `ps -p 0 -p 12').
rwatson [Tue, 1 Jun 2004 02:42:56 +0000 (02:42 +0000)]
The SS_COMP and SS_INCOMP flags in the so_state field indicate whether
the socket is on an accept queue of a listen socket. This change
renames the flags to SQ_COMP and SQ_INCOMP, and moves them to a new
state field on the socket, so_qstate, as the locking for these flags
is substantially different for the locking on the remainder of the
flags in so_state.
gad [Tue, 1 Jun 2004 02:31:44 +0000 (02:31 +0000)]
Additional tiny adjustment to kludge-option processing so `ps t p0'
is treated like `ps -t p0', instead of changing it to `ps -T p0'.
Note that `ps t' is still changed to `ps -T', since that is one of
the main reasons for this kludge processing...
gad [Tue, 1 Jun 2004 02:03:21 +0000 (02:03 +0000)]
Rewrite the kludge-option processing to improve how it handles a few
more special situations. This is the code which process `ps blah',
when "blah" does not include a leading '-'.
This change also removes a long-undocumented BACKWARD_COMPATIBILITY
compile-time option, where:
ps -options arg1 arg2
(with no '-' on "arg1" and "arg2") was treated as:
ps -options -N arg1 -M arg2
This also changes `ps' to check for any additional arguments after
processing all the '-'-options, and attempt to use those arguments as
a pid or pidlist. If an extra argument is not a valid pidlist, then
`ps' will print an error and exit. This seems a more generally useful
extension of the kludge-option processing than the -N/-M behavior, and
has fewer confusing side-effects.
bmilekic [Tue, 1 Jun 2004 01:36:26 +0000 (01:36 +0000)]
Fix a comment above uma_zsecond_create(), describing its arguments.
It doesn't take 'align' and 'flags' but 'master' instead, which is
a reference to the Master Zone, containing the backing Keg.
truckman [Tue, 1 Jun 2004 01:18:51 +0000 (01:18 +0000)]
Add MSG_NBIO flag option to soreceive() and sosend() that causes
them to behave the same as if the SS_NBIO socket flag had been set
for this call. The SS_NBIO flag for ordinary sockets is set by
fcntl(fd, F_SETFL, O_NONBLOCK).
Pass the MSG_NBIO flag to the soreceive() and sosend() calls in
fifo_read() and fifo_write() instead of frobbing the SS_NBIO flag
on the underlying socket for each I/O operation. The O_NONBLOCK
flag is a property of the descriptor, and unlike ordinary sockets,
fifos may be referenced by multiple descriptors.
csjp [Tue, 1 Jun 2004 00:25:44 +0000 (00:25 +0000)]
Add a warning note to security.jail.allow_raw_sockets
about the risks of enabling raw sockets in prisons.
Because raw sockets can be used to configure and interact
with various network subsystems, extra caution should be
used where privileged access to jails is given out to
untrusted parties. As such, by default this option is disabled.
A few others and I are currently auditing the kernel
source code to ensure that the use of raw sockets by
privledged prison users is safe.
dougb [Tue, 1 Jun 2004 00:16:32 +0000 (00:16 +0000)]
Remove duplicates of the "wherever you go, there you are" fortune,
quote directly from the movie, and give a better attribution (with
correct spelling) for Buckaroo Banzai.
bmilekic [Mon, 31 May 2004 21:46:06 +0000 (21:46 +0000)]
Bring in mbuma to replace mballoc.
mbuma is an Mbuf & Cluster allocator built on top of a number of
extensions to the UMA framework, all included herein.
Extensions to UMA worth noting:
- Better layering between slab <-> zone caches; introduce
Keg structure which splits off slab cache away from the
zone structure and allows multiple zones to be stacked
on top of a single Keg (single type of slab cache);
perhaps we should look into defining a subset API on
top of the Keg for special use by malloc(9),
for example.
- UMA_ZONE_REFCNT zones can now be added, and reference
counters automagically allocated for them within the end
of the associated slab structures. uma_find_refcnt()
does a kextract to fetch the slab struct reference from
the underlying page, and lookup the corresponding refcnt.
mbuma things worth noting:
- integrates mbuf & cluster allocations with extended UMA
and provides caches for commonly-allocated items; defines
several zones (two primary, one secondary) and two kegs.
- change up certain code paths that always used to do:
m_get() + m_clget() to instead just use m_getcl() and
try to take advantage of the newly defined secondary
Packet zone.
- netstat(1) and systat(1) quickly hacked up to do basic
stat reporting but additional stats work needs to be
done once some other details within UMA have been taken
care of and it becomes clearer to how stats will work
within the modified framework.
From the user perspective, one implication is that the
NMBCLUSTERS compile-time option is no longer used. The
maximum number of clusters is still capped off according
to maxusers, but it can be made unlimited by setting
the kern.ipc.nmbclusters boot-time tunable to zero.
Work should be done to write an appropriate sysctl
handler allowing dynamic tuning of kern.ipc.nmbclusters
at runtime.
Additional things worth noting/known issues (READ):
- One report of 'ips' (ServeRAID) driver acting really
slow in conjunction with mbuma. Need more data.
Latest report is that ips is equally sucking with
and without mbuma.
- Giant leak in NFS code sometimes occurs, can't
reproduce but currently analyzing; brueffer is
able to reproduce but THIS IS NOT an mbuma-specific
problem and currently occurs even WITHOUT mbuma.
- Issues in network locking: there is at least one
code path in the rip code where one or more locks
are acquired and we end up in m_prepend() with
M_WAITOK, which causes WITNESS to whine from within
UMA. Current temporary solution: force all UMA
allocations to be M_NOWAIT from within UMA for now
to avoid deadlocks unless WITNESS is defined and we
can determine with certainty that we're not holding
any locks when we're M_WAITOK.
- I've seen at least one weird socketbuffer empty-but-
mbuf-still-attached panic. I don't believe this
to be related to mbuma but please keep your eyes
open, turn on debugging, and capture crash dumps.
This change removes more code than it adds.
A paper is available detailing the change and considering
various performance issues, it was presented at BSDCan2004:
http://www.unixdaemons.com/~bmilekic/netbuf_bmilekic.pdf
Please read the paper for Future Work and implementation
details, as well as credits.
Testing and Debugging:
rwatson,
brueffer,
Ketrien I. Saihr-Kesenchedra,
...
Reviewed by: Lots of people (for different parts)
ume [Mon, 31 May 2004 21:09:14 +0000 (21:09 +0000)]
Treat IPv4 private address as global scope rather than site scope.
Though it breaks RFC 3484, without this change, dest addr selection
doesn't work well under NAT environment.
rwatson [Mon, 31 May 2004 16:32:49 +0000 (16:32 +0000)]
Add an assertion that nfssvc() isn't called with Giant.
Add two additional pairs of assertions, one at the end of the NFS
server event loop, and one one exit from the NFS daemon, that
assert that if debug.mpsafenet is enabled, Giant is not held, and
that if it is not enabled, Giant will be held. This is intended
to support debugging scenarios where Giant is "leaked" during NFS
processing.
nsouch [Mon, 31 May 2004 14:24:21 +0000 (14:24 +0000)]
Necessary modifications do get pcf working again for ISA. Tested with
my Elektor card. Note that the hints are necessary to specify the
IO base of the pcf chip. This enables to check the IO base when the
probe routine is called during ISA enumeration.
The interrupt driven code is mixed with polled mode, which is wrong
and produces supposed spurious interrupts at each access. I still have
to work on it.
kris [Mon, 31 May 2004 07:34:40 +0000 (07:34 +0000)]
Add common share/locale directories (everything used by >= 5 ports [1]) and
/usr/local/www
[1] Semi-arbitrary cutoff, but I didn't want to add every locale directory
used by ports, because a lot are only used by one or two, and it's less
intrusive for these ports to just clean up after themselves.
rwatson [Mon, 31 May 2004 00:59:10 +0000 (00:59 +0000)]
The NFS server modevent code manually patches the system call table to
install nfssvc(). It also updates the argument count, but did so
without setting SYF_MPSAFE, effectively removing the MPSAFE flag even
when syscalls.master indicates it doesn't require Giant. This change
forces the modevent to set MPSAFE as a flag to its internal notion of
an argument coutn.
Note: this duplication of information is a bad thing, but is a more
general problem I'm not currently willing to address.
imp [Sun, 30 May 2004 23:08:53 +0000 (23:08 +0000)]
Include <machine/bus.h> and <machine/resource.h> here (only in the
kernel). No other sys/*.h file requires machine/foo.h to be included
before it. In addition, all the files that include rman.h would need
to include those two anyway. From these two perspectives, it is
traditional to include things like this.
This lets us stop treating sys/rman.h specially in every bus frontend
file.
rwatson [Sun, 30 May 2004 22:59:54 +0000 (22:59 +0000)]
One more case where we want to drop the NFS server lock and acquire
Giant when entering VFS. Discovered by code inspection; still not
hit without debug.mpsafenet=1.
rwatson [Sun, 30 May 2004 22:41:43 +0000 (22:41 +0000)]
Acquire Giant around two more cases when calling into VFS to vput()
a vnode. Not bumped into with asserts in the main tree because we
run the NFS server with Giant by default. Discovered by inspection.
Complete annotations of Giant acquisition/release to note that it's
only because of VFS that we acquire Giant in most places in the NFS
server.
dwmalone [Sun, 30 May 2004 10:04:03 +0000 (10:04 +0000)]
A log file name may now be prefixed by a '-' if it should not be
explicitly fsynced after kernel messages are logged. This option
should be syntax compatible with a similar option in Linux syslogd.
I've made some small changes to Pekka's patch, hoepfully I haven't
goofed anything.
PR: 66790
Submitted by: Pekka Savola <pekkas@netcore.fi>
Obtained from: Martin Schulze's syslogd
MFC after: 1 month
stefanf [Sun, 30 May 2004 09:21:56 +0000 (09:21 +0000)]
Add implementations for cimag{,f,l}, creal{,f,l} and conj{,f,l}. They are
needed for cases where GCC's builtin functions cannot be used and for
compilers that don't know about them.
stefanf [Sun, 30 May 2004 08:47:12 +0000 (08:47 +0000)]
Remove the macros for creal{,f} and cimag{,f}. They failed to convert their
arguments to the needed type and so the result type depended on the argument
type. Fixing them isn't really worth the effort because GCC emits the same
assembler code with or without them.
hmp [Sun, 30 May 2004 00:42:38 +0000 (00:42 +0000)]
Correct typo, vm_page_list_find() is called vm_pageq_find() for quite a
long time, i.e., since the cleanup of the VM Page-queues code done two
years ago.
Reviewed by: Alan Cox <alc at freebsd.org>,
Matthew Dillon <dillon at backplane.com>
dwmalone [Sun, 30 May 2004 00:02:19 +0000 (00:02 +0000)]
Try to be more careful about using using the file descriptor f_file.
Syslogd should ensure that f_file is a valid file descriptor when
f_type is FILE, CONSOLE, TTY and for a PIPE where f_pid > 0. If the
descriptor is closed/invalid then the type should be set to UNUSED
or the pid should be set to 0.
To this end:
1) Don't close(f->f_file) if we can't send a message to a remote
host because the file descriptor used for remote logging is
stored in finet, not in f->f_file. f->f_file is probably
uninitialised, so I guess we usually end up closing fd 0.
2) Don't close PIPE file descriptors if they are invalid.
3) If the call to p_open fails, don't set the pid.
The OpenBSD patches in this area set f_file to -1 after the fd is
closed and then avoids calling close if f_file < 0. I haven't done
this, but it might be a good idea too.