bde [Sun, 30 Mar 2008 18:07:12 +0000 (18:07 +0000)]
Use fabs[f]() instead of bit fiddling for setting absolute values.
This makes little difference in float precision, but in double
precision gives a speedup of about 30% on amd64 (A64 CPU) and i386
(A64). This depends on fabs[f]() being inline and efficient. The
bit fiddling (or any use of SET_HIGH_WORD(), which libm does too
much because it was best on old 32-bit machines) always causes
packing overheads and sometimes causes stalls in the packing, since
it operates on only part of a variable in the double precision case.
It apparently did cause stalls in a critical path here.
bde [Sun, 30 Mar 2008 17:28:27 +0000 (17:28 +0000)]
Use the expression fabs(x+0.0)-fabs(y+0.0) instead of
fabs(x+0.0)+fabs(y+0.0) when mixing NaNs. This improves
consistency of the result by making it harder for the compiler to reorder
the operands. (FP addition is not necessarily commutative because the
order of operands makes a difference on some machines iff the operands are
both NaNs.)
bde [Sun, 30 Mar 2008 17:17:42 +0000 (17:17 +0000)]
Fix a missing mask in a hi+lo decomposition. Thus bug made the extra
precision in software useless, so hypotf() had some errors in the 1-2
ulp range unless there is extra precision in hardware (as happens on
i386).
jeff [Sun, 30 Mar 2008 11:31:14 +0000 (11:31 +0000)]
- Consistently return EDEADLK when presented with a new set that is
incompatible with existing bindings.
- Try to copyout the setid in cpuset() before migrating the proc to the
setid in case the user has supplied a bad buffer.
- Rename cpuset_root() and cpuset_base() to cpuset_ref{root,base} to
be more descriptive and free cpuset_root to be used as a different
type of symbol.
- Make cpuset_root the cpuset_t set of all cpus in the system. This
should contain the same bitmask as all_cpus presently.
- Add a CPU_CMP() macro to compare two sets.
dfr [Sun, 30 Mar 2008 09:36:17 +0000 (09:36 +0000)]
Don't call xdrrec_skiprecord in the non-blocking case. If
__xdrrec_getrec has returned TRUE, then we have a complete request in
the buffer - calling xdrrec_skiprecord is not necessary. In particular,
if there is another record already buffered on the stream,
xdrrec_skiprecord will discard both this request and the next
one, causing the call to xdr_callmsg to fail and the stream to be
closed.
mav [Sun, 30 Mar 2008 07:53:51 +0000 (07:53 +0000)]
- Account all node stats at the shape mode.
- Do not check destination hook presence, it will be done by netgraph.
- Use u_int instead of int in some places to simplify type conversions.
- Use NG_SEND_DATA_ONLY() macro instead of selfmade equivalent.
brooks [Sun, 30 Mar 2008 02:42:39 +0000 (02:42 +0000)]
Add a new function is_default_interface() which determines if this
interface is one with the default route (or there isn't one). Use it to
decide if we should adjust the default route and /etc/resolv.conf.
Fix the delete of the default route. The if statement was totally bogus
and the delete only worked due to a typo. [1]
Reported by: Jordan Coleman <jordan at JordanColeman dot com> [1]
MFC after: 1 week
jeff [Sat, 29 Mar 2008 23:36:26 +0000 (23:36 +0000)]
- Don't allow calls to vn_lock() with no lock type requested. Callers
which simply want a reference should use vref(). Callers which want
to check validity need to hold a lock while performing any action
based on that validity. vn_lock() would always release the interlock
before returning making any action synchronous with the validity check
impossible.
jeff [Sat, 29 Mar 2008 23:24:54 +0000 (23:24 +0000)]
- Simplify null_hashget() and null_hashins() by using vref() rather
than a complex series of steps involving vget() without a lock type
to emulate the same thing.
jhb [Sat, 29 Mar 2008 17:46:03 +0000 (17:46 +0000)]
Change kgdb_parse() to use wrapped versions of parse_expression() and
evaluate_expression() so that any errors are caught and cause the function
to return to 0. Otherwise the errors posted an exception (via longjmp())
that aborted the current operation. This fixes the kld handling for
older kernels (6.x and 7.x) that don't have the full pathname stored in
the kernel linker.
marcel [Sat, 29 Mar 2008 17:33:29 +0000 (17:33 +0000)]
Change the order from SI_ORDER_FIRST to SI_ORDER_ANY (within
SI_SUB_DRIVERS) to avoid loading schemes before all the GEOM
classes have been loaded and initialized. Otherwise we may
end up using mutexes that haven't been initialized (due to
g_retaste() posting an event).
das [Sat, 29 Mar 2008 16:19:35 +0000 (16:19 +0000)]
Document modff() and modfl(). Technically, modff() and modfl()
live in libm, while modf() lives in libc due to historical
mistakes. I'm claiming in the manpage that they all live in libm,
since programmers should not rely on the mistake.
jeff [Sat, 29 Mar 2008 07:06:13 +0000 (07:06 +0000)]
- Use vm_object_reference_locked() directly from
vm_object_reference(). This is intended to get rid of vget()
consumers who don't wish to acquire a lock. This is functionally
the same as calling vref(). vm_object_reference_locked() already
uses vref.
jhb [Sat, 29 Mar 2008 03:48:06 +0000 (03:48 +0000)]
Initialize the head pointer in kld_current_sos() to NULL to avoid returning
a junk pointer and possibly causing a seg fault if we don't have any
non-kernel klds (or are unable to walk the list due to core / kernel
mismatch).
mlaier [Sat, 29 Mar 2008 00:24:36 +0000 (00:24 +0000)]
Make ALTQ cope with disappearing interfaces (particularly common with mpd
and netgraph in gernal). This also allows to add queues for an interface
that is not yet existing (you have to provide the bandwidth for the
interface, however).
attilio [Fri, 28 Mar 2008 12:30:12 +0000 (12:30 +0000)]
b_waiters cannot be adequately protected by the interlock because it is
dropped after the call to lockmgr() so just revert this approach using
something similar to the precedent one:
BUF_LOCKWAITERS() just checks if there are waiters (not the actual number
of them) and it is based on newly introduced lockmgr_waiters() which
returns if the lockmgr has waiters or not. The name has been choosen
differently by old lockwaiters() in order to not confuse them.
KPI results enriched by this commit so __FreeBSD_version bumping and
manpage update will be happening soon.
'struct buf' also changes, so kernel ABI is disturbed.
brooks [Fri, 28 Mar 2008 07:57:52 +0000 (07:57 +0000)]
Add support for hardwiring ppp sessions to particular devices with new
per-profile variables of the form ppp_<profile>_unit. No ppp_unit
variable is supported since tying the same unit to more than one profile
won't work.
marcel [Fri, 28 Mar 2008 06:31:12 +0000 (06:31 +0000)]
When retasting, wither any existing GEOMs of the same class. This
allows the class to create a different GEOM for the same provider
as well as avoid that we end up with multiple GEOMs of the same
class with the same name.
For example, when a disk contains a PC98 partition table but
only MBR is supported, then the partition table can be treated
as a MBR. If support for PC98 is later loaded as a module, the
MBR scheme is pre-empted for the PC98 scheme as expected.
yongari [Fri, 28 Mar 2008 01:21:21 +0000 (01:21 +0000)]
In revision 1.70, 1.71 and 1.84 re(4) tried to workaround checksum
offload bugs by manual padding for short IP/UDP frames. Unfortunately
it seems that these workaround does not work reliably on newer PCIe
variants of RealTek chips.
To workaround the hardware bug, always pad short frames if Tx IP
checksum offload is requested. It seems that the hardware has a
bug in IP checksum offload handling. NetBSD manually pads short
frames only when the length of IP frame is less than 28 bytes but I
chose 60 bytes to safety. Also unconditionally set IP checksum
offload bit in Tx descriptor if any TCP or UDP checksum offload is
requested. This is the same way as Linux does but it's not
mentioned in data sheet.
jb [Thu, 27 Mar 2008 23:21:25 +0000 (23:21 +0000)]
The sources covered by Sun's CDDL have been repo copied below the
src/cddl and src/sys/cddl directories per the core@ decision following
the license review.
This change modifies the affected Makefiles to reference the sources
in their new location.
mav [Thu, 27 Mar 2008 23:02:30 +0000 (23:02 +0000)]
Remove ng_setisr() call from ng_dequeue(). It is useless as we any way
will never exit ngintr(), while there is some ready requests on the queue.
It was made years ago with hope of parallel queue processing by several
net threads. But even if we have several threads sometimes, we have no
rights to process queue in parallel as it will break original requests
serialization that is critically important for some setups.
mav [Thu, 27 Mar 2008 20:04:20 +0000 (20:04 +0000)]
Switch from timeval to bintime, to use 1/(2^20) of seconds instead of
microseconds. It allows to use bit shifts instead of some heavy 64bit
mul/div math operations.
iedowse [Thu, 27 Mar 2008 18:02:30 +0000 (18:02 +0000)]
Add IFF_NEEDSGIANT to IFF_CANTCHANGE, to prevent user-level code
from clearing the IFF_NEEDSGIANT flag on Giant-locked interfaces.
In particular, wpa_supplicant was doing this on USB interfaces,
causing panics when Giant-locked code was then called without Giant.
dfr [Thu, 27 Mar 2008 11:54:20 +0000 (11:54 +0000)]
Add kernel module support for nfslockd and krpc. Use the module system
to detect (or load) kernel NLM support in rpc.lockd. Remove the '-k'
option to rpc.lockd and make kernel NLM the default. A user can still
force the use of the old user NLM by building a kernel without NFSLOCKD
and/or removing the nfslockd.ko module.
alc [Thu, 27 Mar 2008 04:34:17 +0000 (04:34 +0000)]
MFamd64 with few changes:
1. Add support for automatic promotion of 4KB page mappings to 2MB page
mappings. Automatic promotion can be enabled by setting the tunable
"vm.pmap.pg_ps_enabled" to a non-zero value. By default, automatic
promotion is disabled. Tested by: kris
2. To date, we have assumed that the TLB will only set the PG_M bit in a
PTE if that PTE has the PG_RW bit set. However, this assumption does
not hold on recent processors from Intel. For example, consider a PTE
that has the PG_RW bit set but the PG_M bit clear. Suppose this PTE
is cached in the TLB and later the PG_RW bit is cleared in the PTE,
but the corresponding TLB entry is not (yet) invalidated.
Historically, upon a write access using this (stale) TLB entry, the
TLB would observe that the PG_RW bit had been cleared and initiate a
page fault, aborting the setting of the PG_M bit in the PTE. Now,
however, P4- and Core2-family processors will set the PG_M bit before
observing that the PG_RW bit is clear and initiating a page fault. In
other words, the write does not occur but the PG_M bit is still set.
The real impact of this difference is not that great. Specifically,
we should no longer assert that any PTE with the PG_M bit set must
also have the PG_RW bit set, and we should ignore the state of the
PG_M bit unless the PG_RW bit is set.
jb [Thu, 27 Mar 2008 01:33:26 +0000 (01:33 +0000)]
Allow awk (the one true one!) to handle 64 files instead of just 20.
The current FreeBSD syscall generation script uses all 20 and I need
another open file.
It's a shame that something named as the 'one-true-awk' is so limited
by an old denition like FOPEN_MAX when it could just make the file
handling dynamic.
This is done to avoid touching contrib sources on a vendor branch.
phk [Wed, 26 Mar 2008 22:12:00 +0000 (22:12 +0000)]
Back in the good old days, PC's had random pieces of rock for
frequency generation and what frequency the generated was anyones
guess.
In general the 32.768kHz RTC clock x-tal was the best, because that
was a regular wrist-watch Xtal, whereas the X-tal generating the
ISA bus frequency was much lower quality, often costing as much as
several cents a piece, so it made good sense to check the ISA bus
frequency against the RTC clock.
The other relevant property of those machines, is that they
typically had no more than 16MB RAM.
These days, CPU chips croak if their clocks are not tightly within
specs and all necessary frequencies are derived from the master
crystal by means if PLL's.
Considering that it takes on average 1.5 second to calibrate the
frequency of the i8254 counter, that more likely than not, we will
not actually use the result of the calibration, and as the final
clincher, we seldom use the i8254 for anything besides BEL in
syscons anyway, it has become time to drop the calibration code.
If you need to tell the system what frequency your i8254 runs,
you can do so from the loader using hw.i8254.freq or using the
sysctl kern.timecounter.tc.i8254.frequency.
brooks [Wed, 26 Mar 2008 21:54:48 +0000 (21:54 +0000)]
Allow the characters .-+/ to appear in ppp profile names by folding them
to _ when evaluating ppp_<profile>_nat and ppp_<profile>_mode. Document
the per-profile variables.
rwatson [Wed, 26 Mar 2008 21:29:13 +0000 (21:29 +0000)]
Add a comment explaining that we initialize the 'a' buffer for
zero-copy to the store buffer position on the BPF descriptor,
and the 'b' buffer as the free buffer in order to fill them in
the order documented in bpf(4).
jhb [Wed, 26 Mar 2008 20:48:07 +0000 (20:48 +0000)]
Fix a nit with the 'nofoo' options where 'foo' is mapped to 'nonofoo'
(such as 'atime' vs 'noatime'). The filesystems will always see either
'nofoo' or 'nonofoo', never plain 'foo'. As such, their list of valid
mount options should include 'nofoo' instead of 'foo'. With this fix,
you can do 'mount -u -o atime' on a FFS filesystem that isn't marked as
noatime without getting an error. You can also update a noatime FFS
filesystem mounted via mount(2) (e.g. 6.x /sbin/mount binary) to 'atime'
using nmount(2) (e.g. 7.x /sbin/mount binary).
phk [Wed, 26 Mar 2008 20:09:21 +0000 (20:09 +0000)]
The "free-lance" timer in the i8254 is only used for the speaker
these days, so de-generalize the acquire_timer/release_timer api
to just deal with speakers.
The new (optional) MD functions are:
timer_spkr_acquire()
timer_spkr_release()
and
timer_spkr_setfreq()
the last of which configures the timer to generate a tone of a given
frequency, in Hz instead of 1/1193182th of seconds.
Drop entirely timer2 on pc98, it is not used anywhere at all.
Move sysbeep() to kern/tty_cons.c and use the timer_spkr*() if
they exist, and do nothing otherwise.
Remove prototypes and empty acquire-/release-timer() and sysbeep()
functions from the non-beeping archs.
This eliminate the need for the speaker driver to know about
i8254frequency at all. In theory this makes the speaker driver MI,
contingent on the timer_spkr_*() functions existing but the driver
does not know this yet and still attaches to the ISA bus.
Syscons is more tricky, in one function, sc_tone(), it knows the hz
and things are just fine.
In the other function, sc_bell() it seems to get the period from
the KDMKTONE ioctl in terms if 1/1193182th second, so we hardcode
the 1193182 and leave it at that. It's probably not important.
Change a few other sysbeep() uses which obviously knew that the
argument was in terms of i8254 frequency, and leave alone those
that look like people thought sysbeep() took frequency in hertz.
This eliminates the knowledge of i8254_freq from all but the actual
clock.c code and the prof_machdep.c on amd64 and i386, where I think
it would be smart to ask for help from the timecounters anyway [TBD].
dfr [Wed, 26 Mar 2008 15:23:12 +0000 (15:23 +0000)]
Add the new kernel-mode NFS Lock Manager. To use it instead of the
user-mode lock manager, build a kernel with the NFSLOCKD option and
add '-k' to 'rpc_lockd_flags' in rc.conf.
Highlights include:
* Thread-safe kernel RPC client - many threads can use the same RPC
client handle safely with replies being de-multiplexed at the socket
upcall (typically driven directly by the NIC interrupt) and handed
off to whichever thread matches the reply. For UDP sockets, many RPC
clients can share the same socket. This allows the use of a single
privileged UDP port number to talk to an arbitrary number of remote
hosts.
* Single-threaded kernel RPC server. Adding support for multi-threaded
server would be relatively straightforward and would follow
approximately the Solaris KPI. A single thread should be sufficient
for the NLM since it should rarely block in normal operation.
* Kernel mode NLM server supporting cancel requests and granted
callbacks. I've tested the NLM server reasonably extensively - it
passes both my own tests and the NFS Connectathon locking tests
running on Solaris, Mac OS X and Ubuntu Linux.
* Userland NLM client supported. While the NLM server doesn't have
support for the local NFS client's locking needs, it does have to
field async replies and granted callbacks from remote NLMs that the
local client has contacted. We relay these replies to the userland
rpc.lockd over a local domain RPC socket.
* Robust deadlock detection for the local lock manager. In particular
it will detect deadlocks caused by a lock request that covers more
than one blocking request. As required by the NLM protocol, all
deadlock detection happens synchronously - a user is guaranteed that
if a lock request isn't rejected immediately, the lock will
eventually be granted. The old system allowed for a 'deferred
deadlock' condition where a blocked lock request could wake up and
find that some other deadlock-causing lock owner had beaten them to
the lock.
* Since both local and remote locks are managed by the same kernel
locking code, local and remote processes can safely use file locks
for mutual exclusion. Local processes have no fairness advantage
compared to remote processes when contending to lock a region that
has just been unlocked - the local lock manager enforces a strict
first-come first-served model for both local and remote lockers.
phk [Wed, 26 Mar 2008 15:03:24 +0000 (15:03 +0000)]
Rename timer0_max_count to i8254_max_count.
Rename timer0_real_max_count to i8254_real_max_count and make it static.
Rename timer_freq to i8254_freq and make it a loader tunable.
remko [Wed, 26 Mar 2008 06:45:28 +0000 (06:45 +0000)]
Document the removal data for usbdevs.h and usbdevs_data.h,
sort the entry into it's correct place (behind 200407XX before
200406XX because we have an explicit date here).