Optimize the TX side of the part by using the PDC to move bytes out to
the wire. This increases the speed considerably. Start to put
infrastructure in place to do RX side, but that requires more study
before it can be done.
Re-correct commit 1.73, but this time in a way that does not cause
all column-headers to print in lowercase by default. I was in too
much of a rush in committing 1.75, and didn't notice that the case
had changed. This time I did considerably more testing, and used
'diff' instead of just quickly eyeballing the results...
Apologies. I expect this means the dunce cap is mine for awhile.
If this doesn't work, I'll just drop back to 1.72 and hide under
my desk for awhile.
Keep track of the number of in-progress async direct IO writes in the nfsnode.
Make fsync/close wait until all of these drain. Add a check to nfs_getpage() and
nfs_putpage().
Back out the fan control changes that were merged in revision 1.2.2.5.
The necessary changes to the driver haven't been merged yet, which won't
happen before 6.1-RELEASE.
Cache the value of the lower half of each I/O APIC redirection table entry
so that we only have to do an ioapic_write() instead of an ioapic_read()
followed by an ioapic_write() every time we mask and unmask level triggered
interrupts. This cuts the execution time for these operations roughly in
half.
Profiled by: Paolo Pisati <p.pisati@oltrelinux.com>
MFC after: 1 week
Fix a problem introduced by change 1.73, which causes a seg-fault if
the user specifies a keyword which is an alias to some other keyword.
E.g.: stat (for state) or pcpu (for %cpu)..
marius [Tue, 4 Apr 2006 21:00:44 +0000 (21:00 +0000)]
For USIII CPUs the type of the trap caused by peeking/poking non-existent
PCI devices apparently was changed from a special deferred trap with TPC
pointing to the membar #Sync following the failing load/store instruction
to a precise trap with TPC pointing to the failing load/store instruction.
Thus remove the check the check whether TPC points to a membar #Sync in
case of a data access trap as it's off-by-one for USIII CPUs and it should
be sufficient to check whether the trap happend while in fasword*() to
properly detect traps caused by peeking/poking. This also corresponds to
what other OSs do. Note that also only the USIIi manual suggests to check
the TPC for such traps while the USII one doesn't (in the public USIII
manual device peeking/poking isn't mentioned at all).
Add init_lock, and use it to protect against allocator initialization
races. This isn't currently necessary for libpthread or libthr, but
without it external threads libraries like the linuxthreads port are
not safe to use.
Turn a file that was mostly style(9) compliant to a file that's really close
to being completely style(9). The odd-ball indentation in a few places was
really distracting.
The Z8530 on the MacIO has an interrupt per channel. Deal with this
by having interrupt resource variables per channel. We don't set up
different interrupt handlers per channel, though.
Before dereferencing intotw() when INP_TIMEWAIT, check for inp_ppcb being
NULL. We currently do allow this to happen, but may want to remove that
possibility in the future. This case can occur when a socket is left
open after TCP wraps up, and the timewait state is recycled. This will
be cleaned up in the future.
Found by: Kazuaki Oda <kaakun at highway dot ne dot jp>
MFC after: 3 months
Remove unused variables s and error in key_detach. The previous
revision removed their usage but did not remove the declaration. This
caused a warning in my build, which was fatal with -Werror.
jeff [Tue, 4 Apr 2006 06:46:10 +0000 (06:46 +0000)]
- VFS_LOCK_GIANT when recycling a vnode via getnewvnode. We may be
recycling for an unrelated filesystem. I really don't like potentially
acquiring giant in the context of a giantless filesystem but there
are reasonable objections to removing the recycling from this path.
jeff [Tue, 4 Apr 2006 06:44:21 +0000 (06:44 +0000)]
- Properly check against B_DELWRI and B_NEEDSGIANT. This check was
incorrectly written and caused some !NEEDSGIANT buffers to be put in
the NEEDSGIANT queue.
Refactor per-run bitmap manipulation functions so that bitmap offsets only
have to be calculated once per allocator operation.
Make nil const.
Update various comments.
Remove/avoid division where possible.
For the one division operation that remains in the critical path, add a
switch statement that has a case for each small size class, and do division
with a constant divisor in each case. This allows the compiler to generate
optimized code that does not use hardware division [1].
Simplify _get_curthread() and _tcb_ctor because libc and rtld now
already allocate thread pointer space in tls block for initial thread.
Only i386 and amd64 have been done, others still have to be tested.
Increment kdb_active after we stopped the other CPUs and decrement
kdb_active before we restart them. This avoids false positives on
restarted CPUs when they test for kdb_active while kdb_trap() is
still finishing up.
Improve handling of IPI_STOP:
o use atomic operations to fiddle with stopped_cpus and started_cpus.
o disable interrupts while we're waiting to be started.
o remove logic relating to cpustop_restartfunc as it's not used.
Eliminate HAVE_STOPPEDPCBS. On ia64 the PCPU holds a pointer to the
PCB in which the context of stopped CPUs is stored. To access this
PCB from KDB, we introduce a new define, called KDB_STOPPEDPCB. The
definition, when present, lives in <machine/kdb.h> and abstracts
where MD code saves the context. Define KDB_STOPPEDPCB on i386,
amd64, alpha and sparc64 in accordance to previous code.
peter [Mon, 3 Apr 2006 21:36:01 +0000 (21:36 +0000)]
Shrink the amd64 pv entry from 48 bytes to about 24 bytes. On a machine
with large mmap files mapped into many processes, this saves hundreds of
megabytes of ram.
pv entries were individually allocated and had two tailq entries and two
pointers (or addresses). Each pv entry was linked to a vm_page_t and
a process's address space (pmap). It had the virtual address and a
pointer to the pmap.
This change replaces the individual allocation with a per-process
allocation system. A page ("pv chunk") is allocated and this provides
168 pv entries for that process. We can now eliminate one of the 16 byte
tailq entries because we can simply iterate through the pv chunks to find
all the pv entries for a process. We can eliminate one of the 8 byte
pointers because the location of the pv entry implies the containing
pv chunk, which has the pointer. After overheads from the pv chunk
bitmap and tailq linkage, this works out that each pv entry has an
effective size of 24.38 bytes.
Future work still required, and other problems:
* when running low on pv entries or system ram, we may need to defrag
the chunk pages and free any spares. The stats (vm.pmap.*) show that
this doesn't seem to be that much of a problem, but it can be done if
needed.
* running low on pv entries is now a much bigger problem. The old
get_pv_entry() routine just needed to reclaim one other pv entry.
Now, since they are per-process, we can only use pv entries that are
assigned to our current process, or by stealing an entire page worth
from another process. Under normal circumstances, the pmap_collect()
code should be able to dislodge some pv entries from the current
process. But if needed, it can still reclaim entire pv chunk pages
from other processes.
* This should port to i386 really easily, except there it would reduce
pv entries from 24 bytes to about 12 bytes.
marius [Mon, 3 Apr 2006 21:27:01 +0000 (21:27 +0000)]
- s,tramoline,trampoline, in a comment.
- Use FBSDID in trap.c
- Make the global trap_sig[] static as it's not used outside of trap.c.
- In sendsig() remove an unused variable.
- In trap() sync with the other archs; for fast data access MMU miss and
data access protection traps set ksi_addr to the SFAR reg which contains
the faulting address and otherwise to the TPC reg. Generally the TCP reg
contains the address of the instruction that caused the exception, except
for fast instruction access traps (and some others; more refinement may
be needed here) it also contains the faulting address.
Previously sendsig() always set si_addr to the SFAR reg which is wrong
for most traps.
- In sendsig() add support for FreeBSD old-style signals.
These changes are inspired by kmacy's sun4v changes and allow libsigsegv
to build on FreeBSD/sparc64, but it doesn't pass all checks and tests it
actually should, yet.
In kdb_trap(), change the type of the local variable 'intr' from int
to register_t, as intr_disable() returns the latter and register_t
may be wider than int.
Milosz (sorry for not using the right 'l', it will not display corretly
in the commit log) submitted support for some NO_* knobs for delete-old*
and check-old. I converted it to the new WITHOUT_* knobs (more correctly:
MK_*) and added some dummy ones so that people can see what's missing.
Volunteers can have a look at http://phk.freebsd.dk/misc/build_options/
for a list of files.
The location looks a little bit odd to me, but I don't care about the
color of this bikeshed and follow the suggestion of our build
infrastructure guru to place it "somewhere under src/tools/ please". [1]
The build/mk/ directory looks more sane to me than the other ones there.
sam [Mon, 3 Apr 2006 18:14:02 +0000 (18:14 +0000)]
o add opt_ath.h enable tweaking various config parameters for the driver
without modifying the source code
o default debug msgs and diag support to off
Replace critical_enter() and critical_exit() in kdb_trap() with
intr_disable() and intr_restore() resp. Previously, critical
regions would have interrupts disabled, but that was changed.
Consequently, the debugger could run with interrupts enabled.
This could cause problems for the low-level console code where
received characters would trigger an interrupt that causes
the interrupt handler to read the character instead of the
cngetc() function.
In TCP notify routines, check inpcb for INP_TIMEWAIT and INP_DROPPED.
The INP_DROPPED check replaces the current NULL checks; the INP_TIMEWAIT
checks appear to have always been required, but not been there, which
is/was a bug. This avoids unconditionally casting of in_ppcb to a tcpcb,
when it may be a twtcb, which may have resulted in obscure ICMP-related
panics in earlier releases.
Change inp_ppcb from caddr_t to void *, fix/remove associated related
casts.
Consistently use intotw() to cast inp_ppcb pointers to struct tcptw *
pointers.
Consistently use intotcpcb() to cast inp_ppcb pointers to struct tcpcb *
pointers.
Don't assign tp to the results to intotcpcb() during variable declation
at the top of functions, as that is before the asserts relating to
locking have been performed. Do this later in the function after
appropriate assertions have run to allow that operation to be conisdered
safe.
Fix up locking surrounding tcp_drop sysctl: in the new world order, we
don't free inpcbs until after the socket is closed, so we always need
to unlock an inpcb after calling tcp_drop() on it.
After checking for SO_ISDISCONNECTED in tcp_usr_accept(), return
immediately rather than jumping to the normal output handling, which
assumes we've pulled out the inpcb, which hasn't happened at this
point (and isn't necessary).
Return ECONNABORTED instead of EINVAL when the inpcb has entered
INP_TIMEWAIT or INP_DROPPED, as this is the documented error value.
Eliminate the sc_hasfifo flag from the softc. It was only used by
the NS8250 class driver. The UART has FIFOs if sc_rxfifosz>1, so
test for that instead.
While here properly initialize sc_rxfifosz and sc_txfifosz in the
case the UART doesn't have FIFOs.
Add test cases that check utility syntax errors and redirection errors. For
special built-in utilities they must terminate the shell, for other
utilities only a error message shall be written. We currently fail both
tests.
Issue an error when . (dot) is invoked without a filename. The synopsis
is just ". file" according to POSIX, however many other shells allow
arguments to be passed after the file. For compatibility (we even use that
feature in buildworld) additional arguments are not considered to be an
error, even though this shell does not do anything with the arguments at all.
Use -s to flag POSIX's "special built-in" utilities in builtins.def. Add a
new member to struct builtincmd and set it to 1 if -s was specified. This
is done because there are cases where special builtins must be treated
differently from other builtins.
During reformulation of tcp_usr_detach(), the call to initiate TCP
disconnect for fully connected sockets was dropped, meaning that if
the socket was closed while the connection was alive, it would be
leaked. Structure tcp_usr_detach() so that there are two clear
parts: initiating disconnect, and reclaiming state, and reintroduce
the tcp_disconnect() call in the first part.
New release notes:
hwpmc(4) and pmcstat(8) profiling support for dynamically-loaded
objects, and pmcstat(8) network logging support added[1].
Fix incorrect entries:
gmirror/graid3 uses parallel I/O request for synchronization now.
The parallel I/O request itself has been already supported[2].
Fix typos and the following incorrect entries:
kbdmux(4) GENERIC support currently in amd64 and i386 only[1],
uart(4) GENERIC support currently not in pc98[1],
speaker(4) on amd64 entry needs arch="amd64"[2],
hptmv(4) update entry needs arch="amd64,i386"[2], and
OpenSSH 4.3p1 import has not been merged yet[2].
- Teach pmcstat(8) to log over the network; the -O option now
takes a host:port specification.
- Update the manual page and add an example showing how log
over the network using pmcstat(8) and nc(1). Document the
current inability to process logs in cross-platform manner.
- Have pmcstat_open_log() call err(3) directly in case
of an error; this simplifies error handling in its caller.
When running the second part of the test, kill off the server process from
the first part before starting, or the TCP port we want to bind may be in
use still. Sleep for a short period between tests.
file descriptor handling bug fixed,
support for copying console messages to remote gdb(1),
kbdmux(4) now in GENERIC by default,
scc(4) for generic serial devices added,
net.inet.ip.portrange.reserved[high|low] for IPv6,
ata(4) USB mass storage class support,
gmirror kernel crash dump support,
gmirror and graid3 parallel I/O request support,
mfi(4) for LSI MegaRAID SAS controller family added,
getfacl(1) -q option added,
gvinum(8) resetconfig sub-command support,
libarchive(3) tp format support removed,
libarchive(3) POSIX.1e EA support added,
libc symbol map added,
libm symbol map added,
ls(1) -U option added,
mdconfig(8) XML device listing support,
mdconfig(8) -u option now supports comma-separated multiple devices,
BIND9 DNS resolver library imported,
strtonum(3) from OpenBSD,
rc.d/ike removed,
GNU Readline library updated to 5.1,
OpenSSH updated to 4.3p1,
hostapd updated to 0.4.8,
WPA supplicant updated to 0.4.8,
zlib updated to 1.2.3,
pkg_add(1) -F option added,
portsnap(1) HTTP_PROXY_AUTH handling bug fixed,
"make showconfig" in src/Makefile added, and
/etc/src.conf added.
Introduce pmap_try_insert_pv_entry(), a function that conditionally creates
a pv entry if the number of entries is below the high water mark for pv
entries.
Use pmap_try_insert_pv_entry() in pmap_copy() instead of
pmap_insert_entry(). This avoids possible recursion on a pmap lock in
get_pv_entry().
Eliminate the explicit low-memory checks in pmap_copy(). The check that
the number of pv entries was below the high water mark was largely
ineffective because it was located in the outer loop rather than the
inner loop where pv entries were allocated. Instead of checking, we
attempt the allocation and handle the failure.
Reviewed by: tegge
Reported by: kris
MFC after: 5 days