Robert Watson [Sun, 29 May 2005 16:10:33 +0000 (16:10 +0000)]
Add place-holder audit.h that defines only au_event_t, which is needed
in order to modify the system call table to include event identifiers.
The full audit.h will be merged at a later date.
Robert Watson [Sun, 29 May 2005 13:40:00 +0000 (13:40 +0000)]
Modify vmstat(8)'s domem() routine, which is responsible for extracting
malloc(9) statistics from kernel memory or a kernel coredump, to catch
up with recent changes to adopt per-CPU malloc(9) statistics. The new
routines walk the per-CPU statistics pools and coalesce them for
presentation to the user.
Robert Watson [Sun, 29 May 2005 13:38:07 +0000 (13:38 +0000)]
Kernel malloc layers malloc_type allocation over one of two underlying
allocators: a set of power-of-two UMA zones for small allocations, and the
VM page allocator for large allocations. In order to maintain unified
statistics for specific malloc types, kernel malloc maintains a separate
per-type statistics pool, which can be monitored using vmstat -m. Prior
to this commit, each pool of per-type statistics was protected using a
per-type mutex associated with the malloc type.
This change modifies kernel malloc to maintain per-CPU statistics pools
for each malloc type, and protects writing those statistics using critical
sections. It also moves to unsynchronized reads of per-CPU statistics
when generating coalesced statistics. To do this, several changes are
implemented:
- In the previous world order, the statistics memory was allocated by
the owner of the malloc type structure, allocated statically using
MALLOC_DEFINE(). This embedded the definition of the malloc_type
structure into all kernel modules. Move to a model in which a pointer
within struct malloc_type points at a UMA-allocated
malloc_type_internal data structure owned and maintained by
kern_malloc.c, and not part of the exported ABI/API to the rest of
the kernel. For the purposes of easing a possible MFC, re-use an
existing pointer in 'struct malloc_type', and maintain the current
malloc_type structure size, as well as layout with respect to the
fields reused outside of the malloc subsystem (such as ks_shortdesc).
There are several unused fields as a result of no longer requiring
the mutex in malloc_type.
- Struct malloc_type_internal contains an array of malloc_type_stats,
of size MAXCPU. The structure defined above avoids hard-coding a
kernel compile-time value of MAXCPU into kernel modules that interact
with malloc.
- When accessing per-cpu statistics for a malloc type, surround read -
modify - update requests with critical_enter()/critical_exit() in
order to avoid races during write. The per-CPU fields are written
only from the CPU that owns them.
- Per-CPU stats now maintained "allocated" and "freed" counters for
number of allocations/frees and bytes allocated/freed, since there is
no longer a coherent global notion of the totals. When coalescing
malloc stats, accept a slight race between reading stats across CPUs,
and avoid showing the user a negative allocation count for the type
in the event of a race. The global high watermark is no longer
maintained for a malloc type, as there is no global notion of the
number of allocations.
- While tearing up the sysctl() path, also switch to using sbufs. The
current "export as text" sysctl format is retained with the same
syntax. We may want to change this in the future to export more
per-CPU information, such as how allocations and frees are balanced
across CPUs.
This change results in a substantial speedup of kernel malloc and free
paths on SMP, as critical sections (where usable) out-perform mutexes
due to avoiding atomic/bus-locked operations. There is also a minor
improvement on UP due to the slightly lower cost of critical sections
there. The cost of the change to this approach is the loss of a
continuous notion of total allocations that can be exploited to track
per-type high watermarks, as well as increased complexity when
monitoring statistics.
Due to carefully avoiding changing the ABI, as well as hardening the ABI
against future changes, it is not necessary to recompile kernel modules
for this change. However, MFC'ing this change to RELENG_5 will require
also MFC'ing optimizations for soft critical sections, which may modify
exposed kernel ABIs. The internal malloc API is changed, and
modifications to vmstat in order to restore "vmstat -m" on core dumps will
follow shortly.
Peter Grehan [Sun, 29 May 2005 08:51:21 +0000 (08:51 +0000)]
The end values passed to rman_manage_region() for PCI i/o and mem
spaces were 1 too large. This resulted in the rman list not being
sorted correctly, and USB ports not being discovered on older
TiBooks.
Detective work by: Andreas Tobler <toa at pop dot agri dot ch>
Xin LI [Sun, 29 May 2005 08:43:44 +0000 (08:43 +0000)]
Add VESA mode support for syscons, which enables the support of 15, 16,
24, and 32 bit modes. To use that, syscons(4) must be built with
the compile time option 'options SC_PIXEL_MODE', and VESA support (a.k.a.
vesa.ko) must be either loaded, or be compiled into the kernel.
Do not return EINVAL when the mouse state is changed to what it already is,
which seems to cause problems when you have two mice attached, and
applications are not likely obtain useful information through the EINVAL
caused by showing the mouse pointer twice.
Teach vidcontrol(8) about mode names like MODE_<NUMBER>, where <NUMBER> is
the video mode number from the vidcontrol -i mode output. Also, revert the
video mode if something fails.
Obtained from: DragonFlyBSD
Discussed at: current@ with patch attached [1]
PR: kern/71142 [2]
Submitted by: Xuefeng DENG <dsnofe at msn com> [1],
Cyrille Lefevre <cyrille dot lefevre at laposte dot net> [2]
Fix panic when module is compiled in and it is loaded from loader.conf.
Only panic is fixed, module will be still listed in kldstat(8) output.
Not sure what is correct fix, because adding unloading code in case of
failure to linker_init_kernel_modules() doesn't work.
Change the way options are parsed on the `#!'-line of a shell-script. Instead
of having the kernel parse that line and add an entry to the argument list for
each 'separate word' it finds, have it add only one entry which holds all
the words found on that line. The old behavior is useful in some situations,
but it does not match the way any other operating system will parse that line.
This has been discussed in the thread "Bug in #! processing - One More Time"
on the freebsd-arch mailing list (starting back on Feb 24, 2005). The first
few messages in that thread provide the background in much detail.
Robert Watson [Sat, 28 May 2005 14:34:41 +0000 (14:34 +0000)]
Explicitly acquire Giant around the ntp_gettime() and assert it in the
sysctl path. While this code is close to MPSAFE, it may require some
additional locking. Mark ntp_gettime1() as GIANT_REQUIRED for now.
Robert Watson [Sat, 28 May 2005 13:23:42 +0000 (13:23 +0000)]
Mark the following compatability system calls as MCOMPAT or MCOMPAT4 based
on the their simply wrapping MPSAFE implementations of existing MPSAFE
system calls:
Note that ogetdirentries() is not marked MPSAFE because it does not share
the MPSAFE implementation used for getdirentries(), and requires separate
locking to be implemented.
Bjoern A. Zeeb [Sat, 28 May 2005 13:15:44 +0000 (13:15 +0000)]
Fix use of uninitialized variable len in ngd_send.
Note: len gets intialized to 0 for sap == NULL case only to
make compiler on amd64 happy. This has nothing todo with the
former uninitialized use of len in sap != NULL case.
Robert Watson [Sat, 28 May 2005 12:58:54 +0000 (12:58 +0000)]
Acquire Giant explicitly in fhopen(), fhstat(), and kern_fhstatfs(),
so that we can start to eliminate the presence of non-MPSAFE system
call entries in syscalls.master.
Maksim Yevmenkin [Sat, 28 May 2005 00:48:42 +0000 (00:48 +0000)]
Move AVM USB Bluetooth-Adapter BlueFritz! from "broken" devices list
(where I incorrectly put it initially) to "ignored" devices list (where
it should be). Pointy hat goes to me.
John Baldwin [Fri, 27 May 2005 19:31:00 +0000 (19:31 +0000)]
- Add support to the loader for multiple consoles.
- Teach the i386 and pc98 loaders to honor multiple console requests from
their respective boot2 binaries so that the same console(s) are used in
both boot2 and the loader.
- Since the kernel doesn't support multiple consoles, whichever console is
listed first is treated as the "primary" console and is passed to the
kernel in the boot_howto flags.
PR: kern/66425
Submitted by: Gavin Atkinson gavin at ury dot york dot ac dot uk
MFC after: 1 week
Robert Watson [Fri, 27 May 2005 17:52:56 +0000 (17:52 +0000)]
dd a '-n' option to ministat, which causes it to display only summary
statistics, not graph and statistical test output. Useful for automated
processing.
Robert Watson [Fri, 27 May 2005 17:16:43 +0000 (17:16 +0000)]
In the current world order, each socket has two mutexes: a mutex
that protects socket and receive socket buffer state, and a second
mutex to protect send socket buffer state. In some places, the
mutex shared between the socket and receive socket buffer will be
acquired twice, once by each layer, resulting in some
inconsistency, but providing the abstraction benefit of being able
to more easily separate the two mutexes in the future if desired.
When transitioning a socket to the SS_ISDISCONNECTING or
SS_ISDISCONNECTED states, grab the socket/receive socket buffer lock
once rather than grabbing it as the socket lock, modifying socket
state, then grabbing a second time as the receive lock in order to
modify the socket buffer state to indicate no further data can be
read. This change is believed to close a race between the change in
socket state and the change in socket buffer state, which for a
remotely initiated close on a UNIX domain socket, resulted in
soreceive() returning ENOTCONN rather than an EOF condition.
A similar race still exists in the case of send, however, and is
harder to fix as the socket and send socket buffer mutexes are not
the same, and we would like to avoid holding combinations of socket
mutexes over sb_upcall until we've finished clarifying the locking
protocol for upcalls.
This change has the side affect of reducing the number of mutex
operations to initiate disconnect or perform disconnect on a
socket by two.
David Xu [Fri, 27 May 2005 15:57:27 +0000 (15:57 +0000)]
Remove thread_upcall_check, it was used to avoid race bug in earlier
day's sleep queue code, today the bug no longer exists.
please see 04/25/2004 freebsd-threads@ mailing list archive.
Robert Watson [Fri, 27 May 2005 12:25:42 +0000 (12:25 +0000)]
Back out ipx.h:1.18, which introduced a Linux API compatibility field in
the ipx_net data structure. Doing so introduced a stronger alignment
requirement for the address structure, which in turn propagated into
other dependent data structures, which turns out not to be suported by
the available IPX source code. As a result, a number of user space
applications, such as IPX routing components, failed to operate
correctly.
RELENG_5_3 candidate?
PRs: 74059, 80266
Pointy hat to: bms
Fix by: bde
Tested by: Keith White <Keith dot White at site dot uottawa dot ca>
MFC after: 1 week
Suffering: great
Fix long (and long long) to long double, unsigned to long double and unsigned
long (and unsigned long long) to long double conversions.
- Add a parameter that specifies the position of the sign bit to the _QP_TTOQ
macro, previously it always looked at bit 31. Pass a negative number to
disable sign inspection for unsigned types. This fixes _Qp_xtoq(),
_Qp_uitoq() and _Qp_uxtoq().
- In the functions __fpu_itof() and __fpu_xtof(), look at the sign bit to
decide whether we're doing a conversion from an unsigned type. If so, don't
negate the mantissa if the integer exceeds the biggest signed number.
Maxime Henrion [Fri, 27 May 2005 00:05:16 +0000 (00:05 +0000)]
Use clnt_create_timed() instead of clnt_create(). The former has an
additional argument that allows us to specify a timeout, like we do for
the subsequent clnt_call() calls.
Submitted by: Jeremie Le Hen <jeremie@le-hen.org>
MFC after: 3 weeks
Tony Ackerman [Thu, 26 May 2005 23:32:02 +0000 (23:32 +0000)]
Changes to update driver with latest Intel driver version 2.1.7
- Changed from using explicit devices id to using descriptive labels.
- Added support for 82573 and 82546 Quad adapters.
- Corrected support for 82547EI and 82541ER (mac_type was not assigned)
- Removed #ifdef DBG_STATS and extraneous code.
if_em_hw.c/if_em_hw.h
- Added support for 82573 and 82546 Quad adapters.
- Brought forward Intel's most current mac and phy changes.
Eivind Eklund [Thu, 26 May 2005 23:01:30 +0000 (23:01 +0000)]
Baby, we are not in Kansas anymore. Nor are we in 1996 or FreeBSD 2.1.
Note that these papers are mostly quite old, and add a pointer to
more recent docs.
Olivier Houchard [Thu, 26 May 2005 15:01:13 +0000 (15:01 +0000)]
Don't call vm_page_dirty() in pmap_nuke_pv(), it's not the place to do so, and
it leads to funny things, such as pmap_remove_all() marking the page as dirty.
Gleb Smirnoff [Thu, 26 May 2005 06:50:00 +0000 (06:50 +0000)]
Plug mbuf leak, that I have introduced in 1.85. Also restore important comment
from if_ethersubr.c:1.178. While here adjust formatting, to make code more
readable.
Looking just at Makefiles was slightly too narrow to catch all
"inofficial" declarations of maintainership. Grep all plain files,
and insert the actual command the list was generated with, so
future updates avoid that pitfall.
Removed sheldonh@ who just released his maintainership of ntp/doc
after I informed him of this effort.
Sheldon Hearn [Wed, 25 May 2005 16:30:43 +0000 (16:30 +0000)]
Release maintainership. More ambitious minds have plans for the ntp
docs. Last I heard, Harlan Stenn was considering using FreeBSD's
pages as a starting point for the ISC NTP distribution's own pages.
If that happens, everyone wins and these can go away, to be replaced
by imported versions in contrib/ntp.
Hartmut Brandt [Wed, 25 May 2005 16:06:14 +0000 (16:06 +0000)]
Under certain conditions the condition parser would go one past end of
the string. Until now this caused no harm, because the buffer code used
to tack two NULs onto buffers. With the new, soon to come, parsing code
this isn't the case anymore in all cases, so fix this.