uqs [Tue, 2 Mar 2010 19:04:07 +0000 (19:04 +0000)]
Remove manual .includes in cddl Makefiles
- Break the dependency on ../Makefile.inc for .PATH, and include
../Makefile.inc implicitly. This is required to ...
- Set WARNS?=6 in top-level Makefile.inc
- Remove now redundant WARNS settings, add WARNS?=0 where appropriate
- Remove redundant SHLIB_MAJOR overrides
- Use NO_MAN, not MK_MAN=no
- Remove redundant inclusion of bsd.own.mk
- Order Makefiles more according to style.Makefile(9)
- Reduce diff of cddl Makefiles against each other
luigi [Tue, 2 Mar 2010 17:40:48 +0000 (17:40 +0000)]
Bring in the most recent version of ipfw and dummynet, developed
and tested over the past two months in the ipfw3-head branch. This
also happens to be the same code available in the Linux and Windows
ports of ipfw and dummynet.
The major enhancement is a completely restructured version of
dummynet, with support for different packet scheduling algorithms
(loadable at runtime), faster queue/pipe lookup, and a much cleaner
internal architecture and kernel/userland ABI which simplifies
future extensions.
In addition to the existing schedulers (FIFO and WF2Q+), we include
a Deficit Round Robin (DRR or RR for brevity) scheduler, and a new,
very fast version of WF2Q+ called QFQ.
Some test code is also present (in sys/netinet/ipfw/test) that
lets you build and test schedulers in userland.
Also, we have added a compatibility layer that understands requests
from the RELENG_7 and RELENG_8 versions of the /sbin/ipfw binaries,
and replies correctly (at least, it does its best; sometimes you
just cannot tell who sent the request and how to answer).
The compatibility layer should make it possible to MFC this code in a
relatively short time.
Some minor glitches (e.g. handling of ipfw set enable/disable,
and a workaround for a bug in RELENG_7's /sbin/ipfw) will be
fixed with separate commits.
CREDITS:
This work has been partly supported by the ONELAB2 project, and
mostly developed by Riccardo Panicucci and myself.
The code for the qfq scheduler is mostly from Fabio Checconi,
and Marta Carbone and Francesco Magno have helped with testing,
debugging and some bug fixes.
rpaulo [Tue, 2 Mar 2010 12:59:42 +0000 (12:59 +0000)]
Couple of suggestions from Sam regarding latest commit:
o rename the new variables to comply with the naming scheme
o move the new variables to an AR5212 specific struct
o use ahp when available
o revert to previous ts_flags check
rrs [Tue, 2 Mar 2010 12:11:00 +0000 (12:11 +0000)]
- Move rmi_pci_bus_space to header and avoid extern
- remove unused and commented code (MIPS_BUS_SPACE_PCI, pic_usb_ack)
- use rmi_pci_bus_space for USB too (needs byteswap)
- uncomment xls_ehci.c in files.xlr
- changes to xls_ehci.c - updated with dev/usb/controller/ehci_*.c as
glebius [Tue, 2 Mar 2010 10:43:41 +0000 (10:43 +0000)]
Sync with recent changes from luigi - struct ng_ipfw_tag superceeded
by more general ipfw_rule_ref. The latter isn't documented here, since
it should be documented in ipfw.4.
alfred [Tue, 2 Mar 2010 06:58:58 +0000 (06:58 +0000)]
Merge projects/enhanced_coredumps (r204346) into HEAD:
Enhanced process coredump routines.
This brings in the following features:
1) Limit number of cores per process via the %I coredump formatter.
Example:
if corefilename is set to %N.%I.core AND num_cores = 3, then
if a process "rpd" cores, then the corefile will be named
"rpd.0.core", however if it cores again, then the kernel will
generate "rpd.1.core" until we hit the limit of "num_cores".
this is useful to get several corefiles, but also prevent filling
the machine with corefiles.
2) Encode machine hostname in core dump name via %H.
3) Compress coredumps, useful for embedded platforms with limited space.
A sysctl kern.compress_user_cores is made available if turned on.
To enable compressed coredumps, the following config options need to be set:
options COMPRESS_USER_CORES
device zlib # brings in the zlib requirements.
device gzio # brings in the kernel vnode gzip output module.
4) Eventhandlers are fired to indicate coredumps in progress.
5) The imgact sv_coredump routine has grown a flag to pass in more
state, currently this is used only for passing a flag down to compress
the coredump or not.
Note that the gzio facility can be used for generic output of gzip'd
streams via vnodes.
yongari [Tue, 2 Mar 2010 01:45:02 +0000 (01:45 +0000)]
Remove taskqueue based interrupt handling. After r204541 msk(4)
does not generate excessive interrupts any more so we don't need
to have two copies of interrupt handler.
While I'm here remove two STAT_PUT_IDX register accesses in LE
status event handler. After r204539 msk(4) always sync status LEs
so there is no need to resort to reading STAT_PUT_IDX register to
know the end of status LE processing. Just trust status LE's
ownership bit.
yongari [Mon, 1 Mar 2010 23:39:43 +0000 (23:39 +0000)]
Implement rudimentary interrupt moderation with programmable
countdown timer register. The timer resolution may vary among
controllers but the value would be represented by core clock
cycles. msk(4) will automatically computes number of required clock
cycles from given micro-seconds unit.
The default interrupt holdoff timer value is 100us which will
ensure less than 10k interrupts under load. The timer value can be
changed with dev.mskc.0.int_holdoff sysctl node.
Note, the interrupt moderation is shared resource on dual-port
controllers so you can't use separate interrupt moderation value
for each port. This means we can't stop interrupt moderation in
driver stop routine. Also have msk_tick() reclaim transmitted Tx
buffers as safety belt. With this change there is no need to check
missing Tx completion interrupt in watchdog handler, so remove it.
yongari [Mon, 1 Mar 2010 22:55:35 +0000 (22:55 +0000)]
Make sure to enable flow-control only if established link is
full-duplex. Previously msk(4) used to allow flow-control on
1000baseT half-duplex media. Also GMAC pause is enabled if link
partner is capable of handling it.
While I'm here use IFM_OPTIONS instead of using IFM_GMASK to check
optional flags of link.
jhb [Mon, 1 Mar 2010 13:56:15 +0000 (13:56 +0000)]
Print the contents of the miscellaneous (MISC) register to the console if
it is valid along with the other register values when a machine check is
encountered.
rwatson [Mon, 1 Mar 2010 00:46:45 +0000 (00:46 +0000)]
Teach netstat -Q to work with -N and -M by adding libkvm versions of data
query routines. This code is necessarily more fragile in the presence of
kernel changes than querying the kernel via sysctl (the default), but
useful when investigating crashes or live kernel state via firewire.
rwatson [Mon, 1 Mar 2010 00:42:36 +0000 (00:42 +0000)]
Changes to support crashdump analysis of netisr:
- Rename the netisr protocol registration array, 'np' to 'netisr_proto',
in order to reduce the chances of symbol name collisions. It remains
statically defined, but it will be looked up by netstat(1).
- Move certain internal structure definitions from netisr.c to
netisr_internal.h so that netstat(1) can find them. They remain
private, and should not be used for any other purpose (for example,
they should not be used by kernel modules, which must instead use the
public interfaces in netisr.h).
- Store a kernel-compiled version of NETISR_MAXPROT in the global variable
netisr_maxprot, and export via a sysctl, so that it is available for use
by netstat(1). This is especially important for crashdump
interpretation, where the size of the workstream structure is determined
by the maximum number of protocols compiled into the kernel.
rwatson [Mon, 1 Mar 2010 00:27:55 +0000 (00:27 +0000)]
A first cut at teaching libkvm how to deal with dynamic per-CPU storage
(DPCPU):
A new API, kvm_dpcpu_setcpu(3), selects the active CPU for the purposes
of DPCPU. Calls to kvm_nlist(3) will automatically translate DPCPU
symbols and return a pointer to the current CPU's version of the data.
Consumers needing to read the same symbol on several CPUs will invoke a
series of setcpu/nlist calls, one per CPU of interest.
This addition makes it possible for tools like netstat(1) to query the
values of DPCPU variables during crashdump analysis, and is based on
similar code handling virtualized global variables.
MFC after: 1 week
Sponsored by: Juniper Networks, Inc.
This is a split merge because of non-uniform licensing of the DTC package
contents and the way these components will be used in the FreeBSD environment.
The original DTC package is composed of the following two major pieces:
The libfdt component is going to be shared in all aspects of the environment:
- /boot/loader
- kernel
- dtc (the device tree compiler proper, userspace tool)
kib [Sun, 28 Feb 2010 17:10:41 +0000 (17:10 +0000)]
In msdosfs_inactive(), reclaim the vnodes both for SLOT_DELETED and
SLOT_EMPTY deName[0] values. Besides conforming to FAT specification, it
also clears the issue where vfs_hash_insert found the vnode in hash, and
newly allocated vnode is vput()ed. There, deName[0] == 0, and vnode is
not reclaimed, indefinitely kept on mountlist.
kib [Sun, 28 Feb 2010 17:07:49 +0000 (17:07 +0000)]
Assert that the msdosfs vnode is (e)locked in several places.
The plan is to use vnode lock to protect denode and fat cache,
and having separate lock for block use map.
Change the check and return on impossible condition into KASSERT().
kib [Sun, 28 Feb 2010 16:25:49 +0000 (16:25 +0000)]
In both if_tun and if_tap:
Do not do additional dev_ref() on the newly created interface in the
if_clone create method [1]. This reference is not needed and never
removed, causing struct cdevpriv leakage. Remove the setting of
SI_CHEAPCLONE flag as well, since it is unused.
For dev_clone handlers, create cdevs with the call make_dev_credf(MAKEDEV_REF)
instead of calling make_dev() and then dev_ref(), to avoid a race.
Call drain_dev_clone_events() at the module unload time after dev_clone
handler is deinstalled.
jh [Sun, 28 Feb 2010 13:31:29 +0000 (13:31 +0000)]
In _gettemp(), check that the length of the path doesn't exceed
MAXPATHLEN. Otherwise the path name (or part of it) may not fit to
carrybuf causing a buffer overflow.
rwatson [Sat, 27 Feb 2010 19:57:40 +0000 (19:57 +0000)]
Remove stale comment about socket buffer accounting from access(2) code.
It is the case, however, that the uidinfo of the temporary credential
set up for access(2) is not properly updated when its effective uid is
changed.
marcel [Sat, 27 Feb 2010 18:55:43 +0000 (18:55 +0000)]
Interrupt related cleanups:
o Assign vectors based on priority, because vectors have
implied priority in hardware.
o Use unordered memory accesses to the I/O sapic and use
the acceptance form of the mf instruction.
o Remove the sapicreg.h and sapicvar.h headers. All definitions
in sapicreg.h are private to sapic.c and all definitions in
sapicvar.h are either private or interface functions. Move the
interface functions to intr.h.
o Hide the definition of struct sapic.
alc [Sat, 27 Feb 2010 18:00:57 +0000 (18:00 +0000)]
When running as a guest operating system, the FreeBSD kernel must assume
that the virtual machine monitor has enabled machine check exceptions.
Unfortunately, on AMD Family 10h processors the machine check hardware
has a bug (Erratum 383) that can result in a false machine check exception
when a superpage promotion occurs. Thus, I am disabling superpage
promotion when the FreeBSD kernel is running as a guest operating system
on an AMD Family 10h processor.
kib [Sat, 27 Feb 2010 15:32:49 +0000 (15:32 +0000)]
For kinfo_proc in kp->ki_siglist, return the set of the signals pending
in the process queue when gathering information for the process, and set
of signals pending for the thread, when gathering information for the
thread. Previously, the sysctl returned a union of the process and some
arbitrary thread pending set for the process, and union of the process
and the thread pending set for the thread.