jhb [Mon, 1 Jun 2009 21:32:52 +0000 (21:32 +0000)]
Add an extension to the character device interface that allows character
device drivers to use arbitrary VM objects to satisfy individual mmap()
requests.
- A new d_mmap_single(cdev, &foff, objsize, &object, prot) callback is
added to cdevsw. This function is called for each mmap() request.
If it returns ENODEV, then the mmap() request will fall back to using
the device's device pager object and d_mmap(). Otherwise, the method
can return a VM object to satisfy this entire mmap() request via
*object. It can also modify the starting offset into this object via
*foff. This allows device drivers to use the file offset as a cookie
to identify specific VM objects.
- vm_mmap_vnode() has been changed to call vm_mmap_cdev() directly when
mapping V_CHR vnodes. This avoids duplicating all the cdev mmap
handling code and simplifies some of vm_mmap_vnode().
- D_VERSION has been bumped to D_VERSION_02. Older device drivers
using D_VERSION_01 are still supported.
jhb [Mon, 1 Jun 2009 21:17:03 +0000 (21:17 +0000)]
Rework socket upcalls to close some races with setup/teardown of upcalls.
- Each socket upcall is now invoked with the appropriate socket buffer
locked. It is not permissible to call soisconnected() with this lock
held; however, so socket upcalls now return an integer value. The two
possible values are SU_OK and SU_ISCONNECTED. If an upcall returns
SU_ISCONNECTED, then the soisconnected() will be invoked on the
socket after the socket buffer lock is dropped.
- A new API is provided for setting and clearing socket upcalls. The
API consists of soupcall_set() and soupcall_clear().
- To simplify locking, each socket buffer now has a separate upcall.
- When a socket upcall returns SU_ISCONNECTED, the upcall is cleared from
the receive socket buffer automatically. Note that a SO_SND upcall
should never return SU_ISCONNECTED.
- All this means that accept filters should now return SU_ISCONNECTED
instead of calling soisconnected() directly. They also no longer need
to explicitly clear the upcall on the new socket.
- The HTTP accept filter still uses soupcall_set() to manage its internal
state machine, but other accept filters no longer have any explicit
knowlege of socket upcall internals aside from their return value.
- The various RPC client upcalls currently drop the socket buffer lock
while invoking soreceive() as a temporary band-aid. The plan for
the future is to add a new flag to allow soreceive() to be called with
the socket buffer locked.
- The AIO callback for socket I/O is now also invoked with the socket
buffer locked. Previously sowakeup() would drop the socket buffer
lock only to call aio_swake() which immediately re-acquired the socket
buffer lock for the duration of the function call.
jhb [Mon, 1 Jun 2009 20:35:39 +0000 (20:35 +0000)]
Add a simple API to manage scatter/gather lists of phyiscal addresses.
Each list describes a logical memory object that is backed by one or more
physical address ranges. To minimize locking, the sglist objects
themselves are immutable once they are shared.
These objects may be used in the future to facilitate I/O requests using
physically-addressed buffers. For the immediate future I plan to use them
to implement a new type of VM object and pager.
rwatson [Mon, 1 Jun 2009 20:26:51 +0000 (20:26 +0000)]
Add a flags field to struct ucred, and export that via kinfo_proc,
consuming one of its spare fields. The cr_flags field is currently
unused, but will be used for features, including capability mode and
pay-as-you-go audit.
gallatin [Mon, 1 Jun 2009 19:16:57 +0000 (19:16 +0000)]
Set an rx jumbo cluster to the correct size before
using bus_dmamap_load_mbuf_sg() on it. This
prevents data corruption when the mxge MTU is
between 4076 and 8172 on machines with 4KB
pages and MXGE_VIRT_JUMBOS is in use (which it
isn't, in -current or -stable)
joel [Mon, 1 Jun 2009 18:58:46 +0000 (18:58 +0000)]
- Remove obsolete and confusing comment about renaming "sound" to "snd".
We will look at renaming stuff for 9.0, but it's far from certain that we
will do it this way.
- Sort sysctl's alphabetically. I'll add a bunch of new sysctl's once
ariff's next mega-patch goes in, and having everything sorted makes my
job easier.
rwatson [Mon, 1 Jun 2009 18:38:36 +0000 (18:38 +0000)]
Revert a recent netisr2 change: when billing packets to the current
CPU, don't lock the workstream, as its mutexes may not have been
initialized if there are fewer workstreams than CPUs.
imp [Mon, 1 Jun 2009 16:29:03 +0000 (16:29 +0000)]
Move the unlock to after the ifdef (maybe the right fix is to remove
the ifdef) since it calls bwi_start_locked, which expects to the lock
to be held...
rwatson [Mon, 1 Jun 2009 16:13:06 +0000 (16:13 +0000)]
Add 'sy_flags', a currently unused per-syscall entry flags field that will
see future use in 9-CURRENT and 8-STABLE for features such as the
capability-mode enable flag and pay-as-you-audit.
Convert the two dimensional array to be malloced and introduce
an accessor function to get the correct rnh pointer back.
Update netstat to get the correct pointer using kvm_read()
as well.
This not only fixes the ABI problem depending on the kernel
option but also permits the tunable to overwrite the kernel
option at boot time up to MAXFIBS, enlarging the number of
FIBs without having to recompile. So people could just use
GENERIC now.
Reviewed by: julian, rwatson, zec
X-MFC: not possible
bms [Mon, 1 Jun 2009 15:30:18 +0000 (15:30 +0000)]
Merge fixes from p4:
* Tighten v1 query input processing.
* Borrow changes from MLDv2 for how general queries are processed.
* Do address field validation upfront before accepting input.
* Do NOT switch protocol version if old querier present timer active.
* Always clear IGMPv3 state in igmp_v3_cancel_link_timers().
* Update comments.
rwatson [Mon, 1 Jun 2009 15:03:58 +0000 (15:03 +0000)]
Garbage collect NETISR_POLL and NETISR_POLLMORE, which are no longer
required for options DEVICE_POLLING.
De-fragment the NETISR_ constant space and lower NETISR_MAXPROT from
32 to 16 -- when sizing queue arrays using this compile-time constant,
significant amounts of memory are saved.
Warn on the console when tunable values for netisr are automatically
adjusted during boot due to exceeding limits, invalid values, or as a
result of DEVICE_POLLING.
mav [Mon, 1 Jun 2009 13:13:47 +0000 (13:13 +0000)]
Comment out old Realtek ALC883 quirk, that was disabling phantop power on
mic inputs. I have no idea what for it was made that time, but now I have
several reports that it should be removed to make microphones work. If
this quirk is still required for some systems then they should be identified
and specified explicitly.
rwatson [Mon, 1 Jun 2009 10:41:38 +0000 (10:41 +0000)]
Reimplement the netisr framework in order to support parallel netisr
threads:
- Support up to one netisr thread per CPU, each processings its own
workstream, or set of per-protocol queues. Threads may be bound
to specific CPUs, or allowed to migrate, based on a global policy.
In the future it would be desirable to support topology-centric
policies, such as "one netisr per package".
- Allow each protocol to advertise an ordering policy, which can
currently be one of:
NETISR_POLICY_SOURCE: packets must maintain ordering with respect to
an implicit or explicit source (such as an interface or socket).
NETISR_POLICY_FLOW: make use of mbuf flow identifiers to place work,
as well as allowing protocols to provide a flow generation function
for mbufs without flow identifers (m2flow). Falls back on
NETISR_POLICY_SOURCE if now flow ID is available.
NETISR_POLICY_CPU: allow protocols to inspect and assign a CPU for
each packet handled by netisr (m2cpuid).
- Provide utility functions for querying the number of workstreams
being used, as well as a mapping function from workstream to CPU ID,
which protocols may use in work placement decisions.
- Add explicit interfaces to get and set per-protocol queue limits, and
get and clear drop counters, which query data or apply changes across
all workstreams.
- Add a more extensible netisr registration interface, in which
protocols declare 'struct netisr_handler' structures for each
registered NETISR_ type. These include name, handler function,
optional mbuf to flow ID function, optional mbuf to CPU ID function,
queue limit, and ordering policy. Padding is present to allow these
to be expanded in the future. If no queue limit is declared, then
a default is used.
- Queue limits are now per-workstream, and raised from the previous
IFQ_MAXLEN default of 50 to 256.
- All protocols are updated to use the new registration interface, and
with the exception of netnatm, default queue limits. Most protocols
register as NETISR_POLICY_SOURCE, except IPv4 and IPv6, which use
NETISR_POLICY_FLOW, and will therefore take advantage of driver-
generated flow IDs if present.
- Formalize a non-packet based interface between interface polling and
the netisr, rather than having polling pretend to be two protocols.
Provide two explicit hooks in the netisr worker for start and end
events for runs: netisr_poll() and netisr_pollmore(), as well as a
function, netisr_sched_poll(), to allow the polling code to schedule
netisr execution. DEVICE_POLLING still embeds single-netisr
assumptions in its implementation, so for now if it is compiled into
the kernel, a single and un-bound netisr thread is enforced
regardless of tunable configuration.
In the default configuration, the new netisr implementation maintains
the same basic assumptions as the previous implementation: a single,
un-bound worker thread processes all deferred work, and direct dispatch
is enabled by default wherever possible.
Performance measurement shows a marginal performance improvement over
the old implementation due to the use of batched dequeue.
An rmlock is used to synchronize use and registration/unregistration
using the framework; currently, synchronized use is disabled
(replicating current netisr policy) due to a measurable 3%-6% hit in
ping-pong micro-benchmarking. It will be enabled once further rmlock
optimization has taken place. However, in practice, netisrs are
rarely registered or unregistered at runtime.
A new man page for netisr will follow, but since one doesn't currently
exist, it hasn't been updated.
This change is not appropriate for MFC, although the polling shutdown
handler should be merged to 7-STABLE.
pjd [Mon, 1 Jun 2009 10:30:00 +0000 (10:30 +0000)]
- Rename IP_NONLOCALOK IP socket option to IP_BINDANY, to be more consistent
with OpenBSD (and BSD/OS originally). We can't easly do it SOL_SOCKET option
as there is no more space for more SOL_SOCKET options, but this option also
fits better as an IP socket option, it seems.
- Implement this functionality also for IPv6 and RAW IP sockets.
- Always compile it in (don't use additional kernel options).
- Remove sysctl to turn this functionality on and off.
- Introduce new privilege - PRIV_NETINET_BINDANY, which allows to use this
functionality (currently only unjail root can use it).
delphij [Mon, 1 Jun 2009 07:05:52 +0000 (07:05 +0000)]
According to Intel documentation (307013), 3Gbps mode is supported on
Desktop chipsets only for ICH7 series, so mark all ICH7M as ATA_SA150
instead of ATA_SA300.
jmallett [Mon, 1 Jun 2009 06:49:09 +0000 (06:49 +0000)]
o) Restructure tcpdrop(8) to provide a facility to try to drop all established
connections. Including a flag to instead output a sequence of tcpdrop(8)
invocations that would accomplish the same thing, which is convenient for
scripting.
o) Make tcpdrop complain if the addresses given to it are entirely in different
address families, rather than failing silently.
o) When cross-referencing httpd(8), do not explicitly specify the apache2 port,
since the example in question is generic.
dougb [Mon, 1 Jun 2009 06:31:04 +0000 (06:31 +0000)]
Local hack to get the build going again while ISC works on a more
permanent solution for 9.6.1-release.
"My suggestion is to remove the whole attribute construct.
It only suppresses a warning when a function is unused. In this case
the function is defined as inline, so it's not causing a warning when
not used."
dougb [Mon, 1 Jun 2009 05:37:13 +0000 (05:37 +0000)]
Eliminate the warning that "Values of network_interfaces other than
AUTO are deprecated.' There is no good reason to deprecate them, and
setting this to different values can be useful for custom solutions
and/or one-off configuration problems.
dougb [Mon, 1 Jun 2009 05:35:03 +0000 (05:35 +0000)]
Make the pf and ipfw firewalls start before netif, just like ipfilter
already does. This eliminates a logical inconsistency, and a small
window where the system is open after the network comes up.
dougb [Mon, 1 Jun 2009 04:55:13 +0000 (04:55 +0000)]
Substitute ypset for ypbind in REQUIRE lines. If you use ypset it has to
happen right after ypbind, and before anything that uses NIS. The only
change in rcorder accomplished by this patch is make that happen.
PR: conf/117555
Submitted by: John Marshall <john@rwsrv05.mby.riverwillow.net.au>
rodrigc [Mon, 1 Jun 2009 01:02:30 +0000 (01:02 +0000)]
sys/boot/common.c
=================
Extend the loader to parse the root file system mount options in /etc/fstab,
and set a new loader variable vfs.root.mountfrom.options with these options.
The root mount options must be a comma-delimited string, as specified in
/etc/fstab.
Only set the vfs.root.mountfrom.options variable if it has not been
set in the environment.
sys/kern/vfs_mount.c
====================
When mounting the root file system, pass the mount options
specified in vfs.root.mountfrom.options, but filter out "rw" and "noro",
since the initial mount of the root file system must be done as "ro".
While we are here, try to add a few hints to the mountroot prompt
to give users and idea what might of gone wrong during mounting
of the root file system.
rodrigc [Mon, 1 Jun 2009 00:40:39 +0000 (00:40 +0000)]
Code for parsing nmount options in kernel was merged
to stable/7 branch in r190315. So only resort to fallback_mount()
could which passes struct nfs_args to kernel in kernel versions
less than 702100.
jilles [Sun, 31 May 2009 19:37:06 +0000 (19:37 +0000)]
sh: Make read's timeout (-t) apply to the entire line, not only the first
character.
This avoids using non-standard behaviour of the old (upto FreeBSD 7) TTY
layer: it reprocesses the input queue when switching to canonical mode. The
new TTY layer does not provide this functionality and so read -t worked
very poorly (first character is not echoed, cannot be backspaced but is
still read).
This also agrees with what most other shells with read -t do.
PR: bin/129566
Reviewed by: stefanf
Approved by: ed (mentor)
ed [Sun, 31 May 2009 19:35:41 +0000 (19:35 +0000)]
Restore support for bell pitch/duration.
Because we only support a single argument to tf_param, use 16 bits for
the pitch and 16 bits for the duration. While there, make the argument
unsigned. There isn't a single param call that needs a signed integer.
kib [Sun, 31 May 2009 15:01:50 +0000 (15:01 +0000)]
Unlock the pseudofs vnode before calling fill method for pfs_readlink().
The fill code may need to lock another vnode, e.g. procfs file
implementation.
Reviewed by: des
Tested by: pho
MFC after: 2 weeks
kib [Sun, 31 May 2009 14:58:43 +0000 (14:58 +0000)]
Implement the bypass routine for VOP_VPTOCNP in nullfs.
Among other things, this makes procfs <pid>/file working for executables
started from nullfs mount.
kib [Sun, 31 May 2009 14:57:43 +0000 (14:57 +0000)]
Eliminate code duplication in vn_fullpath1() around the cache lookups
and calls to vn_vptocnp() by moving more of the common code to
vn_vptocnp(). Rename vn_vptocnp() to vn_vptocnp_locked() to signify that
cache is locked around the call.
Do not track buffer position by both the pointer and offset, use only
buflen to record the start of the free space.
Export vn_vptocnp() for external consumers as a wrapper around
vn_vptocnp_locked() that locks the cache and handles hold counts.
kib [Sun, 31 May 2009 14:54:20 +0000 (14:54 +0000)]
Do not drop vnode interlock in null_checkvp(). null_lock() verifies that
v_data is not-null before calling NULLVPTOLOWERVP(), and dropping the
interlock allows for reclaim to clean v_data and free the memory.
While there, remove unneeded semicolons and convert the infinite loops
to panics. I have a will to remove null_checkvp() altogether, or leave
it as a trivial stub, but not now.
kib [Sun, 31 May 2009 14:52:45 +0000 (14:52 +0000)]
Lock the real null vnode lock before substitution of vp->v_vnlock.
This should not really matter for correctness, since vp->v_lock is
not locked before the call, and null_lock() holds the interlock,
but makes the control flow for reclaim more clear.
deischen [Sun, 31 May 2009 14:48:51 +0000 (14:48 +0000)]
Add a NO_SYNCHRONIZE_CACHE quirk for an AIPTEK2
part identified as Sunplus Technology Inc. This
happens to sit in a Rosewill RX81U-ES-25A 2.5" SATA
to USB 2.0 external enclosure.
stefanf [Sun, 31 May 2009 12:36:14 +0000 (12:36 +0000)]
Fix the eval command in combination with set -e. Before this change the shell
would always terminate if eval returned with a non-zero exit status regardless
if the status was actually tested. Unfortunately a new file-scope variable
is needed, the alternative would only be to add a new parameter to all
built-ins.
zec [Sun, 31 May 2009 12:10:04 +0000 (12:10 +0000)]
Introduce an interm userland-kernel API for creating vnets and
assigning ifnets from one vnet to another. Deletion of vnets is not
yet supported.
The interface is implemented as an ioctl extension so that no syscalls
had to be introduced. This should be acceptable given that the new
interface will be used for a short / interim period only, until the
new jail management framwork gains the capability of managing vnets.
This method for managing vimages / vnets has been in use for the past
7 years without any observable issues.
The userland tool to be used in conjunction with the interim API can be
found in p4: //depot/projects/vimage-commit2/src/usr.sbin/vimage/... and
will most probably never get commited to svn.
While here, bump copyright notices in kern_vimage.c and vimage.h to
cover work done in year 2009.
rwatson [Sun, 31 May 2009 09:03:14 +0000 (09:03 +0000)]
Upgrade audit(4) from experimental to production status for FreeBSD 8.0.
While there remain some incomplete aspects of the implementation (such
as incomplete auditing of some system calls), the implementation has
been burned in for a few years, as well as in GENERIC for a few years.
nwhitehorn [Sun, 31 May 2009 08:59:15 +0000 (08:59 +0000)]
Provide a new CPU device driver ivar to report the nominal speed of the
CPU, if available. This is meant to solve the issue of cpufreq misreporting
speeds on CPUs that boot in a reduced power mode and have only relative
speed control.
adrian [Sun, 31 May 2009 08:11:39 +0000 (08:11 +0000)]
Fix the MP IPI code to differentiate between bitmapped IPIs and function IPIs.
This attempts to fix the IPI handling code to correctly differentiate
between bitmapped IPIs and function IPIs. The Xen IPIs were on low numbers
which clashed with the bitmapped IPIs.
This commit bumps those IPI numbers up to 240 and above (just like in the i386
code) and fiddles with the ipi_vectors[] logic to call the correct function.
This still isn't "right". Specifically, the IPI code may work fine for TLB
shootdown events but the rendezvous/lazypmap IPIs are thrown by calling ipi_*()
routines which don't set the call_func stuff (function id, addr1, addr2) that
the TLB shootdown events are. So the Xen SMP support is still broken.
adrian [Sun, 31 May 2009 07:25:24 +0000 (07:25 +0000)]
Remove some unused code in ipi_selected() .
The code path this was copied from (sys/i386/i386/mp_machdep.c:ipi_selected())
handles bitmap'ed IPIs and normal IPIs via separate notification paths. Xen
SMP handles them the same way.
dougb [Sun, 31 May 2009 05:44:21 +0000 (05:44 +0000)]
Update BIND to version 9.6.1rc1. This version has better performance and
lots of new features compared to 9.4.x, including:
Full NSEC3 support
Automatic zone re-signing
New update-policy methods tcp-self and 6to4-self
DHCID support.
More detailed statistics counters including those supported in BIND 8.
Faster ACL processing.
Efficient LRU cache-cleaning mechanism.
NSID support.
dougb [Sun, 31 May 2009 05:42:58 +0000 (05:42 +0000)]
Update BIND to version 9.6.1rc1. This version has better performance and
lots of new features compared to 9.4.x, including:
Full NSEC3 support
Automatic zone re-signing
New update-policy methods tcp-self and 6to4-self
DHCID support.
More detailed statistics counters including those supported in BIND 8.
Faster ACL processing.
Efficient LRU cache-cleaning mechanism.
NSID support.
dougb [Sat, 30 May 2009 23:50:12 +0000 (23:50 +0000)]
In preparation for the BIND 9.6.1rc1 import, remove this directory.
The libbind library is no longer distributed as part of the main
BIND package, and we never built it in any case.