rwatson [Fri, 22 Oct 2004 12:12:40 +0000 (12:12 +0000)]
Add an annotation to the comment for sysv_ipc.c to indicate that the
MAC Framework doesn't require checks in ipcperm() because checks
relating to System V IPC will be performed in individual IPC
implementations.
rwatson [Fri, 22 Oct 2004 11:29:30 +0000 (11:29 +0000)]
Expand comments on various sections of the MAC Framework Policy API,
as well as document the properties of the mac_policy_conf structure.
Warn about the ABI risks in changing the structure without careful
consideration.
rwatson [Fri, 22 Oct 2004 11:04:58 +0000 (11:04 +0000)]
When MAC is enabled, warn if getnewvnode() is asked to produce a vnode
without a mountpoint. In this scenario, there's no useful source for
a label on the vnode, since we can't query the mountpoint for the
labeling strategy or default label.
phk [Fri, 22 Oct 2004 09:59:37 +0000 (09:59 +0000)]
Alas, poor SPECFS! -- I knew him, Horatio; A filesystem of infinite
jest, of most excellent fancy: he hath taught me lessons a thousand
times; and now, how abhorred in my imagination it is! my gorge rises
at it. Here were those hacks that I have curs'd I know not how
oft. Where be your kludges now? your workarounds? your layering
violations, that were wont to set the table on a roar?
Move the skeleton of specfs into devfs where it now belongs and
bury the rest.
marcel [Fri, 22 Oct 2004 04:49:09 +0000 (04:49 +0000)]
Seperate ia64 from the pack. The disc1 is overflowing to such extend
that most packages can not be included. It's much easier to list those
that we do want on disc1 for ia64. We only need to list 11 of them.
rwatson [Thu, 21 Oct 2004 18:35:24 +0000 (18:35 +0000)]
Add KTR_GEOM, which allows tracing of basic GEOM I/O events occuring
in the g_up and g_down threads. Each time a bio is propelled up and
down the stack, an event is generating showing the provider, offset,
and length, as well as thread wakeup and work status information.
phk [Thu, 21 Oct 2004 14:42:31 +0000 (14:42 +0000)]
Add BO_* macros parallel to VI_* macros for manipulating the bo_mtx.
Initialize the bo_mtx when we allocate a vnode i getnewvnode() For
now we point to the vnodes interlock mutex, that retains the exact
same locking sematics.
Move v_numoutput from vnode to bufobj. Add renaming macro to
postpone code sweep.
phk [Thu, 21 Oct 2004 12:51:36 +0000 (12:51 +0000)]
Forced commit to get the right commit message:
Add new include file <sys/bufobj.h> which will contain the gory
details on the new buffer-cache object. (see comments in file
about the direction this is moving).
Include it from <sys/vnode.h> for now to avoid munging a lot of files
which can later be munged back.
Embed a bufobj in vnode.
Move the buf splay trees from the vnode to the bufobj.
phk [Thu, 21 Oct 2004 12:24:38 +0000 (12:24 +0000)]
Add new function ttyinitmode() which sets our systemwide default
modes on a tty structure. Both the ".init" and the current settings
are initialized allowing the function to be used both at attach and
open time.
The function takes an argument to decide if echoing should be enabled
by default. Echoing should not be enabled for regular physical
serial ports unless they are consoles, in which case they should
be configured by ttyconsolemode() instead.
rwatson [Thu, 21 Oct 2004 11:21:13 +0000 (11:21 +0000)]
Modify libugidfw(3) to use MBI_* permission flags from mac_bsdextended.h
instead of using the V* permission flags from vnode.h. Remove include
of vnode.h.
rwatson [Thu, 21 Oct 2004 11:19:02 +0000 (11:19 +0000)]
Modify mac_bsdextended policy so that it defines its own vnode access
right bits rather than piggy-backing on the V* rights defined in
vnode.h. The mac_bsdextended bits are given the same values as the V*
bits to make the new kernel module binary compatible with the old
version of libugidfw that uses V* bits. This avoids leaking kernel
API/ABI to user management tools, and in particular should remove the
need for libugidfw to include vnode.h.
alc [Wed, 20 Oct 2004 17:44:40 +0000 (17:44 +0000)]
Modify the vm object locking in do_sendfile() so that the containing object
is locked when vm_page_io_finish() is called on a page. This is to satisfy
a new, post-RELENG_5 assertion in vm_page_io_finish(). (I am in the
process of transitioning the responsibility for synchronizing access to
various fields/flags on the page from the global page queues lock to the
per-object lock.)
keramida [Wed, 20 Oct 2004 16:58:28 +0000 (16:58 +0000)]
Introduce root_rw_mount as a new variable in defaults/rc.conf to
unbreak /etc/rc.d/root for diskless systems that get their root
filesystem from a read-only NFS mount.
rwatson [Wed, 20 Oct 2004 08:05:02 +0000 (08:05 +0000)]
Explicitly break out NETA license from Berkeley license to clearly
indicate license grant, as well as to indicate that NETA is asserting
only two clauses, not four clauses.
mux [Tue, 19 Oct 2004 23:31:44 +0000 (23:31 +0000)]
Add missing bus_dmamap_sync() calls. If you are using an architecture
with a weak memory model or x86 + PAE (or more specifically, your
driver is using bounce pages) and you have had problems with em(4),
this may fix it. At least this is needed to have em(4) work properly
on FreeBSD/arm.
Original version by: cognet
Reviewed by: tackerman
Tested by: cognet
andre [Tue, 19 Oct 2004 22:08:13 +0000 (22:08 +0000)]
Slightly extend the locking during unload to fully cover the protocol
deregistration. This does not entirely close the race but narrows the
even previously extremely small chance of a race some more.
rwatson [Tue, 19 Oct 2004 21:35:42 +0000 (21:35 +0000)]
Annotate a newly introduced race present due to the unloading of
protocols: it is possible for sockets to be created and attached
to the divert protocol between the test for sockets present and
successful unload of the registration handler. We will need to
explore more mature APIs for unregistering the protocol and then
draining consumers, or an atomic test-and-unregister mechanism.
andre [Tue, 19 Oct 2004 21:14:57 +0000 (21:14 +0000)]
Convert IPDIVERT into a loadable module. This makes use of the dynamic loadability
of protocols. The call to divert_packet() is done through a function pointer. All
semantics of IPDIVERT remain intact. If IPDIVERT is not loaded ipfw will refuse to
install divert rules and natd will complain about 'protocol not supported'. Once
it is loaded both will work and accept rules and open the divert socket. The module
can only be unloaded if no divert sockets are open. It does not close any divert
sockets when an unload is requested but will return EBUSY instead.
andre [Tue, 19 Oct 2004 20:59:01 +0000 (20:59 +0000)]
Pre-emptively define IPPROTO_SPACER to 32767, the same value as PROTO_SPACER
to document that this value is globally assigned for a special purpose and
may not be reused within the IPPROTO number space.
gibbs [Tue, 19 Oct 2004 20:48:06 +0000 (20:48 +0000)]
aic7xxx.h:
Add constants for SPI protocol delays that are needed for
target mode.
aic7xxx.c:
Correct a target mode issue that caused an occassional
spurious REQ to be seen on the bus when performing manual
message processing (e.g. transfer rate negotiation).
Enforce phase change bus settle rules with explicit
delays when performing manual message processing in
target mode. The sequencer already did this for
"fast-path", target mode message processing.
ru [Tue, 19 Oct 2004 20:38:49 +0000 (20:38 +0000)]
- Removed the .CURDIR/.OBJDIR magic, it is not necessary here.
- Let the built-in sys.mk rule produce the "yearistype" script.
- Install zone files with mode 444 (now that the -m option of
zic(8) has been fixed).
rwatson [Tue, 19 Oct 2004 18:11:55 +0000 (18:11 +0000)]
Define IFF_LOCKGIANT() and IFF_UNLOCKGIANT() macros, which conditionally
acquire Giant if the passed interface has IFF_NEEDSGIANT set on it.
Modify calls into (ifp)->if_ioctl() in if.c to use these macros in order
to ensure that Giant is held.
obrien [Tue, 19 Oct 2004 17:39:15 +0000 (17:39 +0000)]
Size matters. Correctly use a size_t so 64-bit hosts can mount SMB FS's
when using character set conversions.
Also include POSIX <string.h> vs. BSD <strings.h> now that we've broken
traditional BSD behavior [and compatibility with our BSD brethren].
PR: 72445
Submitted by: Vladimir Nechitailo <nechit@lpi.ru>
Patch by: Stasys Smailys <ssmailys@komvista.lt>
obrien [Tue, 19 Oct 2004 17:25:33 +0000 (17:25 +0000)]
Define "I386_CPU" if CPUTYPE is 'i386'. Userland bits can check for "I386_CPU"
to determine if they should select code paths suitable for the 80386 CPU.
andre [Tue, 19 Oct 2004 15:45:57 +0000 (15:45 +0000)]
Support for dynamically loadable and unloadable IP protocols in the ipmux.
With pr_proto_register() it has become possible to dynamically load protocols
within the PF_INET domain. However the PF_INET domain has a second important
structure called ip_protox[] that is derived from the 'struct protosw inetsw[]'
and takes care of the de-multiplexing of the various protocols that ride on
top of IP packets.
The functions ipproto_[un]register() allow to dynamically adjust the ip_protox[]
array mux in a consistent and easy way. To register a protocol within
ip_protox[] the existence of a corresponding and matching protocol definition
in inetsw[] is required. The function does not allow to overwrite an already
registered protocol. The unregister function simply replaces the mux slot with
the default index pointer to IPPROTO_RAW as it was previously.
bms [Tue, 19 Oct 2004 15:30:47 +0000 (15:30 +0000)]
Detach the Rhine completely on shutdown, rather than merely stopping it
as the original logic did. This fixes a race with vr_intr() which was
masked on UP systems and manifested on SMP systems.
andre [Tue, 19 Oct 2004 15:13:30 +0000 (15:13 +0000)]
Support for dynamically loadable and unloadable protocols within existing protocol
families.
The protosw[] array of any particular protocol family ("domain") is of fixed size
defined at compile time. This made it impossible to dynamically add or remove any
protocols to or from it. We work around this by introducing so called SPACER's
which are embedded into the protosw[] array at compile time. The SPACER's have
a special protocol number (32767) to indicate the fact that they are SPACER's but
are otherwise NULL. Only as many protocols can be dynamically loaded as SPACER's
are provided in the protosw[] structure.
The pr_usrreqs structure is treated more special and contains pointers to dummy
functions only returning EOPNOTSUPP. This is needed because the use of those
functions pointers is usually not checked within the kernel because until now it
was assumed to be a valid function pointer. Instead of fixing all potential
callers we just return a proper error code.
Two new functions provide a clean API to register and unregister a protocol. The
register function expects a pointer to a valid and complete struct protosw including
a pointer to struct pru_usrreqs provided by the caller. Upon successful registration
the pr_init() function will be called to finish initialization of the protocol. The
unregister function restores the SPACER in place of the protocol again. It is the
responseability of the caller to ensure proper closing of all sockets and freeing
of memory allocation by the unloading protocol.
sys/protosw.h
o Define generic PROTO_SPACER to be 32767
o Prototypes for all pru_*_notsupp() functions
o Prototypes for pf_proto_[un]register() functions
kern/uipc_domain.c
o Global struct pr_usrreqs nousrreqs containing valid pointers to the
pru_*_notsupp() functions
o New functions pf_proto_[un]register()
kern/uipc_socket2.c
o New functions bodies for all pru_*_notsupp() functions
le [Tue, 19 Oct 2004 10:29:00 +0000 (10:29 +0000)]
Return the unit number of a channel instead of a hardcoded '1' from
the ATA pccard locking function. This makes pccard devices like
Compact Flash cards work again.
PR: kern/72805
Submitted by: James E. Flemer <jflemer@alum.rpi.edu>
MFC in: 2 days
scottl [Tue, 19 Oct 2004 02:44:38 +0000 (02:44 +0000)]
Forced commit to note that the previous change also elimates calls to
bus_dmamap_create|destroy for the rx and tx descriptor buffers. Since these
buffers are created with bus_dmamem_alloc(), there is no reason to also
create a map, and doing so just wastes memory.
scottl [Tue, 19 Oct 2004 02:42:49 +0000 (02:42 +0000)]
Use and alignment of 1 instead of ETHER_ALIGN for rx and tx buffers and jumbo
frames. BGE hardware with the rx alignment bug will still be handled by the
calls to m_adj() that already exist. m_adj() is probably better suited for
this task anyways. Just as with if_em, this saves a malloc + several locks
per packet and prevents unneeded data copying within busdma.
scottl [Tue, 19 Oct 2004 02:39:27 +0000 (02:39 +0000)]
Use an alignment of 1 instead of PAGE_SIZE for the rx and tx buffer tags.
Since the e1000 DMA engines hava no constraints on the alignment of buffer
transfers, there is no reason to tell busdma that there is. This save a
minimum of 1 malloc call per packet, which translates to eliminating 4 locks.
It also means that buffers are not needlessly bounced when transfered. The
end result is a 38% improvement in pps in a 4 way bridging environment.
thomas [Mon, 18 Oct 2004 23:40:13 +0000 (23:40 +0000)]
When dumpdev is set to 'auto', and a suitable swap device is found,
create a symbolic link /dev/dumpdev designating that device so
savecore can find and save a previous kernel dump.
jmg [Mon, 18 Oct 2004 23:06:12 +0000 (23:06 +0000)]
fix (for me) the problems where if_de gets really slow after time
(usually taking 20 seconds to transmit a packet).. no longer fall back
to only transmitting one packet (instead of the entire queue) after we
have processed the entire send queue... I have no idea why we didn't
start seeing this problem ~6 years ago when this code was introduced...
rwatson [Mon, 18 Oct 2004 22:19:43 +0000 (22:19 +0000)]
Push acquisition of the accept mutex out of sofree() into the caller
(sorele()/sotryfree()):
- This permits the caller to acquire the accept mutex before the socket
mutex, avoiding sofree() having to drop the socket mutex and re-order,
which could lead to races permitting more than one thread to enter
sofree() after a socket is ready to be free'd.
- This also covers clearing of the so_pcb weak socket reference from
the protocol to the socket, preventing races in clearing and
evaluation of the reference such that sofree() might be called more
than once on the same socket.
This appears to close a race I was able to easily trigger by repeatedly
opening and resetting TCP connections to a host, in which the
tcp_close() code called as a result of the RST raced with the close()
of the accepted socket in the user process resulting in simultaneous
attempts to de-allocate the same socket. The new locking increases
the overhead for operations that may potentially free the socket, so we
will want to revise the synchronization strategy here as we normalize
the reference counting model for sockets. The use of the accept mutex
in freeing of sockets that are not listen sockets is primarily
motivated by the potential need to remove the socket from the
incomplete connection queue on its parent (listen) socket, so cleaning
up the reference model here may allow us to substantially weaken the
synchronization requirements.
RELENG_5_3 candidate.
MFC after: 3 days
Reviewed by: dwhite
Discussed with: gnn, dwhite, green
Reported by: Marc UBM Bocklet <ubm at u-boot-man dot de>
Reported by: Vlad <marchenko at gmail dot com>
phk [Mon, 18 Oct 2004 21:51:27 +0000 (21:51 +0000)]
Add new function ttyinitmode() which sets our systemwide default
modes on a tty structure.
Both the ".init" and the current settings are initialized allowing
the function to be used both at attach and open time.
The function takes an argument to decide if echoing should be enabled.
Echoing should not be enabled for regular physical serial ports
unless they are consoles, in which case they should be configured
by ttyconsolemode() instead.
ru [Mon, 18 Oct 2004 21:42:15 +0000 (21:42 +0000)]
Utilize FILES, SCRIPTS, and SYMLINKS. While here, fixed a bug in
the implementation of the following feature in revision 1.4:
- Install Makefile.yp as /var/yp/Makefile.dist and link it to
/var/yp/Makefile only if /var/yp/Makefile doesn't already exist.
Suggested by Peter Wemm.
The actual code was only symlinking when no /var/yp/Makefile.dist
existed, i.e., never.
glebius [Mon, 18 Oct 2004 20:13:57 +0000 (20:13 +0000)]
Major overhaul.
List of functional changes:
- Make a single device per single node with a single hook.
This gives us parrallelizm, which can't be achieved on a single
node with many devices/hooks. This also gives us flexibility - we
can play with a particular device node, not affecting others.
- Remove read queue as it is. Use struct ifqueue instead. This change
removes a lot of extra memcpy()ing, m_devget()ting and m_copymem()ming.
In ng_device_receivedata() we enqueue an mbuf and wake readers.
In ngdread() we take one mbuf from qeueue and uiomove() it to
userspace. If no mbuf is present we optionally block. [1]
- In ngdwrite() we create an mbuf from uio using m_uiotombuf().
This is faster then uiomove() into buffer, and then m_copydata(),
and this is much better than huge m_pullup().
- Perform locking of device
- Perform locking of connection list.
- Clear out _rcvmsg method, since it does nothing good yet.
- Implement NGM_DEVICE_GET_DEVNAME message.
- #if 0 ioctl method, while nothing is done here yet.
- Return immediately from ngdwrite() if uio_resid == 0.
List of tidyness changes:
- Introduce device2priv(), to remove cut'n'paste.
- Use MALLOC/FREE, instead of malloc/free.
- Use unit2minor().
- Use UID_ROOT/GID_WHEEL instead of 0/0.
- Define NGD_DEVICE_DEVNAME, use it.
- Use more nice macros for debugging. [2]
- Return Exxx, not -1.
style(9) changes:
- No "#endif" after short block.
- Break long lines.
- Remove extra spaces, add needed spaces.
rwatson [Mon, 18 Oct 2004 19:29:13 +0000 (19:29 +0000)]
Annotate that get_cyclecount() can be expensive on some platforms,
which juxtaposes nicely with the comment just above on how the
harvest function must be cheap.