1 .\" Copyright (c) 1983, 1986, 1993
2 .\" The Regents of the University of California. All rights reserved.
4 .\" Redistribution and use in source and binary forms, with or without
5 .\" modification, are permitted provided that the following conditions
7 .\" 1. Redistributions of source code must retain the above copyright
8 .\" notice, this list of conditions and the following disclaimer.
9 .\" 2. Redistributions in binary form must reproduce the above copyright
10 .\" notice, this list of conditions and the following disclaimer in the
11 .\" documentation and/or other materials provided with the distribution.
12 .\" 3. Neither the name of the University nor the names of its contributors
13 .\" may be used to endorse or promote products derived from this software
14 .\" without specific prior written permission.
16 .\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
17 .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
18 .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
19 .\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
20 .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
21 .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
22 .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
23 .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
24 .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
25 .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
28 .\" @(#)6.t 8.1 (Berkeley) 6/8/93
31 .\".ds RH "Internal layering
35 \s+2Internal layering\s0
37 The internal structure of the network system is divided into
39 layers correspond to the services provided by the socket
40 abstraction, those provided by the communication protocols,
41 and those provided by the hardware interfaces. The communication
42 protocols are normally layered into two or more individual
43 cooperating layers, though they are collectively viewed
44 in the system as one layer providing services supportive
45 of the appropriate socket abstraction.
47 The following sections describe the properties of each layer
48 in the system and the interfaces to which each must conform.
52 The socket layer deals with the interprocess communication
53 facilities provided by the system. A socket is a bidirectional
54 endpoint of communication which is ``typed'' by the semantics
55 of communication it supports. The system calls described in
56 the \fIBerkeley Software Architecture Manual\fP [Joy86]
57 are used to manipulate sockets.
59 A socket consists of the following data structure:
63 short so_type; /* generic type */
64 short so_options; /* from socket call */
65 short so_linger; /* time to linger while closing */
66 short so_state; /* internal state flags */
67 caddr_t so_pcb; /* protocol control block */
68 struct protosw *so_proto; /* protocol handle */
69 struct socket *so_head; /* back pointer to accept socket */
70 struct socket *so_q0; /* queue of partial connections */
71 short so_q0len; /* partials on so_q0 */
72 struct socket *so_q; /* queue of incoming connections */
73 short so_qlen; /* number of connections on so_q */
74 short so_qlimit; /* max number queued connections */
75 struct sockbuf so_rcv; /* receive queue */
76 struct sockbuf so_snd; /* send queue */
77 short so_timeo; /* connection timeout */
78 u_short so_error; /* error affecting connection */
79 u_short so_oobmark; /* chars to oob mark */
80 short so_pgrp; /* pgrp for signals */
84 Each socket contains two data queues, \fIso_rcv\fP and \fIso_snd\fP,
85 and a pointer to routines which provide supporting services.
86 The type of the socket,
87 \fIso_type\fP is defined at socket creation time and used in selecting
88 those services which are appropriate to support it. The supporting
89 protocol is selected at socket creation time and recorded in
90 the socket data structure for later use. Protocols are defined
91 by a table of procedures, the \fIprotosw\fP structure, which will
92 be described in detail later. A pointer to a protocol-specific
94 the ``protocol control block,'' is also present in the socket structure.
95 Protocols control this data structure, which normally includes a
96 back pointer to the parent socket structure to allow easy
97 lookup when returning information to a user
98 (for example, placing an error number in the \fIso_error\fP
99 field). The other entries in the socket structure are used in
100 queuing connection requests, validating user requests, storing
101 socket characteristics (e.g.
102 options supplied at the time a socket is created), and maintaining
105 Processes ``rendezvous at a socket'' in many instances. For instance,
106 when a process wishes to extract data from a socket's receive queue
107 and it is empty, or lacks sufficient data to satisfy the request,
108 the process blocks, supplying the address of the receive queue as
109 a ``wait channel' to be used in notification. When data arrives
110 for the process and is placed in the socket's queue, the blocked
111 process is identified by the fact it is waiting ``on the queue.''
115 A socket's state is defined from the following:
117 .ta \w'#define 'u +\w'SS_ISDISCONNECTING 'u +\w'0x000 'u
118 #define SS_NOFDREF 0x001 /* no file table ref any more */
119 #define SS_ISCONNECTED 0x002 /* socket connected to a peer */
120 #define SS_ISCONNECTING 0x004 /* in process of connecting to peer */
121 #define SS_ISDISCONNECTING 0x008 /* in process of disconnecting */
122 #define SS_CANTSENDMORE 0x010 /* can't send more data to peer */
123 #define SS_CANTRCVMORE 0x020 /* can't receive more data from peer */
124 #define SS_RCVATMARK 0x040 /* at mark on input */
126 #define SS_PRIV 0x080 /* privileged */
127 #define SS_NBIO 0x100 /* non-blocking ops */
128 #define SS_ASYNC 0x200 /* async i/o notify */
131 The state of a socket is manipulated both by the protocols
132 and the user (through system calls).
133 When a socket is created, the state is defined based on the type of socket.
134 It may change as control actions are performed, for example connection
136 It may also change according to the type of
137 input/output the user wishes to perform, as indicated by options
138 set with \fIfcntl\fP. ``Non-blocking'' I/O implies that
139 a process should never be blocked to await resources. Instead, any
140 call which would block returns prematurely
141 with the error EWOULDBLOCK, or the service request may be partially
142 fulfilled, e.g. a request for more data than is present.
144 If a process requested ``asynchronous'' notification of events
145 related to the socket, the SIGIO signal is posted to the process
146 when such events occur.
147 An event is a change in the socket's state;
148 examples of such occurrences are: space
149 becoming available in the send queue, new data available in the
150 receive queue, connection establishment or disestablishment, etc.
152 A socket may be marked ``privileged'' if it was created by the
153 super-user. Only privileged sockets may
154 bind addresses in privileged portions of an address space
155 or use ``raw'' sockets to access lower levels of the network.
159 A socket's data queue contains a pointer to the data stored in
160 the queue and other entries related to the management of
161 the data. The following structure defines a data queue:
165 u_short sb_cc; /* actual chars in buffer */
166 u_short sb_hiwat; /* max actual char count */
167 u_short sb_mbcnt; /* chars of mbufs used */
168 u_short sb_mbmax; /* max chars of mbufs to use */
169 u_short sb_lowat; /* low water mark */
170 short sb_timeo; /* timeout */
171 struct mbuf *sb_mb; /* the mbuf chain */
172 struct proc *sb_sel; /* process selecting read/write */
173 short sb_flags; /* flags, see below */
177 Data is stored in a queue as a chain of mbufs.
178 The actual count of data characters as well as high and low water marks are
179 used by the protocols in controlling the flow of data.
180 The amount of buffer space (characters of mbufs and associated data pages)
181 is also recorded along with the limit on buffer allocation.
182 The socket routines cooperate in implementing the flow control
183 policy by blocking a process when it requests to send data and
184 the high water mark has been reached, or when it requests to
185 receive data and less than the low water mark is present
186 (assuming non-blocking I/O has not been specified).*
188 * The low-water mark is always presumed to be 0
189 in the current implementation.
192 When a socket is created, the supporting protocol ``reserves'' space
193 for the send and receive queues of the socket.
194 The limit on buffer allocation is set somewhat higher than the limit
196 to account for the granularity of buffer allocation.
197 The actual storage associated with a
198 socket queue may fluctuate during a socket's lifetime, but it is assumed
199 that this reservation will always allow a protocol to acquire enough memory
200 to satisfy the high water marks.
202 The timeout and select values are manipulated by the socket routines
203 in implementing various portions of the interprocess communications
204 facilities and will not be described here.
206 Data queued at a socket is stored in one of two styles.
207 Stream-oriented sockets queue data with no addresses, headers
208 or record boundaries.
209 The data are in mbufs linked through the \fIm_next\fP field.
210 Buffers containing access rights may be present within the chain
211 if the underlying protocol supports passage of access rights.
212 Record-oriented sockets, including datagram sockets,
213 queue data as a list of packets; the sections of packets are distinguished
214 by the types of the mbufs containing them.
215 The mbufs which comprise a record are linked through the \fIm_next\fP field;
216 records are linked from the \fIm_act\fP field of the first mbuf
217 of one packet to the first mbuf of the next.
218 Each packet begins with an mbuf containing the ``from'' address
219 if the protocol provides it,
220 then any buffers containing access rights, and finally any buffers
222 If a record contains no data,
223 no data buffers are required unless neither address nor access rights
226 A socket queue has a number of flags used in synchronizing access
227 to the data and in acquiring resources:
230 #define SB_LOCK 0x01 /* lock on data queue (so_rcv only) */
231 #define SB_WANT 0x02 /* someone is waiting to lock */
232 #define SB_WAIT 0x04 /* someone is waiting for data/space */
233 #define SB_SEL 0x08 /* buffer is selected */
234 #define SB_COLL 0x10 /* collision selecting */
236 The last two flags are manipulated by the system in implementing
237 the select mechanism.
239 Socket connection queuing
241 In dealing with connection oriented sockets (e.g. SOCK_STREAM)
242 the two ends are considered distinct. One end is termed
243 \fIactive\fP, and generates connection requests. The other
244 end is called \fIpassive\fP and accepts connection requests.
246 From the passive side, a socket is marked with
247 SO_ACCEPTCONN when a \fIlisten\fP call is made,
248 creating two queues of sockets: \fIso_q0\fP for connections
249 in progress and \fIso_q\fP for connections already made and
250 awaiting user acceptance.
251 As a protocol is preparing incoming connections, it creates
252 a socket structure queued on \fIso_q0\fP by calling the routine
253 \fIsonewconn\fP(). When the connection
254 is established, the socket structure is then transferred
255 to \fIso_q\fP, making it available for an \fIaccept\fP.
257 If an SO_ACCEPTCONN socket is closed with sockets on either
258 \fIso_q0\fP or \fIso_q\fP, these sockets are dropped,
259 with notification to the peers as appropriate.
263 Each socket is created in a communications domain,
264 which usually implies both an addressing structure (address family)
265 and a set of protocols which implement various socket types within the domain
267 Each domain is defined by the following structure:
269 .ta .5i +\w'struct 'u +\w'(*dom_externalize)(); 'u
271 int dom_family; /* PF_xxx */
273 int (*dom_init)(); /* initialize domain data structures */
274 int (*dom_externalize)(); /* externalize access rights */
275 int (*dom_dispose)(); /* dispose of internalized rights */
276 struct protosw *dom_protosw, *dom_protoswNPROTOSW;
277 struct domain *dom_next;
281 At boot time, each domain configured into the kernel
282 is added to a linked list of domain.
283 The initialization procedure of each domain is then called.
284 After that time, the domain structure is used to locate protocols
285 within the protocol family.
286 It may also contain procedure references
287 for externalization of access rights at the receiving socket
288 and the disposal of access rights that are not received.
290 Protocols are described by a set of entry points and certain
291 socket-visible characteristics, some of which are used in
292 deciding which socket type(s) they may support.
294 An entry in the ``protocol switch'' table exists for each
295 protocol module configured into the system. It has the following form:
297 .ta .5i +\w'struct 'u +\w'domain *pr_domain; 'u
299 short pr_type; /* socket type used for */
300 struct domain *pr_domain; /* domain protocol a member of */
301 short pr_protocol; /* protocol number */
302 short pr_flags; /* socket visible attributes */
303 /* protocol-protocol hooks */
304 int (*pr_input)(); /* input to protocol (from below) */
305 int (*pr_output)(); /* output to protocol (from above) */
306 int (*pr_ctlinput)(); /* control input (from below) */
307 int (*pr_ctloutput)(); /* control output (from above) */
308 /* user-protocol hook */
309 int (*pr_usrreq)(); /* user request */
311 int (*pr_init)(); /* initialization routine */
312 int (*pr_fasttimo)(); /* fast timeout (200ms) */
313 int (*pr_slowtimo)(); /* slow timeout (500ms) */
314 int (*pr_drain)(); /* flush any excess space possible */
318 A protocol is called through the \fIpr_init\fP entry before any other.
319 Thereafter it is called every 200 milliseconds through the
320 \fIpr_fasttimo\fP entry and
321 every 500 milliseconds through the \fIpr_slowtimo\fP for timer based actions.
322 The system will call the \fIpr_drain\fP entry if it is low on space and
323 this should throw away any non-critical data.
325 Protocols pass data between themselves as chains of mbufs using
326 the \fIpr_input\fP and \fIpr_output\fP routines. \fIPr_input\fP
327 passes data up (towards
328 the user) and \fIpr_output\fP passes it down (towards the network); control
329 information passes up and down on \fIpr_ctlinput\fP and \fIpr_ctloutput\fP.
330 The protocol is responsible for the space occupied by any of the
331 arguments to these entries and must either pass it onward or dispose of it.
332 (On output, the lowest level reached must free buffers storing the arguments;
333 on input, the highest level is responsible for freeing buffers.)
335 The \fIpr_usrreq\fP routine interfaces protocols to the socket
336 code and is described below.
338 The \fIpr_flags\fP field is constructed from the following values:
340 .ta \w'#define 'u +\w'PR_CONNREQUIRED 'u +8n
341 #define PR_ATOMIC 0x01 /* exchange atomic messages only */
342 #define PR_ADDR 0x02 /* addresses given with messages */
343 #define PR_CONNREQUIRED 0x04 /* connection required by protocol */
344 #define PR_WANTRCVD 0x08 /* want PRU_RCVD calls */
345 #define PR_RIGHTS 0x10 /* passes capabilities */
347 Protocols which are connection-based specify the PR_CONNREQUIRED
348 flag so that the socket routines will never attempt to send data
349 before a connection has been established. If the PR_WANTRCVD flag
350 is set, the socket routines will notify the protocol when the user
351 has removed data from the socket's receive queue. This allows
352 the protocol to implement acknowledgement on user receipt, and
353 also update windowing information based on the amount of space
354 available in the receive queue. The PR_ADDR field indicates that any
355 data placed in the socket's receive queue will be preceded by the
356 address of the sender. The PR_ATOMIC flag specifies that each \fIuser\fP
357 request to send data must be performed in a single \fIprotocol\fP send
358 request; it is the protocol's responsibility to maintain record
359 boundaries on data to be sent. The PR_RIGHTS flag indicates that the
360 protocol supports the passing of capabilities; this is currently
361 used only by the protocols in the UNIX protocol family.
363 When a socket is created, the socket routines scan the protocol
365 looking for an appropriate protocol to support the type of
366 socket being created. The \fIpr_type\fP field contains one of the
367 possible socket types (e.g. SOCK_STREAM), while the \fIpr_domain\fP
368 is a back pointer to the domain structure.
369 The \fIpr_protocol\fP field contains the protocol number of the
370 protocol, normally a well-known value.
372 Network-interface layer
374 Each network-interface configured into a system defines a
375 path through which packets may be sent and received.
376 Normally a hardware device is associated with this interface,
377 though there is no requirement for this (for example, all
378 systems have a software ``loopback'' interface used for
379 debugging and performance analysis).
380 In addition to manipulating the hardware device, an interface
381 module is responsible
382 for encapsulation and decapsulation of any link-layer header
383 information required to deliver a message to its destination.
384 The selection of which interface to use in delivering packets
385 is a routing decision carried out at a
386 higher level than the network-interface layer.
387 An interface may have addresses in one or more address families.
388 The address is set at boot time using an \fIioctl\fP on a socket
389 in the appropriate domain; this operation is implemented by the protocol
390 family, after verifying the operation through the device \fIioctl\fP entry.
392 An interface is defined by the following structure,
394 .ta .5i +\w'struct 'u +\w'ifaddr *if_addrlist; 'u
396 char *if_name; /* name, e.g. ``en'' or ``lo'' */
397 short if_unit; /* sub-unit for lower level driver */
398 short if_mtu; /* maximum transmission unit */
399 short if_flags; /* up/down, broadcast, etc. */
400 short if_timer; /* time 'til if_watchdog called */
401 struct ifaddr *if_addrlist; /* list of addresses of interface */
402 struct ifqueue if_snd; /* output queue */
403 int (*if_init)(); /* init routine */
404 int (*if_output)(); /* output routine */
405 int (*if_ioctl)(); /* ioctl routine */
406 int (*if_reset)(); /* bus reset routine */
407 int (*if_watchdog)(); /* timer routine */
408 int if_ipackets; /* packets received on interface */
409 int if_ierrors; /* input errors on interface */
410 int if_opackets; /* packets sent on interface */
411 int if_oerrors; /* output errors on interface */
412 int if_collisions; /* collisions on csma interfaces */
413 struct ifnet *if_next;
416 Each interface address has the following form:
418 .ta \w'#define 'u +\w'struct 'u +\w'struct 'u +\w'sockaddr ifa_addr; 'u-\w'struct 'u
420 struct sockaddr ifa_addr; /* address of interface */
422 struct sockaddr ifu_broadaddr;
423 struct sockaddr ifu_dstaddr;
425 struct ifnet *ifa_ifp; /* back-pointer to interface */
426 struct ifaddr *ifa_next; /* next address for interface */
428 .ta \w'#define 'u +\w'ifa_broadaddr 'u +\w'ifa_ifu.ifu_broadaddr 'u
429 #define ifa_broadaddr ifa_ifu.ifu_broadaddr /* broadcast address */
430 #define ifa_dstaddr ifa_ifu.ifu_dstaddr /* other end of p-to-p link */
432 The protocol generally maintains this structure as part of a larger
433 structure containing additional information concerning the address.
435 Each interface has a send queue and routines used for
436 initialization, \fIif_init\fP, and output, \fIif_output\fP.
437 If the interface resides on a system bus, the routine \fIif_reset\fP
438 will be called after a bus reset has been performed.
439 An interface may also
440 specify a timer routine, \fIif_watchdog\fP;
441 if \fIif_timer\fP is non-zero, it is decremented once per second
442 until it reaches zero, at which time the watchdog routine is called.
444 The state of an interface and certain characteristics are stored in
445 the \fIif_flags\fP field. The following values are possible:
448 #define IFF_UP 0x1 /* interface is up */
449 #define IFF_BROADCAST 0x2 /* broadcast is possible */
450 #define IFF_DEBUG 0x4 /* turn on debugging */
451 #define IFF_LOOPBACK 0x8 /* is a loopback net */
452 #define IFF_POINTOPOINT 0x10 /* interface is point-to-point link */
453 #define IFF_NOTRAILERS 0x20 /* avoid use of trailers */
454 #define IFF_RUNNING 0x40 /* resources allocated */
455 #define IFF_NOARP 0x80 /* no address resolution protocol */
457 If the interface is connected to a network which supports transmission
458 of \fIbroadcast\fP packets, the IFF_BROADCAST flag will be set and
459 the \fIifa_broadaddr\fP field will contain the address to be used in
460 sending or accepting a broadcast packet. If the interface is associated
461 with a point-to-point hardware link (for example, a DEC DMR-11), the
462 IFF_POINTOPOINT flag will be set and \fIifa_dstaddr\fP will contain the
463 address of the host on the other side of the connection. These addresses
464 and the local address of the interface, \fIif_addr\fP, are used in
465 filtering incoming packets. The interface sets IFF_RUNNING after
466 it has allocated system resources and posted an initial read on the
467 device it manages. This state bit is used to avoid multiple allocation
468 requests when an interface's address is changed. The IFF_NOTRAILERS
469 flag indicates the interface should refrain from using a \fItrailer\fP
470 encapsulation on outgoing packets, or (where per-host negotiation
471 of trailers is possible) that trailer encapsulations should not be requested;
472 \fItrailer\fP protocols are described
473 in section 14. The IFF_NOARP flag indicates the interface should not
474 use an ``address resolution protocol'' in mapping internetwork addresses
475 to local network addresses.
477 Various statistics are also stored in the interface structure. These
478 may be viewed by users using the \fInetstat\fP(1) program.
480 The interface address and flags may be set with the SIOCSIFADDR and
481 SIOCSIFFLAGS \fIioctl\fP\^s. SIOCSIFADDR is used initially to define each
482 interface's address; SIOGSIFFLAGS can be used to mark
483 an interface down and perform site-specific configuration.
484 The destination address of a point-to-point link is set with SIOCSIFDSTADDR.
485 Corresponding operations exist to read each value.
486 Protocol families may also support operations to set and read the broadcast
488 In addition, the SIOCGIFCONF \fIioctl\fP retrieves a list of interface
489 names and addresses for all interfaces and protocols on the host.
493 All hardware related interfaces currently reside on the UNIBUS.
494 Consequently a common set of utility routines for dealing
495 with the UNIBUS has been developed. Each UNIBUS interface
496 utilizes a structure of the following form:
498 .ta \w'#define 'u +\w'ifw_xtofree 'u +\w'pte ifu_wmap[IF_MAXNUBAMR]; 'u
500 short iff_uban; /* uba number */
501 short iff_hlen; /* local net header length */
502 struct uba_regs *iff_uba; /* uba regs, in vm */
503 short iff_flags; /* used during uballoc's */
506 Additional structures are associated with each receive and transmit buffer,
507 normally one each per interface; for read,
509 .ta \w'#define 'u +\w'ifw_xtofree 'u +\w'pte ifu_wmap[IF_MAXNUBAMR]; 'u
511 caddr_t ifrw_addr; /* virt addr of header */
512 short ifrw_bdp; /* unibus bdp */
513 short ifrw_flags; /* type, etc. */
514 #define IFRW_W 0x01 /* is a transmit buffer */
515 int ifrw_info; /* value from ubaalloc */
516 int ifrw_proto; /* map register prototype */
517 struct pte *ifrw_mr; /* base of map registers */
522 .ta \w'#define 'u +\w'ifw_xtofree 'u +\w'pte ifu_wmap[IF_MAXNUBAMR]; 'u
525 caddr_t ifw_base; /* virt addr of buffer */
526 struct pte ifw_wmap[IF_MAXNUBAMR]; /* base pages for output */
527 struct mbuf *ifw_xtofree; /* pages being dma'd out */
528 short ifw_xswapd; /* mask of clusters swapped */
529 short ifw_nmr; /* number of entries in wmap */
531 .ta \w'#define 'u +\w'ifw_xtofree 'u +\w'pte ifu_wmap[IF_MAXNUBAMR]; 'u
532 #define ifw_addr ifrw.ifrw_addr
533 #define ifw_bdp ifrw.ifrw_bdp
534 #define ifw_flags ifrw.ifrw_flags
535 #define ifw_info ifrw.ifrw_info
536 #define ifw_proto ifrw.ifrw_proto
537 #define ifw_mr ifrw.ifrw_mr
539 One of each of these structures is conveniently packaged for interfaces
540 with single buffers for each direction, as follows:
542 .ta \w'#define 'u +\w'ifw_xtofree 'u +\w'pte ifu_wmap[IF_MAXNUBAMR]; 'u
544 struct ifubinfo ifu_info;
546 struct ifxmt ifu_xmt;
548 .ta \w'#define 'u +\w'ifw_xtofree 'u
549 #define ifu_uban ifu_info.iff_uban
550 #define ifu_hlen ifu_info.iff_hlen
551 #define ifu_uba ifu_info.iff_uba
552 #define ifu_flags ifu_info.iff_flags
553 #define ifu_w ifu_xmt.ifrw
554 #define ifu_xtofree ifu_xmt.ifw_xtofree
557 The \fIif_ubinfo\fP structure contains the general information needed
558 to characterize the I/O-mapped buffers for the device.
559 In addition, there is a structure describing each buffer, including
560 UNIBUS resources held by the interface.
561 Sufficient memory pages and bus map registers are allocated to each buffer
562 upon initialization according to the maximum packet size and header length.
563 The kernel virtual address of the buffer is held in \fIifrw_addr\fP,
564 and the map registers begin
565 at \fIifrw_mr\fP. UNIBUS map register \fIifrw_mr\fP\^[\-1]
566 maps the local network header
567 ending on a page boundary. UNIBUS data paths are
568 reserved for read and for
569 write, given by \fIifrw_bdp\fP. The prototype of the map
570 registers for read and for write is saved in \fIifrw_proto\fP.
572 When write transfers are not at least half-full pages on page boundaries,
573 the data are just copied into the pages mapped on the UNIBUS
574 and the transfer is started.
575 If a write transfer is at least half a page long and on a page
576 boundary, UNIBUS page table entries are swapped to reference
577 the pages, and then the initial pages are
578 remapped from \fIifw_wmap\fP when the transfer completes.
579 The mbufs containing the mapped pages are placed on the \fIifw_xtofree\fP
580 queue to be freed after transmission.
582 When read transfers give at least half a page of data to be input, page
583 frames are allocated from a network page list and traded
584 with the pages already containing the data, mapping the allocated
585 pages to replace the input pages for the next UNIBUS data input.
587 The following utility routines are available for use in
588 writing network interface drivers; all use the
589 structures described above.
591 if_ubaminit(ifubinfo, uban, hlen, nmr, ifr, nr, ifx, nx);
593 if_ubainit(ifuba, uban, hlen, nmr);
595 \fIif_ubaminit\fP allocates resources on UNIBUS adapter \fIuban\fP,
596 storing the information in the \fIifubinfo\fP, \fIifrw\fP and \fIifxmt\fP
597 structures referenced.
598 The \fIifr\fP and \fIifx\fP parameters are pointers to arrays
599 of \fIifrw\fP and \fIifxmt\fP structures whose dimensions
600 are \fInr\fP and \fInx\fP, respectively.
601 \fIif_ubainit\fP is a simpler, backwards-compatible interface used
602 for hardware with single buffers of each type.
603 They are called only at boot time or after a UNIBUS reset.
604 One data path (buffered or unbuffered,
605 depending on the \fIifu_flags\fP field) is allocated for each buffer.
606 The \fInmr\fP parameter indicates
607 the number of UNIBUS mapping registers required to map a maximal
608 sized packet onto the UNIBUS, while \fIhlen\fP specifies the size
609 of a local network header, if any, which should be mapped separately
610 from the data (see the description of trailer protocols in chapter 14).
611 Sufficient UNIBUS mapping registers and pages of memory are allocated
612 to initialize the input data path for an initial read. For the output
613 data path, mapping registers and pages of memory are also allocated
614 and mapped onto the UNIBUS. The pages associated with the output
615 data path are held in reserve in the event a write requires copying
616 non-page-aligned data (see \fIif_wubaput\fP below).
617 If \fIif_ubainit\fP is called with memory pages already allocated,
618 they will be used instead of allocating new ones (this normally
619 occurs after a UNIBUS reset).
620 A 1 is returned when allocation and initialization are successful,
623 m = if_ubaget(ifubinfo, ifr, totlen, off0, ifp);
625 m = if_rubaget(ifuba, totlen, off0, ifp);
627 \fIif_ubaget\fP and \fIif_rubaget\fP pull input data
628 out of an interface receive buffer and into an mbuf chain.
629 The first interface passes pointers to the \fIifubinfo\fP structure
630 for the interface and the \fIifrw\fP structure for the receive buffer;
631 the second call may be used for single-buffered devices.
632 \fItotlen\fP specifies the length of data to be obtained, not counting the
633 local network header. If \fIoff0\fP is non-zero, it indicates
634 a byte offset to a trailing local network header which should be
635 copied into a separate mbuf and prepended to the front of the resultant mbuf
636 chain. When the data amount to at least a half a page,
637 the previously mapped data pages are remapped
638 into the mbufs and swapped with fresh pages, thus avoiding
640 The receiving interface is recorded as \fIifp\fP, a pointer to an \fIifnet\fP
641 structure, for the use of the receiving network protocol.
642 A 0 return value indicates a failure to allocate resources.
644 if_wubaput(ifubinfo, ifx, m);
646 if_wubaput(ifuba, m);
648 \fIif_ubaput\fP and \fIif_wubaput\fP map a chain of mbufs
649 onto a network interface in preparation for output.
650 The first interface is used by devices with multiple transmit buffers.
651 The chain includes any local network
652 header, which is copied so that it resides in the mapped and
654 Page-aligned data that are page-aligned in the output buffer
655 are mapped to the UNIBUS in place of the normal buffer page,
656 and the corresponding mbuf is placed on a queue to be freed after transmission.
657 Any other mbufs which contained non-page-sized
658 data portions are copied to the I/O space and then freed.
659 Pages mapped from a previous output operation (no longer needed)