share/doc/psd/21.ipc/2.t

   1 .\" Copyright (c) 1986, 1993
   2 .\"     The Regents of the University of California.  All rights reserved.
   3 .\"
   4 .\" Redistribution and use in source and binary forms, with or without
   5 .\" modification, are permitted provided that the following conditions
   6 .\" are met:
   7 .\" 1. Redistributions of source code must retain the above copyright
   8 .\"    notice, this list of conditions and the following disclaimer.
   9 .\" 2. Redistributions in binary form must reproduce the above copyright
  10 .\"    notice, this list of conditions and the following disclaimer in the
  11 .\"    documentation and/or other materials provided with the distribution.
  12 .\" 3. All advertising materials mentioning features or use of this software
  13 .\"    must display the following acknowledgement:
  14 .\"     This product includes software developed by the University of
  15 .\"     California, Berkeley and its contributors.
  16 .\" 4. Neither the name of the University nor the names of its contributors
  17 .\"    may be used to endorse or promote products derived from this software
  18 .\"    without specific prior written permission.
  19 .\"
  20 .\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
  21 .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  22 .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  23 .\" ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
  24 .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  25 .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  26 .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  27 .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  28 .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  29 .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  30 .\" SUCH DAMAGE.
  31 .\"
  32 .\"     @(#)2.t 8.1 (Berkeley) 8/14/93
  33 .\"
  34 .\".ds RH "Basics
  35 .bp
  36 .nr H1 2
  37 .nr H2 0
  38 .\" The next line is a major hack to get around internal changes in the groff
  39 .\" implementation of .NH.
  40 .nr nh*hl 1
  41 .bp
  42 .LG
  43 .B
  44 .ce
  45 2. BASICS
  46 .sp 2
  47 .R
  48 .NL
  49 .PP
  50 The basic building block for communication is the \fIsocket\fP.
  51 A socket is an endpoint of communication to which a name may
  52 be \fIbound\fP.  Each socket in use has a \fItype\fP
  53 and one or more associated processes.  Sockets exist within
  54 \fIcommunication domains\fP.
  55 A communication domain is an
  56 abstraction introduced to bundle common properties of
  57 processes communicating through sockets.
  58 One such property is the scheme used to name sockets.  For
  59 example, in the UNIX communication domain sockets are
  60 named with UNIX path names; e.g. a
  61 socket may be named \*(lq/dev/foo\*(rq.  Sockets normally
  62 exchange data only with
  63 sockets in the same domain (it may be possible to cross domain
  64 boundaries, but only if some translation process is
  65 performed).  The
  66 4.4BSD IPC facilities support four separate communication domains:
  67 the UNIX domain, for on-system communication;
  68 the Internet domain, which is used by
  69 processes which communicate
  70 using the Internet standard communication protocols;
  71 the NS domain, which is used by processes which
  72 communicate using the Xerox standard communication
  73 protocols*;
  74 .FS
  75 * See \fIInternet Transport Protocols\fP, Xerox System Integration
  76 Standard (XSIS)028112 for more information.  This document is
  77 almost a necessity for one trying to write NS applications.
  78 .FE
  79 and the ISO OSI protocols, which are not documented in this tutorial.
  80 The underlying communication
  81 facilities provided by these domains have a significant influence
  82 on the internal system implementation as well as the interface to
  83 socket facilities available to a user.  An example of the
  84 latter is that a socket \*(lqoperating\*(rq in the UNIX domain
  85 sees a subset of the error conditions which are possible
  86 when operating in the Internet (or NS) domain.
  87 .NH 2
  88 Socket types
  89 .PP
  90 Sockets are
  91 typed according to the communication properties visible to a
  92 user.
  93 Processes are presumed to communicate only between sockets of
  94 the same type, although there is
  95 nothing that prevents communication between sockets of different
  96 types should the underlying communication
  97 protocols support this.
  98 .PP
  99 Four types of sockets currently are available to a user.
 100 A \fIstream\fP socket provides for the bidirectional, reliable,
 101 sequenced, and unduplicated flow of data without record boundaries.
 102 Aside from the bidirectionality of data flow, a pair of connected
 103 stream sockets provides an interface nearly identical to that of pipes\(dg.
 104 .FS
 105 \(dg In the UNIX domain, in fact, the semantics are identical and,
 106 as one might expect, pipes have been implemented internally
 107 as simply a pair of connected stream sockets.
 108 .FE
 109 .PP
 110 A \fIdatagram\fP socket supports bidirectional flow of data which
 111 is not promised to be sequenced, reliable, or unduplicated.
 112 That is, a process
 113 receiving messages on a datagram socket may find messages duplicated,
 114 and, possibly,
 115 in an order different from the order in which it was sent.
 116 An important characteristic of a datagram
 117 socket is that record boundaries in data are preserved.  Datagram
 118 sockets closely model the facilities found in many contemporary
 119 packet switched networks such as the Ethernet.
 120 .PP
 121 A \fIraw\fP socket provides users access to
 122 the underlying communication
 123 protocols which support socket abstractions.
 124 These sockets are normally datagram oriented, though their
 125 exact characteristics are dependent on the interface provided by
 126 the protocol.  Raw sockets are not intended for the general user; they
 127 have been provided mainly for those interested in developing new
 128 communication protocols, or for gaining access to some of the more
 129 esoteric facilities of an existing protocol.  The use of raw sockets
 130 is considered in section 5.
 131 .PP
 132 A \fIsequenced packet\fP socket is similar to a stream socket,
 133 with the exception that record boundaries are preserved.  This
 134 interface is provided only as part of the NS socket abstraction,
 135 and is very important in most serious NS applications.
 136 Sequenced-packet sockets allow the user to manipulate the
 137 SPP or IDP headers on a packet or a group of packets either
 138 by writing a prototype header along with whatever data is
 139 to be sent, or by specifying a default header to be used with
 140 all outgoing data, and allows the user to receive the headers
 141 on incoming packets.  The use of these options is considered in
 142 section 5.
 143 .PP
 144 Another potential socket type which has interesting properties is
 145 the \fIreliably delivered
 146 message\fP socket.
 147 The reliably delivered message socket has
 148 similar properties to a datagram socket, but with
 149 reliable delivery.  There is currently no support for this
 150 type of socket, but a reliably delivered message protocol
 151 similar to Xerox's Packet Exchange Protocol (PEX) may be
 152 simulated at the user level.  More information on this topic
 153 can be found in section 5.
 154 .NH 2
 155 Socket creation
 156 .PP
 157 To create a socket the \fIsocket\fP system call is used:
 158 .DS
 159 s = socket(domain, type, protocol);
 160 .DE
 161 This call requests that the system create a socket in the specified
 162 \fIdomain\fP and of the specified \fItype\fP.  A particular protocol may
 163 also be requested.  If the protocol is left unspecified (a value
 164 of 0), the system will select an appropriate protocol from those
 165 protocols which comprise the communication domain and which
 166 may be used to support the requested socket type.  The user is
 167 returned a descriptor (a small integer number) which may be used
 168 in later system calls which operate on sockets.  The domain is specified as
 169 one of the manifest constants defined in the file <\fIsys/socket.h\fP>.
 170 For the UNIX domain the constant is AF_UNIX*;  for the Internet
 171 .FS
 172 * The manifest constants are named AF_whatever as they indicate
 173 the ``address format'' to use in interpreting names.
 174 .FE
 175 domain AF_INET; and for the NS domain, AF_NS.
 176 The socket types are also defined in this file
 177 and one of SOCK_STREAM, SOCK_DGRAM, SOCK_RAW, or SOCK_SEQPACKET
 178 must be specified.
 179 To create a stream socket in the Internet domain the following
 180 call might be used:
 181 .DS
 182 s = socket(AF_INET, SOCK_STREAM, 0);
 183 .DE
 184 This call would result in a stream socket being created with the TCP
 185 protocol providing the underlying communication support.  To
 186 create a datagram socket for on-machine use the call might
 187 be:
 188 .DS
 189 s = socket(AF_UNIX, SOCK_DGRAM, 0);
 190 .DE
 191 .PP
 192 The default protocol (used when the \fIprotocol\fP argument to the
 193 \fIsocket\fP call is 0) should be correct for most every
 194 situation.  However, it is possible to specify a protocol
 195 other than the default; this will be covered in
 196 section 5.
 197 .PP
 198 There are several reasons a socket call may fail.  Aside from
 199 the rare occurrence of lack of memory (ENOBUFS), a socket
 200 request may fail due to a request for an unknown protocol
 201 (EPROTONOSUPPORT), or a request for a type of socket for
 202 which there is no supporting protocol (EPROTOTYPE).
 203 .NH 2
 204 Binding local names
 205 .PP
 206 A socket is created without a name.  Until a name is bound
 207 to a socket, processes have no way to reference it and, consequently,
 208 no messages may be received on it.
 209 Communicating processes are bound
 210 by an \fIassociation\fP.  In the Internet and NS domains,
 211 an association
 212 is composed of local and foreign
 213 addresses, and local and foreign ports,
 214 while in the UNIX domain, an association is composed of
 215 local and foreign path names (the phrase ``foreign pathname''
 216 means a pathname created by a foreign process, not a pathname
 217 on a foreign system).
 218 In most domains, associations must be unique.
 219 In the Internet domain there
 220 may never be duplicate <protocol, local address, local port, foreign
 221 address, foreign port> tuples.  UNIX domain sockets need not always
 222 be bound to a name, but when bound
 223 there may never be duplicate <protocol, local pathname, foreign
 224 pathname> tuples.
 225 The pathnames may not refer to files
 226 already existing on the system
 227 in 4.3; the situation may change in future releases.
 228 .PP
 229 The \fIbind\fP system call allows a process to specify half of
 230 an association, <local address, local port>
 231 (or <local pathname>), while the \fIconnect\fP
 232 and \fIaccept\fP primitives are used to complete a socket's association.
 233 .PP
 234 In the Internet domain,
 235 binding names to sockets can be fairly complex.
 236 Fortunately, it is usually not necessary to specifically bind an
 237 address and port number to a socket, because the
 238 \fIconnect\fP and \fIsend\fP calls will automatically
 239 bind an appropriate address if they are used with an
 240 unbound socket.  The process of binding names to NS
 241 sockets is similar in most ways to that of
 242 binding names to Internet sockets.
 243 .PP
 244 The \fIbind\fP system call is used as follows:
 245 .DS
 246 bind(s, name, namelen);
 247 .DE
 248 The bound name is a variable length byte string which is interpreted
 249 by the supporting protocol(s).  Its interpretation may vary from
 250 communication domain to communication domain (this is one of
 251 the properties which comprise the \*(lqdomain\*(rq).
 252 As mentioned, in the
 253 Internet domain names contain an Internet address and port
 254 number.  NS domain names contain an NS address and
 255 port number.  In the UNIX domain, names contain a path name and
 256 a family, which is always AF_UNIX.  If one wanted to bind
 257 the name \*(lq/tmp/foo\*(rq to a UNIX domain socket, the
 258 following code would be used*:
 259 .FS
 260 * Note that, although the tendency here is to call the \*(lqaddr\*(rq
 261 structure \*(lqsun\*(rq, doing so would cause problems if the code
 262 were ever ported to a Sun workstation.
 263 .FE
 264 .DS
 265 #include <sys/un.h>
 266  ...
 267 struct sockaddr_un addr;
 268  ...
 269 strcpy(addr.sun_path, "/tmp/foo");
 270 addr.sun_family = AF_UNIX;
 271 bind(s, (struct sockaddr *) &addr, strlen(addr.sun_path) +
 272     sizeof (addr.sun_len) + sizeof (addr.sun_family));
 273 .DE
 274 Note that in determining the size of a UNIX domain address null
 275 bytes are not counted, which is why \fIstrlen\fP is used.  In
 276 the current implementation of UNIX domain IPC,
 277 the file name
 278 referred to in \fIaddr.sun_path\fP is created as a socket
 279 in the system file space.
 280 The caller must, therefore, have
 281 write permission in the directory where
 282 \fIaddr.sun_path\fP is to reside, and this file should be deleted by the
 283 caller when it is no longer needed.  Future versions of 4BSD
 284 may not create this file.
 285 .PP
 286 In binding an Internet address things become more
 287 complicated.  The actual call is similar,
 288 .DS
 289 #include <sys/types.h>
 290 #include <netinet/in.h>
 291  ...
 292 struct sockaddr_in sin;
 293  ...
 294 bind(s, (struct sockaddr *) &sin, sizeof (sin));
 295 .DE
 296 but the selection of what to place in the address \fIsin\fP
 297 requires some discussion.  We will come back to the problem
 298 of formulating Internet addresses in section 3 when
 299 the library routines used in name resolution are discussed.
 300 .PP
 301 Binding an NS address to a socket is even more
 302 difficult,
 303 especially since the Internet library routines do not
 304 work with NS hostnames.  The actual call is again similar:
 305 .DS
 306 #include <sys/types.h>
 307 #include <netns/ns.h>
 308  ...
 309 struct sockaddr_ns sns;
 310  ...
 311 bind(s, (struct sockaddr *) &sns, sizeof (sns));
 312 .DE
 313 Again, discussion of what to place in a \*(lqstruct sockaddr_ns\*(rq
 314 will be deferred to section 3.
 315 .NH 2
 316 Connection establishment
 317 .PP
 318 Connection establishment is usually asymmetric,
 319 with one process a \*(lqclient\*(rq and the other a \*(lqserver\*(rq.
 320 The server, when willing to offer its advertised services,
 321 binds a socket to a well-known address associated with the service
 322 and then passively \*(lqlistens\*(rq on its socket.
 323 It is then possible for an unrelated process to rendezvous
 324 with the server.
 325 The client requests services from the server by initiating a
 326 \*(lqconnection\*(rq to the server's socket.
 327 On the client side the \fIconnect\fP call is
 328 used to initiate a connection.  Using the UNIX domain, this
 329 might appear as,
 330 .DS
 331 struct sockaddr_un server;
 332  ...
 333 connect(s, (struct sockaddr *)&server, strlen(server.sun_path) +
 334     sizeof (server.sun_family));
 335 .DE
 336 while in the Internet domain,
 337 .DS
 338 struct sockaddr_in server;
 339  ...
 340 connect(s, (struct sockaddr *)&server, sizeof (server));
 341 .DE
 342 and in the NS domain,
 343 .DS
 344 struct sockaddr_ns server;
 345  ...
 346 connect(s, (struct sockaddr *)&server, sizeof (server));
 347 .DE
 348 where \fIserver\fP in the example above would contain either the UNIX
 349 pathname, Internet address and port number, or NS address and
 350 port number of the server to which the
 351 client process wishes to speak.
 352 If the client process's socket is unbound at the time of
 353 the connect call,
 354 the system will automatically select and bind a name to
 355 the socket if necessary; c.f. section 5.4.
 356 This is the usual way that local addresses are bound
 357 to a socket.
 358 .PP
 359 An error is returned if the connection was unsuccessful
 360 (any name automatically bound by the system, however, remains).
 361 Otherwise, the socket is associated with the server and
 362 data transfer may begin.  Some of the more common errors returned
 363 when a connection attempt fails are:
 364 .IP ETIMEDOUT
 365 .br
 366 After failing to establish a connection for a period of time,
 367 the system decided there was no point in retrying the
 368 connection attempt any more.  This usually occurs because
 369 the destination host is down, or because problems in
 370 the network resulted in transmissions being lost.
 371 .IP ECONNREFUSED
 372 .br
 373 The host refused service for some reason.
 374 This is usually
 375 due to a server process
 376 not being present at the requested name.
 377 .IP "ENETDOWN or EHOSTDOWN"
 378 .br
 379 These operational errors are
 380 returned based on status information delivered to
 381 the client host by the underlying communication services.
 382 .IP "ENETUNREACH or EHOSTUNREACH"
 383 .br
 384 These operational errors can occur either because the network
 385 or host is unknown (no route to the network or host is present),
 386 or because of status information returned by intermediate
 387 gateways or switching nodes.  Many times the status returned
 388 is not sufficient to distinguish a network being down from a
 389 host being down, in which case the system
 390 indicates the entire network is unreachable.
 391 .PP
 392 For the server to receive a client's connection it must perform
 393 two steps after binding its socket.
 394 The first is to indicate a willingness to listen for
 395 incoming connection requests:
 396 .DS
 397 listen(s, 5);
 398 .DE
 399 The second parameter to the \fIlisten\fP call specifies the maximum
 400 number of outstanding connections which may be queued awaiting
 401 acceptance by the server process; this number
 402 may be limited by the system.  Should a connection be
 403 requested while the queue is full, the connection will not be
 404 refused, but rather the individual messages which comprise the
 405 request will be ignored.  This gives a harried server time to
 406 make room in its pending connection queue while the client
 407 retries the connection request.  Had the connection been returned
 408 with the ECONNREFUSED error, the client would be unable to tell
 409 if the server was up or not.  As it is now it is still possible
 410 to get the ETIMEDOUT error back, though this is unlikely.  The
 411 backlog figure supplied with the listen call is currently limited
 412 by the system to a maximum of 5 pending connections on any
 413 one queue.  This avoids the problem of processes hogging system
 414 resources by setting an infinite backlog, then ignoring
 415 all connection requests.
 416 .PP
 417 With a socket marked as listening, a server may \fIaccept\fP
 418 a connection:
 419 .DS
 420 struct sockaddr_in from;
 421  ...
 422 fromlen = sizeof (from);
 423 newsock = accept(s, (struct sockaddr *)&from, &fromlen);
 424 .DE
 425 (For the UNIX domain, \fIfrom\fP would be declared as a
 426 \fIstruct sockaddr_un\fP, and for the NS domain, \fIfrom\fP
 427 would be declared as a \fIstruct sockaddr_ns\fP,
 428 but nothing different would need
 429 to be done as far as \fIfromlen\fP is concerned.  In the examples
 430 which follow, only Internet routines will be discussed.)  A new
 431 descriptor is returned on receipt of a connection (along with
 432 a new socket).  If the server wishes to find out who its client is,
 433 it may supply a buffer for the client socket's name.  The value-result
 434 parameter \fIfromlen\fP is initialized by the server to indicate how
 435 much space is associated with \fIfrom\fP, then modified on return
 436 to reflect the true size of the name.  If the client's name is not
 437 of interest, the second parameter may be a null pointer.
 438 .PP
 439 \fIAccept\fP normally blocks.  That is, \fIaccept\fP
 440 will not return until a connection is available or the system call
 441 is interrupted by a signal to the process.  Further, there is no
 442 way for a process to indicate it will accept connections from only
 443 a specific individual, or individuals.  It is up to the user process
 444 to consider who the connection is from and close down the connection
 445 if it does not wish to speak to the process.  If the server process
 446 wants to accept connections on more than one socket, or wants to avoid blocking
 447 on the accept call, there are alternatives; they will be considered
 448 in section 5.
 449 .NH 2
 450 Data transfer
 451 .PP
 452 With a connection established, data may begin to flow.  To send
 453 and receive data there are a number of possible calls.
 454 With the peer entity at each end of a connection
 455 anchored, a user can send or receive a message without specifying
 456 the peer.  As one might expect, in this case, then
 457 the normal \fIread\fP and \fIwrite\fP system calls are usable,
 458 .DS
 459 write(s, buf, sizeof (buf));
 460 read(s, buf, sizeof (buf));
 461 .DE
 462 In addition to \fIread\fP and \fIwrite\fP,
 463 the new calls \fIsend\fP and \fIrecv\fP
 464 may be used:
 465 .DS
 466 send(s, buf, sizeof (buf), flags);
 467 recv(s, buf, sizeof (buf), flags);
 468 .DE
 469 While \fIsend\fP and \fIrecv\fP are virtually identical to
 470 \fIread\fP and \fIwrite\fP,
 471 the extra \fIflags\fP argument is important.  The flags,
 472 defined in \fI<sys/socket.h>\fP, may be
 473 specified as a non-zero value if one or more
 474 of the following is required:
 475 .DS
 476 .TS
 477 l l.
 478 MSG_OOB send/receive out of band data
 479 MSG_PEEK        look at data without reading
 480 MSG_DONTROUTE   send data without routing packets
 481 .TE
 482 .DE
 483 Out of band data is a notion specific to stream sockets, and one
 484 which we will not immediately consider.  The option to have data
 485 sent without routing applied to the outgoing packets is currently
 486 used only by the routing table management process, and is
 487 unlikely to be of interest to the casual user.  The ability
 488 to preview data is, however, of interest.  When MSG_PEEK
 489 is specified with a \fIrecv\fP call, any data present is returned
 490 to the user, but treated as still \*(lqunread\*(rq.  That
 491 is, the next \fIread\fP or \fIrecv\fP call applied to the socket will
 492 return the data previously previewed.
 493 .NH 2
 494 Discarding sockets
 495 .PP
 496 Once a socket is no longer of interest, it may be discarded
 497 by applying a \fIclose\fP to the descriptor,
 498 .DS
 499 close(s);
 500 .DE
 501 If data is associated with a socket which promises reliable delivery
 502 (e.g. a stream socket) when a close takes place, the system will
 503 continue to attempt to transfer the data.
 504 However, after a fairly long period of
 505 time, if the data is still undelivered, it will be discarded.
 506 Should a user have no use for any pending data, it may
 507 perform a \fIshutdown\fP on the socket prior to closing it.
 508 This call is of the form:
 509 .DS
 510 shutdown(s, how);
 511 .DE
 512 where \fIhow\fP is 0 if the user is no longer interested in reading
 513 data, 1 if no more data will be sent, or 2 if no data is to
 514 be sent or received.
 515 .NH 2
 516 Connectionless sockets
 517 .PP
 518 To this point we have been concerned mostly with sockets which
 519 follow a connection oriented model.  However, there is also
 520 support for connectionless interactions typical of the datagram
 521 facilities found in contemporary packet switched networks.
 522 A datagram socket provides a symmetric interface to data
 523 exchange.  While processes are still likely to be client
 524 and server, there is no requirement for connection establishment.
 525 Instead, each message includes the destination address.
 526 .PP
 527 Datagram sockets are created as before.
 528 If a particular local address is needed,
 529 the \fIbind\fP operation must precede the first data transmission.
 530 Otherwise, the system will set the local address and/or port
 531 when data is first sent.
 532 To send data, the \fIsendto\fP primitive is used,
 533 .DS
 534 sendto(s, buf, buflen, flags, (struct sockaddr *)&to, tolen);
 535 .DE
 536 The \fIs\fP, \fIbuf\fP, \fIbuflen\fP, and \fIflags\fP
 537 parameters are used as before.
 538 The \fIto\fP and \fItolen\fP
 539 values are used to indicate the address of the intended recipient of the
 540 message.  When
 541 using an unreliable datagram interface, it is
 542 unlikely that any errors will be reported to the sender.  When
 543 information is present locally to recognize a message that can
 544 not be delivered (for instance when a network is unreachable),
 545 the call will return \-1 and the global value \fIerrno\fP will
 546 contain an error number.
 547 .PP
 548 To receive messages on an unconnected datagram socket, the
 549 \fIrecvfrom\fP primitive is provided:
 550 .DS
 551 recvfrom(s, buf, buflen, flags, (struct sockaddr *)&from, &fromlen);
 552 .DE
 553 Once again, the \fIfromlen\fP parameter is handled in
 554 a value-result fashion, initially containing the size of
 555 the \fIfrom\fP buffer, and modified on return to indicate
 556 the actual size of the address from which the datagram was received.
 557 .PP
 558 In addition to the two calls mentioned above, datagram
 559 sockets may also use the \fIconnect\fP call to associate
 560 a socket with a specific destination address.  In this case, any
 561 data sent on the socket will automatically be addressed
 562 to the connected peer, and only data received from that
 563 peer will be delivered to the user.  Only one connected
 564 address is permitted for each socket at one time;
 565 a second connect will change the destination address,
 566 and a connect to a null address (family AF_UNSPEC)
 567 will disconnect.
 568 Connect requests on datagram sockets return immediately,
 569 as this simply results in the system recording
 570 the peer's address (as compared to a stream socket, where a
 571 connect request initiates establishment of an end to end
 572 connection).  \fIAccept\fP and \fIlisten\fP are not
 573 used with datagram sockets.
 574 .PP
 575 While a datagram socket socket is connected,
 576 errors from recent \fIsend\fP calls may be returned
 577 asynchronously.
 578 These errors may be reported on subsequent operations
 579 on the socket,
 580 or a special socket option used with \fIgetsockopt\fP, SO_ERROR,
 581 may be used to interrogate the error status.
 582 A \fIselect\fP for reading or writing will return true
 583 when an error indication has been received.
 584 The next operation will return the error, and the error status is cleared.
 585 Other of the less
 586 important details of datagram sockets are described
 587 in section 5.
 588 .NH 2
 589 Input/Output multiplexing
 590 .PP
 591 One last facility often used in developing applications
 592 is the ability to multiplex i/o requests among multiple
 593 sockets and/or files.  This is done using the \fIselect\fP
 594 call:
 595 .DS
 596 #include <sys/time.h>
 597 #include <sys/types.h>
 598  ...
 599
 600 fd_set readmask, writemask, exceptmask;
 601 struct timeval timeout;
 602  ...
 603 select(nfds, &readmask, &writemask, &exceptmask, &timeout);
 604 .DE
 605 \fISelect\fP takes as arguments pointers to three sets, one for
 606 the set of file descriptors for which the caller wishes to
 607 be able to read data on, one for those descriptors to which
 608 data is to be written, and one for which exceptional conditions
 609 are pending; out-of-band data is the only
 610 exceptional condition currently implemented by the socket
 611 If the user is not interested
 612 in certain conditions (i.e., read, write, or exceptions),
 613 the corresponding argument to the \fIselect\fP should
 614 be a null pointer.
 615 .PP
 616 Each set is actually a structure containing an array of
 617 long integer bit masks; the size of the array is set
 618 by the definition FD_SETSIZE.
 619 The array is be
 620 long enough to hold one bit for each of FD_SETSIZE file descriptors.
 621 .PP
 622 The macros FD_SET(\fIfd, &mask\fP) and
 623 FD_CLR(\fIfd, &mask\fP)
 624 have been provided for adding and removing file descriptor
 625 \fIfd\fP in the set \fImask\fP.  The
 626 set should be zeroed before use, and
 627 the macro FD_ZERO(\fI&mask\fP) has been provided
 628 to clear the set \fImask\fP.
 629 The parameter \fInfds\fP in the \fIselect\fP call specifies the range
 630 of file descriptors  (i.e. one plus the value of the largest
 631 descriptor) to be examined in a set.
 632 .PP
 633 A timeout value may be specified if the selection
 634 is not to last more than a predetermined period of time.  If
 635 the fields in \fItimeout\fP are set to 0, the selection takes
 636 the form of a
 637 \fIpoll\fP, returning immediately.  If the last parameter is
 638 a null pointer, the selection will block indefinitely*.
 639 .FS
 640 * To be more specific, a return takes place only when a
 641 descriptor is selectable, or when a signal is received by
 642 the caller, interrupting the system call.
 643 .FE
 644 \fISelect\fP normally returns the number of file descriptors selected;
 645 if the \fIselect\fP call returns due to the timeout expiring, then
 646 the value 0 is returned.
 647 If the \fIselect\fP terminates because of an error or interruption,
 648 a \-1 is returned with the error number in \fIerrno\fP,
 649 and with the file descriptor masks unchanged.
 650 .PP
 651 Assuming a successful return, the three sets will
 652 indicate which
 653 file descriptors are ready to be read from, written to, or
 654 have exceptional conditions pending.
 655 The status of a file descriptor in a select mask may be
 656 tested with the \fIFD_ISSET(fd, &mask)\fP macro, which
 657 returns a non-zero value if \fIfd\fP is a member of the set
 658 \fImask\fP, and 0 if it is not.
 659 .PP
 660 To determine if there are connections waiting
 661 on a socket to be used with an \fIaccept\fP call,
 662 \fIselect\fP can be used, followed by
 663 a \fIFD_ISSET(fd, &mask)\fP macro to check for read
 664 readiness on the appropriate socket.  If \fIFD_ISSET\fP
 665 returns a non-zero value, indicating permission to read, then a
 666 connection is pending on the socket.
 667 .PP
 668 As an example, to read data from two sockets, \fIs1\fP and
 669 \fIs2\fP as it is available from each and with a one-second
 670 timeout, the following code
 671 might be used:
 672 .DS
 673 #include <sys/time.h>
 674 #include <sys/types.h>
 675  ...
 676 fd_set read_template;
 677 struct timeval wait;
 678  ...
 679 for (;;) {
 680         wait.tv_sec = 1;                /* one second */
 681         wait.tv_usec = 0;
 682
 683         FD_ZERO(&read_template);
 684
 685         FD_SET(s1, &read_template);
 686         FD_SET(s2, &read_template);
 687
 688         nb = select(FD_SETSIZE, &read_template, (fd_set *) 0, (fd_set *) 0, &wait);
 689         if (nb <= 0) {
 690                 \fIAn error occurred during the \fPselect\fI, or
 691                 the \fPselect\fI timed out.\fP
 692         }
 693
 694         if (FD_ISSET(s1, &read_template)) {
 695                 \fISocket #1 is ready to be read from.\fP
 696         }
 697
 698         if (FD_ISSET(s2, &read_template)) {
 699                 \fISocket #2 is ready to be read from.\fP
 700         }
 701 }
 702 .DE
 703 .PP
 704 In 4.2, the arguments to \fIselect\fP were pointers to integers
 705 instead of pointers to \fIfd_set\fPs.  This type of call
 706 will still work as long as the number of file descriptors
 707 being examined is less than the number of bits in an
 708 integer; however, the methods illustrated above should
 709 be used in all current programs.
 710 .PP
 711 \fISelect\fP provides a synchronous multiplexing scheme.
 712 Asynchronous notification of output completion, input availability,
 713 and exceptional conditions is possible through use of the
 714 SIGIO and SIGURG signals described in section 5.