share/doc/psd/21.ipc/2.t

   1 .\" Copyright (c) 1986, 1993
   2 .\"     The Regents of the University of California.  All rights reserved.
   3 .\"
   4 .\" Redistribution and use in source and binary forms, with or without
   5 .\" modification, are permitted provided that the following conditions
   6 .\" are met:
   7 .\" 1. Redistributions of source code must retain the above copyright
   8 .\"    notice, this list of conditions and the following disclaimer.
   9 .\" 2. Redistributions in binary form must reproduce the above copyright
  10 .\"    notice, this list of conditions and the following disclaimer in the
  11 .\"    documentation and/or other materials provided with the distribution.
  12 .\" 3. Neither the name of the University nor the names of its contributors
  13 .\"    may be used to endorse or promote products derived from this software
  14 .\"    without specific prior written permission.
  15 .\"
  16 .\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
  17 .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  18 .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  19 .\" ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
  20 .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  21 .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  22 .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  23 .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  24 .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  25 .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  26 .\" SUCH DAMAGE.
  27 .\"
  28 .\"     @(#)2.t 8.1 (Berkeley) 8/14/93
  29 .\"
  30 .\".ds RH "Basics
  31 .bp
  32 .nr H1 2
  33 .nr H2 0
  34 .\" The next line is a major hack to get around internal changes in the groff
  35 .\" implementation of .NH.
  36 .nr nh*hl 1
  37 .bp
  38 .LG
  39 .B
  40 .ce
  41 2. BASICS
  42 .sp 2
  43 .R
  44 .NL
  45 .PP
  46 The basic building block for communication is the \fIsocket\fP.
  47 A socket is an endpoint of communication to which a name may
  48 be \fIbound\fP.  Each socket in use has a \fItype\fP
  49 and one or more associated processes.  Sockets exist within
  50 \fIcommunication domains\fP.
  51 A communication domain is an
  52 abstraction introduced to bundle common properties of
  53 processes communicating through sockets.
  54 One such property is the scheme used to name sockets.  For
  55 example, in the UNIX communication domain sockets are
  56 named with UNIX path names; e.g. a
  57 socket may be named \*(lq/dev/foo\*(rq.  Sockets normally
  58 exchange data only with
  59 sockets in the same domain (it may be possible to cross domain
  60 boundaries, but only if some translation process is
  61 performed).  The
  62 4.4BSD IPC facilities support four separate communication domains:
  63 the UNIX domain, for on-system communication;
  64 the Internet domain, which is used by
  65 processes which communicate
  66 using the Internet standard communication protocols;
  67 the NS domain, which is used by processes which
  68 communicate using the Xerox standard communication
  69 protocols*;
  70 .FS
  71 * See \fIInternet Transport Protocols\fP, Xerox System Integration
  72 Standard (XSIS)028112 for more information.  This document is
  73 almost a necessity for one trying to write NS applications.
  74 .FE
  75 and the ISO OSI protocols, which are not documented in this tutorial.
  76 The underlying communication
  77 facilities provided by these domains have a significant influence
  78 on the internal system implementation as well as the interface to
  79 socket facilities available to a user.  An example of the
  80 latter is that a socket \*(lqoperating\*(rq in the UNIX domain
  81 sees a subset of the error conditions which are possible
  82 when operating in the Internet (or NS) domain.
  83 .NH 2
  84 Socket types
  85 .PP
  86 Sockets are
  87 typed according to the communication properties visible to a
  88 user.
  89 Processes are presumed to communicate only between sockets of
  90 the same type, although there is
  91 nothing that prevents communication between sockets of different
  92 types should the underlying communication
  93 protocols support this.
  94 .PP
  95 Four types of sockets currently are available to a user.
  96 A \fIstream\fP socket provides for the bidirectional, reliable,
  97 sequenced, and unduplicated flow of data without record boundaries.
  98 Aside from the bidirectionality of data flow, a pair of connected
  99 stream sockets provides an interface nearly identical to that of pipes\(dg.
 100 .FS
 101 \(dg In the UNIX domain, in fact, the semantics are identical and,
 102 as one might expect, pipes have been implemented internally
 103 as simply a pair of connected stream sockets.
 104 .FE
 105 .PP
 106 A \fIdatagram\fP socket supports bidirectional flow of data which
 107 is not promised to be sequenced, reliable, or unduplicated.
 108 That is, a process
 109 receiving messages on a datagram socket may find messages duplicated,
 110 and, possibly,
 111 in an order different from the order in which it was sent.
 112 An important characteristic of a datagram
 113 socket is that record boundaries in data are preserved.  Datagram
 114 sockets closely model the facilities found in many contemporary
 115 packet switched networks such as the Ethernet.
 116 .PP
 117 A \fIraw\fP socket provides users access to
 118 the underlying communication
 119 protocols which support socket abstractions.
 120 These sockets are normally datagram oriented, though their
 121 exact characteristics are dependent on the interface provided by
 122 the protocol.  Raw sockets are not intended for the general user; they
 123 have been provided mainly for those interested in developing new
 124 communication protocols, or for gaining access to some of the more
 125 esoteric facilities of an existing protocol.  The use of raw sockets
 126 is considered in section 5.
 127 .PP
 128 A \fIsequenced packet\fP socket is similar to a stream socket,
 129 with the exception that record boundaries are preserved.  This
 130 interface is provided only as part of the NS socket abstraction,
 131 and is very important in most serious NS applications.
 132 Sequenced-packet sockets allow the user to manipulate the
 133 SPP or IDP headers on a packet or a group of packets either
 134 by writing a prototype header along with whatever data is
 135 to be sent, or by specifying a default header to be used with
 136 all outgoing data, and allows the user to receive the headers
 137 on incoming packets.  The use of these options is considered in
 138 section 5.
 139 .PP
 140 Another potential socket type which has interesting properties is
 141 the \fIreliably delivered
 142 message\fP socket.
 143 The reliably delivered message socket has
 144 similar properties to a datagram socket, but with
 145 reliable delivery.  There is currently no support for this
 146 type of socket, but a reliably delivered message protocol
 147 similar to Xerox's Packet Exchange Protocol (PEX) may be
 148 simulated at the user level.  More information on this topic
 149 can be found in section 5.
 150 .NH 2
 151 Socket creation
 152 .PP
 153 To create a socket the \fIsocket\fP system call is used:
 154 .DS
 155 s = socket(domain, type, protocol);
 156 .DE
 157 This call requests that the system create a socket in the specified
 158 \fIdomain\fP and of the specified \fItype\fP.  A particular protocol may
 159 also be requested.  If the protocol is left unspecified (a value
 160 of 0), the system will select an appropriate protocol from those
 161 protocols which comprise the communication domain and which
 162 may be used to support the requested socket type.  The user is
 163 returned a descriptor (a small integer number) which may be used
 164 in later system calls which operate on sockets.  The domain is specified as
 165 one of the manifest constants defined in the file <\fIsys/socket.h\fP>.
 166 For the UNIX domain the constant is AF_UNIX*;  for the Internet
 167 .FS
 168 * The manifest constants are named AF_whatever as they indicate
 169 the ``address format'' to use in interpreting names.
 170 .FE
 171 domain AF_INET; and for the NS domain, AF_NS.
 172 The socket types are also defined in this file
 173 and one of SOCK_STREAM, SOCK_DGRAM, SOCK_RAW, or SOCK_SEQPACKET
 174 must be specified.
 175 To create a stream socket in the Internet domain the following
 176 call might be used:
 177 .DS
 178 s = socket(AF_INET, SOCK_STREAM, 0);
 179 .DE
 180 This call would result in a stream socket being created with the TCP
 181 protocol providing the underlying communication support.  To
 182 create a datagram socket for on-machine use the call might
 183 be:
 184 .DS
 185 s = socket(AF_UNIX, SOCK_DGRAM, 0);
 186 .DE
 187 .PP
 188 The default protocol (used when the \fIprotocol\fP argument to the
 189 \fIsocket\fP call is 0) should be correct for most every
 190 situation.  However, it is possible to specify a protocol
 191 other than the default; this will be covered in
 192 section 5.
 193 .PP
 194 There are several reasons a socket call may fail.  Aside from
 195 the rare occurrence of lack of memory (ENOBUFS), a socket
 196 request may fail due to a request for an unknown protocol
 197 (EPROTONOSUPPORT), or a request for a type of socket for
 198 which there is no supporting protocol (EPROTOTYPE).
 199 .NH 2
 200 Binding local names
 201 .PP
 202 A socket is created without a name.  Until a name is bound
 203 to a socket, processes have no way to reference it and, consequently,
 204 no messages may be received on it.
 205 Communicating processes are bound
 206 by an \fIassociation\fP.  In the Internet and NS domains,
 207 an association
 208 is composed of local and foreign
 209 addresses, and local and foreign ports,
 210 while in the UNIX domain, an association is composed of
 211 local and foreign path names (the phrase ``foreign pathname''
 212 means a pathname created by a foreign process, not a pathname
 213 on a foreign system).
 214 In most domains, associations must be unique.
 215 In the Internet domain there
 216 may never be duplicate <protocol, local address, local port, foreign
 217 address, foreign port> tuples.  UNIX domain sockets need not always
 218 be bound to a name, but when bound
 219 there may never be duplicate <protocol, local pathname, foreign
 220 pathname> tuples.
 221 The pathnames may not refer to files
 222 already existing on the system
 223 in 4.3; the situation may change in future releases.
 224 .PP
 225 The \fIbind\fP system call allows a process to specify half of
 226 an association, <local address, local port>
 227 (or <local pathname>), while the \fIconnect\fP
 228 and \fIaccept\fP primitives are used to complete a socket's association.
 229 .PP
 230 In the Internet domain,
 231 binding names to sockets can be fairly complex.
 232 Fortunately, it is usually not necessary to specifically bind an
 233 address and port number to a socket, because the
 234 \fIconnect\fP and \fIsend\fP calls will automatically
 235 bind an appropriate address if they are used with an
 236 unbound socket.  The process of binding names to NS
 237 sockets is similar in most ways to that of
 238 binding names to Internet sockets.
 239 .PP
 240 The \fIbind\fP system call is used as follows:
 241 .DS
 242 bind(s, name, namelen);
 243 .DE
 244 The bound name is a variable length byte string which is interpreted
 245 by the supporting protocol(s).  Its interpretation may vary from
 246 communication domain to communication domain (this is one of
 247 the properties which comprise the \*(lqdomain\*(rq).
 248 As mentioned, in the
 249 Internet domain names contain an Internet address and port
 250 number.  NS domain names contain an NS address and
 251 port number.  In the UNIX domain, names contain a path name and
 252 a family, which is always AF_UNIX.  If one wanted to bind
 253 the name \*(lq/tmp/foo\*(rq to a UNIX domain socket, the
 254 following code would be used*:
 255 .FS
 256 * Note that, although the tendency here is to call the \*(lqaddr\*(rq
 257 structure \*(lqsun\*(rq, doing so would cause problems if the code
 258 were ever ported to a Sun workstation.
 259 .FE
 260 .DS
 261 #include <sys/un.h>
 262  ...
 263 struct sockaddr_un addr;
 264  ...
 265 strcpy(addr.sun_path, "/tmp/foo");
 266 addr.sun_family = AF_UNIX;
 267 bind(s, (struct sockaddr *) &addr, strlen(addr.sun_path) +
 268     sizeof (addr.sun_len) + sizeof (addr.sun_family));
 269 .DE
 270 Note that in determining the size of a UNIX domain address null
 271 bytes are not counted, which is why \fIstrlen\fP is used.  In
 272 the current implementation of UNIX domain IPC,
 273 the file name
 274 referred to in \fIaddr.sun_path\fP is created as a socket
 275 in the system file space.
 276 The caller must, therefore, have
 277 write permission in the directory where
 278 \fIaddr.sun_path\fP is to reside, and this file should be deleted by the
 279 caller when it is no longer needed.  Future versions of 4BSD
 280 may not create this file.
 281 .PP
 282 In binding an Internet address things become more
 283 complicated.  The actual call is similar,
 284 .DS
 285 #include <sys/types.h>
 286 #include <netinet/in.h>
 287  ...
 288 struct sockaddr_in sin;
 289  ...
 290 bind(s, (struct sockaddr *) &sin, sizeof (sin));
 291 .DE
 292 but the selection of what to place in the address \fIsin\fP
 293 requires some discussion.  We will come back to the problem
 294 of formulating Internet addresses in section 3 when
 295 the library routines used in name resolution are discussed.
 296 .PP
 297 Binding an NS address to a socket is even more
 298 difficult,
 299 especially since the Internet library routines do not
 300 work with NS hostnames.  The actual call is again similar:
 301 .DS
 302 #include <sys/types.h>
 303 #include <netns/ns.h>
 304  ...
 305 struct sockaddr_ns sns;
 306  ...
 307 bind(s, (struct sockaddr *) &sns, sizeof (sns));
 308 .DE
 309 Again, discussion of what to place in a \*(lqstruct sockaddr_ns\*(rq
 310 will be deferred to section 3.
 311 .NH 2
 312 Connection establishment
 313 .PP
 314 Connection establishment is usually asymmetric,
 315 with one process a \*(lqclient\*(rq and the other a \*(lqserver\*(rq.
 316 The server, when willing to offer its advertised services,
 317 binds a socket to a well-known address associated with the service
 318 and then passively \*(lqlistens\*(rq on its socket.
 319 It is then possible for an unrelated process to rendezvous
 320 with the server.
 321 The client requests services from the server by initiating a
 322 \*(lqconnection\*(rq to the server's socket.
 323 On the client side the \fIconnect\fP call is
 324 used to initiate a connection.  Using the UNIX domain, this
 325 might appear as,
 326 .DS
 327 struct sockaddr_un server;
 328  ...
 329 connect(s, (struct sockaddr *)&server, strlen(server.sun_path) +
 330     sizeof (server.sun_family));
 331 .DE
 332 while in the Internet domain,
 333 .DS
 334 struct sockaddr_in server;
 335  ...
 336 connect(s, (struct sockaddr *)&server, sizeof (server));
 337 .DE
 338 and in the NS domain,
 339 .DS
 340 struct sockaddr_ns server;
 341  ...
 342 connect(s, (struct sockaddr *)&server, sizeof (server));
 343 .DE
 344 where \fIserver\fP in the example above would contain either the UNIX
 345 pathname, Internet address and port number, or NS address and
 346 port number of the server to which the
 347 client process wishes to speak.
 348 If the client process's socket is unbound at the time of
 349 the connect call,
 350 the system will automatically select and bind a name to
 351 the socket if necessary; c.f. section 5.4.
 352 This is the usual way that local addresses are bound
 353 to a socket.
 354 .PP
 355 An error is returned if the connection was unsuccessful
 356 (any name automatically bound by the system, however, remains).
 357 Otherwise, the socket is associated with the server and
 358 data transfer may begin.  Some of the more common errors returned
 359 when a connection attempt fails are:
 360 .IP ETIMEDOUT
 361 .br
 362 After failing to establish a connection for a period of time,
 363 the system decided there was no point in retrying the
 364 connection attempt any more.  This usually occurs because
 365 the destination host is down, or because problems in
 366 the network resulted in transmissions being lost.
 367 .IP ECONNREFUSED
 368 .br
 369 The host refused service for some reason.
 370 This is usually
 371 due to a server process
 372 not being present at the requested name.
 373 .IP "ENETDOWN or EHOSTDOWN"
 374 .br
 375 These operational errors are
 376 returned based on status information delivered to
 377 the client host by the underlying communication services.
 378 .IP "ENETUNREACH or EHOSTUNREACH"
 379 .br
 380 These operational errors can occur either because the network
 381 or host is unknown (no route to the network or host is present),
 382 or because of status information returned by intermediate
 383 gateways or switching nodes.  Many times the status returned
 384 is not sufficient to distinguish a network being down from a
 385 host being down, in which case the system
 386 indicates the entire network is unreachable.
 387 .PP
 388 For the server to receive a client's connection it must perform
 389 two steps after binding its socket.
 390 The first is to indicate a willingness to listen for
 391 incoming connection requests:
 392 .DS
 393 listen(s, 5);
 394 .DE
 395 The second parameter to the \fIlisten\fP call specifies the maximum
 396 number of outstanding connections which may be queued awaiting
 397 acceptance by the server process; this number
 398 may be limited by the system.  Should a connection be
 399 requested while the queue is full, the connection will not be
 400 refused, but rather the individual messages which comprise the
 401 request will be ignored.  This gives a harried server time to
 402 make room in its pending connection queue while the client
 403 retries the connection request.  Had the connection been returned
 404 with the ECONNREFUSED error, the client would be unable to tell
 405 if the server was up or not.  As it is now it is still possible
 406 to get the ETIMEDOUT error back, though this is unlikely.  The
 407 backlog figure supplied with the listen call is currently limited
 408 by the system to a maximum of 5 pending connections on any
 409 one queue.  This avoids the problem of processes hogging system
 410 resources by setting an infinite backlog, then ignoring
 411 all connection requests.
 412 .PP
 413 With a socket marked as listening, a server may \fIaccept\fP
 414 a connection:
 415 .DS
 416 struct sockaddr_in from;
 417  ...
 418 fromlen = sizeof (from);
 419 newsock = accept(s, (struct sockaddr *)&from, &fromlen);
 420 .DE
 421 (For the UNIX domain, \fIfrom\fP would be declared as a
 422 \fIstruct sockaddr_un\fP, and for the NS domain, \fIfrom\fP
 423 would be declared as a \fIstruct sockaddr_ns\fP,
 424 but nothing different would need
 425 to be done as far as \fIfromlen\fP is concerned.  In the examples
 426 which follow, only Internet routines will be discussed.)  A new
 427 descriptor is returned on receipt of a connection (along with
 428 a new socket).  If the server wishes to find out who its client is,
 429 it may supply a buffer for the client socket's name.  The value-result
 430 parameter \fIfromlen\fP is initialized by the server to indicate how
 431 much space is associated with \fIfrom\fP, then modified on return
 432 to reflect the true size of the name.  If the client's name is not
 433 of interest, the second parameter may be a null pointer.
 434 .PP
 435 \fIAccept\fP normally blocks.  That is, \fIaccept\fP
 436 will not return until a connection is available or the system call
 437 is interrupted by a signal to the process.  Further, there is no
 438 way for a process to indicate it will accept connections from only
 439 a specific individual, or individuals.  It is up to the user process
 440 to consider who the connection is from and close down the connection
 441 if it does not wish to speak to the process.  If the server process
 442 wants to accept connections on more than one socket, or wants to avoid blocking
 443 on the accept call, there are alternatives; they will be considered
 444 in section 5.
 445 .NH 2
 446 Data transfer
 447 .PP
 448 With a connection established, data may begin to flow.  To send
 449 and receive data there are a number of possible calls.
 450 With the peer entity at each end of a connection
 451 anchored, a user can send or receive a message without specifying
 452 the peer.  As one might expect, in this case, then
 453 the normal \fIread\fP and \fIwrite\fP system calls are usable,
 454 .DS
 455 write(s, buf, sizeof (buf));
 456 read(s, buf, sizeof (buf));
 457 .DE
 458 In addition to \fIread\fP and \fIwrite\fP,
 459 the new calls \fIsend\fP and \fIrecv\fP
 460 may be used:
 461 .DS
 462 send(s, buf, sizeof (buf), flags);
 463 recv(s, buf, sizeof (buf), flags);
 464 .DE
 465 While \fIsend\fP and \fIrecv\fP are virtually identical to
 466 \fIread\fP and \fIwrite\fP,
 467 the extra \fIflags\fP argument is important.  The flags,
 468 defined in \fI<sys/socket.h>\fP, may be
 469 specified as a non-zero value if one or more
 470 of the following is required:
 471 .DS
 472 .TS
 473 l l.
 474 MSG_OOB send/receive out of band data
 475 MSG_PEEK        look at data without reading
 476 MSG_DONTROUTE   send data without routing packets
 477 .TE
 478 .DE
 479 Out of band data is a notion specific to stream sockets, and one
 480 which we will not immediately consider.  The option to have data
 481 sent without routing applied to the outgoing packets is currently
 482 used only by the routing table management process, and is
 483 unlikely to be of interest to the casual user.  The ability
 484 to preview data is, however, of interest.  When MSG_PEEK
 485 is specified with a \fIrecv\fP call, any data present is returned
 486 to the user, but treated as still \*(lqunread\*(rq.  That
 487 is, the next \fIread\fP or \fIrecv\fP call applied to the socket will
 488 return the data previously previewed.
 489 .NH 2
 490 Discarding sockets
 491 .PP
 492 Once a socket is no longer of interest, it may be discarded
 493 by applying a \fIclose\fP to the descriptor,
 494 .DS
 495 close(s);
 496 .DE
 497 If data is associated with a socket which promises reliable delivery
 498 (e.g. a stream socket) when a close takes place, the system will
 499 continue to attempt to transfer the data.
 500 However, after a fairly long period of
 501 time, if the data is still undelivered, it will be discarded.
 502 Should a user have no use for any pending data, it may
 503 perform a \fIshutdown\fP on the socket prior to closing it.
 504 This call is of the form:
 505 .DS
 506 shutdown(s, how);
 507 .DE
 508 where \fIhow\fP is 0 if the user is no longer interested in reading
 509 data, 1 if no more data will be sent, or 2 if no data is to
 510 be sent or received.
 511 .NH 2
 512 Connectionless sockets
 513 .PP
 514 To this point we have been concerned mostly with sockets which
 515 follow a connection oriented model.  However, there is also
 516 support for connectionless interactions typical of the datagram
 517 facilities found in contemporary packet switched networks.
 518 A datagram socket provides a symmetric interface to data
 519 exchange.  While processes are still likely to be client
 520 and server, there is no requirement for connection establishment.
 521 Instead, each message includes the destination address.
 522 .PP
 523 Datagram sockets are created as before.
 524 If a particular local address is needed,
 525 the \fIbind\fP operation must precede the first data transmission.
 526 Otherwise, the system will set the local address and/or port
 527 when data is first sent.
 528 To send data, the \fIsendto\fP primitive is used,
 529 .DS
 530 sendto(s, buf, buflen, flags, (struct sockaddr *)&to, tolen);
 531 .DE
 532 The \fIs\fP, \fIbuf\fP, \fIbuflen\fP, and \fIflags\fP
 533 parameters are used as before.
 534 The \fIto\fP and \fItolen\fP
 535 values are used to indicate the address of the intended recipient of the
 536 message.  When
 537 using an unreliable datagram interface, it is
 538 unlikely that any errors will be reported to the sender.  When
 539 information is present locally to recognize a message that can
 540 not be delivered (for instance when a network is unreachable),
 541 the call will return \-1 and the global value \fIerrno\fP will
 542 contain an error number.
 543 .PP
 544 To receive messages on an unconnected datagram socket, the
 545 \fIrecvfrom\fP primitive is provided:
 546 .DS
 547 recvfrom(s, buf, buflen, flags, (struct sockaddr *)&from, &fromlen);
 548 .DE
 549 Once again, the \fIfromlen\fP parameter is handled in
 550 a value-result fashion, initially containing the size of
 551 the \fIfrom\fP buffer, and modified on return to indicate
 552 the actual size of the address from which the datagram was received.
 553 .PP
 554 In addition to the two calls mentioned above, datagram
 555 sockets may also use the \fIconnect\fP call to associate
 556 a socket with a specific destination address.  In this case, any
 557 data sent on the socket will automatically be addressed
 558 to the connected peer, and only data received from that
 559 peer will be delivered to the user.  Only one connected
 560 address is permitted for each socket at one time;
 561 a second connect will change the destination address,
 562 and a connect to a null address (family AF_UNSPEC)
 563 will disconnect.
 564 Connect requests on datagram sockets return immediately,
 565 as this simply results in the system recording
 566 the peer's address (as compared to a stream socket, where a
 567 connect request initiates establishment of an end to end
 568 connection).  \fIAccept\fP and \fIlisten\fP are not
 569 used with datagram sockets.
 570 .PP
 571 While a datagram socket socket is connected,
 572 errors from recent \fIsend\fP calls may be returned
 573 asynchronously.
 574 These errors may be reported on subsequent operations
 575 on the socket,
 576 or a special socket option used with \fIgetsockopt\fP, SO_ERROR,
 577 may be used to interrogate the error status.
 578 A \fIselect\fP for reading or writing will return true
 579 when an error indication has been received.
 580 The next operation will return the error, and the error status is cleared.
 581 Other of the less
 582 important details of datagram sockets are described
 583 in section 5.
 584 .NH 2
 585 Input/Output multiplexing
 586 .PP
 587 One last facility often used in developing applications
 588 is the ability to multiplex i/o requests among multiple
 589 sockets and/or files.  This is done using the \fIselect\fP
 590 call:
 591 .DS
 592 #include <sys/time.h>
 593 #include <sys/types.h>
 594  ...
 595
 596 fd_set readmask, writemask, exceptmask;
 597 struct timeval timeout;
 598  ...
 599 select(nfds, &readmask, &writemask, &exceptmask, &timeout);
 600 .DE
 601 \fISelect\fP takes as arguments pointers to three sets, one for
 602 the set of file descriptors for which the caller wishes to
 603 be able to read data on, one for those descriptors to which
 604 data is to be written, and one for which exceptional conditions
 605 are pending; out-of-band data is the only
 606 exceptional condition currently implemented by the socket
 607 If the user is not interested
 608 in certain conditions (i.e., read, write, or exceptions),
 609 the corresponding argument to the \fIselect\fP should
 610 be a null pointer.
 611 .PP
 612 Each set is actually a structure containing an array of
 613 long integer bit masks; the size of the array is set
 614 by the definition FD_SETSIZE.
 615 The array is be
 616 long enough to hold one bit for each of FD_SETSIZE file descriptors.
 617 .PP
 618 The macros FD_SET(\fIfd, &mask\fP) and
 619 FD_CLR(\fIfd, &mask\fP)
 620 have been provided for adding and removing file descriptor
 621 \fIfd\fP in the set \fImask\fP.  The
 622 set should be zeroed before use, and
 623 the macro FD_ZERO(\fI&mask\fP) has been provided
 624 to clear the set \fImask\fP.
 625 The parameter \fInfds\fP in the \fIselect\fP call specifies the range
 626 of file descriptors  (i.e. one plus the value of the largest
 627 descriptor) to be examined in a set.
 628 .PP
 629 A timeout value may be specified if the selection
 630 is not to last more than a predetermined period of time.  If
 631 the fields in \fItimeout\fP are set to 0, the selection takes
 632 the form of a
 633 \fIpoll\fP, returning immediately.  If the last parameter is
 634 a null pointer, the selection will block indefinitely*.
 635 .FS
 636 * To be more specific, a return takes place only when a
 637 descriptor is selectable, or when a signal is received by
 638 the caller, interrupting the system call.
 639 .FE
 640 \fISelect\fP normally returns the number of file descriptors selected;
 641 if the \fIselect\fP call returns due to the timeout expiring, then
 642 the value 0 is returned.
 643 If the \fIselect\fP terminates because of an error or interruption,
 644 a \-1 is returned with the error number in \fIerrno\fP,
 645 and with the file descriptor masks unchanged.
 646 .PP
 647 Assuming a successful return, the three sets will
 648 indicate which
 649 file descriptors are ready to be read from, written to, or
 650 have exceptional conditions pending.
 651 The status of a file descriptor in a select mask may be
 652 tested with the \fIFD_ISSET(fd, &mask)\fP macro, which
 653 returns a non-zero value if \fIfd\fP is a member of the set
 654 \fImask\fP, and 0 if it is not.
 655 .PP
 656 To determine if there are connections waiting
 657 on a socket to be used with an \fIaccept\fP call,
 658 \fIselect\fP can be used, followed by
 659 a \fIFD_ISSET(fd, &mask)\fP macro to check for read
 660 readiness on the appropriate socket.  If \fIFD_ISSET\fP
 661 returns a non-zero value, indicating permission to read, then a
 662 connection is pending on the socket.
 663 .PP
 664 As an example, to read data from two sockets, \fIs1\fP and
 665 \fIs2\fP as it is available from each and with a one-second
 666 timeout, the following code
 667 might be used:
 668 .DS
 669 #include <sys/time.h>
 670 #include <sys/types.h>
 671  ...
 672 fd_set read_template;
 673 struct timeval wait;
 674  ...
 675 for (;;) {
 676         wait.tv_sec = 1;                /* one second */
 677         wait.tv_usec = 0;
 678
 679         FD_ZERO(&read_template);
 680
 681         FD_SET(s1, &read_template);
 682         FD_SET(s2, &read_template);
 683
 684         nb = select(FD_SETSIZE, &read_template, (fd_set *) 0, (fd_set *) 0, &wait);
 685         if (nb <= 0) {
 686                 \fIAn error occurred during the \fPselect\fI, or
 687                 the \fPselect\fI timed out.\fP
 688         }
 689
 690         if (FD_ISSET(s1, &read_template)) {
 691                 \fISocket #1 is ready to be read from.\fP
 692         }
 693
 694         if (FD_ISSET(s2, &read_template)) {
 695                 \fISocket #2 is ready to be read from.\fP
 696         }
 697 }
 698 .DE
 699 .PP
 700 In 4.2, the arguments to \fIselect\fP were pointers to integers
 701 instead of pointers to \fIfd_set\fPs.  This type of call
 702 will still work as long as the number of file descriptors
 703 being examined is less than the number of bits in an
 704 integer; however, the methods illustrated above should
 705 be used in all current programs.
 706 .PP
 707 \fISelect\fP provides a synchronous multiplexing scheme.
 708 Asynchronous notification of output completion, input availability,
 709 and exceptional conditions is possible through use of the
 710 SIGIO and SIGURG signals described in section 5.