2 .\" Must use -- tbl -- with this one
4 .\" @(#)nfs.rfc.ms 2.2 88/08/05 4.0 RPCSRC
7 .if \\n%=1 .tl ''- % -''
10 .\" prevent excess underlining in nroff
12 .OH 'Network File System: Version 2 Protocol Specification''Page %'
13 .EH 'Page %''Network File System: Version 2 Protocol Specification'
16 \&Network File System: Version 2 Protocol Specification
17 .IX NFS "" "" "" PAGE MAJOR
18 .IX "Network File System" "" "" "" PAGE MAJOR
19 .IX NFS "version-2 protocol specification"
20 .IX "Network File System" "version-2 protocol specification"
23 \&Status of this Standard
25 Note: This document specifies a protocol that Sun Microsystems, Inc.,
26 and others are using. It specifies it in standard ARPA RFC form.
31 The Sun Network Filesystem (NFS) protocol provides transparent remote
32 access to shared filesystems over local area networks. The NFS
33 protocol is designed to be machine, operating system, network architecture,
34 and transport protocol independent. This independence is
35 achieved through the use of Remote Procedure Call (RPC) primitives
36 built on top of an External Data Representation (XDR). Implementations
37 exist for a variety of machines, from personal computers to
40 The supporting mount protocol allows the server to hand out remote
41 access privileges to a restricted set of clients. It performs the
42 operating system-specific functions that allow, for example, to
43 attach remote directory trees to some local file system.
45 \&Remote Procedure Call
46 .IX "Remote Procedure Call"
48 Sun's remote procedure call specification provides a procedure-
49 oriented interface to remote services. Each server supplies a
50 program that is a set of procedures. NFS is one such "program".
51 The combination of host address, program number, and procedure
52 number specifies one remote service procedure. RPC does not depend
53 on services provided by specific protocols, so it can be used with
54 any underlying transport protocol. See the
55 .I "Remote Procedure Calls: Protocol Specification"
56 chapter of this manual.
58 \&External Data Representation
59 .IX "External Data Representation"
61 The External Data Representation (XDR) standard provides a common
62 way of representing a set of data types over a network.
64 Protocol Specification is written using the RPC data description
66 For more information, see the
67 .I " External Data Representation Standard: Protocol Specification."
68 Sun provides implementations of XDR and
69 RPC, but NFS does not require their use. Any software that
70 provides equivalent functionality can be used, and if the encoding
71 is exactly the same it can interoperate with other implementations
75 .IX "stateless servers"
78 The NFS protocol is stateless. That is, a server does not need to
79 maintain any extra state information about any of its clients in
80 order to function correctly. Stateless servers have a distinct
81 advantage over stateful servers in the event of a failure. With
82 stateless servers, a client need only retry a request until the
83 server responds; it does not even need to know that the server has
84 crashed, or the network temporarily went down. The client of a
85 stateful server, on the other hand, needs to either detect a server
86 crash and rebuild the server's state when it comes back up, or
87 cause client operations to fail.
89 This may not sound like an important issue, but it affects the
90 protocol in some unexpected ways. We feel that it is worth a bit
91 of extra complexity in the protocol to be able to write very simple
92 servers that do not require fancy crash recovery.
94 On the other hand, NFS deals with objects such as files and
95 directories that inherently have state -- what good would a file be
96 if it did not keep its contents intact? The goal is to not
97 introduce any extra state in the protocol itself. Another way to
98 simplify recovery is by making operations "idempotent" whenever
99 possible (so that they can potentially be repeated).
101 \&NFS Protocol Definition
102 .IX NFS "protocol definition"
105 Servers have been known to change over time, and so can the
106 protocol that they use. So RPC provides a version number with each
107 RPC request. This RFC describes version two of the NFS protocol.
108 Even in the second version, there are various obsolete procedures
109 and parameters, which will be removed in later versions. An RFC
110 for version three of the NFS protocol is currently under
116 NFS assumes a file system that is hierarchical, with directories as
117 all but the bottom-level files. Each entry in a directory (file,
118 directory, device, etc.) has a string name. Different operating
119 systems may have restrictions on the depth of the tree or the names
120 used, as well as using different syntax to represent the "pathname",
121 which is the concatenation of all the "components" (directory and
122 file names) in the name. A "file system" is a tree on a single
123 server (usually a single disk or physical partition) with a specified
124 "root". Some operating systems provide a "mount" operation to make
125 all file systems appear as a single tree, while others maintain a
126 "forest" of file systems. Files are unstructured streams of
127 uninterpreted bytes. Version 3 of NFS uses a slightly more general
130 NFS looks up one component of a pathname at a time. It may not be
131 obvious why it does not just take the whole pathname, traipse down
132 the directories, and return a file handle when it is done. There are
133 several good reasons not to do this. First, pathnames need
134 separators between the directory components, and different operating
135 systems use different separators. We could define a Network Standard
136 Pathname Representation, but then every pathname would have to be
137 parsed and converted at each end. Other issues are discussed in
138 \fINFS Implementation Issues\fP below.
140 Although files and directories are similar objects in many ways,
141 different procedures are used to read directories and files. This
142 provides a network standard format for representing directories. The
143 same argument as above could have been used to justify a procedure
144 that returns only one directory entry per call. The problem is
145 efficiency. Directories can contain many entries, and a remote call
146 to return each would be just too slow.
149 .IX NFS "RPC information"
150 .IP \fIAuthentication\fP
157 authentication, except in the NULL procedure where
160 .IP "\fITransport Protocols\fP"
161 NFS currently is supported on UDP/IP only.
162 .IP "\fIPort Number\fP"
163 The NFS protocol currently uses the UDP port number 2049. This is
164 not an officially assigned port, so later versions of the protocol
165 use the \*QPortmapping\*U facility of RPC.
167 \&Sizes of XDR Structures
168 .IX "XDR structure sizes"
170 These are the sizes, given in decimal bytes, of various XDR
171 structures used in the protocol:
173 /* \fIThe maximum number of bytes of data in a READ or WRITE request\fP */
174 const MAXDATA = 8192;
176 /* \fIThe maximum number of bytes in a pathname argument\fP */
177 const MAXPATHLEN = 1024;
179 /* \fIThe maximum number of bytes in a file name argument\fP */
180 const MAXNAMLEN = 255;
182 /* \fIThe size in bytes of the opaque "cookie" passed by READDIR\fP */
183 const COOKIESIZE = 4;
185 /* \fIThe size in bytes of the opaque file handle\fP */
191 .IX NFS "basic data types"
193 The following XDR definitions are basic structures and types used
194 in other structures described further on.
198 .IX "NFS data types" stat "" \fIstat\fP
214 NFSERR_NAMETOOLONG=63,
225 type is returned with every procedure's results. A
228 indicates that the call completed successfully and
229 the results are valid. The other values indicate some kind of
230 error occurred on the server side during the servicing of the
231 procedure. The error values are derived from UNIX error numbers.
232 .IP \fBNFSERR_PERM\fP:
233 Not owner. The caller does not have correct ownership
234 to perform the requested operation.
235 .IP \fBNFSERR_NOENT\fP:
236 No such file or directory. The file or directory
237 specified does not exist.
239 Some sort of hard error occurred when the operation was
240 in progress. This could be a disk error, for example.
241 .IP \fBNFSERR_NXIO\fP:
242 No such device or address.
243 .IP \fBNFSERR_ACCES\fP:
244 Permission denied. The caller does not have the
245 correct permission to perform the requested operation.
246 .IP \fBNFSERR_EXIST\fP:
247 File exists. The file specified already exists.
248 .IP \fBNFSERR_NODEV\fP:
250 .IP \fBNFSERR_NOTDIR\fP:
251 Not a directory. The caller specified a
252 non-directory in a directory operation.
253 .IP \fBNFSERR_ISDIR\fP:
254 Is a directory. The caller specified a directory in
255 a non- directory operation.
256 .IP \fBNFSERR_FBIG\fP:
257 File too large. The operation caused a file to grow
258 beyond the server's limit.
259 .IP \fBNFSERR_NOSPC\fP:
260 No space left on device. The operation caused the
261 server's filesystem to reach its limit.
262 .IP \fBNFSERR_ROFS\fP:
263 Read-only filesystem. Write attempted on a read-only filesystem.
264 .IP \fBNFSERR_NAMETOOLONG\fP:
265 File name too long. The file name in an operation was too long.
266 .IP \fBNFSERR_NOTEMPTY\fP:
267 Directory not empty. Attempted to remove a
268 directory that was not empty.
269 .IP \fBNFSERR_DQUOT\fP:
270 Disk quota exceeded. The client's disk quota on the
271 server has been exceeded.
272 .IP \fBNFSERR_STALE\fP:
273 The "fhandle" given in the arguments was invalid.
274 That is, the file referred to by that file handle no longer exists,
275 or access to it has been revoked.
276 .IP \fBNFSERR_WFLUSH\fP:
277 The server's write cache used in the
279 call got flushed to disk.
284 .IX "NFS data types" ftype "" \fIftype\fP
298 gives the type of a file. The type
300 indicates a non-file,
306 is a block-special device,
308 is a character-special device, and
314 .IX "NFS data types" fhandle "" \fIfhandle\fP
316 typedef opaque fhandle[FHSIZE];
321 is the file handle passed between the server and the client.
322 All file operations are done using file handles to refer to a file or
323 directory. The file handle can contain whatever information the server
324 needs to distinguish an individual file.
328 .IX "NFS data types" timeval "" \fItimeval\fP
331 unsigned int seconds;
332 unsigned int useconds;
338 structure is the number of seconds and microseconds
339 since midnight January 1, 1970, Greenwich Mean Time. It is used to
340 pass time and date information.
344 .IX "NFS data types" fattr "" \fIfattr\fP
353 unsigned int blocksize;
366 structure contains the attributes of a file; "type" is the type of
367 the file; "nlink" is the number of hard links to the file (the number
368 of different names for the same file); "uid" is the user
369 identification number of the owner of the file; "gid" is the group
370 identification number of the group of the file; "size" is the size in
371 bytes of the file; "blocksize" is the size in bytes of a block of the
372 file; "rdev" is the device number of the file if it is type
376 "blocks" is the number of blocks the file takes up on disk; "fsid" is
377 the file system identifier for the filesystem containing the file;
378 "fileid" is a number that uniquely identifies the file within its
379 filesystem; "atime" is the time when the file was last accessed for
380 either read or write; "mtime" is the time when the file data was last
381 modified (written); and "ctime" is the time when the status of the
382 file was last changed. Writing to the file also changes "ctime" if
383 the size of the file changes.
385 "mode" is the access mode encoded as a set of bits. Notice that the
386 file type is specified both in the mode bits and in the file type.
387 This is really a bug in the protocol and will be fixed in future
388 versions. The descriptions given below specify the bit positions
396 0040000&This is a directory; "type" field should be NFDIR.
397 0020000&This is a character special file; "type" field should be NFCHR.
398 0060000&This is a block special file; "type" field should be NFBLK.
399 0100000&This is a regular file; "type" field should be NFREG.
400 0120000&This is a symbolic link file; "type" field should be NFLNK.
401 0140000&This is a named socket; "type" field should be NFNON.
402 0004000&Set user id on execution.
403 0002000&Set group id on execution.
404 0001000&Save swapped text even after use.
405 0000400&Read permission for owner.
406 0000200&Write permission for owner.
407 0000100&Execute and search permission for owner.
408 0000040&Read permission for group.
409 0000020&Write permission for group.
410 0000010&Execute and search permission for group.
411 0000004&Read permission for others.
412 0000002&Write permission for others.
413 0000001&Execute and search permission for others.
418 The bits are the same as the mode bits returned by the
420 system call in the UNIX system. The file type is specified both in
421 the mode bits and in the file type. This is fixed in future
424 The "rdev" field in the attributes structure is an operating system
425 specific device specifier. It will be removed and generalized in
426 the next revision of the protocol.
432 .IX "NFS data types" sattr "" \fIsattr\fP
446 structure contains the file attributes which can be set
447 from the client. The fields are the same as for
449 above. A "size" of zero means the file should be truncated.
450 A value of -1 indicates a field that should be ignored.
455 .IX "NFS data types" filename "" \fIfilename\fP
457 typedef string filename<MAXNAMLEN>;
462 is used for passing file names or pathname components.
467 .IX "NFS data types" path "" \fIpath\fP
469 typedef string path<MAXPATHLEN>;
474 is a pathname. The server considers it as a string
475 with no internal structure, but to the client it is the name of a
476 node in a filesystem tree.
481 .IX "NFS data types" attrstat "" \fIattrstat\fP
483 union attrstat switch (stat status) {
493 structure is a common procedure result. It contains
494 a "status" and, if the call succeeded, it also contains the
495 attributes of the file on which the operation was done.
500 .IX "NFS data types" diropargs "" \fIdiropargs\fP
510 structure is used in directory operations. The
511 "fhandle" "dir" is the directory in which to find the file "name".
512 A directory operation is one in which the directory is affected.
517 .IX "NFS data types" diropres "" \fIdiropres\fP
519 union diropres switch (stat status) {
530 The results of a directory operation are returned in a
532 structure. If the call succeeded, a new file handle "file" and the
533 "attributes" associated with that file are returned along with the
537 .IX "NFS server procedures" "" "" "" PAGE MAJOR
539 The protocol definition is given as a set of procedures with
540 arguments and results defined using the RPC language. A brief
541 description of the function of each procedure should provide enough
542 information to allow implementation.
544 All of the procedures in the NFS protocol are assumed to be
545 synchronous. When a procedure returns to the client, the client
546 can assume that the operation has completed and any data associated
547 with the request is now on stable storage. For example, a client
549 request may cause the server to update data blocks,
550 filesystem information blocks (such as indirect blocks), and file
551 attribute information (size and modify times). When the
553 returns to the client, it can assume that the write is safe, even
554 in case of a server crash, and it can discard the data written.
555 This is a very important part of the statelessness of the server.
556 If the server waited to flush data from remote requests, the client
557 would have to save those requests so that it could resend them in
558 case of a server crash.
564 * Remote file service routines
567 program NFS_PROGRAM {
568 version NFS_VERSION {
569 void NFSPROC_NULL(void) = 0;
570 attrstat NFSPROC_GETATTR(fhandle) = 1;
571 attrstat NFSPROC_SETATTR(sattrargs) = 2;
572 void NFSPROC_ROOT(void) = 3;
573 diropres NFSPROC_LOOKUP(diropargs) = 4;
574 readlinkres NFSPROC_READLINK(fhandle) = 5;
575 readres NFSPROC_READ(readargs) = 6;
576 void NFSPROC_WRITECACHE(void) = 7;
577 attrstat NFSPROC_WRITE(writeargs) = 8;
578 diropres NFSPROC_CREATE(createargs) = 9;
579 stat NFSPROC_REMOVE(diropargs) = 10;
580 stat NFSPROC_RENAME(renameargs) = 11;
581 stat NFSPROC_LINK(linkargs) = 12;
582 stat NFSPROC_SYMLINK(symlinkargs) = 13;
583 diropres NFSPROC_MKDIR(createargs) = 14;
584 stat NFSPROC_RMDIR(diropargs) = 15;
585 readdirres NFSPROC_READDIR(readdirargs) = 16;
586 statfsres NFSPROC_STATFS(fhandle) = 17;
593 .IX "NFS server procedures" NFSPROC_NULL() "" \fINFSPROC_NULL()\fP
596 NFSPROC_NULL(void) = 0;
599 This procedure does no work. It is made available in all RPC
600 services to allow server response testing and timing.
603 \&Get File Attributes
604 .IX "NFS server procedures" NFSPROC_GETATTR() "" \fINFSPROC_GETATTR()\fP
607 NFSPROC_GETATTR (fhandle) = 1;
610 If the reply status is
612 then the reply attributes contains
613 the attributes for the file given by the input fhandle.
616 \&Set File Attributes
617 .IX "NFS server procedures" NFSPROC_SETATTR() "" \fINFSPROC_SETATTR()\fP
625 NFSPROC_SETATTR (sattrargs) = 2;
628 The "attributes" argument contains fields which are either -1 or
629 are the new value for the attributes of "file". If the reply
632 then the reply attributes have the attributes of
633 the file after the "SETATTR" operation has completed.
635 Note: The use of -1 to indicate an unused field in "attributes" is
636 changed in the next version of the protocol.
639 \&Get Filesystem Root
640 .IX "NFS server procedures" NFSPROC_ROOT "" \fINFSPROC_ROOT\fP
643 NFSPROC_ROOT(void) = 3;
646 Obsolete. This procedure is no longer used because finding the
647 root file handle of a filesystem requires moving pathnames between
648 client and server. To do this right we would have to define a
649 network standard representation of pathnames. Instead, the
650 function of looking up the root file handle is done by the
653 .I "Mount Protocol Definition"
654 later in this chapter for details).
658 .IX "NFS server procedures" NFSPROC_LOOKUP() "" \fINFSPROC_LOOKUP()\fP
661 NFSPROC_LOOKUP(diropargs) = 4;
664 If the reply "status" is
666 then the reply "file" and reply
667 "attributes" are the file handle and attributes for the file "name"
668 in the directory given by "dir" in the argument.
671 \&Read From Symbolic Link
672 .IX "NFS server procedures" NFSPROC_READLINK() "" \fINFSPROC_READLINK()\fP
674 union readlinkres switch (stat status) {
682 NFSPROC_READLINK(fhandle) = 5;
685 If "status" has the value
687 then the reply "data" is the data in
688 the symbolic link given by the file referred to by the fhandle argument.
690 Note: since NFS always parses pathnames on the client, the
691 pathname in a symbolic link may mean something different (or be
692 meaningless) on a different client or on the server if a different
693 pathname syntax is used.
697 .IX "NFS server procedures" NFSPROC_READ "" \fINFSPROC_READ\fP
706 union readres switch (stat status) {
709 opaque data<NFS_MAXDATA>;
715 NFSPROC_READ(readargs) = 6;
718 Returns up to "count" bytes of "data" from the file given by
719 "file", starting at "offset" bytes from the beginning of the file.
720 The first byte of the file is at offset zero. The file attributes
721 after the read takes place are returned in "attributes".
723 Note: The argument "totalcount" is unused, and is removed in the
724 next protocol revision.
728 .IX "NFS server procedures" NFSPROC_WRITECACHE() "" \fINFSPROC_WRITECACHE()\fP
731 NFSPROC_WRITECACHE(void) = 7;
734 To be used in the next protocol revision.
738 .IX "NFS server procedures" NFSPROC_WRITE() "" \fINFSPROC_WRITE()\fP
742 unsigned beginoffset;
745 opaque data<NFS_MAXDATA>;
749 NFSPROC_WRITE(writeargs) = 8;
752 Writes "data" beginning "offset" bytes from the beginning of
753 "file". The first byte of the file is at offset zero. If the
754 reply "status" is NFS_OK, then the reply "attributes" contains the
755 attributes of the file after the write has completed. The write
756 operation is atomic. Data from this call to
758 will not be mixed with data from another client's calls.
760 Note: The arguments "beginoffset" and "totalcount" are ignored and
761 are removed in the next protocol revision.
765 .IX "NFS server procedures" NFSPROC_CREATE() "" \fINFSPROC_CREATE()\fP
773 NFSPROC_CREATE(createargs) = 9;
776 The file "name" is created in the directory given by "dir". The
777 initial attributes of the new file are given by "attributes". A
778 reply "status" of NFS_OK indicates that the file was created, and
779 reply "file" and reply "attributes" are its file handle and
780 attributes. Any other reply "status" means that the operation
781 failed and no file was created.
783 Note: This routine should pass an exclusive create flag, meaning
784 "create the file only if it is not already there".
788 .IX "NFS server procedures" NFSPROC_REMOVE() "" \fINFSPROC_REMOVE()\fP
791 NFSPROC_REMOVE(diropargs) = 10;
794 The file "name" is removed from the directory given by "dir". A
795 reply of NFS_OK means the directory entry was removed.
797 Note: possibly non-idempotent operation.
801 .IX "NFS server procedures" NFSPROC_RENAME() "" \fINFSPROC_RENAME()\fP
809 NFSPROC_RENAME(renameargs) = 11;
812 The existing file "from.name" in the directory given by "from.dir"
813 is renamed to "to.name" in the directory given by "to.dir". If the
816 the file was renamed. The
819 atomic on the server; it cannot be interrupted in the middle.
821 Note: possibly non-idempotent operation.
824 \&Create Link to File
825 .IX "NFS server procedures" NFSPROC_LINK() "" \fINFSPROC_LINK()\fP
833 NFSPROC_LINK(linkargs) = 12;
836 Creates the file "to.name" in the directory given by "to.dir",
837 which is a hard link to the existing file given by "from". If the
840 a link was created. Any other return value
841 indicates an error, and the link was not created.
843 A hard link should have the property that changes to either of the
844 linked files are reflected in both files. When a hard link is made
845 to a file, the attributes for the file should have a value for
846 "nlink" that is one greater than the value before the link.
848 Note: possibly non-idempotent operation.
851 \&Create Symbolic Link
852 .IX "NFS server procedures" NFSPROC_SYMLINK() "" \fINFSPROC_SYMLINK()\fP
861 NFSPROC_SYMLINK(symlinkargs) = 13;
864 Creates the file "from.name" with ftype
867 given by "from.dir". The new file contains the pathname "to" and
868 has initial attributes given by "attributes". If the return value
871 a link was created. Any other return value indicates an
872 error, and the link was not created.
874 A symbolic link is a pointer to another file. The name given in
875 "to" is not interpreted by the server, only stored in the newly
876 created file. When the client references a file that is a symbolic
877 link, the contents of the symbolic link are normally transparently
878 reinterpreted as a pathname to substitute. A
880 operation returns the data to the client for interpretation.
882 Note: On UNIX servers the attributes are never used, since
883 symbolic links always have mode 0777.
887 .IX "NFS server procedures" NFSPROC_MKDIR() "" \fINFSPROC_MKDIR()\fP
890 NFSPROC_MKDIR (createargs) = 14;
893 The new directory "where.name" is created in the directory given by
894 "where.dir". The initial attributes of the new directory are given
895 by "attributes". A reply "status" of NFS_OK indicates that the new
896 directory was created, and reply "file" and reply "attributes" are
897 its file handle and attributes. Any other reply "status" means
898 that the operation failed and no directory was created.
900 Note: possibly non-idempotent operation.
904 .IX "NFS server procedures" NFSPROC_RMDIR() "" \fINFSPROC_RMDIR()\fP
907 NFSPROC_RMDIR(diropargs) = 15;
910 The existing empty directory "name" in the directory given by "dir"
911 is removed. If the reply is
913 the directory was removed.
915 Note: possibly non-idempotent operation.
918 \&Read From Directory
919 .IX "NFS server procedures" NFSPROC_READDIR() "" \fINFSPROC_READDIR()\fP
934 union readdirres switch (stat status) {
945 NFSPROC_READDIR (readdirargs) = 16;
948 Returns a variable number of directory entries, with a total size
949 of up to "count" bytes, from the directory given by "dir". If the
950 returned value of "status" is
952 then it is followed by a
953 variable number of "entry"s. Each "entry" contains a "fileid"
954 which consists of a unique number to identify the file within a
955 filesystem, the "name" of the file, and a "cookie" which is an
956 opaque pointer to the next entry in the directory. The cookie is
959 call to get more entries starting at a
960 given point in the directory. The special cookie zero (all bits
961 zero) can be used to get the entries starting at the beginning of
962 the directory. The "fileid" field should be the same number as the
963 "fileid" in the attributes of the file. (See the
964 .I "Basic Data Types"
966 The "eof" flag has a value of
968 if there are no more entries in the directory.
971 \&Get Filesystem Attributes
972 .IX "NFS server procedures" NFSPROC_STATFS() "" \fINFSPROC_STATFS()\fP
974 union statfsres (stat status) {
988 NFSPROC_STATFS(fhandle) = 17;
991 If the reply "status" is
993 then the reply "info" gives the
994 attributes for the filesystem that contains file referred to by the
995 input fhandle. The attribute fields contain the following values:
997 The optimum transfer size of the server in bytes. This is
998 the number of bytes the server would like to have in the
999 data part of READ and WRITE requests.
1001 The block size in bytes of the filesystem.
1003 The total number of "bsize" blocks on the filesystem.
1005 The number of free "bsize" blocks on the filesystem.
1007 The number of "bsize" blocks available to non-privileged users.
1009 Note: This call does not work well if a filesystem has variable
1012 \&NFS Implementation Issues
1013 .IX NFS implementation
1015 The NFS protocol is designed to be operating system independent, but
1016 since this version was designed in a UNIX environment, many
1017 operations have semantics similar to the operations of the UNIX file
1018 system. This section discusses some of the implementation-specific
1021 \&Server/Client Relationship
1022 .IX NFS "server/client relationship"
1024 The NFS protocol is designed to allow servers to be as simple and
1025 general as possible. Sometimes the simplicity of the server can be a
1026 problem, if the client wants to implement complicated filesystem
1029 For example, some operating systems allow removal of open files. A
1030 process can open a file and, while it is open, remove it from the
1031 directory. The file can be read and written as long as the process
1032 keeps it open, even though the file has no name in the filesystem.
1033 It is impossible for a stateless server to implement these semantics.
1034 The client can do some tricks such as renaming the file on remove,
1035 and only removing it on close. We believe that the server provides
1036 enough functionality to implement most file system semantics on the
1039 Every NFS client can also potentially be a server, and remote and
1040 local mounted filesystems can be freely intermixed. This leads to
1041 some interesting problems when a client travels down the directory
1042 tree of a remote filesystem and reaches the mount point on the server
1043 for another remote filesystem. Allowing the server to follow the
1044 second remote mount would require loop detection, server lookup, and
1045 user revalidation. Instead, we decided not to let clients cross a
1046 server's mount point. When a client does a LOOKUP on a directory on
1047 which the server has mounted a filesystem, the client sees the
1048 underlying directory instead of the mounted directory. A client can
1049 do remote mounts that match the server's mount points to maintain the
1053 \&Pathname Interpretation
1054 .IX NFS "pathname interpretation"
1056 There are a few complications to the rule that pathnames are always
1057 parsed on the client. For example, symbolic links could have
1058 different interpretations on different clients. Another common
1059 problem for non-UNIX implementations is the special interpretation of
1060 the pathname ".." to mean the parent of a given directory. The next
1061 revision of the protocol uses an explicit flag to indicate the parent
1065 .IX NFS "permission issues"
1067 The NFS protocol, strictly speaking, does not define the permission
1068 checking used by servers. However, it is expected that a server
1069 will do normal operating system permission checking using
1071 style authentication as the basis of its protection mechanism. The
1072 server gets the client's effective "uid", effective "gid", and groups
1073 on each call and uses them to check permission. There are various
1074 problems with this method that can been resolved in interesting ways.
1076 Using "uid" and "gid" implies that the client and server share the
1077 same "uid" list. Every server and client pair must have the same
1078 mapping from user to "uid" and from group to "gid". Since every
1079 client can also be a server, this tends to imply that the whole
1080 network shares the same "uid/gid" space.
1083 revision of the NFS protocol) uses string names instead of numbers,
1084 but there are still complex problems to be solved.
1086 Another problem arises due to the usually stateful open operation.
1087 Most operating systems check permission at open time, and then check
1088 that the file is open on each read and write request. With stateless
1089 servers, the server has no idea that the file is open and must do
1090 permission checking on each read and write call. On a local
1091 filesystem, a user can open a file and then change the permissions so
1092 that no one is allowed to touch it, but will still be able to write
1093 to the file because it is open. On a remote filesystem, by contrast,
1094 the write would fail. To get around this problem, the server's
1095 permission checking algorithm should allow the owner of a file to
1096 access it regardless of the permission setting.
1098 A similar problem has to do with paging in from a file over the
1099 network. The operating system usually checks for execute permission
1100 before opening a file for demand paging, and then reads blocks from
1101 the open file. The file may not have read permission, but after it
1102 is opened it doesn't matter. An NFS server can not tell the
1103 difference between a normal file read and a demand page-in read. To
1104 make this work, the server allows reading of files if the "uid" given
1105 in the call has execute or read permission on the file.
1107 In most operating systems, a particular user (on the user ID zero)
1108 has access to all files no matter what permission and ownership they
1109 have. This "super-user" permission may not be allowed on the server,
1110 since anyone who can become super-user on their workstation could
1111 gain access to all remote files. The UNIX server by default maps
1112 user id 0 to -2 before doing its access checking. This works except
1113 for NFS root filesystems, where super-user access cannot be avoided.
1115 \&Setting RPC Parameters
1116 .IX NFS "setting RPC parameters"
1118 Various file system parameters and options should be set at mount
1119 time. The mount protocol is described in the appendix below. For
1120 example, "Soft" mounts as well as "Hard" mounts are usually both
1121 provided. Soft mounted file systems return errors when RPC
1122 operations fail (after a given number of optional retransmissions),
1123 while hard mounted file systems continue to retransmit forever.
1124 Clients and servers may need to keep caches of recent operations to
1125 help avoid problems with non-idempotent operations.
1127 \&Mount Protocol Definition
1128 .IX "mount protocol" "" "" "" PAGE MAJOR
1132 .IX "mount protocol" introduction
1134 The mount protocol is separate from, but related to, the NFS
1135 protocol. It provides operating system specific services to get the
1136 NFS off the ground -- looking up server path names, validating user
1137 identity, and checking access permissions. Clients use the mount
1138 protocol to get the first file handle, which allows them entry into a
1141 The mount protocol is kept separate from the NFS protocol to make it
1142 easy to plug in new access checking and validation methods without
1143 changing the NFS server protocol.
1145 Notice that the protocol definition implies stateful servers because
1146 the server maintains a list of client's mount requests. The mount
1147 list information is not critical for the correct functioning of
1148 either the client or the server. It is intended for advisory use
1149 only, for example, to warn possible clients when a server is going
1152 Version one of the mount protocol is used with version two of the NFS
1153 protocol. The only connecting point is the
1155 structure, which is the same for both protocols.
1158 .IX "mount protocol" "RPC information"
1159 .IP \fIAuthentication\fP
1160 The mount service uses
1164 style authentication only.
1165 .IP "\fITransport Protocols\fP"
1166 The mount service is currently supported on UDP/IP only.
1167 .IP "\fIPort Number\fP"
1168 Consult the server's portmapper, described in the chapter
1169 .I "Remote Procedure Calls: Protocol Specification",
1170 to find the port number on which the mount service is registered.
1172 \&Sizes of XDR Structures
1173 .IX "mount protocol" "XDR structure sizes"
1175 These are the sizes, given in decimal bytes, of various XDR
1176 structures used in the protocol:
1178 /* \fIThe maximum number of bytes in a pathname argument\fP */
1179 const MNTPATHLEN = 1024;
1181 /* \fIThe maximum number of bytes in a name argument\fP */
1182 const MNTNAMLEN = 255;
1184 /* \fIThe size in bytes of the opaque file handle\fP */
1189 .IX "mount protocol" "basic data types"
1190 .IX "mount data types"
1192 This section presents the data types used by the mount protocol.
1193 In many cases they are similar to the types used in NFS.
1197 .IX "mount data types" fhandle "" \fIfhandle\fP
1199 typedef opaque fhandle[FHSIZE];
1204 is the file handle that the server passes to the
1205 client. All file operations are done using file handles to refer
1206 to a file or directory. The file handle can contain whatever
1207 information the server needs to distinguish an individual file.
1209 This is the same as the "fhandle" XDR definition in version 2 of
1210 the NFS protocol; see
1211 .I "Basic Data Types"
1212 in the definition of the NFS protocol, above.
1216 .IX "mount data types" fhstatus "" \fIfhstatus\fP
1218 union fhstatus switch (unsigned status) {
1228 is a union. If a "status" of zero is returned,
1229 the call completed successfully, and a file handle for the
1230 "directory" follows. A non-zero status indicates some sort of
1231 error. In this case the status is a UNIX error number.
1235 .IX "mount data types" dirpath "" \fIdirpath\fP
1237 typedef string dirpath<MNTPATHLEN>;
1242 is a server pathname of a directory.
1246 .IX "mount data types" name "" \fIname\fP
1248 typedef string name<MNTNAMLEN>;
1253 is an arbitrary string used for various names.
1256 .IX "mount server procedures"
1258 The following sections define the RPC procedures supplied by a
1264 * Protocol description for the mount program
1271 * Version 1 of the mount protocol used with
1272 * version 2 of the NFS protocol.
1276 void MOUNTPROC_NULL(void) = 0;
1277 fhstatus MOUNTPROC_MNT(dirpath) = 1;
1278 mountlist MOUNTPROC_DUMP(void) = 2;
1279 void MOUNTPROC_UMNT(dirpath) = 3;
1280 void MOUNTPROC_UMNTALL(void) = 4;
1281 exportlist MOUNTPROC_EXPORT(void) = 5;
1288 .IX "mount server procedures" MNTPROC_NULL() "" \fIMNTPROC_NULL()\fP
1291 MNTPROC_NULL(void) = 0;
1294 This procedure does no work. It is made available in all RPC
1295 services to allow server response testing and timing.
1299 .IX "mount server procedures" MNTPROC_MNT() "" \fIMNTPROC_MNT()\fP
1302 MNTPROC_MNT(dirpath) = 1;
1305 If the reply "status" is 0, then the reply "directory" contains the
1306 file handle for the directory "dirname". This file handle may be
1307 used in the NFS protocol. This procedure also adds a new entry to
1308 the mount list for this client mounting "dirname".
1311 \&Return Mount Entries
1312 .IX "mount server procedures" MNTPROC_DUMP() "" \fIMNTPROC_DUMP()\fP
1317 mountlist nextentry;
1321 MNTPROC_DUMP(void) = 2;
1324 Returns the list of remote mounted filesystems. The "mountlist"
1325 contains one entry for each "hostname" and "directory" pair.
1328 \&Remove Mount Entry
1329 .IX "mount server procedures" MNTPROC_UMNT() "" \fIMNTPROC_UMNT()\fP
1332 MNTPROC_UMNT(dirpath) = 3;
1335 Removes the mount list entry for the input "dirpath".
1338 \&Remove All Mount Entries
1339 .IX "mount server procedures" MNTPROC_UMNTALL() "" \fIMNTPROC_UMNTALL()\fP
1342 MNTPROC_UMNTALL(void) = 4;
1345 Removes all of the mount list entries for this client.
1348 \&Return Export List
1349 .IX "mount server procedures" MNTPROC_EXPORT() "" \fIMNTPROC_EXPORT()\fP
1356 struct *exportlist {
1363 MNTPROC_EXPORT(void) = 5;
1366 Returns a variable number of export list entries. Each entry
1367 contains a filesystem name and a list of groups that are allowed to
1368 import it. The filesystem name is in "filesys", and the group name
1369 is in the list "groups".
1371 Note: The exportlist should contain
1372 more information about the status of the filesystem, such as a