INTRO(5)INTRO(5)NAME
intro - introduction to the Plan 9 File Protocol 9P in Inferno
SYNOPSIS
#include <lib9.h>
#include <styx.h>
DESCRIPTION
Inferno uses the Plan 9 protocol 9P to make resources visible as file
hierarchies. There have been two versions of 9P, the most recent being
9P2000. Inferno originally used its own protocol called Styx, which
was a simplified version of the first 9P protocol. When 9P was subse‐
quently revised as 9P2000, Inferno adopted that, but still using the
name Styx. To reduce confusion and to emphasise that Inferno and Plan
9 use the same protocol, the name Styx is being replaced by 9P. Refer‐
ences to `Styx' remain in the programs and documentation, but will
eventually be eliminated.
An Inferno server is an agent that provides one or more hierarchical
file systems — file trees — that may be accessed by Inferno processes.
A server responds to requests by clients to navigate the hierarchy, and
to create, remove, read, and write files. The prototypical server is a
separate machine that stores large numbers of user files on permanent
media. Another possibility for a server is to synthesize files on
demand, perhaps based on information on data structures inside the ker‐
nel; the prog(3) kernel device is a part of the Inferno kernel that
does this. User programs can also act as servers. One easy way is to
serve a set of files using the sys-file2chan(2) interface. More com‐
plex Limbo file service applications can use styx(2) to handle the pro‐
tocol messages directly, or use styxservers(2) to provide a higher-
level framework for file serving.
A connection to a server is a bidirectional communication path from the
client to the server. There may be a single client or multiple clients
sharing the same connection. A server's file tree is attached to a
process group's name space by bind and mount calls; see intro(2) and
sys-bind(2). Processes in the group are then clients of the server:
system calls operating on files are translated into requests and
responses transmitted on the connection to the appropriate service.
The Plan 9 File Protocol, 9P, is used for messages between clients and
servers. A client transmits requests (T-messages) to a server, which
subsequently returns replies (R-messages) to the client. The combined
acts of transmitting (receiving) a request of a particular type, and
receiving (transmitting) its reply is called a transaction of that
type.
Each message consists of a sequence of bytes. Two-, four-, and eight-
byte fields hold unsigned integers represented in little-endian order
(least significant byte first). Data items of larger or variable
lengths are represented by a two-byte field specifying a count, n, fol‐
lowed by n bytes of data. Text strings are represented this way, with
the text itself stored as a UTF-8 encoded sequence of Unicode charac‐
ters (see utf(6)). Text strings in 9P messages are not NUL-terminated:
n counts the bytes of UTF-8 data, which include no final zero byte.
The NUL character is illegal in all text strings in 9P, and is there‐
fore excluded from file names, user names, and so on.
Each 9P message begins with a four-byte size field specifying the
length in bytes of the complete message including the four bytes of the
size field itself. The next byte is the message type, one of the con‐
stants in the module styx(2), and in the C include file <styx.h> (see
styx(10.2)). The next two bytes are an identifying tag, described
below. The remaining bytes are parameters of different sizes. In the
message descriptions, the number of bytes in a field is given in brack‐
ets after the field name. The notation parameter[n] where n is not a
constant represents a variable-length parameter: n[2] followed by n
bytes of data forming the parameter. The notation string[s] (using a
literal s character) is shorthand for s[2] followed by s bytes of UTF-8
text. (Systems may choose to reduce the set of legal characters to
reduce syntactic problems, for example to remove slashes from name com‐
ponents, but the protocol has no such restriction. Inferno names may
contain any printable character (that is, any character outside hexa‐
decimal 00-1F and 80-9F) except slash.) Messages are transported in
byte form to allow for machine independence; styx(10.2) describes rou‐
tines that convert to and from this form into a machine-dependent C
structure.
MESSAGES
size[4] Tversion tag[2] msize[4] version[s]
size[4] Rversion tag[2] msize[4] version[s]
size[4] Tauth tag[2] afid[4] uname[s] aname[s]
size[4] Rauth tag[2] aqid[13]
size[4] Rerror tag[2] ename[s]
size[4] Tflush tag[2] oldtag[2]
size[4] Rflush tag[2]
size[4] Tattach tag[2] fid[4] afid[4] uname[s] aname[s]
size[4] Rattach tag[2] qid[13]
size[4] Twalk tag[2] fid[4] newfid[4] nwname[2]
nwname*(wname[s])
size[4] Rwalk tag[2] nwqid[2] nwqid*(wqid[13])
size[4] Topen tag[2] fid[4] mode[1]
size[4] Ropen tag[2] qid[13] iounit[4]
size[4] Tcreate tag[2] fid[4] name[s] perm[4] mode[1]
size[4] Rcreate tag[2] qid[13] iounit[4]
size[4] Tread tag[2] fid[4] offset[8] count[4]
size[4] Rread tag[2] count[4] data[count]
size[4] Twrite tag[2] fid[4] offset[8] count[4] data[count]
size[4] Rwrite tag[2] count[4]
size[4] Tclunk tag[2] fid[4]
size[4] Rclunk tag[2]
size[4] Tremove tag[2] fid[4]
size[4] Rremove tag[2]
size[4] Tstat tag[2] fid[4]
size[4] Rstat tag[2] stat[n]
size[4] Twstat tag[2] fid[4] stat[n]
size[4] Rwstat tag[2]
Each T-message has a tag field, chosen and used by the client to iden‐
tify the message. The reply to the message will have the same tag.
Clients must arrange that no two outstanding messages on the same con‐
nection have the same tag. An exception is the tag NOTAG, defined as
16rFFFF in styx(2) and styx(10.2): the client can use it, when estab‐
lishing a connection, to override tag matching in version messages.
The type of an R-message will either be one greater than the type of
the corresponding T-message or Rerror, indicating that the request
failed. In the latter case, the ename field contains a string describ‐
ing the reason for failure.
The version message identifies the version of the protocol and indi‐
cates the maximum message size the system is prepared to handle. It
also initializes the connection and aborts all outstanding I/O on the
connection. The set of messages between version requests is called a
session.
Most T-messages contain a fid, a 32-bit unsigned integer that the
client uses to identify a ``current file'' on the server. Fids are
somewhat like file descriptors in a user process, but they are not
restricted to files open for I/O: directories being examined, files
being accessed by sys-stat(2) calls, and so on — all files being manip‐
ulated by the operating system — are identified by fids. Fids are cho‐
sen by the client. All requests on a connection share the same fid
space; when several clients share a connection, the agent managing the
sharing must arrange that no two clients choose the same fid.
The fid supplied in an attach message will be taken by the server to
refer to the root of the served file tree. The attach identifies the
user to the server and may specify a particular file tree served by the
server (for those that supply more than one).
Permission to attach to the service is proven by providing a special
fid, called afid, in the attach message. This afid is established by
exchanging auth messages and subsequently manipulated using read and
write messages to exchange authentication information not defined
explicitly by 9P. Once the authentication protocol is complete, the
afid is presented in the attach to permit the user to access the ser‐
vice.
A walk message causes the server to change the current file associated
with a fid to be a file in the directory that is the old current file,
or one of its subdirectories. Walk returns a new fid that refers to
the resulting file. Usually, a client maintains a fid for the root,
and navigates by walks from the root fid.
A client can send multiple T-messages without waiting for the corre‐
sponding R-messages, but all outstanding T-messages must specify dif‐
ferent tags. The server may delay the response to a request and
respond to later ones; this is sometimes necessary, for example when
the client reads from a file that the server synthesizes from external
events such as keyboard characters.
Replies (R-messages) to auth, attach, walk, open, and create requests
convey a qid field back to the client. The qid represents the server's
unique identification for the file being accessed: two files on the
same server hierarchy are the same if and only if their qids are the
same. (The client may have multiple fids pointing to a single file on
a server and hence having a single qid.) The thirteen-byte qid fields
hold a one-byte type, specifying whether the file is a directory,
append-only file, etc., and two unsigned integers: first the four-byte
qid version, then the eight-byte qid path. The path is an integer
unique among all files in the hierarchy. If a file is deleted and
recreated with the same name in the same directory, the old and new
path components of the qids should be different. The version is a ver‐
sion number for a file; typically, it is incremented every time the
file is modified.
An existing file can be opened, or a new file may be created in the
current (directory) file. I/O of a given number of bytes at a given
offset on an open file is done by read and write.
A client should clunk any fid that is no longer needed. The remove
transaction deletes files.
The stat transaction retrieves information about the file. The stat
field in the reply includes the file's name, access permissions (read,
write and execute for owner, group and public), access and modification
times, and owner and group identifications (see sys-stat(2)). The
owner and group identifications are textual names. The wstat transac‐
tion allows some of a file's properties to be changed.
A request can be aborted with a flush request. When a server receives
a Tflush, it should not reply to the message with tag oldtag (unless it
has already replied), and it should immediately send an Rflush. The
client must wait until it gets the Rflush (even if the reply to the
original message arrives in the interim), at which point oldtag may be
reused.
Because the message size is negotiable and some elements of the proto‐
col are variable length, it is possible (although unlikely) to have a
situation where a valid message is too large to fit within the negoti‐
ated size. For example, a very long file name may cause a Rstat of the
file or Rread of its directory entry to be too large to send. In most
such cases, the server should generate an error rather than modify the
data to fit, such as by truncating the file name. The exception is
that a long error string in an Rerror message should be truncated if
necessary, since the string is only advisory and in some sense arbi‐
trary.
Most programs do not see the 9P protocol directly; instead calls to
library routines that access files are translated by the mount driver,
mnt(3), into 9P messages.
DIRECTORIES
Directories are created by create with DMDIR set in the permissions
argument (see stat(5)). The members of a directory can be found with
read(5). All directories must support walks to the directory .. (dot-
dot) meaning parent directory, although by convention directories con‐
tain no explicit entry for .. or . (dot). The parent of the root
directory of a server's tree is itself.
ACCESS PERMISSIONS
Each file server maintains a set of user and group names. Each user
can be a member of any number of groups. Each group has a group leader
who has special privileges (see stat(5)). Every file request has an
implicit user id (copied from the original attach) and an implicit set
of groups (every group of which the user is a member).
Each file has an associated owner and group id and three sets of per‐
missions: those of the owner, those of the group, and those of
``other'' users. When the owner attempts to do something to a file,
the owner, group, and other permissions are consulted, and if any of
them grant the requested permission, the operation is allowed. For
someone who is not the owner, but is a member of the file's group, the
group and other permissions are consulted. For everyone else, the
other permissions are used. Each set of permissions says whether read‐
ing is allowed, whether writing is allowed, and whether executing is
allowed. A walk in a directory is regarded as executing the directory,
not reading it. Permissions are kept in the low-order bits of the file
mode: owner read/write/execute permission represented as 1 in bits 8,
7, and 6 respectively (using 0 to number the low order). The group
permissions are in bits 5, 4, and 3, and the other permissions are in
bits 2, 1, and 0.
The file mode contains some additional attributes besides the permis‐
sions. If bit 31 (DMDIR) is set, the file is a directory; if bit 30
(DMAPPEND) is set, the file is append-only (offset is ignored in
writes); if bit 29 (DMEXCL) is set, the file is exclusive-use (only one
client may have it open at a time); if bit 27 (DMAUTH) is set, the file
is an authentication file established by auth messages; if bit 26
(DMTMP) is set, the contents of the file (or directory) are not
included in nightly archives. (Bit 28 is skipped for historical rea‐
sons.) These bits are reproduced, from the top bit down, in the type
byte of the Qid: QTDIR, QTAPPEND, QTEXCL, (skipping one bit) QTAUTH,
and QTTMP. The name QTFILE, defined to be zero, identifies the value
of the type for a plain file.
SEE ALSOintro(2), styx(2), styxservers(2), sys-bind(2), sys-stat(2), mnt(3),
prog(3), read(5), stat(5), styx(10.2)
INTRO(5)