9P(7) Miscellaneous Information Manual 9P(7)

9PSimple Distributed File System

9P is a protocol that implements a distributed file systems. It provides primitives to manage (create, read, write and delete) sets of files remotely. These files don't necessarily need to be actually stored on a disk, they may be, for example, synthesise on demand from external sources.

A client transmits requests (T-messages) to a server, which returns replies (R-messages) to the client. The combined acts of transmitting a request of a particular type and receiving a reply is called a transaction of that type.

Each message consists of a sequence of bytes mostly grouped in one, two or four integer fields transmitted in little-endian order (least significant byte first). Data items of larger or variable lengths are represented by a two-byte field specifying the length followed by the actual data. The only exception to this rule are QIDs, thirteen byte long objects, that are sent as-is.

Text strings are represented with a two-byte count and the sequence of UNICODE codepoints encoded in UTF-8. Text strings in 9p are not NUL-terminated. The NUL-terminator is illegal in all text strings and thus excluded from paths, user names and so on.

Fields are hereafter denoted as

type[1] tag[2] fid[4]

to indicate that type is one byte long, tag two and fid four. Strings are denoted as name[s] and are sent on the wire as

length[2] string[length]

A qid, described later, is a 13-byte value that is sent on the wire as

type[1] version[4] path[8]

Every message has a header with the following fields:

len[4] type[1] tag[2]

where len indicates the overall length of the message, including itself; type is one byte indicating the type of the message and the tag is a number choosen by the client that indicate uniquely the request. Then follows an optional body whose structure depends on the type of the message.

The message types are as follows: (the header is omitted for brevity)

Negotiate the version and maximum message size.
msize[4] version[s]
msize[4] version[s]

The version request must be the first message sent, and the client cannot issue further requests until receiving the Rversion reply. tag should be NOTAG (-1 or 255). The client suggest a msize (the maximum size for packets) and the protocol version used, the server replies with a msize smaller or equal to the one proposed by the client. The version string must always begin with the two character “9P”. If the server don't understand the client required version, should reply with a Rversion using the version string “unknown” and not use a Rerror.

Populate the namespace
fid[4] afid[4] uname[s] aname[s]
qid[13]

The attach message binds the given fid to the root of the file tree identified by aname. uname identifies the user and afid specifies a fid previously established by an auth message, or the special NOFID value (defined as (u32int_t)~0) if the authentication is not required.

Close fids.
fid[4]
⟨empty response⟩

Once a fid has been clunked (closed) it becomes “free” and the same value can be used for subsequential walk or attach requests.

The actual file on the disk is not removed unless it was opened with the ORCLOSE flag.

Return an error string.
⟨no request⟩
ename[s]

The Rerror message is used to return an error string describing the failure of a request. The tag indicates the failed request.

Note that there isn't a Terror request for obvious reason and it's not possible for a server to reply to a Tversion or Tflush using Rerror.

Abort an ongoing operation.
oldtag[2]
⟨empty response⟩

Given the asynchronous nature of the protocol, the server may respond to the pending request before responding to the Tflush and is possible for a client to send multiple Tflush for the same operation. The client must wait to receive a corresponding Rflush before reusing oldtag for subsequent messages.

If a response for oldtag is received before the Rflush reply, the client must assume that the operation was completed with success (fid allocated, files created, ...) If no response is received before the Rflush then the transaction is considered to have been successfully cancelled.

Note that the tag of this request and the corresponding reply is NOT oldtag but a new tag value.

Traverse a file tree.
fid[4] newfid[4] nwname[2] nwname*(wname[s])
nwqid[2] nwqid*(qid[13])

The nwname components are walked in order starting from fid (which must point to a directory) and, if successful, newfid is associated to the reached file.

It is possible for fid and newfid to be equal, in this case the fid is “mutated”, otherwise newfid must be unused. As a special case, a walk of zero components duplicates the fid.

If the first element cannot be walked for any reason an Rerror is returned. Otherwise, Rwalk is returned with a number of qids equal to the file viside by the walk. A client can thus detect a walk when that the replied nwqid number is not equal to the nwname field in the request. Only when walk return successfully newfid will be affected.

A maximum of 16 component can be used per walk request.

Prepare a fid for I/O.
fid[4] mode[1]
qid[13] iounit[4]

mode determines the type of I/O:

0x00 (OREAD)
Open the file for reading.
0x01 (OWRITE)
Open the file for writing.
0x02 (ORDWD)
Open the file for both reading and writing.
0x03 (OEXEC)
Open for exec.

Additionally, the following flags can be or'ed to mode:

0x10 (OTRUNC)
Truncate the file before opening
0x40 (ORCLOSE)
Remove the file upon clunk.

The returned iounit is the optimal blocksize for I/O.

Create a file
fid[4] name[s] perm[4] mode[1]
qid[13] iounit[4]

The call attempts to create a file named name in the directory identified by fid according to perm and then to open it with mode into the given fid.

It is illegal to use an already opened fid or to attempt to create the “.” or “..” entries.

Read data at offset
fid[4] offset[8] count[4]
count[4] data[count]

fid must have been prepared for I/O with a previous open call. The returned count is zero when reaching end-of-file and may be lesser than what requested.

Directories are a stream of stat structures, as described in stat, and for them the read request message must have offset equal to zero or the value of offset in the previous read on the directory plus the number of bytes returned in the previous read. Thus, is not possible to seek into directories except for rewinding.

Write data at offset
fid[4] offset[8] count[4] data[count]
count[4]

fid must have been prepared for I/O with a previous open or create call. The returned count is the amount of data actually written and may differ from the one in the request.

Get file status
fid[4]
stat[n]

The stat structure is made by the following fields:

size[2]
total byte count of the following data
type[2]
for kernel use
dev[4]
for kernel use
qid[13]
server unique identifier of the file
mode[4]
permissions and flags
atime[4]
last access time
mtime[4]
last modification time
length[8]
length of file in bytes
name[s]
file name (must be “/” if the file is the root directory of the server)
uid[s]
owner name
gid[s]
group name
muid[s]
name of the user who last modified the file.

Note that the size is always present, even in the wstat call. While it may be considered redundant, it's kept to simplify the parsing of the stat entries in a directory.

Change file attributes
fid[4] stat[n]
⟨empty response⟩

fid must have been prepared for writing with a previous open or create call.

The stat structure is the same described in stat.

The stat structure sent reflect what changes the client wishes to make to the given fid. To leave some fields as unchanged, use empty string or the maximum allowed value for integral fields. For example, to avoid changing the permission of the fid use 0xFFFFFFFF, or (uint32_t)-1.

Remove and clunk fid
fid[4]
⟨empty response⟩

After a remove call, even if an error is returned, the fid is closed.

utf8(7), kamid(8)

July 30, 2021 OpenBSD 7.2