tahoe-lafs/docs/webapi.txt

380 lines
16 KiB
Plaintext
Raw Normal View History

== connecting to the tahoe node ==
Writing "8011" into $NODEDIR/webport causes the node to run a webserver on
port 8011. Writing "tcp:8011:interface=127.0.0.1" into $NODEDIR/webport does
2007-07-12 23:45:51 +00:00
the same but binds to the loopback interface, ensuring that only the programs
on the local host can connect. Using
"ssl:8011:privateKey=mykey.pem:certKey=cert.pem" would run an SSL server. See
twisted.application.strports for more details.
If $NODEDIR/webpassword exists, it will be used (somehow) to require HTTP
Digest Authentication for all webserver connections. XXX specify how
== vdrive ==
The node provides some small number of "virtual drives". In the 0.5
release, this number is two: the first is the global shared vdrive, the
second is the private non-shared vdrive. We will call these "global" and
"private" for now.
For the purpose of this document, let us assume that the vdrives currently
contain the following directories and files:
global/
global/Documents/
global/Documents/notes.txt
private/
private/Pictures/
private/Pictures/tractors.jpg
Within the webserver, there is a tree of resources. The top-level "vdrive"
resource gives access to files and directories in all of the user's virtual
drives. For example, the URL that corresponds to notes.txt would be:
http://localhost:8011/vdrive/global/Documents/notes.txt
and the URL for tractors.jpg would be:
http://localhost:8011/vdrive/private/Pictures/tractors.jpg
In addition, each directory has a corresponding URL. The Pictures URL is:
http://localhost:8011/vdrive/private/Pictures
Now, what can we do with these URLs? By varying the HTTP method
(GET/PUT/POST/DELETE) and by appending a type-indicating query argument, we
control how what we want to do with the data and how it should be presented.
=== files and directories by name ===
In the following examples "$URL" is a shorthand for a URL like the ones
described above, with "vdrive/" as the top level, followed by a
slash-separated sequence of file or directory names. "$NEWURL" is a
shorthand for a URL pointing to a location in the vdrive where currently
nothing exists.
GET $URL
If the given place in the vdrive contains a file, then this simply
retrieves the contents of the file. The Content-Type is set according to
the vdrive's metadata (if available) or by using the usual
filename-extension-magic built into most webservers. The file's contents
are provided in the body of the HTTP response.
If the given place contains a directory, then this returns an HTML page,
intended to be used by humans, which contains HREF links to all files and
directories reachable from this dirnode. These HREF links do not have a t=
argument, meaning that a human who follows them will get pages also meant
for a human. It also contains forms to upload new files, and to delete
files and directories. These forms use POST methods to do their job.
You can add the "save=true" argument, which adds a 'Content-Disposition:
attachment' header to prompt most web browsers to save the file to disk
rather than attempting to display it.
GET $URL?t=json
This returns machine-parseable information about the named file or
directory in the HTTP response body. This information contains a flag that
indicates whether the thing is a file or a directory.
If it is a file, then the information includes file size, metadata (like
Content-Type), and URIs, like this:
[ 'filenode', { 'mutable': bool, 'uri': file_uri, 'size': bytes } ]
If it is a directory, then it includes a flag to indicate whether this is a
read-write dirnode or a read-only dirnode, and information about the
children of this directory, as a mapping from child name to a set of
metadata about the child (the same data that would appear in a
corresponding GET?t=json of the child itself). Like this:
[ 'dirnode', { 'mutable': bool, 'uri': uri, 'children': children } ]
where 'children' is a dictionary in which the keys are child names
and the values depend upon whether the child is a file or a directory:
'foo.txt': [ 'filenode', { 'mutable': bool, 'uri': uri, 'size': bytes } ]
'subdir': [ 'dirnode', { 'mutable': bool, 'uri': uri } ]
note that the value is the same as the JSON representation of the
corresponding FILEURL or DIRURL (except that dirnodes do not recurse --
the "children" entry of the child is omitted).
Before writing code that uses these results, please see the important note
below about TOCTTOU bugs.
GET $URL?t=uri
This returns the URI of the given file or directory in the HTTP response
body. If you have read-write access to that resource then this returns a
URI which provides read-write access. If you have read-only access to that
resource then this returns a URI which provides read-only access.
GET $URL?t=readonly-uri
This returns the URI providing read-only access to the given file or
directory (whether or not you have read-only or read-write access).
(Currently all files are immutable so everyone has read-only access to all
files.)
GET $URL?t=download&localfile=$LOCALPATH
This instructs the node to download the given file or directory and write
it into the local filesystem at $LOCALPATH. This request will only be
accepted from an HTTP client connection originating at 127.0.0.1 . This
request is most useful when the client node and the HTTP client are
operated by the same user. $LOCALPATH should be an absolute pathname.
PUT $URL?t=uri
This attaches a child (either a file or a directory) to the vdrive at the
given location. The URI of the child is provided in the body of the HTTP
request. This can be used to attach a shared directory to the
vdrive. Intermediate directories are created on-demand just like with the
regular PUT command.
PUT $NEWURL?t=upload&localfile=$LOCALPATH
This uploads a file or directory from the node's local filesystem to the
vdrive. As with "GET $URL?t=download&localfile=$LOCALPATH", this request
will only be accepted from an HTTP connection originating from 127.0.0.1.
If $LOCALPATH points to a directory on the node's local filesystem, then
the node performs a recursive upload of the directory into the vdrive at
the given location. $NEWURL will be created if necessary. When the
operation is complete, the directory referenced by $NEWURL will contain all
of the files and directories that were present in $LOCALPATH, so this is
equivalent to the unix commands:
mkdir -p $NEWURL; cp -r $LOCALPATH/* $NEWURL/
Note that the "curl" utility can be used to provoke this sort of recursive
upload, since the -T option will make it use an HTTP 'PUT':
curl -T /dev/null 'http://localhost:8011/vdrive/global/newdir?t=upload&localdir=/home/user/directory-to-upload'
DELETE $URL
This deletes the given file or directory from the vdrive. If it is a
directory then this deletes all of its chilren. Note that this *does not*
delete any parent directories, so a sequence of 'PUT $NEWURL' and 'DELETE
$NEWURL' does not necessarily return the vdrive to its original state (it
may leave some intermediate directory nodes).
=== files by name ===
PUT $NEWURL
This uploads a file to the given place in the vdrive. It will create
intermediate directory nodes as necessary. The file's contents are taken
from the body of the HTTP request. For convenience, the HTTP response
contains the URI that results from uploading the file, although the node
is not obligated to do anything with the URI. According to the HTTP/1.1
specification (rfc2616), this should return a 200 (OK) code when modifying
an existing file, and a 201 (Created) code when creating a new file.
To use this, run 'curl -T localfile http://localhost:8011/vdrive/global/newfile'
=== directories by name ===
PUT $NEWURL?t=mkdir
Create a new empty directory at the given path. The HTTP response contains
the URI of the given directory, although the client is not obligated to do
anything with it.
GET $URL?t=rename-form&name=$CHILDNAME
This provides a useful facility to browser-based user interfaces. It
returns a page containing a form targetting the "POST $URL t=rename"
functionality described below, with the provided $CHILDNAME present in the
'from_name' field of that form. I.e. this presents a form offering to
rename $CHILDNAME, requesting the new name, and submitting POST rename.
== URIs ==
A separate top-level resource namespace ("uri/" instead of "vdrive/") is used
to get access to files and dirnodes that are indexed directly by URI, rather
than by going through the vdrive. The resource thus referenced is used the
same way as if it were accessed through the vdrive (including accessing a
directory's children with "$URI/childname").
For example, this identifies a file or directory:
http://localhost:8011/uri/$URI
And this identifies a file or directory "foo" in a subdirectory "somedir" of
the identified directory:
http://localhost:8011/uri/$URI/somedir/foo
In the following examples, "$URI_URL" is a shorthand for a URL like the one
above, with "uri/" as the top level, followed by a URI.
Note that since tahoe URIs may contain slashes (in particular, dirnode URIs
contain a FURL, which resembles a regular HTTP URL and starts with pb://),
when URIs are used in this form, they must be specially quoted. All slashes
in the URI must be replaced by '!' characters. XXX consider changing the
allmydata.org uri format to relieve the user of this requirement.
GET $URI_URL
2007-08-10 19:33:29 +00:00
GET $URI_URL?t=json
GET $URI_URL?t=readonly-uri
2007-08-10 19:33:29 +00:00
These each behave the same way that their name-based URL equivalent does,
described in the "files and directories" section above. The difference is
that which file or directory you access does not depend on the contents of
parent directories as it does with the name-based URLs, since a URI
uniquely identifies an object regardless of its location.
Since files accessed this way do not have a filename (from which a
MIME-type can be derived), one can be specified using a 'filename=' query
argument. This filename is also the one used if the 'save=true' argument is
set.
2007-08-10 19:33:29 +00:00
Note that since the $URI_URL already contains the URI, the only use for the
"?t=readonly-uri" command is if the thing identified is a directory and you
have read-write access to it and you want to get a URI which provides
read-only access to it.
GET http://localhost:8011/uri?uri=$URI
This causes a redirect to /uri/$URI, and retains any additional query
arguments (like filename= or save=). This is for the convenience of web
forms which allow the user to paste in a URI (obtained through some
out-of-band channel, like IM or email).
Note that this form only redirects to the specific node indicated by the
URI: unlike the GET /uri/$URI form, you cannot traverse to child nodes by
appending additional path segments to the URL.
The $URI provided as a query argument is allowed to contain slashes. The
redirection provided will escape the slashes with exclamation points, as
described above.
== TOCTTOU bugs ==
Note that since directories are mutable you can get surprises if you query
the vdrive, e.g. "GET $URL?t=json", examine the resulting JSON-encoded
information, and then fetch from or update the vdrive using a name-based URL.
This is because the actual state of the vdrive could have changed after you
did the "GET $URL?t=json" query and before you did the subsequent fetch or
update.
For example, suppose you query to find out that "vdrive/private/somedir/foo"
is a file which has a certain number of bytes, and then you issue a "GET
vdrive/private/somedir/foo" to fetch the file. The file that you get might
have a different number of bytes than the one that you chose to fetch,
because the "foo" entry in the "somedir" directory may have been changed to
point to a different file between your query and your fetch, or because the
"somedir" entry in the private vdrive might have been changed to point to a
different directory.
Potentially more damaging, suppose that the "foo" entry was changed to point
to a directory instead of a file. Then instead of receiving the expected
file, you receive a file containing an HTML page describing the directory
contents!
These are examples of TOCTTOU bugs ( http://en.wikipedia.org/wiki/TOCTTOU ).
A good way to avoid these bugs is to issue your second request, not with a
URL based on the sequence of names that lead to the object, but instead with
the URI of the object. For example, in the case that you query a directory
listing (with "GET vdrive/private/somedir?t=json"), find a file named "foo"
therein that you want to download, and then download the file, if you
download it with its URI ("GET uri/$URI") instead of its URL ("GET
vdrive/private/somedir/foo") then you will get the file that was in the
"somedir/" directory under the name "foo" when you queried that directory,
even if the "somedir/" directory has since been changed so that its "foo"
child now points to a different file or to a directory.
In general, use names if you want "whatever object (whether file or
directory) is found by following this sequence of names when my request
reaches the server". Use URIs if you want "this particular object".
If you are basing your decision to fetch from or update the vdrive on
filesystem information that was returned by an earlier query, then you
usually intend to fetch or update the particular object that was in that
location when you queried it, rather than whatever object is going to be in
that location when your request reaches the server.
== POST forms ==
POST $URL
t=upload
name=childname (optional)
file=newfile
This instructs the node to upload a file into the given dirnode. We need
this because forms are the only way for a web browser to upload a file
(browsers do not know how to do PUT or DELETE). The file's contents and the
new child name will be included in the form's arguments. This can only be
used to upload a single file at a time. To avoid confusion, name= is not
allowed to contain a slash (a 400 Bad Request error will result).
POST $URL
t=mkdir
name=childname
This instructs the node to create a new empty directory. The name of the
new child directory will be included in the form's arguments.
POST $URL
t=uri
name=childname
uri=newuri
This instructs the node to attach a child that is referenced by URI (just
like the PUT $URL?t=uri method). The name and URI of the new child
will be included in the form's arguments.
POST $URL
t=delete
name=childname
This instructs the node to delete a file from the given dirnode. The name
of the child to be deleted will be included in the form's arguments.
POST $URL
t=rename
from_name=oldchildname
to_name=newchildname
This instructs the node to rename a child within the given dirnode. The
child specified by 'from_name' is removed, and reattached as a child named
for 'to_name'. This is unconditional and will replace any child already
present under 'to_name', akin to 'mv -f' in unix parlance.
== XMLRPC ==
http://localhost:8011/xmlrpc
This resource provides an XMLRPC server on which all of the previous
operations can be expressed as function calls taking a "pathname" argument.
This is provided for applications that want to think of everything in terms
of XMLRPC.
listdir(vdrivename, path) -> dict of (childname -> (stuff))
put(vdrivename, path, contents) -> URI
get(vdrivename, path) -> contents
mkdir(vdrivename, path) -> URI
put_localfile(vdrivename, path, localfilename) -> URI
get_localfile(vdrivename, path, localfilename)
put_localdir(vdrivename, path, localdirname) # recursive
get_localdir(vdrivename, path, localdirname) # recursive
put_uri(vdrivename, path, URI)
etc..