mirror of
https://github.com/tahoe-lafs/tahoe-lafs.git
synced 2024-12-24 07:06:41 +00:00
4f2244bfdd
Brian (and anyone who has an interest in the API and documentation): please review.
422 lines
18 KiB
Plaintext
422 lines
18 KiB
Plaintext
== connecting to the tahoe node ==
|
|
|
|
Writing "8011" into $NODEDIR/webport causes the node to run a webserver on
|
|
port 8011. Writing "tcp:8011:interface=127.0.0.1" into $NODEDIR/webport does
|
|
the same but binds to the loopback interface, ensuring that only the programs
|
|
on the local host can connect. Using
|
|
"ssl:8011:privateKey=mykey.pem:certKey=cert.pem" would run an SSL server. See
|
|
twisted.application.strports for more details.
|
|
|
|
If $NODEDIR/webpassword exists, it will be used (somehow) to require HTTP
|
|
Digest Authentication for all webserver connections. XXX specify how
|
|
|
|
|
|
== vdrive ==
|
|
|
|
The node provides some small number of "virtual drives". In the 0.5
|
|
release, this number is two: the first is the global shared vdrive, the
|
|
second is the private non-shared vdrive. We will call these "global" and
|
|
"private" for now.
|
|
|
|
For the purpose of this document, let us assume that the vdrives currently
|
|
contain the following directories and files:
|
|
|
|
global/
|
|
global/Documents/
|
|
global/Documents/notes.txt
|
|
|
|
private/
|
|
private/Pictures/
|
|
private/Pictures/tractors.jpg
|
|
private/Pictures/family/
|
|
private/Pictures/family/bobby.jpg
|
|
|
|
|
|
Within the webserver, there is a tree of resources. The top-level "vdrive"
|
|
resource gives access to files and directories in all of the user's virtual
|
|
drives. For example, the URL that corresponds to notes.txt would be:
|
|
|
|
http://localhost:8011/vdrive/global/Documents/notes.txt
|
|
|
|
and the URL for tractors.jpg would be:
|
|
|
|
http://localhost:8011/vdrive/private/Pictures/tractors.jpg
|
|
|
|
In addition, each directory has a corresponding URL. The Pictures URL is:
|
|
|
|
http://localhost:8011/vdrive/private/Pictures
|
|
|
|
Now, what can we do with these URLs? By varying the HTTP method
|
|
(GET/PUT/POST/DELETE) and by appending a type-indicating query argument, we
|
|
control how what we want to do with the data and how it should be presented.
|
|
|
|
|
|
=== Manipulating files and directories by name ===
|
|
|
|
In the following examples "$URL" is a shorthand for a URL like the ones
|
|
described above, with "vdrive/" as the top level, followed by a
|
|
slash-separated sequence of file or directory names. "$NEWURL" is a
|
|
shorthand for a URL pointing to a location in the vdrive where currently
|
|
nothing exists.
|
|
|
|
GET $URL
|
|
|
|
If the given place in the vdrive contains a file, then this simply
|
|
retrieves the contents of the file. The Content-Type is set according to
|
|
the vdrive's metadata (if available) or by using the usual
|
|
filename-extension-magic built into most webservers. The file's contents
|
|
are provided in the body of the HTTP response.
|
|
|
|
If the given place contains a directory, then this returns an HTML page,
|
|
intended to be used by humans, which contains HREF links to all files and
|
|
directories reachable from this dirnode. These HREF links do not have a t=
|
|
argument, meaning that a human who follows them will get pages also meant
|
|
for a human. It also contains forms to upload new files, and to delete
|
|
files and directories. These forms use POST methods to do their job.
|
|
|
|
You can add the "save=true" argument, which adds a 'Content-Disposition:
|
|
attachment' header to prompt most web browsers to save the file to disk
|
|
rather than attempting to display it.
|
|
|
|
GET $URL?t=json
|
|
|
|
This returns machine-parseable information about the named file or
|
|
directory in the HTTP response body. This information contains a flag that
|
|
indicates whether the thing is a file or a directory.
|
|
|
|
If it is a file, then the information includes file size, metadata (like
|
|
Content-Type), and URIs, like this:
|
|
|
|
[ 'filenode', { 'mutable': bool, 'uri': file_uri, 'size': bytes } ]
|
|
|
|
If it is a directory, then it includes a flag to indicate whether this is a
|
|
read-write dirnode or a read-only dirnode, and information about the
|
|
children of this directory, as a mapping from child name to a set of
|
|
metadata about the child (the same data that would appear in a
|
|
corresponding GET?t=json of the child itself). Like this:
|
|
|
|
[ 'dirnode', { 'mutable': bool, 'uri': uri, 'children': children } ]
|
|
|
|
where 'children' is a dictionary in which the keys are child names
|
|
and the values depend upon whether the child is a file or a directory:
|
|
|
|
'foo.txt': [ 'filenode', { 'mutable': bool, 'uri': uri, 'size': bytes } ]
|
|
'subdir': [ 'dirnode', { 'mutable': bool, 'uri': uri } ]
|
|
|
|
note that the value is the same as the JSON representation of the
|
|
corresponding FILEURL or DIRURL (except that dirnodes do not recurse --
|
|
the "children" entry of the child is omitted).
|
|
|
|
Before writing code that uses these results, please see the important note
|
|
below about TOCTTOU bugs.
|
|
|
|
GET $URL?t=uri
|
|
|
|
This returns the URI of the given file or directory in the HTTP response
|
|
body. If you have read-write access to that resource then this returns a
|
|
URI which provides read-write access. If you have read-only access to that
|
|
resource then this returns a URI which provides read-only access.
|
|
|
|
GET $URL?t=readonly-uri
|
|
|
|
This returns the URI providing read-only access to the given file or
|
|
directory (whether or not you have read-only or read-write access).
|
|
(Currently all files are immutable so everyone has read-only access to all
|
|
files.)
|
|
|
|
PUT $URL?t=uri
|
|
|
|
This attaches a child (either a file or a directory) to the vdrive at the
|
|
given location. The URI of the child is provided in the body of the HTTP
|
|
request. This can be used to attach a shared directory to the
|
|
vdrive. Intermediate directories are created on-demand just like with the
|
|
regular PUT command.
|
|
|
|
DELETE $URL
|
|
|
|
This deletes the given file or directory from the vdrive. If it is a
|
|
directory then this deletes all of its chilren. Note that this *does not*
|
|
delete any parent directories, so a sequence of 'PUT $NEWURL' and 'DELETE
|
|
$NEWURL' does not necessarily return the vdrive to its original state (it
|
|
may leave some intermediate directory nodes).
|
|
|
|
|
|
=== Manipulating files by name ===
|
|
|
|
PUT $NEWURL
|
|
|
|
This uploads a file to the given place in the vdrive. It will create
|
|
intermediate directory nodes as necessary. The file's contents are taken
|
|
from the body of the HTTP request. For convenience, the HTTP response
|
|
contains the URI that results from uploading the file, although the node
|
|
is not obligated to do anything with the URI. According to the HTTP/1.1
|
|
specification (rfc2616), this should return a 200 (OK) code when modifying
|
|
an existing file, and a 201 (Created) code when creating a new file.
|
|
|
|
To use this, run 'curl -T localfile http://localhost:8011/vdrive/global/newfile'
|
|
|
|
|
|
=== Manipulating directories by name ===
|
|
|
|
PUT $NEWURL?t=mkdir
|
|
|
|
Create a new empty directory at the given path. The HTTP response contains
|
|
the URI of the given directory, although the client is not obligated to do
|
|
anything with it.
|
|
|
|
GET $URL?t=rename-form&name=$CHILDNAME
|
|
|
|
This provides a useful facility to browser-based user interfaces. It
|
|
returns a page containing a form targetting the "POST $URL t=rename"
|
|
functionality described below, with the provided $CHILDNAME present in the
|
|
'from_name' field of that form. I.e. this presents a form offering to
|
|
rename $CHILDNAME, requesting the new name, and submitting POST rename.
|
|
|
|
|
|
== URIs ==
|
|
|
|
A separate top-level resource namespace ("uri/" instead of "vdrive/") is used
|
|
to get access to files and dirnodes that are indexed directly by URI, rather
|
|
than by going through the vdrive. The resource thus referenced is used the
|
|
same way as if it were accessed through the vdrive (including accessing a
|
|
directory's children with "$URI/childname").
|
|
|
|
For example, this identifies a file or directory:
|
|
|
|
http://localhost:8011/uri/$URI
|
|
|
|
And this identifies a file or directory "foo" in a subdirectory "somedir" of
|
|
the identified directory:
|
|
|
|
http://localhost:8011/uri/$URI/somedir/foo
|
|
|
|
In the following examples, "$URI_URL" is a shorthand for a URL like the one
|
|
above, with "uri/" as the top level, followed by a URI.
|
|
|
|
Note that since tahoe URIs may contain slashes (in particular, dirnode URIs
|
|
contain a FURL, which resembles a regular HTTP URL and starts with pb://),
|
|
when URIs are used in this form, they must be specially quoted. All slashes
|
|
in the URI must be replaced by '!' characters. XXX consider changing the
|
|
allmydata.org uri format to relieve the user of this requirement.
|
|
|
|
GET $URI_URL
|
|
GET $URI_URL?t=json
|
|
GET $URI_URL?t=uri
|
|
GET $URI_URL?t=readonly-uri
|
|
|
|
These each behave the same way that their name-based URL equivalent does,
|
|
described in the "files and directories" section above. The difference is
|
|
that which file or directory you access does not depend on the contents of
|
|
parent directories as it does with the name-based URLs, since a URI
|
|
uniquely identifies an object regardless of its location.
|
|
|
|
Since files accessed directly this way do not have a filename (from which a
|
|
MIME-type can be derived), one can be specified using a 'filename=' query
|
|
argument. This filename is also the one used if the 'save=true' argument is
|
|
set. For example:
|
|
|
|
GET http://localhost:8011/uri/$TRACTORS_URI?filename=tractors.jpg
|
|
|
|
If the URI represents a directory, you can append additional path segments
|
|
to $URI_URL to access children of that directory. For example, if we first
|
|
obtained the URI of the "private/Pictures" directory by doing:
|
|
|
|
GET http://localhost:8011/vdrive/private/Pictures?t=uri -> PICTURES_URI
|
|
|
|
then we could download "private/Pictures/family/bobby.jpg" by fetching:
|
|
|
|
GET http://localhost:8011/uri/$PICTURES_URI/family/bobby.jpg
|
|
|
|
Note that since the $URI_URL already contains the URI, the only use for the
|
|
"?t=readonly-uri" command is if the thing identified is a directory and you
|
|
have read-write access to it and you want to get a URI which provides
|
|
read-only access to it. "?t=uri" is completely redundant but included for
|
|
completeness.
|
|
|
|
GET http://localhost:8011/uri?uri=$URI
|
|
|
|
This causes a redirect to /uri/$URI, and retains any additional query
|
|
arguments (like filename= or save=). This is for the convenience of web
|
|
forms which allow the user to paste in a URI (obtained through some
|
|
out-of-band channel, like IM or email).
|
|
|
|
Note that this form merely redirects to the specific node indicated by the
|
|
URI: unlike the GET /uri/$URI form, you cannot traverse to child nodes by
|
|
appending additional path segments to the URL.
|
|
|
|
The $URI provided as a query argument is allowed to contain slashes. The
|
|
redirection provided will escape the slashes with exclamation points, as
|
|
described above.
|
|
|
|
|
|
== names versus identifiers ==
|
|
|
|
The vdrive provides a mutable filesystem, but the ways that the filesystem
|
|
can change are limited. The only thing that can change is that the mapping
|
|
from child names to child objects that each directory contains can be changed
|
|
by adding a new child name pointing to an object, removing an existing child
|
|
name, or changing an existing child name to point to a different object.
|
|
|
|
Obviously if you query tahoe for information about the filesystem and then
|
|
act upon the filesystem (such as by getting a listing of the contents of a
|
|
directory and then adding a file to the directory), then the filesystem might
|
|
have been changed after you queried it and before you acted upon it.
|
|
However, if you use the URI instead of the pathname of an object when you act
|
|
upon the object, then the only change that can happen is when the object is a
|
|
directory then the set of child names it has might be different. If, on the
|
|
other hand, you act upon the object using its pathname, then a different
|
|
object might be in that place, which can result in more kinds of surprises.
|
|
|
|
For example, suppose you are writing code which recursively downloads the
|
|
contents of a directory. The first thing your code does is fetch the listing
|
|
of the contents of the directory. For each child that it fetched, if that
|
|
child is a file then it downloads the file, and if that child is a directory
|
|
then it recurses into that directory. Now, if the download and the recurse
|
|
actions are performed using the child's name, then the results might be
|
|
wrong, because for example a child name that pointed to a sub-directory when
|
|
you listed the directory might have been changed to point to a file, in which
|
|
case your attempt to recurse into it would result in an error and the file
|
|
would be skipped, or a child name that pointed to a file when you listed the
|
|
directory might now point to a sub-directory, in which case your attempt to
|
|
download the child would result in a file containing HTML text describing the
|
|
sub-directory!
|
|
|
|
If your recursive algorithm uses the URI of the child instead of the name of
|
|
the child, then those kinds of mistakes just can't happen. Note that both the
|
|
child's name and the child's URI are included in the results of listing the
|
|
parent directory, so it isn't harder to use the URI for this purpose.
|
|
|
|
In general, use names if you want "whatever object (whether file or
|
|
directory) is found by following this name (or sequence of names) when my
|
|
request reaches the server". Use URIs if you want "this particular object".
|
|
|
|
== POST forms ==
|
|
|
|
POST $URL
|
|
t=upload
|
|
name=childname (optional)
|
|
file=newfile
|
|
|
|
This instructs the node to upload a file into the given dirnode. We need
|
|
this because forms are the only way for a web browser to upload a file
|
|
(browsers do not know how to do PUT or DELETE). The file's contents and the
|
|
new child name will be included in the form's arguments. This can only be
|
|
used to upload a single file at a time. To avoid confusion, name= is not
|
|
allowed to contain a slash (a 400 Bad Request error will result).
|
|
|
|
POST $URL
|
|
t=mkdir
|
|
name=childname
|
|
|
|
This instructs the node to create a new empty directory. The name of the
|
|
new child directory will be included in the form's arguments.
|
|
|
|
POST $URL
|
|
t=uri
|
|
name=childname
|
|
uri=newuri
|
|
|
|
This instructs the node to attach a child that is referenced by URI (just
|
|
like the PUT $URL?t=uri method). The name and URI of the new child
|
|
will be included in the form's arguments.
|
|
|
|
POST $URL
|
|
t=delete
|
|
name=childname
|
|
|
|
This instructs the node to delete a file from the given dirnode. The name
|
|
of the child to be deleted will be included in the form's arguments.
|
|
|
|
POST $URL
|
|
t=rename
|
|
from_name=oldchildname
|
|
to_name=newchildname
|
|
|
|
This instructs the node to rename a child within the given dirnode. The
|
|
child specified by 'from_name' is removed, and reattached as a child named
|
|
for 'to_name'. This is unconditional and will replace any child already
|
|
present under 'to_name', akin to 'mv -f' in unix parlance.
|
|
|
|
|
|
== XMLRPC ==
|
|
|
|
http://localhost:8011/xmlrpc
|
|
|
|
This resource provides an XMLRPC server on which all of the previous
|
|
operations can be expressed as function calls taking a "pathname" argument.
|
|
This is provided for applications that want to think of everything in terms
|
|
of XMLRPC.
|
|
|
|
listdir(vdrivename, path) -> dict of (childname -> (stuff))
|
|
put(vdrivename, path, contents) -> URI
|
|
get(vdrivename, path) -> contents
|
|
mkdir(vdrivename, path) -> URI
|
|
put_localfile(vdrivename, path, localfilename) -> URI
|
|
get_localfile(vdrivename, path, localfilename)
|
|
put_localdir(vdrivename, path, localdirname) # recursive
|
|
get_localdir(vdrivename, path, localdirname) # recursive
|
|
put_uri(vdrivename, path, URI)
|
|
|
|
etc..
|
|
|
|
|
|
== Testing/Debugging Commands ==
|
|
|
|
GET $URL?t=download&localfile=$LOCALPATH
|
|
GET $URL?t=download&localdir=$LOCALPATH
|
|
|
|
The localfile= form instructs the node to download the given file and write
|
|
it into the local filesystem at $LOCALPATH. The localdir= form instructs
|
|
the node to recursively download everything from the given directory and
|
|
below into the local filesystem. To avoid surprises, the localfile= form
|
|
will signal an error if $URL actually refers to a directory, likewise if
|
|
localdir= is used with a $URL that refers to a file.
|
|
|
|
This request will only be accepted from an HTTP client connection
|
|
originating at 127.0.0.1 . This request is most useful when the client node
|
|
and the HTTP client are operated by the same user. $LOCALPATH should be an
|
|
absolute pathname.
|
|
|
|
This form is only implemented for testing purposes, because of a trivially
|
|
easy attack: any web server that the local browser visits could serve an
|
|
IMG tag that causes the local node to modify the local filesystem.
|
|
Therefore this form is only enabled if you create a file named
|
|
'webport_allow_localfile' in the node's base directory.
|
|
|
|
PUT $NEWURL?t=upload&localfile=$LOCALPATH
|
|
PUT $NEWURL?t=upload&localdir=$LOCALPATH
|
|
|
|
This uploads a file or directory from the node's local filesystem to the
|
|
vdrive. As with "GET $URL?t=download&localfile=$LOCALPATH", this request
|
|
will only be accepted from an HTTP connection originating from 127.0.0.1 .
|
|
|
|
The localfile= form expects that $LOCALPATH will point to a file on the
|
|
node's local filesystem, and cause sthe node to upload that one file into
|
|
the vdrive at the given location. Any parent directories will be created in
|
|
the vdrive as necessary.
|
|
|
|
The localdir= form expects that $LOCALPATH will point to a directory on the
|
|
node's local filesystem, and it causes the node to perform a recursive
|
|
upload of the directory into the vdrive at the given location, creating
|
|
parent directories as necessary. When the operation is complete, the
|
|
directory referenced by $NEWURL will contain all of the files and
|
|
directories that were present in $LOCALPATH, so this is equivalent to the
|
|
unix commands:
|
|
|
|
mkdir -p $NEWURL; cp -r $LOCALPATH/* $NEWURL/
|
|
|
|
Note that the "curl" utility can be used to provoke this sort of recursive
|
|
upload, since the -T option will make it use an HTTP 'PUT':
|
|
|
|
curl -T /dev/null 'http://localhost:8011/vdrive/global/newdir?t=upload&localdir=/home/user/directory-to-upload'
|
|
|
|
This form is only implemented for testing purposes, because any attacker's
|
|
web server that a local browser visits could serve an IMG tag that causes
|
|
the local node to modify the local filesystem. Therefore this form is only
|
|
enabled if you create a file named 'webport_allow_localfile' in the node's
|
|
base directory.
|
|
|
|
GET $URL?t=manifest
|
|
|
|
Return an HTML-formatted manifest of the given directory, for debugging.
|