== connecting to the tahoe node == Writing "8011" into $NODEDIR/webport causes the node to run a webserver on port 8011. Writing "tcp:8011:interface=127.0.0.1" into $NODEDIR/webport does the same but binds to the loopback interface, ensuring that only the programs on the local host can connect. Using "ssl:8011:privateKey=mykey.pem:certKey=cert.pem" would run an SSL server. See twisted.application.strports for more details. In this release, anyone who can connect to this port will be able to use the vdrive. Authentication will be added in a near-future release, probably by having the node generate an unguessable prefix which should be inserted before the 'vdrive' segment in the URLS described below, and writing this nonce to a read-by-owner-only file in $NODEDIR. Please see ticket #98 for details. == vdrive == The node provides some small number of "virtual drives". In the 0.5 release, this number is two: the first is the global shared vdrive, the second is the private non-shared vdrive. We will call these "global" and "private" for now. For the purpose of this document, let us assume that the vdrives currently contain the following directories and files: global/ global/Documents/ global/Documents/notes.txt private/ private/Pictures/ private/Pictures/tractors.jpg private/Pictures/family/ private/Pictures/family/bobby.jpg Within the webserver, there is a tree of resources. The top-level "vdrive" resource gives access to files and directories in all of the user's virtual drives. For example, the URL that corresponds to notes.txt would be: http://localhost:8011/vdrive/global/Documents/notes.txt and the URL for tractors.jpg would be: http://localhost:8011/vdrive/private/Pictures/tractors.jpg In addition, each directory has a corresponding URL. The Pictures URL is: http://localhost:8011/vdrive/private/Pictures Now, what can we do with these URLs? By varying the HTTP method (GET/PUT/POST/DELETE) and by appending a type-indicating query argument, we control how what we want to do with the data and how it should be presented. === Manipulating files and directories by name === In the following examples "$URL" is a shorthand for a URL like the ones described above, with "vdrive/" as the top level, followed by a slash-separated sequence of directory names, ending with the name of a file or a directory. "$NEWURL" is a shorthand for a URL pointing to a location in the vdrive where currently nothing exists. GET $URL If the given place in the vdrive contains a file, then this simply retrieves the contents of the file. The Content-Type is set according to the vdrive's metadata (if available) or by using the usual filename-extension-magic built into most webservers. The file's contents are provided in the body of the HTTP response. If the given place contains a directory, then this returns an HTML page, intended to be used by humans, which contains HREF links to all files and directories reachable from this dirnode. These HREF links do not have a t= argument, meaning that a human who follows them will get pages also meant for a human. It also contains forms to upload new files, and to delete files and directories. These forms use POST methods to do their job. You can add the "save=true" argument, which adds a 'Content-Disposition: attachment' header to prompt most web browsers to save the file to disk rather than attempting to display it. GET $URL?t=json This returns machine-parseable information about the named file or directory in the HTTP response body. This information contains a flag that indicates whether the thing is a file or a directory. If it is a file, then the information includes file size, metadata (like Content-Type), and URIs, like this: [ 'filenode', { 'mutable': bool, 'uri': file_uri, 'size': bytes } ] If it is a directory, then it includes a flag to indicate whether this is a read-write dirnode or a read-only dirnode, and information about the children of this directory, as a mapping from child name to a set of metadata about the child (the same data that would appear in a corresponding GET?t=json of the child itself). Like this: [ 'dirnode', { 'mutable': bool, 'uri': uri, 'children': children } ] where 'children' is a dictionary in which the keys are child names and the values depend upon whether the child is a file or a directory: 'foo.txt': [ 'filenode', { 'mutable': bool, 'uri': uri, 'size': bytes } ] 'subdir': [ 'dirnode', { 'mutable': bool, 'uri': uri } ] note that the value is the same as the JSON representation of the corresponding FILEURL or DIRURL (except that directories do not recurse -- the "children" entry of the child is omitted). Before writing code that uses these results, please see the important note below about TOCTTOU bugs. GET $URL?t=uri This returns the URI of the given file or directory in the HTTP response body. If you have read-write access to that resource then this returns a URI which provides read-write access. If you have read-only access to that resource then this returns a URI which provides read-only access. GET $URL?t=readonly-uri This returns the URI providing read-only access to the given file or directory (whether or not you have read-only or read-write access). (Currently all files are immutable so everyone has read-only access to all files.) PUT $URL?t=uri This attaches a child (either a file or a directory) to the vdrive at the given location. The URI of the child is provided in the body of the HTTP request. This can be used to attach a shared directory to the vdrive. Intermediate directories are created on-demand just like with the regular PUT command. If there was already a child at the given name, this command will replace the old child with the new one, and will return an HTTP 200 (OK) response code. If there was not already a child there, it will return 201 (Created). If you add an "replace=false" query argument, the command will return a 409 (Conflict) error rather than replacing an existing child. DELETE $URL This deletes the given file or directory from the vdrive. If it is a directory then this deletes all of its chilren. Note that this *does not* delete any parent directories, so a sequence of 'PUT $NEWURL' and 'DELETE $NEWURL' does not necessarily return the vdrive to its original state (it may leave some intermediate directories). === Manipulating files by name === In these examples, $NEWURL is specifically defined to point to a location in the vdrive where currently nothing exists, and will be used to refer to a file rather than a directory. PUT $NEWURL This uploads a file to the given place in the vdrive. It will create intermediate directories as necessary. The file's contents are taken from the body of the HTTP request. For convenience, the HTTP response contains the URI that results from uploading the file, although the node is not obligated to do anything with the URI. According to the HTTP/1.1 specification (rfc2616), this should return a 200 (OK) code when modifying an existing file, and a 201 (Created) code when creating a new file. If there was already a child at the given name, this command will replace the old child with the new one, and will return an HTTP 200 (OK) response code. If there was not already a child there, it will return 201 (Created). If you add an "replace=false" query argument, the command will return a 409 (Conflict) error rather than replacing an existing child. To use this, run 'curl -T localfile http://localhost:8011/vdrive/global/newfile' === Manipulating directories by name === In this section, $URL and $NEWURL specifically refer to directories, rather than files. PUT $NEWURL?t=mkdir Create a new empty directory at the given path. The HTTP response contains the URI of the given directory, although the client is not obligated to do anything with it. If there was already a child at the given name, this command will replace the old child with the new one, and will return an HTTP 200 (OK) response code. If there was not already a child there, it will return 201 (Created). If you add an "replace=false" query argument, the command will return a 409 (Conflict) error rather than replacing an existing child. GET $URL?t=rename-form&name=$CHILDNAME This provides a useful facility to browser-based user interfaces. It returns a page containing a form targetting the "POST $URL t=rename" functionality described below, with the provided $CHILDNAME present in the 'from_name' field of that form. I.e. this presents a form offering to rename $CHILDNAME, requesting the new name, and submitting POST rename. Note that this can be used to rename both files and directories, but the GET request itself is always directed to the directory containing the object to be renamed. == URIs == A separate top-level resource namespace ("uri/" instead of "vdrive/") is used to get access to files and directories that are indexed directly by URI, rather than by going through the vdrive. The resource thus referenced is used the same way as if it were accessed through the vdrive (including accessing a directory's children with "$URI/childname"). For example, this identifies a file or directory: http://localhost:8011/uri/$URI And this identifies a file or directory "foo" in a subdirectory "somedir" of the identified directory: http://localhost:8011/uri/$URI/somedir/foo In the following examples, "$URI_URL" is a shorthand for a URL like the one above, with "uri/" as the top level, followed by a URI. Note that since tahoe URIs may contain slashes (in particular, dirnode URIs contain a FURL, which resembles a regular HTTP URL and starts with pb://), when URIs are used in this form, they must be specially quoted. All slashes in the URI must be replaced by '!' characters. The intent is to remove this unpleasant requirement in a future release: please see ticket #102 for details. GET $URI_URL GET $URI_URL?t=json GET $URI_URL?t=uri GET $URI_URL?t=readonly-uri These each behave the same way that their name-based URL equivalent does, described in the "files and directories" section above. The difference is that which file or directory you access does not depend on the contents of parent directories as it does with the name-based URLs, since a URI uniquely identifies an object regardless of its location. Since files accessed directly this way do not have a filename (from which a MIME-type can be derived), one can be specified using a 'filename=' query argument. This filename is also the one used if the 'save=true' argument is set. For example: GET http://localhost:8011/uri/$TRACTORS_URI?filename=tractors.jpg If the URI represents a directory, you can append additional path segments to $URI_URL to access children of that directory. For example, if we first obtained the URI of the "private/Pictures" directory by doing: GET http://localhost:8011/vdrive/private/Pictures?t=uri -> PICTURES_URI then we could download "private/Pictures/family/bobby.jpg" by fetching: GET http://localhost:8011/uri/$PICTURES_URI/family/bobby.jpg Note that since the $URI_URL already contains the URI, the only use for the "?t=readonly-uri" command is if the thing identified is a directory and you have read-write access to it and you want to get a URI which provides read-only access to it. "?t=uri" is completely redundant but included for completeness. GET http://localhost:8011/uri?uri=$URI This causes a redirect to /uri/$URI, and retains any additional query arguments (like filename= or save=). This is for the convenience of web forms which allow the user to paste in a URI (obtained through some out-of-band channel, like IM or email). Note that this form merely redirects to the specific node indicated by the URI: unlike the GET /uri/$URI form, you cannot traverse to children by appending additional path segments to the URL. The $URI provided as a query argument is allowed to contain slashes. The redirection provided will escape the slashes with exclamation points, as described above. == names versus identifiers == The vdrive provides a mutable filesystem, but the ways that the filesystem can change are limited. The only thing that can change is that the mapping from child names to child objects that each directory contains can be changed by adding a new child name pointing to an object, removing an existing child name, or changing an existing child name to point to a different object. Obviously if you query tahoe for information about the filesystem and then act upon the filesystem (such as by getting a listing of the contents of a directory and then adding a file to the directory), then the filesystem might have been changed after you queried it and before you acted upon it. However, if you use the URI instead of the pathname of an object when you act upon the object, then the only change that can happen is when the object is a directory then the set of child names it has might be different. If, on the other hand, you act upon the object using its pathname, then a different object might be in that place, which can result in more kinds of surprises. For example, suppose you are writing code which recursively downloads the contents of a directory. The first thing your code does is fetch the listing of the contents of the directory. For each child that it fetched, if that child is a file then it downloads the file, and if that child is a directory then it recurses into that directory. Now, if the download and the recurse actions are performed using the child's name, then the results might be wrong, because for example a child name that pointed to a sub-directory when you listed the directory might have been changed to point to a file, in which case your attempt to recurse into it would result in an error and the file would be skipped, or a child name that pointed to a file when you listed the directory might now point to a sub-directory, in which case your attempt to download the child would result in a file containing HTML text describing the sub-directory! If your recursive algorithm uses the URI of the child instead of the name of the child, then those kinds of mistakes just can't happen. Note that both the child's name and the child's URI are included in the results of listing the parent directory, so it isn't harder to use the URI for this purpose. In general, use names if you want "whatever object (whether file or directory) is found by following this name (or sequence of names) when my request reaches the server". Use URIs if you want "this particular object". == POST forms == POST $URL t=upload name=childname (optional) file=newfile This instructs the node to upload a file into the given directory. We need this because forms are the only way for a web browser to upload a file (browsers do not know how to do PUT or DELETE). The file's contents and the new child name will be included in the form's arguments. This can only be used to upload a single file at a time. To avoid confusion, name= is not allowed to contain a slash (a 400 Bad Request error will result). If there was already a child at the given name, this command will replace the old child with the new one. But if you add a "replace=false" argument, the command will refuse to replace the child, signalling an error instead. POST $URL t=mkdir name=childname This instructs the node to create a new empty directory. The name of the new child directory will be included in the form's arguments. Existing children are replaced unless a "replace=false" argument is provided. POST $URL t=uri name=childname uri=newuri This instructs the node to attach a child that is referenced by URI (just like the PUT $URL?t=uri method). The name and URI of the new child will be included in the form's arguments. Existing children are replaced unless a "replace=false" argument is provided. POST $URL t=delete name=childname This instructs the node to delete a file from the given directory. The name of the child to be deleted will be included in the form's arguments. POST $URL t=rename from_name=oldchildname to_name=newchildname This instructs the node to rename a child within the given directory. The child specified by 'from_name' is removed, and reattached as a child named for 'to_name'. An existing child at 'to_name' is replaced unless a "replace=false" argument is provided, making the default behavior similar to the unix 'mv -f' command. == XMLRPC == http://localhost:8011/xmlrpc This resource provides an XMLRPC server on which all of the previous operations can be expressed as function calls taking a "pathname" argument. This is provided for applications that want to think of everything in terms of XMLRPC. listdir(vdrivename, path) -> dict of (childname -> (stuff)) put(vdrivename, path, contents) -> URI get(vdrivename, path) -> contents mkdir(vdrivename, path) -> URI put_localfile(vdrivename, path, localfilename) -> URI get_localfile(vdrivename, path, localfilename) put_localdir(vdrivename, path, localdirname) # recursive get_localdir(vdrivename, path, localdirname) # recursive put_uri(vdrivename, path, URI) etc.. == Testing/Debugging Commands == GET $URL?t=download&localfile=$LOCALPATH GET $URL?t=download&localdir=$LOCALPATH The localfile= form instructs the node to download the given file and write it into the local filesystem at $LOCALPATH. The localdir= form instructs the node to recursively download everything from the given directory and below into the local filesystem. To avoid surprises, the localfile= form will signal an error if $URL actually refers to a directory, likewise if localdir= is used with a $URL that refers to a file. This request will only be accepted from an HTTP client connection originating at 127.0.0.1 . This request is most useful when the client node and the HTTP client are operated by the same user. $LOCALPATH should be an absolute pathname. This form is only implemented for testing purposes, because of a trivially easy attack: any web server that the local browser visits could serve an IMG tag that causes the local node to modify the local filesystem. Therefore this form is only enabled if you create a file named 'webport_allow_localfile' in the node's base directory. PUT $NEWURL?t=upload&localfile=$LOCALPATH PUT $NEWURL?t=upload&localdir=$LOCALPATH This uploads a file or directory from the node's local filesystem to the vdrive. As with "GET $URL?t=download&localfile=$LOCALPATH", this request will only be accepted from an HTTP connection originating from 127.0.0.1 . The localfile= form expects that $LOCALPATH will point to a file on the node's local filesystem, and causes the node to upload that one file into the vdrive at the given location. Any parent directories will be created in the vdrive as necessary. The localdir= form expects that $LOCALPATH will point to a directory on the node's local filesystem, and it causes the node to perform a recursive upload of the directory into the vdrive at the given location, creating parent directories as necessary. When the operation is complete, the directory referenced by $NEWURL will contain all of the files and directories that were present in $LOCALPATH, so this is equivalent to the unix commands: mkdir -p $NEWURL; cp -r $LOCALPATH/* $NEWURL/ Note that the "curl" utility can be used to provoke this sort of recursive upload, since the -T option will make it use an HTTP 'PUT': curl -T /dev/null 'http://localhost:8011/vdrive/global/newdir?t=upload&localdir=/home/user/directory-to-upload' This form is only implemented for testing purposes, because any attacker's web server that a local browser visits could serve an IMG tag that causes the local node to modify the local filesystem. Therefore this form is only enabled if you create a file named 'webport_allow_localfile' in the node's base directory. GET $URL?t=manifest Return an HTML-formatted manifest of the given directory, for debugging.