== connecting to the tahoe node == Writing "8011" into $NODEDIR/webport causes the node to run a webserver on port 8011. Writing "tcp:8011:interface=127.0.0.1" into $NODEDIR/webport does the same but binds to the loopback interface, ensuring that only the programs on the local host can connect. Using "ssl:8011:privateKey=mykey.pem:certKey=cert.pem" would run an SSL server. See twisted.application.strports for more details. If $NODEDIR/webpassword exists, it will be used (somehow) to require HTTP Digest Authentication for all webserver connections. XXX specify how == vdrive == The node provides some small number of "virtual drives". In the 0.5 release, this number is two: the first is the global shared vdrive, the second is the private non-shared vdrive. We will call these "global" and "private" for now. For the purpose of this document, let us assume that the vdrives currently contain the following directories and files: global/ global/Documents/ global/Documents/notes.txt private/ private/Pictures/ private/Pictures/tractors.jpg private/Pictures/family/ private/Pictures/family/bobby.jpg Within the webserver, there is a tree of resources. The top-level "vdrive" resource gives access to files and directories in all of the user's virtual drives. For example, the URL that corresponds to notes.txt would be: http://localhost:8011/vdrive/global/Documents/notes.txt and the URL for tractors.jpg would be: http://localhost:8011/vdrive/private/Pictures/tractors.jpg In addition, each directory has a corresponding URL. The Pictures URL is: http://localhost:8011/vdrive/private/Pictures Now, what can we do with these URLs? By varying the HTTP method (GET/PUT/POST/DELETE) and by appending a type-indicating query argument, we control how what we want to do with the data and how it should be presented. === Manipulating files and directories by name === In the following examples "$URL" is a shorthand for a URL like the ones described above, with "vdrive/" as the top level, followed by a slash-separated sequence of file or directory names. "$NEWURL" is a shorthand for a URL pointing to a location in the vdrive where currently nothing exists. GET $URL If the given place in the vdrive contains a file, then this simply retrieves the contents of the file. The Content-Type is set according to the vdrive's metadata (if available) or by using the usual filename-extension-magic built into most webservers. The file's contents are provided in the body of the HTTP response. If the given place contains a directory, then this returns an HTML page, intended to be used by humans, which contains HREF links to all files and directories reachable from this dirnode. These HREF links do not have a t= argument, meaning that a human who follows them will get pages also meant for a human. It also contains forms to upload new files, and to delete files and directories. These forms use POST methods to do their job. You can add the "save=true" argument, which adds a 'Content-Disposition: attachment' header to prompt most web browsers to save the file to disk rather than attempting to display it. GET $URL?t=json This returns machine-parseable information about the named file or directory in the HTTP response body. This information contains a flag that indicates whether the thing is a file or a directory. If it is a file, then the information includes file size, metadata (like Content-Type), and URIs, like this: [ 'filenode', { 'mutable': bool, 'uri': file_uri, 'size': bytes } ] If it is a directory, then it includes a flag to indicate whether this is a read-write dirnode or a read-only dirnode, and information about the children of this directory, as a mapping from child name to a set of metadata about the child (the same data that would appear in a corresponding GET?t=json of the child itself). Like this: [ 'dirnode', { 'mutable': bool, 'uri': uri, 'children': children } ] where 'children' is a dictionary in which the keys are child names and the values depend upon whether the child is a file or a directory: 'foo.txt': [ 'filenode', { 'mutable': bool, 'uri': uri, 'size': bytes } ] 'subdir': [ 'dirnode', { 'mutable': bool, 'uri': uri } ] note that the value is the same as the JSON representation of the corresponding FILEURL or DIRURL (except that dirnodes do not recurse -- the "children" entry of the child is omitted). Before writing code that uses these results, please see the important note below about TOCTTOU bugs. GET $URL?t=uri This returns the URI of the given file or directory in the HTTP response body. If you have read-write access to that resource then this returns a URI which provides read-write access. If you have read-only access to that resource then this returns a URI which provides read-only access. GET $URL?t=readonly-uri This returns the URI providing read-only access to the given file or directory (whether or not you have read-only or read-write access). (Currently all files are immutable so everyone has read-only access to all files.) PUT $URL?t=uri This attaches a child (either a file or a directory) to the vdrive at the given location. The URI of the child is provided in the body of the HTTP request. This can be used to attach a shared directory to the vdrive. Intermediate directories are created on-demand just like with the regular PUT command. DELETE $URL This deletes the given file or directory from the vdrive. If it is a directory then this deletes all of its chilren. Note that this *does not* delete any parent directories, so a sequence of 'PUT $NEWURL' and 'DELETE $NEWURL' does not necessarily return the vdrive to its original state (it may leave some intermediate directory nodes). === Manipulating files by name === PUT $NEWURL This uploads a file to the given place in the vdrive. It will create intermediate directory nodes as necessary. The file's contents are taken from the body of the HTTP request. For convenience, the HTTP response contains the URI that results from uploading the file, although the node is not obligated to do anything with the URI. According to the HTTP/1.1 specification (rfc2616), this should return a 200 (OK) code when modifying an existing file, and a 201 (Created) code when creating a new file. To use this, run 'curl -T localfile http://localhost:8011/vdrive/global/newfile' === Manipulating directories by name === PUT $NEWURL?t=mkdir Create a new empty directory at the given path. The HTTP response contains the URI of the given directory, although the client is not obligated to do anything with it. GET $URL?t=rename-form&name=$CHILDNAME This provides a useful facility to browser-based user interfaces. It returns a page containing a form targetting the "POST $URL t=rename" functionality described below, with the provided $CHILDNAME present in the 'from_name' field of that form. I.e. this presents a form offering to rename $CHILDNAME, requesting the new name, and submitting POST rename. == URIs == A separate top-level resource namespace ("uri/" instead of "vdrive/") is used to get access to files and dirnodes that are indexed directly by URI, rather than by going through the vdrive. The resource thus referenced is used the same way as if it were accessed through the vdrive (including accessing a directory's children with "$URI/childname"). For example, this identifies a file or directory: http://localhost:8011/uri/$URI And this identifies a file or directory "foo" in a subdirectory "somedir" of the identified directory: http://localhost:8011/uri/$URI/somedir/foo In the following examples, "$URI_URL" is a shorthand for a URL like the one above, with "uri/" as the top level, followed by a URI. Note that since tahoe URIs may contain slashes (in particular, dirnode URIs contain a FURL, which resembles a regular HTTP URL and starts with pb://), when URIs are used in this form, they must be specially quoted. All slashes in the URI must be replaced by '!' characters. XXX consider changing the allmydata.org uri format to relieve the user of this requirement. GET $URI_URL GET $URI_URL?t=json GET $URI_URL?t=uri GET $URI_URL?t=readonly-uri These each behave the same way that their name-based URL equivalent does, described in the "files and directories" section above. The difference is that which file or directory you access does not depend on the contents of parent directories as it does with the name-based URLs, since a URI uniquely identifies an object regardless of its location. Since files accessed directly this way do not have a filename (from which a MIME-type can be derived), one can be specified using a 'filename=' query argument. This filename is also the one used if the 'save=true' argument is set. For example: GET http://localhost:8011/uri/$TRACTORS_URI?filename=tractors.jpg If the URI represents a directory, you can append additional path segments to $URI_URL to access children of that directory. For example, if we first obtained the URI of the "private/Pictures" directory by doing: GET http://localhost:8011/vdrive/private/Pictures?t=uri -> PICTURES_URI then we could download "private/Pictures/family/bobby.jpg" by fetching: GET http://localhost:8011/uri/$PICTURES_URI/family/bobby.jpg Note that since the $URI_URL already contains the URI, the only use for the "?t=readonly-uri" command is if the thing identified is a directory and you have read-write access to it and you want to get a URI which provides read-only access to it. "?t=uri" is completely redundant but included for completeness. GET http://localhost:8011/uri?uri=$URI This causes a redirect to /uri/$URI, and retains any additional query arguments (like filename= or save=). This is for the convenience of web forms which allow the user to paste in a URI (obtained through some out-of-band channel, like IM or email). Note that this form merely redirects to the specific node indicated by the URI: unlike the GET /uri/$URI form, you cannot traverse to child nodes by appending additional path segments to the URL. The $URI provided as a query argument is allowed to contain slashes. The redirection provided will escape the slashes with exclamation points, as described above. == Time-Of-Check-To-Time-Of-Use ("TOCTTOU") bugs == Note that since directories are mutable you can get surprises if you query the vdrive, e.g. "GET $URL?t=json", examine the resulting JSON-encoded information, and then fetch from or update the vdrive using a name-based URL. This is because the actual state of the vdrive could have changed after you did the "GET $URL?t=json" query and before you did the subsequent fetch or update. For example, suppose you query to find out that "vdrive/private/somedir/foo" is a file which has a certain number of bytes, and then you issue a "GET vdrive/private/somedir/foo" to fetch the file. The file that you get might have a different number of bytes than the one that you chose to fetch, because the "foo" entry in the "somedir" directory may have been changed to point to a different file between your query and your fetch, or because the "somedir" entry in the private vdrive might have been changed to point to a different directory. Potentially more damaging, suppose that the "foo" entry was changed to point to a directory instead of a file. Then instead of receiving the expected file, you receive a file containing an HTML page describing the directory contents! These are examples of TOCTTOU bugs ( http://en.wikipedia.org/wiki/TOCTTOU ). A good way to avoid these bugs is to issue your second request, not with a URL based on the sequence of names that lead to the object, but instead with the URI of the object. For example, in the case that you query a directory listing (with "GET vdrive/private/somedir?t=json"), find a file named "foo" therein that you want to download, and then download the file, if you download it with its URI ("GET uri/$URI") instead of its URL ("GET vdrive/private/somedir/foo") then you will get the file that was in the "somedir/" directory under the name "foo" when you queried that directory, even if the "somedir/" directory has since been changed so that its "foo" child now points to a different file or to a directory. In general, use names if you want "whatever object (whether file or directory) is found by following this sequence of names when my request reaches the server". Use URIs if you want "this particular object". If you are basing your decision to fetch from or update the vdrive on filesystem information that was returned by an earlier query, then you usually intend to fetch or update the particular object that was in that location when you first queried it, rather than whatever object is going to be in that location when your subsequent fetch request finally reaches the server. == POST forms == POST $URL t=upload name=childname (optional) file=newfile This instructs the node to upload a file into the given dirnode. We need this because forms are the only way for a web browser to upload a file (browsers do not know how to do PUT or DELETE). The file's contents and the new child name will be included in the form's arguments. This can only be used to upload a single file at a time. To avoid confusion, name= is not allowed to contain a slash (a 400 Bad Request error will result). POST $URL t=mkdir name=childname This instructs the node to create a new empty directory. The name of the new child directory will be included in the form's arguments. POST $URL t=uri name=childname uri=newuri This instructs the node to attach a child that is referenced by URI (just like the PUT $URL?t=uri method). The name and URI of the new child will be included in the form's arguments. POST $URL t=delete name=childname This instructs the node to delete a file from the given dirnode. The name of the child to be deleted will be included in the form's arguments. POST $URL t=rename from_name=oldchildname to_name=newchildname This instructs the node to rename a child within the given dirnode. The child specified by 'from_name' is removed, and reattached as a child named for 'to_name'. This is unconditional and will replace any child already present under 'to_name', akin to 'mv -f' in unix parlance. == XMLRPC == http://localhost:8011/xmlrpc This resource provides an XMLRPC server on which all of the previous operations can be expressed as function calls taking a "pathname" argument. This is provided for applications that want to think of everything in terms of XMLRPC. listdir(vdrivename, path) -> dict of (childname -> (stuff)) put(vdrivename, path, contents) -> URI get(vdrivename, path) -> contents mkdir(vdrivename, path) -> URI put_localfile(vdrivename, path, localfilename) -> URI get_localfile(vdrivename, path, localfilename) put_localdir(vdrivename, path, localdirname) # recursive get_localdir(vdrivename, path, localdirname) # recursive put_uri(vdrivename, path, URI) etc.. == Testing/Debugging Commands == GET $URL?t=download&localfile=$LOCALPATH GET $URL?t=download&localdir=$LOCALPATH The localfile= form instructs the node to download the given file and write it into the local filesystem at $LOCALPATH. The localdir= form instructs the node to recursively download everything from the given directory and below into the local filesystem. To avoid surprises, the localfile= form will signal an error if $URL actually refers to a directory, likewise if localdir= is used with a $URL that refers to a file. This request will only be accepted from an HTTP client connection originating at 127.0.0.1 . This request is most useful when the client node and the HTTP client are operated by the same user. $LOCALPATH should be an absolute pathname. This form is only implemented for testing purposes, because of a trivially easy attack: any web server that the local browser visits could serve an IMG tag that causes the local node to modify the local filesystem. Therefore this form is only enabled if you create a file named 'webport_allow_localfile' in the node's base directory. PUT $NEWURL?t=upload&localfile=$LOCALPATH PUT $NEWURL?t=upload&localdir=$LOCALPATH This uploads a file or directory from the node's local filesystem to the vdrive. As with "GET $URL?t=download&localfile=$LOCALPATH", this request will only be accepted from an HTTP connection originating from 127.0.0.1 . The localfile= form expects that $LOCALPATH will point to a file on the node's local filesystem, and cause sthe node to upload that one file into the vdrive at the given location. Any parent directories will be created in the vdrive as necessary. The localdir= form expects that $LOCALPATH will point to a directory on the node's local filesystem, and it causes the node to perform a recursive upload of the directory into the vdrive at the given location, creating parent directories as necessary. When the operation is complete, the directory referenced by $NEWURL will contain all of the files and directories that were present in $LOCALPATH, so this is equivalent to the unix commands: mkdir -p $NEWURL; cp -r $LOCALPATH/* $NEWURL/ Note that the "curl" utility can be used to provoke this sort of recursive upload, since the -T option will make it use an HTTP 'PUT': curl -T /dev/null 'http://localhost:8011/vdrive/global/newdir?t=upload&localdir=/home/user/directory-to-upload' This form is only implemented for testing purposes, because any attacker's web server that a local browser visits could serve an IMG tag that causes the local node to modify the local filesystem. Therefore this form is only enabled if you create a file named 'webport_allow_localfile' in the node's base directory. GET $URL?t=manifest Return an HTML-formatted manifest of the given directory, for debugging.