webapi.txt: further refactoring and add a section explaining TOCTTOU bugs and how to avoid them by using URIs

This commit is contained in:
Zooko O'Whielacronx 2007-08-10 12:04:30 -07:00
parent e68a0e07de
commit 887240e7a3

View File

@ -48,12 +48,14 @@ Now, what can we do with these URLs? By varying the HTTP "method"
(GET/PUT/POST/DELETE) and by appending a type-indicating query argument, we
control how what we want to do with the data and how it should be presented.
=== files and directories by name ===
In the following examples "$URL" is a shorthand for a URL like the ones
described above. "$NEWURL" is a shorthand for a URL pointing to a location
in the vdrive where currently nothing exists.
=== files or directories ===
described above, with "vdrive/" as the top level, followed by a
slash-separated sequence of file or directory names. "$NEWURL" is a
shorthand for a URL pointing to a location in the vdrive where currently
nothing exists.
GET $URL
@ -70,6 +72,10 @@ in the vdrive where currently nothing exists.
for a human. It also contains forms to upload new files, and to delete
files and directories. These forms use POST methods to do their job.
You can add the "save=true" argument, which adds a 'Content-Disposition:
attachment' header to prompt most web browsers to save the file to disk
rather than attempting to display it.
GET $URL?t=json
This returns machine-parseable information about the named file or
@ -99,6 +105,9 @@ in the vdrive where currently nothing exists.
corresponding FILEURL or DIRURL (except that dirnodes do not recurse --
the "children" entry of the child is omitted).
Before writing code that uses these results, please see the important note
below about TOCTTOU bugs.
DELETE $URL
This deletes the given file or directory from the vdrive. If it is a
@ -150,8 +159,7 @@ in the vdrive where currently nothing exists.
curl -T /dev/null 'http://localhost:8011/vdrive/global/newdir?t=upload&localdir=/home/user/directory-to-upload'
=== just for files ===
=== files by name ===
GET $URL?t=file
@ -173,7 +181,7 @@ in the vdrive where currently nothing exists.
To use this, run 'curl -T localfile http://localhost:8011/vdrive/global/newfile'
=== just for directories ===
=== directories by name ===
GET $URL?t=manifest
@ -194,6 +202,115 @@ in the vdrive where currently nothing exists.
rename $CHILDNAME, requesting the new name, and submitting POST rename.
== URIs ==
A separate top-level resource namespace ("uri" instead of "vdrive") is used
to get access to files and dirnodes that are indexed directly by URI, rather
than by going through the vdrive. The resource thus referenced is used the
same way as if it were accessed through the vdrive, (including accessing a
directory's children with "$URI/childname").
For example, this identifies a file or directory:
http://localhost:8011/uri/$URI
And this identifies a file or directory in a subdirectory of the identified
directory:
http://localhost:8011/uri/$URI/subdir/foo
In the following examples, "$URI_URL" is a shorthand for a URL like the one
above, with "uri/" as the top level, followed by a URI.
Note that since tahoe URIs may contain slashes (in particular, dirnode URIs
contain a FURL, which resembles a regular HTTP URL and starts with pb://),
when URIs are used in this form, they must be specially quoted. All slashes
in the URI must be replaced by '!' characters. XXX consider changing the
allmydata.org uri format to relieve the user of this requirement.
GET $URI_URL
This behaves the same way a "GET $URL", described in the "files and
directories" section above, but which file or directory you get does not
depend on the contents of parent directories as it does with the name-based
URLs, since a URI uniquely identifies an object regardless of its location.
If the URI identifies a file, then this retrieves the contents of the
file. Since files accessed this way do not have a filename (from which a
MIME-type can be derived), one can be specified using a 'filename=' query
argument. This filename is also the one used if the 'save=true' argument is
set.
PUT $URL?t=uri
This attaches a child (either a file or a directory) to the vdrive at the
given location. The URI is provided in the body of the HTTP request. This
can be used to attach a shared directory to the vdrive. Intermediate
directories are created on-demand just like with the regular PUT command.
GET http://localhost:8011/uri?uri=$URI
This causes a redirect to /uri/$URI, and retains any additional query
arguments (like filename= or save=). This is for the convenience of web
forms which allow the user to paste in a URI (obtained through some
out-of-band channel, like IM or email).
Note that this form only redirects to the specific node indicated by the
URI: unlike the GET /uri/$URI form, you cannot traverse to child nodes by
appending additional path segments to the URL.
The $URI provided as a query argument is allowed to contain slashes. The
redirection provided will escape the slashes with exclamation points, as
described above.
== TOCTTOU bugs ==
Note that since directories are mutable you can get surprises if you query
the vdrive, e.g. "GET $URL?t=json", examine the resulting JSON-encoded
information, and then fetch from or update the vdrive using a name-based URL.
This is because the actual state of the vdrive could have changed after you
did the "GET $URL?t=json" query and before you did the subsequent fetch or
update.
For example, suppose you query to find out that "vdrive/private/somedir/foo"
is a file which has a certain number of bytes, and then you issue a "GET
vdrive/private/somedir/foo" to fetch the file. The file that you get might
have a different number of bytes than the one that you chose to fetch,
because the "foo" entry in the "somedir" directory may have been changed to
point to a different file between your query and your fetch, or because the
"somedir" entry in the private vdrive might have been changed to point to a
different directory.
Potentially more damaging, suppose that the "foo" entry was changed to point
to a directory instead of a file. Then instead of receiving the expected
file, you receive a file containing an HTML page describing the directory
contents!
These are examples of TOCTTOU bugs ( http://en.wikipedia.org/wiki/TOCTTOU ).
A good way to avoid these bugs is to issue your second request, not with a
URL based on the sequence of names that lead to the object, but instead with
the URI of the object. For example, in the case that you query a directory
listing (with "GET vdrive/private/somedir?t=json"), find a file named "foo"
therein that you want to download, and then download the file, if you
download it with its URI ("GET uri/$URI") instead of its URL ("GET
vdrive/private/somedir/foo") then you will get the file that was in the
"somedir/" directory under the name "foo" when you queried that directory,
even if the "somedir/" directory has since been changed so that its "foo"
child now points to a different file or to a directory.
In general, use names if you want "whatever object (whether file or
directory) is found by following this sequence of names when my request
reaches the server". Use URIs if you want "this particular object".
If you are basing your decision to fetch from or update the vdrive on
filesystem information that was returned by an earlier query, then you
usually intend to fetch or update the particular object that was in that
location when you queried it, rather than whatever object is going to be in
that location when your request reaches the server.
== POST forms ==
POST $URL
@ -242,64 +359,6 @@ in the vdrive where currently nothing exists.
present under 'to_name', akin to 'mv -f' in unix parlance.
== URIs ==
http://localhost:8011/uri/$URI
A separate top-level resource namespace ("uri" instead of "vdrive") is used
to get access to files and dirnodes that are indexed directly by URI,
rather than by going through the vdrive. The resource thus referenced is
used the same way as if it were accessed through the vdrive, including
child-resource-traversal behavior. For example, if the URI corresponds to a
file, then
GET http://localhost:8011/uri/$URI
would retrieve the contents of the file. Since files accessed this way do
not have a naturally-occurring filename (from which a MIME-type can be
derived), one can be specified using a 'filename=' query argument. This
filename is also the one used if the 'save=true' argument is set, which
adds a 'Content-Disposition: attachment' header to prompt most web browsers
to save the file to disk rather than attempting to display it:
GET http://localhost:8011/uri/$URI?filename=foo.jpg
GET http://localhost:8011/uri/$URI?filename=foo.jpg&save=true
If the URI corresponds to a directory, then:
PUT http://localhost:8011/uri/$URI/subdir/newfile?localfile=$FILENAME
would upload a file (with contents taken from the local filesystem) to a
new file in a subdirectory of the referenced dirnode.
Note that since tahoe URIs may contain slashes (in particular, dirnode URIs
contain a FURL, which resembles a regular HTTP URL and starts with pb://),
when URIs are used in this form, they must be specially quoted. All slashes
in the URI must be replaced by '!' characters.
PUT $URL?t=uri
This attaches a child (either a file or a directory) to the vdrive at the
given location. The URI is provided in the body of the HTTP request. This
can be used to attach a shared directory to the vdrive. Intermediate
directories are created on-demand just like with the regular PUT command.
GET http://localhost:8011/uri?uri=$URI
This causes a redirect to /uri/$URI, and retains any additional query
arguments (like filename= or save=). This is for the convenience of web
forms which allow the user to paste in a URI (obtained through some
out-of-band channel, like IM or email).
Note that this form only redirects to the specific node indicated by the
URI: unlike the GET /uri/$URI form, you cannot traverse to child nodes by
appending additional path segments to the URL.
The $URI provided as a query argument is allowed to contain slashes. The
redirection provided will escape the slashes with exclamation points, as
described above.
== XMLRPC ==
http://localhost:8011/xmlrpc
@ -320,3 +379,4 @@ in the vdrive where currently nothing exists.
put_uri(vdrivename, path, URI)
etc..