mirror of
https://github.com/tahoe-lafs/tahoe-lafs.git
synced 2024-12-19 04:57:54 +00:00
webapi.txt: further refactoring and add a section explaining TOCTTOU bugs and how to avoid them by using URIs
This commit is contained in:
parent
e68a0e07de
commit
887240e7a3
192
docs/webapi.txt
192
docs/webapi.txt
@ -48,12 +48,14 @@ Now, what can we do with these URLs? By varying the HTTP "method"
|
||||
(GET/PUT/POST/DELETE) and by appending a type-indicating query argument, we
|
||||
control how what we want to do with the data and how it should be presented.
|
||||
|
||||
|
||||
=== files and directories by name ===
|
||||
|
||||
In the following examples "$URL" is a shorthand for a URL like the ones
|
||||
described above. "$NEWURL" is a shorthand for a URL pointing to a location
|
||||
in the vdrive where currently nothing exists.
|
||||
|
||||
|
||||
=== files or directories ===
|
||||
described above, with "vdrive/" as the top level, followed by a
|
||||
slash-separated sequence of file or directory names. "$NEWURL" is a
|
||||
shorthand for a URL pointing to a location in the vdrive where currently
|
||||
nothing exists.
|
||||
|
||||
GET $URL
|
||||
|
||||
@ -70,6 +72,10 @@ in the vdrive where currently nothing exists.
|
||||
for a human. It also contains forms to upload new files, and to delete
|
||||
files and directories. These forms use POST methods to do their job.
|
||||
|
||||
You can add the "save=true" argument, which adds a 'Content-Disposition:
|
||||
attachment' header to prompt most web browsers to save the file to disk
|
||||
rather than attempting to display it.
|
||||
|
||||
GET $URL?t=json
|
||||
|
||||
This returns machine-parseable information about the named file or
|
||||
@ -99,6 +105,9 @@ in the vdrive where currently nothing exists.
|
||||
corresponding FILEURL or DIRURL (except that dirnodes do not recurse --
|
||||
the "children" entry of the child is omitted).
|
||||
|
||||
Before writing code that uses these results, please see the important note
|
||||
below about TOCTTOU bugs.
|
||||
|
||||
DELETE $URL
|
||||
|
||||
This deletes the given file or directory from the vdrive. If it is a
|
||||
@ -150,8 +159,7 @@ in the vdrive where currently nothing exists.
|
||||
curl -T /dev/null 'http://localhost:8011/vdrive/global/newdir?t=upload&localdir=/home/user/directory-to-upload'
|
||||
|
||||
|
||||
|
||||
=== just for files ===
|
||||
=== files by name ===
|
||||
|
||||
GET $URL?t=file
|
||||
|
||||
@ -173,7 +181,7 @@ in the vdrive where currently nothing exists.
|
||||
To use this, run 'curl -T localfile http://localhost:8011/vdrive/global/newfile'
|
||||
|
||||
|
||||
=== just for directories ===
|
||||
=== directories by name ===
|
||||
|
||||
GET $URL?t=manifest
|
||||
|
||||
@ -194,6 +202,115 @@ in the vdrive where currently nothing exists.
|
||||
rename $CHILDNAME, requesting the new name, and submitting POST rename.
|
||||
|
||||
|
||||
== URIs ==
|
||||
|
||||
A separate top-level resource namespace ("uri" instead of "vdrive") is used
|
||||
to get access to files and dirnodes that are indexed directly by URI, rather
|
||||
than by going through the vdrive. The resource thus referenced is used the
|
||||
same way as if it were accessed through the vdrive, (including accessing a
|
||||
directory's children with "$URI/childname").
|
||||
|
||||
For example, this identifies a file or directory:
|
||||
|
||||
http://localhost:8011/uri/$URI
|
||||
|
||||
And this identifies a file or directory in a subdirectory of the identified
|
||||
directory:
|
||||
|
||||
http://localhost:8011/uri/$URI/subdir/foo
|
||||
|
||||
In the following examples, "$URI_URL" is a shorthand for a URL like the one
|
||||
above, with "uri/" as the top level, followed by a URI.
|
||||
|
||||
Note that since tahoe URIs may contain slashes (in particular, dirnode URIs
|
||||
contain a FURL, which resembles a regular HTTP URL and starts with pb://),
|
||||
when URIs are used in this form, they must be specially quoted. All slashes
|
||||
in the URI must be replaced by '!' characters. XXX consider changing the
|
||||
allmydata.org uri format to relieve the user of this requirement.
|
||||
|
||||
GET $URI_URL
|
||||
|
||||
This behaves the same way a "GET $URL", described in the "files and
|
||||
directories" section above, but which file or directory you get does not
|
||||
depend on the contents of parent directories as it does with the name-based
|
||||
URLs, since a URI uniquely identifies an object regardless of its location.
|
||||
|
||||
If the URI identifies a file, then this retrieves the contents of the
|
||||
file. Since files accessed this way do not have a filename (from which a
|
||||
MIME-type can be derived), one can be specified using a 'filename=' query
|
||||
argument. This filename is also the one used if the 'save=true' argument is
|
||||
set.
|
||||
|
||||
PUT $URL?t=uri
|
||||
|
||||
This attaches a child (either a file or a directory) to the vdrive at the
|
||||
given location. The URI is provided in the body of the HTTP request. This
|
||||
can be used to attach a shared directory to the vdrive. Intermediate
|
||||
directories are created on-demand just like with the regular PUT command.
|
||||
|
||||
GET http://localhost:8011/uri?uri=$URI
|
||||
|
||||
This causes a redirect to /uri/$URI, and retains any additional query
|
||||
arguments (like filename= or save=). This is for the convenience of web
|
||||
forms which allow the user to paste in a URI (obtained through some
|
||||
out-of-band channel, like IM or email).
|
||||
|
||||
Note that this form only redirects to the specific node indicated by the
|
||||
URI: unlike the GET /uri/$URI form, you cannot traverse to child nodes by
|
||||
appending additional path segments to the URL.
|
||||
|
||||
The $URI provided as a query argument is allowed to contain slashes. The
|
||||
redirection provided will escape the slashes with exclamation points, as
|
||||
described above.
|
||||
|
||||
|
||||
== TOCTTOU bugs ==
|
||||
|
||||
Note that since directories are mutable you can get surprises if you query
|
||||
the vdrive, e.g. "GET $URL?t=json", examine the resulting JSON-encoded
|
||||
information, and then fetch from or update the vdrive using a name-based URL.
|
||||
This is because the actual state of the vdrive could have changed after you
|
||||
did the "GET $URL?t=json" query and before you did the subsequent fetch or
|
||||
update.
|
||||
|
||||
For example, suppose you query to find out that "vdrive/private/somedir/foo"
|
||||
is a file which has a certain number of bytes, and then you issue a "GET
|
||||
vdrive/private/somedir/foo" to fetch the file. The file that you get might
|
||||
have a different number of bytes than the one that you chose to fetch,
|
||||
because the "foo" entry in the "somedir" directory may have been changed to
|
||||
point to a different file between your query and your fetch, or because the
|
||||
"somedir" entry in the private vdrive might have been changed to point to a
|
||||
different directory.
|
||||
|
||||
Potentially more damaging, suppose that the "foo" entry was changed to point
|
||||
to a directory instead of a file. Then instead of receiving the expected
|
||||
file, you receive a file containing an HTML page describing the directory
|
||||
contents!
|
||||
|
||||
These are examples of TOCTTOU bugs ( http://en.wikipedia.org/wiki/TOCTTOU ).
|
||||
|
||||
A good way to avoid these bugs is to issue your second request, not with a
|
||||
URL based on the sequence of names that lead to the object, but instead with
|
||||
the URI of the object. For example, in the case that you query a directory
|
||||
listing (with "GET vdrive/private/somedir?t=json"), find a file named "foo"
|
||||
therein that you want to download, and then download the file, if you
|
||||
download it with its URI ("GET uri/$URI") instead of its URL ("GET
|
||||
vdrive/private/somedir/foo") then you will get the file that was in the
|
||||
"somedir/" directory under the name "foo" when you queried that directory,
|
||||
even if the "somedir/" directory has since been changed so that its "foo"
|
||||
child now points to a different file or to a directory.
|
||||
|
||||
In general, use names if you want "whatever object (whether file or
|
||||
directory) is found by following this sequence of names when my request
|
||||
reaches the server". Use URIs if you want "this particular object".
|
||||
|
||||
If you are basing your decision to fetch from or update the vdrive on
|
||||
filesystem information that was returned by an earlier query, then you
|
||||
usually intend to fetch or update the particular object that was in that
|
||||
location when you queried it, rather than whatever object is going to be in
|
||||
that location when your request reaches the server.
|
||||
|
||||
|
||||
== POST forms ==
|
||||
|
||||
POST $URL
|
||||
@ -242,64 +359,6 @@ in the vdrive where currently nothing exists.
|
||||
present under 'to_name', akin to 'mv -f' in unix parlance.
|
||||
|
||||
|
||||
== URIs ==
|
||||
|
||||
http://localhost:8011/uri/$URI
|
||||
|
||||
A separate top-level resource namespace ("uri" instead of "vdrive") is used
|
||||
to get access to files and dirnodes that are indexed directly by URI,
|
||||
rather than by going through the vdrive. The resource thus referenced is
|
||||
used the same way as if it were accessed through the vdrive, including
|
||||
child-resource-traversal behavior. For example, if the URI corresponds to a
|
||||
file, then
|
||||
|
||||
GET http://localhost:8011/uri/$URI
|
||||
|
||||
would retrieve the contents of the file. Since files accessed this way do
|
||||
not have a naturally-occurring filename (from which a MIME-type can be
|
||||
derived), one can be specified using a 'filename=' query argument. This
|
||||
filename is also the one used if the 'save=true' argument is set, which
|
||||
adds a 'Content-Disposition: attachment' header to prompt most web browsers
|
||||
to save the file to disk rather than attempting to display it:
|
||||
|
||||
GET http://localhost:8011/uri/$URI?filename=foo.jpg
|
||||
GET http://localhost:8011/uri/$URI?filename=foo.jpg&save=true
|
||||
|
||||
If the URI corresponds to a directory, then:
|
||||
|
||||
PUT http://localhost:8011/uri/$URI/subdir/newfile?localfile=$FILENAME
|
||||
|
||||
would upload a file (with contents taken from the local filesystem) to a
|
||||
new file in a subdirectory of the referenced dirnode.
|
||||
|
||||
Note that since tahoe URIs may contain slashes (in particular, dirnode URIs
|
||||
contain a FURL, which resembles a regular HTTP URL and starts with pb://),
|
||||
when URIs are used in this form, they must be specially quoted. All slashes
|
||||
in the URI must be replaced by '!' characters.
|
||||
|
||||
PUT $URL?t=uri
|
||||
|
||||
This attaches a child (either a file or a directory) to the vdrive at the
|
||||
given location. The URI is provided in the body of the HTTP request. This
|
||||
can be used to attach a shared directory to the vdrive. Intermediate
|
||||
directories are created on-demand just like with the regular PUT command.
|
||||
|
||||
GET http://localhost:8011/uri?uri=$URI
|
||||
|
||||
This causes a redirect to /uri/$URI, and retains any additional query
|
||||
arguments (like filename= or save=). This is for the convenience of web
|
||||
forms which allow the user to paste in a URI (obtained through some
|
||||
out-of-band channel, like IM or email).
|
||||
|
||||
Note that this form only redirects to the specific node indicated by the
|
||||
URI: unlike the GET /uri/$URI form, you cannot traverse to child nodes by
|
||||
appending additional path segments to the URL.
|
||||
|
||||
The $URI provided as a query argument is allowed to contain slashes. The
|
||||
redirection provided will escape the slashes with exclamation points, as
|
||||
described above.
|
||||
|
||||
|
||||
== XMLRPC ==
|
||||
|
||||
http://localhost:8011/xmlrpc
|
||||
@ -320,3 +379,4 @@ in the vdrive where currently nothing exists.
|
||||
put_uri(vdrivename, path, URI)
|
||||
|
||||
etc..
|
||||
|
||||
|
Loading…
Reference in New Issue
Block a user