This makes it so that an optional file which is unreadable or is rm'ed
at the wrong moment will be ignored instead of raising an exception.
It also bums out a couple of unnecessary lines of code (the explicit
".close()" call).
It doesn't exit properly afterward, and it might not do the best things with non-success responses from the server.
(See tahoe-put-web2ish.py for an example of better response handling.)
There are actually two versions in this patch, one of which requires twisted.web2 and the other of which uses the Python standard library's socket module. The socketish one doesn't know when the web server is done so it hangs after doing its thing. (Oh -- maybe I should add an HTTP header asking the web server to close the connection when finished.) The web2ish one works better in that respect. Neither of them handle error responses from the server very well yet.
After lunch I intend to finish the socketish one.
To try one, mv src/allmydata/scripts/tahoe_put-{socketish,web2ish}.py src/allmydata/scripts/tahoe_put.py .
If you want to try the web2ish one, and you can't find a web2 package to install, you can get one from:
http://allmydata.org/~zooko/repos/twistedweb2snarf/
We need to look in the fields because that's how we build the mkdir/upload
forms. Without this, uploading or creating directories would leave us on a
page that had just a URI, instead of something actually useful to a human.
The original twisted.web.http.Request class has a requestReceived method
that parses the form body (in the request's .content filehandle) using
the stdlib cgi.parse_multipart() function. parse_multipart() consumes a
lot of memory when handling large file uploads, because it holds the
arguments entirely in RAM. Nevow's subclass of Request uses cgi.FieldStorage
instead, which is much more memory-efficient.
This patch uses a local subclass of Request and a modified copy of the
requestReceived method. It disables the cgi.parse_multipart parsing and
instead relies upon Nevow's use of FieldStorage. Our code must look for
form elements (which come from the body of the POST request) in req.fields,
rather than assuming that they will be copied into req.args (which now
only contains the query arguments that appear in the URL).
As a result, memory usage is no longer inflated by the size of the file
being uploaded in a POST upload request. Note that cgi.FieldStorage uses
temporary files (tempfile.TemporaryFile) to hold the data.
This closes#29.