* repairer (really the uploader) reads beyond end of input file (Uploadable)
* new-downloader does not tolerate overreads
* uploader does lots of tiny reads (inefficient)
This fixes the last two. The uploader still does a single overread at the end
of the input file, but now that's ok so we can leave it in place. The
uploader now expects the Uploadable to behave like a normal disk
file (reading beyond EOF will return less data than was asked for), and now
the new-downloadable behaves that way.
This removes the need to use a locally-built (dependency) bin/twistd, and
removes a big chunk of behavior differences between unix and windows. It
also happens to resolve the "client node probably started" uncertainty.
Might help with #1190, #602, and #71.
fixes#530. I earlier tried this twice (see #530 for history) and then twice rolled it back due to some problems that arose. However, I didn't write down what the problems were in enough detail on the ticket that I can tell today whether those problems are still issues, so here goes the third attempt. (I did write down on the ticket that it would not create site.py or .pth files in the target directory with --multi-version mode, but I didn't explain why *that* was a problem.)
fixes#1191
Patch by Brian. This patch description was actually written by Zooko, but I forged Brian's name on the "author" field so that he would get credit for this patch in revision control history.
If you are investigating the bug in new-downloader, one way to investigate might be to change this ordering to a different fixed order (e.g. rotate by 4 instead of rotate by 5) and observe how the behavior of new-downloader differs in that case.
Kyle's OpenBSD buildslave used 41 reads when doing this test. The fact that I'm blindly bumping this number up to match the observed behavior probably means this isn't a good criterion to be testing for anyway. But perhaps someone else (Brian) could investigate why that run on Kyle's OpenBSD box took four more reads than we expected, and whether the fact that it took 41 reads to do this operation is indicative of an actual problem.
deliver all shares at once instead of feeding them out one-at-a-time.
Also fix distribution of real-number-of-segments information: now all
CommonShares (not just the ones used for the first segment) get a
correctly-sized hashtree. Previously, the late ones might not, which would
make them crash and get dropped (causing the download to fail if the initial
set were insufficient, perhaps because one of their servers went away).
Update tests, add some TODO notes, improve variable names and comments.
Improve logging: add logparents, set more appropriate levels.
This avoids spamming the "recent uploads and downloads" /status page from
FileNode instances that were created for a directory read but which nobody is
ever going to read from. I also cleaned up the way DownloadStatus instances
are made to only ever do it in the CiphertextFileNode, not in the
higher-level plaintext FileNode. Also fixed DownloadStatus handling of read
size, thanks to David-Sarah for the catch.
seems to avoid the #1155 log message which reveals the URI (and filecap).
Also add an [ERROR] marker to the flog entry, since unregisterProducer also
makes interrupted downloads appear "200 OK"; this makes it more obvious that
the download did not complete.
The lost-progress bug occurred when two simultanous read() calls fetched
different segments, and the first one failed (due to corruption, or the other
bugs in #1154): the second read() would never complete. While in this state,
cancelling the second read by having its consumer call stopProducing) would
trigger the cancel-intolerance bug. Finally, in downloader.node.Cancel,
prevent late cancels by adding an 'active' flag
The Range header causes n.read() to be called with an offset= of type 'long',
which eventually got used in a Spans/DataSpans object's __len__ method.
Apparently python doesn't permit __len__() to return longs, only ints.
Rewrote Spans/DataSpans to use s.len() instead of len(s) aka s.__len__() .
Added a test in test_download. Note that test_web didn't catch this because
it uses mock FileNodes for speed: it's probably time to rewrite that.
There is still an unresolved error-recovery problem in #1154, so I'm not
closing the ticket quite yet.
The fixed 10-second timer will eventually be replaced with a per-server
value, calculated based on observed response times.
test_hung_server.py: enhance to exercise DYHB=OVERDUE state. Split existing
mutable+immutable tests into two pieces for clarity. Reenabled several tests.
Deleted the now-obsolete "test_failover_during_stage_4".