The code for validating the share hash tree and the block hash tree has been rewritten to make sure it handles all cases, to share metadata about the file (such as the share hash tree, block hash trees, and UEB) among different share downloads, and not to require hashes to be stored on the server unnecessarily, such as the roots of the block hash trees (not needed since they are also the leaves of the share hash tree), and the root of the share hash tree (not needed since it is also included in the UEB). It also passes the latest tests including handling corrupted shares well.
ValidatedReadBucketProxy takes a share_hash_tree argument to its constructor, which is a reference to a share hash tree shared by all ValidatedReadBucketProxies for that immutable file download.
ValidatedReadBucketProxy requires the block_size and share_size to be provided in its constructor, and it then uses those to compute the offsets and lengths of blocks when it needs them, instead of reading those values out of the share. The user of ValidatedReadBucketProxy therefore has to have first used a ValidatedExtendedURIProxy to compute those two values from the validated contents of the URI. This is pleasingly simplifies safety analysis: the client knows which span of bytes corresponds to a given block from the validated URI data, rather than from the unvalidated data stored on the storage server. It also simplifies unit testing of verifier/repairer, because now it doesn't care about the contents of the "share size" and "block size" fields in the share. It does not relieve the need for share data v2 layout, because we still need to store and retrieve the offsets of the fields which come after the share data, therefore we still need to use share data v2 with its 8-byte fields if we want to store share data larger than about 2^32.
Specify which subset of the block hashes and share hashes you need while downloading a particular share. In the future this will hopefully be used to fetch only a subset, for network efficiency, but currently all of them are fetched, regardless of which subset you specify.
ReadBucketProxy hides the question of whether it has "started" or not (sent a request to the server to get metadata) from its user.
Download is optimized to do as few roundtrips and as few requests as possible, hopefully speeding up download a bit.
I'd implemented stats gathering hooks in the helper a while back.
Brian did the same without reference to my changes. This reconciles
those two changes, encompassing all the stats in both changes,
implemented through the stats_provider interface.
this also provide templates for all 10 helper graphs in the
tahoe-stats munin plugin.
adds a stats_producer for the upload helper, which provides a series of counters
to the stats gatherer, under the name 'chk_upload_helper'.
it examines both the 'incoming' directory, and the 'encoding' dir, providing
inc_count inc_size inc_size_old enc_count enc_size enc_size_old, respectively
the number of files in each dir, the total size thereof, and the aggregate
size of all files older than 48hrs
base62 encoding fits more information into alphanumeric chars while avoiding the troublesome non-alphanumeric chars of base64 encoding. In particular, this allows us to work around the ext3 "32,000 entries in a directory" limit while retaining the convenient property that the intermediate directory names are leading prefixes of the storage index file names.
in a discussion the other day, brian had asked me to try removing this fix, since
it leads to double-closing the reader. since on my windows box, the test failures
I'd experienced were related to the ConnectionLost exception problem, and this
close didn't see to make a difference to test results, I agreed.
turns out that the buildbot's environment does fail without this fix, even with
the exception fix, as I'd kind of expected.
it makes sense, because the reader (specifically the file handle) must be closed
before it can be unlinked. at any rate, I'm reinstating this, in order to fix the
windows build
unlinking a file before closing it is not portable. it works on unix, but fails
since an open file holds a lock on windows.
this closes the reader before trying to unlink the encoding file within the
CHKUploadHelper.
unlinking a file before closing it is not portable. it works on unix, but fails
since an open file holds a lock on windows.
this closes the reader before trying to unlink the encoding file within the
CHKUploadHelper.