tahoe-lafs

mirror of https://github.com/tahoe-lafs/tahoe-lafs.git synced 2024-12-27 08:22:32 +00:00

Author	SHA1	Message	Date
Zooko O'Whielacronx	ecabcc674c	immutable: Make more parts of download use logging mixins and know what their "parent msg id" is.	2009-01-08 11:25:30 -07:00
Zooko O'Whielacronx	2a443cd049	trivial: M-x whitespace-cleanup on src/immutable/download.py	2009-01-08 10:49:01 -07:00
Zooko O'Whielacronx	7d15928faa	immutable: ValidatedExtendedURIProxy computes and stores the tail data size as a convenience to its caller. The "tail data size" is how many of the bytes of the tail segment are data (as opposed to padding).	2009-01-08 10:41:39 -07:00
Zooko O'Whielacronx	83b97ee79f	immutable: fix error in validation of ciphertext hash tree and add test for that code pyflakes pointed out to me that I had committed some code that is untested, since it uses an undefined name. This patch exercises that code -- the validation of the ciphertext hash tree -- by corrupting some of the share files in a very specific way, and also fixes the bug.	2009-01-07 23:40:12 -07:00
Zooko O'Whielacronx	6011f4522f	immutable: do not catch arbitrary exceptions/failures from the attempt to get a crypttext hash tree -- catch only ServerFailure, IntegrityCheckReject, LayoutInvalid, ShareVersionIncompatible, and DeadReferenceError Once again I inserted a bug into the code, and once again it was hidden by something catching arbitrary exception/failure and assuming that it means the server failed to provide valid data.	2009-01-07 22:25:51 -07:00
Zooko O'Whielacronx	e598ca2f3f	download: make sure you really get all the crypttext hashes We were not making sure that we really got all the crypttext hashes during download. If a server were to return less than the complete set of crypttext hashes, then our subsequent attempt to verify the correctness of the ciphertext would fail. (And it wouldn't be obvious without very careful debugging why it had failed.) This patch makes it so that you keep trying to get ciphertext hashes until you have a full set or you run out of servers to ask.	2009-01-07 20:26:38 -07:00
Zooko O'Whielacronx	d5a6eed407	trivial: fix redefinition of name "log" in imports (pyflakes)	2009-01-06 22:08:29 -07:00
Zooko O'Whielacronx	c85f75bb08	immutable: refactor uploader to do just encoding-and-uploading, not encryption This makes Uploader take an EncryptedUploadable object instead of an Uploadable object. I also changed it to return a verify cap instead of a tuple of the bits of data that one finds in a verify cap. This will facilitate hooking together an Uploader and a Downloader to make a Repairer. Also move offloaded.py into src/allmydata/immutable/.	2009-01-06 21:48:22 -07:00
Zooko O'Whielacronx	81add135dc	trivial: whitespace and docstring tidyups	2009-01-06 21:41:04 -07:00
Zooko O'Whielacronx	5e6f90a015	rename "checker results" to "check results", because it is more parallel to "check-and-repair results"	2009-01-06 13:37:03 -07:00
Zooko O'Whielacronx	c35a6ee3a2	trivial: fix a bunch of pyflakes complaints	2009-01-06 08:00:54 -07:00
Zooko O'Whielacronx	6a12f316a4	immutable: new checker and verifier New checker and verifier use the new download class. They are robust against various sorts of failures or corruption. They return detailed results explaining what they learned about your immutable files. Some grotesque sorts of corruption are not properly handled yet, and those ones are marked as TODO or commented-out in the unit tests. There is also a repairer module in this patch with the beginnings of a repairer in it. That repairer is mostly just the interface to the outside world -- the core operation of actually reconstructing the missing data blocks and uploading them is not in there yet. This patch also refactors the unit tests in test_immutable so that the handling of each kind of corruption is reported as passing or failing separately, can be separately TODO'ified, etc. The unit tests are also improved in various ways to require more of the code under test or to stop requiring unreasonable things of it. :-)	2009-01-05 18:28:18 -07:00
Zooko O'Whielacronx	206ab2b44d	immutable: handle another form of share corruption with LayoutInvalid exception instead of AssertionError	2009-01-05 17:46:45 -07:00
Zooko O'Whielacronx	c84bb795f3	trivial: remove unused import (pyflakes)	2009-01-05 17:31:20 -07:00
Zooko O'Whielacronx	f4fab23bf6	immutable: raise a LayoutInvalid exception instead of an AssertionError if the share is corrupted so that the sharehashtree is the wrong size	2009-01-05 14:01:14 -07:00
Zooko O'Whielacronx	98b28c1d5e	immutable: stop reading past the end of the sharefile in the process of optimizing download -- Tahoe storage servers < 1.3.0 return an error if you read past the end of the share file	2009-01-05 13:40:57 -07:00
Zooko O'Whielacronx	8a840469c3	immutable: tidy up the notification of waiters for ReadBucketProxy	2009-01-05 13:35:22 -07:00
Zooko O'Whielacronx	778167c2b1	immutable: refactor downloader to be more reusable for checker/verifier/repairer (and better) The code for validating the share hash tree and the block hash tree has been rewritten to make sure it handles all cases, to share metadata about the file (such as the share hash tree, block hash trees, and UEB) among different share downloads, and not to require hashes to be stored on the server unnecessarily, such as the roots of the block hash trees (not needed since they are also the leaves of the share hash tree), and the root of the share hash tree (not needed since it is also included in the UEB). It also passes the latest tests including handling corrupted shares well. ValidatedReadBucketProxy takes a share_hash_tree argument to its constructor, which is a reference to a share hash tree shared by all ValidatedReadBucketProxies for that immutable file download. ValidatedReadBucketProxy requires the block_size and share_size to be provided in its constructor, and it then uses those to compute the offsets and lengths of blocks when it needs them, instead of reading those values out of the share. The user of ValidatedReadBucketProxy therefore has to have first used a ValidatedExtendedURIProxy to compute those two values from the validated contents of the URI. This is pleasingly simplifies safety analysis: the client knows which span of bytes corresponds to a given block from the validated URI data, rather than from the unvalidated data stored on the storage server. It also simplifies unit testing of verifier/repairer, because now it doesn't care about the contents of the "share size" and "block size" fields in the share. It does not relieve the need for share data v2 layout, because we still need to store and retrieve the offsets of the fields which come after the share data, therefore we still need to use share data v2 with its 8-byte fields if we want to store share data larger than about 2^32. Specify which subset of the block hashes and share hashes you need while downloading a particular share. In the future this will hopefully be used to fetch only a subset, for network efficiency, but currently all of them are fetched, regardless of which subset you specify. ReadBucketProxy hides the question of whether it has "started" or not (sent a request to the server to get metadata) from its user. Download is optimized to do as few roundtrips and as few requests as possible, hopefully speeding up download a bit.	2009-01-05 09:51:45 -07:00
Zooko O'Whielacronx	8f5cc24948	trivial: remove unused import (pyflakes)	2009-01-03 12:22:15 -07:00
Zooko O'Whielacronx	5954ab456d	immutable: fix test for truncated reads of URI extension block size	2009-01-03 11:44:27 -07:00
Zooko O'Whielacronx	54787771c3	immutable: fix detection of truncated shares to take into account the fieldsize -- either 4 or 8	2009-01-02 18:57:45 -07:00
Zooko O'Whielacronx	21e0ff97f2	immutable: raise LayoutInvalid instead of struct.error when a share is truncated To fix this error from the Windows buildslave: [ERROR]: allmydata.test.test_immutable.Test.test_download_from_only_3_remaining_shares Traceback (most recent call last): File "C:\Documents and Settings\buildslave\windows-native-tahoe\windows\build\src\allmydata\immutable\download.py", line 135, in _bad raise NotEnoughSharesError("ran out of peers, last error was %s" % (f,)) allmydata.interfaces.NotEnoughSharesError: ran out of peers, last error was [Failure instance: Traceback: <class 'struct.error'>: unpack requires a string argument of length 4 c:\documents and settings\buildslave\windows-native-tahoe\windows\build\support\lib\site-packages\foolscap-0.3.2-py2.5.egg\foolscap\call.py:667:_done c:\documents and settings\buildslave\windows-native-tahoe\windows\build\support\lib\site-packages\foolscap-0.3.2-py2.5.egg\foolscap\call.py:53:complete c:\Python25\lib\site-packages\twisted\internet\defer.py:239:callback c:\Python25\lib\site-packages\twisted\internet\defer.py:304:_startRunCallbacks --- <exception caught here> --- c:\Python25\lib\site-packages\twisted\internet\defer.py:317:_runCallbacks C:\Documents and Settings\buildslave\windows-native-tahoe\windows\build\src\allmydata\immutable\layout.py:374:_got_length C:\Python25\lib\struct.py:87:unpack ] ===============================================================================	2009-01-02 18:48:06 -07:00
Zooko O'Whielacronx	e26cec2502	immutable: add more detailed tests of download, including testing the count of how many reads different sorts of downloads take	2009-01-02 16:54:59 -07:00
Zooko O'Whielacronx	cc70c163ba	trivial: a few improvements to in-line doc and code, and renaming of test/test_immutable_checker.py to test/test_immutable.py That file currently tests checker and verifier and repairer, and will soon also test downloader.	2009-01-02 16:49:41 -07:00
Zooko O'Whielacronx	a52b5542e9	immutable: fix name change from BadOrMissingShareHash to BadOrMissingHash One of the instances of the name accidentally didn't get changed, and pyflakes noticed. The new downloader/checker/verifier/repairer unit tests would also have noticed, but those tests haven't been rolled into a patch and applied to this repo yet...	2009-01-02 13:27:09 -07:00
Zooko O'Whielacronx	c72be1c553	trivial: remove unused import -- thanks, pyflakes	2009-01-02 13:21:28 -07:00
Zooko O'Whielacronx	d8c9c3dc99	immutable: download.py: Raise the appropriate type of exception to indicate the cause of failure, e.g. BadOrMissingHash, ServerFailure, IntegrityCheckReject (which is a supertype of BadOrMissingHash). This helps users (such as verifier/repairer) catch certain classes of reasons for "why did this download not work". The tests of verifier/repairer test this code and rely on this code.	2009-01-02 12:58:58 -07:00
Zooko O'Whielacronx	fa5c1d8326	immutable: ReadBucketProxy defines classes of exception: LayoutInvalid and its two subtypes RidiculouslyLargeURIExtensionBlock and ShareVersionIncompatible. This helps users (such as verifier/repairer) catch certain classes of reasons for "why did this download not work". This code gets exercised by the verifier/repairer unit tests, which corrupt the shares on disk in order to trigger problems like these.	2009-01-02 12:15:54 -07:00
Zooko O'Whielacronx	0ee027c180	immutable: ValidatedExtendedURIProxy computes and stores block_size and share_size for the convenience of its users	2009-01-02 11:43:17 -07:00
Zooko O'Whielacronx	0687f692b0	trivial: "M-x whitespace-cleanup" on immutable/layout.py	2008-12-31 15:07:02 -07:00
Zooko O'Whielacronx	c54783f5e1	immutable: don't catch all exception when downloading, catch only DeadReferenceError and IntegrityCheckReject	2008-12-21 17:41:35 -07:00
Zooko O'Whielacronx	ad58f8b693	immutable: invent download.BadOrMissingHashError which is raised if either hashtree.BadHashError, hashtree.NotEnoughHashesError, and which is a subclass of IntegrityCheckReject	2008-12-21 17:41:30 -07:00
Zooko O'Whielacronx	8b7ce325d7	immutable, checker, and tests: improve docstrings, assertions, tests No functional changes, but remove unused code, improve or fix docstrings, etc.	2008-12-21 15:07:52 -07:00
Zooko O'Whielacronx	ec86563326	immutable: when downloading an immutable file, use primary shares if they are available Primary shares require no erasure decoding so the more primary shares you have, the less CPU is used.	2008-12-20 07:14:56 -07:00
Zooko O'Whielacronx	a71a68b31e	trivial: remove unused import (thanks, pyflakes)	2008-12-19 13:46:29 -07:00
Zooko O'Whielacronx	471e1f1b9b	try to tidy up uri-as-string vs. uri-as-object I get confused about whether a given argument or return value is a uri-as-string or uri-as-object. This patch adds a lot of assertions that it is one or the other, and also changes CheckerResults to take objects not strings. In the future, I hope that we generally use Python objects except when importing into or exporting from the Python interpreter e.g. over the wire, the UI, or a stored file.	2008-12-19 08:39:24 -07:00
Zooko O'Whielacronx	7b285ebcb1	immutable: remove the last bits of code (only test code or unused code) which did something with plaintext hashes or plaintext hash trees	2008-12-19 08:18:07 -07:00
Zooko O'Whielacronx	d67a3fe4b1	immutable: use new logging mixins to simplify logging	2008-12-16 18:04:50 -07:00
Zooko O'Whielacronx	d511941136	immutable: refactor ReadBucketProxy a little	2008-12-16 17:53:25 -07:00
Zooko O'Whielacronx	db566db31a	immutable: remove unused code to produce plaintext hashes	2008-12-09 16:45:46 -07:00
Zooko O'Whielacronx	c3edae5158	finish renaming 'subshare' to 'block' in immutable/encode.py and in docs/	2008-12-09 16:33:18 -07:00
Zooko O'Whielacronx	c456ff8591	rename "get_verifier()" to "get_verify_cap()"	2008-12-08 12:44:11 -07:00
Zooko O'Whielacronx	60bbc46a53	minor: fix unused imports -- thanks, pyflakes	2008-12-05 13:07:23 -07:00
Zooko O'Whielacronx	b315619d6b	download: refactor handling of URI Extension Block and crypttext hash tree, simplify things Refactor into a class the logic of asking each server in turn until one of them gives an answer that validates. It is called ValidatedThingObtainer. Refactor the downloading and verification of the URI Extension Block into a class named ValidatedExtendedURIProxy. The new logic of validating UEBs is minimalist: it doesn't require the UEB to contain any unncessary information, but of course it still accepts such information for backwards compatibility (so that this new download code is able to download files uploaded with old, and for that matter with current, upload code). The new logic of validating UEBs follows the practice of doing all validation up front. This practice advises one to isolate the validation of incoming data into one place, so that all of the rest of the code can assume only valid data. If any redundant information is present in the UEB+URI, the new code cross-checks and asserts that it is all fully consistent. This closes some issues where the uploader could have uploaded inconsistent redundant data, which would probably have caused the old downloader to simply reject that download after getting a Python exception, but perhaps could have caused greater harm to the old downloader. I removed the notion of selecting an erasure codec from codec.py based on the string that was passed in the UEB. Currently "crs" is the only such string that works, so "_assert(codec_name == 'crs')" is simpler and more explicit. This is also in keeping with the "validate up front" strategy -- now if someone sets a different string than "crs" in their UEB, the downloader will reject the download in the "validate this UEB" function instead of in a separate "select the codec instance" function. I removed the code to check plaintext hashes and plaintext Merkle Trees. Uploaders do not produce this information any more (since it potentially exposes confidential information about the file), and the unit tests for it were disabled. The downloader before this patch would check that plaintext hash or plaintext merkle tree if they were present, but not complain if they were absent. The new downloader in this patch complains if they are present and doesn't check them. (We might in the future re-introduce such hashes over the plaintext, but encrypt the hashes which are stored in the UEB to preserve confidentiality. This would be a double- check on the correctness of our own source code -- the current Merkle Tree over the ciphertext is already sufficient to guarantee the integrity of the download unless there is a bug in our Merkle Tree or AES implementation.) This patch increases the lines-of-code count by 8 (from 17,770 to 17,778), and reduces the uncovered-by-tests lines-of-code count by 24 (from 1408 to 1384). Those numbers would be more meaningful if we omitted src/allmydata/util/ from the test-coverage statistics.	2008-12-05 08:17:54 -07:00
Brian Warner	3e25efc010	upload: when using a Helper, insist that it provide protocols/helper/v1 . Related to #538 .	2008-11-21 20:29:32 -07:00
Brian Warner	0fab511be5	upload: don't use servers which can't support the share size we need. This ought to avoid #439 problems. Some day we'll have a storage server which advertises support for a larger share size. No tests yet.	2008-11-21 20:28:12 -07:00
Brian Warner	bf06492a90	#538 : fetch version and attach to the rref. Make IntroducerClient demand v1 support.	2008-11-21 20:07:27 -07:00
Brian Warner	7932fadb5e	webapi: add 'summary' string to checker results JSON	2008-11-18 18:28:26 -07:00
Brian Warner	dfa2408157	checker: add is_recoverable() to checker results, make our stub immutable-verifier not throw an exception on unrecoverable files, add tests	2008-11-06 22:35:47 -07:00
Brian Warner	6fa41e738b	immutable: tolerate filenode.read() with a size= that's too big, rather than hanging	2008-11-04 15:29:19 -07:00

1 2

88 Commits