Commit Graph

51 Commits

Author SHA1 Message Date
Brian Warner
1d9a58977f uri: implement URI-processing classes, IFileURI/IDirnodeURI, use internally 2007-07-21 15:40:36 -07:00
Brian Warner
c6f52e379a rename storageserver.py to just storage.py, since it has both server and client sides now 2007-07-13 17:25:45 -07:00
Brian Warner
7589a8ee82 storage: we must truncate short segments. Now most tests pass (except uri_extension) 2007-07-13 16:38:25 -07:00
Brian Warner
cd8648d39b storage: use one file per share instead of 7 (#85). work-in-progress, tests still fail 2007-07-13 14:04:49 -07:00
Brian Warner
53cf757be3 make it possible to download LIT uris. oops. 2007-07-12 16:16:59 -07:00
Brian Warner
dce1dc2730 storage: wrap buckets in a local proxy
This will make it easier to change RIBucketWriter in the future to reduce the wire
protocol to just open/write(offset,data)/close, and do all the structuring on the
client end. The ultimate goal is to store each bucket in a single file, to reduce
the considerable filesystem-quantization/inode overhead on the storage servers.
2007-07-08 23:27:46 -07:00
Brian Warner
c4a8db3eb2 webish: provide a valid Content-Length header on downloads 2007-07-03 15:09:00 -07:00
Brian Warner
622acc690a webish: improve reporting of web download errors that occur early enough
If the error occurs before any data has been sent, we can give a sensible
error message (code 500, stack trace, etc). This will cover most of the error
cases. The ones that aren't covered are when we run out of good peers after
successfully decoding the first segment, either because they go away or
because their shares are corrupt.
2007-07-03 13:47:37 -07:00
Brian Warner
f15bb302a1 webish.py: handle errors during download better. Addresses #65.
Previously, exceptions during a web download caused a hang rather than some
kind of exception or error message. This patch improves the situation by
terminating the HTTP download rather than letting it hang forever. The
behavior still isn't ideal, however, because the error can occur too late to
abort the HTTP request cleanly (i.e. with an error code). In fact, the
Content-Type header and response code have already been set by the time any
download errors have been detected, so the browser is committed to displaying
an image or whatever (thus any error message we put into the stream is
unlikely to be displayed in a meaningful way).
2007-07-03 13:18:14 -07:00
Brian Warner
382888899b refactor URI_extension handlers out of encode/download and into uri.py 2007-06-11 18:25:18 -07:00
Brian Warner
956d5ae256 rename fileid/verifierid to plaintext_hash/crypttext_hash 2007-06-09 20:46:04 -07:00
Brian Warner
584dc4ae94 handle uri_extension with a non-bencode serialization scheme 2007-06-08 16:17:54 -07:00
Brian Warner
c9ef291c02 rename thingA to 'uri extension' 2007-06-08 15:59:16 -07:00
Brian Warner
72bc8627de consolidate multiple definitions of NotEnoughPeersError 2007-06-07 22:20:55 -07:00
Brian Warner
f62a544b93 remove several leftover defintions of netstring() 2007-06-07 22:13:18 -07:00
Brian Warner
c049941529 move almost all hashing to SHA256, consolidate into hashutil.py
The only SHA-1 hash that remains is used in the permutation of nodeids,
where we need to decide if we care about performance or long-term security.
I suspect that we could use a much weaker hash (and faster) hash for
this purpose. In the long run, we'll be doing thousands of such hashes
for each file uploaded or downloaded (one per known peer).
2007-06-07 21:47:21 -07:00
Brian Warner
053109b28b add tests for bad/inconsistent plaintext/crypttext merkle tree hashes 2007-06-07 19:32:29 -07:00
Brian Warner
b2caf7fb9a encode/download: reduce memory footprint by deleting large intermediate buffers as soon as possible, improve hash tree usage 2007-06-07 13:15:58 -07:00
Brian Warner
e04ff3adac fetch plaintext/crypttext merkle trees during download, but don't check the segments against them yet 2007-06-07 00:15:41 -07:00
Brian Warner
fae4e8f9a3 download.py: refactor get-thingA-from-somebody to reuse the logic for other things 2007-06-06 23:50:02 -07:00
Brian Warner
3dfd26970b move validation data to thingA, URI has storage_index plus thingA hash
This (compatibility-breaking) change moves much of the validation data and
encoding parameters out of the URI and into the so-called "thingA" block
(which will get a better name as soon as we find one we're comfortable with).
The URI retains the "storage_index" (a generalized term for the role that
we're currently using the verifierid for, the unique index for each file
that gets used by storage servers to decide which shares to return), the
decryption key, the needed_shares/total_shares counts (since they affect
peer selection), and the hash of the thingA block.

This shortens the URI and lets us add more kinds of validation data without
growing the URI (like plaintext merkle trees, to enable strong incremental
plaintext validation), at the cost of maybe 150 bytes of alacrity. Each
storage server holds an identical copy of the thingA block.

This is an incompatible change: new messages have been added to the storage
server interface, and the URI format has changed drastically.
2007-06-01 18:48:01 -07:00
Brian Warner
7124f94461 download.py: refactor bucket_failed() a bit, add some docs 2007-05-31 18:31:36 -07:00
Brian Warner
05163ec8e1 change uri-packer-unpacker to deal with dictionaries, not fragile tuples 2007-05-23 11:18:49 -07:00
Brian Warner
4b2298937b use real encryption, generate/store/verify verifierid and fileid 2007-04-25 17:53:10 -07:00
Brian Warner
80cf789817 download: remove unused import 2007-04-17 21:11:20 -07:00
Brian Warner
b76aa1ce17 download: oops, NotEnoughHashesError comes from hashtree, not hashutil 2007-04-17 20:37:51 -07:00
Brian Warner
e7ec4ff4e5 factor out the tagged hash function used for subshares/blocks 2007-04-17 20:27:56 -07:00
Brian Warner
76e28b3484 comment out some verbose log messages, add commented-out new ones 2007-04-17 20:25:52 -07:00
Brian Warner
c3268ca394 download.py: don't truncate tail segments that are the same size as all the others 2007-04-17 13:39:35 -07:00
Brian Warner
6bdabd2cea download: remove some leftover (and not very useful) debug logging 2007-04-16 17:17:57 -07:00
Brian Warner
42179e5ae2 download: verify that bad blocks or hashes are caught by the download process 2007-04-16 16:30:21 -07:00
Brian Warner
2fef5dac1f download: log more information when hashtree checks fail 2007-04-16 13:08:19 -07:00
Brian Warner
7dabb68a51 download: improve test coverage on our IDownloadTarget classes, make FileHandle return the filehandle when its done so that it is easier to close 2007-04-16 13:07:36 -07:00
Brian Warner
30133a7cdf hash trees: further cleanup, to make sure we're validating the right thing
hashtree.py: improve the methods available for finding out which hash nodes
 are needed. Change set_hashes() to require that every hash provided can
 be validated up to the root.
download.py: validate from the top down, including the URI-derived roothash
 in the share hash tree, and stashing the thus-validated share hash for use
 in the block hash tree.
2007-04-12 19:41:48 -07:00
Brian Warner
d351cd7674 download: always validate the blockhash, and don't let the bucket trick us into not validating hashes 2007-04-12 15:18:46 -07:00
Brian Warner
d8215e0c6f rename chunk.py to hashtree.py 2007-04-12 13:13:25 -07:00
Brian Warner
8f58b30db9 verify hash chains on incoming blocks
Implement enough of chunk.IncompleteHashTree to be usable.
Rearrange download: all block/hash requests now go through
a ValidatedBucket instance, which is responsible for retrieving
and verifying hashes before providing validated data. Download
was changed to use ValidatedBuckets everywhere instead of
unwrapped RIBucketReader references.
2007-04-12 13:07:40 -07:00
Brian Warner
d8b71b85f8 download: retrieve share hashes when downloading. We don't really do much validation with them yet, though. 2007-04-06 22:51:19 -07:00
Zooko O'Whielacronx
077eb7507c assert that only dicts get passed to _got_response() 2007-03-30 18:00:40 -07:00
Brian Warner
7cd9ef3bbf finish making the new encoder/decoder/upload/download work 2007-03-30 16:50:50 -07:00
Brian Warner
234b2f354e add new test for doing an encode/decode round trip, and make it almost work 2007-03-30 13:20:01 -07:00
Zooko O'Whielacronx
f4a718c5b6 finish storage server and write new download 2007-03-30 10:52:19 -07:00
Zooko O'Whielacronx
17299fc96e new upload and storage server 2007-03-29 20:19:52 -07:00
Zooko O'Whielacronx
c427b880d2 update the use of the encoder API in download.py 2007-02-01 16:30:13 -07:00
Brian Warner
ef73ebaf0a download: update all users to match Zooko's change to ICodecDecoder.decode (as it now returns a list instead of a single string) 2007-01-24 17:23:22 -07:00
Brian Warner
430b3a03fc move upload/download interfaces to allmydata.interfaces, let SubTreeMaker assert IDownloader-ness of its 'downloader' argument 2007-01-21 15:01:34 -07:00
Brian Warner
ca3fda3e22 download.py: fix IDownloader to take a URI 2007-01-19 02:17:48 -07:00
Brian Warner
4101bcf218 update URI format, include codec name 2007-01-16 21:29:59 -07:00
Brian Warner
3209fd5e09 rearrange encode/upload, add URIs, switch to ReplicatingEncoder
Added metadata to the bucket store, which is used to hold the share number
(but the bucket doesn't know that, it just gets a string).

Modified the codec interfaces a bit.

Try to pass around URIs to/from download/upload instead of verifierids.
URI format is still in flux.

Change the current (primitive) file encoder to use a ReplicatingEncoder
because it provides ICodecEncoder. We will be moving to the (less primitive)
file encoder (currently in allmydata.encode_new) eventually, but for now
this change lets us test out PyRS or zooko's upcoming C-based RS codec in
something larger than a single unit test. This primitive file encoder only
uses a single segment, and has no merkle trees.

Also added allmydata.util.deferredutil for a DeferredList wrapper that
errbacks (but only when all component Deferreds have fired) if there were
any errors, which unfortunately is not a behavior available from the standard
DeferredList.
2007-01-15 21:22:22 -07:00
Brian Warner
417c17755b use the word 'codec' for erasure coding, for now. 'encode' is used for file-level segmentation/hashing 2007-01-11 20:51:27 -07:00