Commit Graph

79 Commits

Author SHA1 Message Date
Brian Warner
15ef9f624d encode: log a plaintext hash and SI for each upload. This will allow the log gatherer to correlate the two, to better measure the benefits of convergence 2008-03-24 18:55:37 -07:00
Brian Warner
1e097766c9 disable plaintext hashes in shares, but leave a switch to turn it back on 2008-03-24 13:39:51 -07:00
Brian Warner
7b21054c33 UNDO: upload: stop putting plaintext and ciphertext hashes in shares.
This removes the guess-partial-information attack vector, and reduces
the amount of overhead that we consume with each file. It also introduces
a forwards-compability break: older versions of the code (before the
previous download-time "make hashes optional" patch) will be unable
to read files uploaded by this version, as they will complain about the
missing hashes. This patch is experimental, and is being pushed into
trunk to obtain test coverage. We may undo it before releasing 1.0.
2008-03-23 15:35:54 -07:00
Brian Warner
7996131a0a upload: stop putting plaintext and ciphertext hashes in shares.
This removes the guess-partial-information attack vector, and reduces
the amount of overhead that we consume with each file. It also introduces
a forwards-compability break: older versions of the code (before the
previous download-time "make hashes optional" patch) will be unable
to read files uploaded by this version, as they will complain about the
missing hashes. This patch is experimental, and is being pushed into
trunk to obtain test coverage. We may undo it before releasing 1.0.
2008-03-23 15:35:54 -07:00
Zooko O'Whielacronx
fc0d637523 docs: update install and usage docs, improve cli "usage" output, make new example directories, add unit test that fails code which prints out sentences that don't end with punctuation marks 2008-02-15 13:11:02 -07:00
Zooko O'Whielacronx
7c6de95bc6 switch from base62 to base32 for storage indices, switch from z-base-32 to rfc 3548 base-32 for everything, separate out base32 encoding from idlib 2008-02-14 19:27:47 -07:00
Zooko O'Whielacronx
3f8df27063 use base62 encoding for storage indexes, on disk and in verifier caps, and in logging and diagnostic tools
base62 encoding fits more information into alphanumeric chars while avoiding the troublesome non-alphanumeric chars of base64 encoding.  In particular, this allows us to work around the ext3 "32,000 entries in a directory" limit while retaining the convenient property that the intermediate directory names are leading prefixes of the storage index file names.
2008-02-12 20:48:37 -07:00
Brian Warner
d0ce8694c1 add upload-status objects, to track upload progress 2008-02-12 15:36:05 -07:00
Brian Warner
a2cace9cfb helper: return full uri-extension data to the client, so it can compare encoding parameters 2008-02-06 17:30:58 -07:00
Brian Warner
124fb5ecdf add upload-results timing info for helper uploads. This changes the Helper protocol, and introduces a compatibility break 2008-02-06 01:52:25 -07:00
Brian Warner
93d45abb02 add upload timings and rates to the POST /uri?t=upload results page 2008-02-06 00:41:51 -07:00
Brian Warner
504bbe4a16 encode.py: update logging levels 2008-01-28 12:15:27 -07:00
Brian Warner
8f1212edac encode.py: don't allow a shareholder which dies in start() to kill the whole upload 2008-01-28 12:14:48 -07:00
Brian Warner
a36ed9b5b7 encode.py: don't record BAD log event unless there is actually a problem 2008-01-28 11:59:10 -07:00
Brian Warner
081d27a65d encode.py: improve error message when segment lengths come out wrong 2008-01-24 21:51:09 -07:00
Brian Warner
46fe024612 offloaded uploader: don't use a huge amount of memory when skipping over previously-uploaded data 2008-01-24 17:25:33 -07:00
Brian Warner
a0945de0e0 encode.py: log the contents of the uri_extension block 2008-01-23 18:08:04 -07:00
Brian Warner
e9307d3fda offloaded: close the local filehandle after encoding is done, otherwise windows fails 2008-01-17 01:52:33 -07:00
Brian Warner
c597e67c2b offloaded: improve logging across the board 2008-01-17 01:11:35 -07:00
Brian Warner
51321944f0 megapatch: overhaul encoding_parameters handling: now it comes from the Uploadable, or the Client. Removed options= too. Also move helper towards resumability. 2008-01-16 03:03:35 -07:00
Brian Warner
7bb9307871 encode: actually define the UploadAborted exception 2008-01-14 21:27:02 -07:00
Brian Warner
a6ca98ac53 upload: add Encoder.abort(), to abandon the upload in progress. Add some debug hooks to enable unit tests. 2008-01-14 21:22:55 -07:00
Brian Warner
60090fb9f2 upload: improve logging 2008-01-14 21:19:20 -07:00
Brian Warner
ea24864544 offloaded: more code, fix pyflakes problems, change IEncryptedUploader a bit 2008-01-09 17:58:47 -07:00
Zooko O'Whielacronx
1ac09840a4 a few documentation and naming convention updates
Notable: the argument to make REPORTER has been renamed to TRIALARGS.
2007-12-12 19:34:08 -07:00
Brian Warner
1f91e856ef encode.py: trivial whitespace change 2007-11-19 23:00:29 -07:00
Brian Warner
33a5f8ba6b more hierarchical logging: download/upload/encode 2007-11-19 19:33:41 -07:00
Brian Warner
6160af5f50 encode.py: update comments, max_segment_size is now 2MiB 2007-10-16 11:00:29 -07:00
Zooko O'Whielacronx
426721f3f2 update a few documents, comments, and defaults to mention 3-of-10 instead of 25-of-100 2007-10-15 19:53:59 -07:00
Brian Warner
8b14ad1673 encode.py: log a percentage complete as well as the raw byte counts 2007-08-09 18:28:45 -07:00
Brian Warner
08ee80176a Encoder.__repr__: mention the file being encoded 2007-08-09 18:26:56 -07:00
Brian Warner
684966d103 encode.py: add a reactor turn barrier between segments, to allow Deferreds to retire and free their arguments, all in the name of reducing memory footprint 2007-08-09 18:26:17 -07:00
Brian Warner
e6e9ddc588 refactor upload/encode, to split encrypt and encode responsibilities 2007-07-23 19:31:53 -07:00
Brian Warner
9af506900b upload: refactor to enable streaming upload. not all tests pass yet 2007-07-19 18:21:44 -07:00
Brian Warner
20c980d02b reduce MAX_SEGMENT_SIZE from 2MB to 1MB, to compensate for the large blocks that 3-of-10 produces 2007-07-16 13:48:34 -07:00
Brian Warner
d206843625 encode.py: minor typo 2007-07-13 17:00:06 -07:00
Brian Warner
7589a8ee82 storage: we must truncate short segments. Now most tests pass (except uri_extension) 2007-07-13 16:38:25 -07:00
Brian Warner
1f8e407d9c more #85 work, system test still fails 2007-07-13 15:09:01 -07:00
Brian Warner
cd8648d39b storage: use one file per share instead of 7 (#85). work-in-progress, tests still fail 2007-07-13 14:04:49 -07:00
Brian Warner
5399395c27 allow the introducer to set default encoding parameters. Closes #84.
By writing something like "25 75 100" into a file named 'encoding_parameters'
in the central Introducer's base directory, all clients which use that
introducer will be advised to use 25-out-of-100 encoding for files (i.e.
100 shares will be produced, 25 are required to reconstruct, and the upload
process will be happy if it can find homes for at least 75 shares). The
default values are "3 7 10". For small meshes, the defaults are probably
good, but for larger ones it may be appropriate to increase the number of
shares.
2007-07-12 15:33:30 -07:00
Brian Warner
dce1dc2730 storage: wrap buckets in a local proxy
This will make it easier to change RIBucketWriter in the future to reduce the wire
protocol to just open/write(offset,data)/close, and do all the structuring on the
client end. The ultimate goal is to store each bucket in a single file, to reduce
the considerable filesystem-quantization/inode overhead on the storage servers.
2007-07-08 23:27:46 -07:00
Brian Warner
382888899b refactor URI_extension handlers out of encode/download and into uri.py 2007-06-11 18:25:18 -07:00
Brian Warner
584dc4ae94 handle uri_extension with a non-bencode serialization scheme 2007-06-08 16:17:54 -07:00
Brian Warner
c9ef291c02 rename thingA to 'uri extension' 2007-06-08 15:59:16 -07:00
Brian Warner
f62a544b93 remove several leftover defintions of netstring() 2007-06-07 22:13:18 -07:00
Brian Warner
c049941529 move almost all hashing to SHA256, consolidate into hashutil.py
The only SHA-1 hash that remains is used in the permutation of nodeids,
where we need to decide if we care about performance or long-term security.
I suspect that we could use a much weaker hash (and faster) hash for
this purpose. In the long run, we'll be doing thousands of such hashes
for each file uploaded or downloaded (one per known peer).
2007-06-07 21:47:21 -07:00
Brian Warner
cabba59fe7 test_encode.py: even more testing of merkle trees, getting fairly comprehensive now 2007-06-07 21:24:39 -07:00
Brian Warner
f3846da4ab encode.py: hush pyflakes warnings 2007-06-07 13:18:55 -07:00
Brian Warner
b2caf7fb9a encode/download: reduce memory footprint by deleting large intermediate buffers as soon as possible, improve hash tree usage 2007-06-07 13:15:58 -07:00
Brian Warner
c81f2b01ff encode.py: fix generation of plaintext/crypttext merkle trees 2007-06-07 13:14:14 -07:00