Commit Graph

254 Commits

Author SHA1 Message Date
Zooko O'Whielacronx
b315619d6b download: refactor handling of URI Extension Block and crypttext hash tree, simplify things
Refactor into a class the logic of asking each server in turn until one of them gives an answer 
that validates.  It is called ValidatedThingObtainer.

Refactor the downloading and verification of the URI Extension Block into a class named 
ValidatedExtendedURIProxy.

The new logic of validating UEBs is minimalist: it doesn't require the UEB to contain any 
unncessary information, but of course it still accepts such information for backwards 
compatibility (so that this new download code is able to download files uploaded with old, and 
for that matter with current, upload code).

The new logic of validating UEBs follows the practice of doing all validation up front.  This 
practice advises one to isolate the validation of incoming data into one place, so that all of 
the rest of the code can assume only valid data.

If any redundant information is present in the UEB+URI, the new code cross-checks and asserts 
that it is all fully consistent.  This closes some issues where the uploader could have 
uploaded inconsistent redundant data, which would probably have caused the old downloader to 
simply reject that download after getting a Python exception, but perhaps could have caused 
greater harm to the old downloader.

I removed the notion of selecting an erasure codec from codec.py based on the string that was 
passed in the UEB.  Currently "crs" is the only such string that works, so 
"_assert(codec_name == 'crs')" is simpler and more explicit.  This is also in keeping with the 
"validate up front" strategy -- now if someone sets a different string than "crs" in their UEB, 
the downloader will reject the download in the "validate this UEB" function instead of in a 
separate "select the codec instance" function.

I removed the code to check plaintext hashes and plaintext Merkle Trees.  Uploaders do not 
produce this information any more (since it potentially exposes confidential information about 
the file), and the unit tests for it were disabled.  The downloader before this patch would 
check that plaintext hash or plaintext merkle tree if they were present, but not complain if 
they were absent.  The new downloader in this patch complains if they are present and doesn't 
check them.  (We might in the future re-introduce such hashes over the plaintext, but encrypt 
the hashes which are stored in the UEB to preserve confidentiality.  This would be a double-
check on the correctness of our own source code -- the current Merkle Tree over the ciphertext 
is already sufficient to guarantee the integrity of the download unless there is a bug in our 
Merkle Tree or AES implementation.) 

This patch increases the lines-of-code count by 8 (from 17,770 to 17,778), and reduces the 
uncovered-by-tests lines-of-code count by 24 (from 1408 to 1384).  Those numbers would be more 
meaningful if we omitted src/allmydata/util/ from the test-coverage statistics.
2008-12-05 08:17:54 -07:00
Brian Warner
bc53c24003 dirnode manifest: add verifycaps, both to internal API and to webapi. This will give the manual-GC tools more to work with, so they can estimate how much space will be freed. 2008-11-24 14:40:46 -07:00
Brian Warner
bf06492a90 #538: fetch version and attach to the rref. Make IntroducerClient demand v1 support. 2008-11-21 20:07:27 -07:00
Brian Warner
0eb6b324a4 #538: add remote_get_version() to four main Referenceable objects: Introducer Service, Storage Server, Helper, CHK Upload Helper. Remove unused storage-server get_versions(). 2008-11-21 17:43:52 -07:00
Brian Warner
b84c2c6541 manifest: add storage-index strings to the json results 2008-11-19 16:00:27 -07:00
Brian Warner
815e0673e6 manifest: include stats in results. webapi is unchanged. 2008-11-19 15:03:47 -07:00
Brian Warner
dfa2408157 checker: add is_recoverable() to checker results, make our stub immutable-verifier not throw an exception on unrecoverable files, add tests 2008-11-06 22:35:47 -07:00
Brian Warner
b1db6d9ff2 web: add 'Repair' button to checker results when they indicate unhealthyness. Also add the object's uri to the CheckerResults instance. 2008-10-29 18:09:17 -07:00
Brian Warner
37e3d8e47c #527: support HTTP 'Range:' requests, using a cachefile. Adds filenode.read(consumer, offset, size) method. Still needs: cache expiration, reduced alacrity. 2008-10-28 13:41:04 -07:00
Brian Warner
914655c52b interfaces.py: promote immutable.encode.NotEnoughSharesError.. it isn't just for immutable files any more 2008-10-27 13:34:49 -07:00
Brian Warner
4b48d94c52 interfaces.IMutableFileNode.download_best_version(): fix return value 2008-10-27 13:20:46 -07:00
Brian Warner
fca158e83a dirnode lookup: use distinct NoSuchChildError instead of the generic KeyError when a child can't be found 2008-10-27 13:15:25 -07:00
Brian Warner
db37c14ab7 storage: add remote_advise_corrupt_share, for clients to tell storage servers about share corruption that they've discovered. The server logs the report. 2008-10-24 11:52:48 -07:00
Brian Warner
c455d52453 deep-check: add webapi links to detailed per-file/dir results 2008-10-23 16:00:31 -07:00
Brian Warner
819d6b3d03 interface.py: fix typo 2008-10-23 15:59:36 -07:00
Brian Warner
977c6ac510 more #514: pass a Monitor to all checker operations, make mutable-checker honor the cancel flag 2008-10-22 01:38:18 -07:00
Brian Warner
ad3d9207a9 Change deep-size/stats/check/manifest to a start+poll model instead of a single long-running synchronous operation. No cancel or handle-expiration yet. #514. 2008-10-21 17:03:07 -07:00
Zooko O'Whielacronx
8a6d1e5da6 repairer: test all different kinds of corruption that can happen to share files on disk 2008-10-14 16:09:20 -07:00
Zooko O'Whielacronx
86e22b8add interfaces: loosen a few max-size constraints which would limit us to a mere 1.09 TB maximum file size
These constraints were originally intended to protect against attacks on the
storage server protocol layer which exhaust memory in the peer.  However,
defending against that sort of DoS is hard -- probably it isn't completely
achieved -- and it costs development time to think about it, and it sometimes
imposes limits on legitimate users which we don't necessarily want to impose.
So, for now we forget about limiting the amount of RAM that a foolscap peer can
cause you to start using.
2008-10-09 12:13:57 -07:00
Brian Warner
3ffaded809 web: change t=manifest to return a list of (path,read/writecap) tuples, instead of a list of verifycaps. Add output=html,text,json. 2008-10-06 21:36:18 -07:00
Brian Warner
41bacca3f1 interfaces: fix minor typo 2008-10-02 17:52:49 -07:00
Brian Warner
d0bdf9a611 dirnode: add get_child_and_metadata_at_path 2008-10-02 17:52:03 -07:00
Zooko O'Whielacronx
1e8d37cc2d repairer: add basic test of repairer, move tests of immutable checker/repairer from test_system to test_immutable_checker, remove obsolete test helper code from test_filenode
Hm...  "Checker" ought to be renamed to "CheckerRepairer" or "Repairer" at some point...
2008-09-25 10:16:53 -07:00
Brian Warner
f570ad7ba5 disallow deep-check on non-directories, simplifies the code a bit 2008-09-10 13:44:58 -07:00
Brian Warner
4bb88fd2ee dirnode: refactor recursive-traversal methods, add stats to deep_check() method results and t=deep-check webapi 2008-09-10 01:45:04 -07:00
Brian Warner
1d2d6a35a6 checker results: add output=JSON to webapi, add tests, clean up APIs
to make the internal ones use binary strings (nodeid, storage index) and
the web/JSON ones use base32-encoded strings. The immutable verifier is
still incomplete (it returns imaginary healty results).
2008-09-09 19:45:17 -07:00
Brian Warner
84a5778507 checker results: more tests, update interface docs 2008-09-09 17:30:10 -07:00
Brian Warner
137750eca6 interfaces.py: minor improvement to IDirectoryNode.set_node 2008-09-09 16:34:16 -07:00
Brian Warner
3408d552cd checker: overhaul checker results, split check/check_and_repair into separate methods, improve web displays 2008-09-07 12:44:56 -07:00
Brian Warner
d43baa2ad7 mutable: add get_size_of_best_version to the interface, to simplify the web HEAD code, and tests 2008-08-12 19:02:52 -07:00
Brian Warner
c80e352951 IFilesystemNode: add get_storage_index(), it makes tests easier 2008-08-12 16:14:07 -07:00
Brian Warner
d106e411af checker: add information to results, add some deep-check tests, fix a bug in which unhealthy files were not counted 2008-08-11 21:03:26 -07:00
Zooko O'Whielacronx
29255568df storage: make storage servers declare oldest supported version == 1.0, and storage clients declare oldest supported version == 1.0
See comments in patch for intended semantics.
2008-07-30 15:51:07 -07:00
Brian Warner
afda2a43e4 storage: remove update_write_enabler method, it won't serve the desired purpose, and I have a better scheme in mind. See #489 for details 2008-07-21 17:28:28 -07:00
Brian Warner
879fefe5f3 first pass at a mutable repairer. not tested at all yet, but of course all existing tests pass 2008-07-17 21:09:23 -07:00
Brian Warner
3e95681bad interfaces: add IRepairable 2008-07-17 17:32:17 -07:00
Brian Warner
67db0a4967 deep-check: add webapi, add 'DEEP-CHECK' button to wui, add tests, rearrange checker API a bit 2008-07-17 16:47:09 -07:00
Brian Warner
9289433ba3 first pass at deep-checker, no webapi yet, probably big problems with it, only minimal tests 2008-07-16 18:20:57 -07:00
Brian Warner
3e9322bcb6 checker: re-enable checker web results (although they just say 'Healthy' right now) 2008-07-16 15:42:56 -07:00
Brian Warner
94e619c1f6 overhaul checker invocation
Removed the Checker service, removed checker results storage (both in-memory
and the tiny stub of sqlite-based storage). Added ICheckable, all
check/verify is now done by calling the check() method on filenodes and
dirnodes (immutable files, literal files, mutable files, and directory
instances).

Checker results are returned in a Results instance, with an html() method for
display. Checker results have been temporarily removed from the wui directory
listing until we make some other fixes.

Also fixed client.create_node_from_uri() to create LiteralFileNodes properly,
since they have different checking behavior. Previously we were creating full
FileNodes with LIT uris inside, which were downloadable but not checkable.
2008-07-15 17:23:25 -07:00
Brian Warner
fd465b4aaf download: fix stopProducing failure ('self._paused_at not defined'), add tests 2008-07-14 15:25:21 -07:00
Brian Warner
60725ed065 storage: add add_lease/update_write_enabler to remote API, revamp lease handling 2008-07-09 18:06:55 -07:00
Brian Warner
45405d85c4 interfaces: add verify= and repair= args to check() 2008-07-07 14:37:36 -07:00
Brian Warner
4e5b9ee63e introducer: move the relevant interfaces out to introducer/interfaces.py 2008-06-18 17:04:41 -07:00
Brian Warner
5289064dcf move FileTooLargeError out to a common location 2008-06-03 00:01:15 -07:00
Brian Warner
87c1e8e066 dirnode: add overwrite= to most API calls, defaulting to True. When False, this raises ExistingChildError rather than overwriting an existing child 2008-05-16 16:09:47 -07:00
Brian Warner
6c00a70dbc dirnode: add a deep_stats(), like deep-size but with more information. webish adds t=deeps-size too. 2008-05-08 13:21:14 -07:00
Zooko O'Whielacronx
c1184f51e4 docs: fix a few stale comments in code 2008-05-07 08:39:03 -07:00
Brian Warner
a379690b04 mutable: replace MutableFileNode API, update tests. Changed all callers to use overwrite(), but that will change soon 2008-04-17 17:51:38 -07:00
Brian Warner
a1670497a8 mutable WIP: add servermap update status pages 2008-04-16 19:05:41 -07:00
Brian Warner
1334a251ca remove size constraint on ShareData: large directories caused errors which triggered massive memory usage. See #379 for details 2008-04-11 22:51:54 -07:00
robk-tahoe
5578559b85 added offloaded key generation
this adds a new service to pre-generate RSA key pairs.  This allows
the expensive (i.e. slow) key generation to be placed into a process
outside the node, so that the node's reactor will not block when it
needs a key pair, but instead can retrieve them from a pool of already
generated key pairs in the key-generator service.

it adds a tahoe create-key-generator command which initialises an 
empty dir with a tahoe-key-generator.tac file which can then be run
via twistd.  it stashes its .pem and portnum for furl stability and
writes the furl of the key gen service to key_generator.furl, also
printing it to stdout.

by placing a key_generator.furl file into the nodes config directory
(e.g. ~/.tahoe) a node will attempt to connect to such a service, and
will use that when creating mutable files (i.e. directories) whenever
possible.  if the keygen service is unavailable, it will perform the
key generation locally instead, as before.
2008-04-01 18:45:13 -07:00
Zooko O'Whielacronx
fc3bd0c987 use added secret to protect convergent encryption
Now upload or encode methods take a required argument named "convergence" which can be either None, indicating no convergent encryption at all, or a string, which is the "added secret" to be mixed in to the content hash key.  If you want traditional convergent encryption behavior, set the added secret to be the empty string.

This patch also renames "content hash key" to "convergent encryption" in a argument names and variable names.  (A different and larger renaming is needed in order to clarify that Tahoe supports immutable files which are not encrypted content-hash-key a.k.a. convergent encryption.)

This patch also changes a few unit tests to use non-convergent encryption, because it doesn't matter for what they are testing and non-convergent encryption is slightly faster.
2008-03-24 09:46:06 -07:00
Brian Warner
2ef70ab814 mutable.py: split replace() into update() and overwrite(). Addresses #328. 2008-03-12 18:00:43 -07:00
Brian Warner
c21d30c320 client: publish a 'stub client' announcement to the introducer, to provide version/nickname information for each client 2008-03-11 19:20:10 -07:00
Brian Warner
10d3ea5045 increase remote-interface size limits to 16EiB by not casually using 'int' as a constraint 2008-03-11 10:50:31 -07:00
Brian Warner
ca1a1762e2 web: status: add 'started' timestamps to all operations 2008-03-04 18:50:44 -07:00
Brian Warner
68fbd89e66 webish: add primitive publish/retrieve status pages 2008-03-04 01:07:44 -07:00
Brian Warner
18eb00d136 webish: download-results: add per-server response times 2008-03-03 20:53:45 -07:00
Brian Warner
886ef22335 webish: download-results: add server_problems 2008-03-03 20:30:35 -07:00
Brian Warner
def910c391 webish download results: add servermap, decrypt time 2008-03-03 20:09:32 -07:00
Brian Warner
2b49605c51 webish: add 'download results', with some basic timing information 2008-03-03 19:19:21 -07:00
Brian Warner
c8e24f0904 webish: make upload timings visible on the recent uploads/downloads status page 2008-03-03 14:48:52 -07:00
Brian Warner
1a7651ce82 retain 10 most recent upload/download status objects, show them in /status . Prep for showing individual status objects 2008-02-29 22:19:03 -07:00
Zooko O'Whielacronx
99f006c584 wapi: add POST /uri/$DIRECTORY?t=set_children
Unfinished bits: doc in webapi.txt, test handling of badly formed JSON, return reasonable HTTP response, examination of the effect of this patch on code coverage -- but I'm committing it anyway because MikeB can use it and I'm being called to dinner...
2008-02-29 18:40:27 -07:00
Brian Warner
301dd3d489 webish status: distinguish active uploads/downloads from recent ones 2008-02-26 15:35:28 -07:00
Brian Warner
7927495cbe unicode handling: declare dirnodes to contain unicode child names, update webish to match 2008-02-14 15:45:56 -07:00
Brian Warner
e6af3b845c make current upload/download status objects available from the client 2008-02-12 15:39:45 -07:00
Brian Warner
94097affc3 add download-status objects, to track download progress 2008-02-12 15:38:39 -07:00
Brian Warner
d0ce8694c1 add upload-status objects, to track upload progress 2008-02-12 15:36:05 -07:00
Brian Warner
622c477e31 dirnode: add ctime/mtime to metadata, update metadata-modifying APIs. Needs more testing and sanity checking. 2008-02-08 18:43:47 -07:00
Brian Warner
81c5ceae16 upload: rework passing of default encoding parameters: move more responsibility into BaseUploadable 2008-02-06 18:39:03 -07:00
Brian Warner
6cd32c2f5c interfaces: remove spurious line that counted against the figleaf coverage 2008-02-06 16:41:26 -07:00
Brian Warner
124fb5ecdf add upload-results timing info for helper uploads. This changes the Helper protocol, and introduces a compatibility break 2008-02-06 01:52:25 -07:00
Brian Warner
66f33ee504 upload: return an UploadResults instance (with .uri) instead of just a URI 2008-02-05 21:01:38 -07:00
Brian Warner
d146ef7e09 webish: add extra introducer data (version, timestamps) to Welcome page 2008-02-05 17:32:27 -07:00
Brian Warner
daecca6589 big introducer refactoring: separate publish+subscribe. Addresses #271. 2008-02-05 13:05:13 -07:00
Brian Warner
a01f9ce9cc introducer: allow nodes to refrain from publishing themselves, by passing furl=None. This would be useful for clients who do not run storage servers. 2008-02-01 19:48:38 -07:00
robk-tahoe
e5487bbe21 stats: added IStatsProducer interface, fixed stats provider startup
this adds an interface, IStatsProducer, defining the get_stats() method
which the stats provider calls upon and registered producer, and made the
register_producer() method check that interface is implemented.

also refine the startup logic, so that the stats provider doesn't try and
connect out to the stats gatherer until after the node declares the tub
'ready'.  this is to address an issue whereby providers would attach to
the gatherer without providing a valid furl, and hence the gatherer would
be unable to determine the tubid of the connected client, leading to lost
samples.
2008-01-31 21:10:15 -07:00
robk-tahoe
7b9f3207d0 stats: add a simple stats gathering system
We have a desire to collect runtime statistics from multiple nodes primarily
for server monitoring purposes.   This implements a simple implementation of
such a system, as a skeleton to build more sophistication upon.

Each client now looks for a 'stats_gatherer.furl' config file.  If it has
been configured to use a stats gatherer, then it instantiates internally
a StatsProvider.  This is a central place for code which wishes to offer
stats up for monitoring to report them to, either by calling 
stats_provider.count('stat.name', value) to increment a counter, or by
registering a class as a stats producer with sp.register_producer(obj).

The StatsProvider connects to the StatsGatherer server and provides its
provider upon startup.  The StatsGatherer is then responsible for polling
the attached providers periodically to retrieve the data provided.
The provider queries each registered producer when the gatherer queries
the provider.  Both the internal 'counters' and the queried 'stats' are
then reported to the gatherer.

This provides a simple gatherer app, (c.f. make stats-gatherer-run)
which prints its furl and listens for incoming connections.  Once a
minute, the gatherer polls all connected providers, and writes the
retrieved data into a pickle file.

Also included is a munin plugin which knows how to read the gatherer's
stats.pickle and output data munin can interpret.  this plugin, 
tahoe-stats.py can be symlinked as multiple different names within
munin's 'plugins' directory, and inspects argv to determine which
data to display, doing a lookup in a table within that file.
It looks in the environment for 'statsfile' to determine the path to
the gatherer's stats.pickle.  An example plugins-conf.d file is
provided.
2008-01-30 20:11:07 -07:00
Brian Warner
22071c00e0 upload: oops, fix breakage after removing upload_file/upload_data/etc 2008-01-30 19:41:43 -07:00
Brian Warner
492cb92dc8 speedcheck: track SSK creation time separately 2008-01-29 20:44:32 -07:00
Brian Warner
46fe024612 offloaded uploader: don't use a huge amount of memory when skipping over previously-uploaded data 2008-01-24 17:25:33 -07:00
Brian Warner
e9307d3fda offloaded: close the local filehandle after encoding is done, otherwise windows fails 2008-01-17 01:52:33 -07:00
Brian Warner
51321944f0 megapatch: overhaul encoding_parameters handling: now it comes from the Uploadable, or the Client. Removed options= too. Also move helper towards resumability. 2008-01-16 03:03:35 -07:00
Brian Warner
a6ca98ac53 upload: add Encoder.abort(), to abandon the upload in progress. Add some debug hooks to enable unit tests. 2008-01-14 21:22:55 -07:00
Brian Warner
7ac2b94aba remove wait_for_numpeers and the when_enough_peers call in mutable.Publish 2008-01-14 14:55:59 -07:00
Brian Warner
964edadf44 offloaded: add a system test, make it pass. files are now being uploaded through the helper. 2008-01-11 05:42:55 -07:00
Brian Warner
6ac01fde4c offloaded: more test coverage on client side, change interfaces a bit 2008-01-11 04:53:37 -07:00
Brian Warner
e825406fc2 offloaded: move interfaces to interfaces.py, start implementing backend 2008-01-09 21:25:47 -07:00
Brian Warner
ea24864544 offloaded: more code, fix pyflakes problems, change IEncryptedUploader a bit 2008-01-09 17:58:47 -07:00
Brian Warner
9a8f68c41f dirnode: add set_uris() and set_nodes() (plural), to set multiple children at once. Use it to set up a new webapi test for issue #237. 2007-12-18 23:30:02 -07:00
Zooko O'Whielacronx
a5a54ac5ca remove the DirnodeURI foolscap schema and mv those regexes into uri.py
We currently do not pass dirnode uris over foolscap.
2007-12-18 17:44:24 -07:00
Zooko O'Whielacronx
9848d2043d make more precise regexp for WriteableSSKFileURI and DirnodeURI and use it in unit tests
Also allow an optional leading "http://127.0.0.1:8123/uri/".
Also fix a few unit tests to generate bogus Dirnode URIs of the modern form instead of the former form.
2007-12-18 13:15:08 -07:00
Brian Warner
f6b2072af1 check-speed: test SSK upload/download speed too. SDMF imposes a limit on the file sizes, no 10MB or 100MB test 2007-12-14 02:05:31 -07:00
Brian Warner
0dc84963f1 the wait_for_numpeers= argument to client.upload() is optional: make both the code and the Interface reflect this 2007-12-06 18:36:58 -07:00
Brian Warner
f190382d5e refactor web tests, and interfaces.IFileNode 2007-12-04 23:01:37 -07:00
Brian Warner
0f5ef5184d test_dirnode.py: obtain full coverage of dirnode.py 2007-12-04 14:32:04 -07:00
Zooko O'Whielacronx
59d6c3c822 decentralized directories: integration and testing
* use new decentralized directories everywhere instead of old centralized directories
 * provide UI to them through the web server
 * provide UI to them through the CLI
 * update unit tests to simulate decentralized mutable directories in order to test other components that rely on them
 * remove the notion of a "vdrive server" and a client thereof
 * remove the notion of a "public vdrive", which was a directory that was centrally published/subscribed automatically by the tahoe node (you can accomplish this manually by making a directory and posting the URL to it on your web site, for example)
 * add a notion of "wait_for_numpeers" when you need to publish data to peers, which is how many peers should be attached before you start.  The default is 1.
 * add __repr__ for filesystem nodes (note: these reprs contain a few bits of the secret key!)
 * fix a few bugs where we used to equate "mutable" with "not read-only".  Nowadays all directories are mutable, but some might be read-only (to you).
 * fix a few bugs where code wasn't aware of the new general-purpose metadata dict the comes with each filesystem edge
 * sundry fixes to unit tests to adjust to the new directories, e.g. don't assume that every share on disk belongs to a chk file.
2007-12-03 14:52:42 -07:00
Zooko O'Whielacronx
ae727a550a IMutableFileNode is a subtype of IFileNode
I'm not 100% sure that this is correct, but it looks reasonable, it passes unit
tests (although note that unit tests are currently not covering the new mutable
files very well), and it makes the "view JSON" link on a directory work instead
of raising an assertion error.
2007-11-10 16:37:18 -07:00