Commit Graph

126 Commits

Author SHA1 Message Date
Brian Warner
6e57576f2e dirnode deep_traverse: insert a turn break (fireEventually) at least once every 100 files, otherwise a CHK followed by more than 158 LITs can overflow the stack, sort of like #237. 2009-03-13 16:31:35 -07:00
Brian Warner
1a741fdb03 dirnode.py: when doing deep-traverse, walk each directory in alphabetical order, to make things like 'manifest' more predictable 2009-03-12 23:50:46 -07:00
Brian Warner
bce4a5385b add --add-lease to 'tahoe check', 'tahoe deep-check', and webapi. 2009-02-17 19:32:43 -07:00
Brian Warner
72adeccf2d dirnode: add get_repair_cap() 2009-01-22 21:44:49 -07:00
Brian Warner
94ab90273d dirnode.deep_traverse: fix docstring to describe the correct return value 2009-01-22 21:39:50 -07:00
Brian Warner
39a089dc7e dirnode deep-traversal: remove use of Limiter, stick with strict depth-first-traversal, to reduce memory usage during very large (300k+ dirnode) traversals 2009-01-08 19:41:16 -07:00
Zooko O'Whielacronx
5e6f90a015 rename "checker results" to "check results", because it is more parallel to "check-and-repair results" 2009-01-06 13:37:03 -07:00
Zooko O'Whielacronx
f1fbd4feae dirnode: don't check MAC on entries in dirnodes
In an ancient version of directories, we needed a MAC on each entry.  In modern times, the entire dirnode comes with a digital signature, so the MAC on each entry is redundant.
With this patch, we no longer check those MACs when reading directories, but we still produce them so that older readers will accept directories that we write.
2008-12-21 17:35:18 -07:00
Zooko O'Whielacronx
471e1f1b9b try to tidy up uri-as-string vs. uri-as-object
I get confused about whether a given argument or return value is a uri-as-string or uri-as-object.  This patch adds a lot of assertions that it is one or the other, and also changes CheckerResults to take objects not strings.
In the future, I hope that we generally use Python objects except when importing into or exporting from the Python interpreter e.g. over the wire, the UI, or a stored file.
2008-12-19 08:39:24 -07:00
Zooko O'Whielacronx
c456ff8591 rename "get_verifier()" to "get_verify_cap()" 2008-12-08 12:44:11 -07:00
Zooko O'Whielacronx
b58875fe43 mutable: rename mutable/node.py to mutable/filenode.py and mutable/repair.py to mutable/repairer.py
To be more consistent with the immutable layout that I am working on.
2008-12-07 08:20:08 -07:00
Brian Warner
7a0afb59a4 dirnode.py: dirnode.delete which hits UCWE should not fail with NoSuchChildError. Fixes #550. 2008-12-05 22:08:37 -07:00
Brian Warner
fb9af2c7a0 MutableFileNode.modify: pass first_time= and servermap= to the modifier callback 2008-12-05 22:07:10 -07:00
Zooko O'Whielacronx
b315619d6b download: refactor handling of URI Extension Block and crypttext hash tree, simplify things
Refactor into a class the logic of asking each server in turn until one of them gives an answer 
that validates.  It is called ValidatedThingObtainer.

Refactor the downloading and verification of the URI Extension Block into a class named 
ValidatedExtendedURIProxy.

The new logic of validating UEBs is minimalist: it doesn't require the UEB to contain any 
unncessary information, but of course it still accepts such information for backwards 
compatibility (so that this new download code is able to download files uploaded with old, and 
for that matter with current, upload code).

The new logic of validating UEBs follows the practice of doing all validation up front.  This 
practice advises one to isolate the validation of incoming data into one place, so that all of 
the rest of the code can assume only valid data.

If any redundant information is present in the UEB+URI, the new code cross-checks and asserts 
that it is all fully consistent.  This closes some issues where the uploader could have 
uploaded inconsistent redundant data, which would probably have caused the old downloader to 
simply reject that download after getting a Python exception, but perhaps could have caused 
greater harm to the old downloader.

I removed the notion of selecting an erasure codec from codec.py based on the string that was 
passed in the UEB.  Currently "crs" is the only such string that works, so 
"_assert(codec_name == 'crs')" is simpler and more explicit.  This is also in keeping with the 
"validate up front" strategy -- now if someone sets a different string than "crs" in their UEB, 
the downloader will reject the download in the "validate this UEB" function instead of in a 
separate "select the codec instance" function.

I removed the code to check plaintext hashes and plaintext Merkle Trees.  Uploaders do not 
produce this information any more (since it potentially exposes confidential information about 
the file), and the unit tests for it were disabled.  The downloader before this patch would 
check that plaintext hash or plaintext merkle tree if they were present, but not complain if 
they were absent.  The new downloader in this patch complains if they are present and doesn't 
check them.  (We might in the future re-introduce such hashes over the plaintext, but encrypt 
the hashes which are stored in the UEB to preserve confidentiality.  This would be a double-
check on the correctness of our own source code -- the current Merkle Tree over the ciphertext 
is already sufficient to guarantee the integrity of the download unless there is a bug in our 
Merkle Tree or AES implementation.) 

This patch increases the lines-of-code count by 8 (from 17,770 to 17,778), and reduces the 
uncovered-by-tests lines-of-code count by 24 (from 1408 to 1384).  Those numbers would be more 
meaningful if we omitted src/allmydata/util/ from the test-coverage statistics.
2008-12-05 08:17:54 -07:00
Brian Warner
bc53c24003 dirnode manifest: add verifycaps, both to internal API and to webapi. This will give the manual-GC tools more to work with, so they can estimate how much space will be freed. 2008-11-24 14:40:46 -07:00
Brian Warner
b84c2c6541 manifest: add storage-index strings to the json results 2008-11-19 16:00:27 -07:00
Brian Warner
815e0673e6 manifest: include stats in results. webapi is unchanged. 2008-11-19 15:03:47 -07:00
Brian Warner
d6a67cd566 dirnode manifest/stats: process more than one LIT file per tree; we were accidentally ignoring all but the first 2008-11-14 22:50:49 -07:00
Brian Warner
fca158e83a dirnode lookup: use distinct NoSuchChildError instead of the generic KeyError when a child can't be found 2008-10-27 13:15:25 -07:00
Brian Warner
977c6ac510 more #514: pass a Monitor to all checker operations, make mutable-checker honor the cancel flag 2008-10-22 01:38:18 -07:00
Brian Warner
8178b10ef1 dirnode.py: check for cancel during deep-traverse operations, and don't initiate any new ones if we've been cancelled. Gets us closer to #514. 2008-10-22 00:55:52 -07:00
Brian Warner
ad3d9207a9 Change deep-size/stats/check/manifest to a start+poll model instead of a single long-running synchronous operation. No cancel or handle-expiration yet. #514. 2008-10-21 17:03:07 -07:00
Brian Warner
9d4749d546 dirnode.build_manifest: include node.list in the limiter, that's the most important thing to slow down 2008-10-07 13:19:29 -07:00
Brian Warner
3ffaded809 web: change t=manifest to return a list of (path,read/writecap) tuples, instead of a list of verifycaps. Add output=html,text,json. 2008-10-06 21:36:18 -07:00
Brian Warner
09341a969a dirnode: fix my remarkably-consistent 'metdadata' typo 2008-10-02 18:08:45 -07:00
Brian Warner
d0bdf9a611 dirnode: add get_child_and_metadata_at_path 2008-10-02 17:52:03 -07:00
Brian Warner
e8cf581e3f move netstring() and split_netstring() into a separate util.netstring module 2008-09-25 21:38:24 -07:00
Brian Warner
f570ad7ba5 disallow deep-check on non-directories, simplifies the code a bit 2008-09-10 13:44:58 -07:00
Brian Warner
4bb88fd2ee dirnode: refactor recursive-traversal methods, add stats to deep_check() method results and t=deep-check webapi 2008-09-10 01:45:04 -07:00
Brian Warner
f6eeb3161f dirnode: cleanup, make get_verifier() always return a URI instance, not a string 2008-09-10 01:37:55 -07:00
Brian Warner
3408d552cd checker: overhaul checker results, split check/check_and_repair into separate methods, improve web displays 2008-09-07 12:44:56 -07:00
Brian Warner
c80e352951 IFilesystemNode: add get_storage_index(), it makes tests easier 2008-08-12 16:14:07 -07:00
Brian Warner
1306b11a76 dirnode: add some deep-check logging 2008-08-11 21:23:38 -07:00
Brian Warner
67db0a4967 deep-check: add webapi, add 'DEEP-CHECK' button to wui, add tests, rearrange checker API a bit 2008-07-17 16:47:09 -07:00
Brian Warner
69156aeb28 dirnode deep-check: add tests of cycles, fix failures 2008-07-17 14:37:04 -07:00
Brian Warner
acf3180fac dirnode deep-check: rearrange traversal approach, simplify code a bit 2008-07-17 14:25:04 -07:00
Brian Warner
9289433ba3 first pass at deep-checker, no webapi yet, probably big problems with it, only minimal tests 2008-07-16 18:20:57 -07:00
Brian Warner
94e619c1f6 overhaul checker invocation
Removed the Checker service, removed checker results storage (both in-memory
and the tiny stub of sqlite-based storage). Added ICheckable, all
check/verify is now done by calling the check() method on filenodes and
dirnodes (immutable files, literal files, mutable files, and directory
instances).

Checker results are returned in a Results instance, with an html() method for
display. Checker results have been temporarily removed from the wui directory
listing until we make some other fixes.

Also fixed client.create_node_from_uri() to create LiteralFileNodes properly,
since they have different checking behavior. Previously we were creating full
FileNodes with LIT uris inside, which were downloadable but not checkable.
2008-07-15 17:23:25 -07:00
Brian Warner
87c1e8e066 dirnode: add overwrite= to most API calls, defaulting to True. When False, this raises ExistingChildError rather than overwriting an existing child 2008-05-16 16:09:47 -07:00
Brian Warner
fabdc28c06 deep-stats: add file-size histogram 2008-05-08 16:19:42 -07:00
Brian Warner
e6ae7a2c60 dirnode: refactor deep-stats a bit 2008-05-08 13:33:07 -07:00
Brian Warner
6c00a70dbc dirnode: add a deep_stats(), like deep-size but with more information. webish adds t=deeps-size too. 2008-05-08 13:21:14 -07:00
Brian Warner
3cb361e233 dirnode: use the concurrency limiter in t=manifest and t=deep-size, allow 10 retrievals in parallel 2008-05-07 18:36:37 -07:00
Brian Warner
d7b82f73c5 dirnode: return to 'delete fails if the child wasn't actually there' semantics, to make tests pass. There's a switch to enable/disable this 2008-04-17 20:06:06 -07:00
Brian Warner
1bff08a7df dirnode: update to use MutableFileNode.modify 2008-04-17 19:57:04 -07:00
Brian Warner
a379690b04 mutable: replace MutableFileNode API, update tests. Changed all callers to use overwrite(), but that will change soon 2008-04-17 17:51:38 -07:00
Brian Warner
d4230d1781 mutable WIP: split mutable.py into separate files. All tests pass. 2008-04-11 14:31:16 -07:00
robk-tahoe
5578559b85 added offloaded key generation
this adds a new service to pre-generate RSA key pairs.  This allows
the expensive (i.e. slow) key generation to be placed into a process
outside the node, so that the node's reactor will not block when it
needs a key pair, but instead can retrieve them from a pool of already
generated key pairs in the key-generator service.

it adds a tahoe create-key-generator command which initialises an 
empty dir with a tahoe-key-generator.tac file which can then be run
via twistd.  it stashes its .pem and portnum for furl stability and
writes the furl of the key gen service to key_generator.furl, also
printing it to stdout.

by placing a key_generator.furl file into the nodes config directory
(e.g. ~/.tahoe) a node will attempt to connect to such a service, and
will use that when creating mutable files (i.e. directories) whenever
possible.  if the keygen service is unavailable, it will perform the
key generation locally instead, as before.
2008-04-01 18:45:13 -07:00
Brian Warner
2ef70ab814 mutable.py: split replace() into update() and overwrite(). Addresses #328. 2008-03-12 18:00:43 -07:00
Zooko O'Whielacronx
99f006c584 wapi: add POST /uri/$DIRECTORY?t=set_children
Unfinished bits: doc in webapi.txt, test handling of badly formed JSON, return reasonable HTTP response, examination of the effect of this patch on code coverage -- but I'm committing it anyway because MikeB can use it and I'm being called to dinner...
2008-02-29 18:40:27 -07:00
Brian Warner
7927495cbe unicode handling: declare dirnodes to contain unicode child names, update webish to match 2008-02-14 15:45:56 -07:00
Brian Warner
e6ddd9f3da dirnode.py: add metadata= to add_file(), add tests 2008-02-11 14:53:28 -07:00
Brian Warner
622c477e31 dirnode: add ctime/mtime to metadata, update metadata-modifying APIs. Needs more testing and sanity checking. 2008-02-08 18:43:47 -07:00
Brian Warner
66f33ee504 upload: return an UploadResults instance (with .uri) instead of just a URI 2008-02-05 21:01:38 -07:00
Brian Warner
7ac2b94aba remove wait_for_numpeers and the when_enough_peers call in mutable.Publish 2008-01-14 14:55:59 -07:00
Brian Warner
9a8f68c41f dirnode: add set_uris() and set_nodes() (plural), to set multiple children at once. Use it to set up a new webapi test for issue #237. 2007-12-18 23:30:02 -07:00
Brian Warner
0f5ef5184d test_dirnode.py: obtain full coverage of dirnode.py 2007-12-04 14:32:04 -07:00
Brian Warner
cca166a4f5 rename dirnode2.py to dirnode.py 2007-12-04 11:45:20 -07:00
Zooko O'Whielacronx
59d6c3c822 decentralized directories: integration and testing
* use new decentralized directories everywhere instead of old centralized directories
 * provide UI to them through the web server
 * provide UI to them through the CLI
 * update unit tests to simulate decentralized mutable directories in order to test other components that rely on them
 * remove the notion of a "vdrive server" and a client thereof
 * remove the notion of a "public vdrive", which was a directory that was centrally published/subscribed automatically by the tahoe node (you can accomplish this manually by making a directory and posting the URL to it on your web site, for example)
 * add a notion of "wait_for_numpeers" when you need to publish data to peers, which is how many peers should be attached before you start.  The default is 1.
 * add __repr__ for filesystem nodes (note: these reprs contain a few bits of the secret key!)
 * fix a few bugs where we used to equate "mutable" with "not read-only".  Nowadays all directories are mutable, but some might be read-only (to you).
 * fix a few bugs where code wasn't aware of the new general-purpose metadata dict the comes with each filesystem edge
 * sundry fixes to unit tests to adjust to the new directories, e.g. don't assume that every share on disk belongs to a chk file.
2007-12-03 14:52:42 -07:00
Brian Warner
1f22768dc7 webish: add preliminary mutable file support: upload, download, listings, JSON, URI, RO-URI. No replace yet. 2007-11-09 03:54:27 -07:00
Brian Warner
63233ecf37 consolidate dirnode/filenode-creation code into Client 2007-11-09 02:54:51 -07:00
Brian Warner
fb3eddafdb move NotMutableError from dirnode.py into interfaces.py 2007-11-01 15:03:07 -07:00
Brian Warner
b24c2925e8 checkpointing mutable-file work. Storage layer is 80% in place. 2007-10-30 19:47:36 -07:00
Brian Warner
046bda2b47 webish: add checker results and a 'Check' button to the web interface 2007-10-23 17:23:57 -07:00
Brian Warner
9da1d70676 add a simple checker, for both files and directories 2007-10-15 16:16:39 -07:00
Brian Warner
fe06b3be8b dirnode: change the defined behavior of RIVirtualDriveServer.set to allow replace-in-place without raising an exception 2007-08-16 17:03:19 -07:00
Brian Warner
42dcc3088e IDirectoryNode: add has_child() method 2007-08-15 13:22:01 -07:00
Brian Warner
2bc3c163b6 uri.py: get keys and index from the URI instance 2007-07-21 17:45:00 -07:00
Brian Warner
1d9a58977f uri: implement URI-processing classes, IFileURI/IDirnodeURI, use internally 2007-07-21 15:40:36 -07:00
Brian Warner
32fcf0b405 dirnode.build_manifest(): tolerate cycles in the directory graph 2007-07-21 15:40:13 -07:00
Brian Warner
ba7e14a870 fix several methods to handle LIT URIs correctly, rather than assuming that all filenodes are CHK URIs 2007-07-12 16:17:49 -07:00
Brian Warner
0cd730a7b3 web: more test work, now all tests either pass or are skipped (POST, XMLRPC, and URI/) 2007-07-07 10:34:05 -07:00
Brian Warner
71c04fc2e7 web: use KeyError (rather than IndexError) to signal a missing child 2007-07-06 19:40:08 -07:00
Brian Warner
9e42dda6a4 add IDirectoryNode.get_child_at_path 2007-07-06 19:38:37 -07:00
Brian Warner
18ab5ce837 dirnode: add build_manifest() and introduce 'refresh capabilities' 2007-06-26 19:41:20 -07:00
Brian Warner
b11fa20191 merge vdrive.py and filetable.py into a single dirnode.py 2007-06-26 17:16:58 -07:00