Commit Graph

94 Commits

Author SHA1 Message Date
kevan
18a80d99b1 Alter Adder + Adder tests to look for 'only-files' instead of 'only_files' 2009-07-19 20:43:18 -07:00
kevan
c476c66b0e Add 'only_files' option to the overwrite field in Adder 2009-07-17 20:00:10 -07:00
Brian Warner
531cc7899f rename NewDirectoryNode to DirectoryNode, NewDirectoryURI to DirectoryURI 2009-07-17 17:15:49 -05:00
Zooko O'Whielacronx
2f704ed001 dirnode: finish renaming "iv" to "salt" in the code and the hash tag 2009-07-12 17:13:20 -07:00
Brian Warner
7f1d8b7c46 dirnode.py/_encrypt_rwcap: rename IV to "salt", which is more accurate 2009-07-13 00:50:25 +01:00
Brian Warner
c1d5717cf0 dirnode.py: security bug: also use child writecap to derive child enc key,
not just the dirnode writecap. The previous code (which only hashed the
dirnode writecap) would use the same key for all children, which is very bad.
This is the correct implementation of #750.
2009-07-13 00:47:50 +01:00
kevan
d71adaf1ca Use CachingDict instead of dict in dirnode.py 2009-07-03 20:43:01 -07:00
Zooko O'Whielacronx
34213cd2c7 directories: fix semantic conflict between my "keep track of position" optimization patch and Kevan's "cache serialized entries" optimization patch 2009-07-09 20:20:28 -07:00
Zooko O'Whielacronx
0e2d005146 trivial: removed unused import noticed by pyflakes 2009-07-09 06:05:13 -07:00
Zooko O'Whielacronx
786ed012b3 directories: make the IV for the writecaps in directory entries be computed from the secure hash of the writecap itself
This makes encoding of directory entries deterministic, and it is also a tad faster on Macbook Pro than getting a random IV with os.urandom(16).
2009-07-04 19:48:15 -07:00
kevan
903005a528 Add CachingDict dict subclass to dirnode.py 2009-07-05 14:23:45 -07:00
Zooko O'Whielacronx
efafcfb91a directories: keep track of your position as you decode netstring after netstring from an input buffer instead of copying the trailing part
This makes decoding linear in the number of netstrings instead of O(N^2).
2009-07-04 19:51:09 -07:00
Brian Warner
ef1b6ae8e3 Tolerate unknown URI types in directory structures. Part of #683.
The idea is that future versions of Tahoe will add new URI types that this
version won't recognize, but might store them in directories that we *can*
read. We should handle these "objects from the future" as best we can.
Previous releases of Tahoe would just explode. With this change, we'll
continue to be able to work with everything else in the directory.

The code change is to wrap anything we don't recognize as an UnknownNode
instance (as opposed to a FileNode or DirectoryNode). Then webapi knows how
to render these (mostly by leaving fields blank), deep-check knows to skip
over them, deep-stats counts them in "count-unknown". You can rename and
delete these things, but you can't add new ones (because we wouldn't know how
to generate a readcap to put into the dirnode's rocap slot, and because this
lets us catch typos better).
2009-07-02 18:07:49 -07:00
Brian Warner
6237aeabd7 create_node_from_uri: take both writecap+readcap, move logic out of dirnode.py 2009-07-02 15:25:37 -07:00
Brian Warner
656277ac98 dirnode.py: prepare to preserve both rwcap+rocap when copying
This will make it easier to tolerate unknown nodes safely.
2009-07-02 14:12:54 -07:00
Brian Warner
52fa421430 use 522-bit RSA keys in all unit tests (except one)
This reduces the total test time on my laptop from 400s to 283s.
* src/allmydata/test/test_system.py (SystemTest.test_mutable._test_debug):
  Remove assertion about container_size/data_size, this changes with keysize
  and was too variable anyways.
* src/allmydata/mutable/filenode.py (MutableFileNode.create): add keysize=
* src/allmydata/dirnode.py (NewDirectoryNode.create): same
* src/allmydata/client.py (Client.DEFAULT_MUTABLE_KEYSIZE): add default,
  this overrides the one in MutableFileNode
2009-06-29 15:31:24 -07:00
Brian Warner
c9803d5217 switch all foolscap imports to use foolscap.api or foolscap.logging 2009-05-21 17:38:23 -07:00
Zooko O'Whielacronx
9729753692 dirnode: add 'tahoe'/'linkcrtime' and 'tahoe'/'linkmotime' to take the place of what 'mtime'/'ctime' originally did, and make the 'tahoe' subdict be unwritable through the set_children API
Also add extensive documentation in docs/frontends/webapi.txt about the behaviors of these values.  See ticket #628.
2009-04-11 15:52:05 -07:00
Brian Warner
6e57576f2e dirnode deep_traverse: insert a turn break (fireEventually) at least once every 100 files, otherwise a CHK followed by more than 158 LITs can overflow the stack, sort of like #237. 2009-03-13 16:31:35 -07:00
Brian Warner
1a741fdb03 dirnode.py: when doing deep-traverse, walk each directory in alphabetical order, to make things like 'manifest' more predictable 2009-03-12 23:50:46 -07:00
Brian Warner
bce4a5385b add --add-lease to 'tahoe check', 'tahoe deep-check', and webapi. 2009-02-17 19:32:43 -07:00
Brian Warner
72adeccf2d dirnode: add get_repair_cap() 2009-01-22 21:44:49 -07:00
Brian Warner
94ab90273d dirnode.deep_traverse: fix docstring to describe the correct return value 2009-01-22 21:39:50 -07:00
Brian Warner
39a089dc7e dirnode deep-traversal: remove use of Limiter, stick with strict depth-first-traversal, to reduce memory usage during very large (300k+ dirnode) traversals 2009-01-08 19:41:16 -07:00
Zooko O'Whielacronx
5e6f90a015 rename "checker results" to "check results", because it is more parallel to "check-and-repair results" 2009-01-06 13:37:03 -07:00
Zooko O'Whielacronx
f1fbd4feae dirnode: don't check MAC on entries in dirnodes
In an ancient version of directories, we needed a MAC on each entry.  In modern times, the entire dirnode comes with a digital signature, so the MAC on each entry is redundant.
With this patch, we no longer check those MACs when reading directories, but we still produce them so that older readers will accept directories that we write.
2008-12-21 17:35:18 -07:00
Zooko O'Whielacronx
471e1f1b9b try to tidy up uri-as-string vs. uri-as-object
I get confused about whether a given argument or return value is a uri-as-string or uri-as-object.  This patch adds a lot of assertions that it is one or the other, and also changes CheckerResults to take objects not strings.
In the future, I hope that we generally use Python objects except when importing into or exporting from the Python interpreter e.g. over the wire, the UI, or a stored file.
2008-12-19 08:39:24 -07:00
Zooko O'Whielacronx
c456ff8591 rename "get_verifier()" to "get_verify_cap()" 2008-12-08 12:44:11 -07:00
Zooko O'Whielacronx
b58875fe43 mutable: rename mutable/node.py to mutable/filenode.py and mutable/repair.py to mutable/repairer.py
To be more consistent with the immutable layout that I am working on.
2008-12-07 08:20:08 -07:00
Brian Warner
7a0afb59a4 dirnode.py: dirnode.delete which hits UCWE should not fail with NoSuchChildError. Fixes #550. 2008-12-05 22:08:37 -07:00
Brian Warner
fb9af2c7a0 MutableFileNode.modify: pass first_time= and servermap= to the modifier callback 2008-12-05 22:07:10 -07:00
Zooko O'Whielacronx
b315619d6b download: refactor handling of URI Extension Block and crypttext hash tree, simplify things
Refactor into a class the logic of asking each server in turn until one of them gives an answer 
that validates.  It is called ValidatedThingObtainer.

Refactor the downloading and verification of the URI Extension Block into a class named 
ValidatedExtendedURIProxy.

The new logic of validating UEBs is minimalist: it doesn't require the UEB to contain any 
unncessary information, but of course it still accepts such information for backwards 
compatibility (so that this new download code is able to download files uploaded with old, and 
for that matter with current, upload code).

The new logic of validating UEBs follows the practice of doing all validation up front.  This 
practice advises one to isolate the validation of incoming data into one place, so that all of 
the rest of the code can assume only valid data.

If any redundant information is present in the UEB+URI, the new code cross-checks and asserts 
that it is all fully consistent.  This closes some issues where the uploader could have 
uploaded inconsistent redundant data, which would probably have caused the old downloader to 
simply reject that download after getting a Python exception, but perhaps could have caused 
greater harm to the old downloader.

I removed the notion of selecting an erasure codec from codec.py based on the string that was 
passed in the UEB.  Currently "crs" is the only such string that works, so 
"_assert(codec_name == 'crs')" is simpler and more explicit.  This is also in keeping with the 
"validate up front" strategy -- now if someone sets a different string than "crs" in their UEB, 
the downloader will reject the download in the "validate this UEB" function instead of in a 
separate "select the codec instance" function.

I removed the code to check plaintext hashes and plaintext Merkle Trees.  Uploaders do not 
produce this information any more (since it potentially exposes confidential information about 
the file), and the unit tests for it were disabled.  The downloader before this patch would 
check that plaintext hash or plaintext merkle tree if they were present, but not complain if 
they were absent.  The new downloader in this patch complains if they are present and doesn't 
check them.  (We might in the future re-introduce such hashes over the plaintext, but encrypt 
the hashes which are stored in the UEB to preserve confidentiality.  This would be a double-
check on the correctness of our own source code -- the current Merkle Tree over the ciphertext 
is already sufficient to guarantee the integrity of the download unless there is a bug in our 
Merkle Tree or AES implementation.) 

This patch increases the lines-of-code count by 8 (from 17,770 to 17,778), and reduces the 
uncovered-by-tests lines-of-code count by 24 (from 1408 to 1384).  Those numbers would be more 
meaningful if we omitted src/allmydata/util/ from the test-coverage statistics.
2008-12-05 08:17:54 -07:00
Brian Warner
bc53c24003 dirnode manifest: add verifycaps, both to internal API and to webapi. This will give the manual-GC tools more to work with, so they can estimate how much space will be freed. 2008-11-24 14:40:46 -07:00
Brian Warner
b84c2c6541 manifest: add storage-index strings to the json results 2008-11-19 16:00:27 -07:00
Brian Warner
815e0673e6 manifest: include stats in results. webapi is unchanged. 2008-11-19 15:03:47 -07:00
Brian Warner
d6a67cd566 dirnode manifest/stats: process more than one LIT file per tree; we were accidentally ignoring all but the first 2008-11-14 22:50:49 -07:00
Brian Warner
fca158e83a dirnode lookup: use distinct NoSuchChildError instead of the generic KeyError when a child can't be found 2008-10-27 13:15:25 -07:00
Brian Warner
977c6ac510 more #514: pass a Monitor to all checker operations, make mutable-checker honor the cancel flag 2008-10-22 01:38:18 -07:00
Brian Warner
8178b10ef1 dirnode.py: check for cancel during deep-traverse operations, and don't initiate any new ones if we've been cancelled. Gets us closer to #514. 2008-10-22 00:55:52 -07:00
Brian Warner
ad3d9207a9 Change deep-size/stats/check/manifest to a start+poll model instead of a single long-running synchronous operation. No cancel or handle-expiration yet. #514. 2008-10-21 17:03:07 -07:00
Brian Warner
9d4749d546 dirnode.build_manifest: include node.list in the limiter, that's the most important thing to slow down 2008-10-07 13:19:29 -07:00
Brian Warner
3ffaded809 web: change t=manifest to return a list of (path,read/writecap) tuples, instead of a list of verifycaps. Add output=html,text,json. 2008-10-06 21:36:18 -07:00
Brian Warner
09341a969a dirnode: fix my remarkably-consistent 'metdadata' typo 2008-10-02 18:08:45 -07:00
Brian Warner
d0bdf9a611 dirnode: add get_child_and_metadata_at_path 2008-10-02 17:52:03 -07:00
Brian Warner
e8cf581e3f move netstring() and split_netstring() into a separate util.netstring module 2008-09-25 21:38:24 -07:00
Brian Warner
f570ad7ba5 disallow deep-check on non-directories, simplifies the code a bit 2008-09-10 13:44:58 -07:00
Brian Warner
4bb88fd2ee dirnode: refactor recursive-traversal methods, add stats to deep_check() method results and t=deep-check webapi 2008-09-10 01:45:04 -07:00
Brian Warner
f6eeb3161f dirnode: cleanup, make get_verifier() always return a URI instance, not a string 2008-09-10 01:37:55 -07:00
Brian Warner
3408d552cd checker: overhaul checker results, split check/check_and_repair into separate methods, improve web displays 2008-09-07 12:44:56 -07:00
Brian Warner
c80e352951 IFilesystemNode: add get_storage_index(), it makes tests easier 2008-08-12 16:14:07 -07:00