* "cap" means a python instance which encapsulates a filecap/dircap (uri.py)
* "uri" means a string with a "URI:" prefix
* FileNode instances are created with (and retain) a cap instance, and
generate uri strings on demand
* .get_cap/get_readcap/get_verifycap/get_repaircap return cap instances
* .get_uri/get_readonly_uri return uri strings
* add filenode.download_to_filename() for control.py, should find a better way
* use MutableFileNode.init_from_cap, not .init_from_uri
* directory URI instances: use get_filenode_cap, not get_filenode_uri
* update/cleanup bench_dirnode.py to match, add Makefile target to run it
This is safer: in the earlier API, an old webapi server would silently ignore
the initial children, and clients trying to set them would have to fetch the
newly-created directory to discover the incompatibility. In the new API,
clients using t=mkdir-with-children against an old webapi server will get a
clear error.
instead of creating an empty file and then adding the children later.
This should speed up mkdir(initial_children) considerably, removing two
roundtrips and an entire read-modify-write cycle, probably bringing it down
to a single roundtrip. A quick test (against the volunteergrid) suggests a
30% speedup.
test_dirnode: add new tests to enforce the restrictions that interfaces.py
claims for create_new_mutable_directory(): no UnknownNodes, metadata dicts
interfaces.py: define INodeMaker, document argument values, change
create_new_mutable_directory() to take dict-of-nodes. Change
dirnode.set_nodes() and dirnode.create_subdirectory() too.
nodemaker.py: use INodeMaker, update create_new_mutable_directory()
client.py: have create_dirnode() delegate initial_children= to nodemaker
dirnode.py (Adder): take dict-of-nodes instead of list-of-nodes, which
updates set_nodes() and create_subdirectory()
web/common.py (convert_initial_children_json): create dict-of-nodes
web/directory.py: same
web/unlinked.py: same
test_dirnode.py: update tests to match
invoked with the new MutableFileNode and is supposed to return the initial
contents. This can be used by e.g. a new dirnode which needs the filenode's
writekey to encrypt its initial children.
create_mutable_file() still accepts a bytestring too, or None for an empty
file.
We need to carefully document the licence of figleaf in order to get Tahoe-LAFS into Ubuntu Karmic Koala. However, figleaf isn't really a part of Tahoe-LAFS per se -- this is just a "convenience copy" of a development tool. The quickest way to make Tahoe-LAFS acceptable for Karmic then, is to remove figleaf from the Tahoe-LAFS tarball itself. People who want to run figleaf on Tahoe-LAFS (as everyone should want) can install figleaf themselves. I haven't tested this -- there may be incompatibilities between upstream figleaf and the copy that we had here...
This makes it more obvious that the Helper currently generates leases with
the Helper's own secrets, rather than getting values from the client, which
is arguably a bug that will likely be resolved with the Accounting project.
child of the client, access with client.downloader instead of
client.getServiceNamed("downloader"). The single "Downloader" instance is
scheduled for demolition anyways, to be replaced by individual
filenode.download calls.
* stop using IURI as an adapter
* pass cap strings around instead of URI instances
* move filenode/dirnode creation duties from Client to new NodeMaker class
* move other Client duties to KeyGenerator, SecretHolder, History classes
* stop passing Client reference to dirnode/filenode constructors
- pass less-powerful references instead, like StorageBroker or Uploader
* always create DirectoryNodes by wrapping a filenode (mutable for now)
* remove some specialized mock classes from unit tests
Detailed list of changes (done one at a time, then merged together)
always pass a string to create_node_from_uri(), not an IURI instance
always pass a string to IFilesystemNode constructors, not an IURI instance
stop using IURI() as an adapter, switch on cap prefix in create_node_from_uri()
client.py: move SecretHolder code out to a separate class
test_web.py: hush pyflakes
client.py: move NodeMaker functionality out into a separate object
LiteralFileNode: stop storing a Client reference
immutable Checker: remove Client reference, it only needs a SecretHolder
immutable Upload: remove Client reference, leave SecretHolder and StorageBroker
immutable Repairer: replace Client reference with StorageBroker and SecretHolder
immutable FileNode: remove Client reference
mutable.Publish: stop passing Client
mutable.ServermapUpdater: get StorageBroker in constructor, not by peeking into Client reference
MutableChecker: reference StorageBroker and History directly, not through Client
mutable.FileNode: removed unused indirection to checker classes
mutable.FileNode: remove Client reference
client.py: move RSA key generation into a separate class, so it can be passed to the nodemaker
move create_mutable_file() into NodeMaker
test_dirnode.py: stop using FakeClient mockups, use NoNetworkGrid instead. This simplifies the code, but takes longer to run (17s instead of 6s). This should come down later when other cleanups make it possible to use simpler (non-RSA) fake mutable files for dirnode tests.
test_mutable.py: clean up basedir names
client.py: move create_empty_dirnode() into NodeMaker
dirnode.py: get rid of DirectoryNode.create
remove DirectoryNode.init_from_uri, refactor NodeMaker for customization, simplify test_web's mock Client to match
stop passing Client to DirectoryNode, make DirectoryNode.create_with_mutablefile the normal DirectoryNode constructor, start removing client from NodeMaker
remove Client from NodeMaker
move helper status into History, pass History to web.Status instead of Client
test_mutable.py: fix minor typo
webapi.txt: clarify replace=only-files argument, mention replace= on POST t=uri
test_cli.py: insert whitespace between logical operations
web.common.parse_replace_arg: make it case-insensitive, to match the docs
we actually exercise during tests) into more specific exceptions, so they
don't get optimized away. The best rule to follow is probably this: if an
exception is worth testing, then it's part of the API, and AssertionError
should never be part of the API. Closes#749.
The idea is that future versions of Tahoe will add new URI types that this
version won't recognize, but might store them in directories that we *can*
read. We should handle these "objects from the future" as best we can.
Previous releases of Tahoe would just explode. With this change, we'll
continue to be able to work with everything else in the directory.
The code change is to wrap anything we don't recognize as an UnknownNode
instance (as opposed to a FileNode or DirectoryNode). Then webapi knows how
to render these (mostly by leaving fields blank), deep-check knows to skip
over them, deep-stats counts them in "count-unknown". You can rename and
delete these things, but you can't add new ones (because we wouldn't know how
to generate a readcap to put into the dirnode's rocap slot, and because this
lets us catch typos better).
This reduces the total test time on my laptop from 400s to 283s.
* src/allmydata/test/test_system.py (SystemTest.test_mutable._test_debug):
Remove assertion about container_size/data_size, this changes with keysize
and was too variable anyways.
* src/allmydata/mutable/filenode.py (MutableFileNode.create): add keysize=
* src/allmydata/dirnode.py (NewDirectoryNode.create): same
* src/allmydata/client.py (Client.DEFAULT_MUTABLE_KEYSIZE): add default,
this overrides the one in MutableFileNode
Instead of testing to see that the previous SDMF filesize limit was being
obeyed, we now test to make sure that we can insert files larger than that
limit.
ticketed in http://divmod.org/trac/ticket/2830 and doesn't need a Tahoe-side
change, plus this test fails on win32 for unrelated reasons (and test_client
is the place to think about the win32 issue).
* emit lease expiry date in ISO-8601'ish format as well as Brian's format
* rename iso_utc_time_to_localseconds() to iso_utc_time_to_seconds()
* add iso_utc_date()
* simplify the body of iso_utc_time_to_seconds()
This walks slowly through all shares, examining their leases, deciding which
are still valid and which have expired. Once enabled, it will then remove the
expired leases, and delete shares which no longer have any valid leases. Note
that there is not yet a tahoe.cfg option to enable lease-deletion: the
current code is read-only. A subsequent patch will add a tahoe.cfg knob to
control this, as well as docs. Some other minor items included in this patch:
tahoe debug dump-share has a new --leases-only flag
storage sharefile/leaseinfo code is cleaned up
storage web status page (/storage) has more info, more tests coverage
space-left measurement on OS-X should be more accurate (it was off by 2048x)
(use stat .f_frsize instead of f_bsize)
A test failed on draco (MacPPC) because it took 49 seconds to get around to running the test, and the node had already stopped itself when the hotline file was 40 seconds old.
It is currently hardcoded in setup.py to be 'allmydata-tahoe'. Ticket #556 is to make it configurable by a runtime command-line argument to setup.py: "--appname=foo", but I suddenly wondered if we really wanted that and at the same time realized that we don't need that for tahoe-1.3.0 release, so this patch just hardcodes it in setup.py.
setup.py inspects a file named 'src/allmydata/_appname.py' and assert that it contains the string "__appname__ = 'allmydata-tahoe'", and creates it if it isn't already present. src/allmydata/__init__.py import _appname and reads __appname__ from it. The rest of the Python code imports allmydata and inspects "allmydata.__appname__", although actually every use it uses "allmydata.__full_version__" instead, where "allmydata.__full_version__" is created in src/allmydata/__init__.py to be:
__full_version__ = __appname + '-' + str(__version__).
All the code that emits an "application version string" when describing what version of a protocol it supports (introducer server, storage server, upload helper), or when describing itself in general (introducer client), usese allmydata.__full_version__.
This fixes ticket #556 at least well enough for tahoe-1.3.0 release.
Obviously requiring the code under test to perform within some limit isn't very meaningful if we raise the limit whenever the test goes outside of it.
But I still don't want to remove the test code which measures how many writes (and, elsewhere, how many reads) a client does in order to fulfill these duties.
Let this number -- now 20 -- stand as an approximation of the inefficiency of our code divided by my mental model of how many operations are actually optimal for these duties.
This is important, because if the repairer doesn't completely repair all kinds of corruption (as the current one doesn't), then the successive tests get messed up by assuming that the shares were uncorrupted when the test first set about to corrupt them.
This means that the tests still work if you are executing them from a CWD other than the src dir -- *if* the "bin/tahoe" is found at os.path.dirname(os.path.dirname(os.path.dirname(allmydata.__file__))).
If no file is found at that location, then just skip the tests of executing the "tahoe" executable, because we don't want to accidentally run those tests against an executable from a different version of tahoe.
This is necessary because loading allmydata code now depends on PYTHONPATH manipulation which is done in the "bin/tahoe" script. Unfortunately it makes test_runner slower since it launches and waits for many subprocesses.
This reverses some, but not all, of the changes that were committed in the following set of patches.
rolling back:
Sun Jan 18 09:54:30 MST 2009 toby.murray
* add 'web.ambient_upload_authority' as a paramater to tahoe.cfg
M ./src/allmydata/client.py -1 +3
M ./src/allmydata/test/common.py -7 +9
A ./src/allmydata/test/test_ambient_upload_authority.py
M ./src/allmydata/web/root.py +12
M ./src/allmydata/webish.py -1 +4
Sun Jan 18 09:56:08 MST 2009 zooko@zooko.com
* trivial: whitespace
I ran emacs's "M-x whitespace-cleanup" on the files that Toby's recent patch had touched that had trailing whitespace on some lines.
M ./src/allmydata/test/test_ambient_upload_authority.py -9 +8
M ./src/allmydata/web/root.py -2 +1
M ./src/allmydata/webish.py -2 +1
Mon Jan 19 14:16:19 MST 2009 zooko@zooko.com
* trivial: remove unused import noticed by pyflakes
M ./src/allmydata/test/test_ambient_upload_authority.py -1
Mon Jan 19 21:38:35 MST 2009 toby.murray
* doc: describe web.ambient_upload_authority
M ./docs/configuration.txt +14
M ./docs/frontends/webapi.txt +11
Mon Jan 19 21:38:57 MST 2009 zooko@zooko.com
* doc: add Toby Murray to the CREDITS
M ./CREDITS +4
This implements an immutable repairer by marrying a CiphertextDownloader to a CHKUploader. It extends the IDownloadTarget interface so that the downloader can provide some metadata that the uploader requires.
The processing is incremental -- it uploads the first segments before it finishes downloading the whole file. This is necessary so that you can repair large files without running out of RAM or using a temporary file on the repairer.
It requires only a verifycap, not a readcap. That is: it doesn't need or use the decryption key, only the integrity check codes.
There are several tests marked TODO and several instances of XXX in the source code. I intend to open tickets to document further improvements to functionality and testing, but the current version is probably good enough for Tahoe-1.3.0.
Rename "downloadable" to "target".
Rename "u" to "v" in FileDownloader.__init__().
Rename "_uri" to "_verifycap" in FileDownloader.
Rename "_downloadable" to "_target" in FileDownloader.
Rename "FileDownloader" to "CiphertextDownloader".
FileDownloader takes a verify cap and produces ciphertext, instead of taking a read cap and producing plaintext.
FileDownloader does all integrity checking including the mandatory ciphertext hash tree and the optional ciphertext flat hash, rather than expecting its target to do some of that checking.
Rename immutable.download.Output to immutable.download.DecryptingOutput. An instance of DecryptingOutput can be passed to FileDownloader to use as the latter's target. Text pushed to the DecryptingOutput is decrypted and then pushed to *its* target.
DecryptingOutput satisfies the IConsumer interface, and if its target also satisfies IConsumer, then it forwards and pause/unpause signals to its producer (which is the FileDownloader).
This patch also changes some logging code to use the new logging mixin class.
Check integrity of a segment and decrypt the segment one block-sized buffer at a time instead of copying the buffers together into one segment-sized buffer (reduces peak memory usage, I think, and is probably a tad faster/less CPU, depending on your encoding parameters).
Refactor FileDownloader so that processing of segments and of tail-segment share as much code is possible.
FileDownloader and FileNode take caps as instances of URI (Python objects), not as strings.
This was somewhat sad; the assertion didn't say what path caused the
error, what went wrong. So... silently skip over things that are
neither dirs nor files.
pyflakes pointed out to me that I had committed some code that is untested, since it uses an undefined name. This patch exercises that code -- the validation of the ciphertext hash tree -- by corrupting some of the share files in a very specific way, and also fixes the bug.
This makes Uploader take an EncryptedUploadable object instead of an Uploadable object. I also changed it to return a verify cap instead of a tuple of the bits of data that one finds in a verify cap.
This will facilitate hooking together an Uploader and a Downloader to make a Repairer.
Also move offloaded.py into src/allmydata/immutable/.
Maybe it already got one of the corrupted hashes from a different server and it doesn't double-check that the hash from every server is correct. Or another problem. But in any case I'm marking this as TODO because an even better (more picky) verifier is less urgent than repairer.