Commit Graph

53 Commits

Author SHA1 Message Date
Brian Warner
9acf5beebd immutable repairer: populate servers-responding properly
If a server did not respond to the pre-repair filecheck, but did respond
to the repair, that server was not correctly added to the
RepairResults.data["servers-responding"] list. (This resulted from a
buggy usage of DictOfSets.union() in filenode.py).

In addition, servers to which filecheck queries were sent, but did not
respond, were incorrectly added to the servers-responding list
anyawys. (This resulted from code in the checker.py not paying attention
to the 'responded' flag).

The first bug was neatly masked by the second: it's pretty rare to have
a server suddenly start responding in the one-second window between a
filecheck and a subsequent repair, and if the server was around for the
filecheck, you'd never notice the problem. I only spotted the smelly
code while I was changing it for IServer cleanup purposes.

I added coverage to test_repairer.py for this. Trying to get that test
to fail before fixing the first bug is what led me to discover the
second bug. I also had to update test_corrupt_file_verno, since it was
incorrectly asserting that 10 servers responded, when in fact one of
them throws an error (but the second bug was causing it to be reported
anyways).
2012-05-16 16:55:09 -07:00
Brian Warner
4b8876c5da checker.py: minor simplifications 2012-04-04 12:05:31 -07:00
david-sarah
c7f65ee8ad verifier: correct a bug introduced in changeset [5106] that caused us to only verify the first block of a file. refs #1395 2011-08-02 10:24:37 -07:00
Zooko O'Whielacronx
f426e82287 verifier: serialize the fetching of blocks within a share so that we don't use too much RAM
Shares are still verified in parallel, but within a share, don't request a
block until the previous block has been verified and the memory we used to hold
it has been freed up.

Patch originally due to Brian. This version has a mockery-patchery-style test
which is "low tech" (it implements the patching inline in the test code instead
of using an extension of the mock.patch() function from the mock library) and
which unpatches in case of exception.

fixes #1395
2011-08-01 23:37:03 -07:00
Brian Warner
550d67f51f remove get_serverid() from ReadBucketProxy and customers, including Checker
and debug.py dump-share commands
refs #1363
2011-08-01 15:43:07 -07:00
Brian Warner
0f11d35f85 replace IServer.name() with get_name(), and get_longname()
This patch was originally written by Brian, but was re-recorded by Zooko to use
darcs replace instead of hunks for any file in which it would result in fewer
total hunks.
refs #1363
2011-08-01 10:44:28 -07:00
Brian Warner
bdc5cfbdad immutable/checker.py: remove some uses of s.get_serverid(), not all 2011-02-26 19:11:34 -07:00
Brian Warner
ffd296fc5a Refactor StorageFarmBroker handling of servers
Pass around IServer instance instead of (peerid, rref) tuple. Replace
"descriptor" with "server". Other replacements:

 get_all_servers -> get_connected_servers/get_known_servers
 get_servers_for_index -> get_servers_for_psi (now returns IServers)

This change still needs to be pushed further down: lots of code is now
getting the IServer and then distributing (peerid, rref) internally.
Instead, it ought to distribute the IServer internally and delay
extracting a serverid or rref until the last moment.

no_network.py was updated to retain parallelism.
2011-02-20 17:58:04 -08:00
Brian Warner
797828f47f Rewrite immutable downloader (#798). This patch rearranges the rest of src/allmydata/immutable/ . 2010-08-04 00:26:39 -07:00
david-sarah
973f0afdd3 Change direct accesses to an_uri.storage_index to calls to .get_storage_index() (fixes #948) 2010-02-21 18:45:04 -08:00
Brian Warner
d888bf3377 Clean up log.err calls, for one of the issues in #889.
allmydata.util.log.err() either takes a Failure as the first positional
argument, or takes no positional arguments and must be invoked in an
exception handler. Fixed its signature to match both foolscap.logging.log.err
and twisted.python.log.err . Included a brief unit test.
2010-01-11 17:33:43 -08:00
Brian Warner
bacb6fe5aa tidy up DeadReferenceError handling, ignore them in add_lease calls
Stop checking separately for ConnectionDone/ConnectionLost, since those have
been folded into DeadReferenceError since foolscap-0.3.1 . Write
rrefutil.trap_deadref() in terms of rrefutil.trap_and_discard() to improve
code coverage.
2010-01-11 16:07:23 -08:00
Brian Warner
db19b62702 immutable/checker.py: oops, forgot some imports. Also hush pyflakes. 2009-12-29 15:39:09 -08:00
Brian Warner
794e32738f checker: don't let failures in add-lease affect checker results. Closes #875.
Mutable servermap updates and the immutable checker, when run with
add_lease=True, send both the do-you-have-block and add-lease commands in
parallel, to avoid an extra round trip time. Many older servers have problems
with add-lease and raise various exceptions, which don't generally matter.
The client-side code was catching+ignoring some of them, but unrecognized
exceptions were passed through to the DYHB code, concealing the DYHB results
from the checker, making it think the server had no shares.

The fix is to separate the code paths. Both commands are sent at the same
time, but the errback path from add-lease is handled separately. Known
exceptions are ignored, the others (both unknown-remote and all-local) are
logged (log.WEIRD, which will trigger an Incident), but neither will affect
the DYHB results.

The add-lease message is sent first, and we know that the server handles them
synchronously. So when the checker is done, we can be sure that all the
add-lease messages have been retired. This makes life easier for unit tests.
2009-12-29 15:01:08 -08:00
Brian Warner
f4aa418086 Verifier: check the full cryptext-hash tree on each share. Removed .todos
from the last few test_repairer tests that were waiting on this.
2009-10-05 15:18:49 -07:00
Brian Warner
504c767d03 Verifier: check the full block-hash-tree on each share
Removed the .todo from two test_repairer tests that check this. The only
remaining .todos are on the three crypttext-hash-tree tests.
2009-10-05 14:48:44 -07:00
Brian Warner
e8f56af5a7 Verifier: check the full share-hash chain on each share
Removed the .todo from two test_repairer tests that check this.
2009-10-05 14:34:43 -07:00
Brian Warner
be95129833 immutable/checker.py: rearrange code a little bit, make it easier to follow 2009-10-05 13:02:52 -07:00
Brian Warner
0d5dc51617 Overhaul IFilesystemNode handling, to simplify tests and use POLA internally.
* stop using IURI as an adapter
* pass cap strings around instead of URI instances
* move filenode/dirnode creation duties from Client to new NodeMaker class
* move other Client duties to KeyGenerator, SecretHolder, History classes
* stop passing Client reference to dirnode/filenode constructors
  - pass less-powerful references instead, like StorageBroker or Uploader
* always create DirectoryNodes by wrapping a filenode (mutable for now)
* remove some specialized mock classes from unit tests

Detailed list of changes (done one at a time, then merged together)

always pass a string to create_node_from_uri(), not an IURI instance
always pass a string to IFilesystemNode constructors, not an IURI instance
stop using IURI() as an adapter, switch on cap prefix in create_node_from_uri()
client.py: move SecretHolder code out to a separate class
test_web.py: hush pyflakes
client.py: move NodeMaker functionality out into a separate object
LiteralFileNode: stop storing a Client reference
immutable Checker: remove Client reference, it only needs a SecretHolder
immutable Upload: remove Client reference, leave SecretHolder and StorageBroker
immutable Repairer: replace Client reference with StorageBroker and SecretHolder
immutable FileNode: remove Client reference
mutable.Publish: stop passing Client
mutable.ServermapUpdater: get StorageBroker in constructor, not by peeking into Client reference
MutableChecker: reference StorageBroker and History directly, not through Client
mutable.FileNode: removed unused indirection to checker classes
mutable.FileNode: remove Client reference
client.py: move RSA key generation into a separate class, so it can be passed to the nodemaker
move create_mutable_file() into NodeMaker
test_dirnode.py: stop using FakeClient mockups, use NoNetworkGrid instead. This simplifies the code, but takes longer to run (17s instead of 6s). This should come down later when other cleanups make it possible to use simpler (non-RSA) fake mutable files for dirnode tests.
test_mutable.py: clean up basedir names
client.py: move create_empty_dirnode() into NodeMaker
dirnode.py: get rid of DirectoryNode.create
remove DirectoryNode.init_from_uri, refactor NodeMaker for customization, simplify test_web's mock Client to match
stop passing Client to DirectoryNode, make DirectoryNode.create_with_mutablefile the normal DirectoryNode constructor, start removing client from NodeMaker
remove Client from NodeMaker
move helper status into History, pass History to web.Status instead of Client
test_mutable.py: fix minor typo
2009-08-15 04:28:46 -07:00
Brian Warner
1863aee0aa switch to using RemoteException instead of 'wrapped' RemoteReferences. Should fix #653, the rref-EQ problem 2009-05-21 17:46:32 -07:00
Brian Warner
c9803d5217 switch all foolscap imports to use foolscap.api or foolscap.logging 2009-05-21 17:38:23 -07:00
Brian Warner
400c04c19a immutable checker add-lease: catch remote IndexError here too 2009-02-27 01:17:24 -07:00
Brian Warner
f95e9b5964 immutable/checker.py: trap ShareVersionIncompatible too. Also, use f.check
instead of examining the value returned by f.trap, because the latter appears
to squash exception types down into their base classes (i.e. since
ShareVersionIncompatible is a subclass of LayoutInvalid,
f.trap(Failure(ShareVersionIncompatible)) == LayoutInvalid).

All this resulted in 'incompatible' shares being misclassified as 'corrupt'.
2009-02-23 22:14:05 -07:00
Brian Warner
bce4a5385b add --add-lease to 'tahoe check', 'tahoe deep-check', and webapi. 2009-02-17 19:32:43 -07:00
Brian Warner
38ee95fec4 immutable/checker: wrap comments to 80cols, my laptop does not have a wide screen. No functional changes. 2009-02-07 14:04:39 -07:00
Zooko O'Whielacronx
25063688b4 immutable repairer
This implements an immutable repairer by marrying a CiphertextDownloader to a CHKUploader.  It extends the IDownloadTarget interface so that the downloader can provide some metadata that the uploader requires.
The processing is incremental -- it uploads the first segments before it finishes downloading the whole file.  This is necessary so that you can repair large files without running out of RAM or using a temporary file on the repairer.
It requires only a verifycap, not a readcap.  That is: it doesn't need or use the decryption key, only the integrity check codes.
There are several tests marked TODO and several instances of XXX in the source code.  I intend to open tickets to document further improvements to functionality and testing, but the current version is probably good enough for Tahoe-1.3.0.
2009-01-12 11:00:22 -07:00
Zooko O'Whielacronx
ef60e85ec6 naming: finish renaming "CheckerResults" to "CheckResults" 2009-01-09 18:00:52 -07:00
Brian Warner
f8de336039 immutable/checker: include a summary (with 'Healthy' or 'Not Healthy' and a count of shares) in the checker results 2009-01-08 20:01:45 -07:00
Zooko O'Whielacronx
e598ca2f3f download: make sure you really get all the crypttext hashes
We were not making sure that we really got all the crypttext hashes during download.  If a server were to return less than the complete set of crypttext hashes, then our subsequent attempt to verify the correctness of the ciphertext would fail.  (And it wouldn't be obvious without very careful debugging why it had failed.)
This patch makes it so that you keep trying to get ciphertext hashes until you have a full set or you run out of servers to ask.
2009-01-07 20:26:38 -07:00
Zooko O'Whielacronx
5e6f90a015 rename "checker results" to "check results", because it is more parallel to "check-and-repair results" 2009-01-06 13:37:03 -07:00
Zooko O'Whielacronx
c35a6ee3a2 trivial: fix a bunch of pyflakes complaints 2009-01-06 08:00:54 -07:00
Zooko O'Whielacronx
6a12f316a4 immutable: new checker and verifier
New checker and verifier use the new download class.  They are robust against various sorts of failures or corruption.  They return detailed results explaining what they learned about your immutable files.  Some grotesque sorts of corruption are not properly handled yet, and those ones are marked as TODO or commented-out in the unit tests.
There is also a repairer module in this patch with the beginnings of a repairer in it.  That repairer is mostly just the interface to the outside world -- the core operation of actually reconstructing the missing data blocks and uploading them is not in there yet.
This patch also refactors the unit tests in test_immutable so that the handling of each kind of corruption is reported as passing or failing separately, can be separately TODO'ified, etc.  The unit tests are also improved in various ways to require more of the code under test or to stop requiring unreasonable things of it.  :-)
2009-01-05 18:28:18 -07:00
Zooko O'Whielacronx
471e1f1b9b try to tidy up uri-as-string vs. uri-as-object
I get confused about whether a given argument or return value is a uri-as-string or uri-as-object.  This patch adds a lot of assertions that it is one or the other, and also changes CheckerResults to take objects not strings.
In the future, I hope that we generally use Python objects except when importing into or exporting from the Python interpreter e.g. over the wire, the UI, or a stored file.
2008-12-19 08:39:24 -07:00
Zooko O'Whielacronx
d67a3fe4b1 immutable: use new logging mixins to simplify logging 2008-12-16 18:04:50 -07:00
Zooko O'Whielacronx
60bbc46a53 minor: fix unused imports -- thanks, pyflakes 2008-12-05 13:07:23 -07:00
Zooko O'Whielacronx
b315619d6b download: refactor handling of URI Extension Block and crypttext hash tree, simplify things
Refactor into a class the logic of asking each server in turn until one of them gives an answer 
that validates.  It is called ValidatedThingObtainer.

Refactor the downloading and verification of the URI Extension Block into a class named 
ValidatedExtendedURIProxy.

The new logic of validating UEBs is minimalist: it doesn't require the UEB to contain any 
unncessary information, but of course it still accepts such information for backwards 
compatibility (so that this new download code is able to download files uploaded with old, and 
for that matter with current, upload code).

The new logic of validating UEBs follows the practice of doing all validation up front.  This 
practice advises one to isolate the validation of incoming data into one place, so that all of 
the rest of the code can assume only valid data.

If any redundant information is present in the UEB+URI, the new code cross-checks and asserts 
that it is all fully consistent.  This closes some issues where the uploader could have 
uploaded inconsistent redundant data, which would probably have caused the old downloader to 
simply reject that download after getting a Python exception, but perhaps could have caused 
greater harm to the old downloader.

I removed the notion of selecting an erasure codec from codec.py based on the string that was 
passed in the UEB.  Currently "crs" is the only such string that works, so 
"_assert(codec_name == 'crs')" is simpler and more explicit.  This is also in keeping with the 
"validate up front" strategy -- now if someone sets a different string than "crs" in their UEB, 
the downloader will reject the download in the "validate this UEB" function instead of in a 
separate "select the codec instance" function.

I removed the code to check plaintext hashes and plaintext Merkle Trees.  Uploaders do not 
produce this information any more (since it potentially exposes confidential information about 
the file), and the unit tests for it were disabled.  The downloader before this patch would 
check that plaintext hash or plaintext merkle tree if they were present, but not complain if 
they were absent.  The new downloader in this patch complains if they are present and doesn't 
check them.  (We might in the future re-introduce such hashes over the plaintext, but encrypt 
the hashes which are stored in the UEB to preserve confidentiality.  This would be a double-
check on the correctness of our own source code -- the current Merkle Tree over the ciphertext 
is already sufficient to guarantee the integrity of the download unless there is a bug in our 
Merkle Tree or AES implementation.) 

This patch increases the lines-of-code count by 8 (from 17,770 to 17,778), and reduces the 
uncovered-by-tests lines-of-code count by 24 (from 1408 to 1384).  Those numbers would be more 
meaningful if we omitted src/allmydata/util/ from the test-coverage statistics.
2008-12-05 08:17:54 -07:00
Brian Warner
7932fadb5e webapi: add 'summary' string to checker results JSON 2008-11-18 18:28:26 -07:00
Brian Warner
dfa2408157 checker: add is_recoverable() to checker results, make our stub immutable-verifier not throw an exception on unrecoverable files, add tests 2008-11-06 22:35:47 -07:00
Brian Warner
b1db6d9ff2 web: add 'Repair' button to checker results when they indicate unhealthyness. Also add the object's uri to the CheckerResults instance. 2008-10-29 18:09:17 -07:00
Brian Warner
80d8f3e862 hush pyflakes 2008-09-09 19:50:17 -07:00
Brian Warner
1d2d6a35a6 checker results: add output=JSON to webapi, add tests, clean up APIs
to make the internal ones use binary strings (nodeid, storage index) and
the web/JSON ones use base32-encoded strings. The immutable verifier is
still incomplete (it returns imaginary healty results).
2008-09-09 19:45:17 -07:00
Brian Warner
04513e3ac5 immutable verifier: provide some dummy results so deep-check works, make the tests ignore these results until we finish it off 2008-09-09 18:08:27 -07:00
Brian Warner
f895e39d48 checker results: more tests, more results. immutable verifier tests are disabled until they emit more complete results 2008-09-09 17:15:46 -07:00
Brian Warner
af2231563e immutable/checker: make log() tolerate the format= form 2008-09-07 20:03:08 -07:00
Brian Warner
3408d552cd checker: overhaul checker results, split check/check_and_repair into separate methods, improve web displays 2008-09-07 12:44:56 -07:00
Zooko O'Whielacronx
def9fc8cf0 checker: make the log() function of SimpleCHKFileVerifier compatible with the log() function of its superclasses and subclasses 2008-08-25 14:44:07 -07:00
Brian Warner
97852cd626 immutable checker: add a status_report field 2008-08-12 20:35:30 -07:00
Brian Warner
d106e411af checker: add information to results, add some deep-check tests, fix a bug in which unhealthy files were not counted 2008-08-11 21:03:26 -07:00
Brian Warner
923c9c242a oops, fix import/pyflakes problems 2008-07-17 17:06:20 -07:00
Brian Warner
67db0a4967 deep-check: add webapi, add 'DEEP-CHECK' button to wui, add tests, rearrange checker API a bit 2008-07-17 16:47:09 -07:00