Commit Graph

353 Commits

Author SHA1 Message Date
Brian Warner
957a5315aa CheckResults.get_sharemap() now returns IServers
Remove temporary get_new_sharemap().
2012-06-02 11:39:11 -07:00
Brian Warner
1883d393c6 CheckResults: replace get_data() with a bunch of individual getters 2012-06-02 11:39:10 -07:00
Brian Warner
3a1c02cfdf change UploadResults to return IServers, update users to match
This finally changes all callers of get_servermap()/get_sharemap() to
accept IServers, and changes UploadResults to provide them.
2012-05-21 21:18:37 -07:00
Brian Warner
97a1eb6ebf split IDisplayableServer from IServer, add sb.get_stub_server()
IDisplayableServer includes just enough functionality to call
.get_name() and friends, which is all that the UploadResults really
need. IServer is a superset that includes actual share-manipulation
methods. StubServer instances provide only IDisplayableServer, while
actual NativeStorageServer instances provide the full IServer interface.

When the Helper sends a serverid (so we know what to call the server but
nothing else about it, and have no corresponding NativeStorageServer
object to reference), but we want to store an IDisplayableServer in the
UploadResults, we create a synthetic StubServer "server" and store that
instead.
2012-05-21 21:17:27 -07:00
Brian Warner
3d771132a8 switch UploadResults to use get_uri(), hide internal ._uri
Complete the getter-based transformation, by hiding ".uri" and updating
callers to use get_uri(). Also don't set a dummy self._uri, leave it
undefined until someone calls set_uri().
2012-05-21 21:14:44 -07:00
Brian Warner
29b11531b5 switch UploadResults to use getters, hide internal data, for all but .uri
This hides attributes with e.g. _sharemap, and creates getters like
get_sharemap() to access them, for every field except .uri . This will
make it easier to modify the internal representation of .sharemap
without requiring callers to adjust quite yet.

".uri" has so many users that it seemed better to update it in a
subsequent patch.
2012-05-21 21:14:28 -07:00
Brian Warner
e60982c851 helper: remove timings["existence_check"], aka "Already-In-Grid Check"
This measured how long the Helper took to do a filecheck before asking
for ciphertext. The "Contacting Helper" report includes both
existence_check and the client-helper RTT.

For non-overlapping uploads, it was being returned correctly. But when
multiple upload requests overlapped, and the file was not already in the
grid, the filecheck would only run once, and its existence_check time
would be reported for all uploaders (even if they didn't have to wait
for that time). Cleaning that up proved too difficult: the only correct
place to report this time is from the initial remote_upload_chk() call,
but the return value of that is too constrained to accomodate it in the
needs-upload case.

So I'm removing it altogether. Eventually I plan to add a proper
events/times field and record more data, including this check, in a form
that can be drawn on a nice zoomable timeline view.

Old clients talking to a new Helper (which doesn't supply the value)
will tolerate the loss (they'll just display an empty field on the web
view).
2012-05-21 21:13:11 -07:00
Brian Warner
b38cfd0235 move IServer from storage_client.py to interfaces.py 2012-04-04 11:13:59 -07:00
david-sarah
93e50f5e16 interfaces.py: ensure that NoSuchChildError can be converted to str even when it is for a non-ASCII name. fixes #1483 2011-08-14 22:59:59 +00:00
Brian Warner
a56e639346 Fix mutable status (mapupdate/retrieve/publish) to use serverids, not tubids
This still leaves immutable-publish results incorrectly using tubids instead
of serverids. That will need some more work, since it might change the Helper
interface.
2012-03-17 17:01:35 -07:00
Brian Warner
bc21726dfd new introducer: signed extensible dictionary-based messages! refs #466
This introduces new client and server halves to the Introducer (renaming the
old one with a _V1 suffix). Both have fallbacks to accomodate talking to a
different version: the publishing client switches on whether the server's
.get_version() advertises V2 support, the server switches on which
subscription method was invoked by the subscribing client.

The V2 protocol sends a three-tuple of (serialized announcement dictionary,
signature, pubkey) for each announcement. The V2 server dispatches messages
to subscribers according to the service-name, and throws errors for invalid
signatures, but does not otherwise examine the messages. The V2 receiver's
subscription callback will receive a (serverid, ann_dict) pair. The
'serverid' will be equal to the pubkey if all of the following are true:

  the originating client is V2, and was told a privkey to use
  the announcement went through a V2 server
  the signature is valid

If not, 'serverid' will be equal to the tubid portion of the announced FURL,
as was the case for V1 receivers.

Servers will create a keypair if one does not exist yet, stored in
private/server.privkey .

The signed announcement dictionary puts the server FURL in a key named
"anonymous-storage-FURL", which anticipates upcoming Accounting-related
changes in the server advertisements. It also provides a key named
"permutation-seed-base32" to tell clients what permutation seed to use. This
is computed at startup, using tubid if there are existing shares, otherwise
the pubkey, to retain share-order compatibility for existing servers.
2012-03-13 18:24:32 -07:00
david-sarah
cf9bf2ea10 interfaces.py: remove get_extension_params and set_extension_params methods from IMutableFileURI. refs #393, #1526 2011-10-10 12:48:42 -07:00
david-sarah
de00b277cc interfaces.py: fix a typo in the name of IMutableSlotWriter.put_encprivkey. refs #393 2011-10-10 12:46:42 -07:00
david-sarah
c10099f982 interfaces: document that the 'fills-holes-with-zero-bytes' key should be used to detect whether a storage server has that behavior. refs #1528 2011-09-12 17:28:43 -07:00
Zooko O'Whielacronx
32f80625c9 storage: more paranoid handling of bounds and palimpsests in mutable share files
* storage server ignores requests to extend shares by sending a new_length
* storage server fills exposed holes (created by sending a write vector whose offset begins after the end of the current data) with 0 to avoid "palimpsest" exposure of previous contents
* storage server zeroes out lease info at the old location when moving it to a new location
ref. #1528
2011-09-12 15:26:55 -07:00
Zooko O'Whielacronx
5476f67dc1 storage: remove the storage server's "remote_cancel_lease" function
We're removing this function because it is currently unused, because it is dangerous, and because the bug described in #1528 leaks the cancellation secret, which allows anyone who knows a file's storage index to abuse this function to delete shares of that file.
fixes #1528 (there are two patches that are each a sufficient fix to #1528 and this is one of them)
2011-09-12 15:23:31 -07:00
Brian Warner
748e419a9b move DownloadStopped from download.common to interfaces 2011-09-09 11:11:50 -07:00
Brian Warner
48544a251d MDMF: s/Writable/Writeable/g, for consistency with existing SDMF code 2011-08-27 11:33:57 -07:00
david-sarah
3d7a32647c Implementation, tests and docs for blacklists. This version allows listing directories containing a blacklisted child. Inclusion of blacklist.py fixed. fixes #1425 2011-08-24 08:59:28 -07:00
Kevan Carstensen
126d1ad010 interfaces: change interfaces to work with MDMF
A lot of this work concerns #993, in that it unifies (to an extent) the
interfaces of mutable and immutable files.
2011-08-01 18:41:19 -07:00
Zooko O'Whielacronx
155d048d17 docs: three minor fixes
CREDITS for arc for stats tweak
fix link to .zip file in quickstart.rst (thanks to ChosenOne for noticing)
English usage tweak
2011-06-10 05:16:56 -07:00
wilcoxjg
67ad0175cd server.py: get_latencies now reports percentiles _only_ if there are sufficient observations for the interpretation of the percentile to be unambiguous.
interfaces.py:  modified the return type of RIStatsProvider.get_stats to allow for None as a return value
NEWS.rst, stats.py: documentation of change to get_latencies
stats.rst: now documents percentile modification in get_latencies
test_storage.py:  test_latencies now expects None in output categories that contain too few samples for the associated percentile to be unambiguously reported.
fixes #1392
2011-05-27 05:01:35 -07:00
Brian Warner
70f9f89c66 control.py: remove all uses of s.get_serverid() 2011-02-26 19:12:03 -07:00
Brian Warner
ffd296fc5a Refactor StorageFarmBroker handling of servers
Pass around IServer instance instead of (peerid, rref) tuple. Replace
"descriptor" with "server". Other replacements:

 get_all_servers -> get_connected_servers/get_known_servers
 get_servers_for_index -> get_servers_for_psi (now returns IServers)

This change still needs to be pushed further down: lots of code is now
getting the IServer and then distributing (peerid, rref) internally.
Instead, it ought to distribute the IServer internally and delay
extracting a serverid or rref until the last moment.

no_network.py was updated to retain parallelism.
2011-02-20 17:58:04 -08:00
Brian Warner
c18953c169 fix #1223, crash+inefficiency during repair due to read overrun
* repairer (really the uploader) reads beyond end of input file (Uploadable)
* new-downloader does not tolerate overreads
* uploader does lots of tiny reads (inefficient)

This fixes the last two. The uploader still does a single overread at the end
of the input file, but now that's ok so we can leave it in place. The
uploader now expects the Uploadable to behave like a normal disk
file (reading beyond EOF will return less data than was asked for), and now
the new-downloadable behaves that way.
2010-10-29 01:20:36 -07:00
Zooko O'Whielacronx
0c2397523b doc: add explanation of the motivation for the surprising and awkward API to erasure coding 2010-10-14 23:02:02 -07:00
Zooko O'Whielacronx
cb83f2e41c minor: remove unused interface declaration, change allmydata.org to tahoe-lafs.org in email address, fix wording in relnotes.txt 2010-09-30 08:37:08 -07:00
Brian Warner
919938dd95 copy the rest of David-Sarah's changes to make my tree match 1.8.0beta 2010-08-04 00:27:52 -07:00
Brian Warner
7b7b0c9709 Rewrite immutable downloader (#798). This patch includes higher-level
integration into the NodeMaker, and updates the web-status display to handle
the new download events.
2010-08-04 00:27:02 -07:00
david-sarah
29a06457d2 dirnode.py: fix a bug in the no-write change for Adder, and improve test coverage. Add a 'metadata' argument to create_subdirectory, with documentation. Also update some comments in test_dirnode.py made stale by the ctime/mtime change. 2010-06-01 20:26:41 -07:00
david-sarah
5974773969 SFTP: Fix error in support for getAttrs on an open file, to index open files by directory entry rather than path. Extend that support to renaming open files. Also, implement the extposix-rename@openssh.org extension, and some other minor refactoring. 2010-05-21 20:58:36 -07:00
david-sarah
9214dbda50 Add must_exist, must_be_directory, and must_be_file arguments to DirectoryNode.delete. This will be used to fixes a minor condition in the SFTP frontend. 2010-05-27 12:45:29 -07:00
david-sarah
027e7701bd Change doc comments in interfaces.py to take into account unknown nodes. 2010-05-28 10:19:22 -07:00
Kevan Carstensen
e225f573b9 Fix up the behavior of #778, per reviewers' comments
- Make some important utility functions clearer and more thoroughly 
    documented.
  - Assert in upload.servers_of_happiness that the buckets attributes
    of PeerTrackers passed to it are mutually disjoint.
  - Get rid of some silly non-Pythonisms that I didn't see when I first
    wrote these patches.
  - Make sure that should_add_server returns true when queried about a 
    shnum that it doesn't know about yet.
  - Change Tahoe2PeerSelector.preexisting_shares to map a shareid to a set
    of peerids, alter dependencies to deal with that.
  - Remove upload.should_add_servers, because it is no longer necessary
  - Move upload.shares_of_happiness and upload.shares_by_server to a utility
    file.
  - Change some points in Tahoe2PeerSelector.
  - Compute servers_of_happiness using a bipartite matching algorithm that 
    we know is optimal instead of an ad-hoc greedy algorithm that isn't.
  - Change servers_of_happiness to just take a sharemap as an argument,
    change its callers to merge existing_shares and used_peers before 
    calling it.
  - Change an error message in the encoder to be more appropriate for 
    servers of happiness.
  - Clarify the wording of an error message in immutable/upload.py
  - Refactor a happiness failure message to happinessutil.py, and make
    immutable/upload.py and immutable/encode.py use it.
  - Move the word "only" as far to the right as possible in failure 
    messages.
  - Use a better definition of progress during peer selection.
  - Do read-only peer share detection queries in parallel, not sequentially.
  - Clean up logging semantics; print the query statistics whenever an
    upload is unsuccessful, not just in one case.
2010-05-13 17:49:17 -07:00
Kevan Carstensen
8bcc771e26 Change "UploadHappinessError" to "UploadUnhappinessError" 2009-12-04 22:30:37 -07:00
Kevan Carstensen
68fb556e93 Alter the error message returned when peer selection fails
The Tahoe2PeerSelector returned either NoSharesError or NotEnoughSharesError
for a variety of error conditions that weren't informatively described by them.
This patch creates a new error, UploadHappinessError, replaces uses of 
NoSharesError and NotEnoughSharesError with it, and alters the error message
raised with the errors to be more in line with the new servers_of_happiness
behavior. See ticket #834 for more information.
2009-11-22 18:24:05 -07:00
Kevan Carstensen
4e29060847 Change stray "shares_of_happiness" to "servers_of_happiness" 2009-11-16 15:24:59 -07:00
Kevan Carstensen
b2d8a7cec2 Alter the signature of set_shareholders in IEncoder to add a 'servermap' parameter, which gives IEncoders enough information to perform a sane check for servers_of_happiness. 2009-11-03 21:32:41 -07:00
"Kevan Carstensen"
5fe125ed74 Alter wording in 'interfaces.py' to be correct wrt #778 2009-12-04 21:40:05 -07:00
david-sarah
56c00cb381 Miscellaneous documentation, test, and code formatting tweaks. 2010-01-26 23:03:09 -08:00
david-sarah
6057bc02cc Prevent mutable objects from being retrieved from an immutable directory, and associated forward-compatibility improvements. 2010-01-26 22:44:30 -08:00
Brian Warner
ba0690c9d7 mutable repair: return successful=False when numshares<k (thus repair fails),
instead of weird errors. Closes #874 and #786.

Previously, if the file had 0 shares, this would raise TypeError as it tried
to call download_version(None). If the file had some shares but fewer than
'k', it would incorrectly raise MustForceRepairError.

Added get_successful() to the IRepairResults API, to give repair() a place to
report non-code-bug problems like this.
2009-12-29 15:37:46 -08:00
Brian Warner
acd211765c node.py/interfaces.py: minor docs fixes 2009-12-29 15:04:09 -08:00
Brian Warner
499add09e6 webapi: don't accept zero-length childnames during traversal. Closes #358, #676.
This forbids operations that would implicitly create a directory with a
zero-length (empty string) name, like what you'd get if you did "tahoe put
local /oops/blah" (#358) or "POST /uri/CAP//?t=mkdir" (#676). The error
message is fairly friendly too.

Also added code to "tahoe put" to catch this error beforehand and suggest the
correct syntax (i.e. without the leading slash).
2009-12-27 15:10:43 -05:00
Brian Warner
96834da0a2 Simplify immutable download API: use just filenode.read(consumer, offset, size)
* remove Downloader.download_to_data/download_to_filename/download_to_filehandle
* remove download.Data/FileName/FileHandle targets
* remove filenode.download/download_to_data/download_to_filename methods
* leave Downloader.download (the whole Downloader will go away eventually)
* add util.consumer.MemoryConsumer/download_to_data, for convenience
  (this is mostly used by unit tests, but it gets used by enough non-test
   code to warrant putting it in allmydata.util)
* update tests
* removes about 180 lines of code. Yay negative code days!

Overall plan is to rewrite immutable/download.py and leave filenode.read() as
the sole read-side API.
2009-12-01 17:53:30 -05:00
Brian Warner
0cf320c2ab interface name cleanups: IFileNode, IImmutableFileNode, IMutableFileNode
The proper hierarchy is:
 IFilesystemNode
 +IFileNode
 ++IMutableFileNode
 ++IImmutableFileNode
 +IDirectoryNode

Also expand test_client.py (NodeMaker) to hit all IFilesystemNode types.
2009-11-19 23:52:55 -08:00
Brian Warner
d2badbea78 class name cleanups: s/FileNode/ImmutableFileNode/
also fix test/bench_dirnode.py for recent dirnode changes
2009-11-19 23:22:39 -08:00
Brian Warner
e046744f40 make get_size/get_current_size consistent for all IFilesystemNode classes
* stop caching most_recent_size in dirnode, rely upon backing filenode for it
* start caching most_recent_size in MutableFileNode
* return None when you don't know, not "?"
* only render None as "?" in the web "more info" page
* add get_size/get_current_size to UnknownNode
2009-11-18 11:16:24 -08:00
Brian Warner
f85690697a Add t=mkdir-immutable to the webapi. Closes #607.
* change t=mkdir-with-children to not use multipart/form encoding. Instead,
  the request body is all JSON. t=mkdir-immutable uses this format too.
* make nodemaker.create_immutable_dirnode() get convergence from SecretHolder,
  but let callers override it
* raise NotDeepImmutableError instead of using assert()
* add mutable= argument to DirectoryNode.create_subdirectory(), default True
2009-11-17 23:09:00 -08:00
Brian Warner
5fe713fc52 nodemaker: implement immutable directories (internal interface), for #607
* nodemaker.create_from_cap() now handles DIR2-CHK and DIR2-LIT
* client.create_immutable_dirnode() is used to create them
* no webapi yet
2009-11-11 16:22:33 -08:00
Brian Warner
131e05b155 clean up uri-vs-cap terminology, emphasize cap instances instead of URI strings
* "cap" means a python instance which encapsulates a filecap/dircap (uri.py)
 * "uri" means a string with a "URI:" prefix
 * FileNode instances are created with (and retain) a cap instance, and
   generate uri strings on demand
 * .get_cap/get_readcap/get_verifycap/get_repaircap return cap instances
 * .get_uri/get_readonly_uri return uri strings

* add filenode.download_to_filename() for control.py, should find a better way
* use MutableFileNode.init_from_cap, not .init_from_uri
* directory URI instances: use get_filenode_cap, not get_filenode_uri
* update/cleanup bench_dirnode.py to match, add Makefile target to run it
2009-11-11 14:26:19 -08:00
Brian Warner
b4ec86c95a update many dirnode interfaces to accept dict-of-nodes instead of dict-of-caps
interfaces.py: define INodeMaker, document argument values, change
               create_new_mutable_directory() to take dict-of-nodes. Change
               dirnode.set_nodes() and dirnode.create_subdirectory() too.
nodemaker.py: use INodeMaker, update create_new_mutable_directory()
client.py: have create_dirnode() delegate initial_children= to nodemaker
dirnode.py (Adder): take dict-of-nodes instead of list-of-nodes, which
                    updates set_nodes() and create_subdirectory()
web/common.py (convert_initial_children_json): create dict-of-nodes
web/directory.py: same
web/unlinked.py: same
test_dirnode.py: update tests to match
2009-10-17 12:28:29 -07:00
Brian Warner
c2520e4ec7 client.create_mutable_file(contents=) now accepts a callable, which is
invoked with the new MutableFileNode and is supposed to return the initial
contents. This can be used by e.g. a new dirnode which needs the filenode's
writekey to encrypt its initial children.

create_mutable_file() still accepts a bytestring too, or None for an empty
file.
2009-10-12 20:12:32 -07:00
Brian Warner
cf65cc2ae3 replace dirnode.create_empty_directory() with create_subdirectory(), which
takes an initial_children= argument
2009-10-12 19:15:20 -07:00
Brian Warner
d079eb45f6 dirnode.set_children: change return value: fire with self instead of None 2009-10-12 18:50:26 -07:00
Brian Warner
f871c3bb3d dirnode.set_nodes: change return value: fire with self instead of None 2009-10-12 18:45:46 -07:00
Brian Warner
304aadd4f7 dirnode.set_children: take a dict, not a list 2009-10-12 17:24:40 -07:00
Brian Warner
e2ffc3dc03 dirnode.set_uri/set_children: change signature to take writecap+readcap
instead of a single cap. The webapi t=set_children call benefits too.
2009-10-12 16:51:26 -07:00
Brian Warner
3ee740628a replace Client.create_empty_dirnode() with create_dirnode(), in anticipation
of adding initial_children= argument.

Includes stubbed-out initial_children= support.
2009-10-12 15:45:06 -07:00
Brian Warner
0d5dc51617 Overhaul IFilesystemNode handling, to simplify tests and use POLA internally.
* stop using IURI as an adapter
* pass cap strings around instead of URI instances
* move filenode/dirnode creation duties from Client to new NodeMaker class
* move other Client duties to KeyGenerator, SecretHolder, History classes
* stop passing Client reference to dirnode/filenode constructors
  - pass less-powerful references instead, like StorageBroker or Uploader
* always create DirectoryNodes by wrapping a filenode (mutable for now)
* remove some specialized mock classes from unit tests

Detailed list of changes (done one at a time, then merged together)

always pass a string to create_node_from_uri(), not an IURI instance
always pass a string to IFilesystemNode constructors, not an IURI instance
stop using IURI() as an adapter, switch on cap prefix in create_node_from_uri()
client.py: move SecretHolder code out to a separate class
test_web.py: hush pyflakes
client.py: move NodeMaker functionality out into a separate object
LiteralFileNode: stop storing a Client reference
immutable Checker: remove Client reference, it only needs a SecretHolder
immutable Upload: remove Client reference, leave SecretHolder and StorageBroker
immutable Repairer: replace Client reference with StorageBroker and SecretHolder
immutable FileNode: remove Client reference
mutable.Publish: stop passing Client
mutable.ServermapUpdater: get StorageBroker in constructor, not by peeking into Client reference
MutableChecker: reference StorageBroker and History directly, not through Client
mutable.FileNode: removed unused indirection to checker classes
mutable.FileNode: remove Client reference
client.py: move RSA key generation into a separate class, so it can be passed to the nodemaker
move create_mutable_file() into NodeMaker
test_dirnode.py: stop using FakeClient mockups, use NoNetworkGrid instead. This simplifies the code, but takes longer to run (17s instead of 6s). This should come down later when other cleanups make it possible to use simpler (non-RSA) fake mutable files for dirnode tests.
test_mutable.py: clean up basedir names
client.py: move create_empty_dirnode() into NodeMaker
dirnode.py: get rid of DirectoryNode.create
remove DirectoryNode.init_from_uri, refactor NodeMaker for customization, simplify test_web's mock Client to match
stop passing Client to DirectoryNode, make DirectoryNode.create_with_mutablefile the normal DirectoryNode constructor, start removing client from NodeMaker
remove Client from NodeMaker
move helper status into History, pass History to web.Status instead of Client
test_mutable.py: fix minor typo
2009-08-15 04:28:46 -07:00
Brian Warner
531cc7899f rename NewDirectoryNode to DirectoryNode, NewDirectoryURI to DirectoryURI 2009-07-17 17:15:49 -05:00
Brian Warner
8536db4e64 interfaces: remove dead code, FileNode_ and EncryptedThing constraints 2009-07-17 17:11:39 -05:00
Brian Warner
d8ba8c2eb5 Allow tests to pass with -OO by turning some AssertionErrors (the ones that
we actually exercise during tests) into more specific exceptions, so they
don't get optimized away. The best rule to follow is probably this: if an
exception is worth testing, then it's part of the API, and AssertionError
should never be part of the API. Closes #749.
2009-07-14 23:45:10 -07:00
Brian Warner
ef1b6ae8e3 Tolerate unknown URI types in directory structures. Part of #683.
The idea is that future versions of Tahoe will add new URI types that this
version won't recognize, but might store them in directories that we *can*
read. We should handle these "objects from the future" as best we can.
Previous releases of Tahoe would just explode. With this change, we'll
continue to be able to work with everything else in the directory.

The code change is to wrap anything we don't recognize as an UnknownNode
instance (as opposed to a FileNode or DirectoryNode). Then webapi knows how
to render these (mostly by leaving fields blank), deep-check knows to skip
over them, deep-stats counts them in "count-unknown". You can rename and
delete these things, but you can't add new ones (because we wouldn't know how
to generate a readcap to put into the dirnode's rocap slot, and because this
lets us catch typos better).
2009-07-02 18:07:49 -07:00
Brian Warner
6237aeabd7 create_node_from_uri: take both writecap+readcap, move logic out of dirnode.py 2009-07-02 15:25:37 -07:00
Brian Warner
3dedfed9de interfaces.py: wrap some lines to 80cols 2009-07-01 18:57:28 -07:00
Brian Warner
4194565b3d interfaces.py: improve ICheckAndRepairResults docs a bit 2009-06-30 17:19:25 -07:00
Brian Warner
bd6ecc9f44 Split out NoSharesError, stop adding attributes to NotEnoughSharesError, change humanize_failure to include the original exception string, update tests, behave better if humanize_failure fails. 2009-06-24 19:17:07 -07:00
Brian Warner
8df15e9f30 big rework of introducer client: change local API, split division of responsibilites better, remove old-code testing, improve error logging 2009-06-22 19:10:47 -07:00
Brian Warner
711c09bc5d clean up storage_broker interface: should fix #732 2009-06-21 16:51:19 -07:00
Brian Warner
4177a3616b remove plaintext-hashing code from the helper interface, to close #722
and deny the Helper the ability to mount a partial-information-guessing
attack. This will probably break compatibility between new clients and very
old (pre-1.0) helpers.
2009-06-01 15:49:16 -07:00
Brian Warner
c9803d5217 switch all foolscap imports to use foolscap.api or foolscap.logging 2009-05-21 17:38:23 -07:00
Brian Warner
67571eb033 add more information to NotEnoughSharesError, split out new exceptions for no-servers and no-source-of-ueb-hash 2009-03-03 19:37:15 -07:00
Brian Warner
2810de32b1 test_web: add (disabled) test to see what happens when deep-check encounters an unrecoverable directory. We still need code changes to improve this behavior. 2009-02-24 15:40:17 -07:00
Brian Warner
2346d8621d interfaces.py: allow add/renew/cancel-lease to return Any, so that 1.3.1 clients (the first to use these calls) can tolerate future storage servers which might return something other than None 2009-02-18 13:29:03 -07:00
Brian Warner
bce4a5385b add --add-lease to 'tahoe check', 'tahoe deep-check', and webapi. 2009-02-17 19:32:43 -07:00
Brian Warner
e9563ebc02 change RIStorageServer.remote_add_lease to exit silently in case of no-such-bucket, instead of raising IndexError, because that makes the upcoming --add-lease feature faster and less noisy 2009-02-17 19:30:53 -07:00
Brian Warner
0e78b2587c interfaces.py: document behavior of add_lease/renew_lease/cancel_lease, before I change it 2009-02-17 13:48:09 -07:00
Brian Warner
8ff76c6269 interfaces.py: minor docstring edit 2009-02-16 14:58:16 -07:00
Brian Warner
13a3ef5ec1 #620: storage: allow mutable shares to be deleted, with a writev where new_length=0 2009-02-10 23:37:56 -07:00
Brian Warner
d8b3505cf5 filenode: add get_repair_cap(), which uses the read-write filecap for immutable files, and the verifycap for immutable files 2009-01-22 21:38:36 -07:00
Brian Warner
bf56e2bb51 deep-check-and-repair: improve results and their HTML representation 2009-01-12 18:56:19 -07:00
Zooko O'Whielacronx
25063688b4 immutable repairer
This implements an immutable repairer by marrying a CiphertextDownloader to a CHKUploader.  It extends the IDownloadTarget interface so that the downloader can provide some metadata that the uploader requires.
The processing is incremental -- it uploads the first segments before it finishes downloading the whole file.  This is necessary so that you can repair large files without running out of RAM or using a temporary file on the repairer.
It requires only a verifycap, not a readcap.  That is: it doesn't need or use the decryption key, only the integrity check codes.
There are several tests marked TODO and several instances of XXX in the source code.  I intend to open tickets to document further improvements to functionality and testing, but the current version is probably good enough for Tahoe-1.3.0.
2009-01-12 11:00:22 -07:00
Zooko O'Whielacronx
b496eba072 trivial: minor changes to in-line comments -- mark plaintext-hash-tree as obsolete 2009-01-10 14:56:01 -07:00
Zooko O'Whielacronx
0f9c11cfde immutable: fix edit-o in interfaces.py documentation introduced in recent patch 2009-01-10 12:54:08 -07:00
Zooko O'Whielacronx
6e3396fb88 immutable: redefine the "sharemap" member of the upload results to be a map from shnum to set of serverids
It used to be a map from shnum to a string saying "placed this share on XYZ server".  The new definition is more in keeping with the "sharemap" object that results from immutable file checking and repair, and it is more useful to the repairer, which is a consumer of immutable upload results.
2009-01-10 11:46:23 -07:00
Zooko O'Whielacronx
157e365d2b naming: Rename a few things which I touched or changed in the recent patch to download-without-decrypting.
Rename "downloadable" to "target".
Rename "u" to "v" in FileDownloader.__init__().
Rename "_uri" to "_verifycap" in FileDownloader.
Rename "_downloadable" to "_target" in FileDownloader.
Rename "FileDownloader" to "CiphertextDownloader".
2009-01-08 12:13:07 -07:00
Zooko O'Whielacronx
2e762f39f6 immutable: define a new interface IImmutableFileURI and declare that CHKFileURI and LiteralFileURI provide it 2009-01-07 12:24:51 -07:00
Zooko O'Whielacronx
c85f75bb08 immutable: refactor uploader to do just encoding-and-uploading, not encryption
This makes Uploader take an EncryptedUploadable object instead of an Uploadable object.  I also changed it to return a verify cap instead of a tuple of the bits of data that one finds in a verify cap.
This will facilitate hooking together an Uploader and a Downloader to make a Repairer.
Also move offloaded.py into src/allmydata/immutable/.
2009-01-06 21:48:22 -07:00
Zooko O'Whielacronx
81add135dc trivial: whitespace and docstring tidyups 2009-01-06 21:41:04 -07:00
Zooko O'Whielacronx
5e6f90a015 rename "checker results" to "check results", because it is more parallel to "check-and-repair results" 2009-01-06 13:37:03 -07:00
Zooko O'Whielacronx
778167c2b1 immutable: refactor downloader to be more reusable for checker/verifier/repairer (and better)
The code for validating the share hash tree and the block hash tree has been rewritten to make sure it handles all cases, to share metadata about the file (such as the share hash tree, block hash trees, and UEB) among different share downloads, and not to require hashes to be stored on the server unnecessarily, such as the roots of the block hash trees (not needed since they are also the leaves of the share hash tree), and the root of the share hash tree (not needed since it is also included in the UEB).  It also passes the latest tests including handling corrupted shares well.
  
ValidatedReadBucketProxy takes a share_hash_tree argument to its constructor, which is a reference to a share hash tree shared by all ValidatedReadBucketProxies for that immutable file download.
  
ValidatedReadBucketProxy requires the block_size and share_size to be provided in its constructor, and it then uses those to compute the offsets and lengths of blocks when it needs them, instead of reading those values out of the share.  The user of ValidatedReadBucketProxy therefore has to have first used a ValidatedExtendedURIProxy to compute those two values from the validated contents of the URI.  This is pleasingly simplifies safety analysis: the client knows which span of bytes corresponds to a given block from the validated URI data, rather than from the unvalidated data stored on the storage server.  It also simplifies unit testing of verifier/repairer, because now it doesn't care about the contents of the "share size" and "block size" fields in the share.  It does not relieve the need for share data v2 layout, because we still need to store and retrieve the offsets of the fields which come after the share data, therefore we still need to use share data v2 with its 8-byte fields if we want to store share data larger than about 2^32.
  
Specify which subset of the block hashes and share hashes you need while downloading a particular share.  In the future this will hopefully be used to fetch only a subset, for network efficiency, but currently all of them are fetched, regardless of which subset you specify.
  
ReadBucketProxy hides the question of whether it has "started" or not (sent a request to the server to get metadata) from its user.

Download is optimized to do as few roundtrips and as few requests as possible, hopefully speeding up download a bit.
2009-01-05 09:51:45 -07:00
Zooko O'Whielacronx
3a47031a51 immutable: more detailed tests for checker/verifier/repairer
There are a lot of different ways that a share could be corrupted, or that attempting to download it might fail.  These tests attempt to exercise many of those ways and require the checker/verifier/repairer to handle each kind of failure well.
2008-12-31 14:18:38 -07:00
Zooko O'Whielacronx
872e4fc84d doc: sundry amendments to docs and in-line code comments 2008-12-28 16:59:54 -07:00
Zooko O'Whielacronx
8b7ce325d7 immutable, checker, and tests: improve docstrings, assertions, tests
No functional changes, but remove unused code, improve or fix docstrings, etc.
2008-12-21 15:07:52 -07:00
Zooko O'Whielacronx
471e1f1b9b try to tidy up uri-as-string vs. uri-as-object
I get confused about whether a given argument or return value is a uri-as-string or uri-as-object.  This patch adds a lot of assertions that it is one or the other, and also changes CheckerResults to take objects not strings.
In the future, I hope that we generally use Python objects except when importing into or exporting from the Python interpreter e.g. over the wire, the UI, or a stored file.
2008-12-19 08:39:24 -07:00
Zooko O'Whielacronx
7b285ebcb1 immutable: remove the last bits of code (only test code or unused code) which did something with plaintext hashes or plaintext hash trees 2008-12-19 08:18:07 -07:00
Zooko O'Whielacronx
c456ff8591 rename "get_verifier()" to "get_verify_cap()" 2008-12-08 12:44:11 -07:00
Brian Warner
fb9af2c7a0 MutableFileNode.modify: pass first_time= and servermap= to the modifier callback 2008-12-05 22:07:10 -07:00
Zooko O'Whielacronx
b315619d6b download: refactor handling of URI Extension Block and crypttext hash tree, simplify things
Refactor into a class the logic of asking each server in turn until one of them gives an answer 
that validates.  It is called ValidatedThingObtainer.

Refactor the downloading and verification of the URI Extension Block into a class named 
ValidatedExtendedURIProxy.

The new logic of validating UEBs is minimalist: it doesn't require the UEB to contain any 
unncessary information, but of course it still accepts such information for backwards 
compatibility (so that this new download code is able to download files uploaded with old, and 
for that matter with current, upload code).

The new logic of validating UEBs follows the practice of doing all validation up front.  This 
practice advises one to isolate the validation of incoming data into one place, so that all of 
the rest of the code can assume only valid data.

If any redundant information is present in the UEB+URI, the new code cross-checks and asserts 
that it is all fully consistent.  This closes some issues where the uploader could have 
uploaded inconsistent redundant data, which would probably have caused the old downloader to 
simply reject that download after getting a Python exception, but perhaps could have caused 
greater harm to the old downloader.

I removed the notion of selecting an erasure codec from codec.py based on the string that was 
passed in the UEB.  Currently "crs" is the only such string that works, so 
"_assert(codec_name == 'crs')" is simpler and more explicit.  This is also in keeping with the 
"validate up front" strategy -- now if someone sets a different string than "crs" in their UEB, 
the downloader will reject the download in the "validate this UEB" function instead of in a 
separate "select the codec instance" function.

I removed the code to check plaintext hashes and plaintext Merkle Trees.  Uploaders do not 
produce this information any more (since it potentially exposes confidential information about 
the file), and the unit tests for it were disabled.  The downloader before this patch would 
check that plaintext hash or plaintext merkle tree if they were present, but not complain if 
they were absent.  The new downloader in this patch complains if they are present and doesn't 
check them.  (We might in the future re-introduce such hashes over the plaintext, but encrypt 
the hashes which are stored in the UEB to preserve confidentiality.  This would be a double-
check on the correctness of our own source code -- the current Merkle Tree over the ciphertext 
is already sufficient to guarantee the integrity of the download unless there is a bug in our 
Merkle Tree or AES implementation.) 

This patch increases the lines-of-code count by 8 (from 17,770 to 17,778), and reduces the 
uncovered-by-tests lines-of-code count by 24 (from 1408 to 1384).  Those numbers would be more 
meaningful if we omitted src/allmydata/util/ from the test-coverage statistics.
2008-12-05 08:17:54 -07:00