This will make it easier to change RIBucketWriter in the future to reduce the wire
protocol to just open/write(offset,data)/close, and do all the structuring on the
client end. The ultimate goal is to store each bucket in a single file, to reduce
the considerable filesystem-quantization/inode overhead on the storage servers.
Also include the encoder portion of Bob Ippolito's simplejson-1.7.1 as
allmydata.util.json_encoder . simplejson is distributed under a more liberal
license than Tahoe (looks to be modified BSD), so redistributing it should be ok.
To use this, write a number like 10MB or 5Gb or 5000000000 to a file
named 'sizelimit' in the client's base directory. The node will not grant
leases for shares that would take it much beyond this many bytes of
storage. Note that metadata is not included in the allocation count until
a restart, so the actual space consumed may grow beyond the limit if
the node is not restarted very frequently and the amount of metadata is
significant.
If the error occurs before any data has been sent, we can give a sensible
error message (code 500, stack trace, etc). This will cover most of the error
cases. The ones that aren't covered are when we run out of good peers after
successfully decoding the first segment, either because they go away or
because their shares are corrupt.
Previously, exceptions during a web download caused a hang rather than some
kind of exception or error message. This patch improves the situation by
terminating the HTTP download rather than letting it hang forever. The
behavior still isn't ideal, however, because the error can occur too late to
abort the HTTP request cleanly (i.e. with an error code). In fact, the
Content-Type header and response code have already been set by the time any
download errors have been detected, so the browser is committed to displaying
an image or whatever (thus any error message we put into the stream is
unlikely to be displayed in a meaningful way).
These allow client-side code to conveniently retrieve the IDirectoryNode
instances for both the global shared public root directory, and the per-user
private root directory.
The only SHA-1 hash that remains is used in the permutation of nodeids,
where we need to decide if we care about performance or long-term security.
I suspect that we could use a much weaker hash (and faster) hash for
this purpose. In the long run, we'll be doing thousands of such hashes
for each file uploaded or downloaded (one per known peer).
This (compatibility-breaking) change moves much of the validation data and
encoding parameters out of the URI and into the so-called "thingA" block
(which will get a better name as soon as we find one we're comfortable with).
The URI retains the "storage_index" (a generalized term for the role that
we're currently using the verifierid for, the unique index for each file
that gets used by storage servers to decide which shares to return), the
decryption key, the needed_shares/total_shares counts (since they affect
peer selection), and the hash of the thingA block.
This shortens the URI and lets us add more kinds of validation data without
growing the URI (like plaintext merkle trees, to enable strong incremental
plaintext validation), at the cost of maybe 150 bytes of alacrity. Each
storage server holds an identical copy of the thingA block.
This is an incompatible change: new messages have been added to the storage
server interface, and the URI format has changed drastically.
Add a new method to RIIntroducer, to allow the central introducer node to
remove peers from the active set after they've gone away. Without this,
client nodes accumulate stale peer FURLs forever. This introduces a
compatibility break, as old introducers won't know about the 'lost_peers'
message, although the errors produced are probably harmless.
Add a new method to RIIntroducer, to allow the central introducer node to
remove peers from the active set after they've gone away. Without this,
client nodes accumulate stale peer FURLs forever. This introduces a
compatibility break, as old introducers won't know about the 'lost_peers'
message, although the errors produced are probably harmless.
It does indeed take longer than 2400 seconds to run test_upload_and_download on a virtual windows machine when the underlying real machine is heavily loaded down with filesystem analysis runs...
This is a potentially disruptive and potentially ugly change to the code base,
because I renamed the object that serves in both roles from "Queen" to
"IntroducerAndVdrive", which is a bit of an ugly name.
However, I think that clarity is important enough in this release to make this
change. All unit tests pass. I'm now darcs recording this patch in order to
pull it to other machines for more testing.