mirror of
https://github.com/tahoe-lafs/tahoe-lafs.git
synced 2024-12-19 04:57:54 +00:00
globally search and replace "mesh" with "grid" and adjust description of the effect of NAT on the topology
This commit is contained in:
parent
44f68c55a9
commit
e0a18d12af
18
README
18
README
@ -1,11 +1,11 @@
|
|||||||
Welcome to the AllMyData "tahoe" project. This project implements a
|
Welcome to the AllMyData "tahoe" project. This project implements a
|
||||||
secure, distributed, fault-tolerant storage mesh.
|
secure, distributed, fault-tolerant storage grid.
|
||||||
|
|
||||||
The basic idea is that the data in this storage mesh is spread over all
|
The basic idea is that the data in this storage grid is spread over all
|
||||||
participating nodes, using an algorithm that can recover the data even if a
|
participating nodes, using an algorithm that can recover the data even if a
|
||||||
majority of the nodes are no longer available.
|
majority of the nodes are no longer available.
|
||||||
|
|
||||||
The interface to the storage mesh allows you to store and fetch files, either
|
The interface to the storage grid allows you to store and fetch files, either
|
||||||
by self-authenticating cryptographic identifier or by filename and path.
|
by self-authenticating cryptographic identifier or by filename and path.
|
||||||
|
|
||||||
|
|
||||||
@ -157,12 +157,12 @@ RUNNING:
|
|||||||
directory, inside of which you can add files to configure and control the
|
directory, inside of which you can add files to configure and control the
|
||||||
node. Nodes also read and write files within that directory.
|
node. Nodes also read and write files within that directory.
|
||||||
|
|
||||||
A mesh consists of a single central 'introducer and vdrive' node and a large
|
A grid consists of a single central 'introducer and vdrive' node and a large
|
||||||
number of 'client' nodes. If you are joining an existing mesh, the
|
number of 'client' nodes. If you are joining an existing grid, the
|
||||||
introducer-and-vdrive node will already be running, and you'll just need to
|
introducer-and-vdrive node will already be running, and you'll just need to
|
||||||
create a client node. If you're creating a brand new mesh, you'll need to
|
create a client node. If you're creating a brand new grid, you'll need to
|
||||||
create both an introducer-and-vdrive and a client (and then invite other
|
create both an introducer-and-vdrive and a client (and then invite other
|
||||||
people to create their own client nodes and join your mesh).
|
people to create their own client nodes and join your grid).
|
||||||
|
|
||||||
The introducer (-and-vdrive) node is constructed by running 'allmydata-tahoe
|
The introducer (-and-vdrive) node is constructed by running 'allmydata-tahoe
|
||||||
create-introducer --basedir $HERE'. Once constructed, you can start the
|
create-introducer --basedir $HERE'. Once constructed, you can start the
|
||||||
@ -177,7 +177,7 @@ RUNNING:
|
|||||||
from the introducer into this new directory, then run 'allmydata-tahoe start
|
from the introducer into this new directory, then run 'allmydata-tahoe start
|
||||||
--basedir $HERE'. After that, the client node should be off and running. The
|
--basedir $HERE'. After that, the client node should be off and running. The
|
||||||
first thing it will do is connect to the introducer and introduce itself to
|
first thing it will do is connect to the introducer and introduce itself to
|
||||||
all other nodes on the mesh. You can follow its progress by looking at the
|
all other nodes on the grid. You can follow its progress by looking at the
|
||||||
$HERE/twistd.log file.
|
$HERE/twistd.log file.
|
||||||
|
|
||||||
To actually use the client, enable the web interface by writing a port
|
To actually use the client, enable the web interface by writing a port
|
||||||
@ -193,6 +193,6 @@ RUNNING:
|
|||||||
To stop it again, use 'make stop-client'. Similar makefile targets exist for
|
To stop it again, use 'make stop-client'. Similar makefile targets exist for
|
||||||
making and running an introducer node.
|
making and running an introducer node.
|
||||||
|
|
||||||
There is a public mesh available for testing. Look at the wiki page
|
There is a public grid available for testing. Look at the wiki page
|
||||||
(http://allmydata.org) for the necessary .furl data.
|
(http://allmydata.org) for the necessary .furl data.
|
||||||
|
|
||||||
|
@ -3,14 +3,14 @@
|
|||||||
|
|
||||||
OVERVIEW
|
OVERVIEW
|
||||||
|
|
||||||
The high-level view of this system consists of three layers: the mesh, the
|
The high-level view of this system consists of three layers: the grid, the
|
||||||
virtual drive, and the application that sits on top.
|
virtual drive, and the application that sits on top.
|
||||||
|
|
||||||
The lowest layer is the "mesh", basically a DHT (Distributed Hash Table)
|
The lowest layer is the "grid", basically a DHT (Distributed Hash Table)
|
||||||
which maps URIs to data. The URIs are relatively short ascii strings
|
which maps URIs to data. The URIs are relatively short ascii strings
|
||||||
(currently about 140 bytes), and each is used as references to an immutable
|
(currently about 140 bytes), and each is used as references to an immutable
|
||||||
arbitrary-length sequence of data bytes. This data is distributed around the
|
arbitrary-length sequence of data bytes. This data is distributed around the
|
||||||
mesh in a large number of nodes, such that a statistically unlikely number
|
grid in a large number of nodes, such that a statistically unlikely number
|
||||||
of nodes would have to be unavailable for the data to become unavailable.
|
of nodes would have to be unavailable for the data to become unavailable.
|
||||||
|
|
||||||
The middle layer is the virtual drive: a tree-shaped data structure in which
|
The middle layer is the virtual drive: a tree-shaped data structure in which
|
||||||
@ -32,9 +32,9 @@ actual code present in the current release. Please take a look at roadmap.txt
|
|||||||
to get an idea of how much of this has been implemented so far.
|
to get an idea of how much of this has been implemented so far.
|
||||||
|
|
||||||
|
|
||||||
THE BIG MESH OF PEERS
|
THE BIG GRID OF PEERS
|
||||||
|
|
||||||
Underlying the mesh is a large collection of peer nodes. These are processes
|
Underlying the grid is a large collection of peer nodes. These are processes
|
||||||
running on a wide variety of computers, all of which know about each other in
|
running on a wide variety of computers, all of which know about each other in
|
||||||
some way or another. They establish TCP connections to one another using
|
some way or another. They establish TCP connections to one another using
|
||||||
Foolscap, an encrypted+authenticated remote message passing library (using
|
Foolscap, an encrypted+authenticated remote message passing library (using
|
||||||
@ -47,7 +47,7 @@ that would cause it to consume more space than it wants to provide. When a
|
|||||||
lease expires, the data is deleted. Peers might renew their leases.
|
lease expires, the data is deleted. Peers might renew their leases.
|
||||||
|
|
||||||
This storage is used to hold "shares", which are themselves used to store
|
This storage is used to hold "shares", which are themselves used to store
|
||||||
files in the mesh. There are many shares for each file, typically around 100
|
files in the grid. There are many shares for each file, typically around 100
|
||||||
(the exact number depends upon the tradeoffs made between reliability,
|
(the exact number depends upon the tradeoffs made between reliability,
|
||||||
overhead, and storage space consumed). The files are indexed by a piece of
|
overhead, and storage space consumed). The files are indexed by a piece of
|
||||||
the URI called the "verifierid", which is derived from the contents of the
|
the URI called the "verifierid", which is derived from the contents of the
|
||||||
@ -64,16 +64,19 @@ In this release, peers learn about each other through the "introducer". Each
|
|||||||
peer connects to this central introducer at startup, and receives a list of
|
peer connects to this central introducer at startup, and receives a list of
|
||||||
all other peers from it. Each peer then connects to all other peers, creating
|
all other peers from it. Each peer then connects to all other peers, creating
|
||||||
a fully-connected topology. Future versions will reduce the number of
|
a fully-connected topology. Future versions will reduce the number of
|
||||||
connections considerably, to enable the mesh to scale to larger sizes: the
|
connections considerably, to enable the grid to scale to larger sizes: the
|
||||||
design target is one million nodes. In addition, future versions will offer
|
design target is one million nodes. In addition, future versions will offer
|
||||||
relay and NAT-traversal services to allow nodes without full internet
|
relay and NAT-traversal services to allow nodes without full internet
|
||||||
connectivity to participate. In the current release, only one node may be
|
connectivity to participate. In the current release, nodes behind NAT boxes
|
||||||
behind a NAT box and still permit the mesh to achieve full connectivity.
|
will connect to all nodes that they can open connections to, but they cannot
|
||||||
|
open connections to other nodes behind NAT boxes. Therefore, the more nodes
|
||||||
|
there are behind NAT boxes the less the topology resembles the intended
|
||||||
|
fully-connected mesh topology.
|
||||||
|
|
||||||
|
|
||||||
FILE ENCODING
|
FILE ENCODING
|
||||||
|
|
||||||
When a file is to be added to the mesh, it is first encrypted using a key
|
When a file is to be added to the grid, it is first encrypted using a key
|
||||||
that is derived from the hash of the file itself. The encrypted file is then
|
that is derived from the hash of the file itself. The encrypted file is then
|
||||||
broken up into segments so it can be processed in small pieces (to minimize
|
broken up into segments so it can be processed in small pieces (to minimize
|
||||||
the memory footprint of both encode and decode operations, and to increase
|
the memory footprint of both encode and decode operations, and to increase
|
||||||
@ -131,9 +134,9 @@ it to validate that these potential bytes are indeed the ones that you were
|
|||||||
looking for.
|
looking for.
|
||||||
|
|
||||||
URIs refer to an immutable set of bytes. If you modify a file and upload the
|
URIs refer to an immutable set of bytes. If you modify a file and upload the
|
||||||
new version to the mesh, you will get a different URI. URIs do not represent
|
new version to the grid, you will get a different URI. URIs do not represent
|
||||||
filenames at all, just the data that a filename might point to at some given
|
filenames at all, just the data that a filename might point to at some given
|
||||||
point in time. This is why the "mesh" layer is insufficient to provide a
|
point in time. This is why the "grid" layer is insufficient to provide a
|
||||||
virtual drive: an actual filesystem requires human-meaningful names and
|
virtual drive: an actual filesystem requires human-meaningful names and
|
||||||
mutability, while URIs provide neither. URIs sit on the "global+secure" edge
|
mutability, while URIs provide neither. URIs sit on the "global+secure" edge
|
||||||
of Zooko's Triangle[1]. They are self-authenticating, meaning that nobody can
|
of Zooko's Triangle[1]. They are self-authenticating, meaning that nobody can
|
||||||
@ -165,7 +168,7 @@ which ones? The "peer selection" algorithm is used to make this choice.
|
|||||||
In the current version, the verifierid is used to consistently-permute the
|
In the current version, the verifierid is used to consistently-permute the
|
||||||
set of all peers (by sorting the peers by HASH(verifierid+peerid)). Each file
|
set of all peers (by sorting the peers by HASH(verifierid+peerid)). Each file
|
||||||
gets a different permutation, which (on average) will evenly distribute
|
gets a different permutation, which (on average) will evenly distribute
|
||||||
shares among the mesh and avoid hotspots.
|
shares among the grid and avoid hotspots.
|
||||||
|
|
||||||
This permutation places the peers around a 2^256-sized ring, like the rim of
|
This permutation places the peers around a 2^256-sized ring, like the rim of
|
||||||
a big clock. The 100-or-so shares are then placed around the same ring (at 0,
|
a big clock. The 100-or-so shares are then placed around the same ring (at 0,
|
||||||
@ -188,7 +191,7 @@ shares we were able to place. The current parameters try to place 100 shares,
|
|||||||
of which 25 must be retrievable to recover the file, and the peer selection
|
of which 25 must be retrievable to recover the file, and the peer selection
|
||||||
algorithm is happy if it was able to place at least 75 shares. These numbers
|
algorithm is happy if it was able to place at least 75 shares. These numbers
|
||||||
are adjustable: 25-out-of-100 means an expansion factor of 4x (every file in
|
are adjustable: 25-out-of-100 means an expansion factor of 4x (every file in
|
||||||
the mesh consumes four times as much space when totalled across all
|
the grid consumes four times as much space when totalled across all
|
||||||
StorageServers), but is highly reliable (the actual reliability is a binomial
|
StorageServers), but is highly reliable (the actual reliability is a binomial
|
||||||
distribution function of the expected availability of the individual peers,
|
distribution function of the expected availability of the individual peers,
|
||||||
but in general it goes up very quickly with the expansion factor).
|
but in general it goes up very quickly with the expansion factor).
|
||||||
@ -217,11 +220,11 @@ routine will eventually find all the peers that have shares, and will find
|
|||||||
them quickly if there is significant overlap between the set of peers that
|
them quickly if there is significant overlap between the set of peers that
|
||||||
were present when the file was uploaded and the set of peers that are present
|
were present when the file was uploaded and the set of peers that are present
|
||||||
as it is downloaded (i.e. if the "peerlist stability" is high). Some limits
|
as it is downloaded (i.e. if the "peerlist stability" is high). Some limits
|
||||||
may be imposed in large meshes to avoid querying a million peers; this
|
may be imposed in large grids to avoid querying a million peers; this
|
||||||
provides a tradeoff between the work spent to discover that a file is
|
provides a tradeoff between the work spent to discover that a file is
|
||||||
unrecoverable and the probability that a retrieval will fail when it could
|
unrecoverable and the probability that a retrieval will fail when it could
|
||||||
have succeeded if we had just tried a little bit harder. The appropriate
|
have succeeded if we had just tried a little bit harder. The appropriate
|
||||||
value of this tradeoff will depend upon the size of the mesh, and will change
|
value of this tradeoff will depend upon the size of the grid, and will change
|
||||||
over time.
|
over time.
|
||||||
|
|
||||||
Other peer selection algorithms are being evaluated. One of them (known as
|
Other peer selection algorithms are being evaluated. One of them (known as
|
||||||
@ -328,7 +331,7 @@ is to finish up with a full set of 100 shares.
|
|||||||
|
|
||||||
There are a number of engineering issues to be resolved here. The bandwidth,
|
There are a number of engineering issues to be resolved here. The bandwidth,
|
||||||
disk IO, and CPU time consumed by the verification/repair process must be
|
disk IO, and CPU time consumed by the verification/repair process must be
|
||||||
balanced against the robustness that it provides to the mesh. The nodes
|
balanced against the robustness that it provides to the grid. The nodes
|
||||||
involved in repair will have very different access patterns than normal
|
involved in repair will have very different access patterns than normal
|
||||||
nodes, such that these processes may need to be run on hosts with more memory
|
nodes, such that these processes may need to be run on hosts with more memory
|
||||||
or network connectivity than usual. The frequency of repair runs directly
|
or network connectivity than usual. The frequency of repair runs directly
|
||||||
@ -366,7 +369,7 @@ match the originally uploaded data) is provided by the hashes embedded the
|
|||||||
URI. Data security (the promise that the data is only readable by people with
|
URI. Data security (the promise that the data is only readable by people with
|
||||||
the URI) is provided by the encryption key embedded in the URI. Data
|
the URI) is provided by the encryption key embedded in the URI. Data
|
||||||
availability (the hope that data which has been uploaded in the past will be
|
availability (the hope that data which has been uploaded in the past will be
|
||||||
downloadable in the future) is provided by the mesh, which distributes
|
downloadable in the future) is provided by the grid, which distributes
|
||||||
failures in a way that reduces the correlation between individual node
|
failures in a way that reduces the correlation between individual node
|
||||||
failure and overall file recovery failure.
|
failure and overall file recovery failure.
|
||||||
|
|
||||||
@ -425,7 +428,7 @@ different goals. Each choice results in a number of properties; there are
|
|||||||
many tradeoffs.
|
many tradeoffs.
|
||||||
|
|
||||||
First, some terms: the erasure-coding algorithm is described as K-out-of-N
|
First, some terms: the erasure-coding algorithm is described as K-out-of-N
|
||||||
(for this release, the default values are K=25 and N=100). Each mesh will
|
(for this release, the default values are K=25 and N=100). Each grid will
|
||||||
have some number of peers; this number will rise and fall over time as peers
|
have some number of peers; this number will rise and fall over time as peers
|
||||||
join, drop out, come back, and leave forever. Files are of various sizes,
|
join, drop out, come back, and leave forever. Files are of various sizes,
|
||||||
some are popular, others are rare. Peers have various capacities, variable
|
some are popular, others are rare. Peers have various capacities, variable
|
||||||
@ -438,7 +441,7 @@ segments of a given maximum size, which affects memory usage.
|
|||||||
|
|
||||||
The ratio of N/K is the "expansion factor". Higher expansion factors improve
|
The ratio of N/K is the "expansion factor". Higher expansion factors improve
|
||||||
reliability very quickly (the binomial distribution curve is very sharp), but
|
reliability very quickly (the binomial distribution curve is very sharp), but
|
||||||
consumes much more mesh capacity. The absolute value of K affects the
|
consumes much more grid capacity. The absolute value of K affects the
|
||||||
granularity of the binomial curve (1-out-of-2 is much worse than
|
granularity of the binomial curve (1-out-of-2 is much worse than
|
||||||
50-out-of-100), but high values asymptotically approach a constant that
|
50-out-of-100), but high values asymptotically approach a constant that
|
||||||
depends upon 'P' (i.e. 500-of-1000 is not much better than 50-of-100).
|
depends upon 'P' (i.e. 500-of-1000 is not much better than 50-of-100).
|
||||||
@ -452,11 +455,11 @@ will take out all of them at once. The "Sybil Attack" is where a single
|
|||||||
attacker convinces you that they are actually multiple servers, so that you
|
attacker convinces you that they are actually multiple servers, so that you
|
||||||
think you are using a large number of independent peers, but in fact you have
|
think you are using a large number of independent peers, but in fact you have
|
||||||
a single point of failure (where the attacker turns off all their machines at
|
a single point of failure (where the attacker turns off all their machines at
|
||||||
once). Large meshes, with lots of truly-independent peers, will enable the
|
once). Large grids, with lots of truly-independent peers, will enable the
|
||||||
use of lower expansion factors to achieve the same reliability, but increase
|
use of lower expansion factors to achieve the same reliability, but increase
|
||||||
overhead because each peer needs to know something about every other, and the
|
overhead because each peer needs to know something about every other, and the
|
||||||
rate at which peers come and go will be higher (requiring network maintenance
|
rate at which peers come and go will be higher (requiring network maintenance
|
||||||
traffic). Also, the File Repairer work will increase with larger meshes,
|
traffic). Also, the File Repairer work will increase with larger grids,
|
||||||
although then the job can be distributed out to more peers.
|
although then the job can be distributed out to more peers.
|
||||||
|
|
||||||
Higher values of N increase overhead: more shares means more Merkle hashes
|
Higher values of N increase overhead: more shares means more Merkle hashes
|
||||||
@ -467,10 +470,10 @@ must be held in memory while erasure coding runs) and increases "alacrity"
|
|||||||
to the target sooner), but also increase overhead (because more blocks means
|
to the target sooner), but also increase overhead (because more blocks means
|
||||||
more Merkle hashes to validate them).
|
more Merkle hashes to validate them).
|
||||||
|
|
||||||
In general, small private meshes should work well, but the participants will
|
In general, small private grids should work well, but the participants will
|
||||||
have to decide between storage overhead and reliability. Large stable meshes
|
have to decide between storage overhead and reliability. Large stable grids
|
||||||
will be able to reduce the expansion factor down to a bare minimum while
|
will be able to reduce the expansion factor down to a bare minimum while
|
||||||
still retaining high reliability, but large unstable meshes (where nodes are
|
still retaining high reliability, but large unstable grids (where nodes are
|
||||||
coming and going very quickly) may require more repair/verification bandwidth
|
coming and going very quickly) may require more repair/verification bandwidth
|
||||||
than actual upload/download traffic.
|
than actual upload/download traffic.
|
||||||
|
|
||||||
|
20
relnotes.txt
20
relnotes.txt
@ -1,6 +1,6 @@
|
|||||||
Allmydata, Inc. [1], provider of the "Allmydata" consumer backup product, is
|
Allmydata, Inc. [1], provider of the "Allmydata" consumer backup product, is
|
||||||
pleased announce the first public release of "Tahoe", a secure, distributed
|
pleased announce the first public release of "Tahoe", a secure, distributed
|
||||||
storage mesh with a free-software licence.
|
storage grid with a free-software licence.
|
||||||
|
|
||||||
The source code that we are releasing is the current working prototype for
|
The source code that we are releasing is the current working prototype for
|
||||||
Allmydata's next-generation product. This release is targeted at hackers who
|
Allmydata's next-generation product. This release is targeted at hackers who
|
||||||
@ -9,7 +9,7 @@ interface.
|
|||||||
|
|
||||||
This prototype is not recommended for storage of confidential data nor for
|
This prototype is not recommended for storage of confidential data nor for
|
||||||
data which is not otherwise backed up, but it already implements a functional
|
data which is not otherwise backed up, but it already implements a functional
|
||||||
distributed storage mesh and is useful for experimentation, prototyping, and
|
distributed storage grid and is useful for experimentation, prototyping, and
|
||||||
extension.
|
extension.
|
||||||
|
|
||||||
|
|
||||||
@ -26,8 +26,8 @@ USAGE
|
|||||||
|
|
||||||
Once installed, create a "client node" as described in the README. Instruct
|
Once installed, create a "client node" as described in the README. Instruct
|
||||||
this client node to connect to a specific "introducer node" by means of
|
this client node to connect to a specific "introducer node" by means of
|
||||||
config files in the client node's working directory. To join a public mesh,
|
config files in the client node's working directory. To join a public grid,
|
||||||
copy in the .furl files for that mesh. To create a private mesh, run your
|
copy in the .furl files for that grid. To create a private grid, run your
|
||||||
own introducer, and copy its .furl files.
|
own introducer, and copy its .furl files.
|
||||||
|
|
||||||
Each client node runs a local webserver (enabled by writing the desired port
|
Each client node runs a local webserver (enabled by writing the desired port
|
||||||
@ -82,14 +82,14 @@ mapping from pathnames/filenames to URIs.
|
|||||||
|
|
||||||
We are well aware of the limitations of decentralization and scalability
|
We are well aware of the limitations of decentralization and scalability
|
||||||
inherent in this prototype. In particular, the completely-connected property
|
inherent in this prototype. In particular, the completely-connected property
|
||||||
of the mesh and the requirement of a single distinct introducer and vdrive
|
of the grid and the requirement of a single distinct introducer and vdrive
|
||||||
server limits the possible size of the mesh. We have plans to loosen these
|
server limits the possible size of the grid. We have plans to loosen these
|
||||||
limitations (see roadmap.txt [5]). Currently it should be noted that the
|
limitations (see roadmap.txt [5]). Currently it should be noted that the
|
||||||
mesh already depends as little as possible on the accessibility and
|
grid already depends as little as possible on the accessibility and
|
||||||
correctness of the introduction server and the vdrive server. Also note that
|
correctness of the introduction server and the vdrive server. Also note that
|
||||||
the choice of which servers to use is easily configured -- you should be able
|
the choice of which servers to use is easily configured -- you should be able
|
||||||
to set up a private mesh for you and your friends almost as easily as to
|
to set up a private grid for you and your friends almost as easily as to
|
||||||
connect to our public test mesh.
|
connect to our public test grid.
|
||||||
|
|
||||||
|
|
||||||
SOFTWARE ARCHITECTURE
|
SOFTWARE ARCHITECTURE
|
||||||
@ -99,7 +99,7 @@ consumer backup service. It is primarily written in Python.
|
|||||||
|
|
||||||
It uses the Foolscap library [10] which provides a remote object protocol
|
It uses the Foolscap library [10] which provides a remote object protocol
|
||||||
inspired by the capability-secure "E" programming language [11]. Foolscap
|
inspired by the capability-secure "E" programming language [11]. Foolscap
|
||||||
allows us to express the intended behavior of the distributed mesh directly
|
allows us to express the intended behavior of the distributed grid directly
|
||||||
in object-oriented terms while relying on a well-engineered, secure transport
|
in object-oriented terms while relying on a well-engineered, secure transport
|
||||||
layer.
|
layer.
|
||||||
|
|
||||||
|
8
setup.py
8
setup.py
@ -1,6 +1,6 @@
|
|||||||
#! /usr/bin/env python
|
#! /usr/bin/env python
|
||||||
|
|
||||||
# Allmydata Tahoe -- secure, distributed storage mesh
|
# Allmydata Tahoe -- secure, distributed storage grid
|
||||||
#
|
#
|
||||||
# Copyright (C) 2007 Allmydata, Inc.
|
# Copyright (C) 2007 Allmydata, Inc.
|
||||||
#
|
#
|
||||||
@ -57,11 +57,11 @@ trove_classifiers=[
|
|||||||
|
|
||||||
setup(name='allmydata-tahoe',
|
setup(name='allmydata-tahoe',
|
||||||
version='0.1.4b2',
|
version='0.1.4b2',
|
||||||
description='secure, distributed storage mesh',
|
description='secure, distributed storage grid',
|
||||||
long_description="""Welcome to the AllMyData "tahoe" project. This project implements a
|
long_description="""Welcome to the AllMyData "tahoe" project. This project implements a
|
||||||
secure, distributed, fault-tolerant storage mesh.
|
secure, distributed, fault-tolerant storage grid.
|
||||||
|
|
||||||
The basic idea is that the data in this storage mesh is spread over all
|
The basic idea is that the data in this storage grid is spread over all
|
||||||
participating nodes, using an algorithm that can recover the data even if a
|
participating nodes, using an algorithm that can recover the data even if a
|
||||||
majority of the nodes are no longer available.""",
|
majority of the nodes are no longer available.""",
|
||||||
author='Allmydata, Inc.',
|
author='Allmydata, Inc.',
|
||||||
|
@ -1,5 +1,5 @@
|
|||||||
"""
|
"""
|
||||||
Decentralized storage mesh.
|
Decentralized storage grid.
|
||||||
|
|
||||||
maintainer web site: U{http://allmydata.com/}
|
maintainer web site: U{http://allmydata.com/}
|
||||||
|
|
||||||
|
@ -14,7 +14,7 @@ from allmydata.interfaces import IEncoder
|
|||||||
|
|
||||||
The goal of the encoder is to turn the original file into a series of
|
The goal of the encoder is to turn the original file into a series of
|
||||||
'shares'. Each share is going to a 'shareholder' (nominally each shareholder
|
'shares'. Each share is going to a 'shareholder' (nominally each shareholder
|
||||||
is a different host, but for small meshes there may be overlap). The number
|
is a different host, but for small grids there may be overlap). The number
|
||||||
of shares is chosen to hit our reliability goals (more shares on more
|
of shares is chosen to hit our reliability goals (more shares on more
|
||||||
machines means more reliability), and is limited by overhead (proportional to
|
machines means more reliability), and is limited by overhead (proportional to
|
||||||
numshares or log(numshares)) and the encoding technology in use (Reed-Solomon
|
numshares or log(numshares)) and the encoding technology in use (Reed-Solomon
|
||||||
|
@ -135,7 +135,7 @@ class _DirectorySubTree(object):
|
|||||||
# self.populate_from_node must be defined by the subclass (CHK or
|
# self.populate_from_node must be defined by the subclass (CHK or
|
||||||
# SSK), since it controls how the spec is interpreted. It will
|
# SSK), since it controls how the spec is interpreted. It will
|
||||||
# probably use the contents of the node to figure out what to
|
# probably use the contents of the node to figure out what to
|
||||||
# download from the mesh, then pass this downloaded serialized data
|
# download from the grid, then pass this downloaded serialized data
|
||||||
# to populate_from_data()
|
# to populate_from_data()
|
||||||
raise NotImplementedError
|
raise NotImplementedError
|
||||||
|
|
||||||
|
@ -106,7 +106,7 @@ class ISubTree(Interface):
|
|||||||
|
|
||||||
Each subtree's populate_from_node() method is expected to use the
|
Each subtree's populate_from_node() method is expected to use the
|
||||||
downloader to obtain a file with the subtree's serialized contents
|
downloader to obtain a file with the subtree's serialized contents
|
||||||
(probably by pulling data from some source, like the mesh, the vdrive
|
(probably by pulling data from some source, like the grid, the vdrive
|
||||||
server, an HTTP server, or somewhere on the local filesystem), then
|
server, an HTTP server, or somewhere on the local filesystem), then
|
||||||
unserialize them and populate the subtree's state.
|
unserialize them and populate the subtree's state.
|
||||||
|
|
||||||
@ -175,7 +175,7 @@ class ISubTree(Interface):
|
|||||||
def serialize_subtree_to_file(f):
|
def serialize_subtree_to_file(f):
|
||||||
"""Create a string which describes my structure and write it to the
|
"""Create a string which describes my structure and write it to the
|
||||||
given filehandle (using only .write()). This string should be
|
given filehandle (using only .write()). This string should be
|
||||||
suitable for uploading to the mesh or storing in a local file."""
|
suitable for uploading to the grid or storing in a local file."""
|
||||||
|
|
||||||
def update_now(uploader):
|
def update_now(uploader):
|
||||||
"""Perform whatever work is necessary to record this subtree to
|
"""Perform whatever work is necessary to record this subtree to
|
||||||
@ -185,7 +185,7 @@ class ISubTree(Interface):
|
|||||||
the subtree has been persisted.
|
the subtree has been persisted.
|
||||||
|
|
||||||
For directory subtrees, this will cause the subtree to serialize
|
For directory subtrees, this will cause the subtree to serialize
|
||||||
itself to a file, then upload this file to the mesh, then create an
|
itself to a file, then upload this file to the grid, then create an
|
||||||
INode-providing instance which describes where the file wound up. For
|
INode-providing instance which describes where the file wound up. For
|
||||||
redirections, this will cause the subtree to modify the redirection's
|
redirections, this will cause the subtree to modify the redirection's
|
||||||
persistent storage, then return the (unmodified) INode that describes
|
persistent storage, then return the (unmodified) INode that describes
|
||||||
|
@ -512,7 +512,7 @@ class IWorkQueue(Interface):
|
|||||||
generated uri."""
|
generated uri."""
|
||||||
|
|
||||||
def add_upload_chk(source_filename, stash_uri_in_boxname):
|
def add_upload_chk(source_filename, stash_uri_in_boxname):
|
||||||
"""This step uploads a file to the mesh and obtains a content-based
|
"""This step uploads a file to the grid and obtains a content-based
|
||||||
URI which can be used to later retrieve the same contents ('CHK'
|
URI which can be used to later retrieve the same contents ('CHK'
|
||||||
mode). This URI includes unlink rights. It does not mark the file for
|
mode). This URI includes unlink rights. It does not mark the file for
|
||||||
retention.
|
retention.
|
||||||
@ -526,7 +526,7 @@ class IWorkQueue(Interface):
|
|||||||
"""
|
"""
|
||||||
|
|
||||||
def add_upload_ssk(write_capability, previous_version, source_filename):
|
def add_upload_ssk(write_capability, previous_version, source_filename):
|
||||||
"""This step uploads a file to the mesh in a way that replaces the
|
"""This step uploads a file to the grid in a way that replaces the
|
||||||
previous version and does not require a change to the ID referenced
|
previous version and does not require a change to the ID referenced
|
||||||
by the parent.
|
by the parent.
|
||||||
"""
|
"""
|
||||||
@ -618,15 +618,15 @@ class NotCapableError(Exception):
|
|||||||
|
|
||||||
class RIControlClient(RemoteInterface):
|
class RIControlClient(RemoteInterface):
|
||||||
def upload_from_file_to_uri(filename=str):
|
def upload_from_file_to_uri(filename=str):
|
||||||
"""Upload a file to the mesh. This accepts a filename (which must be
|
"""Upload a file to the grid. This accepts a filename (which must be
|
||||||
absolute) that points to a file on the node's local disk. The node
|
absolute) that points to a file on the node's local disk. The node
|
||||||
will read the contents of this file, upload it to the mesh, then
|
will read the contents of this file, upload it to the grid, then
|
||||||
return the URI at which it was uploaded.
|
return the URI at which it was uploaded.
|
||||||
"""
|
"""
|
||||||
return URI
|
return URI
|
||||||
|
|
||||||
def download_from_uri_to_file(uri=URI, filename=str):
|
def download_from_uri_to_file(uri=URI, filename=str):
|
||||||
"""Download a file from the mesh, placing it on the node's local disk
|
"""Download a file from the grid, placing it on the node's local disk
|
||||||
at the given filename (which must be absolute[?]). Returns the
|
at the given filename (which must be absolute[?]). Returns the
|
||||||
absolute filename where the file was written."""
|
absolute filename where the file was written."""
|
||||||
return str
|
return str
|
||||||
|
@ -11,7 +11,7 @@ from allmydata.interfaces import IDownloader, IUploader
|
|||||||
from allmydata import workqueue
|
from allmydata import workqueue
|
||||||
from cStringIO import StringIO
|
from cStringIO import StringIO
|
||||||
|
|
||||||
class FakeMesh(object):
|
class FakeGrid(object):
|
||||||
implements(IDownloader, IUploader)
|
implements(IDownloader, IUploader)
|
||||||
|
|
||||||
"""
|
"""
|
||||||
@ -333,7 +333,7 @@ class Utils(unittest.TestCase):
|
|||||||
pairs = list(directory.in_pairs(l))
|
pairs = list(directory.in_pairs(l))
|
||||||
self.failUnlessEqual(pairs, [(0,1), (2,3), (4,5), (6,7)])
|
self.failUnlessEqual(pairs, [(0,1), (2,3), (4,5), (6,7)])
|
||||||
|
|
||||||
class FakeMesh(object):
|
class FakeGrid(object):
|
||||||
implements(IDownloader, IUploader)
|
implements(IDownloader, IUploader)
|
||||||
debug = False
|
debug = False
|
||||||
|
|
||||||
@ -343,7 +343,7 @@ class FakeMesh(object):
|
|||||||
def upload(self, uploadable):
|
def upload(self, uploadable):
|
||||||
uri = "stub-uri-%d" % len(self.files)
|
uri = "stub-uri-%d" % len(self.files)
|
||||||
if self.debug:
|
if self.debug:
|
||||||
print "FakeMesh.upload -> %s" % uri
|
print "FakeGrid.upload -> %s" % uri
|
||||||
assert upload.IUploadable.providedBy(uploadable)
|
assert upload.IUploadable.providedBy(uploadable)
|
||||||
f = uploadable.get_filehandle()
|
f = uploadable.get_filehandle()
|
||||||
data = f.read()
|
data = f.read()
|
||||||
@ -353,17 +353,17 @@ class FakeMesh(object):
|
|||||||
|
|
||||||
def upload_filename(self, filename):
|
def upload_filename(self, filename):
|
||||||
if self.debug:
|
if self.debug:
|
||||||
print "FakeMesh.upload_filename(%s)" % filename
|
print "FakeGrid.upload_filename(%s)" % filename
|
||||||
return self.upload(upload.FileName(filename))
|
return self.upload(upload.FileName(filename))
|
||||||
|
|
||||||
def upload_data(self, data):
|
def upload_data(self, data):
|
||||||
if self.debug:
|
if self.debug:
|
||||||
print "FakeMesh.upload_data(%s)" % data
|
print "FakeGrid.upload_data(%s)" % data
|
||||||
return self.upload(upload.Data(data))
|
return self.upload(upload.Data(data))
|
||||||
|
|
||||||
def download(self, uri, target):
|
def download(self, uri, target):
|
||||||
if self.debug:
|
if self.debug:
|
||||||
print "FakeMesh.download(%s)" % uri
|
print "FakeGrid.download(%s)" % uri
|
||||||
target.open()
|
target.open()
|
||||||
target.write(self.files[uri])
|
target.write(self.files[uri])
|
||||||
target.close()
|
target.close()
|
||||||
@ -372,16 +372,16 @@ class FakeMesh(object):
|
|||||||
|
|
||||||
class VDrive(unittest.TestCase):
|
class VDrive(unittest.TestCase):
|
||||||
|
|
||||||
def makeVirtualDrive(self, basedir, root_node=None, mesh=None):
|
def makeVirtualDrive(self, basedir, root_node=None, grid=None):
|
||||||
wq = workqueue.WorkQueue(os.path.join("test_filetree",
|
wq = workqueue.WorkQueue(os.path.join("test_filetree",
|
||||||
"VDrive",
|
"VDrive",
|
||||||
basedir, "1.workqueue"))
|
basedir, "1.workqueue"))
|
||||||
if mesh:
|
if grid:
|
||||||
assert IUploader.providedBy(mesh)
|
assert IUploader.providedBy(grid)
|
||||||
assert IDownloader.providedBy(mesh)
|
assert IDownloader.providedBy(grid)
|
||||||
dl = ul = mesh
|
dl = ul = grid
|
||||||
else:
|
else:
|
||||||
dl = ul = FakeMesh()
|
dl = ul = FakeGrid()
|
||||||
if not root_node:
|
if not root_node:
|
||||||
root_node = directory.LocalFileSubTreeNode()
|
root_node = directory.LocalFileSubTreeNode()
|
||||||
root_node.new("rootdirtree.save")
|
root_node.new("rootdirtree.save")
|
||||||
@ -403,17 +403,17 @@ class VDrive(unittest.TestCase):
|
|||||||
def makeCHKTree(self, basename):
|
def makeCHKTree(self, basename):
|
||||||
# create a LocalFileRedirection pointing at a CHKDirectorySubTree.
|
# create a LocalFileRedirection pointing at a CHKDirectorySubTree.
|
||||||
# Returns a VirtualDrive instance.
|
# Returns a VirtualDrive instance.
|
||||||
mesh = FakeMesh()
|
grid = FakeGrid()
|
||||||
topdir = directory.CHKDirectorySubTree().new()
|
topdir = directory.CHKDirectorySubTree().new()
|
||||||
d = topdir.update_now(mesh)
|
d = topdir.update_now(grid)
|
||||||
def _updated(topnode):
|
def _updated(topnode):
|
||||||
root = redirect.LocalFileRedirection()
|
root = redirect.LocalFileRedirection()
|
||||||
root.new("%s-root" % basename, topnode)
|
root.new("%s-root" % basename, topnode)
|
||||||
return root.update_now(mesh)
|
return root.update_now(grid)
|
||||||
d.addCallback(_updated)
|
d.addCallback(_updated)
|
||||||
d.addCallback(lambda rootnode:
|
d.addCallback(lambda rootnode:
|
||||||
self.makeVirtualDrive("%s-vdrive" % basename,
|
self.makeVirtualDrive("%s-vdrive" % basename,
|
||||||
rootnode, mesh))
|
rootnode, grid))
|
||||||
return d
|
return d
|
||||||
|
|
||||||
def failUnlessListsAreEqual(self, list1, list2):
|
def failUnlessListsAreEqual(self, list1, list2):
|
||||||
@ -425,7 +425,7 @@ class VDrive(unittest.TestCase):
|
|||||||
self.failUnlessEqual(c1a, c2a)
|
self.failUnlessEqual(c1a, c2a)
|
||||||
|
|
||||||
def testDirectory(self):
|
def testDirectory(self):
|
||||||
stm = vdrive.SubTreeMaker(FakeMesh())
|
stm = vdrive.SubTreeMaker(FakeGrid())
|
||||||
|
|
||||||
# create an empty directory (stored locally)
|
# create an empty directory (stored locally)
|
||||||
subtree = directory.LocalFileSubTree()
|
subtree = directory.LocalFileSubTree()
|
||||||
|
@ -12,7 +12,7 @@
|
|||||||
|
|
||||||
<p>To view the global shared filestore, <a href="../vdrive">Click Here!</a></p>
|
<p>To view the global shared filestore, <a href="../vdrive">Click Here!</a></p>
|
||||||
|
|
||||||
<h2>Mesh Status</h2>
|
<h2>Grid Status</h2>
|
||||||
|
|
||||||
<div>My nodeid: <span n:render="string" n:data="my_nodeid" /></div>
|
<div>My nodeid: <span n:render="string" n:data="my_nodeid" /></div>
|
||||||
<div>Introducer: <span n:render="string" n:data="introducer_pburl" /></div>
|
<div>Introducer: <span n:render="string" n:data="introducer_pburl" /></div>
|
||||||
|
Loading…
Reference in New Issue
Block a user