globally search and replace "mesh" with "grid" and adjust description of the effect of NAT on the topology

This commit is contained in:
Zooko O'Whielacronx 2007-04-30 13:06:09 -07:00
parent 44f68c55a9
commit e0a18d12af
11 changed files with 80 additions and 77 deletions

18
README
View File

@ -1,11 +1,11 @@
Welcome to the AllMyData "tahoe" project. This project implements a
secure, distributed, fault-tolerant storage mesh.
secure, distributed, fault-tolerant storage grid.
The basic idea is that the data in this storage mesh is spread over all
The basic idea is that the data in this storage grid is spread over all
participating nodes, using an algorithm that can recover the data even if a
majority of the nodes are no longer available.
The interface to the storage mesh allows you to store and fetch files, either
The interface to the storage grid allows you to store and fetch files, either
by self-authenticating cryptographic identifier or by filename and path.
@ -157,12 +157,12 @@ RUNNING:
directory, inside of which you can add files to configure and control the
node. Nodes also read and write files within that directory.
A mesh consists of a single central 'introducer and vdrive' node and a large
number of 'client' nodes. If you are joining an existing mesh, the
A grid consists of a single central 'introducer and vdrive' node and a large
number of 'client' nodes. If you are joining an existing grid, the
introducer-and-vdrive node will already be running, and you'll just need to
create a client node. If you're creating a brand new mesh, you'll need to
create a client node. If you're creating a brand new grid, you'll need to
create both an introducer-and-vdrive and a client (and then invite other
people to create their own client nodes and join your mesh).
people to create their own client nodes and join your grid).
The introducer (-and-vdrive) node is constructed by running 'allmydata-tahoe
create-introducer --basedir $HERE'. Once constructed, you can start the
@ -177,7 +177,7 @@ RUNNING:
from the introducer into this new directory, then run 'allmydata-tahoe start
--basedir $HERE'. After that, the client node should be off and running. The
first thing it will do is connect to the introducer and introduce itself to
all other nodes on the mesh. You can follow its progress by looking at the
all other nodes on the grid. You can follow its progress by looking at the
$HERE/twistd.log file.
To actually use the client, enable the web interface by writing a port
@ -193,6 +193,6 @@ RUNNING:
To stop it again, use 'make stop-client'. Similar makefile targets exist for
making and running an introducer node.
There is a public mesh available for testing. Look at the wiki page
There is a public grid available for testing. Look at the wiki page
(http://allmydata.org) for the necessary .furl data.

View File

@ -3,14 +3,14 @@
OVERVIEW
The high-level view of this system consists of three layers: the mesh, the
The high-level view of this system consists of three layers: the grid, the
virtual drive, and the application that sits on top.
The lowest layer is the "mesh", basically a DHT (Distributed Hash Table)
The lowest layer is the "grid", basically a DHT (Distributed Hash Table)
which maps URIs to data. The URIs are relatively short ascii strings
(currently about 140 bytes), and each is used as references to an immutable
arbitrary-length sequence of data bytes. This data is distributed around the
mesh in a large number of nodes, such that a statistically unlikely number
grid in a large number of nodes, such that a statistically unlikely number
of nodes would have to be unavailable for the data to become unavailable.
The middle layer is the virtual drive: a tree-shaped data structure in which
@ -32,9 +32,9 @@ actual code present in the current release. Please take a look at roadmap.txt
to get an idea of how much of this has been implemented so far.
THE BIG MESH OF PEERS
THE BIG GRID OF PEERS
Underlying the mesh is a large collection of peer nodes. These are processes
Underlying the grid is a large collection of peer nodes. These are processes
running on a wide variety of computers, all of which know about each other in
some way or another. They establish TCP connections to one another using
Foolscap, an encrypted+authenticated remote message passing library (using
@ -47,7 +47,7 @@ that would cause it to consume more space than it wants to provide. When a
lease expires, the data is deleted. Peers might renew their leases.
This storage is used to hold "shares", which are themselves used to store
files in the mesh. There are many shares for each file, typically around 100
files in the grid. There are many shares for each file, typically around 100
(the exact number depends upon the tradeoffs made between reliability,
overhead, and storage space consumed). The files are indexed by a piece of
the URI called the "verifierid", which is derived from the contents of the
@ -64,16 +64,19 @@ In this release, peers learn about each other through the "introducer". Each
peer connects to this central introducer at startup, and receives a list of
all other peers from it. Each peer then connects to all other peers, creating
a fully-connected topology. Future versions will reduce the number of
connections considerably, to enable the mesh to scale to larger sizes: the
connections considerably, to enable the grid to scale to larger sizes: the
design target is one million nodes. In addition, future versions will offer
relay and NAT-traversal services to allow nodes without full internet
connectivity to participate. In the current release, only one node may be
behind a NAT box and still permit the mesh to achieve full connectivity.
connectivity to participate. In the current release, nodes behind NAT boxes
will connect to all nodes that they can open connections to, but they cannot
open connections to other nodes behind NAT boxes. Therefore, the more nodes
there are behind NAT boxes the less the topology resembles the intended
fully-connected mesh topology.
FILE ENCODING
When a file is to be added to the mesh, it is first encrypted using a key
When a file is to be added to the grid, it is first encrypted using a key
that is derived from the hash of the file itself. The encrypted file is then
broken up into segments so it can be processed in small pieces (to minimize
the memory footprint of both encode and decode operations, and to increase
@ -131,9 +134,9 @@ it to validate that these potential bytes are indeed the ones that you were
looking for.
URIs refer to an immutable set of bytes. If you modify a file and upload the
new version to the mesh, you will get a different URI. URIs do not represent
new version to the grid, you will get a different URI. URIs do not represent
filenames at all, just the data that a filename might point to at some given
point in time. This is why the "mesh" layer is insufficient to provide a
point in time. This is why the "grid" layer is insufficient to provide a
virtual drive: an actual filesystem requires human-meaningful names and
mutability, while URIs provide neither. URIs sit on the "global+secure" edge
of Zooko's Triangle[1]. They are self-authenticating, meaning that nobody can
@ -165,7 +168,7 @@ which ones? The "peer selection" algorithm is used to make this choice.
In the current version, the verifierid is used to consistently-permute the
set of all peers (by sorting the peers by HASH(verifierid+peerid)). Each file
gets a different permutation, which (on average) will evenly distribute
shares among the mesh and avoid hotspots.
shares among the grid and avoid hotspots.
This permutation places the peers around a 2^256-sized ring, like the rim of
a big clock. The 100-or-so shares are then placed around the same ring (at 0,
@ -188,7 +191,7 @@ shares we were able to place. The current parameters try to place 100 shares,
of which 25 must be retrievable to recover the file, and the peer selection
algorithm is happy if it was able to place at least 75 shares. These numbers
are adjustable: 25-out-of-100 means an expansion factor of 4x (every file in
the mesh consumes four times as much space when totalled across all
the grid consumes four times as much space when totalled across all
StorageServers), but is highly reliable (the actual reliability is a binomial
distribution function of the expected availability of the individual peers,
but in general it goes up very quickly with the expansion factor).
@ -217,11 +220,11 @@ routine will eventually find all the peers that have shares, and will find
them quickly if there is significant overlap between the set of peers that
were present when the file was uploaded and the set of peers that are present
as it is downloaded (i.e. if the "peerlist stability" is high). Some limits
may be imposed in large meshes to avoid querying a million peers; this
may be imposed in large grids to avoid querying a million peers; this
provides a tradeoff between the work spent to discover that a file is
unrecoverable and the probability that a retrieval will fail when it could
have succeeded if we had just tried a little bit harder. The appropriate
value of this tradeoff will depend upon the size of the mesh, and will change
value of this tradeoff will depend upon the size of the grid, and will change
over time.
Other peer selection algorithms are being evaluated. One of them (known as
@ -328,7 +331,7 @@ is to finish up with a full set of 100 shares.
There are a number of engineering issues to be resolved here. The bandwidth,
disk IO, and CPU time consumed by the verification/repair process must be
balanced against the robustness that it provides to the mesh. The nodes
balanced against the robustness that it provides to the grid. The nodes
involved in repair will have very different access patterns than normal
nodes, such that these processes may need to be run on hosts with more memory
or network connectivity than usual. The frequency of repair runs directly
@ -366,7 +369,7 @@ match the originally uploaded data) is provided by the hashes embedded the
URI. Data security (the promise that the data is only readable by people with
the URI) is provided by the encryption key embedded in the URI. Data
availability (the hope that data which has been uploaded in the past will be
downloadable in the future) is provided by the mesh, which distributes
downloadable in the future) is provided by the grid, which distributes
failures in a way that reduces the correlation between individual node
failure and overall file recovery failure.
@ -425,7 +428,7 @@ different goals. Each choice results in a number of properties; there are
many tradeoffs.
First, some terms: the erasure-coding algorithm is described as K-out-of-N
(for this release, the default values are K=25 and N=100). Each mesh will
(for this release, the default values are K=25 and N=100). Each grid will
have some number of peers; this number will rise and fall over time as peers
join, drop out, come back, and leave forever. Files are of various sizes,
some are popular, others are rare. Peers have various capacities, variable
@ -438,7 +441,7 @@ segments of a given maximum size, which affects memory usage.
The ratio of N/K is the "expansion factor". Higher expansion factors improve
reliability very quickly (the binomial distribution curve is very sharp), but
consumes much more mesh capacity. The absolute value of K affects the
consumes much more grid capacity. The absolute value of K affects the
granularity of the binomial curve (1-out-of-2 is much worse than
50-out-of-100), but high values asymptotically approach a constant that
depends upon 'P' (i.e. 500-of-1000 is not much better than 50-of-100).
@ -452,11 +455,11 @@ will take out all of them at once. The "Sybil Attack" is where a single
attacker convinces you that they are actually multiple servers, so that you
think you are using a large number of independent peers, but in fact you have
a single point of failure (where the attacker turns off all their machines at
once). Large meshes, with lots of truly-independent peers, will enable the
once). Large grids, with lots of truly-independent peers, will enable the
use of lower expansion factors to achieve the same reliability, but increase
overhead because each peer needs to know something about every other, and the
rate at which peers come and go will be higher (requiring network maintenance
traffic). Also, the File Repairer work will increase with larger meshes,
traffic). Also, the File Repairer work will increase with larger grids,
although then the job can be distributed out to more peers.
Higher values of N increase overhead: more shares means more Merkle hashes
@ -467,10 +470,10 @@ must be held in memory while erasure coding runs) and increases "alacrity"
to the target sooner), but also increase overhead (because more blocks means
more Merkle hashes to validate them).
In general, small private meshes should work well, but the participants will
have to decide between storage overhead and reliability. Large stable meshes
In general, small private grids should work well, but the participants will
have to decide between storage overhead and reliability. Large stable grids
will be able to reduce the expansion factor down to a bare minimum while
still retaining high reliability, but large unstable meshes (where nodes are
still retaining high reliability, but large unstable grids (where nodes are
coming and going very quickly) may require more repair/verification bandwidth
than actual upload/download traffic.

View File

@ -1,6 +1,6 @@
Allmydata, Inc. [1], provider of the "Allmydata" consumer backup product, is
pleased announce the first public release of "Tahoe", a secure, distributed
storage mesh with a free-software licence.
storage grid with a free-software licence.
The source code that we are releasing is the current working prototype for
Allmydata's next-generation product. This release is targeted at hackers who
@ -9,7 +9,7 @@ interface.
This prototype is not recommended for storage of confidential data nor for
data which is not otherwise backed up, but it already implements a functional
distributed storage mesh and is useful for experimentation, prototyping, and
distributed storage grid and is useful for experimentation, prototyping, and
extension.
@ -26,8 +26,8 @@ USAGE
Once installed, create a "client node" as described in the README. Instruct
this client node to connect to a specific "introducer node" by means of
config files in the client node's working directory. To join a public mesh,
copy in the .furl files for that mesh. To create a private mesh, run your
config files in the client node's working directory. To join a public grid,
copy in the .furl files for that grid. To create a private grid, run your
own introducer, and copy its .furl files.
Each client node runs a local webserver (enabled by writing the desired port
@ -82,14 +82,14 @@ mapping from pathnames/filenames to URIs.
We are well aware of the limitations of decentralization and scalability
inherent in this prototype. In particular, the completely-connected property
of the mesh and the requirement of a single distinct introducer and vdrive
server limits the possible size of the mesh. We have plans to loosen these
of the grid and the requirement of a single distinct introducer and vdrive
server limits the possible size of the grid. We have plans to loosen these
limitations (see roadmap.txt [5]). Currently it should be noted that the
mesh already depends as little as possible on the accessibility and
grid already depends as little as possible on the accessibility and
correctness of the introduction server and the vdrive server. Also note that
the choice of which servers to use is easily configured -- you should be able
to set up a private mesh for you and your friends almost as easily as to
connect to our public test mesh.
to set up a private grid for you and your friends almost as easily as to
connect to our public test grid.
SOFTWARE ARCHITECTURE
@ -99,7 +99,7 @@ consumer backup service. It is primarily written in Python.
It uses the Foolscap library [10] which provides a remote object protocol
inspired by the capability-secure "E" programming language [11]. Foolscap
allows us to express the intended behavior of the distributed mesh directly
allows us to express the intended behavior of the distributed grid directly
in object-oriented terms while relying on a well-engineered, secure transport
layer.

View File

@ -1,6 +1,6 @@
#! /usr/bin/env python
# Allmydata Tahoe -- secure, distributed storage mesh
# Allmydata Tahoe -- secure, distributed storage grid
#
# Copyright (C) 2007 Allmydata, Inc.
#
@ -57,11 +57,11 @@ trove_classifiers=[
setup(name='allmydata-tahoe',
version='0.1.4b2',
description='secure, distributed storage mesh',
description='secure, distributed storage grid',
long_description="""Welcome to the AllMyData "tahoe" project. This project implements a
secure, distributed, fault-tolerant storage mesh.
secure, distributed, fault-tolerant storage grid.
The basic idea is that the data in this storage mesh is spread over all
The basic idea is that the data in this storage grid is spread over all
participating nodes, using an algorithm that can recover the data even if a
majority of the nodes are no longer available.""",
author='Allmydata, Inc.',

View File

@ -1,5 +1,5 @@
"""
Decentralized storage mesh.
Decentralized storage grid.
maintainer web site: U{http://allmydata.com/}

View File

@ -14,7 +14,7 @@ from allmydata.interfaces import IEncoder
The goal of the encoder is to turn the original file into a series of
'shares'. Each share is going to a 'shareholder' (nominally each shareholder
is a different host, but for small meshes there may be overlap). The number
is a different host, but for small grids there may be overlap). The number
of shares is chosen to hit our reliability goals (more shares on more
machines means more reliability), and is limited by overhead (proportional to
numshares or log(numshares)) and the encoding technology in use (Reed-Solomon

View File

@ -135,7 +135,7 @@ class _DirectorySubTree(object):
# self.populate_from_node must be defined by the subclass (CHK or
# SSK), since it controls how the spec is interpreted. It will
# probably use the contents of the node to figure out what to
# download from the mesh, then pass this downloaded serialized data
# download from the grid, then pass this downloaded serialized data
# to populate_from_data()
raise NotImplementedError

View File

@ -106,7 +106,7 @@ class ISubTree(Interface):
Each subtree's populate_from_node() method is expected to use the
downloader to obtain a file with the subtree's serialized contents
(probably by pulling data from some source, like the mesh, the vdrive
(probably by pulling data from some source, like the grid, the vdrive
server, an HTTP server, or somewhere on the local filesystem), then
unserialize them and populate the subtree's state.
@ -175,7 +175,7 @@ class ISubTree(Interface):
def serialize_subtree_to_file(f):
"""Create a string which describes my structure and write it to the
given filehandle (using only .write()). This string should be
suitable for uploading to the mesh or storing in a local file."""
suitable for uploading to the grid or storing in a local file."""
def update_now(uploader):
"""Perform whatever work is necessary to record this subtree to
@ -185,7 +185,7 @@ class ISubTree(Interface):
the subtree has been persisted.
For directory subtrees, this will cause the subtree to serialize
itself to a file, then upload this file to the mesh, then create an
itself to a file, then upload this file to the grid, then create an
INode-providing instance which describes where the file wound up. For
redirections, this will cause the subtree to modify the redirection's
persistent storage, then return the (unmodified) INode that describes

View File

@ -512,7 +512,7 @@ class IWorkQueue(Interface):
generated uri."""
def add_upload_chk(source_filename, stash_uri_in_boxname):
"""This step uploads a file to the mesh and obtains a content-based
"""This step uploads a file to the grid and obtains a content-based
URI which can be used to later retrieve the same contents ('CHK'
mode). This URI includes unlink rights. It does not mark the file for
retention.
@ -526,7 +526,7 @@ class IWorkQueue(Interface):
"""
def add_upload_ssk(write_capability, previous_version, source_filename):
"""This step uploads a file to the mesh in a way that replaces the
"""This step uploads a file to the grid in a way that replaces the
previous version and does not require a change to the ID referenced
by the parent.
"""
@ -618,15 +618,15 @@ class NotCapableError(Exception):
class RIControlClient(RemoteInterface):
def upload_from_file_to_uri(filename=str):
"""Upload a file to the mesh. This accepts a filename (which must be
"""Upload a file to the grid. This accepts a filename (which must be
absolute) that points to a file on the node's local disk. The node
will read the contents of this file, upload it to the mesh, then
will read the contents of this file, upload it to the grid, then
return the URI at which it was uploaded.
"""
return URI
def download_from_uri_to_file(uri=URI, filename=str):
"""Download a file from the mesh, placing it on the node's local disk
"""Download a file from the grid, placing it on the node's local disk
at the given filename (which must be absolute[?]). Returns the
absolute filename where the file was written."""
return str

View File

@ -11,7 +11,7 @@ from allmydata.interfaces import IDownloader, IUploader
from allmydata import workqueue
from cStringIO import StringIO
class FakeMesh(object):
class FakeGrid(object):
implements(IDownloader, IUploader)
"""
@ -333,7 +333,7 @@ class Utils(unittest.TestCase):
pairs = list(directory.in_pairs(l))
self.failUnlessEqual(pairs, [(0,1), (2,3), (4,5), (6,7)])
class FakeMesh(object):
class FakeGrid(object):
implements(IDownloader, IUploader)
debug = False
@ -343,7 +343,7 @@ class FakeMesh(object):
def upload(self, uploadable):
uri = "stub-uri-%d" % len(self.files)
if self.debug:
print "FakeMesh.upload -> %s" % uri
print "FakeGrid.upload -> %s" % uri
assert upload.IUploadable.providedBy(uploadable)
f = uploadable.get_filehandle()
data = f.read()
@ -353,17 +353,17 @@ class FakeMesh(object):
def upload_filename(self, filename):
if self.debug:
print "FakeMesh.upload_filename(%s)" % filename
print "FakeGrid.upload_filename(%s)" % filename
return self.upload(upload.FileName(filename))
def upload_data(self, data):
if self.debug:
print "FakeMesh.upload_data(%s)" % data
print "FakeGrid.upload_data(%s)" % data
return self.upload(upload.Data(data))
def download(self, uri, target):
if self.debug:
print "FakeMesh.download(%s)" % uri
print "FakeGrid.download(%s)" % uri
target.open()
target.write(self.files[uri])
target.close()
@ -372,16 +372,16 @@ class FakeMesh(object):
class VDrive(unittest.TestCase):
def makeVirtualDrive(self, basedir, root_node=None, mesh=None):
def makeVirtualDrive(self, basedir, root_node=None, grid=None):
wq = workqueue.WorkQueue(os.path.join("test_filetree",
"VDrive",
basedir, "1.workqueue"))
if mesh:
assert IUploader.providedBy(mesh)
assert IDownloader.providedBy(mesh)
dl = ul = mesh
if grid:
assert IUploader.providedBy(grid)
assert IDownloader.providedBy(grid)
dl = ul = grid
else:
dl = ul = FakeMesh()
dl = ul = FakeGrid()
if not root_node:
root_node = directory.LocalFileSubTreeNode()
root_node.new("rootdirtree.save")
@ -403,17 +403,17 @@ class VDrive(unittest.TestCase):
def makeCHKTree(self, basename):
# create a LocalFileRedirection pointing at a CHKDirectorySubTree.
# Returns a VirtualDrive instance.
mesh = FakeMesh()
grid = FakeGrid()
topdir = directory.CHKDirectorySubTree().new()
d = topdir.update_now(mesh)
d = topdir.update_now(grid)
def _updated(topnode):
root = redirect.LocalFileRedirection()
root.new("%s-root" % basename, topnode)
return root.update_now(mesh)
return root.update_now(grid)
d.addCallback(_updated)
d.addCallback(lambda rootnode:
self.makeVirtualDrive("%s-vdrive" % basename,
rootnode, mesh))
rootnode, grid))
return d
def failUnlessListsAreEqual(self, list1, list2):
@ -425,7 +425,7 @@ class VDrive(unittest.TestCase):
self.failUnlessEqual(c1a, c2a)
def testDirectory(self):
stm = vdrive.SubTreeMaker(FakeMesh())
stm = vdrive.SubTreeMaker(FakeGrid())
# create an empty directory (stored locally)
subtree = directory.LocalFileSubTree()

View File

@ -12,7 +12,7 @@
<p>To view the global shared filestore, <a href="../vdrive">Click Here!</a></p>
<h2>Mesh Status</h2>
<h2>Grid Status</h2>
<div>My nodeid: <span n:render="string" n:data="my_nodeid" /></div>
<div>Introducer: <span n:render="string" n:data="introducer_pburl" /></div>