mirror of
https://github.com/tahoe-lafs/tahoe-lafs.git
synced 2025-01-31 00:24:13 +00:00
move historical docs from wiki pages into the source tree, clearly marked as historical
This commit is contained in:
parent
313876263d
commit
aa2c693764
66
docs/historical/peer-selection-tahoe3.txt
Normal file
66
docs/historical/peer-selection-tahoe3.txt
Normal file
@ -0,0 +1,66 @@
|
||||
= THIS PAGE DESCRIBES HISTORICAL ARCHITECTURE CHOICES: THE CURRENT CODE DOES NOT WORK AS DESCRIBED HERE. =
|
||||
|
||||
When a file is uploaded, the encoded shares are sent to other peers. But to
|
||||
which ones? The PeerSelection algorithm is used to make this choice.
|
||||
|
||||
In the old (May 2007) version, the verifierid is used to consistently-permute
|
||||
the set of all peers (by sorting the peers by HASH(verifierid+peerid)). Each
|
||||
file gets a different permutation, which (on average) will evenly distribute
|
||||
shares among the grid and avoid hotspots.
|
||||
|
||||
This permutation places the peers around a 2^256^-sized ring, like the rim of
|
||||
a big clock. The 100-or-so shares are then placed around the same ring (at 0,
|
||||
1/100*2^256^, 2/100*2^256^, ... 99/100*2^256^). Imagine that we start at 0 with
|
||||
an empty basket in hand and proceed clockwise. When we come to a share, we
|
||||
pick it up and put it in the basket. When we come to a peer, we ask that peer
|
||||
if they will give us a lease for every share in our basket.
|
||||
|
||||
The peer will grant us leases for some of those shares and reject others (if
|
||||
they are full or almost full). If they reject all our requests, we remove
|
||||
them from the ring, because they are full and thus unhelpful. Each share they
|
||||
accept is removed from the basket. The remainder stay in the basket as we
|
||||
continue walking clockwise.
|
||||
|
||||
We keep walking, accumulating shares and distributing them to peers, until
|
||||
either we find a home for all shares, or there are no peers left in the ring
|
||||
(because they are all full). If we run out of peers before we run out of
|
||||
shares, the upload may be considered a failure, depending upon how many
|
||||
shares we were able to place. The current parameters try to place 100 shares,
|
||||
of which 25 must be retrievable to recover the file, and the peer selection
|
||||
algorithm is happy if it was able to place at least 75 shares. These numbers
|
||||
are adjustable: 25-out-of-100 means an expansion factor of 4x (every file in
|
||||
the grid consumes four times as much space when totalled across all
|
||||
StorageServers), but is highly reliable (the actual reliability is a binomial
|
||||
distribution function of the expected availability of the individual peers,
|
||||
but in general it goes up very quickly with the expansion factor).
|
||||
|
||||
If the file has been uploaded before (or if two uploads are happening at the
|
||||
same time), a peer might already have shares for the same file we are
|
||||
proposing to send to them. In this case, those shares are removed from the
|
||||
list and assumed to be available (or will be soon). This reduces the number
|
||||
of uploads that must be performed.
|
||||
|
||||
When downloading a file, the current release just asks all known peers for
|
||||
any shares they might have, chooses the minimal necessary subset, then starts
|
||||
downloading and processing those shares. A later release will use the full
|
||||
algorithm to reduce the number of queries that must be sent out. This
|
||||
algorithm uses the same consistent-hashing permutation as on upload, but
|
||||
instead of one walker with one basket, we have 100 walkers (one per share).
|
||||
They each proceed clockwise in parallel until they find a peer, and put that
|
||||
one on the "A" list: out of all peers, this one is the most likely to be the
|
||||
same one to which the share was originally uploaded. The next peer that each
|
||||
walker encounters is put on the "B" list, etc.
|
||||
|
||||
All the "A" list peers are asked for any shares they might have. If enough of
|
||||
them can provide a share, the download phase begins and those shares are
|
||||
retrieved and decoded. If not, the "B" list peers are contacted, etc. This
|
||||
routine will eventually find all the peers that have shares, and will find
|
||||
them quickly if there is significant overlap between the set of peers that
|
||||
were present when the file was uploaded and the set of peers that are present
|
||||
as it is downloaded (i.e. if the "peerlist stability" is high). Some limits
|
||||
may be imposed in large grids to avoid querying a million peers; this
|
||||
provides a tradeoff between the work spent to discover that a file is
|
||||
unrecoverable and the probability that a retrieval will fail when it could
|
||||
have succeeded if we had just tried a little bit harder. The appropriate
|
||||
value of this tradeoff will depend upon the size of the grid, and will change
|
||||
over time.
|
28
docs/historical/peer-selection.txt
Normal file
28
docs/historical/peer-selection.txt
Normal file
@ -0,0 +1,28 @@
|
||||
When a file is uploaded, the encoded shares are sent to other peers. But to
|
||||
which ones? Likewise, when we want to download a file, which peers should we
|
||||
ask for shares? The "peer selection" algorithm is used to make these choices.
|
||||
|
||||
During the first tahoe meeting, (actualy on the drive back home), we designed
|
||||
the now-abandoned "tahoe1" algorithm, which involved a "cabal" for each file,
|
||||
where the peers involved would check up on each other to make sure the data
|
||||
was still available. The big limitation was the expense of tracking which
|
||||
nodes were parts of which cabals.
|
||||
|
||||
Formerly (until v0.6, ticket #132), we used the "tahoe3" algorithm (see
|
||||
peer-selection-tahoe3.txt), but now we use the "tahoe2" algorithm (see the
|
||||
PEER SELECTION section of docs/architecture.txt), which uses a permuted
|
||||
peerid list and packs the shares into the first 10 or so members of this
|
||||
list. (It is named "tahoe2" because it was designed before "tahoe3" was.)
|
||||
|
||||
In the future, we might move to an algorithm known as "denver airport", which
|
||||
uses Chord-like routing to minimize the number of active connections.
|
||||
|
||||
Different peer selection algorithms result in different properties:
|
||||
* how well do we handle nodes leaving or joining the mesh (differences in the
|
||||
peer list)?
|
||||
* how many connections do we need to keep open?
|
||||
* how many nodes must we speak to when uploading a file?
|
||||
* if a file is unrecoverable, how long will it take for us to discover this
|
||||
fact?
|
||||
* how expensive is a file-checking operation?
|
||||
* how well can we accomodate changes to encoding parameters?
|
Loading…
x
Reference in New Issue
Block a user