2013-11-08 20:31:08 +00:00
|
|
|
.. -*- coding: utf-8-with-signature -*-
|
|
|
|
|
2010-12-12 00:46:32 +00:00
|
|
|
====================
|
|
|
|
Servers of Happiness
|
|
|
|
====================
|
2010-05-24 00:35:08 +00:00
|
|
|
|
|
|
|
When you upload a file to a Tahoe-LAFS grid, you expect that it will
|
|
|
|
stay there for a while, and that it will do so even if a few of the
|
|
|
|
peers on the grid stop working, or if something else goes wrong. An
|
2010-06-18 05:07:10 +00:00
|
|
|
upload health metric helps to make sure that this actually happens.
|
|
|
|
An upload health metric is a test that looks at a file on a Tahoe-LAFS
|
|
|
|
grid and says whether or not that file is healthy; that is, whether it
|
|
|
|
is distributed on the grid in such a way as to ensure that it will
|
|
|
|
probably survive in good enough shape to be recoverable, even if a few
|
|
|
|
things go wrong between the time of the test and the time that it is
|
2010-05-24 00:35:08 +00:00
|
|
|
recovered. Our current upload health metric for immutable files is called
|
|
|
|
'servers-of-happiness'; its predecessor was called 'shares-of-happiness'.
|
|
|
|
|
|
|
|
shares-of-happiness used the number of encoded shares generated by a
|
|
|
|
file upload to say whether or not it was healthy. If there were more
|
|
|
|
shares than a user-configurable threshold, the file was reported to be
|
|
|
|
healthy; otherwise, it was reported to be unhealthy. In normal
|
|
|
|
situations, the upload process would distribute shares fairly evenly
|
|
|
|
over the peers in the grid, and in that case shares-of-happiness
|
|
|
|
worked fine. However, because it only considered the number of shares,
|
|
|
|
and not where they were on the grid, it could not detect situations
|
|
|
|
where a file was unhealthy because most or all of the shares generated
|
|
|
|
from the file were stored on one or two peers.
|
|
|
|
|
|
|
|
servers-of-happiness addresses this by extending the share-focused
|
|
|
|
upload health metric to also consider the location of the shares on
|
|
|
|
grid. servers-of-happiness looks at the mapping of peers to the shares
|
|
|
|
that they hold, and compares the cardinality of the largest happy subset
|
2010-06-18 05:07:10 +00:00
|
|
|
of those to a user-configurable threshold. A happy subset of peers has
|
2010-05-24 00:35:08 +00:00
|
|
|
the property that any k (where k is as in k-of-n encoding) peers within
|
2010-06-18 05:07:10 +00:00
|
|
|
the subset can reconstruct the source file. This definition of file
|
2010-05-24 00:35:08 +00:00
|
|
|
health provides a stronger assurance of file availability over time;
|
|
|
|
with 3-of-10 encoding, and happy=7, a healthy file is still guaranteed
|
|
|
|
to be available even if 4 peers fail.
|
|
|
|
|
2010-12-12 00:46:32 +00:00
|
|
|
Measuring Servers of Happiness
|
|
|
|
==============================
|
2010-05-24 00:35:08 +00:00
|
|
|
|
|
|
|
We calculate servers-of-happiness by computing a matching on a
|
|
|
|
bipartite graph that is related to the layout of shares on the grid.
|
|
|
|
One set of vertices is the peers on the grid, and one set of vertices is
|
|
|
|
the shares. An edge connects a peer and a share if the peer will (or
|
|
|
|
does, for existing shares) hold the share. The size of the maximum
|
|
|
|
matching on this graph is the size of the largest happy peer set that
|
|
|
|
exists for the upload.
|
|
|
|
|
|
|
|
First, note that a bipartite matching of size n corresponds to a happy
|
|
|
|
subset of size n. This is because a bipartite matching of size n implies
|
|
|
|
that there are n peers such that each peer holds a share that no other
|
|
|
|
peer holds. Then any k of those peers collectively hold k distinct
|
|
|
|
shares, and can restore the file.
|
|
|
|
|
|
|
|
A bipartite matching of size n is not necessary for a happy subset of
|
|
|
|
size n, however (so it is not correct to say that the size of the
|
|
|
|
maximum matching on this graph is the size of the largest happy subset
|
|
|
|
of peers that exists for the upload). For example, consider a file with
|
|
|
|
k = 3, and suppose that each peer has all three of those pieces. Then,
|
|
|
|
since any peer from the original upload can restore the file, if there
|
|
|
|
are 10 peers holding shares, and the happiness threshold is 7, the
|
|
|
|
upload should be declared happy, because there is a happy subset of size
|
|
|
|
10, and 10 > 7. However, since a maximum matching on the bipartite graph
|
|
|
|
related to this layout has only 3 edges, Tahoe-LAFS declares the upload
|
|
|
|
unhealthy. Though it is not unhealthy, a share layout like this example
|
|
|
|
is inefficient; for k = 3, and if there are n peers, it corresponds to
|
|
|
|
an expansion factor of 10x. Layouts that are declared healthy by the
|
|
|
|
bipartite graph matching approach have the property that they correspond
|
|
|
|
to uploads that are either already relatively efficient in their
|
2010-06-18 05:07:10 +00:00
|
|
|
utilization of space, or can be made to be so by deleting shares; and
|
2010-05-24 00:35:08 +00:00
|
|
|
that place all of the shares that they generate, enabling redistribution
|
|
|
|
of shares later without having to re-encode the file. Also, it is
|
|
|
|
computationally reasonable to compute a maximum matching in a bipartite
|
|
|
|
graph, and there are well-studied algorithms to do that.
|
|
|
|
|
2010-12-12 00:46:32 +00:00
|
|
|
Issues
|
|
|
|
======
|
2010-05-24 00:35:08 +00:00
|
|
|
|
|
|
|
The uploader is good at detecting unhealthy upload layouts, but it
|
|
|
|
doesn't always know how to make an unhealthy upload into a healthy
|
|
|
|
upload if it is possible to do so; it attempts to redistribute shares to
|
|
|
|
achieve happiness, but only in certain circumstances. The redistribution
|
|
|
|
algorithm isn't optimal, either, so even in these cases it will not
|
|
|
|
always find a happy layout if one can be arrived at through
|
|
|
|
redistribution. We are investigating improvements to address these
|
|
|
|
issues.
|
|
|
|
|
|
|
|
We don't use servers-of-happiness for mutable files yet; this fix will
|
|
|
|
likely come in Tahoe-LAFS version 1.8.
|