tahoe-lafs/docs/specifications/servers-of-happiness.rst

159 lines
7.6 KiB
ReStructuredText
Raw Normal View History

.. -*- coding: utf-8-with-signature -*-
====================
Servers of Happiness
====================
When you upload a file to a Tahoe-LAFS grid, you expect that it will
stay there for a while, and that it will do so even if a few of the
peers on the grid stop working, or if something else goes wrong. An
upload health metric helps to make sure that this actually happens.
An upload health metric is a test that looks at a file on a Tahoe-LAFS
grid and says whether or not that file is healthy; that is, whether it
is distributed on the grid in such a way as to ensure that it will
probably survive in good enough shape to be recoverable, even if a few
things go wrong between the time of the test and the time that it is
recovered. Our current upload health metric for immutable files is called
'servers-of-happiness'; its predecessor was called 'shares-of-happiness'.
shares-of-happiness used the number of encoded shares generated by a
file upload to say whether or not it was healthy. If there were more
shares than a user-configurable threshold, the file was reported to be
healthy; otherwise, it was reported to be unhealthy. In normal
situations, the upload process would distribute shares fairly evenly
over the peers in the grid, and in that case shares-of-happiness
worked fine. However, because it only considered the number of shares,
and not where they were on the grid, it could not detect situations
where a file was unhealthy because most or all of the shares generated
from the file were stored on one or two peers.
servers-of-happiness addresses this by extending the share-focused
upload health metric to also consider the location of the shares on
grid. servers-of-happiness looks at the mapping of peers to the shares
that they hold, and compares the cardinality of the largest happy subset
of those to a user-configurable threshold. A happy subset of peers has
the property that any k (where k is as in k-of-n encoding) peers within
the subset can reconstruct the source file. This definition of file
health provides a stronger assurance of file availability over time;
with 3-of-10 encoding, and happy=7, a healthy file is still guaranteed
to be available even if 4 peers fail.
Measuring Servers of Happiness
==============================
We calculate servers-of-happiness by computing a matching on a
bipartite graph that is related to the layout of shares on the grid.
One set of vertices is the peers on the grid, and one set of vertices is
the shares. An edge connects a peer and a share if the peer will (or
does, for existing shares) hold the share. The size of the maximum
matching on this graph is the size of the largest happy peer set that
exists for the upload.
First, note that a bipartite matching of size n corresponds to a happy
subset of size n. This is because a bipartite matching of size n implies
that there are n peers such that each peer holds a share that no other
peer holds. Then any k of those peers collectively hold k distinct
shares, and can restore the file.
A bipartite matching of size n is not necessary for a happy subset of
size n, however (so it is not correct to say that the size of the
maximum matching on this graph is the size of the largest happy subset
of peers that exists for the upload). For example, consider a file with
k = 3, and suppose that each peer has all three of those pieces. Then,
since any peer from the original upload can restore the file, if there
are 10 peers holding shares, and the happiness threshold is 7, the
upload should be declared happy, because there is a happy subset of size
10, and 10 > 7. However, since a maximum matching on the bipartite graph
related to this layout has only 3 edges, Tahoe-LAFS declares the upload
unhealthy. Though it is not unhealthy, a share layout like this example
is inefficient; for k = 3, and if there are n peers, it corresponds to
an expansion factor of 10x. Layouts that are declared healthy by the
bipartite graph matching approach have the property that they correspond
to uploads that are either already relatively efficient in their
utilization of space, or can be made to be so by deleting shares; and
that place all of the shares that they generate, enabling redistribution
of shares later without having to re-encode the file. Also, it is
computationally reasonable to compute a maximum matching in a bipartite
graph, and there are well-studied algorithms to do that.
Issues
======
The uploader is good at detecting unhealthy upload layouts, but it
doesn't always know how to make an unhealthy upload into a healthy
upload if it is possible to do so; it attempts to redistribute shares to
achieve happiness, but only in certain circumstances. The redistribution
algorithm isn't optimal, either, so even in these cases it will not
always find a happy layout if one can be arrived at through
redistribution. We are investigating improvements to address these
issues.
We don't use servers-of-happiness for mutable files yet; this fix will
likely come in Tahoe-LAFS version 1.13.
============================
Upload Strategy of Happiness
============================
As mentioned above, the uploader is good at detecting instances which
do not pass the servers-of-happiness test, but the share distribution algorithm
is not always successful in instances where happiness can be achieved. A new
placement algorithm designed to pass the servers-of-happiness test, titled
'Upload Strategy of Happiness', is meant to fix these instances where the uploader
is unable to achieve happiness.
Calculating Share Placements
============================
We calculate share placement like so:
1. Query 2n servers for existing shares.
2. Construct a bipartite graph of readonly servers to shares, where an edge
exists between an arbitrary readonly server s and an arbitrary share n if and only if s
holds n.
3. Calculate the maximum matching graph of the bipartite graph. The maxmum matching
is the matching which contains the largest possible number of edges.
4. Construct a bipartite graph of servers to shares, removing any servers and
shares used in the maximum matching graph from step 3. Let an edge exist between
server s and share n if and only if s holds n.
5. Calculate the maximum matching graph of the new graph.
6. Construct a bipartite graph of servers to share, removing any servers and
shares used in the maximum matching graphs from steps 3 and 5. Let an edge exist
between server s and share n if and only if s can hold n.
7. Calculate the maximum matching graph of the new graph.
8. Renew the shares on their respective servers from steps 3
and 5.
9. Place share n on server s if an edge exists between s and n in the
maximum matching graph from step 7.
10. If any placements from step 7 fail, remove the server from the set of possible
servers and regenerate the matchings.
Properties of Upload Strategy of Happiness
==========================================
The size of the maximum bipartite matching is bounded by the size of the smaller
set of vertices. Therefore in a situation where the set of servers is smaller
than the set of shares, placement is not generated for a subset of shares. In
this case the remaining shares are distributed as evenly as possible across the
set of writable servers.
If the servers-of-happiness criteria can be met, the upload strategy of
happiness guarantees that H shares will be placed on the network. During file
repair, if the set of servers is larger than N, the algorithm will only attempt
to spread shares over N distinct servers. For both initial file upload and file
repair, N should be viewed as the maximum number of distinct servers shares
can be placed on, and H as the minimum amount. The uploader will fail if
the number of distinct servers is less than H, and it will never attempt to
exceed N.