convergence secret doc by CtB, marlowe, zooko

This commit is contained in:
Zooko O'Whielacronx 2013-04-08 23:33:42 -06:00 committed by Brian Warner
parent 389251860e
commit 07f7d50afa

View File

@ -0,0 +1,75 @@

What Is It?
-----------
The identifer of a file (also called the "capability" to a file) is derived
from two pieces of information when the file is uploaded: the content of the
file and the upload node's "convergence secret". By default, the convergence
secret is randomly generated by the node when it first starts up, then stored
in the node's base directory (<Tahoe's node dir>/private/convergence) and
re-used after that. So the same file content uploaded from the same node will
always have the same cap. Uploading the file from a different node with a
different convergence secret would result in a different cap—and in a second
copy of the file's contents stored on the grid. If you want files you upload
to converge (also known as "deduplicate") with files uploaded by someone
else, just make sure you're using the same convergence secret when you upload
files as they
The advantages of deduplication should be clear, but keep in mind that the
convergence secret was created to protect confidentiality. There are two
attacks that can be used against you by someone who knows the convergence
secret you use.
The first one is called the "Confirmation-of-a-File Attack". Someone who
knows the convergence secret that you used when you uploaded a file, and who
has a copy of that file themselves, can check whether you have a copy of that
file. This is usually not a problem, but it could be if that file is, for
example, a book or movie that is banned in your country.
The second attack is more subtle. It is called the
"Learn-the-Remaining-Information Attack". Suppose you've received a
confidential document, such as a PDF from your bank which contains many pages
of boilerplate text as well as containing your bank account number and
balance. Someone who knows your convergence secret can generate a file with
all of the boilerplate text (perhaps they would open an account with the same
bank so they receive the same document with their account number and
balance). Then they can try a "brute force search" to find your account
number and your balance.
The defense against these attacks is that only someone who knows the
convergence secret that you used on each file can perform these attacks on
that file.
Both of these attacks and the defense are described in more detail in `Drew
Perttula's Hack Tahoe-LAFS Hall Of Fame entry`_
.. _`Drew Perttula's Hack Tahoe-LAFS Hall Of Fame entry`:
https://tahoe-lafs.org/hacktahoelafs/drew_perttula.html
What If I Change My Convergence Secret?
---------------------------------------
All your old file capabilities will still work, but the new data that you
upload will not be deduplicated with the old data. If you upload all of the
same things to the grid, you will end up using twice the space until garbage
collection kicks in (if it's enabled). Changing the convergence secret that a
node uses for uploads can be though of as moving the node to a new
"deduplication domain".
How To Use It
-------------
To enable deduplication between different clients, **securely** copy the
convergence secret file from one client to all the others.
For example, if you are on host A and have an account on host B and you have
scp installed, run:
*scp ~/.tahoe/private/convergence
my_other_account@B:.tahoe/private/convergence*
If you have two different nodes on a single computer, say one for each disk,
you would do:
*cp /tahoe1/private/convergence /tahoe2/private/convergence*