2013-04-09 05:33:42 +00:00
|
|
|
|
|
|
|
|
|
|
|
|
|
|
What Is It?
|
|
|
|
|
-----------
|
|
|
|
|
|
2013-04-10 00:13:37 +00:00
|
|
|
|
The identifier of a file (also called the "capability" to a file) is derived
|
2013-04-09 05:33:42 +00:00
|
|
|
|
from two pieces of information when the file is uploaded: the content of the
|
2013-04-09 06:48:23 +00:00
|
|
|
|
file and the upload client's "convergence secret". By default, the
|
|
|
|
|
convergence secret is randomly generated by the client when it first starts
|
|
|
|
|
up, then stored in the client's base directory (<Tahoe's node
|
|
|
|
|
dir>/private/convergence) and re-used after that. So the same file content
|
|
|
|
|
uploaded from the same client will always have the same cap. Uploading the
|
|
|
|
|
file from a different client with a different convergence secret would result
|
2013-04-09 19:19:58 +00:00
|
|
|
|
in a different cap -- and in a second copy of the file's contents stored on
|
|
|
|
|
the grid. If you want files you upload to converge (also known as
|
|
|
|
|
"deduplicate") with files uploaded by someone else, just make sure you're
|
2013-04-10 00:13:37 +00:00
|
|
|
|
using the same convergence secret when you upload files as them.
|
2013-04-09 05:33:42 +00:00
|
|
|
|
|
|
|
|
|
The advantages of deduplication should be clear, but keep in mind that the
|
|
|
|
|
convergence secret was created to protect confidentiality. There are two
|
|
|
|
|
attacks that can be used against you by someone who knows the convergence
|
|
|
|
|
secret you use.
|
|
|
|
|
|
|
|
|
|
The first one is called the "Confirmation-of-a-File Attack". Someone who
|
|
|
|
|
knows the convergence secret that you used when you uploaded a file, and who
|
|
|
|
|
has a copy of that file themselves, can check whether you have a copy of that
|
|
|
|
|
file. This is usually not a problem, but it could be if that file is, for
|
|
|
|
|
example, a book or movie that is banned in your country.
|
|
|
|
|
|
|
|
|
|
The second attack is more subtle. It is called the
|
|
|
|
|
"Learn-the-Remaining-Information Attack". Suppose you've received a
|
|
|
|
|
confidential document, such as a PDF from your bank which contains many pages
|
|
|
|
|
of boilerplate text as well as containing your bank account number and
|
|
|
|
|
balance. Someone who knows your convergence secret can generate a file with
|
|
|
|
|
all of the boilerplate text (perhaps they would open an account with the same
|
|
|
|
|
bank so they receive the same document with their account number and
|
|
|
|
|
balance). Then they can try a "brute force search" to find your account
|
|
|
|
|
number and your balance.
|
|
|
|
|
|
|
|
|
|
The defense against these attacks is that only someone who knows the
|
|
|
|
|
convergence secret that you used on each file can perform these attacks on
|
|
|
|
|
that file.
|
|
|
|
|
|
|
|
|
|
Both of these attacks and the defense are described in more detail in `Drew
|
|
|
|
|
Perttula's Hack Tahoe-LAFS Hall Of Fame entry`_
|
|
|
|
|
|
|
|
|
|
.. _`Drew Perttula's Hack Tahoe-LAFS Hall Of Fame entry`:
|
|
|
|
|
https://tahoe-lafs.org/hacktahoelafs/drew_perttula.html
|
|
|
|
|
|
|
|
|
|
What If I Change My Convergence Secret?
|
|
|
|
|
---------------------------------------
|
|
|
|
|
|
|
|
|
|
All your old file capabilities will still work, but the new data that you
|
|
|
|
|
upload will not be deduplicated with the old data. If you upload all of the
|
|
|
|
|
same things to the grid, you will end up using twice the space until garbage
|
|
|
|
|
collection kicks in (if it's enabled). Changing the convergence secret that a
|
2013-04-09 06:48:23 +00:00
|
|
|
|
storage client uses for uploads can be though of as moving the client to a
|
|
|
|
|
new "deduplication domain".
|
2013-04-09 05:33:42 +00:00
|
|
|
|
|
|
|
|
|
How To Use It
|
|
|
|
|
-------------
|
|
|
|
|
|
|
|
|
|
To enable deduplication between different clients, **securely** copy the
|
|
|
|
|
convergence secret file from one client to all the others.
|
|
|
|
|
|
|
|
|
|
For example, if you are on host A and have an account on host B and you have
|
|
|
|
|
scp installed, run:
|
|
|
|
|
|
|
|
|
|
*scp ~/.tahoe/private/convergence
|
|
|
|
|
my_other_account@B:.tahoe/private/convergence*
|
|
|
|
|
|
2013-04-09 06:48:23 +00:00
|
|
|
|
If you have two different clients on a single computer, say one for each
|
|
|
|
|
disk, you would do:
|
2013-04-09 05:33:42 +00:00
|
|
|
|
|
|
|
|
|
*cp /tahoe1/private/convergence /tahoe2/private/convergence*
|
2013-04-09 06:48:23 +00:00
|
|
|
|
|
|
|
|
|
After you change the convergence secret file, you must restart the client
|
|
|
|
|
before it will stop using the old one and read the new one from the file.
|