The Tahoe-LAFS decentralized secure filesystem.
Go to file
2007-07-22 19:48:44 -07:00
bin bin/allmydata-tahoe: also update PYTHONPATH so that child processes (like twistd) will work 2007-06-06 11:36:48 -07:00
docs truncate storage index to 128 bits, since it's derived from a 128 bit AES key 2007-07-22 19:48:44 -07:00
misc misc/storage-overhead.py: tool to estimate storage-space overhead per filesize 2007-07-16 13:43:31 -07:00
src truncate storage index to 128 bits, since it's derived from a 128 bit AES key 2007-07-22 19:48:44 -07:00
twisted/plugins change #!/usr/bin/python to #!/usr/bin/env python 2007-03-29 14:01:28 -07:00
.darcs-boringfile boringfile: exclude some foolscap.deb-generated files 2007-07-11 13:28:42 -07:00
COPYING clarify licence 2007-06-29 15:28:15 -07:00
CREDITS update CREDITS file 2007-06-29 14:22:30 -07:00
Makefile Makefile/check-memory: put all stats in ./memstats.out 2007-07-17 10:41:41 -07:00
MANIFEST.in add distutils-based packaging 2006-12-05 01:29:26 -07:00
README edit README and require Nevow 0.9.18 2007-06-29 15:33:38 -07:00
relnotes.txt small edit of relnotes 2007-07-22 18:23:37 -07:00
roadmap.txt roadmap: move Storage to milestone v3 2007-07-03 17:40:09 -07:00
setup.py remove old filetree code 2007-06-25 20:34:19 -07:00
Tahoe.home bin/allmydata-tahoe: add a sys.path-modifying preamble to make it easy to run from source 2007-06-06 11:24:00 -07:00

Welcome to the Allmydata-Tahoe project.  This project implements a secure,
distributed, fault-tolerant storage grid.  All of the source code is available 
under a Free Software licence.

The basic idea is that the data in this storage grid is spread over all
participating nodes, using an algorithm that can recover the data even if a
majority of the nodes are no longer available.

The interface to the storage grid allows you to store and fetch files, either
by self-authenticating cryptographic identifier or by filename and path.

See the web site for all kinds of information, news, and community
contributions, and prebuilt packages for Debian-like systems:

http://allmydata.org


LICENCE:

 Tahoe is offered under the GNU General Public License (v2 or later), with
 the added permission that, if you become obligated to release a derived work
 under this licence (as per section 2.b), you may delay the fulfillment of
 this obligation for up to 12 months.  See the COPYING file for details.


GETTING THE SOURCE CODE:

The code is available via darcs by running the following command:

darcs get http://allmydata.org/source/tahoe/trunk tahoe

Tarballs of sources are available at:

http://allmydata.org/source/tahoe/


DEPENDENCIES:

Note: All of the following dependencies can probably be installed through
your standard package management tool if you are running on a modern Unix
operating system.

For example, on an debian-like system, you can do "sudo apt-get install
gcc make python-dev python-twisted python-nevow python-pyopenssl".

 + a C compiler (language)

 + GNU make (build tool)

 + Python 2.4 or newer (tested against 2.4, and 2.5.1 -- on Windows-native 
   Python 2.5 or higher is required), including development headers (language)

   http://python.org/

 + Python Twisted (tested against both 2.4 and 2.5) (network and operating
   system integration library)

   http://twistedmatrix.com/

   You need the following subpackages, which are included in the default
   Twisted distribution:

   * core (the standard Twisted package)
   * web, trial, conch

   Twisted requires zope.interface, a copy of which is included in the
   Twisted distribution.

 + Python Nevow (0.9.18 or later) (web presentation language)

   http://divmod.org/trac/wiki/DivmodNevow

 + Python setuptools (build and distribution tool)

   Note: The build process will automatically download and install
   setuptools if it is not present.  However, if an old, incompatible
   version of setuptools (< v0.6c3) is present, then the build will fail.
   Therefore, if the build fails due to setuptools not being compatible,
   you can either upgrade or uninstall your version of setuptools and try
   again.

   http://peak.telecommunity.com/DevCenter/EasyInstall#installation-instructions

 + Python PyOpenSSL (0.6 or later) (secure transport layer)

   http://pyopenssl.sourceforge.net

   To install PyOpenSSL on Windows-native, download this:
   http://allmydata.org/source/pyOpenSSL-0.6.win32-py2.5.exe

   To install PyOpenSSL on Windows-cygwin, install the OpenSSL development
   libraries with the cygwin package management tool, then get the pyOpenSSL
   source code, cd into it, and run "python ./setup.py install".

 + the pywin32 package: only required on Windows

   http://sourceforge.net/projects/pywin32/

   (Tested with build 210, and known to not work with build 204.
   Feedback with details of other builds is greatly appreciated)


Tahoe uses a few additional libraries which are included in this source
distribution for convenience. These will be automatically built when you type
'make', but if you have separate installations of them you may wish to modify
the makefile to use those in preference to the included versions. They
include Foolscap (a secure remote-object-invocation library), zfec (erasure
coding), and a modified version of PyCrypto (enhanced to provide a faster
CTR-mode API).


BUILDING:

 Just type 'make' in the top-level tahoe directory.  This works on Windows
 too, provided that you have the dependencies mentioned above.  (Either a
 normal cygwin build or a mingw-style native build will be done by the
 makefile, depending on whether the version of python that you have installed
 is the Windows-native python or the cygwin python.)

 If the desired version of 'python' is not already on your PATH, then type
 'make PYTHON=/path/to/your/preferred/python'.

 'make test-all' runs the unit test suites.  (This can take a long time on
 slow computers.  There are a lot of tests and some of them do a lot of
 public-key cryptography.)


INSTALLING:

There are three ways to do it.  Choose one:

 The Debian Way:

  The Debian Way is to build .deb files which you can then install with
  "dpkg".

  This requires the debian packages build-essential, fakeroot, devscripts,
  and the packages listed as "Build-Depends" in the DIST/debian/control in
  the top-level tahoe directory, replacing the word DIST with etch, dapper,
  edgy, or feisty as appropriate:

  If you're running on a debian system, run 'make deb-dapper', 'make
  deb-sid', 'make deb-edgy', or 'make deb-feisty' from within the tahoe
  top-level directory to construct two debian packages named
  'allmydata-tahoe' and 'python-foolscap' which you can then install with
  dpkg.

 The Python Way:

  The Python Way is to execute "setup.py install" for each Python package.

  You'll need to run "setup.py install" four separate times, one for each of
  the four subpackages (allmydata, allmydata.Crypto, foolscap, and zfec).

    for PACKAGE in zfec Crypto foolscap ; do
      cd src/${PACKAGE} && python setup.py install && cd ../..
    done

    # the tahoe subpackage's setup.py script is in the root directory
    PACKAGE=tahoe
    python setup.py install

 The Running-In-Place Way:

  The Running-In-Place Way is to add a directory to your PYTHONPATH.

  To run from a source tree (without installing first), type 'make', which
  will put all the necessary libraries into a local directory named
  "./instdir/lib", which you can then add to your PYTHONPATH .  (It will put
  executables into "./instdir/bin".)


TESTING THAT IT IS PROPERLY INSTALLED

 To test that all the modules got installed properly, start a python
 interpreter and import modules as follows.  If each one imports successfully
 instead of raising ImportError then it is correctly installed.

   % python
   Python 2.4.4 (#2, Jan 13 2007, 17:50:26)
   [GCC 4.1.2 20061115 (prerelease) (Debian 4.1.1-21)] on linux2
   Type "help", "copyright", "credits" or "license" for more information.
   >>> import zfec
   >>> import allmydata.Crypto
   >>> import foolscap
   >>> import allmydata.interfaces


RUNNING:

 If you installed one of the debian packages constructed by "make deb-*",
 then it creates an 'allmydata-tahoe' executable, usually in /usr/bin .
 Else, you can find allmydata-tahoe in ./instdir/bin/ .  This tool is used to
 create, start, and stop nodes.  Each node lives in a separate base
 directory, inside of which you can add files to configure and control the
 node.  Nodes also read and write files within that directory.

 A grid consists of a single central 'introducer and vdrive' node and one or
 more 'client' nodes.  If you are joining an existing grid, the
 introducer-and-vdrive node will already be running, and you'll just need to
 create a client node.  If you're creating a brand new grid, you'll need to
 create both an introducer-and-vdrive and a client (and then invite other
 people to create their own client nodes and join your grid).

 The introducer (-and-vdrive) node is constructed by running 'allmydata-tahoe
 create-introducer --basedir $HERE'.  Once constructed, you can start the
 introducer by running 'allmydata-tahoe start --basedir $HERE' (or, if you
 are already in the introducer's base directory, just type 'allmydata-tahoe
 start').  Inside that base directory, there will be a pair of files
 'introducer.furl' and 'vdrive.furl'.  Make a copy of these, as they'll be
 needed on the client nodes.

 To construct a client node, pick a new working directory for it, then run
 'allmydata-tahoe create-client --basedir $HERE'.  Copy the two .furl files
 from the introducer into this new directory, then run 'allmydata-tahoe start
 --basedir $HERE'.  After that, the client node should be off and running.
 The first thing it will do is connect to the introducer and introduce itself
 to all other nodes on the grid.  You can follow its progress by looking at
 the $HERE/twistd.log file.

 To actually use the client, enable the web interface by writing a port
 number (like "8080") into a file named $HERE/webport and then restarting the
 node with 'allmydata-tahoe restart --basedir $HERE'. This will prompt the
 client node to run a webserver on the desired port, through which you can
 view, upload, download, and delete files. This 'webport' file is actually a
 "strports specification", defined in
 http://twistedmatrix.com/documents/current/api/twisted.application.strports.html
 , so you can have it only listen on a local interface by writing
 "tcp:8080:interface=127.0.0.1" to this file, or make it use SSL by writing
 "ssl:8443:privateKey=mykey.pem:certKey=cert.pem" instead.

 A client node directory can also be created without installing the code
 first.  Just use 'make create-client', and a new directory named 'CLIENTDIR'
 will be created inside the top of the source tree.  Copy the relevant .furl
 files in, set the webport, then start the node by using 'make start-client'.
 To stop it again, use 'make stop-client'.  Similar makefile targets exist
 for making and running an introducer node.

 If you are behind a firewall and you can configure your firewall to forward
 TCP connections on a port to the computer running your Tahoe node, then you
 can configure the Tahoe node to announce itself as being available on that
 IP address and port.  The way to do this is to create a file named
 $HERE/advertised_ip_addresses, in which you can put IP addresses and port numbers in
 "dotted-quad:port" form, e.g. "209.97.232.113:1345".  You can put multiple
 IP-address-and-port-number entries into this file, on separate lines.

 There is a public grid available for testing.  Look at the wiki page
 (http://allmydata.org) for the necessary .furl data.