NEW VERSION RELEASED -- Allmydata-Tahoe version 0.6 We are pleased to announce the release of version 0.6 of allmydata.org "Tahoe", a secure, decentralized storage grid under a free-software licence. This is the successor to v0.5.1, which was released August 23, 2007 (see [1]). Since v0.5.1 we've made the following changes: * Package Tahoe with setuptools/easy_install. This makes it so that other libraries that Tahoe depends upon get automatically installed when Tahoe is installed. It also means that people who have Python and the easy_install tool can execute "easy_install allmydata-tahoe" on the command-line (including on Windows), and it will download and install Tahoe. (tickets #82, 93, 130) * We did performance profiling of various kinds -- upload/download throughput, memory usage, CPU usage, storage efficiency. The results showed that the current version is reasonably efficient on those metrics, for the loads that we tested. See The Performance Page [2] for details. * Distribute shares more evenly onto servers -- this makes files more reliable when there are few servers. (ticket #132) * Memory usage during download now remains low, even if your node is streaming the downloaded content to a slow web browser over HTTP. (ticket #129) * Shares have a version number in them so that in the future we can upgrade the share format without losing old data. (ticket #90) * improved logging, thanks to Arno * Shares now contain leases, which gives us the information to compute which shares are safe to delete, but we haven't yet implemented deletion itself. Eventually, this will enable client quota tracking. (tickets #119, #67) We also fixed other bugs and implemented other improvements. For complete details, see this web page which shows all ticket changes, repository checkins, and wiki changes from August 24 to today, September 24: [3]. Allmydata.org Tahoe v0.6 is incompatible with Allmydata.org Tahoe v0.5.1 because of the share format version number and the leases. WHAT IS IT GOOD FOR? With Tahoe, you can store your files in a distributed way across a set of computers, such that if some of the computers fail or become unavailable, you can still retrieve your data from the remaining computers. You can also securely share your files with other users. This release is targeted at hackers and users who are willing to use a text-oriented web user interface, or a command-line user interface. (Or a RESTful API. Just telnet to localhost and type HTTP requests to get started.) Because this software is new, it is not yet recommended for storage of highly confidential data nor for important data which is not otherwise backed up. Given that caveat, this software works and there are no known security flaws which would compromise confidentiality or data integrity. This release of Tahoe is suitable for the "friendnet" use case [4]. It is easy to set up a private grid which is securely shared among a specific, limited set of friends. Files uploaded to this shared grid will be available to all friends, even when some of the computers are unavailable. It is also easy to encrypt individual files and directories so that only designated recipients can read them. LICENCE Tahoe is offered under the GNU General Public License (v2 or later), with the added permission that, if you become obligated to release a derived work under this licence (as per section 2.b), you may delay the fulfillment of this obligation for up to 12 months. If you are obligated to release code under section 2.b of this licence, you are obligated to release it under these same terms, including the 12-month grace period clause. INSTALLATION Tahoe works on Linux, Mac OS X, Windows, Cygwin, and Solaris. For installation instructions please see the README [5]. USAGE - web interface Once installed, create a "client node". Instruct this client node to connect to a specific "introducer node" by means of config files in the client node's working directory. To join a grid, copy in the .furl files for that grid. To create a private grid, run your own introducer, and copy its .furl files. See the README for step-by-step instructions. Each client node can run a local webserver (enabled by writing the desired port number into a file called 'webport'). The welcome page of this webserver shows the node's status, including which introducer is being used and which other nodes are connected. Links from the welcome page lead to other pages that give access to a virtual filesystem, in which each directory is represented by a separate page. Each directory page shows a list of the files available there, with download links, and forms to upload new files. USAGE - command-line interface Run "allmydata-tahoe ls [VIRTUAL PATH NAME]" to list the contents of a virtual directory. Run "allmydata-tahoe get [VIRTUAL FILE NAME] [LOCAL FILE NAME]" to download a file. Run "allmydata-tahoe put [LOCAL FILE NAME] [VIRTUAL FILE NAME]" to upload a file. Run "allmydata-tahoe rm [VIRTUAL PATH NAME]" to unlink a file or directory in the virtual drive. USAGE - other You can control the filesystem through the RESTful web API [6]. Other ways to access the filesystem are planned: please see the roadmap.txt [7] for some plans. HACKING AND COMMUNITY Please join the mailing list [8] to discuss the ideas behind Tahoe and extensions of and uses of Tahoe. Patches that extend and improve Tahoe are gratefully accepted -- roadmap.txt [7] shows the next improvements that we plan to make and CREDITS [9] lists the names of people who've contributed to the project. The wiki Dev page [10] collects various hacking resources including revision history browsing, automated test results (including code coverage), automated performance tests, graphs of how many people are using the public test grid for how many files, and more. NETWORK ARCHITECTURE Each peer maintains a connection to each other peer. A single distinct server called an "introducer" is used to discover other peers with which to connect. To store a file, the file is encrypted and erasure coded, and each resulting share is uploaded to a different peer. The secure hash of the encrypted file and the encryption key are packed into a URI, knowledge of which is necessary and sufficient to recover the file. To fetch a file, starting with the URI, a subset of shares is downloaded from peers, the file is reconstructed from the shares, and then decrypted. A single distinct server called a "vdrive server" maintains a global mapping from pathnames/filenames to URIs. We are acutely aware of the limitations on decentralization and scalability inherent in this version. In particular, the completely-connected property of the grid and the requirement of a single distinct introducer and vdrive server limits the possible size of the grid. We have plans to loosen these limitations (see roadmap.txt). Currently it should be noted that the grid already depends as little as possible on the accessibility and correctness of the introduction server and the vdrive server. Also note that the choice of which servers to use is easily configured -- you can set up a private grid for you and your friends as easily as connecting to our public test grid. SOFTWARE ARCHITECTURE Tahoe is a "from the ground-up" rewrite, inspired by Allmydata's existing consumer backup service as well as by its p2p ancestor Mojo Nation. It is primarily written in the Python programming language. Tahoe is based on the Foolscap library [11] which provides a remote object protocol inspired by the capability-secure "E" programming language [12]. Foolscap allows us to express the intended behavior of the distributed grid directly in object-oriented terms while relying on a well-engineered, secure transport layer. The network layer is provided by the Twisted library [13]. Computationally intensive operations are performed in native compiled code, such as the "zfec" library for fast erasure coding (also available separately: [14]). SPONSORSHIP Tahoe is sponsored by Allmydata, Inc. [15], a provider of consumer backup services. Allmydata, Inc. contributes hardware, software, ideas, bug reports, suggestions, demands, and money (employing several allmydata.org Tahoe hackers and allowing them to spend part of their work time on the next-generation, free-software project). We are eternally grateful! Zooko O'Whielacronx and Brian Warner on behalf of the allmydata.org Tahoe team September 24, 2007 Boulder, Colorado and San Francisco, California [1] http://allmydata.org/trac/tahoe/browser/relnotes.txt?rev=1154 [2] http://allmydata.org/trac/tahoe/wiki/Performance [3] http://allmydata.org/trac/tahoe/timeline?from=2007-09-24&daysback=30&changeset=on&milestone=on&ticket=on&ticket_details=on&wiki=on&update=Update [4] http://allmydata.org/trac/tahoe/wiki/UseCases [5] http://allmydata.org/trac/tahoe/browser/README?rev=1338 [6] http://allmydata.org/trac/tahoe/browser/docs/webapi.txt?rev=1151 [7] http://allmydata.org/trac/tahoe/browser/roadmap.txt [8] http://allmydata.org/cgi-bin/mailman/listinfo/tahoe-dev [9] http://allmydata.org/trac/tahoe/browser/CREDITS?rev=1270 [10] http://allmydata.org/trac/tahoe/wiki/Dev [11] http://twistedmatrix.com/trac/wiki/FoolsCap [12] http://erights.org/ [13] http://twistedmatrix.com/ [14] http://allmydata.org/source/zfec/zfec/ [15] http://allmydata.com