mirror of
https://github.com/tahoe-lafs/tahoe-lafs.git
synced 2024-12-21 13:57:51 +00:00
1148 lines
56 KiB
Plaintext
1148 lines
56 KiB
Plaintext
User visible changes in Tahoe-LAFS. -*- outline; coding: utf-8 -*-
|
||
|
||
* Release 1.8.0α (2010-07-31)
|
||
|
||
** Bugfixes and Improvements
|
||
|
||
- Fix bug in WUI in Python 2.5 when the system clock is set back to
|
||
1969. (#1055)
|
||
|
||
* Release 1.7.1 (2010-07-18)
|
||
|
||
** Bugfixes and Improvements
|
||
|
||
- Fix bug in which uploader could fail with AssertionFailure or
|
||
report that it had achieved servers-of-happiness when it
|
||
hadn't. (#1118)
|
||
- Fix bug in which servers could get into a state where they would
|
||
refuse to accept shares of a certain file (#1117)
|
||
- Add init scripts for managing the gateway server on Debian/Ubuntu
|
||
(#961)
|
||
- Fix bug where server version number was always 0 on the welcome
|
||
page (#1067)
|
||
- Add new command-line command "tahoe unlink" as a synonym for "tahoe
|
||
rm" (#776)
|
||
- The FTP frontend now encrypts its temporary files, protecting their
|
||
contents from an attacker who is able to read the disk. (#1083)
|
||
- Fix IP address detection on FreeBSD 7, 8, and 9 (#1098)
|
||
- Fix minor layout issue in the Web User Interface with Internet
|
||
Explorer (#1097)
|
||
- Fix rarely-encountered incompatibility between Twisted logging
|
||
utility and the new unicode support added in v1.7.0 (#1099)
|
||
- Forward-compatibility improvements for non-ASCII caps (#1051)
|
||
|
||
** Code improvements
|
||
|
||
- Simplify and tidy-up directories, unicode support, test code (#923, #967,
|
||
#1072)
|
||
|
||
* Release 1.7.0 (2010-06-18)
|
||
|
||
** New Features
|
||
|
||
*** SFTP support
|
||
|
||
Your Tahoe-LAFS gateway now acts like a full-fledged SFTP server. It has been
|
||
tested with sshfs to provide a virtual filesystem in Linux. Many users have
|
||
asked for this feature. We hope that it serves them well! See the
|
||
docs/frontends/FTP-and-SFTP.txt document to get started.
|
||
|
||
*** support for non-ASCII character encodings
|
||
|
||
Tahoe-LAFS now correctly handles filenames containing non-ASCII characters on
|
||
all supported platforms:
|
||
|
||
- when reading files in from the local filesystem (such as when you run "tahoe
|
||
backup" to back up your local files to a Tahoe-LAFS grid);
|
||
|
||
- when writing files out to the local filesystem (such as when you run "tahoe
|
||
cp -r" to recursively copy files out of a Tahoe-LAFS grid);
|
||
|
||
- when displaying filenames to the terminal (such as when you run "tahoe ls"),
|
||
subject to limitations of the terminal and locale;
|
||
|
||
- when parsing command-line arguments, except on Windows.
|
||
|
||
*** Servers of Happiness
|
||
|
||
Tahoe-LAFS now measures during immutable file upload to see how well
|
||
distributed it is across multiple servers. It aborts the upload if the pieces
|
||
of the file are not sufficiently well-distributed.
|
||
|
||
This behavior is controlled by a configuration parameter called "servers of
|
||
happiness". With the default settings for its erasure coding, Tahoe-LAFS
|
||
generates 10 shares for each file, such that any 3 of those shares are
|
||
sufficient to recover the file. The default value of "servers of happiness" is
|
||
7, which means that Tahoe-LAFS will guarantee that there are at least 7 servers
|
||
holding some of the shares, such that any 3 of those servers can completely
|
||
recover your file.
|
||
|
||
The new upload code also distributes the shares better than the previous
|
||
version in some cases and takes better advantage of pre-existing shares (when a
|
||
file has already been previously uploaded). See the architecture.txt document
|
||
[3] for details.
|
||
|
||
** Bugfixes and Improvements
|
||
|
||
- Premature abort of upload if some shares were already present and some
|
||
servers fail. (#608)
|
||
- python ./setup.py install -- can't create or remove files in install
|
||
directory. (#803)
|
||
- Network failure => internal TypeError. (#902)
|
||
- Install of Tahoe on CentOS 5.4. (#933)
|
||
- CLI option --node-url now supports https url. (#1028)
|
||
- HTML/CSS template files were not correctly installed under Windows. (#1033)
|
||
- MetadataSetter does not enforce restriction on setting "tahoe" subkeys.
|
||
(#1034)
|
||
- ImportError: No module named setuptools_darcs.setuptools_darcs. (#1054)
|
||
- Renamed Title in xhtml files. (#1062)
|
||
- Increase Python version dependency to 2.4.4, to avoid a critical CPython
|
||
security bug. (#1066)
|
||
- Typo correction for the munin plugin tahoe_storagespace. (#968)
|
||
- Fix warnings found by pylint. (#973)
|
||
- Changing format of some documentation files. (#1027)
|
||
- the misc/ directory was tied up. (#1068)
|
||
- The 'ctime' and 'mtime' metadata fields are no longer written except by
|
||
"tahoe backup". (#924)
|
||
- Unicode filenames in Tahoe-LAFS directories are normalized so that names
|
||
that differ only in how accents are encoded are treated as the same. (#1076)
|
||
- Various small improvements to documentation. (#937, #911, #1024, #1082)
|
||
|
||
** Removals
|
||
|
||
The 'tahoe debug consolidate' subcommand (for converting old allmydata Windows
|
||
client backups to a newer format) has been removed.
|
||
|
||
** Dependency Updates
|
||
|
||
the Python version dependency is raised to 2.4.4 in some cases (2.4.3 for
|
||
Redhat-based Linux distributions, 2.4.2 for UCS-2 builds) (#1066)
|
||
pycrypto >= 2.0.1
|
||
pyasn1 >= 0.0.8a
|
||
mock (only required by unit tests)
|
||
|
||
* Release 1.6.1 (2010-02-27)
|
||
|
||
** Bugfixes
|
||
|
||
*** Correct handling of Small Immutable Directories
|
||
|
||
Immutable directories can now be deep-checked and listed in the web UI in
|
||
all cases. (In v1.6.0, some operations, such as deep-check, on a directory
|
||
graph that included very small immutable directories, would result in an
|
||
exception causing the whole operation to abort.) (#948)
|
||
|
||
** Usability Improvements
|
||
|
||
Improved user interface messages and error reporting. (#681, #837, #939)
|
||
|
||
The timeouts for operation handles have been greatly increased, so that
|
||
you can view the results of an operation up to 4 days after it has
|
||
completed. After viewing them for the first time, the results are
|
||
retained for a further day. (#577)
|
||
|
||
* Release 1.6.0 (2010-02-01)
|
||
|
||
** New Features
|
||
|
||
*** Immutable Directories
|
||
|
||
Tahoe-LAFS can now create and handle immutable directories. (#607, #833, #931)
|
||
These are read just like normal directories, but are "deep-immutable", meaning
|
||
that all their children (and everything reachable from those children) must be
|
||
immutable objects (i.e. immutable or literal files, and other immutable
|
||
directories).
|
||
|
||
These directories must be created in a single webapi call that provides all
|
||
of the children at once. (Since they cannot be changed after creation, the
|
||
usual create/add/add sequence cannot be used.) They have URIs that start with
|
||
"URI:DIR2-CHK:" or "URI:DIR2-LIT:", and are described on the human-facing web
|
||
interface (aka the "WUI") with a "DIR-IMM" abbreviation (as opposed to "DIR"
|
||
for the usual read-write directories and "DIR-RO" for read-only directories).
|
||
|
||
Tahoe-LAFS releases before 1.6.0 cannot read the contents of an immutable
|
||
directory. 1.5.0 will tolerate their presence in a directory listing (and
|
||
display it as "unknown"). 1.4.1 and earlier cannot tolerate them: a DIR-IMM
|
||
child in any directory will prevent the listing of that directory.
|
||
|
||
Immutable directories are repairable, just like normal immutable files.
|
||
|
||
The webapi "POST t=mkdir-immutable" call is used to create immutable
|
||
directories. See docs/frontends/webapi.txt for details.
|
||
|
||
*** "tahoe backup" now creates immutable directories, backupdb has dircache
|
||
|
||
The "tahoe backup" command has been enhanced to create immutable directories
|
||
(in previous releases, it created read-only mutable directories) (#828). This
|
||
is significantly faster, since it does not need to create an RSA keypair for
|
||
each new directory. Also "DIR-IMM" immutable directories are repairable, unlike
|
||
"DIR-RO" read-only mutable directories at present. (A future Tahoe-LAFS release
|
||
should also be able to repair DIR-RO.)
|
||
|
||
In addition, the backupdb (used by "tahoe backup" to remember what it has
|
||
already copied) has been enhanced to store information about existing immutable
|
||
directories. This allows it to re-use directories that have moved but still
|
||
contain identical contents, or that have been deleted and later replaced. (The
|
||
1.5.0 "tahoe backup" command could only re-use directories that were in the
|
||
same place as they were in the immediately previous backup.) With this change,
|
||
the backup process no longer needs to read the previous snapshot out of the
|
||
Tahoe-LAFS grid, reducing the network load considerably. (#606)
|
||
|
||
A "null backup" (in which nothing has changed since the previous backup) will
|
||
require only two Tahoe-side operations: one to add an Archives/$TIMESTAMP
|
||
entry, and a second to update the Latest/ link. On the local disk side, it
|
||
will readdir() all your local directories and stat() all your local files.
|
||
|
||
If you've been using "tahoe backup" for a while, you will notice that your
|
||
first use of it after upgrading to 1.6.0 may take a long time: it must create
|
||
proper immutable versions of all the old read-only mutable directories. This
|
||
process won't take as long as the initial backup (where all the file contents
|
||
had to be uploaded too): it will require time proportional to the number and
|
||
size of your directories. After this initial pass, all subsequent passes
|
||
should take a tiny fraction of the time.
|
||
|
||
As noted above, Tahoe-LAFS versions earlier than 1.5.0 cannot list a directory
|
||
containing an immutable subdirectory. Tahoe-LAFS versions earlier than 1.6.0
|
||
cannot read the contents of an immutable directory.
|
||
|
||
The "tahoe backup" command has been improved to skip over unreadable objects
|
||
(like device files, named pipes, and files with permissions that prevent the
|
||
command from reading their contents), instead of throwing an exception and
|
||
terminating the backup process. It also skips over symlinks, because these
|
||
cannot be represented faithfully in the Tahoe-side filesystem. A warning
|
||
message will be emitted each time something is skipped. (#729, #850, #641)
|
||
|
||
*** "create-node" command added, "create-client" now implies --no-storage
|
||
|
||
The basic idea behind Tahoe-LAFS's client+server and client-only processes is
|
||
that you are creating a general-purpose Tahoe-LAFS "node" process, which has
|
||
several components that can be activated. Storage service is one of these
|
||
optional components, as is the Helper, FTP server, and SFTP server. Web gateway
|
||
functionality is nominally on this list, but it is always active; a future
|
||
release will make it optional. There are three special purpose servers that
|
||
can't currently be run as a component in a node: introducer, key-generator,
|
||
and stats-gatherer.
|
||
|
||
So now "tahoe create-node" will create a Tahoe-LAFS node process, and after
|
||
creation you can edit its tahoe.cfg to enable or disable the desired
|
||
services. It is a more general-purpose replacement for "tahoe create-client".
|
||
The default configuration has storage service enabled. For convenience, the
|
||
"--no-storage" argument makes a tahoe.cfg file that disables storage
|
||
service. (#760)
|
||
|
||
"tahoe create-client" has been changed to create a Tahoe-LAFS node without a
|
||
storage service. It is equivalent to "tahoe create-node --no-storage". This
|
||
helps to reduce the confusion surrounding the use of a command with "client" in
|
||
its name to create a storage *server*. Use "tahoe create-client" to create a
|
||
purely client-side node. If you want to offer storage to the grid, use
|
||
"tahoe create-node" instead.
|
||
|
||
In the future, other services will be added to the node, and they will be
|
||
controlled through options in tahoe.cfg . The most important of these
|
||
services may get additional --enable-XYZ or --disable-XYZ arguments to
|
||
"tahoe create-node".
|
||
|
||
** Performance Improvements
|
||
|
||
Download of immutable files begins as soon as the downloader has located the K
|
||
necessary shares (#928, #287). In both the previous and current releases, a
|
||
downloader will first issue queries to all storage servers on the grid to
|
||
locate shares before it begins downloading the shares. In previous releases of
|
||
Tahoe-LAFS, download would not begin until all storage servers on the grid had
|
||
replied to the query, at which point K shares would be chosen for download from
|
||
among the shares that were located. In this release, download begins as soon as
|
||
any K shares are located. This means that downloads start sooner, which is
|
||
particularly important if there is a server on the grid that is extremely slow
|
||
or even hung in such a way that it will never respond. In previous releases
|
||
such a server would have a negative impact on all downloads from that grid. In
|
||
this release, such a server will have no impact on downloads, as long as K
|
||
shares can be found on other, quicker, servers. This also means that
|
||
downloads now use the "best-alacrity" servers that they talk to, as measured by
|
||
how quickly the servers reply to the initial query. This might cause downloads
|
||
to go faster, especially on grids with heterogeneous servers or geographical
|
||
dispersion.
|
||
|
||
** Minor Changes
|
||
|
||
The webapi acquired a new "t=mkdir-with-children" command, to create and
|
||
populate a directory in a single call. This is significantly faster than
|
||
using separate "t=mkdir" and "t=set-children" operations (it uses one
|
||
gateway-to-grid roundtrip, instead of three or four). (#533)
|
||
|
||
The t=set-children (note the hyphen) operation is now documented in
|
||
docs/frontends/webapi.txt, and is the new preferred spelling of the old
|
||
t=set_children (with an underscore). The underscore version remains for
|
||
backwards compatibility. (#381, #927)
|
||
|
||
The tracebacks produced by errors in CLI tools should now be in plain text,
|
||
instead of HTML (which is unreadable outside of a browser). (#646)
|
||
|
||
The [storage]reserved_space configuration knob (which causes the storage
|
||
server to refuse shares when available disk space drops below a threshold)
|
||
should work on Windows now, not just UNIX. (#637)
|
||
|
||
"tahoe cp" should now exit with status "1" if it cannot figure out a suitable
|
||
target filename, such as when you copy from a bare filecap. (#761)
|
||
|
||
"tahoe get" no longer creates a zero-length file upon error. (#121)
|
||
|
||
"tahoe ls" can now list single files. (#457)
|
||
|
||
"tahoe deep-check --repair" should tolerate repair failures now, instead of
|
||
halting traversal. (#874, #786)
|
||
|
||
"tahoe create-alias" no longer corrupts the aliases file if it had
|
||
previously been edited to have no trailing newline. (#741)
|
||
|
||
Many small packaging improvements were made to facilitate the "tahoe-lafs"
|
||
package being included in Ubuntu. Several mac/win32 binary libraries were
|
||
removed, some figleaf code-coverage files were removed, a bundled copy of
|
||
darcsver-1.2.1 was removed, and additional licensing text was added.
|
||
|
||
Several DeprecationWarnings for python2.6 were silenced. (#859)
|
||
|
||
The checker --add-lease option would sometimes fail for shares stored
|
||
on old (Tahoe v1.2.0) servers. (#875)
|
||
|
||
The documentation for installing on Windows (docs/install.html) has been
|
||
improved. (#773)
|
||
|
||
For other changes not mentioned here, see
|
||
<http://allmydata.org/trac/tahoe/query?milestone=1.6.0&keywords=!~news-done>.
|
||
To include the tickets mentioned above, go to
|
||
<http://allmydata.org/trac/tahoe/query?milestone=1.6.0>.
|
||
|
||
|
||
* Release 1.5.0 (2009-08-01)
|
||
|
||
** Improvements
|
||
|
||
Uploads of immutable files now use pipelined writes, improving upload speed
|
||
slightly (10%) over high-latency connections. (#392)
|
||
|
||
Processing large directories has been sped up, by removing a O(N^2) algorithm
|
||
from the dirnode decoding path and retaining unmodified encrypted entries.
|
||
(#750, #752)
|
||
|
||
The human-facing web interface (aka the "WUI") received a significant CSS
|
||
makeover by Kevin Reid, making it much prettier and easier to read. The WUI
|
||
"check" and "deep-check" forms now include a "Renew Lease" checkbox,
|
||
mirroring the CLI --add-lease option, so leases can be added or renewed from
|
||
the web interface.
|
||
|
||
The CLI "tahoe mv" command now refuses to overwrite directories. (#705)
|
||
|
||
The CLI "tahoe webopen" command, when run without arguments, will now bring
|
||
up the "Welcome Page" (node status and mkdir/upload forms).
|
||
|
||
The 3.5MB limit on mutable files was removed, so it should be possible to
|
||
upload arbitrarily-sized mutable files. Note, however, that the data format
|
||
and algorithm remains the same, so using mutable files still requires
|
||
bandwidth, computation, and RAM in proportion to the size of the mutable file.
|
||
(#694)
|
||
|
||
This version of Tahoe-LAFS will tolerate directory entries that contain filecap
|
||
formats which it does not recognize: files and directories from the future.
|
||
This should improve the user experience (for 1.5.0 users) when we add new cap
|
||
formats in the future. Previous versions would fail badly, preventing the user
|
||
from seeing or editing anything else in those directories. These unrecognized
|
||
objects can be renamed and deleted, but obviously not read or written. Also
|
||
they cannot generally be copied. (#683)
|
||
|
||
** Bugfixes
|
||
|
||
deep-check-and-repair now tolerates read-only directories, such as the ones
|
||
produced by the "tahoe backup" CLI command. Read-only directories and mutable
|
||
files are checked, but not repaired. Previous versions threw an exception
|
||
when attempting the repair and failed to process the remaining contents. We
|
||
cannot yet repair these read-only objects, but at least this version allows
|
||
the rest of the check+repair to proceed. (#625)
|
||
|
||
A bug in 1.4.1 which caused a server to be listed multiple times (and
|
||
frequently broke all connections to that server) was fixed. (#653)
|
||
|
||
The plaintext-hashing code was removed from the Helper interface, removing
|
||
the Helper's ability to mount a partial-information-guessing attack. (#722)
|
||
|
||
** Platform/packaging changes
|
||
|
||
Tahoe-LAFS now runs on NetBSD, OpenBSD, ArchLinux, and NixOS, and on an
|
||
embedded system based on an ARM CPU running at 266 MHz.
|
||
|
||
Unit test timeouts have been raised to allow the tests to complete on
|
||
extremely slow platforms like embedded ARM-based NAS boxes, which may take
|
||
several hours to run the test suite. An ARM-specific data-corrupting bug in
|
||
an older version of Crypto++ (5.5.2) was identified: ARM-users are encouraged
|
||
to use recent Crypto++/pycryptopp which avoids this problem.
|
||
|
||
Tahoe-LAFS now requires a SQLite library, either the sqlite3 that comes
|
||
built-in with python2.5/2.6, or the add-on pysqlite2 if you're using
|
||
python2.4. In the previous release, this was only needed for the "tahoe backup"
|
||
command: now it is mandatory.
|
||
|
||
Several minor documentation updates were made.
|
||
|
||
To help get Tahoe-LAFS into Linux distributions like Fedora and Debian,
|
||
packaging improvements are being made in both Tahoe-LAFS and related libraries
|
||
like pycryptopp and zfec.
|
||
|
||
The Crypto++ library included in the pycryptopp package has been upgraded to
|
||
version 5.6.0 of Crypto++, which includes a more efficient implementation of
|
||
SHA-256 in assembly for x86 or amd64 architectures.
|
||
|
||
** dependency updates
|
||
|
||
foolscap-0.4.1
|
||
no python-2.4.0 or 2.4.1 (2.4.2 is good)
|
||
(they contained a bug in base64.b32decode)
|
||
avoid python-2.6 on windows with mingw: compiler issues
|
||
python2.4 requires pysqlite2 (2.5,2.6 does not)
|
||
no python-3.x
|
||
pycryptopp-0.5.15
|
||
|
||
|
||
* Release 1.4.1 (2009-04-13)
|
||
|
||
** Garbage Collection
|
||
|
||
The big feature for this release is the implementation of garbage collection,
|
||
allowing Tahoe storage servers to delete shares for old deleted files. When
|
||
enabled, this uses a "mark and sweep" process: clients are responsible for
|
||
updating the leases on their shares (generally by running "tahoe deep-check
|
||
--add-lease"), and servers are allowed to delete any share which does not
|
||
have an up-to-date lease. The process is described in detail in
|
||
docs/garbage-collection.txt .
|
||
|
||
The server must be configured to enable garbage-collection, by adding
|
||
directives to the [storage] section that define an age limit for shares. The
|
||
default configuration will not delete any shares.
|
||
|
||
Both servers and clients should be upgraded to this release to make the
|
||
garbage-collection as pleasant as possible. 1.2.0 servers have code to
|
||
perform the update-lease operation but it suffers from a fatal bug, while
|
||
1.3.0 servers have update-lease but will return an exception for unknown
|
||
storage indices, causing clients to emit an Incident for each exception,
|
||
slowing the add-lease process down to a crawl. 1.1.0 servers did not have the
|
||
add-lease operation at all.
|
||
|
||
** Security/Usability Problems Fixed
|
||
|
||
A super-linear algorithm in the Merkle Tree code was fixed, which previously
|
||
caused e.g. download of a 10GB file to take several hours before the first
|
||
byte of plaintext could be produced. The new "alacrity" is about 2 minutes. A
|
||
future release should reduce this to a few seconds by fixing ticket #442.
|
||
|
||
The previous version permitted a small timing attack (due to our use of
|
||
strcmp) against the write-enabler and lease-renewal/cancel secrets. An
|
||
attacker who could measure response-time variations of approximatly 3ns
|
||
against a very noisy background time of about 15ms might be able to guess
|
||
these secrets. We do not believe this attack was actually feasible. This
|
||
release closes the attack by first hashing the two strings to be compared
|
||
with a random secret.
|
||
|
||
** webapi changes
|
||
|
||
In most cases, HTML tracebacks will only be sent if an "Accept: text/html"
|
||
header was provided with the HTTP request. This will generally cause browsers
|
||
to get an HTMLized traceback but send regular text/plain tracebacks to
|
||
non-browsers (like the CLI clients). More errors have been mapped to useful
|
||
HTTP error codes.
|
||
|
||
The streaming webapi operations (deep-check and manifest) now have a way to
|
||
indicate errors (an output line that starts with "ERROR" instead of being
|
||
legal JSON). See docs/frontends/webapi.txt for details.
|
||
|
||
The storage server now has its own status page (at /storage), linked from the
|
||
Welcome page. This page shows progress and results of the two new
|
||
share-crawlers: one which merely counts shares (to give an estimate of how
|
||
many files/directories are being stored in the grid), the other examines
|
||
leases and reports how much space would be freed if GC were enabled. The page
|
||
also shows how much disk space is present, used, reserved, and available for
|
||
the Tahoe server, and whether the server is currently running in "read-write"
|
||
mode or "read-only" mode.
|
||
|
||
When a directory node cannot be read (perhaps because of insufficent shares),
|
||
a minimal webapi page is created so that the "more-info" links (including a
|
||
Check/Repair operation) will still be accessible.
|
||
|
||
A new "reliability" page was added, with the beginnings of work on a
|
||
statistical loss model. You can tell this page how many servers you are using
|
||
and their independent failure probabilities, and it will tell you the
|
||
likelihood that an arbitrary file will survive each repair period. The
|
||
"numpy" package must be installed to access this page. A partial paper,
|
||
written by Shawn Willden, has been added to docs/proposed/lossmodel.lyx .
|
||
|
||
** CLI changes
|
||
|
||
"tahoe check" and "tahoe deep-check" now accept an "--add-lease" argument, to
|
||
update a lease on all shares. This is the "mark" side of garbage collection.
|
||
|
||
In many cases, CLI error messages have been improved: the ugly HTMLized
|
||
traceback has been replaced by a normal python traceback.
|
||
|
||
"tahoe deep-check" and "tahoe manifest" now have better error reporting.
|
||
"tahoe cp" is now non-verbose by default.
|
||
|
||
"tahoe backup" now accepts several "--exclude" arguments, to ignore certain
|
||
files (like editor temporary files and version-control metadata) during
|
||
backup.
|
||
|
||
On windows, the CLI now accepts local paths like "c:\dir\file.txt", which
|
||
previously was interpreted as a Tahoe path using a "c:" alias.
|
||
|
||
The "tahoe restart" command now uses "--force" by default (meaning it will
|
||
start a node even if it didn't look like there was one already running).
|
||
|
||
The "tahoe debug consolidate" command was added. This takes a series of
|
||
independent timestamped snapshot directories (such as those created by the
|
||
allmydata.com windows backup program, or a series of "tahoe cp -r" commands)
|
||
and creates new snapshots that used shared read-only directories whenever
|
||
possible (like the output of "tahoe backup"). In the most common case (when
|
||
the snapshots are fairly similar), the result will use significantly fewer
|
||
directories than the original, allowing "deep-check" and similar tools to run
|
||
much faster. In some cases, the speedup can be an order of magnitude or more.
|
||
This tool is still somewhat experimental, and only needs to be run on large
|
||
backups produced by something other than "tahoe backup", so it was placed
|
||
under the "debug" category.
|
||
|
||
"tahoe cp -r --caps-only tahoe:dir localdir" is a diagnostic tool which,
|
||
instead of copying the full contents of files into the local directory,
|
||
merely copies their filecaps. This can be used to verify the results of a
|
||
"consolidation" operation.
|
||
|
||
** other fixes
|
||
|
||
The codebase no longer rauses RuntimeError as a kind of assert(). Specific
|
||
exception classes were created for each previous instance of RuntimeError.
|
||
|
||
Many unit tests were changed to use a non-network test harness, speeding them
|
||
up considerably.
|
||
|
||
Deep-traversal operations (manifest and deep-check) now walk individual
|
||
directories in alphabetical order. Occasional turn breaks are inserted to
|
||
prevent a stack overflow when traversing directories with hundreds of
|
||
entries.
|
||
|
||
The experimental SFTP server had its path-handling logic changed slightly, to
|
||
accomodate more SFTP clients, although there are still issues (#645).
|
||
|
||
|
||
* Release 1.3.0 (2009-02-13)
|
||
|
||
** Checker/Verifier/Repairer
|
||
|
||
The primary focus of this release has been writing a checker / verifier /
|
||
repairer for files and directories. "Checking" is the act of asking storage
|
||
servers whether they have a share for the given file or directory: if there
|
||
are not enough shares available, the file or directory will be
|
||
unrecoverable. "Verifying" is the act of downloading and cryptographically
|
||
asserting that the server's share is undamaged: it requires more work
|
||
(bandwidth and CPU) than checking, but can catch problems that simple
|
||
checking cannot. "Repair" is the act of replacing missing or damaged shares
|
||
with new ones.
|
||
|
||
This release includes a full checker, a partial verifier, and a partial
|
||
repairer. The repairer is able to handle missing shares: new shares are
|
||
generated and uploaded to make up for the missing ones. This is currently the
|
||
best application of the repairer: to replace shares that were lost because of
|
||
server departure or permanent drive failure.
|
||
|
||
The repairer in this release is somewhat able to handle corrupted shares. The
|
||
limitations are:
|
||
|
||
* Immutable verifier is incomplete: not all shares are used, and not all
|
||
fields of those shares are verified. Therefore the immutable verifier has
|
||
only a moderate chance of detecting corrupted shares.
|
||
* The mutable verifier is mostly complete: all shares are examined, and most
|
||
fields of the shares are validated.
|
||
* The storage server protocol offers no way for the repairer to replace or
|
||
delete immutable shares. If corruption is detected, the repairer will
|
||
upload replacement shares to other servers, but the corrupted shares will
|
||
be left in place.
|
||
* read-only directories and read-only mutable files must be repaired by
|
||
someone who holds the write-cap: the read-cap is insufficient. Moreover,
|
||
the deep-check-and-repair operation will halt with an error if it attempts
|
||
to repair one of these read-only objects.
|
||
* Some forms of corruption can cause both download and repair operations to
|
||
fail. A future release will fix this, since download should be tolerant of
|
||
any corruption as long as there are at least 'k' valid shares, and repair
|
||
should be able to fix any file that is downloadable.
|
||
|
||
If the downloader, verifier, or repairer detects share corruption, the
|
||
servers which provided the bad shares will be notified (via a file placed in
|
||
the BASEDIR/storage/corruption-advisories directory) so their operators can
|
||
manually delete the corrupted shares and investigate the problem. In
|
||
addition, the "incident gatherer" mechanism will automatically report share
|
||
corruption to an incident gatherer service, if one is configured. Note that
|
||
corrupted shares indicate hardware failures, serious software bugs, or malice
|
||
on the part of the storage server operator, so a corrupted share should be
|
||
considered highly unusual.
|
||
|
||
By periodically checking/repairing all files and directories, objects in the
|
||
Tahoe filesystem remain resistant to recoverability failures due to missing
|
||
and/or broken servers.
|
||
|
||
This release includes a wapi mechanism to initiate checks on individual
|
||
files and directories (with or without verification, and with or without
|
||
automatic repair). A related mechanism is used to initiate a "deep-check" on
|
||
a directory: recursively traversing the directory and its children, checking
|
||
(and/or verifying/repairing) everything underneath. Both mechanisms can be
|
||
run with an "output=JSON" argument, to obtain machine-readable check/repair
|
||
status results. These results include a copy of the filesystem statistics
|
||
from the "deep-stats" operation (including total number of files, size
|
||
histogram, etc). If repair is possible, a "Repair" button will appear on the
|
||
results page.
|
||
|
||
The client web interface now features some extra buttons to initiate check
|
||
and deep-check operations. When these operations finish, they display a
|
||
results page that summarizes any problems that were encountered. All
|
||
long-running deep-traversal operations, including deep-check, use a
|
||
start-and-poll mechanism, to avoid depending upon a single long-lived HTTP
|
||
connection. docs/frontends/webapi.txt has details.
|
||
|
||
** Efficient Backup
|
||
|
||
The "tahoe backup" command is new in this release, which creates efficient
|
||
versioned backups of a local directory. Given a local pathname and a target
|
||
Tahoe directory, this will create a read-only snapshot of the local directory
|
||
in $target/Archives/$timestamp. It will also create $target/Latest, which is
|
||
a reference to the latest such snapshot. Each time you run "tahoe backup"
|
||
with the same source and target, a new $timestamp snapshot will be added.
|
||
These snapshots will share directories that have not changed since the last
|
||
backup, to speed up the process and minimize storage requirements. In
|
||
addition, a small database is used to keep track of which local files have
|
||
been uploaded already, to avoid uploading them a second time. This
|
||
drastically reduces the work needed to do a "null backup" (when nothing has
|
||
changed locally), making "tahoe backup' suitable to run from a daily cronjob.
|
||
|
||
Note that the "tahoe backup" CLI command must be used in conjunction with a
|
||
1.3.0-or-newer Tahoe client node; there was a bug in the 1.2.0 webapi
|
||
implementation that would prevent the last step (create $target/Latest) from
|
||
working.
|
||
|
||
** Large Files
|
||
|
||
The 12GiB (approximate) immutable-file-size limitation is lifted. This
|
||
release knows how to handle so-called "v2 immutable shares", which permit
|
||
immutable files of up to about 18 EiB (about 3*10^14). These v2 shares are
|
||
created if the file to be uploaded is too large to fit into v1 shares. v1
|
||
shares are created if the file is small enough to fit into them, so that
|
||
files created with tahoe-1.3.0 can still be read by earlier versions if they
|
||
are not too large. Note that storage servers also had to be changed to
|
||
support larger files, and this release is the first release in which they are
|
||
able to do that. Clients will detect which servers are capable of supporting
|
||
large files on upload and will not attempt to upload shares of a large file
|
||
to a server which doesn't support it.
|
||
|
||
** FTP/SFTP Server
|
||
|
||
Tahoe now includes experimental FTP and SFTP servers. When configured with a
|
||
suitable method to translate username+password into a root directory cap, it
|
||
provides simple access to the virtual filesystem. Remember that FTP is
|
||
completely unencrypted: passwords, filenames, and file contents are all sent
|
||
over the wire in cleartext, so FTP should only be used on a local (127.0.0.1)
|
||
connection. This feature is still in development: there are no unit tests
|
||
yet, and behavior with respect to Unicode filenames is uncertain. Please see
|
||
docs/frontends/FTP-and-SFTP.txt for configuration details. (#512, #531)
|
||
|
||
** CLI Changes
|
||
|
||
This release adds the 'tahoe create-alias' command, which is a combination of
|
||
'tahoe mkdir' and 'tahoe add-alias'. This also allows you to start using a
|
||
new tahoe directory without exposing its URI in the argv list, which is
|
||
publicly visible (through the process table) on most unix systems. Thanks to
|
||
Kevin Reid for bringing this issue to our attention.
|
||
|
||
The single-argument form of "tahoe put" was changed to create an unlinked
|
||
file. I.e. "tahoe put bar.txt" will take the contents of a local "bar.txt"
|
||
file, upload them to the grid, and print the resulting read-cap; the file
|
||
will not be attached to any directories. This seemed a bit more useful than
|
||
the previous behavior (copy stdin, upload to the grid, attach the resulting
|
||
file into your default tahoe: alias in a child named 'bar.txt').
|
||
|
||
"tahoe put" was also fixed to handle mutable files correctly: "tahoe put
|
||
bar.txt URI:SSK:..." will read the contents of the local bar.txt and use them
|
||
to replace the contents of the given mutable file.
|
||
|
||
The "tahoe webopen" command was modified to accept aliases. This means "tahoe
|
||
webopen tahoe:" will cause your web browser to open to a "wui" page that
|
||
gives access to the directory associated with the default "tahoe:" alias. It
|
||
should also accept leading slashes, like "tahoe webopen tahoe:/stuff".
|
||
|
||
Many esoteric debugging commands were moved down into a "debug" subcommand:
|
||
|
||
tahoe debug dump-cap
|
||
tahoe debug dump-share
|
||
tahoe debug find-shares
|
||
tahoe debug catalog-shares
|
||
tahoe debug corrupt-share
|
||
|
||
The last command ("tahoe debug corrupt-share") flips a random bit of the
|
||
given local sharefile. This is used to test the file verifying/repairing
|
||
code, and obviously should not be used on user data.
|
||
|
||
The cli might not correctly handle arguments which contain non-ascii
|
||
characters in Tahoe v1.3 (although depending on your platform it
|
||
might, especially if your platform can be configured to pass such
|
||
characters on the command-line in utf-8 encoding). See
|
||
http://allmydata.org/trac/tahoe/ticket/565 for details.
|
||
|
||
** Web changes
|
||
|
||
The "default webapi port", used when creating a new client node (and in the
|
||
getting-started documentation), was changed from 8123 to 3456, to reduce
|
||
confusion when Tahoe accessed through a Firefox browser on which the
|
||
"Torbutton" extension has been installed. Port 8123 is occasionally used as a
|
||
Tor control port, so Torbutton adds 8123 to Firefox's list of "banned ports"
|
||
to avoid CSRF attacks against Tor. Once 8123 is banned, it is difficult to
|
||
diagnose why you can no longer reach a Tahoe node, so the Tahoe default was
|
||
changed. Note that 3456 is reserved by IANA for the "vat" protocol, but there
|
||
are argueably more Torbutton+Tahoe users than vat users these days. Note that
|
||
this will only affect newly-created client nodes. Pre-existing client nodes,
|
||
created by earlier versions of tahoe, may still be listening on 8123.
|
||
|
||
All deep-traversal operations (start-manifest, start-deep-size,
|
||
start-deep-stats, start-deep-check) now use a start-and-poll approach,
|
||
instead of using a single (fragile) long-running synchronous HTTP connection.
|
||
All these "start-" operations use POST instead of GET. The old "GET
|
||
manifest", "GET deep-size", and "POST deep-check" operations have been
|
||
removed.
|
||
|
||
The new "POST start-manifest" operation, when it finally completes, results
|
||
in a table of (path,cap), instead of the list of verifycaps produced by the
|
||
old "GET manifest". The table is available in several formats: use
|
||
output=html, output=text, or output=json to choose one. The JSON output also
|
||
includes stats, and a list of verifycaps and storage-index strings.
|
||
|
||
The "return_to=" and "when_done=" arguments have been removed from the
|
||
t=check and deep-check operations.
|
||
|
||
The top-level status page (/status) now has a machine-readable form, via
|
||
"/status/?t=json". This includes information about the currently-active
|
||
uploads and downloads, which may be useful for frontends that wish to display
|
||
progress information. There is no easy way to correlate the activities
|
||
displayed here with recent wapi requests, however.
|
||
|
||
Any files in BASEDIR/public_html/ (configurable) will be served in response
|
||
to requests in the /static/ portion of the URL space. This will simplify the
|
||
deployment of javascript-based frontends that can still access wapi calls
|
||
by conforming to the (regrettable) "same-origin policy".
|
||
|
||
The welcome page now has a "Report Incident" button, which is tied into the
|
||
"Incident Gatherer" machinery. If the node is attached to an incident
|
||
gatherer (via log_gatherer.furl), then pushing this button will cause an
|
||
Incident to be signalled: this means recent log events are aggregated and
|
||
sent in a bundle to the gatherer. The user can push this button after
|
||
something strange takes place (and they can provide a short message to go
|
||
along with it), and the relevant data will be delivered to a centralized
|
||
incident-gatherer for later processing by operations staff.
|
||
|
||
The "HEAD" method should now work correctly, in addition to the usual "GET",
|
||
"PUT", and "POST" methods. "HEAD" is supposed to return exactly the same
|
||
headers as "GET" would, but without any of the actual response body data. For
|
||
mutable files, this now does a brief mapupdate (to figure out the size of the
|
||
file that would be returned), without actually retrieving the file's
|
||
contents.
|
||
|
||
The "GET" operation on files can now support the HTTP "Range:" header,
|
||
allowing requests for partial content. This allows certain media players to
|
||
correctly stream audio and movies out of a Tahoe grid. The current
|
||
implementation uses a disk-based cache in BASEDIR/private/cache/download ,
|
||
which holds the plaintext of the files being downloaded. Future
|
||
implementations might not use this cache. GET for immutable files now returns
|
||
an ETag header.
|
||
|
||
Each file and directory now has a "Show More Info" web page, which contains
|
||
much of the information that was crammed into the directory page before. This
|
||
includes readonly URIs, storage index strings, object type, buttons to
|
||
control checking/verifying/repairing, and deep-check/deep-stats buttons (for
|
||
directories). For mutable files, the "replace contents" upload form has been
|
||
moved here too. As a result, the directory page is now much simpler and
|
||
cleaner, and several potentially-misleading links (like t=uri) are now gone.
|
||
|
||
Slashes are discouraged in Tahoe file/directory names, since they cause
|
||
problems when accessing the filesystem through the wapi. However, there are
|
||
a couple of accidental ways to generate such names. This release tries to
|
||
make it easier to correct such mistakes by escaping slashes in several
|
||
places, allowing slashes in the t=info and t=delete commands, and in the
|
||
source (but not the target) of a t=rename command.
|
||
|
||
** Packaging
|
||
|
||
Tahoe's dependencies have been extended to require the "[secure_connections]"
|
||
feature from Foolscap, which will cause pyOpenSSL to be required and/or
|
||
installed. If OpenSSL and its development headers are already installed on
|
||
your system, this can occur automatically. Tahoe now uses pollreactor
|
||
(instead of the default selectreactor) to work around a bug between pyOpenSSL
|
||
and the most recent release of Twisted (8.1.0). This bug only affects unit
|
||
tests (hang during shutdown), and should not impact regular use.
|
||
|
||
The Tahoe source code tarballs now come in two different forms: regular and
|
||
"sumo". The regular tarball contains just Tahoe, nothing else. When building
|
||
from the regular tarball, the build process will download any unmet
|
||
dependencies from the internet (starting with the index at PyPI) so it can
|
||
build and install them. The "sumo" tarball contains copies of all the
|
||
libraries that Tahoe requires (foolscap, twisted, zfec, etc), so using the
|
||
"sumo" tarball should not require any internet access during the build
|
||
process. This can be useful if you want to build Tahoe while on an airplane,
|
||
a desert island, or other bandwidth-limited environments.
|
||
|
||
Similarly, allmydata.org now hosts a "tahoe-deps" tarball which contains the
|
||
latest versions of all these dependencies. This tarball, located at
|
||
http://allmydata.org/source/tahoe/deps/tahoe-deps.tar.gz, can be unpacked in
|
||
the tahoe source tree (or in its parent directory), and the build process
|
||
should satisfy its downloading needs from it instead of reaching out to PyPI.
|
||
This can be useful if you want to build Tahoe from a darcs checkout while on
|
||
that airplane or desert island.
|
||
|
||
Because of the previous two changes ("sumo" tarballs and the "tahoe-deps"
|
||
bundle), most of the files have been removed from misc/dependencies/ . This
|
||
brings the regular Tahoe tarball down to 2MB (compressed), and the darcs
|
||
checkout (without history) to about 7.6MB. A full darcs checkout will still
|
||
be fairly large (because of the historical patches which included the
|
||
dependent libraries), but a 'lazy' one should now be small.
|
||
|
||
The default "make" target is now an alias for "setup.py build", which itself
|
||
is an alias for "setup.py develop --prefix support", with some extra work
|
||
before and after (see setup.cfg). Most of the complicated platform-dependent
|
||
code in the Makefile was rewritten in Python and moved into setup.py,
|
||
simplifying things considerably.
|
||
|
||
Likewise, the "make test" target now delegates most of its work to "setup.py
|
||
test", which takes care of getting PYTHONPATH configured to access the tahoe
|
||
code (and dependencies) that gets put in support/lib/ by the build_tahoe
|
||
step. This should allow unit tests to be run even when trial (which is part
|
||
of Twisted) wasn't already installed (in this case, trial gets installed to
|
||
support/bin because Twisted is a dependency of Tahoe).
|
||
|
||
Tahoe is now compatible with the recently-released Python 2.6 , although it
|
||
is recommended to use Tahoe on Python 2.5, on which it has received more
|
||
thorough testing and deployment.
|
||
|
||
Tahoe is now compatible with simplejson-2.0.x . The previous release assumed
|
||
that simplejson.loads always returned unicode strings, which is no longer the
|
||
case in 2.0.x .
|
||
|
||
** Grid Management Tools
|
||
|
||
Several tools have been added or updated in the misc/ directory, mostly munin
|
||
plugins that can be used to monitor a storage grid.
|
||
|
||
The misc/spacetime/ directory contains a "disk watcher" daemon (startable
|
||
with 'tahoe start'), which can be configured with a set of HTTP URLs
|
||
(pointing at the wapi '/statistics' page of a bunch of storage servers),
|
||
and will periodically fetch disk-used/disk-available information from all the
|
||
servers. It keeps this information in an Axiom database (a sqlite-based
|
||
library available from divmod.org). The daemon computes time-averaged rates
|
||
of disk usage, as well as a prediction of how much time is left before the
|
||
grid is completely full.
|
||
|
||
The misc/munin/ directory contains a new set of munin plugins
|
||
(tahoe_diskleft, tahoe_diskusage, tahoe_doomsday) which talk to the
|
||
disk-watcher and provide graphs of its calculations.
|
||
|
||
To support the disk-watcher, the Tahoe statistics component (visible through
|
||
the wapi at the /statistics/ URL) now includes disk-used and disk-available
|
||
information. Both are derived through an equivalent of the unix 'df' command
|
||
(i.e. they ask the kernel for the number of free blocks on the partition that
|
||
encloses the BASEDIR/storage directory). In the future, the disk-available
|
||
number will be further influenced by the local storage policy: if that policy
|
||
says that the server should refuse new shares when less than 5GB is left on
|
||
the partition, then "disk-available" will report zero even though the kernel
|
||
sees 5GB remaining.
|
||
|
||
The 'tahoe_overhead' munin plugin interacts with an allmydata.com-specific
|
||
server which reports the total of the 'deep-size' reports for all active user
|
||
accounts, compares this with the disk-watcher data, to report on overhead
|
||
percentages. This provides information on how much space could be recovered
|
||
once Tahoe implements some form of garbage collection.
|
||
|
||
** Configuration Changes: single INI-format tahoe.cfg file
|
||
|
||
The Tahoe node is now configured with a single INI-format file, named
|
||
"tahoe.cfg", in the node's base directory. Most of the previous
|
||
multiple-separate-files are still read for backwards compatibility (the
|
||
embedded SSH debug server and the advertised_ip_addresses files are the
|
||
exceptions), but new directives will only be added to tahoe.cfg . The "tahoe
|
||
create-client" command will create a tahoe.cfg for you, with sample values
|
||
commented out. (ticket #518)
|
||
|
||
tahoe.cfg now has controls for the foolscap "keepalive" and "disconnect"
|
||
timeouts (#521).
|
||
|
||
tahoe.cfg now has controls for the encoding parameters: "shares.needed" and
|
||
"shares.total" in the "[client]" section. The default parameters are still
|
||
3-of-10.
|
||
|
||
The inefficient storage 'sizelimit' control (which established an upper bound
|
||
on the amount of space that a storage server is allowed to consume) has been
|
||
replaced by a lightweight 'reserved_space' control (which establishes a lower
|
||
bound on the amount of remaining space). The storage server will reject all
|
||
writes that would cause the remaining disk space (as measured by a '/bin/df'
|
||
equivalent) to drop below this value. The "[storage]reserved_space="
|
||
tahoe.cfg parameter controls this setting. (note that this only affects
|
||
immutable shares: it is an outstanding bug that reserved_space does not
|
||
prevent the allocation of new mutable shares, nor does it prevent the growth
|
||
of existing mutable shares).
|
||
|
||
** Other Changes
|
||
|
||
Clients now declare which versions of the protocols they support. This is
|
||
part of a new backwards-compatibility system:
|
||
http://allmydata.org/trac/tahoe/wiki/Versioning .
|
||
|
||
The version strings for human inspection (as displayed on the Welcome web
|
||
page, and included in logs) now includes a platform identifer (frequently
|
||
including a linux distribution name, processor architecture, etc).
|
||
|
||
Several bugs have been fixed, including one that would cause an exception (in
|
||
the logs) if a wapi download operation was cancelled (by closing the TCP
|
||
connection, or pushing the "stop" button in a web browser).
|
||
|
||
Tahoe now uses Foolscap "Incidents", writing an "incident report" file to
|
||
logs/incidents/ each time something weird occurs. These reports are available
|
||
to an "incident gatherer" through the flogtool command. For more details,
|
||
please see the Foolscap logging documentation. An incident-classifying plugin
|
||
function is provided in misc/incident-gatherer/classify_tahoe.py .
|
||
|
||
If clients detect corruption in shares, they now automatically report it to
|
||
the server holding that share, if it is new enough to accept the report.
|
||
These reports are written to files in BASEDIR/storage/corruption-advisories .
|
||
|
||
The 'nickname' setting is now defined to be a UTF-8 -encoded string, allowing
|
||
non-ascii nicknames.
|
||
|
||
The 'tahoe start' command will now accept a --syslog argument and pass it
|
||
through to twistd, making it easier to launch non-Tahoe nodes (like the
|
||
cpu-watcher) and have them log to syslogd instead of a local file. This is
|
||
useful when running a Tahoe node out of a USB flash drive.
|
||
|
||
The Mac GUI in src/allmydata/gui/ has been improved.
|
||
|
||
|
||
* Release 1.2.0 (2008-07-21)
|
||
|
||
** Security
|
||
|
||
This release makes the immutable-file "ciphertext hash tree" mandatory.
|
||
Previous releases allowed the uploader to decide whether their file would
|
||
have an integrity check on the ciphertext or not. A malicious uploader could
|
||
use this to create a readcap that would download as one file or a different
|
||
one, depending upon which shares the client fetched first, with no errors
|
||
raised. There are other integrity checks on the shares themselves, preventing
|
||
a storage server or other party from violating the integrity properties of
|
||
the read-cap: this failure was only exploitable by the uploader who gives you
|
||
a carefully constructed read-cap. If you download the file with Tahoe 1.2.0
|
||
or later, you will not be vulnerable to this problem. #491
|
||
|
||
This change does not introduce a compatibility issue, because all existing
|
||
versions of Tahoe will emit the ciphertext hash tree in their shares.
|
||
|
||
** Dependencies
|
||
|
||
Tahoe now requires Foolscap-0.2.9 . It also requires pycryptopp 0.5 or newer,
|
||
since earlier versions had a bug that interacted with specific compiler
|
||
versions that could sometimes result in incorrect encryption behavior. Both
|
||
packages are included in the Tahoe source tarball in misc/dependencies/ , and
|
||
should be built automatically when necessary.
|
||
|
||
** Web API
|
||
|
||
Web API directory pages should now contain properly-slash-terminated links to
|
||
other directories. They have also stopped using absolute links in forms and
|
||
pages (which interfered with the use of a front-end load-balancing proxy).
|
||
|
||
The behavior of the "Check This File" button changed, in conjunction with
|
||
larger internal changes to file checking/verification. The button triggers an
|
||
immediate check as before, but the outcome is shown on its own page, and does
|
||
not get stored anywhere. As a result, the web directory page no longer shows
|
||
historical checker results.
|
||
|
||
A new "Deep-Check" button has been added, which allows a user to initiate a
|
||
recursive check of the given directory and all files and directories
|
||
reachable from it. This can cause quite a bit of work, and has no
|
||
intermediate progress information or feedback about the process. In addition,
|
||
the results of the deep-check are extremely limited. A later release will
|
||
improve this behavior.
|
||
|
||
The web server's behavior with respect to non-ASCII (unicode) filenames in
|
||
the "GET save=true" operation has been improved. To achieve maximum
|
||
compatibility with variously buggy web browsers, the server does not try to
|
||
figure out the character set of the inbound filename. It just echoes the same
|
||
bytes back to the browser in the Content-Disposition header. This seems to
|
||
make both IE7 and Firefox work correctly.
|
||
|
||
** Checker/Verifier/Repairer
|
||
|
||
Tahoe is slowly acquiring convenient tools to check up on file health,
|
||
examine existing shares for errors, and repair files that are not fully
|
||
healthy. This release adds a mutable checker/verifier/repairer, although
|
||
testing is very limited, and there are no web interfaces to trigger repair
|
||
yet. The "Check" button next to each file or directory on the wapi page
|
||
will perform a file check, and the "deep check" button on each directory will
|
||
recursively check all files and directories reachable from there (which may
|
||
take a very long time).
|
||
|
||
Future releases will improve access to this functionality.
|
||
|
||
** Operations/Packaging
|
||
|
||
A "check-grid" script has been added, along with a Makefile target. This is
|
||
intended (with the help of a pre-configured node directory) to check upon the
|
||
health of a Tahoe grid, uploading and downloading a few files. This can be
|
||
used as a monitoring tool for a deployed grid, to be run periodically and to
|
||
signal an error if it ever fails. It also helps with compatibility testing,
|
||
to verify that the latest Tahoe code is still able to handle files created by
|
||
an older version.
|
||
|
||
The munin plugins from misc/munin/ are now copied into any generated debian
|
||
packages, and are made executable (and uncompressed) so they can be symlinked
|
||
directly from /etc/munin/plugins/ .
|
||
|
||
Ubuntu "Hardy" was added as a supported debian platform, with a Makefile
|
||
target to produce hardy .deb packages. Some notes have been added to
|
||
docs/debian.txt about building Tahoe on a debian/ubuntu system.
|
||
|
||
Storage servers now measure operation rates and latency-per-operation, and
|
||
provides results through the /statistics web page as well as the stats
|
||
gatherer. Munin plugins have been added to match.
|
||
|
||
** Other
|
||
|
||
Tahoe nodes now use Foolscap "incident logging" to record unusual events to
|
||
their NODEDIR/logs/incidents/ directory. These incident files can be examined
|
||
by Foolscap logging tools, or delivered to an external log-gatherer for
|
||
further analysis. Note that Tahoe now requires Foolscap-0.2.9, since 0.2.8
|
||
had a bug that complained about "OSError: File exists" when trying to create
|
||
the incidents/ directory for a second time.
|
||
|
||
If no servers are available when retrieving a mutable file (like a
|
||
directory), the node now reports an error instead of hanging forever. Earlier
|
||
releases would not only hang (causing the wapi directory listing to get
|
||
stuck half-way through), but the internal dirnode serialization would cause
|
||
all subsequent attempts to retrieve or modify the same directory to hang as
|
||
well. #463
|
||
|
||
A minor internal exception (reported in logs/twistd.log, in the
|
||
"stopProducing" method) was fixed, which complained about "self._paused_at
|
||
not defined" whenever a file download was stopped from the web browser end.
|
||
|
||
|
||
* Release 1.1.0 (2008-06-11)
|
||
|
||
** CLI: new "alias" model
|
||
|
||
The new CLI code uses an scp/rsync -like interface, in which directories in
|
||
the Tahoe storage grid are referenced by a colon-suffixed alias. The new
|
||
commands look like:
|
||
tahoe cp local.txt tahoe:virtual.txt
|
||
tahoe ls work:subdir
|
||
|
||
More functionality is available through the CLI: creating unlinked files and
|
||
directories, recursive copy in or out of the storage grid, hardlinks, and
|
||
retrieving the raw read- or write- caps through the 'ls' command. Please read
|
||
docs/CLI.txt for complete details.
|
||
|
||
** wapi: new pages, new commands
|
||
|
||
Several new pages were added to the web API:
|
||
|
||
/helper_status : to describe what a Helper is doing
|
||
/statistics : reports node uptime, CPU usage, other stats
|
||
/file : for easy file-download URLs, see #221
|
||
/cap == /uri : future compatibility
|
||
|
||
The localdir=/localfile= and t=download operations were removed. These
|
||
required special configuration to enable anyways, but this feature was a
|
||
security problem, and was mostly obviated by the new "cp -r" command.
|
||
|
||
Several new options to the GET command were added:
|
||
|
||
t=deep-size : add up the size of all immutable files reachable from the directory
|
||
t=deep-stats : return a JSON-encoded description of number of files, size
|
||
distribution, total size, etc
|
||
|
||
POST is now preferred over PUT for most operations which cause side-effects.
|
||
|
||
Most wapi calls now accept overwrite=, and default to overwrite=true .
|
||
|
||
"POST /uri/DIRCAP/parent/child?t=mkdir" is now the preferred API to create
|
||
multiple directories at once, rather than ...?t=mkdir-p .
|
||
|
||
PUT to a mutable file ("PUT /uri/MUTABLEFILECAP", "PUT /uri/DIRCAP/child")
|
||
will modify the file in-place.
|
||
|
||
** more munin graphs in misc/munin/
|
||
|
||
tahoe-introstats
|
||
tahoe-rootdir-space
|
||
tahoe_estimate_files
|
||
mutable files published/retrieved
|
||
tahoe_cpu_watcher
|
||
tahoe_spacetime
|
||
|
||
** New Dependencies
|
||
|
||
zfec 1.1.0
|
||
foolscap 0.2.8
|
||
pycryptopp 0.5
|
||
setuptools (now required at runtime)
|
||
|
||
** New Mutable-File Code
|
||
|
||
The mutable-file handling code (mostly used for directories) has been
|
||
completely rewritten. The new scheme has a better API (with a modify()
|
||
method) and is less likely to lose data when several uncoordinated writers
|
||
change a file at the same time.
|
||
|
||
In addition, a single Tahoe process will coordinate its own writes. If you
|
||
make two concurrent directory-modifying wapi calls to a single tahoe node,
|
||
it will internally make one of them wait for the other to complete. This
|
||
prevents auto-collision (#391).
|
||
|
||
The new mutable-file code also detects errors during publish better. Earlier
|
||
releases might believe that a mutable file was published when in fact it
|
||
failed.
|
||
|
||
** other features
|
||
|
||
The node now monitors its own CPU usage, as a percentage, measured every 60
|
||
seconds. 1/5/15 minute moving averages are available on the /statistics web
|
||
page and via the stats-gathering interface.
|
||
|
||
Clients now accelerate reconnection to all servers after being offline
|
||
(#374). When a client is offline for a long time, it scales back reconnection
|
||
attempts to approximately once per hour, so it may take a while to make the
|
||
first attempt, but once any attempt succeeds, the other server connections
|
||
will be retried immediately.
|
||
|
||
A new "offloaded KeyGenerator" facility can be configured, to move RSA key
|
||
generation out from, say, a wapi node, into a separate process. RSA keys
|
||
can take several seconds to create, and so a wapi node which is being used
|
||
for directory creation will be unavailable for anything else during this
|
||
time. The Key Generator process will pre-compute a small pool of keys, to
|
||
speed things up further. This also takes better advantage of multi-core CPUs,
|
||
or SMP hosts.
|
||
|
||
The node will only use a potentially-slow "du -s" command at startup (to
|
||
measure how much space has been used) if the "sizelimit" parameter has been
|
||
configured (to limit how much space is used). Large storage servers should
|
||
turn off sizelimit until a later release improves the space-management code,
|
||
since "du -s" on a terabyte filesystem can take hours.
|
||
|
||
The Introducer now allows new announcements to replace old ones, to avoid
|
||
buildups of obsolete announcements.
|
||
|
||
Immutable files are limited to about 12GiB (when using the default 3-of-10
|
||
encoding), because larger files would be corrupted by the four-byte
|
||
share-size field on the storage servers (#439). A later release will remove
|
||
this limit. Earlier releases would allow >12GiB uploads, but the resulting
|
||
file would be unretrievable.
|
||
|
||
The docs/ directory has been rearranged, with old docs put in
|
||
docs/historical/ and not-yet-implemented ones in docs/proposed/ .
|
||
|
||
The Mac OS-X FUSE plugin has a significant bug fix: earlier versions would
|
||
corrupt writes that used seek() instead of writing the file in linear order.
|
||
The rsync tool is known to perform writes in this order. This has been fixed.
|