tahoe-lafs/NEWS
2008-12-30 02:17:28 -07:00

577 lines
29 KiB
Plaintext

User visible changes in Tahoe. -*- outline -*-
* Release 1.3.0 (?)
** Checker/Verifier/Repairer
The focus of this release has been improving the functionality and the
usability of code which checks/verifies/repairs files and directories.
"Checking" is the act of asking storage servers whether they have a share for
the given file or directory: if there are not enough shares available, the
file/directory will be unrecoverable. "Verifying" is the act of downloading
and/or cryptographically asserting that the server's share is undamaged: it
requires more work (bandwidth and CPU) than checking, but can catch problems
that simple checking cannot. "Repair" is the act of replacing missing/damaged
shares with new ones.
For mutable files (and therefore directories), missing shares can be
regenerated, and corrupted shares can be repaired in place. For immutable
files, missing shares are regenerated, and corrupted shares are handled by
uploading new shares to other servers. The storage server protocol does not
allow clients to change or remove immutable shares, so if persistent
corruption is detected, the user and the storage server operator must work
together to remove the damaged share. Note that corrupted shares indicate
hardware failures, serious software bugs, or malice on the part of the
storage server operator, so a corrupted share should be considered highly
unusual. The "incident gatherer" mechanism will automatically report share
corruption to an incident gatherer service, if one is configured.
By periodically checking/repairing all files and directories, objects in the
Tahoe filesystem remain resistant to recoverability failures due to missing
and/or broken servers.
This release includes a webapi mechanism to initiate checks on individual
files and directories (with or without verification, and with or without
automatic repair). A related mechanism is used to initiate a "deep-check" on
a directory: recursively traversing the directory and its children, checking
(and/or verifying/repairing) everything underneath. Both mechanisms can be
run with an "output=JSON" argument, to obtain machine-readable check/repair
status results. These results include a copy of the filesystem statistics
from the "deep-stats" operation (including total number of files, size
histogram, etc). If repair is necessary, a "Repair" button will appear on the
results page.
The client web interface now features some extra buttons to initiate check
and deep-check operations. When these operations finish, they display a
results page that summarizes any problems that were encountered. All
long-running deep-traversal operations, including deep-check, use a
start-and-poll mechanism, to avoid depending upon a single long-lived HTTP
connection. docs/frontends/webapi.txt has details.
** Configuration Changes: single INI-format tahoe.cfg file
The Tahoe node is now configured with a single INI-format file, named
"tahoe.cfg", in the node's base directory. Most of the previous
multiple-separate-files are still read for backwards compatibility (the
embedded SSH debug server and the advertised_ip_addresses files are the
exceptions), but new directives will only be added to tahoe.cfg . The "tahoe
create-client" command will create a tahoe.cfg for you, with sample values
commented out. (ticket #518)
tahoe.cfg now has controls for the foolscap "keepalive" and "disconnect"
timeouts (#521).
tahoe.cfg now has controls for the encoding parameters: "shares.needed" and
"shares.total" in the "[client]" section. The default parameters are still
3-of-10.
The inefficient storage 'sizelimit' control (which established an upper bound
on the amount of space that a storage server is allowed to consume) has been
replaced by a lightweight 'reserved_space' control (which establishes a lower
bound on the amount of remaining space). The storage server will reject all
writes that would cause the remaining disk space (as measured by a '/bin/df'
equivalent) to drop below this value. The "[storage]reserved_space="
tahoe.cfg parameter controls this setting. (note that this only affects
immutable shares: it is an outstanding bug that reserved_space does not
prevent the allocation of new mutable shares, nor does it prevent the growth
of existing mutable shares).
** CLI Changes
This release adds the 'tahoe create-alias' command, which is a combination of
'tahoe mkdir' and 'tahoe add-alias'. This also allows you to start using a
new tahoe directory without exposing its URI in the argv list, which is
publicly visible (through the process table) on most unix systems. Thanks to
Kevin Reid for bringing this issue to our attention.
The single-argument form of "tahoe put" was changed to create an unlinked
file. I.e. "tahoe put bar.txt" will take the contents of a local "bar.txt"
file, upload them to the grid, and print the resulting read-cap; the file
will not be attached to any directories. This seemed a bit more useful than
the previous behavior (copy stdin, upload to the grid, attach the resulting
file into your default tahoe: alias in a child named 'bar.txt').
"tahoe put" was also fixed to handle mutable files correctly: "tahoe put
bar.txt URI:SSK:..." will read the contents of the local bar.txt and use them
to replace the contents of the given mutable file.
The "tahoe webopen" command was modified to accept aliases. This means "tahoe
webopen tahoe:" will cause your web browser to open to a "wui" page that
gives access to the directory associated with the default "tahoe:" alias. It
should also accept leading slashes, like "tahoe webopen tahoe:/stuff".
Many esoteric debugging commands were moved down into a "debug" subcommand:
tahoe debug dump-cap
tahoe debug dump-share
tahoe debug find-shares
tahoe debug catalog-shares
tahoe debug corrupt-share
The last command ("tahoe debug corrupt-share") flips a random bit of the
given local sharefile. This is used to test the file verifying/repairing
code, and obviously should not be used on user data.
The cli might not correctly handle arguments which contain non-ascii
characters in Tahoe v1.3 (although depending on your platform it
might, especially if your platform can be configured to pass such
characters on the command-line in utf-8 encoding). See
http://allmydata.org/trac/tahoe/ticket/565 for details.
** Web changes
The "default webapi port", used when creating a new client node (and in the
getting-started documentation), was changed from 8123 to 3456, to reduce
confusion when Tahoe accessed through a Firefox browser on which the
"Torbutton" extension has been installed. Port 8123 is occasionally used as a
Tor control port, so Torbutton adds 8123 to Firefox's list of "banned ports"
to avoid CSRF attacks against Tor. Once 8123 is banned, it is difficult to
diagnose why you can no longer reach a Tahoe node, so the Tahoe default was
changed. Note that 3456 is reserved by IANA for the "vat" protocol, but there
are argueably more Torbutton+Tahoe users than vat users these days. Note that
this will only affect newly-created client nodes. Pre-existing client nodes,
created by earlier versions of tahoe, may still be listening on 8123.
All deep-traversal operations (start-manifest, start-deep-size,
start-deep-stats, start-deep-check) now use a start-and-poll approach,
instead of using a single (fragile) long-running synchronous HTTP connection.
All these "start-" operations use POST instead of GET. The old "GET
manifest", "GET deep-size", and "POST deep-check" operations have been
removed.
The new "POST start-manifest" operation, when it finally completes, results
in a table of (path,cap), instead of the list of verifycaps produced by the
old "GET manifest". The table is available in several formats: use
output=html, output=text, or output=json to choose one. The JSON output also
includes stats, and a list of verifycaps and storage-index strings.
The "return_to=" and "when_done=" arguments have been removed from the
t=check and deep-check operations.
The top-level status page (/status) now has a machine-readable form, via
"/status/?t=json". This includes information about the currently-active
uploads and downloads, which may be useful for frontends that wish to display
progress information. There is no easy way to correlate the activities
displayed here with recent webapi requests, however.
Any files in BASEDIR/public_html/ (configurable) will be served in response
to requests in the /static/ portion of the URL space. This will simplify the
deployment of javascript-based frontends that can still access webapi calls
by conforming to the (regrettable) "same-origin policy".
The welcome page now has a "Report Incident" button, which is tied into the
"Incident Gatherer" machinery. If the node is attached to an incident
gatherer (via log_gatherer.furl), then pushing this button will cause an
Incident to be signalled: this means recent log events are aggregated and
sent in a bundle to the gatherer. The user can push this button after
something strange takes place (and they can provide a short message to go
along with it), and the relevant data will be delivered to a centralized
incident-gatherer for later processing by operations staff.
The "HEAD" method should now work correctly, in addition to the usual "GET",
"PUT", and "POST" methods. "HEAD" is supposed to return exactly the same
headers as "GET" would, but without any of the actual response body data. For
mutable files, this now does a brief mapupdate (to figure out the size of the
file that would be returned), without actually retrieving the file's
contents.
The "GET" operation on files can now support the HTTP "Range:" header,
allowing requests for partial content. This allows certain media players to
correctly stream audio and movies out of a Tahoe grid. The current
implementation uses a disk-based cache in BASEDIR/private/cache/download ,
which holds the plaintext of the files being downloaded. Future
implementations might not use this cache. GET for immutable files now returns
an ETag header.
Each file and directory now has a "Show More Info" web page, which contains
much of the information that was crammed into the directory page before. This
includes readonly URIs, storage index strings, object type, buttons to
control checking/verifying/repairing, and deep-check/deep-stats buttons (for
directories). For mutable files, the "replace contents" upload form has been
moved here too. As a result, the directory page is now much simpler and
cleaner, and several potentially-misleading links (like t=uri) are now gone.
Slashes are discouraged in Tahoe file/directory names, since they cause
problems when accessing the filesystem through the webapi. However, there are
a couple of accidental ways to generate such names. This release tries to
make it easier to correct such mistakes by escaping slashes in several
places, allowing slashes in the t=info and t=delete commands, and in the
source (but not the target) of a t=rename command.
** Packaging
Tahoe's dependencies have been extended to require the "[secure_connections]"
feature from Foolscap, which will cause pyOpenSSL to be required and/or
installed. If OpenSSL and its development headers are already installed on
your system, this can occur automatically. Tahoe now uses pollreactor
(instead of the default selectreactor) to work around a bug between pyOpenSSL
and the most recent release of Twisted (8.1.0). This bug only affects unit
tests (hang during shutdown), and should not impact regular use.
The Tahoe source code tarballs now come in two different forms: regular and
"sumo". The regular tarball contains just Tahoe, nothing else. When building
from the regular tarball, the build process will download any unmet
dependencies from the internet (starting with the index at PyPI) so it can
build and install them. The "sumo" tarball contains copies of all the
libraries that Tahoe requires (foolscap, twisted, zfec, etc), so using the
"sumo" tarball should not require any internet access during the build
process. This can be useful if you want to build Tahoe while on an airplane,
a desert island, or other bandwidth-limited environments.
Similarly, allmydata.org now hosts a "tahoe-deps" tarball which contains the
latest versions of all these dependencies. This tarball, located at
http://allmydata.org/source/tahoe/deps/tahoe-deps.tar.gz, can be unpacked in
the tahoe source tree (or in its parent directory), and the build process
should satisfy its downloading needs from it instead of reaching out to PyPI.
This can be useful if you want to build Tahoe from a darcs checkout while on
that airplane or desert island.
Because of the previous two changes ("sumo" tarballs and the "tahoe-deps"
bundle), most of the files have been removed from misc/dependencies/ . This
brings the regular Tahoe tarball down to 2MB (compressed), and the darcs
checkout (without history) to about 7.6MB. A full darcs checkout will still
be fairly large (because of the historical patches which included the
dependent libraries), but a 'lazy' one should now be small.
The default "make" target is now an alias for "setup.py build_tahoe", which
itself is a wrapper around "setup.py develop --prefix support/lib", with some
extra work before and after. Most of the complicated platform-dependent code
in the Makefile was rewritten in Python and moved into setup.py, simplifying
things considerably.
Likewise, the "make test" target now delegates most of its work to "setup.py
trial", which takes care of getting PYTHONPATH configured to access the tahoe
code (and dependencies) that gets put in support/lib/ by the build_tahoe
step. This should allow unit tests to be run even when trial (which is part
of Twisted) wasn't already installed (in this case, trial gets installed to
support/bin because Twisted is a dependency of Tahoe).
Tahoe itself is now compatible with the recently-released Python 2.6 . Most
of its dependencies are too, however the most recent release of Nevow
(0.9.31) has an unused function that uses 'with' as a variable name, which
runs afoul of the new 'with' reserved word in 2.6, so to run Tahoe against
python2.6 you will need to install Nevow from its SVN trunk, or comment out
the offending function.
Tahoe is now compatible with simplejson-2.0.x . The previous release assumed
that simplejson.loads always returned unicode strings, which is no longer the
case in 2.0.x .
setup.py now includes /System/Library in site-dirs when building on a Mac,
which should help it find previously-installed libraries like Twisted (#229)
** Grid Management Tools
Several tools have been added or updated in the misc/ directory, mostly munin
plugins that can be used to monitor a storage grid.
The misc/spacetime/ directory contains a "disk watcher" daemon (startable
with 'tahoe start'), which can be configured with a set of HTTP URLs
(pointing at the webapi '/statistics' page of a bunch of storage servers),
and will periodically fetch disk-used/disk-available information from all the
servers. It keeps this information in an Axiom database (a sqlite-based
library available from divmod.org). The daemon computes time-averaged rates
of disk usage, as well as a prediction of how much time is left before the
grid is completely full.
The misc/munin/ directory contains a new set of munin plugins
(tahoe_diskleft, tahoe_diskusage, tahoe_doomsday) which talk to the
disk-watcher and provide graphs of its calculations.
To support the disk-watcher, the Tahoe statistics component (visible through
the webapi at the /statistics/ URL) now includes disk-used and disk-available
information. Both are derived through an equivalent of the unix 'df' command
(i.e. they ask the kernel for the number of free blocks on the partition that
encloses the BASEDIR/storage directory). In the future, the disk-available
number will be further influenced by the local storage policy: if that policy
says that the server should refuse new shares when less than 5GB is left on
the partition, then "disk-available" will report zero even though the kernel
sees 5GB remaining.
The 'tahoe_overhead' munin plugin interacts with an allmydata.com-specific
server which reports the total of the 'deep-size' reports for all active user
accounts, compares this with the disk-watcher data, to report on overhead
percentages. This provides information on how much space could be recovered
once Tahoe implements some form of garbage collection.
** Other Changes
Clients now declare their "oldest-supported version" to be 1.0.0 . This is
part of a backwards-compatibility system that has not yet been fully
specified. Previous releases declared their oldest-supported-version to be
the same as their current version number.
The version strings (as displayed on the Welcome web page, and included in
logs) now includes a platform identifer (frequently including a linux
distribution name, processor architecture, etc).
Several bugs have been fixed, including one that would cause an exception (in
the logs) if a webapi download operation was cancelled (by closing the TCP
connection, or pushing the "stop" button in a web browser).
The 12GiB (approximate) immutable-file-size limitation is slowly being
lifted. This release knows how to handle so-called "v2 immutable shares",
which permit immutable files of up to about 18 EiB (about 3*10^14). These v2
shares are not yet created by default, so that files created with tahoe-1.3.0
can still be read by earlier versions. In the next release we will switch to
generating v2 shares, so that files created with tahoe-1.4.0 can be read by
tahoe-1.3.0 and later. Note that the storage server must also be changed to
support files larger than 12GiB, and that these changes have not yet been
implemented. (ticket #346)
Tahoe now uses Foolscap "Incidents", writing an "incident report" file to
logs/incidents/ each time something weird occurs. These reports are available
to an "incident gatherer" through the flogtool command. For more details,
please see the Foolscap logging documentation. An incident-classifying plugin
function is provided in misc/incident-gatherer/classify_tahoe.py .
If clients detect corruption in shares, they now automatically report it to
the server holding that share, if it is new enough to accept the report.
These reports are written to files in BASEDIR/storage/corruption-advisories .
The 'nickname' setting is now defined to be a UTF-8 -encoded string, allowing
non-ascii nicknames.
The 'tahoe start' command will now accept a --syslog argument and pass it
through to twistd, making it easier to launch non-Tahoe nodes (like the
cpu-watcher) and have them log to syslogd instead of a local file. This is
useful when running a Tahoe node out of a USB flash drive.
Tahoe now includes experimental FTP and SFTP servers. When configured with a
suitable method to translate username+password into a root directory cap, it
provides simple access to the virtual filesystem. Remember that FTP is
completely unencrypted: passwords, filenames, and file contents are all sent
over the wire in cleartext, so FTP should only be used on a local (127.0.0.1)
connection. This feature is still in development: there are no unit tests
yet, and behavior with respect to Unicode filenames is uncertain. Please see
docs/frontends/FTP-and-SFTP.txt for configuration details. (#512, #531)
The Mac GUI in src/allmydata/gui/ has been improved.
* Release 1.2.0 (2008-07-21)
** Security
This release makes the immutable-file "ciphertext hash tree" mandatory.
Previous releases allowed the uploader to decide whether their file would
have an integrity check on the ciphertext or not. A malicious uploader could
use this to create a readcap that would download as one file or a different
one, depending upon which shares the client fetched first, with no errors
raised. There are other integrity checks on the shares themselves, preventing
a storage server or other party from violating the integrity properties of
the read-cap: this failure was only exploitable by the uploader who gives you
a carefully constructed read-cap. If you download the file with Tahoe 1.2.0
or later, you will not be vulnerable to this problem. #491
This change does not introduce a compatibility issue, because all existing
versions of Tahoe will emit the ciphertext hash tree in their shares.
** Dependencies
Tahoe now requires Foolscap-0.2.9 . It also requires pycryptopp 0.5 or newer,
since earlier versions had a bug that interacted with specific compiler
versions that could sometimes result in incorrect encryption behavior. Both
packages are included in the Tahoe source tarball in misc/dependencies/ , and
should be built automatically when necessary.
** Web API
Web API directory pages should now contain properly-slash-terminated links to
other directories. They have also stopped using absolute links in forms and
pages (which interfered with the use of a front-end load-balancing proxy).
The behavior of the "Check This File" button changed, in conjunction with
larger internal changes to file checking/verification. The button triggers an
immediate check as before, but the outcome is shown on its own page, and does
not get stored anywhere. As a result, the web directory page no longer shows
historical checker results.
A new "Deep-Check" button has been added, which allows a user to initiate a
recursive check of the given directory and all files and directories
reachable from it. This can cause quite a bit of work, and has no
intermediate progress information or feedback about the process. In addition,
the results of the deep-check are extremely limited. A later release will
improve this behavior.
The web server's behavior with respect to non-ASCII (unicode) filenames in
the "GET save=true" operation has been improved. To achieve maximum
compatibility with variously buggy web browsers, the server does not try to
figure out the character set of the inbound filename. It just echoes the same
bytes back to the browser in the Content-Disposition header. This seems to
make both IE7 and Firefox work correctly.
** Checker/Verifier/Repairer
Tahoe is slowly acquiring convenient tools to check up on file health,
examine existing shares for errors, and repair files that are not fully
healthy. This release adds a mutable checker/verifier/repairer, although
testing is very limited, and there are no web interfaces to trigger repair
yet. The "Check" button next to each file or directory on the webapi page
will perform a file check, and the "deep check" button on each directory will
recursively check all files and directories reachable from there (which may
take a very long time).
Future releases will improve access to this functionality.
** Operations/Packaging
A "check-grid" script has been added, along with a Makefile target. This is
intended (with the help of a pre-configured node directory) to check upon the
health of a Tahoe grid, uploading and downloading a few files. This can be
used as a monitoring tool for a deployed grid, to be run periodically and to
signal an error if it ever fails. It also helps with compatibility testing,
to verify that the latest Tahoe code is still able to handle files created by
an older version.
The munin plugins from misc/munin/ are now copied into any generated debian
packages, and are made executable (and uncompressed) so they can be symlinked
directly from /etc/munin/plugins/ .
Ubuntu "Hardy" was added as a supported debian platform, with a Makefile
target to produce hardy .deb packages. Some notes have been added to
docs/debian.txt about building Tahoe on a debian/ubuntu system.
Storage servers now measure operation rates and latency-per-operation, and
provides results through the /statistics web page as well as the stats
gatherer. Munin plugins have been added to match.
** Other
Tahoe nodes now use Foolscap "incident logging" to record unusual events to
their NODEDIR/logs/incidents/ directory. These incident files can be examined
by Foolscap logging tools, or delivered to an external log-gatherer for
further analysis. Note that Tahoe now requires Foolscap-0.2.9, since 0.2.8
had a bug that complained about "OSError: File exists" when trying to create
the incidents/ directory for a second time.
If no servers are available when retrieving a mutable file (like a
directory), the node now reports an error instead of hanging forever. Earlier
releases would not only hang (causing the webapi directory listing to get
stuck half-way through), but the internal dirnode serialization would cause
all subsequent attempts to retrieve or modify the same directory to hang as
well. #463
A minor internal exception (reported in logs/twistd.log, in the
"stopProducing" method) was fixed, which complained about "self._paused_at
not defined" whenever a file download was stopped from the web browser end.
* Release 1.1.0 (2008-06-11)
** CLI: new "alias" model
The new CLI code uses an scp/rsync -like interface, in which directories in
the Tahoe storage grid are referenced by a colon-suffixed alias. The new
commands look like:
tahoe cp local.txt tahoe:virtual.txt
tahoe ls work:subdir
More functionality is available through the CLI: creating unlinked files and
directories, recursive copy in or out of the storage grid, hardlinks, and
retrieving the raw read- or write- caps through the 'ls' command. Please read
docs/CLI.txt for complete details.
** webapi: new pages, new commands
Several new pages were added to the web API:
/helper_status : to describe what a Helper is doing
/statistics : reports node uptime, CPU usage, other stats
/file : for easy file-download URLs, see #221
/cap == /uri : future compatibility
The localdir=/localfile= and t=download operations were removed. These
required special configuration to enable anyways, but this feature was a
security problem, and was mostly obviated by the new "cp -r" command.
Several new options to the GET command were added:
t=deep-size : add up the size of all immutable files reachable from the directory
t=deep-stats : return a JSON-encoded description of number of files, size
distribution, total size, etc
POST is now preferred over PUT for most operations which cause side-effects.
Most webapi calls now accept overwrite=, and default to overwrite=true .
"POST /uri/DIRCAP/parent/child?t=mkdir" is now the preferred API to create
multiple directories at once, rather than ...?t=mkdir-p .
PUT to a mutable file ("PUT /uri/MUTABLEFILECAP", "PUT /uri/DIRCAP/child")
will modify the file in-place.
** more munin graphs in misc/munin/
tahoe-introstats
tahoe-rootdir-space
tahoe_estimate_files
mutable files published/retrieved
tahoe_cpu_watcher
tahoe_spacetime
** New Dependencies
zfec 1.1.0
foolscap 0.2.8
pycryptopp 0.5
setuptools (now required at runtime)
** New Mutable-File Code
The mutable-file handling code (mostly used for directories) has been
completely rewritten. The new scheme has a better API (with a modify()
method) and is less likely to lose data when several uncoordinated writers
change a file at the same time.
In addition, a single Tahoe process will coordinate its own writes. If you
make two concurrent directory-modifying webapi calls to a single tahoe node,
it will internally make one of them wait for the other to complete. This
prevents auto-collision (#391).
The new mutable-file code also detects errors during publish better. Earlier
releases might believe that a mutable file was published when in fact it
failed.
** other features
The node now monitors its own CPU usage, as a percentage, measured every 60
seconds. 1/5/15 minute moving averages are available on the /statistics web
page and via the stats-gathering interface.
Clients now accelerate reconnection to all servers after being offline
(#374). When a client is offline for a long time, it scales back reconnection
attempts to approximately once per hour, so it may take a while to make the
first attempt, but once any attempt succeeds, the other server connections
will be retried immediately.
A new "offloaded KeyGenerator" facility can be configured, to move RSA key
generation out from, say, a webapi node, into a separate process. RSA keys
can take several seconds to create, and so a webapi node which is being used
for directory creation will be unavailable for anything else during this
time. The Key Generator process will pre-compute a small pool of keys, to
speed things up further. This also takes better advantage of multi-core CPUs,
or SMP hosts.
The node will only use a potentially-slow "du -s" command at startup (to
measure how much space has been used) if the "sizelimit" parameter has been
configured (to limit how much space is used). Large storage servers should
turn off sizelimit until a later release improves the space-management code,
since "du -s" on a terabyte filesystem can take hours.
The Introducer now allows new announcements to replace old ones, to avoid
buildups of obsolete announcements.
Immutable files are limited to about 12GiB (when using the default 3-of-10
encoding), because larger files would be corrupted by the four-byte
share-size field on the storage servers (#439). A later release will remove
this limit. Earlier releases would allow >12GiB uploads, but the resulting
file would be unretrievable.
The docs/ directory has been rearranged, with old docs put in
docs/historical/ and not-yet-implemented ones in docs/proposed/ .
The Mac OS-X FUSE plugin has a significant bug fix: earlier versions would
corrupt writes that used seek() instead of writing the file in linear order.
The rsync tool is known to perform writes in this order. This has been fixed.