mirror of
https://github.com/tahoe-lafs/tahoe-lafs.git
synced 2025-01-16 09:49:54 +00:00
437 lines
21 KiB
Plaintext
437 lines
21 KiB
Plaintext
User visible changes in Tahoe. -*- outline -*-
|
|
|
|
* Release 1.3.0 (?)
|
|
|
|
** Checker/Verifier/Repairer
|
|
|
|
The focus of this release has been improving the functionality and the
|
|
usability of code which checks/verifies/repairs files and directories.
|
|
"Checking" is the act of asking storage servers whether they have a share for
|
|
the given file or directory: if there are not enough shares available, the
|
|
file/directory will be unrecoverable. "Verifying" is the act of downloading
|
|
or cryptographically asserting that the server's share is undamaged: it
|
|
requires more work (bandwidth and CPU) than checking, but can catch problems
|
|
that simple checking cannot. "Repair" is the act of replacing missing/damaged
|
|
shares with new ones.
|
|
|
|
For mutable files (and therefore directories), missing shares can be
|
|
regenerated, and corrupted shares can be repaired in place. For immutable
|
|
files, missing shares are regenerated, and corrupted shares are handled by
|
|
uploading new shares to other servers. The storage server protocol does not
|
|
allow clients to change or remove immutable shares, so if persistent
|
|
corruption is detected, the user and the storage server operator must work
|
|
together to remove the damaged share. Note that corrupted shares indicate
|
|
hardware failures, serious software bugs, or malice on the part of the
|
|
storage server operator, so a corrupted share should be considered highly
|
|
unusual. The "incident gatherer" mechanism will automatically report share
|
|
corruption to a pre-configured incident gatherer service.
|
|
|
|
By periodically checking/repairing all files and directories, objects in the
|
|
Tahoe filesystem remain resistant to recoverability failures due to missing
|
|
and/or broken servers.
|
|
|
|
This release includes a webapi mechanism to initiate checks on individual
|
|
files and directories (with or without verification, and with or without
|
|
automatic repair). A related mechanism is used to initiate a "deep-check" on
|
|
a directory: recursively traversing the directory and its children, checking
|
|
(and/or verifying/repairing) everything underneath. Both mechanisms can be
|
|
run with an "output=JSON" argument, to obtain machine-readable check/repair
|
|
status results. These results include a copy of the filesystem statistics
|
|
from the "deep-stats" operation (including total number of files, size
|
|
histogram, etc).
|
|
|
|
The client web interface now features some extra buttons to initiate check
|
|
and deep-check operations. When these operations finish, they display a
|
|
results page that summarizes any problems that were encountered.
|
|
|
|
** CLI Changes
|
|
|
|
This release adds the 'tahoe create-alias' command, which is a combination of
|
|
'tahoe mkdir' and 'tahoe add-alias'. This also allows you to start using a
|
|
new tahoe directory without exposing its URI in the argv list, which is
|
|
publically visible (through the process table) on most unix systems.
|
|
|
|
The single-argument form of "tahoe put" was changed to create an unlinked
|
|
file. I.e. "tahoe put bar.txt" will take the contents of a local "bar.txt"
|
|
file, upload them to the grid, and print the resulting read-cap; the file
|
|
will not be attached to any directories. This seemed a bit more useful than
|
|
the previous behavior (copy stdin, upload to the grid, attach the resulting
|
|
file into your default tahoe: alias in a child named 'bar.txt').
|
|
|
|
"tahoe put" was also fixed to handle mutable files correctly: "tahoe put
|
|
bar.txt URI:SSK:..." will read the contents of the local bar.txt and use them
|
|
to replace the contents of the given mutable file.
|
|
|
|
The "tahoe webopen" command was modified to accept aliases. This means "tahoe
|
|
webopen tahoe:" will cause your web browser to open to a "wui" page that
|
|
gives access to the directory associated with the default "tahoe:" alias.
|
|
|
|
Many esoteric debugging commands were moved down into a "debug" subcommand:
|
|
|
|
tahoe debug dump-cap
|
|
tahoe debug dump-share
|
|
tahoe debug find-shares
|
|
tahoe debug catalog-shares
|
|
tahoe debug corrupt-share
|
|
|
|
The last command ("tahoe debug corrupt-share") flips a random bit of the
|
|
given local sharefile. This is used to test the file verifying/repairing, and
|
|
obviously should not be used on user data.
|
|
|
|
** Web changes
|
|
|
|
The top-level status page (/status) now has a machine-readable form, via
|
|
"/status/?t=json". This includes information about the currently-active
|
|
uploads and downloads, which may be useful for frontends that wish to display
|
|
progress information. There is no easy way to correlate the activities
|
|
displayed here with recent webapi requests, however.
|
|
|
|
The welcome page now has a "Report Incident" button, which is tied into the
|
|
"Incident Gatherer" machinery. If the node is attached to an incident
|
|
gatherer (via log_gatherer.furl), then pushing this button will cause an
|
|
Incident to be signalled: this means recent log events are aggregated and
|
|
sent in a bundle to the gatherer. The user can push this button after
|
|
something strange takes place (and they can provide a short message to go
|
|
along with it), and the relevant data will be delivered to a centralized
|
|
incident-gatherer for later processing by operations staff.
|
|
|
|
The "HEAD" method should now work correctly, in addition to the usual "GET",
|
|
"PUT", and "POST" methods. "HEAD" is supposed to return exactly the same
|
|
headers as "GET" would, but without any of the actual response body data. For
|
|
mutable files, this now does a brief mapupdate (to figure out the size of the
|
|
file that would be returned), without actually retrieve the file's contents.
|
|
|
|
Each file and directory now has a "Show More Info" web page, which contains
|
|
much of the information that was crammed into the directory page before. This
|
|
includes readonly URIs, storage index strings, object type, buttons to
|
|
control checking/verifying/repairing, and deep-check/deep-stats buttons (for
|
|
directories). For mutable files, the "replace contents" upload form has been
|
|
moved here too. As a result, the directory page is now much simpler and
|
|
cleaner.
|
|
|
|
** Packaging
|
|
|
|
The Tahoe dependencies have been extended to require the
|
|
"[secure_connections]" feature from Foolscap, which will cause pyOpenSSL to
|
|
be required and/or installed. If OpenSSL and its development headers are
|
|
already installed on your system, this can occur automatically. Tahoe now
|
|
uses pollreactor (instead of the default selectreactor) to work around a bug
|
|
between pyOpenSSL and the most recent release of Twisted (8.1.0). This bug
|
|
only affects unit tests (hang during shutdown), and should not impact regular
|
|
use.
|
|
|
|
The Tahoe source code tarballs now come in two different forms: regular and
|
|
"sumo". The regular tarball contains just Tahoe, nothing else. When building
|
|
from the regular tarball, the build process will download any unmet
|
|
dependencies from the internet (starting with the index at PyPI) so it can
|
|
build and install them. The "sumo" tarball contains copies of all the
|
|
libraries that Tahoe requires (foolscap, twisted, zfec, etc), so using the
|
|
"sumo" tarball should not require any internet access during the build
|
|
process. This can be useful if you want to build Tahoe while on an airplane,
|
|
a desert island, or other bandwidth-limited environments.
|
|
|
|
Similarly, allmydata.org now hosts a "tahoe-deps" tarball which contains the
|
|
latest versions of all these dependencies. This tarball, located at
|
|
http://allmydata.org/source/tahoe/deps/tahoe-deps.tar.gz, can be unpacked in
|
|
the tahoe source tree (or in its parent directory), and the build process
|
|
should satisfy its downloading needs from it instead of reaching out to PyPI.
|
|
This can be useful if you want to build Tahoe from a darcs checkout while on
|
|
that airplane or desert island.
|
|
|
|
The previous two changes ("sumo" tarballs and the "tahoe-deps" bundle), most
|
|
of the files have been removed from misc/dependencies/ . This brings the
|
|
regular Tahoe tarball down to 2MB (compressed), and the darcs checkout
|
|
(without history) to about 7.6MB. A full darcs checkout will still be fairly
|
|
large (because of the historical patches which included the dependent
|
|
libraries), but a 'lazy' one should now be small.
|
|
|
|
The default "make" target is now an alias for "setup.py build_tahoe", which
|
|
itself is a wrapper around "setup.py develop --prefix support/lib", with some
|
|
extra work before and after. Most of the complicated platform-dependent code
|
|
in the Makefile was rewritten in Python and moved into setup.py, simplifying
|
|
things considerably.
|
|
|
|
Likewise, the "make test" target now delegates most of its work to "setup.py
|
|
trial", which takes care of getting PYTHONPATH configured to access the tahoe
|
|
code (and dependencies) that gets put in support/lib/ by the build_tahoe
|
|
step. This should allow unit tests to be run even when trial (which is part
|
|
of Twisted) wasn't already installed (in this case, trial gets installed to
|
|
support/bin because Twisted is a dependency of Tahoe).
|
|
|
|
** Grid Management Tools
|
|
|
|
Several tools have been added or updated in the misc/ directory, mostly munin
|
|
plugins that can be used to monitor a storage grid.
|
|
|
|
The misc/spacetime/ directory contains a "disk watcher" daemon (startable
|
|
with 'tahoe start'), which can be configured with a set of HTTP URLs
|
|
(pointing at the webapi '/statistics' page of a bunch of storage servers),
|
|
and will periodically fetch disk-used/disk-available information from all the
|
|
servers. It keeps this information in an Axiom database (a sqlite-based
|
|
library available from divmod.org). The daemon computes time-averaged rates
|
|
of disk usage, as well as a prediction of how much time is left before the
|
|
grid is completely full.
|
|
|
|
The misc/munin/ directory contains a new set of munin plugins
|
|
(tahoe_diskleft, tahoe_diskusage, tahoe_doomsday) which talk to the
|
|
disk-watcher and provide graphs of its calculations.
|
|
|
|
To support the disk-watcher, the Tahoe statistics component (visible through
|
|
the webapi at the /statistics/ URL) now includes disk-used and disk-available
|
|
information. Both are derived through an equivalent of the unix 'df' command
|
|
(i.e. they ask the kernel for the number of free blocks on the partition that
|
|
encloses the BASEDIR/storage directory). In the future, the disk-available
|
|
number will be further influenced by the local storage policy: if that policy
|
|
says that the server should refuse new shares when less than 5GB is left on
|
|
the partition, then "disk-available" will report zero even though the kernel
|
|
sees 5GB remaining.
|
|
|
|
The 'tahoe_overhead' munin plugin interacts with an allmydata.com-specific
|
|
server which reports the total of the 'deep-size' reports for all active user
|
|
accounts, compares this with the disk-watcher data, to report on overhead
|
|
percentages. This provides information on how much space could be recovered
|
|
once Tahoe implements some form of garbage collection.
|
|
|
|
** Other Changes
|
|
|
|
Clients now declare their "oldest-supported version" to be 1.0.0 . This is
|
|
part of a backwards-compatibility system that has not yet been fully
|
|
specified. Previous releases declared their oldest-supported-version to be
|
|
the same as their current version number.
|
|
|
|
Several bugs have been fixed, including one that would cause an exception (in
|
|
the logs) if a webapi download operation was cancelled (by closing the TCP
|
|
connection, or pushing the "stop" button in a web browser).
|
|
|
|
Tahoe now uses Foolscap "Incidents", writing an "incident report" file to
|
|
logs/incidents/ each time something weird occurs. These reports are available
|
|
to an "incident gatherer" through the flogtool command. For more details,
|
|
please see the Foolscap logging documentation.
|
|
|
|
|
|
* Release 1.2.0 (2008-07-21)
|
|
|
|
** Security
|
|
|
|
This release makes the immutable-file "ciphertext hash tree" mandatory.
|
|
Previous releases allowed the uploader to decide whether their file would
|
|
have an integrity check on the ciphertext or not. A malicious uploader could
|
|
use this to create a readcap that would download as one file or a different
|
|
one, depending upon which shares the client fetched first, with no errors
|
|
raised. There are other integrity checks on the shares themselves, preventing
|
|
a storage server or other party from violating the integrity properties of
|
|
the read-cap: this failure was only exploitable by the uploader who gives you
|
|
a carefully constructed read-cap. If you download the file with Tahoe 1.2.0
|
|
or later, you will not be vulnerable to this problem. #491
|
|
|
|
This change does not introduce a compatibility issue, because all existing
|
|
versions of Tahoe will emit the ciphertext hash tree in their shares.
|
|
|
|
** Dependencies
|
|
|
|
Tahoe now requires Foolscap-0.2.9 . It also requires pycryptopp 0.5 or newer,
|
|
since earlier versions had a bug that interacted with specific compiler
|
|
versions that could sometimes result in incorrect encryption behavior. Both
|
|
packages are included in the Tahoe source tarball in misc/dependencies/ , and
|
|
should be built automatically when necessary.
|
|
|
|
** Web API
|
|
|
|
Web API directory pages should now contain properly-slash-terminated links to
|
|
other directories. They have also stopped using absolute links in forms and
|
|
pages (which interfered with the use of a front-end load-balancing proxy).
|
|
|
|
The behavior of the "Check This File" button changed, in conjunction with
|
|
larger internal changes to file checking/verification. The button triggers an
|
|
immediate check as before, but the outcome is shown on its own page, and does
|
|
not get stored anywhere. As a result, the web directory page no longer shows
|
|
historical checker results.
|
|
|
|
A new "Deep-Check" button has been added, which allows a user to initiate a
|
|
recursive check of the given directory and all files and directories
|
|
reachable from it. This can cause quite a bit of work, and has no
|
|
intermediate progress information or feedback about the process. In addition,
|
|
the results of the deep-check are extremely limited. A later release will
|
|
improve this behavior.
|
|
|
|
The web server's behavior with respect to non-ASCII (unicode) filenames in
|
|
the "GET save=true" operation has been improved. To achieve maximum
|
|
compatibility with variously buggy web browsers, the server does not try to
|
|
figure out the character set of the inbound filename. It just echoes the same
|
|
bytes back to the browser in the Content-Disposition header. This seems to
|
|
make both IE7 and Firefox work correctly.
|
|
|
|
** Checker/Verifier/Repairer
|
|
|
|
Tahoe is slowly acquiring convenient tools to check up on file health,
|
|
examine existing shares for errors, and repair files that are not fully
|
|
healthy. This release adds a mutable checker/verifier/repairer, although
|
|
testing is very limited, and there are no web interfaces to trigger repair
|
|
yet. The "Check" button next to each file or directory on the webapi page
|
|
will perform a file check, and the "deep check" button on each directory will
|
|
recursively check all files and directories reachable from there (which may
|
|
take a very long time).
|
|
|
|
Future releases will improve access to this functionality.
|
|
|
|
** Operations/Packaging
|
|
|
|
A "check-grid" script has been added, along with a Makefile target. This is
|
|
intended (with the help of a pre-configured node directory) to check upon the
|
|
health of a Tahoe grid, uploading and downloading a few files. This can be
|
|
used as a monitoring tool for a deployed grid, to be run periodically and to
|
|
signal an error if it ever fails. It also helps with compatibility testing,
|
|
to verify that the latest Tahoe code is still able to handle files created by
|
|
an older version.
|
|
|
|
The munin plugins from misc/munin/ are now copied into any generated debian
|
|
packages, and are made executable (and uncompressed) so they can be symlinked
|
|
directly from /etc/munin/plugins/ .
|
|
|
|
Ubuntu "Hardy" was added as a supported debian platform, with a Makefile
|
|
target to produce hardy .deb packages. Some notes have been added to
|
|
docs/debian.txt about building Tahoe on a debian/ubuntu system.
|
|
|
|
Storage servers now measure operation rates and latency-per-operation, and
|
|
provides results through the /statistics web page as well as the stats
|
|
gatherer. Munin plugins have been added to match.
|
|
|
|
** Other
|
|
|
|
Tahoe nodes now use Foolscap "incident logging" to record unusual events to
|
|
their NODEDIR/logs/incidents/ directory. These incident files can be examined
|
|
by Foolscap logging tools, or delivered to an external log-gatherer for
|
|
further analysis. Note that Tahoe now requires Foolscap-0.2.9, since 0.2.8
|
|
had a bug that complained about "OSError: File exists" when trying to create
|
|
the incidents/ directory for a second time.
|
|
|
|
If no servers are available when retrieving a mutable file (like a
|
|
directory), the node now reports an error instead of hanging forever. Earlier
|
|
releases would not only hang (causing the webapi directory listing to get
|
|
stuck half-way through), but the internal dirnode serialization would cause
|
|
all subsequent attempts to retrieve or modify the same directory to hang as
|
|
well. #463
|
|
|
|
A minor internal exception (reported in logs/twistd.log, in the
|
|
"stopProducing" method) was fixed, which complained about "self._paused_at
|
|
not defined" whenever a file download was stopped from the web browser end.
|
|
|
|
|
|
* Release 1.1.0 (2008-06-11)
|
|
|
|
** CLI: new "alias" model
|
|
|
|
The new CLI code uses an scp/rsync -like interface, in which directories in
|
|
the Tahoe storage grid are referenced by a colon-suffixed alias. The new
|
|
commands look like:
|
|
tahoe cp local.txt tahoe:virtual.txt
|
|
tahoe ls work:subdir
|
|
|
|
More functionality is available through the CLI: creating unlinked files and
|
|
directories, recursive copy in or out of the storage grid, hardlinks, and
|
|
retrieving the raw read- or write- caps through the 'ls' command. Please read
|
|
docs/CLI.txt for complete details.
|
|
|
|
** webapi: new pages, new commands
|
|
|
|
Several new pages were added to the web API:
|
|
|
|
/helper_status : to describe what a Helper is doing
|
|
/statistics : reports node uptime, CPU usage, other stats
|
|
/file : for easy file-download URLs, see #221
|
|
/cap == /uri : future compatibility
|
|
|
|
The localdir=/localfile= and t=download operations were removed. These
|
|
required special configuration to enable anyways, but this feature was a
|
|
security problem, and was mostly obviated by the new "cp -r" command.
|
|
|
|
Several new options to the GET command were added:
|
|
|
|
t=deep-size : add up the size of all immutable files reachable from the directory
|
|
t=deep-stats : return a JSON-encoded description of number of files, size
|
|
distribution, total size, etc
|
|
|
|
POST is now preferred over PUT for most operations which cause side-effects.
|
|
|
|
Most webapi calls now accept overwrite=, and default to overwrite=true .
|
|
|
|
"POST /uri/DIRCAP/parent/child?t=mkdir" is now the preferred API to create
|
|
multiple directories at once, rather than ...?t=mkdir-p .
|
|
|
|
PUT to a mutable file ("PUT /uri/MUTABLEFILECAP", "PUT /uri/DIRCAP/child")
|
|
will modify the file in-place.
|
|
|
|
** more munin graphs in misc/munin/
|
|
|
|
tahoe-introstats
|
|
tahoe-rootdir-space
|
|
tahoe_estimate_files
|
|
mutable files published/retrieved
|
|
tahoe_cpu_watcher
|
|
tahoe_spacetime
|
|
|
|
** New Dependencies
|
|
|
|
zfec 1.1.0
|
|
foolscap 0.2.8
|
|
pycryptopp 0.5
|
|
setuptools (now required at runtime)
|
|
|
|
** New Mutable-File Code
|
|
|
|
The mutable-file handling code (mostly used for directories) has been
|
|
completely rewritten. The new scheme has a better API (with a modify()
|
|
method) and is less likely to lose data when several uncoordinated writers
|
|
change a file at the same time.
|
|
|
|
In addition, a single Tahoe process will coordinate its own writes. If you
|
|
make two concurrent directory-modifying webapi calls to a single tahoe node,
|
|
it will internally make one of them wait for the other to complete. This
|
|
prevents auto-collision (#391).
|
|
|
|
The new mutable-file code also detects errors during publish better. Earlier
|
|
releases might believe that a mutable file was published when in fact it
|
|
failed.
|
|
|
|
** other features
|
|
|
|
The node now monitors its own CPU usage, as a percentage, measured every 60
|
|
seconds. 1/5/15 minute moving averages are available on the /statistics web
|
|
page and via the stats-gathering interface.
|
|
|
|
Clients now accelerate reconnection to all servers after being offline
|
|
(#374). When a client is offline for a long time, it scales back reconnection
|
|
attempts to approximately once per hour, so it may take a while to make the
|
|
first attempt, but once any attempt succeeds, the other server connections
|
|
will be retried immediately.
|
|
|
|
A new "offloaded KeyGenerator" facility can be configured, to move RSA key
|
|
generation out from, say, a webapi node, into a separate process. RSA keys
|
|
can take several seconds to create, and so a webapi node which is being used
|
|
for directory creation will be unavailable for anything else during this
|
|
time. The Key Generator process will pre-compute a small pool of keys, to
|
|
speed things up further. This also takes better advantage of multi-core CPUs,
|
|
or SMP hosts.
|
|
|
|
The node will only use a potentially-slow "du -s" command at startup (to
|
|
measure how much space has been used) if the "sizelimit" parameter has been
|
|
configured (to limit how much space is used). Large storage servers should
|
|
turn off sizelimit until a later release improves the space-management code,
|
|
since "du -s" on a terabyte filesystem can take hours.
|
|
|
|
The Introducer now allows new announcements to replace old ones, to avoid
|
|
buildups of obsolete announcements.
|
|
|
|
Immutable files are limited to about 12GiB (when using the default 3-of-10
|
|
encoding), because larger files would be corrupted by the four-byte
|
|
share-size field on the storage servers (#439). A later release will remove
|
|
this limit. Earlier releases would allow >12GiB uploads, but the resulting
|
|
file would be unretrievable.
|
|
|
|
The docs/ directory has been rearranged, with old docs put in
|
|
docs/historical/ and not-yet-implemented ones in docs/proposed/ .
|
|
|
|
The Mac OS-X FUSE plugin has a significant bug fix: earlier versions would
|
|
corrupt writes that used seek() instead of writing the file in linear order.
|
|
The rsync tool is known to perform writes in this order. This has been fixed.
|