docs: more formatting cleanups and corrections. Spell webapi and wapi as web-API.

This commit is contained in:
david-sarah 2010-12-26 21:05:33 -08:00
parent 7be2c73f08
commit 45dd8b910a
14 changed files with 244 additions and 235 deletions

View File

@ -14,9 +14,8 @@ To speed up backup operations, Tahoe maintains a small database known as the
uploaded recently. uploaded recently.
This database lives in ``~/.tahoe/private/backupdb.sqlite``, and is a SQLite This database lives in ``~/.tahoe/private/backupdb.sqlite``, and is a SQLite
single-file database. It is used by the "tahoe backup" command. In the single-file database. It is used by the "``tahoe backup``" command. In the
future, it will also be used by "tahoe mirror", and by "tahoe cp" when the future, it may optionally be used by other commands such as "``tahoe cp``".
``--use-backupdb`` option is included.
The purpose of this database is twofold: to manage the file-to-cap The purpose of this database is twofold: to manage the file-to-cap
translation (the "upload" step) and the directory-to-cap translation (the translation (the "upload" step) and the directory-to-cap translation (the
@ -24,7 +23,7 @@ translation (the "upload" step) and the directory-to-cap translation (the
The overall goal of optimizing backup is to reduce the work required when the The overall goal of optimizing backup is to reduce the work required when the
source disk has not changed (much) since the last backup. In the ideal case, source disk has not changed (much) since the last backup. In the ideal case,
running "tahoe backup" twice in a row, with no intervening changes to the running "``tahoe backup``" twice in a row, with no intervening changes to the
disk, will not require any network traffic. Minimal changes to the source disk, will not require any network traffic. Minimal changes to the source
disk should result in minimal traffic. disk should result in minimal traffic.
@ -32,12 +31,12 @@ This database is optional. If it is deleted, the worst effect is that a
subsequent backup operation may use more effort (network bandwidth, CPU subsequent backup operation may use more effort (network bandwidth, CPU
cycles, and disk IO) than it would have without the backupdb. cycles, and disk IO) than it would have without the backupdb.
The database uses sqlite3, which is included as part of the standard python The database uses sqlite3, which is included as part of the standard Python
library with python2.5 and later. For python2.4, Tahoe will try to install the library with Python 2.5 and later. For Python 2.4, Tahoe will try to install the
"pysqlite" package at build-time, but this will succeed only if sqlite3 with "pysqlite" package at build-time, but this will succeed only if sqlite3 with
development headers is already installed. On Debian and Debian derivatives development headers is already installed. On Debian and Debian derivatives
you can install the "python-pysqlite2" package (which, despite the name, you can install the "python-pysqlite2" package (which, despite the name,
actually provides sqlite3 rather than sqlite2), but on old distributions such actually provides sqlite3 rather than sqlite2). On old distributions such
as Debian etch (4.0 "oldstable") or Ubuntu Edgy (6.10) the "python-pysqlite2" as Debian etch (4.0 "oldstable") or Ubuntu Edgy (6.10) the "python-pysqlite2"
package won't work, but the "sqlite3-dev" package will. package won't work, but the "sqlite3-dev" package will.
@ -84,11 +83,11 @@ The database contains the following tables::
Upload Operation Upload Operation
================ ================
The upload process starts with a pathname (like ~/.emacs) and wants to end up The upload process starts with a pathname (like ``~/.emacs``) and wants to end up
with a file-cap (like URI:CHK:...). with a file-cap (like ``URI:CHK:...``).
The first step is to convert the path to an absolute form The first step is to convert the path to an absolute form
(/home/warner/.emacs) and do a lookup in the local_files table. If the path (``/home/warner/.emacs``) and do a lookup in the local_files table. If the path
is not present in this table, the file must be uploaded. The upload process is not present in this table, the file must be uploaded. The upload process
is: is:
@ -150,8 +149,8 @@ checked and found healthy, the 'last_upload' entry is updated.
Relying upon timestamps is a compromise between efficiency and safety: a file Relying upon timestamps is a compromise between efficiency and safety: a file
which is modified without changing the timestamp or size will be treated as which is modified without changing the timestamp or size will be treated as
unmodified, and the "tahoe backup" command will not copy the new contents unmodified, and the "``tahoe backup``" command will not copy the new contents
into the grid. The ``--no-timestamps`` can be used to disable this into the grid. The ``--no-timestamps`` option can be used to disable this
optimization, forcing every byte of the file to be hashed and encoded. optimization, forcing every byte of the file to be hashed and encoded.
Directory Operations Directory Operations
@ -162,17 +161,17 @@ dircap for each directory), the backup process must find or create a tahoe
directory node with the same contents. The contents are hashed, and the hash directory node with the same contents. The contents are hashed, and the hash
is queried in the 'directories' table. If found, the last-checked timestamp is queried in the 'directories' table. If found, the last-checked timestamp
is used to perform the same random-early-check algorithm described for files is used to perform the same random-early-check algorithm described for files
above, but no new upload is performed. Since "tahoe backup" creates immutable above, but no new upload is performed. Since "``tahoe backup``" creates immutable
directories, it is perfectly safe to re-use a directory from a previous directories, it is perfectly safe to re-use a directory from a previous
backup. backup.
If not found, the webapi "mkdir-immutable" operation is used to create a new If not found, the web-API "mkdir-immutable" operation is used to create a new
directory, and an entry is stored in the table. directory, and an entry is stored in the table.
The comparison operation ignores timestamps and metadata, and pays attention The comparison operation ignores timestamps and metadata, and pays attention
solely to the file names and contents. solely to the file names and contents.
By using a directory-contents hash, the "tahoe backup" command is able to By using a directory-contents hash, the "``tahoe backup``" command is able to
re-use directories from other places in the backed up data, or from old re-use directories from other places in the backed up data, or from old
backups. This means that renaming a directory and moving a subdirectory to a backups. This means that renaming a directory and moving a subdirectory to a
new parent both count as "minor changes" and will result in minimal Tahoe new parent both count as "minor changes" and will result in minimal Tahoe
@ -184,7 +183,7 @@ directories from backup #1.
The best case is a null backup, in which nothing has changed. This will The best case is a null backup, in which nothing has changed. This will
result in minimal network bandwidth: one directory read and two modifies. The result in minimal network bandwidth: one directory read and two modifies. The
Archives/ directory must be read to locate the latest backup, and must be ``Archives/`` directory must be read to locate the latest backup, and must be
modified to add a new snapshot, and the Latest/ directory will be updated to modified to add a new snapshot, and the ``Latest/`` directory will be updated to
point to that same snapshot. point to that same snapshot.

View File

@ -149,7 +149,7 @@ set the ``tub.location`` option described below.
tub.port = 8098 tub.port = 8098
tub.location = external-firewall.example.com:7912 tub.location = external-firewall.example.com:7912
* Run a node behind a Tor proxy (perhaps via torsocks), in client-only * Run a node behind a Tor proxy (perhaps via ``torsocks``), in client-only
mode (i.e. we can make outbound connections, but other nodes will not mode (i.e. we can make outbound connections, but other nodes will not
be able to connect to us). The literal '``unreachable.example.org``' will be able to connect to us). The literal '``unreachable.example.org``' will
not resolve, but will serve as a reminder to human observers that this not resolve, but will serve as a reminder to human observers that this
@ -186,7 +186,7 @@ set the ``tub.location`` option described below.
a "log gatherer", which will be granted access to the logport. This can a "log gatherer", which will be granted access to the logport. This can
be used by centralized storage grids to gather operational logs in a be used by centralized storage grids to gather operational logs in a
single place. Note that when an old-style ``BASEDIR/log_gatherer.furl`` file single place. Note that when an old-style ``BASEDIR/log_gatherer.furl`` file
exists (see 'Backwards Compatibility Files', below), both are used. (For exists (see `Backwards Compatibility Files`_, below), both are used. (For
most other items, the separate config file overrides the entry in most other items, the separate config file overrides the entry in
``tahoe.cfg``.) ``tahoe.cfg``.)
@ -208,12 +208,12 @@ set the ``tub.location`` option described below.
each connection to another node, if nothing has been heard for a while, each connection to another node, if nothing has been heard for a while,
we will drop the connection. The duration of silence that passes before we will drop the connection. The duration of silence that passes before
dropping the connection will be between DT-2*KT and 2*DT+2*KT (please see dropping the connection will be between DT-2*KT and 2*DT+2*KT (please see
ticket #521 for more details). If we are sending a large amount of data ticket `#521`_ for more details). If we are sending a large amount of data
to the other end (which takes more than DT-2*KT to deliver), we might to the other end (which takes more than DT-2*KT to deliver), we might
incorrectly drop the connection. The default behavior (when this value is incorrectly drop the connection. The default behavior (when this value is
not provided) is to disable the disconnect timer. not provided) is to disable the disconnect timer.
See ticket #521 for a discussion of how to pick these timeout values. See ticket `#521`_ for a discussion of how to pick these timeout values.
Using 30 minutes means we'll disconnect after 22 to 68 minutes of Using 30 minutes means we'll disconnect after 22 to 68 minutes of
inactivity. Receiving data will reset this timeout, however if we have inactivity. Receiving data will reset this timeout, however if we have
more than 22min of data in the outbound queue (such as 800kB in two more than 22min of data in the outbound queue (such as 800kB in two
@ -221,6 +221,8 @@ set the ``tub.location`` option described below.
contact us, our ping might be delayed, so we may disconnect them by contact us, our ping might be delayed, so we may disconnect them by
accident. accident.
.. _`#521`: http://tahoe-lafs.org/trac/tahoe-lafs/ticket/521
``ssh.port = (strports string, optional)`` ``ssh.port = (strports string, optional)``
``ssh.authorized_keys_file = (filename, optional)`` ``ssh.authorized_keys_file = (filename, optional)``
@ -236,8 +238,8 @@ set the ``tub.location`` option described below.
``tempdir = (string, optional)`` ``tempdir = (string, optional)``
This specifies a temporary directory for the webapi server to use, for This specifies a temporary directory for the web-API server to use, for
holding large files while they are being uploaded. If a webapi client holding large files while they are being uploaded. If a web-API client
attempts to upload a 10GB file, this tempdir will need to have at least attempts to upload a 10GB file, this tempdir will need to have at least
10GB available for the upload to complete. 10GB available for the upload to complete.
@ -400,10 +402,11 @@ and pays attention to the ``[node]`` section, but not the others.
The Introducer node maintains some different state than regular client nodes. The Introducer node maintains some different state than regular client nodes.
``BASEDIR/introducer.furl`` : This is generated the first time the introducer ``BASEDIR/introducer.furl``
node is started, and used again on subsequent runs, to give the introduction This is generated the first time the introducer node is started, and used
service a persistent long-term identity. This file should be published and again on subsequent runs, to give the introduction service a persistent
copied into new client nodes before they are started for the first time. long-term identity. This file should be published and copied into new client
nodes before they are started for the first time.
Other Files in BASEDIR Other Files in BASEDIR
@ -572,14 +575,17 @@ these are not the default values), merely a legal one.
ssh.port = 8022 ssh.port = 8022
ssh.authorized_keys_file = ~/.ssh/authorized_keys ssh.authorized_keys_file = ~/.ssh/authorized_keys
[client] [client]
introducer.furl = pb://ok45ssoklj4y7eok5c3xkmj@tahoe.example:44801/ii3uumo introducer.furl = pb://ok45ssoklj4y7eok5c3xkmj@tahoe.example:44801/ii3uumo
helper.furl = pb://ggti5ssoklj4y7eok5c3xkmj@helper.tahoe.example:7054/kk8lhr helper.furl = pb://ggti5ssoklj4y7eok5c3xkmj@helper.tahoe.example:7054/kk8lhr
[storage] [storage]
enabled = True enabled = True
readonly_storage = True readonly_storage = True
sizelimit = 10000000000 sizelimit = 10000000000
[helper] [helper]
run_helper = True run_helper = True

View File

@ -12,7 +12,7 @@ Debian Support
Overview Overview
======== ========
One convenient way to install Tahoe-LAFS is with debian packages. One convenient way to install Tahoe-LAFS is with Debian packages.
This document attempts to explain how to complete a desert island build for This document attempts to explain how to complete a desert island build for
people in a hurry. It also attempts to explain more about our Debian packaging people in a hurry. It also attempts to explain more about our Debian packaging
for those willing to read beyond the simple pragmatic packaging exercises. for those willing to read beyond the simple pragmatic packaging exercises.
@ -21,7 +21,7 @@ TL;DR supporting package building instructions
============================================== ==============================================
There are only four supporting packages that are currently not available from There are only four supporting packages that are currently not available from
the debian apt repositories in Debian Lenny:: the Debian apt repositories in Debian Lenny::
python-foolscap python-zfec argparse zbase32 python-foolscap python-zfec argparse zbase32
@ -99,23 +99,23 @@ a source release, do the following::
sudo dpkg -i ../allmydata-tahoe_1.6.1-r4262_all.deb sudo dpkg -i ../allmydata-tahoe_1.6.1-r4262_all.deb
You should now have a functional desert island build of Tahoe with all of the You should now have a functional desert island build of Tahoe with all of the
supported libraries as .deb packages. You'll need to edit the Debian specific supported libraries as .deb packages. You'll need to edit the Debian-specific
/etc/defaults/allmydata-tahoe file to get Tahoe started. Data is by default ``/etc/defaults/allmydata-tahoe`` file to get Tahoe started. Data is by default
stored in /var/lib/tahoelafsd/ and Tahoe runs as the 'tahoelafsd' user. stored in ``/var/lib/tahoelafsd/`` and Tahoe runs as the 'tahoelafsd' user.
Building Debian Packages Building Debian Packages
======================== ========================
The Tahoe source tree comes with limited support for building debian packages The Tahoe source tree comes with limited support for building Debian packages
on a variety of Debian and Ubuntu platforms. For each supported platform, on a variety of Debian and Ubuntu platforms. For each supported platform,
there is a "deb-PLATFORM-head" target in the Makefile that will produce a there is a "deb-PLATFORM-head" target in the Makefile that will produce a
debian package from a darcs checkout, using a version number that is derived Debian package from a darcs checkout, using a version number that is derived
from the most recent darcs tag, plus the total number of revisions present in from the most recent darcs tag, plus the total number of revisions present in
the tree (e.g. "1.1-r2678"). the tree (e.g. "1.1-r2678").
To create debian packages from a Tahoe tree, you will need some additional To create Debian packages from a Tahoe tree, you will need some additional
tools installed. The canonical list of these packages is in the tools installed. The canonical list of these packages is in the
"Build-Depends" clause of misc/sid/debian/control , and includes:: "Build-Depends" clause of ``misc/sid/debian/control``, and includes::
build-essential build-essential
debhelper debhelper
@ -127,13 +127,13 @@ tools installed. The canonical list of these packages is in the
python-twisted-core python-twisted-core
In addition, to use the "deb-$PLATFORM-head" target, you will also need the In addition, to use the "deb-$PLATFORM-head" target, you will also need the
"debchange" utility from the "devscripts" package, and the "fakeroot" package. "``debchange``" utility from the "devscripts" package, and the "fakeroot" package.
Some recent platforms can be handled by using the targets for the previous Some recent platforms can be handled by using the targets for the previous
release, for example if there is no "deb-hardy-head" target, try building release, for example if there is no "deb-hardy-head" target, try building
"deb-gutsy-head" and see if the resulting package will work. "deb-gutsy-head" and see if the resulting package will work.
Note that we haven't tried to build source packages (.orig.tar.gz + dsc) yet, Note that we haven't tried to build source packages (``.orig.tar.gz`` + dsc) yet,
and there are no such source packages in our APT repository. and there are no such source packages in our APT repository.
Using Pre-Built Debian Packages Using Pre-Built Debian Packages
@ -146,16 +146,16 @@ describes this repository.
The ``tahoe-lafs.org`` APT repository also includes Debian packages of support The ``tahoe-lafs.org`` APT repository also includes Debian packages of support
libraries, like Foolscap, zfec, pycryptopp, and everything else you need that libraries, like Foolscap, zfec, pycryptopp, and everything else you need that
isn't already in debian. isn't already in Debian.
Building From Source on Debian Systems Building From Source on Debian Systems
====================================== ======================================
Many of Tahoe's build dependencies can be satisfied by first installing Many of Tahoe's build dependencies can be satisfied by first installing
certain debian packages: simplejson is one of these. Some debian/ubuntu certain Debian packages: simplejson is one of these. Some Debian/Ubuntu
platforms do not provide the necessary .egg-info metadata with their platforms do not provide the necessary ``.egg-info`` metadata with their
packages, so the Tahoe build process may not believe they are present. Some packages, so the Tahoe build process may not believe they are present. Some
Tahoe dependencies are not present in most debian systems (such as foolscap Tahoe dependencies are not present in most Debian systems (such as foolscap
and zfec): debs for these are made available in the APT repository described and zfec): debs for these are made available in the APT repository described
above. above.
@ -164,9 +164,9 @@ that it needs to run and which are not already present in the build
environment). environment).
We have observed occasional problems with this acquisition process. In some We have observed occasional problems with this acquisition process. In some
cases, setuptools will only be half-aware of an installed debian package, cases, setuptools will only be half-aware of an installed Debian package,
just enough to interfere with the automatic download+build of the dependency. just enough to interfere with the automatic download+build of the dependency.
For example, on some platforms, if Nevow-0.9.26 is installed via a debian For example, on some platforms, if Nevow-0.9.26 is installed via a Debian
package, setuptools will believe that it must download Nevow anyways, but it package, setuptools will believe that it must download Nevow anyways, but it
will insist upon downloading that specific 0.9.26 version. Since the current will insist upon downloading that specific 0.9.26 version. Since the current
release of Nevow is 0.9.31, and 0.9.26 is no longer available for download, release of Nevow is 0.9.31, and 0.9.26 is no longer available for download,

View File

@ -21,4 +21,4 @@ automatically, but older filesystems may not have it enabled::
If "dir_index" is present in the "features:" line, then you're all set. If If "dir_index" is present in the "features:" line, then you're all set. If
not, you'll need to use tune2fs and e2fsck to enable and build the index. See not, you'll need to use tune2fs and e2fsck to enable and build the index. See
<http://wiki.dovecot.org/MailboxFormat/Maildir> for some hints. `<http://wiki.dovecot.org/MailboxFormat/Maildir>`_ for some hints.

View File

@ -82,7 +82,7 @@ clients.
"key-generation" service, which allows a client to offload their RSA key "key-generation" service, which allows a client to offload their RSA key
generation to a separate process. Since RSA key generation takes several generation to a separate process. Since RSA key generation takes several
seconds, and must be done each time a directory is created, moving it to a seconds, and must be done each time a directory is created, moving it to a
separate process allows the first process (perhaps a busy webapi server) to separate process allows the first process (perhaps a busy web-API server) to
continue servicing other requests. The key generator exports a FURL that can continue servicing other requests. The key generator exports a FURL that can
be copied into a node to enable this functionality. be copied into a node to enable this functionality.
@ -96,8 +96,8 @@ same way as "``tahoe run``".
"``tahoe stop [NODEDIR]``" will shut down a running node. "``tahoe stop [NODEDIR]``" will shut down a running node.
"``tahoe restart [NODEDIR]``" will stop and then restart a running node. This is "``tahoe restart [NODEDIR]``" will stop and then restart a running node. This
most often used by developers who have just modified the code and want to is most often used by developers who have just modified the code and want to
start using their changes. start using their changes.
@ -107,15 +107,15 @@ Filesystem Manipulation
These commands let you exmaine a Tahoe-LAFS filesystem, providing basic These commands let you exmaine a Tahoe-LAFS filesystem, providing basic
list/upload/download/delete/rename/mkdir functionality. They can be used as list/upload/download/delete/rename/mkdir functionality. They can be used as
primitives by other scripts. Most of these commands are fairly thin wrappers primitives by other scripts. Most of these commands are fairly thin wrappers
around webapi calls, which are described in `<webapi.rst>`_. around web-API calls, which are described in `<webapi.rst>`_.
By default, all filesystem-manipulation commands look in ``~/.tahoe/`` to figure By default, all filesystem-manipulation commands look in ``~/.tahoe/`` to
out which Tahoe-LAFS node they should use. When the CLI command makes webapi figure out which Tahoe-LAFS node they should use. When the CLI command makes
calls, it will use ``~/.tahoe/node.url`` for this purpose: a running Tahoe-LAFS web-API calls, it will use ``~/.tahoe/node.url`` for this purpose: a running
node that provides a webapi port will write its URL into this file. If you want Tahoe-LAFS node that provides a web-API port will write its URL into this
to use a node on some other host, just create ``~/.tahoe/`` and copy that node's file. If you want to use a node on some other host, just create ``~/.tahoe/``
webapi URL into this file, and the CLI commands will contact that node instead and copy that node's web-API URL into this file, and the CLI commands will
of a local one. contact that node instead of a local one.
These commands also use a table of "aliases" to figure out which directory These commands also use a table of "aliases" to figure out which directory
they ought to use a starting point. This is explained in more detail below. they ought to use a starting point. This is explained in more detail below.
@ -258,7 +258,8 @@ In these summaries, ``PATH``, ``TOPATH`` or ``FROMPATH`` can be one of::
* ``[SUBDIRS/]FILENAME`` for a path relative to the default ``tahoe:`` alias; * ``[SUBDIRS/]FILENAME`` for a path relative to the default ``tahoe:`` alias;
* ``ALIAS:[SUBDIRS/]FILENAME`` for a path relative to another alias; * ``ALIAS:[SUBDIRS/]FILENAME`` for a path relative to another alias;
* ``DIRCAP/[SUBDIRS/]FILENAME`` or ``DIRCAP:./[SUBDIRS/]FILENAME`` for a path relative to a directory cap. * ``DIRCAP/[SUBDIRS/]FILENAME`` or ``DIRCAP:./[SUBDIRS/]FILENAME`` for a
path relative to a directory cap.
Command Examples Command Examples

View File

@ -37,7 +37,7 @@ All Tahoe-LAFS client nodes can run a frontend FTP server, allowing regular FTP
clients (like /usr/bin/ftp, ncftp, and countless others) to access the clients (like /usr/bin/ftp, ncftp, and countless others) to access the
virtual filesystem. They can also run an SFTP server, so SFTP clients (like virtual filesystem. They can also run an SFTP server, so SFTP clients (like
/usr/bin/sftp, the sshfs FUSE plugin, and others) can too. These frontends /usr/bin/sftp, the sshfs FUSE plugin, and others) can too. These frontends
sit at the same level as the webapi interface. sit at the same level as the web-API interface.
Since Tahoe-LAFS does not use user accounts or passwords, the FTP/SFTP servers Since Tahoe-LAFS does not use user accounts or passwords, the FTP/SFTP servers
must be configured with a way to first authenticate a user (confirm that a must be configured with a way to first authenticate a user (confirm that a

View File

@ -30,7 +30,7 @@ What's involved in a download?
============================== ==============================
Downloads are triggered by read() calls, each with a starting offset (defaults Downloads are triggered by read() calls, each with a starting offset (defaults
to 0) and a length (defaults to the whole file). A regular webapi GET request to 0) and a length (defaults to the whole file). A regular web-API GET request
will result in a whole-file read() call. will result in a whole-file read() call.
Each read() call turns into an ordered sequence of get_segment() calls. A Each read() call turns into an ordered sequence of get_segment() calls. A

View File

@ -109,7 +109,7 @@ actions to upload, rename, and delete files.
When an error occurs, the HTTP response code will be set to an appropriate When an error occurs, the HTTP response code will be set to an appropriate
400-series code (like 404 Not Found for an unknown childname, or 400 Bad Request 400-series code (like 404 Not Found for an unknown childname, or 400 Bad Request
when the parameters to a webapi operation are invalid), and the HTTP response when the parameters to a web-API operation are invalid), and the HTTP response
body will usually contain a few lines of explanation as to the cause of the body will usually contain a few lines of explanation as to the cause of the
error and possible responses. Unusual exceptions may result in a 500 Internal error and possible responses. Unusual exceptions may result in a 500 Internal
Server Error as a catch-all, with a default response body containing Server Error as a catch-all, with a default response body containing
@ -231,9 +231,9 @@ contain unicode filenames, and cannot contain binary strings that are not
representable as such. representable as such.
All Tahoe operations that refer to existing files or directories must include All Tahoe operations that refer to existing files or directories must include
a suitable read- or write- cap in the URL: the webapi server won't add one a suitable read- or write- cap in the URL: the web-API server won't add one
for you. If you don't know the cap, you can't access the file. This allows for you. If you don't know the cap, you can't access the file. This allows
the security properties of Tahoe caps to be extended across the webapi the security properties of Tahoe caps to be extended across the web-API
interface. interface.
Slow Operations, Progress, and Cancelling Slow Operations, Progress, and Cancelling
@ -436,22 +436,22 @@ Creating A New Directory
} }
For forward-compatibility, a mutable directory can also contain caps in For forward-compatibility, a mutable directory can also contain caps in
a format that is unknown to the webapi server. When such caps are retrieved a format that is unknown to the web-API server. When such caps are retrieved
from a mutable directory in a "ro_uri" field, they will be prefixed with from a mutable directory in a "ro_uri" field, they will be prefixed with
the string "ro.", indicating that they must not be decoded without the string "ro.", indicating that they must not be decoded without
checking that they are read-only. The "ro." prefix must not be stripped checking that they are read-only. The "ro." prefix must not be stripped
off without performing this check. (Future versions of the webapi server off without performing this check. (Future versions of the web-API server
will perform it where necessary.) will perform it where necessary.)
If both the "rw_uri" and "ro_uri" fields are present in a given PROPDICT, If both the "rw_uri" and "ro_uri" fields are present in a given PROPDICT,
and the webapi server recognizes the rw_uri as a write cap, then it will and the web-API server recognizes the rw_uri as a write cap, then it will
reset the ro_uri to the corresponding read cap and discard the original reset the ro_uri to the corresponding read cap and discard the original
contents of ro_uri (in order to ensure that the two caps correspond to the contents of ro_uri (in order to ensure that the two caps correspond to the
same object and that the ro_uri is in fact read-only). However this may not same object and that the ro_uri is in fact read-only). However this may not
happen for caps in a format unknown to the webapi server. Therefore, when happen for caps in a format unknown to the web-API server. Therefore, when
writing a directory the webapi client should ensure that the contents writing a directory the web-API client should ensure that the contents
of "rw_uri" and "ro_uri" for a given PROPDICT are a consistent of "rw_uri" and "ro_uri" for a given PROPDICT are a consistent
(write cap, read cap) pair if possible. If the webapi client only has (write cap, read cap) pair if possible. If the web-API client only has
one cap and does not know whether it is a write cap or read cap, then one cap and does not know whether it is a write cap or read cap, then
it is acceptable to set "rw_uri" to that cap and omit "ro_uri". The it is acceptable to set "rw_uri" to that cap and omit "ro_uri". The
client must not put a write cap into a "ro_uri" field. client must not put a write cap into a "ro_uri" field.
@ -462,7 +462,7 @@ Creating A New Directory
Also, if the "no-write" field is set to true in the metadata of a link to Also, if the "no-write" field is set to true in the metadata of a link to
a mutable child, it will cause the link to be diminished to read-only. a mutable child, it will cause the link to be diminished to read-only.
Note that the webapi-using client application must not provide the Note that the web-API-using client application must not provide the
"Content-Type: multipart/form-data" header that usually accompanies HTML "Content-Type: multipart/form-data" header that usually accompanies HTML
form submissions, since the body is not formatted this way. Doing so will form submissions, since the body is not formatted this way. Doing so will
cause a server error as the lower-level code misparses the request body. cause a server error as the lower-level code misparses the request body.
@ -480,18 +480,18 @@ Creating A New Directory
immutable files, literal files, and deep-immutable directories. immutable files, literal files, and deep-immutable directories.
For forward-compatibility, a deep-immutable directory can also contain caps For forward-compatibility, a deep-immutable directory can also contain caps
in a format that is unknown to the webapi server. When such caps are retrieved in a format that is unknown to the web-API server. When such caps are retrieved
from a deep-immutable directory in a "ro_uri" field, they will be prefixed from a deep-immutable directory in a "ro_uri" field, they will be prefixed
with the string "imm.", indicating that they must not be decoded without with the string "imm.", indicating that they must not be decoded without
checking that they are immutable. The "imm." prefix must not be stripped checking that they are immutable. The "imm." prefix must not be stripped
off without performing this check. (Future versions of the webapi server off without performing this check. (Future versions of the web-API server
will perform it where necessary.) will perform it where necessary.)
The cap for each child may be given either in the "rw_uri" or "ro_uri" The cap for each child may be given either in the "rw_uri" or "ro_uri"
field of the PROPDICT (not both). If a cap is given in the "rw_uri" field, field of the PROPDICT (not both). If a cap is given in the "rw_uri" field,
then the webapi server will check that it is an immutable read-cap of a then the web-API server will check that it is an immutable read-cap of a
*known* format, and give an error if it is not. If a cap is given in the *known* format, and give an error if it is not. If a cap is given in the
"ro_uri" field, then the webapi server will still check whether known "ro_uri" field, then the web-API server will still check whether known
caps are immutable, but for unknown caps it will simply assume that the caps are immutable, but for unknown caps it will simply assume that the
cap can be stored, as described above. Note that an attacker would be cap can be stored, as described above. Note that an attacker would be
able to store any cap in an immutable directory, so this check when able to store any cap in an immutable directory, so this check when
@ -729,7 +729,7 @@ In Tahoe earlier than v1.4.0, 'mtime' and 'ctime' keys were populated
instead of the 'tahoe':'linkmotime' and 'tahoe':'linkcrtime' keys. Starting instead of the 'tahoe':'linkmotime' and 'tahoe':'linkcrtime' keys. Starting
in Tahoe v1.4.0, the 'linkmotime'/'linkcrtime' keys in the 'tahoe' sub-dict in Tahoe v1.4.0, the 'linkmotime'/'linkcrtime' keys in the 'tahoe' sub-dict
are populated. However, prior to Tahoe v1.7beta, a bug caused the 'tahoe' are populated. However, prior to Tahoe v1.7beta, a bug caused the 'tahoe'
sub-dict to be deleted by webapi requests in which new metadata is sub-dict to be deleted by web-API requests in which new metadata is
specified, and not to be added to existing child links that lack it. specified, and not to be added to existing child links that lack it.
From Tahoe v1.7.0 onward, the 'mtime' and 'ctime' fields are no longer From Tahoe v1.7.0 onward, the 'mtime' and 'ctime' fields are no longer
@ -829,7 +829,7 @@ Attaching an existing File or Directory by its read- or write-cap
Note that this operation does not take its child cap in the form of Note that this operation does not take its child cap in the form of
separate "rw_uri" and "ro_uri" fields. Therefore, it cannot accept a separate "rw_uri" and "ro_uri" fields. Therefore, it cannot accept a
child cap in a format unknown to the webapi server, unless its URI child cap in a format unknown to the web-API server, unless its URI
starts with "ro." or "imm.". This restriction is necessary because the starts with "ro." or "imm.". This restriction is necessary because the
server is not able to attenuate an unknown write cap to a read cap. server is not able to attenuate an unknown write cap to a read cap.
Unknown URIs starting with "ro." or "imm.", on the other hand, are Unknown URIs starting with "ro." or "imm.", on the other hand, are
@ -1138,7 +1138,7 @@ Attaching An Existing File Or Directory (by URI)
directory, with a specified child name. This behaves much like the PUT t=uri directory, with a specified child name. This behaves much like the PUT t=uri
operation, and is a lot like a UNIX hardlink. It is subject to the same operation, and is a lot like a UNIX hardlink. It is subject to the same
restrictions as that operation on the use of cap formats unknown to the restrictions as that operation on the use of cap formats unknown to the
webapi server. web-API server.
This will create additional intermediate directories as necessary, although This will create additional intermediate directories as necessary, although
since it is expected to be triggered by a form that was retrieved by "GET since it is expected to be triggered by a form that was retrieved by "GET
@ -1796,7 +1796,7 @@ This is the "Welcome Page", and contains a few distinct sections::
Static Files in /public_html Static Files in /public_html
============================ ============================
The webapi server will take any request for a URL that starts with /static The web-API server will take any request for a URL that starts with /static
and serve it from a configurable directory which defaults to and serve it from a configurable directory which defaults to
$BASEDIR/public_html . This is configured by setting the "[node]web.static" $BASEDIR/public_html . This is configured by setting the "[node]web.static"
value in $BASEDIR/tahoe.cfg . If this is left at the default value of value in $BASEDIR/tahoe.cfg . If this is left at the default value of
@ -1804,7 +1804,7 @@ value in $BASEDIR/tahoe.cfg . If this is left at the default value of
served with the contents of the file $BASEDIR/public_html/subdir/foo.html . served with the contents of the file $BASEDIR/public_html/subdir/foo.html .
This can be useful to serve a javascript application which provides a This can be useful to serve a javascript application which provides a
prettier front-end to the rest of the Tahoe webapi. prettier front-end to the rest of the Tahoe web-API.
Safety and security issues -- names vs. URIs Safety and security issues -- names vs. URIs
@ -1850,7 +1850,7 @@ parent directory, so it isn't any harder to use the URI for this purpose.
The read and write caps in a given directory node are separate URIs, and The read and write caps in a given directory node are separate URIs, and
can't be assumed to point to the same object even if they were retrieved in can't be assumed to point to the same object even if they were retrieved in
the same operation (although the webapi server attempts to ensure this the same operation (although the web-API server attempts to ensure this
in most cases). If you need to rely on that property, you should explicitly in most cases). If you need to rely on that property, you should explicitly
verify it. More generally, you should not make assumptions about the verify it. More generally, you should not make assumptions about the
internal consistency of the contents of mutable directories. As a result internal consistency of the contents of mutable directories. As a result
@ -1895,7 +1895,7 @@ Coordination Directive.
Tahoe nodes implement internal serialization to make sure that a single Tahoe Tahoe nodes implement internal serialization to make sure that a single Tahoe
node cannot conflict with itself. For example, it is safe to issue two node cannot conflict with itself. For example, it is safe to issue two
directory modification requests to a single tahoe node's webapi server at the directory modification requests to a single tahoe node's web-API server at the
same time, because the Tahoe node will internally delay one of them until same time, because the Tahoe node will internally delay one of them until
after the other has finished being applied. (This feature was introduced in after the other has finished being applied. (This feature was introduced in
Tahoe-1.1; back with Tahoe-1.0 the web client was responsible for serializing Tahoe-1.1; back with Tahoe-1.0 the web client was responsible for serializing

View File

@ -28,7 +28,7 @@ next renewal pass.
There are several tradeoffs to be considered when choosing the renewal timer There are several tradeoffs to be considered when choosing the renewal timer
and the lease duration, and there is no single optimal pair of values. See and the lease duration, and there is no single optimal pair of values. See
the "lease-tradeoffs.svg" diagram to get an idea for the tradeoffs involved. the `<lease-tradeoffs.svg>`_ diagram to get an idea for the tradeoffs involved.
If lease renewal occurs quickly and with 100% reliability, than any renewal If lease renewal occurs quickly and with 100% reliability, than any renewal
time that is shorter than the lease duration will suffice, but a larger ratio time that is shorter than the lease duration will suffice, but a larger ratio
of duration-over-renewal-time will be more robust in the face of occasional of duration-over-renewal-time will be more robust in the face of occasional
@ -48,14 +48,14 @@ Client-side Renewal
If all of the files and directories which you care about are reachable from a If all of the files and directories which you care about are reachable from a
single starting point (usually referred to as a "rootcap"), and you store single starting point (usually referred to as a "rootcap"), and you store
that rootcap as an alias (via "tahoe create-alias"), then the simplest way to that rootcap as an alias (via "``tahoe create-alias``" for example), then the
renew these leases is with the following CLI command:: simplest way to renew these leases is with the following CLI command::
tahoe deep-check --add-lease ALIAS: tahoe deep-check --add-lease ALIAS:
This will recursively walk every directory under the given alias and renew This will recursively walk every directory under the given alias and renew
the leases on all files and directories. (You may want to add a ``--repair`` the leases on all files and directories. (You may want to add a ``--repair``
flag to perform repair at the same time). Simply run this command once a week flag to perform repair at the same time.) Simply run this command once a week
(or whatever other renewal period your grid recommends) and make sure it (or whatever other renewal period your grid recommends) and make sure it
completes successfully. As a side effect, a manifest of all unique files and completes successfully. As a side effect, a manifest of all unique files and
directories will be emitted to stdout, as well as a summary of file sizes and directories will be emitted to stdout, as well as a summary of file sizes and
@ -78,7 +78,7 @@ Server Side Expiration
Expiration must be explicitly enabled on each storage server, since the Expiration must be explicitly enabled on each storage server, since the
default behavior is to never expire shares. Expiration is enabled by adding default behavior is to never expire shares. Expiration is enabled by adding
config keys to the "[storage]" section of the tahoe.cfg file (as described config keys to the ``[storage]`` section of the ``tahoe.cfg`` file (as described
below) and restarting the server node. below) and restarting the server node.
Each lease has two parameters: a create/renew timestamp and a duration. The Each lease has two parameters: a create/renew timestamp and a duration. The
@ -89,7 +89,7 @@ at 31 days, and the "nominal lease expiration time" is simply $duration
seconds after the $create_renew timestamp. (In a future release of Tahoe, the seconds after the $create_renew timestamp. (In a future release of Tahoe, the
client will get to request a specific duration, and the server will accept or client will get to request a specific duration, and the server will accept or
reject the request depending upon its local configuration, so that servers reject the request depending upon its local configuration, so that servers
can achieve better control over their storage obligations). can achieve better control over their storage obligations.)
The lease-expiration code has two modes of operation. The first is age-based: The lease-expiration code has two modes of operation. The first is age-based:
leases are expired when their age is greater than their duration. This is the leases are expired when their age is greater than their duration. This is the
@ -99,7 +99,7 @@ active files and directories will be preserved, and the garbage will
collected in a timely fashion. collected in a timely fashion.
Since there is not yet a way for clients to request a lease duration of other Since there is not yet a way for clients to request a lease duration of other
than 31 days, there is a tahoe.cfg setting to override the duration of all than 31 days, there is a ``tahoe.cfg`` setting to override the duration of all
leases. If, for example, this alternative duration is set to 60 days, then leases. If, for example, this alternative duration is set to 60 days, then
clients could safely renew their leases with an add-lease operation perhaps clients could safely renew their leases with an add-lease operation perhaps
once every 50 days: even though nominally their leases would expire 31 days once every 50 days: even though nominally their leases would expire 31 days
@ -117,22 +117,22 @@ for a long period of time: once the lease-checker has examined all shares and
expired whatever it is going to expire, the second and subsequent passes are expired whatever it is going to expire, the second and subsequent passes are
not going to find any new leases to remove. not going to find any new leases to remove.
The tahoe.cfg file uses the following keys to control lease expiration:: The ``tahoe.cfg`` file uses the following keys to control lease expiration:
[storage] ``[storage]``
expire.enabled = (boolean, optional) ``expire.enabled = (boolean, optional)``
If this is True, the storage server will delete shares on which all If this is ``True``, the storage server will delete shares on which all
leases have expired. Other controls dictate when leases are considered to leases have expired. Other controls dictate when leases are considered to
have expired. The default is False. have expired. The default is ``False``.
expire.mode = (string, "age" or "cutoff-date", required if expiration enabled) ``expire.mode = (string, "age" or "cutoff-date", required if expiration enabled)``
If this string is "age", the age-based expiration scheme is used, and the If this string is "age", the age-based expiration scheme is used, and the
"expire.override_lease_duration" setting can be provided to influence the ``expire.override_lease_duration`` setting can be provided to influence the
lease ages. If it is "cutoff-date", the absolute-date-cutoff mode is lease ages. If it is "cutoff-date", the absolute-date-cutoff mode is
used, and the "expire.cutoff_date" setting must be provided to specify used, and the ``expire.cutoff_date`` setting must be provided to specify
the cutoff date. The mode setting currently has no default: you must the cutoff date. The mode setting currently has no default: you must
provide a value. provide a value.
@ -140,24 +140,24 @@ The tahoe.cfg file uses the following keys to control lease expiration::
this release it was deemed safer to require an explicit mode this release it was deemed safer to require an explicit mode
specification. specification.
expire.override_lease_duration = (duration string, optional) ``expire.override_lease_duration = (duration string, optional)``
When age-based expiration is in use, a lease will be expired if its When age-based expiration is in use, a lease will be expired if its
"lease.create_renew" timestamp plus its "lease.duration" time is ``lease.create_renew`` timestamp plus its ``lease.duration`` time is
earlier/older than the current time. This key, if present, overrides the earlier/older than the current time. This key, if present, overrides the
duration value for all leases, changing the algorithm from: duration value for all leases, changing the algorithm from::
if (lease.create_renew_timestamp + lease.duration) < now: if (lease.create_renew_timestamp + lease.duration) < now:
expire_lease() expire_lease()
to: to::
if (lease.create_renew_timestamp + override_lease_duration) < now: if (lease.create_renew_timestamp + override_lease_duration) < now:
expire_lease() expire_lease()
The value of this setting is a "duration string", which is a number of The value of this setting is a "duration string", which is a number of
days, months, or years, followed by a units suffix, and optionally days, months, or years, followed by a units suffix, and optionally
separated by a space, such as one of the following: separated by a space, such as one of the following::
7days 7days
31day 31day
@ -175,14 +175,14 @@ The tahoe.cfg file uses the following keys to control lease expiration::
31days" had been passed. 31days" had been passed.
This key is only valid when age-based expiration is in use (i.e. when This key is only valid when age-based expiration is in use (i.e. when
"expire.mode = age" is used). It will be rejected if cutoff-date ``expire.mode = age`` is used). It will be rejected if cutoff-date
expiration is in use. expiration is in use.
expire.cutoff_date = (date string, required if mode=cutoff-date) ``expire.cutoff_date = (date string, required if mode=cutoff-date)``
When cutoff-date expiration is in use, a lease will be expired if its When cutoff-date expiration is in use, a lease will be expired if its
create/renew timestamp is older than the cutoff date. This string will be create/renew timestamp is older than the cutoff date. This string will be
a date in the following format: a date in the following format::
2009-01-16 (January 16th, 2009) 2009-01-16 (January 16th, 2009)
2008-02-02 2008-02-02

View File

@ -23,33 +23,33 @@ record information about what is happening inside the Tahoe node. This is
primarily for use by programmers and grid operators who want to find out what primarily for use by programmers and grid operators who want to find out what
went wrong. went wrong.
The foolscap logging system is documented here: The foolscap logging system is documented at
`<http://foolscap.lothar.com/docs/logging.html>`_.
http://foolscap.lothar.com/docs/logging.html The foolscap distribution includes a utility named "``flogtool``" (usually
at ``/usr/bin/flogtool`` on Unix) which is used to get access to many
The foolscap distribution includes a utility named "flogtool" (usually at foolscap logging features.
/usr/bin/flogtool) which is used to get access to many foolscap logging
features.
Realtime Logging Realtime Logging
================ ================
When you are working on Tahoe code, and want to see what the node is doing, When you are working on Tahoe code, and want to see what the node is doing,
the easiest tool to use is "flogtool tail". This connects to the tahoe node the easiest tool to use is "``flogtool tail``". This connects to the Tahoe
and subscribes to hear about all log events. These events are then displayed node and subscribes to hear about all log events. These events are then
to stdout, and optionally saved to a file. displayed to stdout, and optionally saved to a file.
"flogtool tail" connects to the "logport", for which the FURL is stored in "``flogtool tail``" connects to the "logport", for which the FURL is stored
BASEDIR/private/logport.furl . The following command will connect to this in ``BASEDIR/private/logport.furl`` . The following command will connect to
port and start emitting log information: this port and start emitting log information::
flogtool tail BASEDIR/private/logport.furl flogtool tail BASEDIR/private/logport.furl
The "--save-to FILENAME" option will save all received events to a file, The ``--save-to FILENAME`` option will save all received events to a file,
where then can be examined later with "flogtool dump" or "flogtool where then can be examined later with "``flogtool dump``" or
web-viewer". The --catch-up flag will ask the node to dump all stored events "``flogtool web-viewer``". The ``--catch-up`` option will ask the node to
before subscribing to new ones (without --catch-up, you will only hear about dump all stored events before subscribing to new ones (without ``--catch-up``,
events that occur after the tool has connected and subscribed). you will only hear about events that occur after the tool has connected and
subscribed).
Incidents Incidents
========= =========
@ -57,41 +57,41 @@ Incidents
Foolscap keeps a short list of recent events in memory. When something goes Foolscap keeps a short list of recent events in memory. When something goes
wrong, it writes all the history it has (and everything that gets logged in wrong, it writes all the history it has (and everything that gets logged in
the next few seconds) into a file called an "incident". These files go into the next few seconds) into a file called an "incident". These files go into
BASEDIR/logs/incidents/ , in a file named ``BASEDIR/logs/incidents/`` , in a file named
"incident-TIMESTAMP-UNIQUE.flog.bz2". The default definition of "something "``incident-TIMESTAMP-UNIQUE.flog.bz2``". The default definition of
goes wrong" is the generation of a log event at the log.WEIRD level or "something goes wrong" is the generation of a log event at the ``log.WEIRD``
higher, but other criteria could be implemented. level or higher, but other criteria could be implemented.
The typical "incident report" we've seen in a large Tahoe grid is about 40kB The typical "incident report" we've seen in a large Tahoe grid is about 40kB
compressed, representing about 1800 recent events. compressed, representing about 1800 recent events.
These "flogfiles" have a similar format to the files saved by "flogtool tail These "flogfiles" have a similar format to the files saved by
--save-to". They are simply lists of log events, with a small header to "``flogtool tail --save-to``". They are simply lists of log events, with a
indicate which event triggered the incident. small header to indicate which event triggered the incident.
The "flogtool dump FLOGFILE" command will take one of these .flog.bz2 files The "``flogtool dump FLOGFILE``" command will take one of these ``.flog.bz2``
and print their contents to stdout, one line per event. The raw event files and print their contents to stdout, one line per event. The raw event
dictionaries can be dumped by using "flogtool dump --verbose FLOGFILE". dictionaries can be dumped by using "``flogtool dump --verbose FLOGFILE``".
The "flogtool web-viewer" command can be used to examine the flogfile in a The "``flogtool web-viewer``" command can be used to examine the flogfile
web browser. It runs a small HTTP server and emits the URL on stdout. This in a web browser. It runs a small HTTP server and emits the URL on stdout.
view provides more structure than the output of "flogtool dump": the This view provides more structure than the output of "``flogtool dump``":
parent/child relationships of log events is displayed in a nested format. the parent/child relationships of log events is displayed in a nested format.
"flogtool web-viewer" is still fairly immature. "``flogtool web-viewer``" is still fairly immature.
Working with flogfiles Working with flogfiles
====================== ======================
The "flogtool filter" command can be used to take a large flogfile (perhaps The "``flogtool filter``" command can be used to take a large flogfile
one created by the log-gatherer, see below) and copy a subset of events into (perhaps one created by the log-gatherer, see below) and copy a subset of
a second file. This smaller flogfile may be easier to work with than the events into a second file. This smaller flogfile may be easier to work with
original. The arguments to "flogtool filter" specify filtering criteria: a than the original. The arguments to "``flogtool filter``" specify filtering
predicate that each event must match to be copied into the target file. criteria: a predicate that each event must match to be copied into the
--before and --after are used to exclude events outside a given window of target file. ``--before`` and ``--after`` are used to exclude events outside
time. --above will retain events above a certain severity level. --from a given window of time. ``--above`` will retain events above a certain
retains events send by a specific tubid. --strip-facility removes events that severity level. ``--from`` retains events send by a specific tubid.
were emitted with a given facility (like foolscap.negotiation or ``--strip-facility`` removes events that were emitted with a given facility
tahoe.upload). (like ``foolscap.negotiation`` or ``tahoe.upload``).
Gatherers Gatherers
========= =========
@ -99,16 +99,16 @@ Gatherers
In a deployed Tahoe grid, it is useful to get log information automatically In a deployed Tahoe grid, it is useful to get log information automatically
transferred to a central log-gatherer host. This offloads the (admittedly transferred to a central log-gatherer host. This offloads the (admittedly
modest) storage requirements to a different host and provides access to modest) storage requirements to a different host and provides access to
logfiles from multiple nodes (webapi/storage/helper) nodes in a single place. logfiles from multiple nodes (web-API, storage, or helper) in a single place.
There are two kinds of gatherers. Both produce a FURL which needs to be There are two kinds of gatherers. Both produce a FURL which needs to be
placed in the NODEDIR/log_gatherer.furl file (one FURL per line) of the nodes placed in the ``NODEDIR/log_gatherer.furl`` file (one FURL per line) of
that are to publish their logs to the gatherer. When the Tahoe node starts, each node that is to publish its logs to the gatherer. When the Tahoe node
it will connect to the configured gatherers and offer its logport: the starts, it will connect to the configured gatherers and offer its logport:
gatherer will then use the logport to subscribe to hear about events. the gatherer will then use the logport to subscribe to hear about events.
The gatherer will write to files in its working directory, which can then be The gatherer will write to files in its working directory, which can then be
examined with tools like "flogtool dump" as described above. examined with tools like "``flogtool dump``" as described above.
Incident Gatherer Incident Gatherer
----------------- -----------------
@ -121,38 +121,38 @@ functions are written after examining a new/unknown incident. The idea is to
recognize when the same problem is happening multiple times. recognize when the same problem is happening multiple times.
A collection of classification functions that are useful for Tahoe nodes are A collection of classification functions that are useful for Tahoe nodes are
provided in misc/incident-gatherer/support_classifiers.py . There is roughly provided in ``misc/incident-gatherer/support_classifiers.py`` . There is
one category for each log.WEIRD-or-higher level event in the Tahoe source roughly one category for each ``log.WEIRD``-or-higher level event in the
code. Tahoe source code.
The incident gatherer is created with the "flogtool create-incident-gatherer The incident gatherer is created with the "``flogtool create-incident-gatherer
WORKDIR" command, and started with "tahoe start". The generated WORKDIR``" command, and started with "``tahoe start``". The generated
"gatherer.tac" file should be modified to add classifier functions. "``gatherer.tac``" file should be modified to add classifier functions.
The incident gatherer writes incident names (which are simply the relative The incident gatherer writes incident names (which are simply the relative
pathname of the incident-\*.flog.bz2 file) into classified/CATEGORY. For pathname of the ``incident-\*.flog.bz2`` file) into ``classified/CATEGORY``.
example, the classified/mutable-retrieve-uncoordinated-write-error file For example, the ``classified/mutable-retrieve-uncoordinated-write-error``
contains a list of all incidents which were triggered by an uncoordinated file contains a list of all incidents which were triggered by an uncoordinated
write that was detected during mutable file retrieval (caused when somebody write that was detected during mutable file retrieval (caused when somebody
changed the contents of the mutable file in between the node's mapupdate step changed the contents of the mutable file in between the node's mapupdate step
and the retrieve step). The classified/unknown file contains a list of all and the retrieve step). The ``classified/unknown`` file contains a list of all
incidents that did not match any of the classification functions. incidents that did not match any of the classification functions.
At startup, the incident gatherer will automatically reclassify any incident At startup, the incident gatherer will automatically reclassify any incident
report which is not mentioned in any of the classified/* files. So the usual report which is not mentioned in any of the ``classified/\*`` files. So the
workflow is to examine the incidents in classified/unknown, add a new usual workflow is to examine the incidents in ``classified/unknown``, add a
classification function, delete classified/unknown, then bound the gatherer new classification function, delete ``classified/unknown``, then bound the
with "tahoe restart WORKDIR". The incidents which can be classified with the gatherer with "``tahoe restart WORKDIR``". The incidents which can be
new functions will be added to their own classified/FOO lists, and the classified with the new functions will be added to their own ``classified/FOO``
remaining ones will be put in classified/unknown, where the process can be lists, and the remaining ones will be put in ``classified/unknown``, where
repeated until all events are classifiable. the process can be repeated until all events are classifiable.
The incident gatherer is still fairly immature: future versions will have a The incident gatherer is still fairly immature: future versions will have a
web interface and an RSS feed, so operations personnel can track problems in web interface and an RSS feed, so operations personnel can track problems in
the storage grid. the storage grid.
In our experience, each Incident takes about two seconds to transfer from the In our experience, each incident takes about two seconds to transfer from
node which generated it to the gatherer. The gatherer will automatically the node that generated it to the gatherer. The gatherer will automatically
catch up to any incidents which occurred while it is offline. catch up to any incidents which occurred while it is offline.
Log Gatherer Log Gatherer
@ -163,20 +163,20 @@ the connected nodes, regardless of severity. This server writes these log
events into a large flogfile that is rotated (closed, compressed, and events into a large flogfile that is rotated (closed, compressed, and
replaced with a new one) on a periodic basis. Each flogfile is named replaced with a new one) on a periodic basis. Each flogfile is named
according to the range of time it represents, with names like according to the range of time it represents, with names like
"from-2008-08-26-132256--to-2008-08-26-162256.flog.bz2". The flogfiles "``from-2008-08-26-132256--to-2008-08-26-162256.flog.bz2``". The flogfiles
contain events from many different sources, making it easier to correlate contain events from many different sources, making it easier to correlate
things that happened on multiple machines (such as comparing a client node things that happened on multiple machines (such as comparing a client node
making a request with the storage servers that respond to that request). making a request with the storage servers that respond to that request).
The Log Gatherer is created with the "flogtool create-gatherer WORKDIR" The Log Gatherer is created with the "``flogtool create-gatherer WORKDIR``"
command, and started with "tahoe start". The log_gatherer.furl it creates command, and started with "``tahoe start``". The ``log_gatherer.furl`` it
then needs to be copied into the BASEDIR/log_gatherer.furl file of all nodes creates then needs to be copied into the ``BASEDIR/log_gatherer.furl`` file
which should be sending it log events. of all nodes that should be sending it log events.
The "flogtool filter" command, described above, is useful to cut down the The "``flogtool filter``" command, described above, is useful to cut down the
potentially-large flogfiles into more a narrowly-focussed form. potentially-large flogfiles into more a narrowly-focussed form.
Busy nodes, particularly wapi nodes which are performing recursive Busy nodes, particularly web-API nodes which are performing recursive
deep-size/deep-stats/deep-check operations, can produce a lot of log events. deep-size/deep-stats/deep-check operations, can produce a lot of log events.
To avoid overwhelming the node (and using an unbounded amount of memory for To avoid overwhelming the node (and using an unbounded amount of memory for
the outbound TCP queue), publishing nodes will start dropping log events when the outbound TCP queue), publishing nodes will start dropping log events when
@ -186,19 +186,20 @@ the outbound queue grows too large. When this occurs, there will be gaps
Local twistd.log files Local twistd.log files
====================== ======================
[TODO: not yet true, requires foolscap-0.3.1 and a change to allmydata.node] [TODO: not yet true, requires foolscap-0.3.1 and a change to ``allmydata.node``]
In addition to the foolscap-based event logs, certain high-level events will In addition to the foolscap-based event logs, certain high-level events will
be recorded directly in human-readable text form, in the be recorded directly in human-readable text form, in the
BASEDIR/logs/twistd.log file (and its rotated old versions: twistd.log.1, ``BASEDIR/logs/twistd.log`` file (and its rotated old versions: ``twistd.log.1``,
twistd.log.2, etc). This form does not contain as much information as the ``twistd.log.2``, etc). This form does not contain as much information as the
flogfiles available through the means described previously, but they are flogfiles available through the means described previously, but they are
immediately available to the curious developer, and are retained until the immediately available to the curious developer, and are retained until the
twistd.log.NN files are explicitly deleted. twistd.log.NN files are explicitly deleted.
Only events at the log.OPERATIONAL level or higher are bridged to twistd.log Only events at the ``log.OPERATIONAL`` level or higher are bridged to
(i.e. not the log.NOISY debugging events). In addition, foolscap internal ``twistd.log`` (i.e. not the ``log.NOISY`` debugging events). In addition,
events (like connection negotiation messages) are not bridged to twistd.log . foolscap internal events (like connection negotiation messages) are not
bridged to ``twistd.log``.
Adding log messages Adding log messages
=================== ===================
@ -207,63 +208,65 @@ When adding new code, the Tahoe developer should add a reasonable number of
new log events. For details, please see the Foolscap logging documentation, new log events. For details, please see the Foolscap logging documentation,
but a few notes are worth stating here: but a few notes are worth stating here:
* use a facility prefix of "tahoe.", like "tahoe.mutable.publish" * use a facility prefix of "``tahoe.``", like "``tahoe.mutable.publish``"
* assign each severe (log.WEIRD or higher) event a unique message * assign each severe (``log.WEIRD`` or higher) event a unique message
identifier, as the umid= argument to the log.msg() call. The identifier, as the ``umid=`` argument to the ``log.msg()`` call. The
misc/coding_tools/make_umid script may be useful for this purpose. This will make it ``misc/coding_tools/make_umid`` script may be useful for this purpose.
easier to write a classification function for these messages. This will make it easier to write a classification function for these
messages.
* use the parent= argument whenever the event is causally/temporally * use the ``parent=`` argument whenever the event is causally/temporally
clustered with its parent. For example, a download process that involves clustered with its parent. For example, a download process that involves
three sequential hash fetches could announce the send and receipt of those three sequential hash fetches could announce the send and receipt of those
hash-fetch messages with a parent= argument that ties them to the overall hash-fetch messages with a ``parent=`` argument that ties them to the
download process. However, each new wapi download request should be overall download process. However, each new web-API download request
unparented. should be unparented.
* use the format= argument in preference to the message= argument. E.g. * use the ``format=`` argument in preference to the ``message=`` argument.
use log.msg(format="got %(n)d shares, need %(k)d", n=n, k=k) instead of E.g. use ``log.msg(format="got %(n)d shares, need %(k)d", n=n, k=k)``
log.msg("got %d shares, need %d" % (n,k)). This will allow later tools to instead of ``log.msg("got %d shares, need %d" % (n,k))``. This will allow
analyze the event without needing to scrape/reconstruct the structured later tools to analyze the event without needing to scrape/reconstruct
data out of the formatted string. the structured data out of the formatted string.
* Pass extra information as extra keyword arguments, even if they aren't * Pass extra information as extra keyword arguments, even if they aren't
included in the format= string. This information will be displayed in the included in the ``format=`` string. This information will be displayed in
"flogtool dump --verbose" output, as well as being available to other the "``flogtool dump --verbose``" output, as well as being available to
tools. The umid= argument should be passed this way. other tools. The ``umid=`` argument should be passed this way.
* use log.err for the catch-all addErrback that gets attached to the end of * use ``log.err`` for the catch-all ``addErrback`` that gets attached to
any given Deferred chain. When used in conjunction with LOGTOTWISTED=1, the end of any given Deferred chain. When used in conjunction with
log.err() will tell Twisted about the error-nature of the log message, ``LOGTOTWISTED=1``, ``log.err()`` will tell Twisted about the error-nature
causing Trial to flunk the test (with an "ERROR" indication that prints a of the log message, causing Trial to flunk the test (with an "ERROR"
copy of the Failure, including a traceback). Don't use log.err for events indication that prints a copy of the Failure, including a traceback).
that are BAD but handled (like hash failures: since these are often Don't use ``log.err`` for events that are ``BAD`` but handled (like hash
deliberately provoked by test code, they should not cause test failures): failures: since these are often deliberately provoked by test code, they
use log.msg(level=BAD) for those instead. should not cause test failures): use ``log.msg(level=BAD)`` for those
instead.
Log Messages During Unit Tests Log Messages During Unit Tests
============================== ==============================
If a test is failing and you aren't sure why, start by enabling If a test is failing and you aren't sure why, start by enabling
FLOGTOTWISTED=1 like this: ``FLOGTOTWISTED=1`` like this::
make test FLOGTOTWISTED=1 make test FLOGTOTWISTED=1
With FLOGTOTWISTED=1, sufficiently-important log events will be written into With ``FLOGTOTWISTED=1``, sufficiently-important log events will be written
_trial_temp/test.log, which may give you more ideas about why the test is into ``_trial_temp/test.log``, which may give you more ideas about why the
failing. Note, however, that _trial_temp/log.out will not receive messages test is failing. Note, however, that ``_trial_temp/log.out`` will not receive
below the level=OPERATIONAL threshold, due to this issue: messages below the ``level=OPERATIONAL`` threshold, due to this issue:
<http://foolscap.lothar.com/trac/ticket/154> `<http://foolscap.lothar.com/trac/ticket/154>`_
If that isn't enough, look at the detailed foolscap logging messages instead, If that isn't enough, look at the detailed foolscap logging messages instead,
by running the tests like this: by running the tests like this::
make test FLOGFILE=flog.out.bz2 FLOGLEVEL=1 FLOGTOTWISTED=1 make test FLOGFILE=flog.out.bz2 FLOGLEVEL=1 FLOGTOTWISTED=1
The first environment variable will cause foolscap log events to be written The first environment variable will cause foolscap log events to be written
to ./flog.out.bz2 (instead of merely being recorded in the circular buffers to ``./flog.out.bz2`` (instead of merely being recorded in the circular buffers
for the use of remote subscribers or incident reports). The second will cause for the use of remote subscribers or incident reports). The second will cause
all log events to be written out, not just the higher-severity ones. The all log events to be written out, not just the higher-severity ones. The
third will cause twisted log events (like the markers that indicate when each third will cause twisted log events (like the markers that indicate when each
@ -271,13 +274,13 @@ unit test is starting and stopping) to be copied into the flogfile, making it
easier to correlate log events with unit tests. easier to correlate log events with unit tests.
Enabling this form of logging appears to roughly double the runtime of the Enabling this form of logging appears to roughly double the runtime of the
unit tests. The flog.out.bz2 file is approximately 2MB. unit tests. The ``flog.out.bz2`` file is approximately 2MB.
You can then use "flogtool dump" or "flogtool web-viewer" on the resulting You can then use "``flogtool dump``" or "``flogtool web-viewer``" on the
flog.out file. resulting ``flog.out`` file.
("flogtool tail" and the log-gatherer are not useful during unit tests, since ("``flogtool tail``" and the log-gatherer are not useful during unit tests,
there is no single Tub to which all the log messages are published). since there is no single Tub to which all the log messages are published).
It is possible for setting these environment variables to cause spurious test It is possible for setting these environment variables to cause spurious test
failures in tests with race condition bugs. All known instances of this have failures in tests with race condition bugs. All known instances of this have

View File

@ -1502,6 +1502,6 @@
sodipodi:role="line" sodipodi:role="line"
id="tspan2822" id="tspan2822"
x="469.52924" x="469.52924"
y="342.69528">Tahoe-LAFS WAPI</tspan></text> y="342.69528">Tahoe-LAFS web-API</tspan></text>
</g> </g>
</svg> </svg>

Before

Width:  |  Height:  |  Size: 161 KiB

After

Width:  |  Height:  |  Size: 161 KiB

View File

@ -88,14 +88,14 @@ contents of an authority string. These authority strings can be shared with
others just like filecaps and dircaps: knowledge of the authority string is others just like filecaps and dircaps: knowledge of the authority string is
both necessary and complete to wield the authority it represents. both necessary and complete to wield the authority it represents.
webapi requests will include the authority necessary to complete the Web-API requests will include the authority necessary to complete the
operation. When used by a CLI tool, the authority is likely to come from operation. When used by a CLI tool, the authority is likely to come from
~/.tahoe/private/authority (i.e. it is ambient to the user who has access to ~/.tahoe/private/authority (i.e. it is ambient to the user who has access to
that node, just like aliases provide similar access to a specific "root that node, just like aliases provide similar access to a specific "root
directory"). When used by the browser-oriented WUI, the authority will [TODO] directory"). When used by the browser-oriented WUI, the authority will [TODO]
somehow be retained on each page in a way that minimizes the risk of CSRF somehow be retained on each page in a way that minimizes the risk of CSRF
attacks and allows safe sharing (cut-and-paste of a URL without sharing the attacks and allows safe sharing (cut-and-paste of a URL without sharing the
storage authority too). The client node receiving the webapi request will storage authority too). The client node receiving the web-API request will
extract the authority string from the request and use it to build the storage extract the authority string from the request and use it to build the storage
server messages that it sends to fulfill that request. server messages that it sends to fulfill that request.
@ -449,12 +449,11 @@ using a foreign tahoe node, or when asking a Helper to upload a specific
file. Attenuations (see below) should be used to limit the delegated file. Attenuations (see below) should be used to limit the delegated
authority in these cases. authority in these cases.
In the programmatic webapi interface (colloquially known as the "WAPI"), any In the programmatic web-API, any operation that consumes storage will accept
operation that consumes storage will accept a storage-authority= query a storage-authority= query argument, the value of which will be the printable
argument, the value of which will be the printable form of an authority form of an authority string. This includes all PUT operations, POST t=upload
string. This includes all PUT operations, POST t=upload and t=mkdir, and and t=mkdir, and anything which creates a new file, creates a directory
anything which creates a new file, creates a directory (perhaps an (perhaps an intermediate one), or modifies a mutable file.
intermediate one), or modifies a mutable file.
Alternatively, the authority string can also be passed through an HTTP Alternatively, the authority string can also be passed through an HTTP
header. A single "X-Tahoe-Storage-Authority:" header can be used with the header. A single "X-Tahoe-Storage-Authority:" header can be used with the
@ -501,7 +500,7 @@ servers in a single grid and sum them together, providing a grid-wide usage
number for each account. This could be used by e.g. clients in a commercial number for each account. This could be used by e.g. clients in a commercial
grid to report overall-space-used to the end user. grid to report overall-space-used to the end user.
There will be webapi URLs available for all of these reports. There will be web-API URLs available for all of these reports.
TODO: storage servers might also have a mechanism to apply space-usage limits TODO: storage servers might also have a mechanism to apply space-usage limits
to specific account ids directly, rather than requiring that these be to specific account ids directly, rather than requiring that these be
@ -516,7 +515,7 @@ beginning with the storage-authority data structure and working upwards. This
section is organized to follow the storage authority, starting from the point section is organized to follow the storage authority, starting from the point
of grant. The discussion will thus begin at the storage server (where the of grant. The discussion will thus begin at the storage server (where the
authority is first created), work back to the client (which receives the authority is first created), work back to the client (which receives the
authority as a webapi argument), then follow the authority back to the authority as a web-API argument), then follow the authority back to the
servers as it is used to enable specific storage operations. It will then servers as it is used to enable specific storage operations. It will then
detail the accounting tables that the storage server is obligated to detail the accounting tables that the storage server is obligated to
maintain, and describe the interfaces through which these tables are accessed maintain, and describe the interfaces through which these tables are accessed

View File

@ -124,7 +124,7 @@
<p>The <a href="http://tahoe-lafs.org/trac/tahoe-lafs/wiki/SftpFrontend">SftpFrontend</a> page <p>The <a href="http://tahoe-lafs.org/trac/tahoe-lafs/wiki/SftpFrontend">SftpFrontend</a> page
on the wiki has more information about using SFTP with Tahoe-LAFS.</p> on the wiki has more information about using SFTP with Tahoe-LAFS.</p>
<h3>The WAPI</h3> <h3>The Web-API</h3>
<p>Want to program your Tahoe-LAFS node to do your bidding? Easy! See <a <p>Want to program your Tahoe-LAFS node to do your bidding? Easy! See <a
href="frontends/webapi.rst">webapi.rst</a>.</p> href="frontends/webapi.rst">webapi.rst</a>.</p>

View File

@ -498,8 +498,9 @@ overwrite() tells the client to ignore this cached version information, and
to unconditionally replace the mutable file's contents with the new data. to unconditionally replace the mutable file's contents with the new data.
This should not be used in delta application, but rather in situations where This should not be used in delta application, but rather in situations where
you want to replace the file's contents with completely unrelated ones. When you want to replace the file's contents with completely unrelated ones. When
raw files are uploaded into a mutable slot through the tahoe webapi (using raw files are uploaded into a mutable slot through the Tahoe-LAFS web-API
POST and the ?mutable=true argument), they are put in place with overwrite(). (using POST and the ?mutable=true argument), they are put in place with
overwrite().
The peer-selection and data-structure manipulation (and signing/verification) The peer-selection and data-structure manipulation (and signing/verification)
steps will be implemented in a separate class in allmydata/mutable.py . steps will be implemented in a separate class in allmydata/mutable.py .