Merge PR380: LAFS = "Least-Authority File Store"

Closes tahoe-lafs/tahoe-lafs#380
refs ticket:2345
This commit is contained in:
Brian Warner 2017-06-06 18:01:52 +01:00
commit 2cad19961f
15 changed files with 175 additions and 175 deletions

View File

@ -67,7 +67,7 @@ Here's how it works:
A "storage grid" is made up of a number of storage servers. A storage server
has direct attached storage (typically one or more hard disks). A "gateway"
communicates with storage nodes, and uses them to provide access to the
file store over protocols such as HTTP(S), SFTP or FTP.
grid over protocols such as HTTP(S), SFTP or FTP.
Note that you can find "client" used to refer to gateway nodes (which act as
a client to storage servers), and also to processes or programs connecting to
@ -94,8 +94,8 @@ An alternate deployment mode is that the gateway runs on a remote machine and
the user connects to it over HTTPS or SFTP. This means that the operator of
the gateway can view and modify the user's data (the user *relies on* the
gateway for confidentiality and integrity), but the advantage is that the
user can access the file store with a client that doesn't have the gateway
software installed, such as an Internet kiosk or cell phone.
user can access the Tahoe-LAFS grid with a client that doesn't have the
gateway software installed, such as an Internet kiosk or cell phone.
Access Control
==============

View File

@ -10,7 +10,7 @@ Tahoe-LAFS Architecture
4. `Capabilities`_
5. `Server Selection`_
6. `Swarming Download, Trickling Upload`_
7. `The Filesystem Layer`_
7. `The File Store Layer`_
8. `Leases, Refreshing, Garbage Collection`_
9. `File Repairer`_
10. `Security`_
@ -22,7 +22,7 @@ Overview
(See the `docs/specifications directory`_ for more details.)
There are three layers: the key-value store, the filesystem, and the
There are three layers: the key-value store, the file store, and the
application.
The lowest layer is the key-value store. The keys are "capabilities" -- short
@ -33,19 +33,19 @@ values, but there may be performance issues with extremely large values (just
due to the limitation of network bandwidth). In practice, values as small as
a few bytes and as large as tens of gigabytes are in common use.
The middle layer is the decentralized filesystem: a directed graph in which
The middle layer is the decentralized file store: a directed graph in which
the intermediate nodes are directories and the leaf nodes are files. The leaf
nodes contain only the data -- they contain no metadata other than the length
in bytes. The edges leading to leaf nodes have metadata attached to them
about the file they point to. Therefore, the same file may be associated with
different metadata if it is referred to through different edges.
The top layer consists of the applications using the filesystem.
The top layer consists of the applications using the file store.
Allmydata.com used it for a backup service: the application periodically
copies files from the local disk onto the decentralized filesystem. We later
copies files from the local disk onto the decentralized file store. We later
provide read-only access to those files, allowing users to recover them.
There are several other applications built on top of the Tahoe-LAFS
filesystem (see the RelatedProjects_ page of the wiki for a list).
file store (see the RelatedProjects_ page of the wiki for a list).
.. _docs/specifications directory: https://github.com/tahoe-lafs/tahoe-lafs/tree/master/docs/specifications
.. _RelatedProjects: https://tahoe-lafs.org/trac/tahoe-lafs/wiki/RelatedProjects
@ -157,7 +157,7 @@ The "key-value store" layer doesn't include human-meaningful names.
Capabilities sit on the "global+secure" edge of `Zooko's Triangle`_. They are
self-authenticating, meaning that nobody can trick you into accepting a file
that doesn't match the capability you used to refer to that file. The
filesystem layer (described below) adds human-meaningful names atop the
file store layer (described below) adds human-meaningful names atop the
key-value layer.
.. _`Zooko's Triangle`: https://en.wikipedia.org/wiki/Zooko%27s_triangle
@ -319,15 +319,15 @@ in the same facility, so the helper-to-storage-server bandwidth is huge.
See :doc:`helper` for details about the upload helper.
The Filesystem Layer
The File Store Layer
====================
The "filesystem" layer is responsible for mapping human-meaningful pathnames
The "file store" layer is responsible for mapping human-meaningful pathnames
(directories and filenames) to pieces of data. The actual bytes inside these
files are referenced by capability, but the filesystem layer is where the
files are referenced by capability, but the file store layer is where the
directory names, file names, and metadata are kept.
The filesystem layer is a graph of directories. Each directory contains a
The file store layer is a graph of directories. Each directory contains a
table of named children. These children are either other directories or
files. All children are referenced by their capability.
@ -353,11 +353,11 @@ that are globally visible.
Leases, Refreshing, Garbage Collection
======================================
When a file or directory in the virtual filesystem is no longer referenced,
the space that its shares occupied on each storage server can be freed,
making room for other shares. Tahoe-LAFS uses a garbage collection ("GC")
mechanism to implement this space-reclamation process. Each share has one or
more "leases", which are managed by clients who want the file/directory to be
When a file or directory in the file store is no longer referenced, the space
that its shares occupied on each storage server can be freed, making room for
other shares. Tahoe-LAFS uses a garbage collection ("GC") mechanism to
implement this space-reclamation process. Each share has one or more
"leases", which are managed by clients who want the file/directory to be
retained. The storage server accepts each share for a pre-defined period of
time, and is allowed to delete the share if all of the leases are cancelled
or allowed to expire.
@ -378,7 +378,7 @@ Shares may go away because the storage server hosting them has suffered a
failure: either temporary downtime (affecting availability of the file), or a
permanent data loss (affecting the preservation of the file). Hard drives
crash, power supplies explode, coffee spills, and asteroids strike. The goal
of a robust distributed filesystem is to survive these setbacks.
of a robust distributed file store is to survive these setbacks.
To work against this slow, continual loss of shares, a File Checker is used
to periodically count the number of shares still available for any given
@ -486,12 +486,7 @@ validate-capability, but not vice versa). These capabilities may be expressly
delegated (irrevocably) by simply transferring the relevant secrets.
The application layer can provide whatever access model is desired, built on
top of this capability access model. The first big user of this system so far
is allmydata.com. The allmydata.com access model currently works like a
normal web site, using username and password to give a user access to her
"virtual drive". In addition, allmydata.com users can share individual files
(using a file sharing interface built on top of the immutable file read
capabilities).
top of this capability access model.
Reliability

View File

@ -676,10 +676,10 @@ Client Configuration
Frontend Configuration
======================
The Tahoe client process can run a variety of frontend file-access protocols.
You will use these to create and retrieve files from the virtual filesystem.
Configuration details for each are documented in the following
protocol-specific guides:
The Tahoe-LAFS client process can run a variety of frontend file access
protocols. You will use these to create and retrieve files from the
Tahoe-LAFS file store. Configuration details for each are documented in
the following protocol-specific guides:
HTTP
@ -695,7 +695,7 @@ HTTP
CLI
The main ``tahoe`` executable includes subcommands for manipulating the
filesystem, uploading/downloading files, and creating/running Tahoe
file store, uploading/downloading files, and creating/running Tahoe
nodes. See :doc:`frontends/CLI` for details.
SFTP, FTP

View File

@ -10,7 +10,7 @@ The Tahoe-LAFS CLI commands
1. `Unicode Support`_
3. `Node Management`_
4. `Filesystem Manipulation`_
4. `File Store Manipulation`_
1. `Starting Directories`_
2. `Command Syntax Summary`_
@ -24,7 +24,7 @@ Overview
========
Tahoe-LAFS provides a single executable named "``tahoe``", which can be used
to create and manage client/server nodes, manipulate the filesystem, and
to create and manage client/server nodes, manipulate the file store, and
perform several debugging/maintenance tasks. This executable is installed
into your virtualenv when you run ``pip install tahoe-lafs``.
@ -35,7 +35,7 @@ CLI Command Overview
The "``tahoe``" tool provides access to three categories of commands.
* node management: create a client/server node, start/stop/restart it
* filesystem manipulation: list files, upload, download, unlink, rename
* file store manipulation: list files, upload, download, unlink, rename
* debugging: unpack cap-strings, examine share files
To get a list of all commands, just run "``tahoe``" with no additional
@ -120,15 +120,15 @@ is most often used by developers who have just modified the code and want to
start using their changes.
Filesystem Manipulation
File Store Manipulation
=======================
These commands let you exmaine a Tahoe-LAFS filesystem, providing basic
These commands let you exmaine a Tahoe-LAFS file store, providing basic
list/upload/download/unlink/rename/mkdir functionality. They can be used as
primitives by other scripts. Most of these commands are fairly thin wrappers
around web-API calls, which are described in :doc:`webapi`.
By default, all filesystem-manipulation commands look in ``~/.tahoe/`` to
By default, all file store manipulation commands look in ``~/.tahoe/`` to
figure out which Tahoe-LAFS node they should use. When the CLI command makes
web-API calls, it will use ``~/.tahoe/node.url`` for this purpose: a running
Tahoe-LAFS node that provides a web-API port will write its URL into this
@ -142,7 +142,7 @@ they ought to use a starting point. This is explained in more detail below.
Starting Directories
--------------------
As described in :doc:`../architecture`, the Tahoe-LAFS distributed filesystem
As described in :doc:`../architecture`, the Tahoe-LAFS distributed file store
consists of a collection of directories and files, each of which has a
"read-cap" or a "write-cap" (also known as a URI). Each directory is simply a
table that maps a name to a child file or directory, and this table is turned
@ -179,12 +179,14 @@ and later will use it if necessary. However, once you've set a ``tahoe:``
alias with "``tahoe set-alias``", that will override anything in the old
``root_dir.cap`` file.
The Tahoe-LAFS CLI commands use the same path syntax as ``scp`` and
The Tahoe-LAFS CLI commands use a similar path syntax to ``scp`` and
``rsync`` -- an optional ``ALIAS:`` prefix, followed by the pathname or
filename. Some commands (like "``tahoe cp``") use the lack of an alias to
mean that you want to refer to a local file, instead of something from the
Tahoe-LAFS filesystem. Another way to indicate this is to start the
pathname with "./", "~/", "~username/", or "/".
Tahoe-LAFS file store. Another way to indicate this is to start the
pathname with "./", "~/", "~username/", or "/". On Windows, aliases
cannot be a single character, so that it is possible to distinguish a
path relative to an alias from a path starting with a local drive specifier.
When you're dealing a single starting directory, the ``tahoe:`` alias is
all you need. But when you want to refer to something that isn't yet
@ -333,7 +335,7 @@ Command Examples
``tahoe ls subdir``
This lists a subdirectory of your filesystem.
This lists a subdirectory of your file store.
``tahoe webopen``

View File

@ -42,8 +42,8 @@ Tahoe-LAFS Support
All Tahoe-LAFS client nodes can run a frontend SFTP server, allowing regular
SFTP clients (like ``/usr/bin/sftp``, the ``sshfs`` FUSE plugin, and many
others) to access the virtual filesystem. They can also run an FTP server,
so FTP clients (like ``/usr/bin/ftp``, ``ncftp``, and others) can too. These
others) to access the file store. They can also run an FTP server, so FTP
clients (like ``/usr/bin/ftp``, ``ncftp``, and others) can too. These
frontends sit at the same level as the web-API interface.
Since Tahoe-LAFS does not use user accounts or passwords, the SFTP/FTP

View File

@ -71,8 +71,8 @@ port 3456, on the loopback (127.0.0.1) interface.
Basic Concepts: GET, PUT, DELETE, POST
======================================
As described in :doc:`../architecture`, each file and directory in a Tahoe
virtual filesystem is referenced by an identifier that combines the
As described in :doc:`../architecture`, each file and directory in a
Tahoe-LAFS file store is referenced by an identifier that combines the
designation of the object with the authority to do something with it (such as
read or modify the contents). This identifier is called a "read-cap" or
"write-cap", depending upon whether it enables read-only or read-write
@ -93,7 +93,7 @@ Other variations (generally implemented by adding query parameters to the
URL) will return information about the object, such as metadata. GET
operations are required to have no side-effects.
PUT is used to upload new objects into the filesystem, or to replace an
PUT is used to upload new objects into the file store, or to replace an
existing link or the contents of a mutable file. DELETE is used to unlink
objects from directories. Both PUT and DELETE are required to be idempotent:
performing the same operation multiple times must have the same side-effects
@ -107,12 +107,12 @@ unlinking), because otherwise a regular web browser has no way to accomplish
these tasks. In general, everything that can be done with a PUT or DELETE can
also be done with a POST.
Tahoe's web API is designed for two different kinds of consumer. The first is
a program that needs to manipulate the virtual file system. Such programs are
Tahoe-LAFS' web API is designed for two different kinds of consumer. The
first is a program that needs to manipulate the file store. Such programs are
expected to use the RESTful interface described above. The second is a human
using a standard web browser to work with the filesystem. This user is given
a series of HTML pages with links to download files, and forms that use POST
actions to upload, rename, and unlink files.
using a standard web browser to work with the file store. This user is
presented with a series of HTML pages with links to download files, and forms
that use POST actions to upload, rename, and unlink files.
When an error occurs, the HTTP response code will be set to an appropriate
400-series code (like 404 Not Found for an unknown childname, or 400 Bad Request
@ -332,7 +332,7 @@ Programmatic Operations
=======================
Now that we know how to build URLs that refer to files and directories in a
Tahoe virtual filesystem, what sorts of operations can we do with those URLs?
Tahoe-LAFS file store, what sorts of operations can we do with those URLs?
This section contains a catalog of GET, PUT, DELETE, and POST operations that
can be performed on these URLs. This set of operations are aimed at programs
that use HTTP to communicate with a Tahoe node. A later section describes
@ -419,7 +419,7 @@ Writing/Uploading a File
``PUT /uri``
This uploads a file, and produces a file-cap for the contents, but does not
attach the file into the filesystem. No directories will be modified by
attach the file into the file store. No directories will be modified by
this operation. The file-cap is returned as the body of the HTTP response.
This method accepts format= and mutable=true as query string arguments, and
@ -435,7 +435,7 @@ Creating a New Directory
Create a new empty directory and return its write-cap as the HTTP response
body. This does not make the newly created directory visible from the
filesystem. The "PUT" operation is provided for backwards compatibility:
file store. The "PUT" operation is provided for backwards compatibility:
new code should use POST.
This supports a format= argument in the query string. The format=
@ -807,8 +807,8 @@ child is set. The value of the 'tahoe':'linkcrtime' key is updated whenever
a link to a child is created -- i.e. when there was not previously a link
under that name.
Note however, that if the edge in the Tahoe filesystem points to a mutable
file and the contents of that mutable file is changed, then the
Note however, that if the edge in the Tahoe-LAFS file store points to a
mutable file and the contents of that mutable file is changed, then the
'tahoe':'linkmotime' value on that edge will *not* be updated, since the
edge itself wasn't updated -- only the mutable file was.
@ -835,8 +835,8 @@ The reason we added the new fields in Tahoe v1.4.0 is that there is a
values of the 'mtime'/'ctime' pair, and this API is used by the
"tahoe backup" command (in Tahoe v1.3.0 and later) to set the 'mtime' and
'ctime' values when backing up files from a local filesystem into the
Tahoe filesystem. As of Tahoe v1.4.0, the set_children API cannot be used
to set anything under the 'tahoe' key of the metadata dict -- if you
Tahoe-LAFS file store. As of Tahoe v1.4.0, the set_children API cannot be
used to set anything under the 'tahoe' key of the metadata dict -- if you
include 'tahoe' keys in your 'metadata' arguments then it will silently
ignore those keys.
@ -864,8 +864,8 @@ When an edge is created or updated by "tahoe backup", the 'mtime' and
There are several ways that the 'ctime' field could be confusing:
1. You might be confused about whether it reflects the time of the creation
of a link in the Tahoe filesystem (by a version of Tahoe < v1.7.0) or a
timestamp copied in by "tahoe backup" from a local filesystem.
of a link in the Tahoe-LAFS file store (by a version of Tahoe < v1.7.0)
or a timestamp copied in by "tahoe backup" from a local filesystem.
2. You might be confused about whether it is a copy of the file creation
time (if "tahoe backup" was run on a Windows system) or of the last
@ -895,7 +895,7 @@ Attaching an Existing File or Directory by its read- or write-cap
``PUT /uri/$DIRCAP/[SUBDIRS../]CHILDNAME?t=uri``
This attaches a child object (either a file or directory) to a specified
location in the virtual filesystem. The child object is referenced by its
location in the Tahoe-LAFS file store. The child object is referenced by its
read- or write- cap, as provided in the HTTP request body. This will create
intermediate directories as necessary.
@ -1008,9 +1008,9 @@ Browser Operations: Human-oriented interfaces
This section describes the HTTP operations that provide support for humans
running a web browser. Most of these operations use HTML forms that use POST
to drive the Tahoe node. This section is intended for HTML authors who want
to write web pages that contain forms and buttons which manipulate the Tahoe
filesystem.
to drive the Tahoe-LAFS node. This section is intended for HTML authors who
want to write web pages containing user interfaces for manipulating the
Tahoe-LAFS file store.
Note that for all POST operations, the arguments listed can be provided
either as URL query arguments or as form body fields. URL query arguments are
@ -1107,8 +1107,8 @@ Creating a Directory
``POST /uri?t=mkdir``
This creates a new empty directory, but does not attach it to the virtual
filesystem.
This creates a new empty directory, but does not attach it to any other
directory in the Tahoe-LAFS file store.
If a "redirect_to_result=true" argument is provided, then the HTTP response
will cause the web browser to be redirected to a /uri/$DIRCAP page that
@ -1150,8 +1150,8 @@ Uploading a File
``POST /uri?t=upload``
This uploads a file, and produces a file-cap for the contents, but does not
attach the file into the filesystem. No directories will be modified by
this operation.
attach the file to any directory in the Tahoe-LAFS file store. That is, no
directories will be modified by this operation.
The file must be provided as the "file" field of an HTML encoded form body,
produced in response to an HTML form like this::
@ -1699,7 +1699,7 @@ incorrectly.
for debugging. This is a table of (path, filecap/dircap), for every object
reachable from the starting directory. The path will be slash-joined, and
the filecap/dircap will contain a link to the object in question. This page
gives immediate access to every object in the virtual filesystem subtree.
gives immediate access to every object in the file store subtree.
This operation uses the same ophandle= mechanism as deep-check. The
corresponding /operations/$HANDLE page has three different forms. The
@ -1833,9 +1833,9 @@ Other Useful Pages
==================
The portion of the web namespace that begins with "/uri" (and "/named") is
dedicated to giving users (both humans and programs) access to the Tahoe
virtual filesystem. The rest of the namespace provides status information
about the state of the Tahoe node.
dedicated to giving users (both humans and programs) access to the Tahoe-LAFS
file store. The rest of the namespace provides status information about the
state of the Tahoe-LAFS node.
``GET /`` (the root page)
@ -1843,11 +1843,11 @@ This is the "Welcome Page", and contains a few distinct sections::
Node information: library versions, local nodeid, services being provided.
Filesystem Access Forms: create a new directory, view a file/directory by
File store access forms: create a new directory, view a file/directory by
URI, upload a file (unlinked), download a file by
URI.
Grid Status: introducer information, helper information, connected storage
Grid status: introducer information, helper information, connected storage
servers.
``GET /status/``
@ -1994,23 +1994,24 @@ Safety and Security Issues -- Names vs. URIs
============================================
Summary: use explicit file- and dir- caps whenever possible, to reduce the
potential for surprises when the filesystem structure is changed.
potential for surprises when the file store structure is changed.
Tahoe provides a mutable filesystem, but the ways that the filesystem can
change are limited. The only thing that can change is that the mapping from
child names to child objects that each directory contains can be changed by
adding a new child name pointing to an object, removing an existing child name,
or changing an existing child name to point to a different object.
Tahoe-LAFS provides a mutable file store, but the ways that the store can
change are limited. The only things that can change are:
* the mapping from child names to child objects inside mutable directories
(by adding a new child, removing an existing child, or changing an
existing child to point to a different object)
* the contents of mutable files
Obviously if you query Tahoe for information about the filesystem and then act
to change the filesystem (such as by getting a listing of the contents of a
directory and then adding a file to the directory), then the filesystem might
have been changed after you queried it and before you acted upon it. However,
if you use the URI instead of the pathname of an object when you act upon the
object, then the only change that can happen is if the object is a directory
then the set of child names it has might be different. If, on the other hand,
you act upon the object using its pathname, then a different object might be in
that place, which can result in more kinds of surprises.
Obviously if you query for information about the file store and then act
to change it (such as by getting a listing of the contents of a mutable
directory and then adding a file to the directory), then the store might
have been changed after you queried it and before you acted upon it.
However, if you use the URI instead of the pathname of an object when you
act upon the object, then it will be the same object; only its contents
can change (if it is mutable). If, on the other hand, you act upon the
object using its pathname, then a different object might be in that place,
which can result in more kinds of surprises.
For example, suppose you are writing code which recursively downloads the
contents of a directory. The first thing your code does is fetch the listing
@ -2018,15 +2019,14 @@ of the contents of the directory. For each child that it fetched, if that
child is a file then it downloads the file, and if that child is a directory
then it recurses into that directory. Now, if the download and the recurse
actions are performed using the child's name, then the results might be
wrong, because for example a child name that pointed to a sub-directory when
wrong, because for example a child name that pointed to a subdirectory when
you listed the directory might have been changed to point to a file (in which
case your attempt to recurse into it would result in an error and the file
would be skipped), or a child name that pointed to a file when you listed the
directory might now point to a sub-directory (in which case your attempt to
download the child would result in a file containing HTML text describing the
sub-directory!).
case your attempt to recurse into it would result in an error), or a child
name that pointed to a file when you listed the directory might now point to
a subdirectory (in which case your attempt to download the child would result
in a file containing HTML text describing the subdirectory!).
If your recursive algorithm uses the uri of the child instead of the name of
If your recursive algorithm uses the URI of the child instead of the name of
the child, then those kinds of mistakes just can't happen. Note that both the
child's name and the child's URI are included in the results of listing the
parent directory, so it isn't any harder to use the URI for this purpose.

View File

@ -13,7 +13,7 @@ Garbage Collection in Tahoe
Overview
========
When a file or directory in the virtual filesystem is no longer referenced,
When a file or directory in a Tahoe-LAFS file store is no longer referenced,
the space that its shares occupied on each storage server can be freed,
making room for other shares. Tahoe currently uses a garbage collection
("GC") mechanism to implement this space-reclamation process. Each share has

View File

@ -1,7 +1,7 @@
.TH TAHOE 1 "July 2011" "Tahoe-LAFS \[em] tahoe command" "User Commands"
.SH NAME
.PP
tahoe - Secure distributed filesystem.
tahoe - Secure distributed file store.
.SH SYNOPSIS
.PP
tahoe \f[I]COMMAND\f[] [\f[I]OPTION\f[]]... [\f[I]PARAMETER\f[]]...
@ -130,7 +130,7 @@ other than \f[B]run\f[]: `$HOME/.tahoe/').
Display help and exit
.RS
.RE
.SS USING THE FILESYSTEM
.SS USING THE FILE STORE
.TP
.B \f[B]mkdir\f[]
Create a new directory.

View File

@ -1,6 +1,6 @@
The lossmodel.lyx file is the source document for an in-progress paper
that analyzes the probability of losing files stored in a Tahoe
Least-acces File System under various scenarios. It describes:
that analyzes the probability of losing files stored in a Tahoe-LAFS
file store under various scenarios. It describes:
1. How to estimate peer reliabilities, based on peer MTBF failure
data.

View File

@ -116,10 +116,10 @@ The CLI
Prefer the command-line? Run “``tahoe --help``” (the same command-line
tool that is used to start and stop nodes serves to navigate and use the
decentralized filesystem). To get started, create a new directory and
decentralized file store). To get started, create a new directory and
mark it as the 'tahoe:' alias by running “``tahoe create-alias tahoe``”.
Once you've done that, you can do “``tahoe ls tahoe:``” and “``tahoe cp
LOCALFILE tahoe:foo.txt``” to work with your filesystem. The Tahoe-LAFS
LOCALFILE tahoe:foo.txt``” to work with your file store. The Tahoe-LAFS
CLI uses similar syntax to the well-known scp and rsync tools. See
:doc:`frontends/CLI` for more details.

View File

@ -8,17 +8,17 @@ As explained in the architecture docs, Tahoe-LAFS can be roughly viewed as
a collection of three layers. The lowest layer is the key-value store: it
provides operations that accept files and upload them to the grid, creating
a URI in the process which securely references the file's contents.
The middle layer is the filesystem, creating a structure of directories and
filenames resembling the traditional unix/windows filesystems. The top layer
is the application layer, which uses the lower layers to provide useful
The middle layer is the file store, creating a structure of directories and
filenames resembling the traditional Unix or Windows filesystems. The top
layer is the application layer, which uses the lower layers to provide useful
services to users, like a backup application, or a way to share files with
friends.
This document examines the middle layer, the "filesystem".
This document examines the middle layer, the "file store".
1. `Key-value Store Primitives`_
2. `Filesystem goals`_
3. `Dirnode goals`_
2. `File Store Goals`_
3. `Dirnode Goals`_
4. `Dirnode secret values`_
5. `Dirnode storage format`_
6. `Dirnode sizes, mutable-file initial read sizes`_
@ -53,10 +53,10 @@ contents of a pre-existing slot, and the third retrieves the contents::
replace(mutable_uri, new_data)
data = get(mutable_uri)
Filesystem Goals
File Store Goals
================
The main goal for the middle (filesystem) layer is to give users a way to
The main goal for the middle (file store) layer is to give users a way to
organize the data that they have uploaded into the grid. The traditional way
to do this in computer filesystems is to put this data into files, give those
files names, and collect these names into directories.
@ -113,7 +113,7 @@ dirnodes is such that read-only access is transitive: i.e. if you grant Bob
read-only access to a parent directory, then Bob will get read-only access
(and *not* read-write access) to its children.
Relative to the previous "vdrive-server" based scheme, the current
Relative to the previous "vdrive server"-based scheme, the current
distributed dirnode approach gives better availability, but cannot guarantee
updateness quite as well, and requires far more network traffic for each
retrieval and update. Mutable files are somewhat less available than
@ -289,7 +289,7 @@ shorter than read-caps and write-caps, the attacker can use the length of the
ciphertext to guess the number of children of each node, and might be able to
guess the length of the child names (or at least their sum). From this, the
attacker may be able to build up a graph with the same shape as the plaintext
filesystem, but with unlabeled edges and unknown file contents.
file store, but with unlabeled edges and unknown file contents.
Integrity failures in the storage servers
@ -339,11 +339,11 @@ directory-creation effort to a bare minimum (picking a random number instead
of generating two random primes).
When a backup program is run for the first time, it needs to copy a large
amount of data from a pre-existing filesystem into reliable storage. This
means that a large and complex directory structure needs to be duplicated in
the dirnode layer. With the one-object-per-dirnode approach described here,
this requires as many operations as there are edges in the imported
filesystem graph.
amount of data from a pre-existing local filesystem into reliable storage.
This means that a large and complex directory structure needs to be
duplicated in the dirnode layer. With the one-object-per-dirnode approach
described here, this requires as many operations as there are edges in the
imported filesystem graph.
Another approach would be to aggregate multiple directories into a single
storage object. This object would contain a serialized graph rather than a
@ -404,7 +404,7 @@ storage index, but do *not* include the readkeys or writekeys, so the
repairer does not get to read the files or directories that it is helping to
keep alive.
After each change to the user's vdrive, the client creates a manifest and
After each change to the user's file store, the client creates a manifest and
looks for differences from their previous version. Anything which was removed
prompts the client to send out lease-cancellation messages, allowing the data
to be deleted.
@ -422,27 +422,29 @@ Mounting and Sharing Directories
================================
The biggest benefit of this dirnode approach is that sharing individual
directories is almost trivial. Alice creates a subdirectory that she wants to
use to share files with Bob. This subdirectory is attached to Alice's
filesystem at "~alice/share-with-bob". She asks her filesystem for the
read-write directory URI for that new directory, and emails it to Bob. When
Bob receives the URI, he asks his own local vdrive to attach the given URI,
perhaps at a place named "~bob/shared-with-alice". Every time either party
writes a file into this directory, the other will be able to read it. If
Alice prefers, she can give a read-only URI to Bob instead, and then Bob will
be able to read files but not change the contents of the directory. Neither
Alice nor Bob will get access to any files above the mounted directory: there
are no 'parent directory' pointers. If Alice creates a nested set of
directories, "~alice/share-with-bob/subdir2", and gives a read-only URI to
share-with-bob to Bob, then Bob will be unable to write to either
share-with-bob/ or subdir2/.
directories is almost trivial. Alice creates a subdirectory that she wants
to use to share files with Bob. This subdirectory is attached to Alice's
file store at "alice:shared-with-bob". She asks her file store for the
read-only directory URI for that new directory, and emails it to Bob. When
Bob receives the URI, he attaches the given URI into one of his own
directories, perhaps at a place named "bob:shared-with-alice". Every time
Alice writes a file into this directory, Bob will be able to read it.
(It is also possible to share read-write URIs between users, but that makes
it difficult to follow the `Prime Coordination Directive`_ .) Neither
Alice nor Bob will get access to any files above the mounted directory:
there are no 'parent directory' pointers. If Alice creates a nested set of
directories, "alice:shared-with-bob/subdir2", and gives a read-only URI to
shared-with-bob to Bob, then Bob will be unable to write to either
shared-with-bob/ or subdir2/.
.. _`Prime Coordination Directive`: ../write_coordination.rst
A suitable UI needs to be created to allow users to easily perform this
sharing action: dragging a folder their vdrive to an IM or email user icon,
for example. The UI will need to give the sending user an opportunity to
indicate whether they want to grant read-write or read-only access to the
recipient. The recipient then needs an interface to drag the new folder into
their vdrive and give it a home.
sharing action: dragging a folder from their file store to an IM or email
user icon, for example. The UI will need to give the sending user an
opportunity to indicate whether they want to grant read-write or read-only
access to the recipient. The recipient then needs an interface to drag the
new folder into their file store and give it a home.
Revocation
==========

View File

@ -13,9 +13,9 @@ Tahoe URIs
2. `Directory URIs`_
3. `Internal Usage of URIs`_
Each file and directory in a Tahoe filesystem is described by a "URI". There
are different kinds of URIs for different kinds of objects, and there are
different kinds of URIs to provide different kinds of access to those
Each file and directory in a Tahoe-LAFS file store is described by a "URI".
There are different kinds of URIs for different kinds of objects, and there
are different kinds of URIs to provide different kinds of access to those
objects. Each URI is a string representation of a "capability" or "cap", and
there are read-caps, write-caps, verify-caps, and others.
@ -41,9 +41,10 @@ herein.
File URIs
=========
The lowest layer of the Tahoe architecture (the "grid") is reponsible for
mapping URIs to data. This is basically a distributed hash table, in which
the URI is the key, and some sequence of bytes is the value.
The lowest layer of the Tahoe architecture (the "key-value store") is
reponsible for mapping URIs to data. This is basically a distributed
hash table, in which the URI is the key, and some sequence of bytes is
the value.
There are two kinds of entries in this table: immutable and mutable. For
immutable entries, the URI represents a fixed chunk of data. The URI itself
@ -53,10 +54,10 @@ to locate and download that data from the grid at some time in the future.
For mutable entries, the URI identifies a "slot" or "container", which can be
filled with different pieces of data at different times.
It is important to note that the "files" described by these URIs are just a
bunch of bytes, and that **no** filenames or other metadata is retained at
this layer. The vdrive layer (which sits above the grid layer) is entirely
responsible for directories and filenames and the like.
It is important to note that the values referenced by these URIs are just
sequences of bytes, and that **no** filenames or other metadata is retained at
this layer. The file store layer (which sits above the key-value store layer)
is entirely responsible for directories and filenames and the like.
CHK URIs
--------
@ -164,10 +165,10 @@ structure to provide mutable file access.
Directory URIs
==============
The grid layer provides a mapping from URI to data. To turn this into a graph
of directories and files, the "vdrive" layer (which sits on top of the grid
layer) needs to keep track of "directory nodes", or "dirnodes" for short.
:doc:`dirnodes` describes how these work.
The key-value store layer provides a mapping from URI to data. To turn this
into a graph of directories and files, the "file store" layer (which sits on
top of the key-value store layer) needs to keep track of "directory nodes",
or "dirnodes" for short. :doc:`dirnodes` describes how these work.
Dirnodes are contained inside mutable files, and are thus simply a particular
way to interpret the contents of these files. As a result, a directory

View File

@ -14,7 +14,7 @@ directory at a time. One convenient way to accomplish this is to make
a different file or directory for each person or process that wants to
write.
If mutable parts of a filesystem are accessed via sshfs, only a single
If mutable parts of a file store are accessed via sshfs, only a single
sshfs mount should be used. There may be data loss if mutable files or
directories are accessed via two sshfs mounts, or written both via sshfs
and from other clients.

View File

@ -9,7 +9,7 @@ NODEURL_RE=re.compile("http(s?)://([^:]*)(:([1-9][0-9]*))?")
_default_nodedir = get_default_nodedir()
class FilesystemOptions(BaseOptions):
class FileStoreOptions(BaseOptions):
optParameters = [
["node-url", "u", None,
"Specify the URL of the Tahoe gateway node, such as "
@ -46,7 +46,7 @@ class FilesystemOptions(BaseOptions):
self.aliases = aliases # maps alias name to dircap
class MakeDirectoryOptions(FilesystemOptions):
class MakeDirectoryOptions(FileStoreOptions):
optParameters = [
("format", None, None, "Create a directory with the given format: SDMF or MDMF (case-insensitive)"),
]
@ -61,7 +61,7 @@ class MakeDirectoryOptions(FilesystemOptions):
synopsis = "[options] [REMOTE_DIR]"
description = """Create a new directory, either unlinked or as a subdirectory."""
class AddAliasOptions(FilesystemOptions):
class AddAliasOptions(FileStoreOptions):
def parseArgs(self, alias, cap):
self.alias = argv_to_unicode(alias)
if self.alias.endswith(u':'):
@ -71,7 +71,7 @@ class AddAliasOptions(FilesystemOptions):
synopsis = "[options] ALIAS[:] DIRCAP"
description = """Add a new alias for an existing directory."""
class CreateAliasOptions(FilesystemOptions):
class CreateAliasOptions(FileStoreOptions):
def parseArgs(self, alias):
self.alias = argv_to_unicode(alias)
if self.alias.endswith(u':'):
@ -80,14 +80,14 @@ class CreateAliasOptions(FilesystemOptions):
synopsis = "[options] ALIAS[:]"
description = """Create a new directory and add an alias for it."""
class ListAliasesOptions(FilesystemOptions):
class ListAliasesOptions(FileStoreOptions):
synopsis = "[options]"
description = """Display a table of all configured aliases."""
optFlags = [
("readonly-uri", None, "Show read-only dircaps instead of readwrite"),
]
class ListOptions(FilesystemOptions):
class ListOptions(FileStoreOptions):
optFlags = [
("long", "l", "Use long format: show file sizes, and timestamps."),
("uri", None, "Show file/directory URIs."),
@ -124,11 +124,11 @@ class ListOptions(FilesystemOptions):
Otherwise the size of the file, when known, is given in bytes.
The size of mutable files or unknown objects is shown as '?'.
The date/time shows when this link in the Tahoe filesystem was
last modified.
The date/time shows when this link in the Tahoe grid was last
modified.
"""
class GetOptions(FilesystemOptions):
class GetOptions(FileStoreOptions):
def parseArgs(self, arg1, arg2=None):
# tahoe get FOO |less # write to stdout
# tahoe get tahoe:FOO |less # same
@ -156,7 +156,7 @@ class GetOptions(FilesystemOptions):
% tahoe get tahoe:FOO bar # same
"""
class PutOptions(FilesystemOptions):
class PutOptions(FileStoreOptions):
optFlags = [
("mutable", "m", "Create a mutable file instead of an immutable one (like --format=SDMF)"),
]
@ -202,7 +202,7 @@ class PutOptions(FilesystemOptions):
% tahoe put bar MUTABLE-FILE-WRITECAP # modify the mutable file in-place
"""
class CpOptions(FilesystemOptions):
class CpOptions(FileStoreOptions):
optFlags = [
("recursive", "r", "Copy source directory recursively."),
("verbose", "v", "Be noisy about what is happening."),
@ -249,7 +249,7 @@ class CpOptions(FilesystemOptions):
contents.
"""
class UnlinkOptions(FilesystemOptions):
class UnlinkOptions(FileStoreOptions):
def parseArgs(self, where):
self.where = argv_to_unicode(where)
@ -260,7 +260,7 @@ class RmOptions(UnlinkOptions):
synopsis = "[options] REMOTE_FILE"
description = "Remove a named file from its parent directory."
class MvOptions(FilesystemOptions):
class MvOptions(FileStoreOptions):
def parseArgs(self, frompath, topath):
self.from_file = argv_to_unicode(frompath)
self.to_file = argv_to_unicode(topath)
@ -279,7 +279,7 @@ class MvOptions(FilesystemOptions):
the grid -- use 'tahoe cp' for that.
"""
class LnOptions(FilesystemOptions):
class LnOptions(FileStoreOptions):
def parseArgs(self, frompath, topath):
self.from_file = argv_to_unicode(frompath)
self.to_file = argv_to_unicode(topath)
@ -311,7 +311,7 @@ class LnOptions(FilesystemOptions):
class BackupConfigurationError(Exception):
pass
class BackupOptions(FilesystemOptions):
class BackupOptions(FileStoreOptions):
optFlags = [
("verbose", "v", "Be noisy about what is happening."),
("ignore-timestamps", None, "Do not use backupdb timestamps to decide whether a local file is unchanged."),
@ -380,7 +380,7 @@ class BackupOptions(FilesystemOptions):
--link-dest=TO/Archives/(previous) FROM TO/Archives/(new); ln -sf
TO/Archives/(new) TO/Latest'."""
class WebopenOptions(FilesystemOptions):
class WebopenOptions(FileStoreOptions):
optFlags = [
("info", "i", "Open the t=info page for the file"),
]
@ -394,7 +394,7 @@ class WebopenOptions(FilesystemOptions):
directory on the grid. When run without arguments, open the Welcome
page."""
class ManifestOptions(FilesystemOptions):
class ManifestOptions(FileStoreOptions):
optFlags = [
("storage-index", "s", "Only print storage index strings, not pathname+cap."),
("verify-cap", None, "Only print verifycap, not pathname+cap."),
@ -409,7 +409,7 @@ class ManifestOptions(FilesystemOptions):
Print a list of all files and directories reachable from the given
starting point."""
class StatsOptions(FilesystemOptions):
class StatsOptions(FileStoreOptions):
optFlags = [
("raw", "r", "Display raw JSON data instead of parsed"),
]
@ -421,7 +421,7 @@ class StatsOptions(FilesystemOptions):
Print statistics about of all files and directories reachable from the
given starting point."""
class CheckOptions(FilesystemOptions):
class CheckOptions(FileStoreOptions):
optFlags = [
("raw", None, "Display raw JSON data instead of parsed."),
("verify", None, "Verify all hashes, instead of merely querying share presence."),
@ -437,7 +437,7 @@ class CheckOptions(FilesystemOptions):
verify their hashes. Optionally repair the file if any problems were
found."""
class DeepCheckOptions(FilesystemOptions):
class DeepCheckOptions(FileStoreOptions):
optFlags = [
("raw", None, "Display raw JSON data instead of parsed."),
("verify", None, "Verify all hashes, instead of merely querying share presence."),

View File

@ -44,7 +44,7 @@ class Options(usage.Options):
+ startstop_node.subCommands
+ GROUP("Debugging")
+ debug.subCommands
+ GROUP("Using the filesystem")
+ GROUP("Using the file store")
+ cli.subCommands
+ magic_folder_cli.subCommands
)