mirror of
https://github.com/tahoe-lafs/tahoe-lafs.git
synced 2025-04-28 06:49:46 +00:00
Update docs, notably performance.rst, to include MDMF. fixes #1772
This commit is contained in:
parent
c1faaa2ca2
commit
514fb096be
@ -365,10 +365,13 @@ Client Configuration
|
|||||||
mutable-type parameter in the webapi. If you do not specify a value here,
|
mutable-type parameter in the webapi. If you do not specify a value here,
|
||||||
Tahoe-LAFS will use SDMF for all newly-created mutable files.
|
Tahoe-LAFS will use SDMF for all newly-created mutable files.
|
||||||
|
|
||||||
Note that this parameter only applies to mutable files. Mutable
|
Note that this parameter applies only to files, not to directories.
|
||||||
directories, which are stored as mutable files, are not controlled by
|
Mutable directories, which are stored in mutable files, are not
|
||||||
this parameter and will always use SDMF. We may revisit this decision in
|
controlled by this parameter and will always use SDMF. We may revisit
|
||||||
future versions of Tahoe-LAFS.
|
this decision in future versions of Tahoe-LAFS.
|
||||||
|
|
||||||
|
See `<frontends/specifications/mutable.rst>`_ for details about mutable
|
||||||
|
file formats.
|
||||||
|
|
||||||
Frontend Configuration
|
Frontend Configuration
|
||||||
======================
|
======================
|
||||||
|
@ -10,8 +10,8 @@ Performance costs for some common operations
|
|||||||
6. `Inserting/Removing B bytes in an A-byte mutable file`_
|
6. `Inserting/Removing B bytes in an A-byte mutable file`_
|
||||||
7. `Adding an entry to an A-entry directory`_
|
7. `Adding an entry to an A-entry directory`_
|
||||||
8. `Listing an A entry directory`_
|
8. `Listing an A entry directory`_
|
||||||
9. `Performing a file-check on an A-byte file`_
|
9. `Checking an A-byte file`_
|
||||||
10. `Performing a file-verify on an A-byte file`_
|
10. `Verifying an A-byte file (immutable)`_
|
||||||
11. `Repairing an A-byte file (mutable or immutable)`_
|
11. `Repairing an A-byte file (mutable or immutable)`_
|
||||||
|
|
||||||
``K`` indicates the number of shares required to reconstruct the file
|
``K`` indicates the number of shares required to reconstruct the file
|
||||||
@ -23,7 +23,7 @@ Performance costs for some common operations
|
|||||||
|
|
||||||
``A`` indicates the number of bytes in a file
|
``A`` indicates the number of bytes in a file
|
||||||
|
|
||||||
``B`` indicates the number of bytes of a file which are being read or
|
``B`` indicates the number of bytes of a file that are being read or
|
||||||
written
|
written
|
||||||
|
|
||||||
``G`` indicates the number of storage servers on your grid
|
``G`` indicates the number of storage servers on your grid
|
||||||
@ -179,8 +179,8 @@ directory be downloaded from the grid. So listing an A entry
|
|||||||
directory requires downloading a (roughly) 330 * A byte mutable
|
directory requires downloading a (roughly) 330 * A byte mutable
|
||||||
file, since each directory entry is about 300-330 bytes in size.
|
file, since each directory entry is about 300-330 bytes in size.
|
||||||
|
|
||||||
Performing a file-check on an ``A``-byte file
|
Checking an ``A``-byte file
|
||||||
=============================================
|
===========================
|
||||||
|
|
||||||
cpu: ~G
|
cpu: ~G
|
||||||
|
|
||||||
@ -193,8 +193,8 @@ about. Note that neither of these values directly depend on the size
|
|||||||
of the file. This is relatively inexpensive, compared to the verify
|
of the file. This is relatively inexpensive, compared to the verify
|
||||||
and repair operations.
|
and repair operations.
|
||||||
|
|
||||||
Performing a file-verify on an ``A``-byte file
|
Verifying an A-byte file (immutable)
|
||||||
==============================================
|
====================================
|
||||||
|
|
||||||
cpu: ~N/K*A
|
cpu: ~N/K*A
|
||||||
|
|
||||||
@ -204,9 +204,24 @@ memory footprint: N/K*S
|
|||||||
|
|
||||||
notes: To verify a file, Tahoe-LAFS downloads all of the ciphertext
|
notes: To verify a file, Tahoe-LAFS downloads all of the ciphertext
|
||||||
shares that were originally uploaded to the grid and integrity checks
|
shares that were originally uploaded to the grid and integrity checks
|
||||||
them. This is (for well-behaved grids) more expensive than downloading
|
them. This is (for grids with good redundancy) more expensive than
|
||||||
an A-byte file, since only a fraction of these shares are necessary to
|
downloading an A-byte file, since only a fraction of these shares would
|
||||||
recover the file.
|
be necessary to recover the file.
|
||||||
|
|
||||||
|
Verifying an A-byte file (mutable)
|
||||||
|
==================================
|
||||||
|
|
||||||
|
cpu: ~N/K*A
|
||||||
|
|
||||||
|
network: N/K*A
|
||||||
|
|
||||||
|
memory footprint: N/K*A
|
||||||
|
|
||||||
|
notes: To verify a file, Tahoe-LAFS downloads all of the ciphertext
|
||||||
|
shares that were originally uploaded to the grid and integrity checks
|
||||||
|
them. This is (for grids with good redundancy) more expensive than
|
||||||
|
downloading an A-byte file, since only a fraction of these shares would
|
||||||
|
be necessary to recover the file.
|
||||||
|
|
||||||
Repairing an ``A``-byte file (mutable or immutable)
|
Repairing an ``A``-byte file (mutable or immutable)
|
||||||
===================================================
|
===================================================
|
||||||
|
@ -2,8 +2,6 @@
|
|||||||
Mutable Files
|
Mutable Files
|
||||||
=============
|
=============
|
||||||
|
|
||||||
This describes the "RSA-based mutable files" which were shipped in Tahoe v0.8.0.
|
|
||||||
|
|
||||||
1. `Mutable Formats`_
|
1. `Mutable Formats`_
|
||||||
2. `Consistency vs. Availability`_
|
2. `Consistency vs. Availability`_
|
||||||
3. `The Prime Coordination Directive: "Don't Do That"`_
|
3. `The Prime Coordination Directive: "Don't Do That"`_
|
||||||
@ -19,33 +17,38 @@ This describes the "RSA-based mutable files" which were shipped in Tahoe v0.8.0.
|
|||||||
6. `Large Distributed Mutable Files`_
|
6. `Large Distributed Mutable Files`_
|
||||||
7. `TODO`_
|
7. `TODO`_
|
||||||
|
|
||||||
Mutable File Slots are places with a stable identifier that can hold data
|
Mutable files are places with a stable identifier that can hold data that
|
||||||
that changes over time. In contrast to CHK slots, for which the
|
changes over time. In contrast to immutable slots, for which the
|
||||||
URI/identifier is derived from the contents themselves, the Mutable File Slot
|
identifier/capability is derived from the contents themselves, the mutable
|
||||||
URI remains fixed for the life of the slot, regardless of what data is placed
|
file identifier remains fixed for the life of the slot, regardless of what
|
||||||
inside it.
|
data is placed inside it.
|
||||||
|
|
||||||
Each mutable slot is referenced by two different URIs. The "read-write" URI
|
Each mutable file is referenced by two different caps. The "read-write" cap
|
||||||
grants read-write access to its holder, allowing them to put whatever
|
grants read-write access to its holder, allowing them to put whatever
|
||||||
contents they like into the slot. The "read-only" URI is less powerful, only
|
contents they like into the slot. The "read-only" cap is less powerful, only
|
||||||
granting read access, and not enabling modification of the data. The
|
granting read access, and not enabling modification of the data. The
|
||||||
read-write URI can be turned into the read-only URI, but not the other way
|
read-write cap can be turned into the read-only cap, but not the other way
|
||||||
around.
|
around.
|
||||||
|
|
||||||
The data in these slots is distributed over a number of servers, using the
|
The data in these files is distributed over a number of servers, using the
|
||||||
same erasure coding that CHK files use, with 3-of-10 being a typical choice
|
same erasure coding that immutable files use, with 3-of-10 being a typical
|
||||||
of encoding parameters. The data is encrypted and signed in such a way that
|
choice of encoding parameters. The data is encrypted and signed in such a way
|
||||||
only the holders of the read-write URI will be able to set the contents of
|
that only the holders of the read-write cap will be able to set the contents
|
||||||
the slot, and only the holders of the read-only URI will be able to read
|
of the slot, and only the holders of the read-only cap will be able to read
|
||||||
those contents. Holders of either URI will be able to validate the contents
|
those contents. Holders of either cap will be able to validate the contents
|
||||||
as being written by someone with the read-write URI. The servers who hold the
|
as being written by someone with the read-write cap. The servers who hold the
|
||||||
shares cannot read or modify them: the worst they can do is deny service (by
|
shares are not automatically given the ability read or modify them: the worst
|
||||||
deleting or corrupting the shares), or attempt a rollback attack (which can
|
they can do is deny service (by deleting or corrupting the shares), or
|
||||||
only succeed with the cooperation of at least k servers).
|
attempt a rollback attack (which can only succeed with the cooperation of at
|
||||||
|
least k servers).
|
||||||
|
|
||||||
|
|
||||||
Mutable Formats
|
Mutable Formats
|
||||||
===============
|
===============
|
||||||
|
|
||||||
|
History
|
||||||
|
-------
|
||||||
|
|
||||||
When mutable files first shipped in Tahoe-0.8.0 (15-Feb-2008), the only
|
When mutable files first shipped in Tahoe-0.8.0 (15-Feb-2008), the only
|
||||||
version available was "SDMF", described below. This was a
|
version available was "SDMF", described below. This was a
|
||||||
limited-functionality placeholder, intended to be replaced with
|
limited-functionality placeholder, intended to be replaced with
|
||||||
@ -75,8 +78,11 @@ SDMF a clean subset of MDMF, where any single-segment MDMF file could be
|
|||||||
handled by the old SDMF code). In the fall of 2011, Kevan's code was finally
|
handled by the old SDMF code). In the fall of 2011, Kevan's code was finally
|
||||||
integrated, and first made available in the Tahoe-1.9.0 release.
|
integrated, and first made available in the Tahoe-1.9.0 release.
|
||||||
|
|
||||||
The main improvement of MDMF is the use of multiple segments: individual
|
SDMF vs. MDMF
|
||||||
128KiB sections of the file can be retrieved or modified independently. The
|
-------------
|
||||||
|
|
||||||
|
The improvement of MDMF is the use of multiple segments: individual 128-KiB
|
||||||
|
sections of the file can be retrieved or modified independently. The
|
||||||
improvement can be seen when fetching just a portion of the file (using a
|
improvement can be seen when fetching just a portion of the file (using a
|
||||||
Range: header on the webapi), or when modifying a portion (again with a
|
Range: header on the webapi), or when modifying a portion (again with a
|
||||||
Range: header). It can also be seen indirectly when fetching the whole file:
|
Range: header). It can also be seen indirectly when fetching the whole file:
|
||||||
@ -84,12 +90,14 @@ the first segment of data should be delivered faster from a large MDMF file
|
|||||||
than from an SDMF file, although the overall download will then proceed at
|
than from an SDMF file, although the overall download will then proceed at
|
||||||
the same rate.
|
the same rate.
|
||||||
|
|
||||||
We've decided to make it opt-in for the first release while we shake out the
|
We've decided to make it opt-in for now: mutable files default to
|
||||||
bugs, just in case a problem is found which requires an incompatible format
|
SDMF format unless explicitly configured to use MDMF, either in ``tahoe.cfg``
|
||||||
change. All new mutable files will be in SDMF format unless the user
|
(see `<configuration.rst>`__) or in the WUI or CLI command that created a
|
||||||
specifically chooses to use MDMF instead. The code can read and modify
|
new mutable file.
|
||||||
existing files of either format without user intervention. We expect to make
|
|
||||||
MDMF the default in a subsequent release, perhaps 2.0.
|
The code can read and modify existing files of either format without user
|
||||||
|
intervention. We expect to make MDMF the default in a subsequent release,
|
||||||
|
perhaps 2.0.
|
||||||
|
|
||||||
Which format should you use? SDMF works well for files up to a few MB, and
|
Which format should you use? SDMF works well for files up to a few MB, and
|
||||||
can be handled by older versions (Tahoe-1.8.3 and earlier). If you do not
|
can be handled by older versions (Tahoe-1.8.3 and earlier). If you do not
|
||||||
@ -114,8 +122,9 @@ As we develop more sophisticated mutable slots, the API may expose multiple
|
|||||||
read versions to the application layer. The tahoe philosophy is to defer most
|
read versions to the application layer. The tahoe philosophy is to defer most
|
||||||
consistency recovery logic to the higher layers. Some applications have
|
consistency recovery logic to the higher layers. Some applications have
|
||||||
effective ways to merge multiple versions, so inconsistency is not
|
effective ways to merge multiple versions, so inconsistency is not
|
||||||
necessarily a problem (i.e. directory nodes can usually merge multiple "add
|
necessarily a problem (i.e. directory nodes can usually merge multiple
|
||||||
child" operations).
|
"add child" operations).
|
||||||
|
|
||||||
|
|
||||||
The Prime Coordination Directive: "Don't Do That"
|
The Prime Coordination Directive: "Don't Do That"
|
||||||
=================================================
|
=================================================
|
||||||
@ -697,38 +706,30 @@ Medium Distributed Mutable Files
|
|||||||
|
|
||||||
These are just like the SDMF case, but:
|
These are just like the SDMF case, but:
|
||||||
|
|
||||||
* we actually take advantage of the Merkle hash tree over the blocks, by
|
* We actually take advantage of the Merkle hash tree over the blocks, by
|
||||||
reading a single segment of data at a time (and its necessary hashes), to
|
reading a single segment of data at a time (and its necessary hashes), to
|
||||||
reduce the read-time alacrity
|
reduce the read-time alacrity.
|
||||||
* we allow arbitrary writes to the file (i.e. seek() is provided, and
|
* We allow arbitrary writes to any range of the file.
|
||||||
O_TRUNC is no longer required)
|
* We add more code to first read each segment that a write must modify.
|
||||||
* we write more code on the client side (in the MutableFileNode class), to
|
This looks exactly like the way a normal filesystem uses a block device,
|
||||||
first read each segment that a write must modify. This looks exactly like
|
or how a CPU must perform a cache-line fill before modifying a single word.
|
||||||
the way a normal filesystem uses a block device, or how a CPU must perform
|
* We might implement some sort of copy-based atomic update server call,
|
||||||
a cache-line fill before modifying a single word.
|
|
||||||
* we might implement some sort of copy-based atomic update server call,
|
|
||||||
to allow multiple writev() calls to appear atomic to any readers.
|
to allow multiple writev() calls to appear atomic to any readers.
|
||||||
|
|
||||||
MDMF slots provide fairly efficient in-place edits of very large files (a few
|
MDMF slots provide fairly efficient in-place edits of very large files (a few
|
||||||
GB). Appending data is also fairly efficient, although each time a power of 2
|
GB). Appending data is also fairly efficient.
|
||||||
boundary is crossed, the entire file must effectively be re-uploaded (because
|
|
||||||
the size of the block hash tree changes), so if the filesize is known in
|
|
||||||
advance, that space ought to be pre-allocated (by leaving extra space between
|
|
||||||
the block hash tree and the actual data).
|
|
||||||
|
|
||||||
MDMF1 uses the Merkle tree to enable low-alacrity random-access reads. MDMF2
|
|
||||||
adds cache-line reads to allow random-access writes.
|
|
||||||
|
|
||||||
Large Distributed Mutable Files
|
Large Distributed Mutable Files
|
||||||
===============================
|
===============================
|
||||||
|
|
||||||
LDMF slots use a fundamentally different way to store the file, inspired by
|
LDMF slots (not implemented) would use a fundamentally different way to store
|
||||||
Mercurial's "revlog" format. They enable very efficient insert/remove/replace
|
the file, inspired by Mercurial's "revlog" format. This would enable very
|
||||||
editing of arbitrary spans. Multiple versions of the file can be retained, in
|
efficient insert/remove/replace editing of arbitrary spans. Multiple versions
|
||||||
a revision graph that can have multiple heads. Each revision can be
|
of the file can be retained, in a revision graph that can have multiple heads.
|
||||||
referenced by a cryptographic identifier. There are two forms of the URI, one
|
Each revision can be referenced by a cryptographic identifier. There are two
|
||||||
that means "most recent version", and a longer one that points to a specific
|
forms of the URI, one that means "most recent version", and a longer one that
|
||||||
revision.
|
points to a specific revision.
|
||||||
|
|
||||||
Metadata can be attached to the revisions, like timestamps, to enable rolling
|
Metadata can be attached to the revisions, like timestamps, to enable rolling
|
||||||
back an entire tree to a specific point in history.
|
back an entire tree to a specific point in history.
|
||||||
@ -736,6 +737,7 @@ back an entire tree to a specific point in history.
|
|||||||
LDMF1 provides deltas but tries to avoid dealing with multiple heads. LDMF2
|
LDMF1 provides deltas but tries to avoid dealing with multiple heads. LDMF2
|
||||||
provides explicit support for revision identifiers and branching.
|
provides explicit support for revision identifiers and branching.
|
||||||
|
|
||||||
|
|
||||||
TODO
|
TODO
|
||||||
====
|
====
|
||||||
|
|
||||||
|
Loading…
x
Reference in New Issue
Block a user