mirror of
https://github.com/tahoe-lafs/tahoe-lafs.git
synced 2025-01-31 08:25:35 +00:00
docs: update webapi.txt with write-coordination issues, add TODO note to recovery section of mutable.txt
This commit is contained in:
parent
2aaf0d551a
commit
01469433ef
@ -78,10 +78,18 @@ versions of the file that different parties are trying to establish as the
|
|||||||
one true current contents. Each simultaneous writer counts as a "competing
|
one true current contents. Each simultaneous writer counts as a "competing
|
||||||
version", as does the previous version of the file. If the count "S" of these
|
version", as does the previous version of the file. If the count "S" of these
|
||||||
competing versions is larger than N/k, then the file runs the risk of being
|
competing versions is larger than N/k, then the file runs the risk of being
|
||||||
lost completely. If at least one of the writers remains running after the
|
lost completely. [TODO] If at least one of the writers remains running after
|
||||||
collision is detected, it will attempt to recover, but if S>(N/k) and all
|
the collision is detected, it will attempt to recover, but if S>(N/k) and all
|
||||||
writers crash after writing a few shares, the file will be lost.
|
writers crash after writing a few shares, the file will be lost.
|
||||||
|
|
||||||
|
Note that Tahoe uses serialization internally to make sure that a single
|
||||||
|
Tahoe node will not perform simultaneous modifications to a mutable file. It
|
||||||
|
accomplishes this by using a weakref cache of the MutableFileNode (so that
|
||||||
|
there will never be two distinct MutableFileNodes for the same file), and by
|
||||||
|
forcing all mutable file operations to obtain a per-node lock before they
|
||||||
|
run. The Prime Coordination Directive therefore applies to inter-node
|
||||||
|
conflicts, not intra-node ones.
|
||||||
|
|
||||||
|
|
||||||
== Small Distributed Mutable Files ==
|
== Small Distributed Mutable Files ==
|
||||||
|
|
||||||
|
@ -1,14 +1,13 @@
|
|||||||
|
|
||||||
= The Tahoe REST-ful Web API =
|
= The Tahoe REST-ful Web API =
|
||||||
|
|
||||||
This document has six sections:
|
1. Enabling the web-API port
|
||||||
|
2. Basic Concepts: GET, PUT, DELETE, POST
|
||||||
1. the basic API for how to programmatically control your tahoe node
|
3. URLs, Machine-Oriented Interfaces
|
||||||
2. convenience methods
|
4. Browser Operations: Human-Oriented Interfaces
|
||||||
3. safety and security issues
|
5. Welcome / Debug / Status pages
|
||||||
4. features for controlling your tahoe node from a standard web browser
|
6. Safety and security issues -- names vs. URIs
|
||||||
5. debugging and testing features
|
7. Concurrency Issues
|
||||||
6. XML-RPC (coming soon)
|
|
||||||
|
|
||||||
|
|
||||||
== Enabling the web-API port ==
|
== Enabling the web-API port ==
|
||||||
@ -800,7 +799,7 @@ GET / (introducer status)
|
|||||||
clients over time.
|
clients over time.
|
||||||
|
|
||||||
|
|
||||||
3. safety and security issues -- names vs. URIs
|
== safety and security issues -- names vs. URIs ==
|
||||||
|
|
||||||
Summary: use explicit file- and dir- caps whenever possible, to reduce the
|
Summary: use explicit file- and dir- caps whenever possible, to reduce the
|
||||||
potential for surprises when the virtual drive is changed while you aren't
|
potential for surprises when the virtual drive is changed while you aren't
|
||||||
@ -844,3 +843,45 @@ parent directory, so it isn't any harder to use the URI for this purpose.
|
|||||||
In general, use names if you want "whatever object (whether file or
|
In general, use names if you want "whatever object (whether file or
|
||||||
directory) is found by following this name (or sequence of names) when my
|
directory) is found by following this name (or sequence of names) when my
|
||||||
request reaches the server". Use URIs if you want "this particular object".
|
request reaches the server". Use URIs if you want "this particular object".
|
||||||
|
|
||||||
|
== Concurrency Issues ==
|
||||||
|
|
||||||
|
Tahoe uses both mutable and immutable files. Mutable files can be created
|
||||||
|
explicitly by doing an upload with ?mutable=true added, or implicitly by
|
||||||
|
creating a new directory (since a directory is just a special way to
|
||||||
|
interpret a given mutable file).
|
||||||
|
|
||||||
|
Mutable files suffer from the same consistency-vs-availability tradeoff that
|
||||||
|
all distributed data storage systems face. It is not possible to
|
||||||
|
simultaneously achieve perfect consistency and perfect availability in the
|
||||||
|
face of network partitions (servers being unreachable or faulty).
|
||||||
|
|
||||||
|
Tahoe tries to achieve a reasonable compromise, but there is a basic rule in
|
||||||
|
place, known as the Prime Coordination Directive: "Don't Do That". What this
|
||||||
|
means is that if write-access to a mutable file is available to several
|
||||||
|
parties, then those parties are responsible for coordinating their activities
|
||||||
|
to avoid multiple simultaneous updates. This could be achieved by having
|
||||||
|
these parties talk to each other and using some sort of locking mechanism, or
|
||||||
|
by serializing all changes through a single writer.
|
||||||
|
|
||||||
|
The consequences of performing uncoordinated writes can vary. Some of the
|
||||||
|
writers may lose their changes, as somebody else wins the race condition. In
|
||||||
|
many cases the file will be left in an "unhealthy" state, meaning that there
|
||||||
|
are not as many redundant shares as we would like (reducing the reliability
|
||||||
|
of the file against server failures). In the worst case, the file can be left
|
||||||
|
in such an unhealthy state that no version is recoverable, even the old ones.
|
||||||
|
It is this small possibility of data loss that prompts us to issue the Prime
|
||||||
|
Coordination Directive.
|
||||||
|
|
||||||
|
Tahoe nodes implement internal serialization to make sure that a single Tahoe
|
||||||
|
node cannot conflict with itself. For example, it is safe to issue two
|
||||||
|
directory modification requests to a single tahoe node's webapi server at the
|
||||||
|
same time, because the Tahoe node will internally delay one of them until
|
||||||
|
after the other has finished being applied. (This feature was introduced in
|
||||||
|
Tahoe-1.1; back with Tahoe-1.0 the web client was responsible for serializing
|
||||||
|
web requests themselves).
|
||||||
|
|
||||||
|
For more details, please see the "Consistency vs Availability" and "The Prime
|
||||||
|
Coordination Directive" sections of mutable.txt, in the same directory as
|
||||||
|
this file.
|
||||||
|
|
||||||
|
Loading…
x
Reference in New Issue
Block a user