Therefore in this system an interesting side effect of this Tahoe snapshot object is that there is no
snapshot author. The only notion of an identity in the Magic-Folder system is the write capability of the user's DMD.
The snapshot object is an immutable directory which looks like this:
content -> immutable cap to file content
parent0 -> immutable cap to a parent snapshot object
parent1..N -> more parent snapshots
Snapshot Author Identity
------------------------
Snapshot identity might become an important feature so that bad actors
can be recognized and other clients can stop "subscribing" to (polling for) updates from them.
Perhaps snapshots could be signed by the user's Magic-Folder write key for this purpose? Probably a bad idea to reuse the write-cap key for this. Better to introduce ed25519 identity keys which can (optionally) sign snapshot contents and store the signature as another member of the immutable directory.
Conflict Resolution
-------------------
detection of conflicts
``````````````````````
A Magic-Folder client updates a given file's current snapshot link to a snapshot which is a descendent
of the previous snapshot. For a given file, let's say "file1", Alice can detect that Bob's DMD has a "file1"
that links to a snapshot which conflicts. Two snapshots conflict if one is not an ancestor of the other.
a possible UI for resolving conflicts
`````````````````````````````````````
If Alice links a conflicting snapshot object for a file named "file1",
Bob and Carole will see a file in their Magic-Folder called "file1.conflicted.Alice".
Alice conversely will see an additional file called "file1.conflicted.previous".
If Alice wishes to resolve the conflict with her new version of the file then
she simply deletes the file called "file1.conflicted.previous". If she wants to
choose the other version then she moves it into place:
mv file1.conflicted.previous file1
This scheme works for N number of conflicts. Bob for instance could choose
the same resolution for the conflict, like this:
mv file1.Alice file1
Deletion propagation and eventual Garbage Collection
3. modify mutable directory (DMD) to link to the immutable snapshot object
remote changes
``````````````
Our old scheme requires one remote Tahoe-LAFS operation per remote file modification (not counting the polling of the dmd):
1. Download new file content
Our new scheme requires a minimum of two remote operations (not counting the polling of the dmd) for conflicting downloads, or three remote operations for overwrite downloads:
1. Download new snapshot object
2. Download the content it points to
3. If the download is an overwrite, modify the DMD to indicate that the downloaded version is their current version.
If the new snapshot is not a direct descendant of our current snapshot or the other party's previous snapshot we saw, we will also need to download more snapshots to determine if it is a conflict or an overwrite. However, those can be done in
parallel with the content download since we will need to download the content in either case.
While the old scheme is obviously more efficient, we think that the properties provided by the new scheme make it worth the additional cost.
Physical updates to the DMD overiouslly need to be serialized, so multiple logical updates should be combined when an update is already in progress.
conflict detection and local caching
````````````````````````````````````
Local caching of snapshots is important for performance.
We refer to the client's local snapshot cache as the ``magic-folder db``.
Conflict detection can be expensive because it may require the client
to download many snapshots from the other user's DMD in order to try
and find it's own current snapshot or a descendent. The cost of scanning
the remote DMDs should not be very high unless the client conducting the
scan has lots of history to download because of being offline for a long
time while many new snapshots were distributed.
local cache purging policy
``````````````````````````
The client's current snapshot for each file should be cached at all times.
When all clients' views of a file are synchronized (they all have the same
snapshot for that file), no ancestry for that file needs to be cached.
When clients' views of a file are *not* synchronized, the most recent
common ancestor of all clients' snapshots must be kept cached, as must
all intermediate snapshots.
Local Merge Property
--------------------
Bob can in fact, set a pre-existing directory (with files) as his new Magic-Folder directory, resulting
in a merge of the Magic-Folder with Bob's local directory. Filename collisions will result in conflicts
because Bob's new snapshots are not descendent's of the existing Magic-Folder file snapshots.
Example: simultaneous update with four parties:
1. A, B, C, D are in sync for file "foo" at snapshot X
2. A and B simultaneously change the file, creating snapshots XA and XB (both descendants of X).
3. C hears about XA first, and D hears about XB first. Both accept an overwrite.
4. All four parties hear about the other update they hadn't heard about yet.
5. Result:
- everyone's local file "foo" has the content pointed to by the snapshot in their DMD's "foo" entry
- A and C's DMDs each have the "foo" entry pointing at snapshot XA
- B and D's DMDs each have the "foo" entry pointing at snapshot XB
- A and C have a local file called foo.conflict-B,D with XB's content
- B and D have a local file called foo.conflict-A,C with XA's content
Later:
- Everyone ignores the conflict, and continue updating their local "foo". but slowly enough that there are no further conflicts, so that A and C remain in sync with eachother, and B and D remain in sync with eachother.
- A and C's foo.conflict-B,D file continues to be updated with the latest version of the file B and D are working on, and vice-versa.
- A and C edit the file at the same time again, causing a new conflict.
- Local files are now:
A: "foo", "foo.conflict-B,D", "foo.conflict-C"
C: "foo", "foo.conflict-B,D", "foo.conflict-A"
B and D: "foo", "foo.conflict-A", "foo.conflict-C"
- Finally, D decides to look at "foo.conflict-A" and "foo.conflict-C", and they manually integrate (or decide to ignore) the differences into their own local file "foo".
- D deletes their conflict files.
- D's DMD now points to a snapshot that is a descendant of everyone else's current snapshot, resolving all conflicts.
- The conflict files on A, B, and C disappear, and everyone's local file "foo" contains D's manually-merged content.
Daira: I think it is too complicated to include multiple nicknames in the .conflict files
(e.g. "foo.conflict-B,D"). It should be sufficient to have one file for each other client,
reflecting that client's latest version, regardless of who else it conflicts with.
Zooko's Design (as interpreted by Daira)
========================================
A version map is a mapping from client nickname to version number.
Definition: a version map M' strictly-follows a mapping M iff for every entry c->v
in M, there is an entry c->v' in M' such that v' > v.
Each client maintains a 'local version map' and a 'conflict version map' for each file
in its magic folder db.
If it has never written the file, then the entry for its own nickname in the local version
map is zero. The conflict version map only contains entries for nicknames B where
"$FILENAME.conflict-$B" exists.
When a client A uploads a file, it increments the version for its own nickname in its
local version map for the file, and includes that map as metadata with its upload.
A download by client A from client B is an overwrite iff the downloaded version map
strictly-follows A's local version map for that file; in this case A replaces its local
version map with the downloaded version map. Otherwise it is a conflict, and the
download is put into "$FILENAME.conflict-$B"; in this case A's
local version map remains unchanged, and the entry B->v taken from the downloaded
version map is added to its conflict version map.
If client A deletes or renames a conflict file "$FILENAME.conflict-$B", then A copies
the entry for B from its conflict version map to its local version map, deletes
the entry for B in its conflict version map, and performs another upload (with
incremented version number) of $FILENAME.
Example:
A, B, C = (10, 20, 30) everyone agrees.
A updates: (11, 20, 30)
B updates: (10, 21, 30)
C will see either A or B first. Both would be an overwrite, if considered alone.