tahoe-lafs/roadmap.txt


['*' means complete]

Connection Management:
*v1: foolscap, no relay, live=connected-to-introducer, broadcast updates, fully connected topology
*v2: configurable IP address -- http://allmydata.org/trac/tahoe/ticket/22
 v3: live != connected-to-introducer, connect on demand
 v4: decentralized introduction -- http://allmydata.org/trac/tahoe/ticket/68
 v5: relay?

File Encoding:
*v1: single-segment, no merkle trees
*v2: multiple-segment (LFE)
*v3: merkle tree to verify each share
*v4: merkle tree to verify each segment
*v5: merkle tree on plaintext and crypttext: incremental validation
 v6: only retrieve the minimal number of hashes instead of all of them

Share Encoding:
*v1: fake it (replication)
*v2: PyRS
*v2.5: ICodec-based codecs, but still using replication
*v3: C-based Reed-Solomon

URI:
*v1: really big
*v2: store URI Extension with shares
*v3: derive storage index from readkey
 v4: perhaps derive more information from version and filesize, to remove
     codec_name, codec_params, tail_codec_params, needed_shares,
     total_shares, segment_size from the URI Extension

Upload Peer Selection:
*v1: permuted peer list, consistent hash
*v2: permute peers by verifierid and arrange around ring, intermixed with
     shareids on the same range, each share goes to the
     next-clockwise-available peer
 v3: reliability/goodness-point counting?
 v4: denver airport (chord)?

Download Peer Selection:
*v1: ask all peers
 v2: permute peers and shareids as in upload, ask next-clockwise peers first
     (the "A" list), if necessary ask the ones after them, etc.
 v3: denver airport?

Directory/Filesystem Maintenance:
*v1: vdrive-based tree of MutableDirectoryNodes, persisted to vdrive's disk
     no accounts
*v2: single-host dirnodes, one tree per user, plus one global mutable space
*v3: distributed storage for dirnodes
 v4: maintain file manifest, delete on remove
 v5: figure out accounts, users, quotas, snapshots, versioning, etc

Checker/Repairer:
*v1: none
 v1.5: maintain file manifest
 v2: centralized checker, repair agent
 v3: nodes also check their own files

Storage:
*v1: no deletion, one directory per verifierid, no owners of shares,
     leases never expire
*v2: multiple shares per verifierid [zooko]
*v3: disk space limits on storage servers -- ticket #34
 v4: deletion
 v5: leases expire, delete expired data on demand, multiple owners per share

UI:
*v1: readonly webish (nevow, URLs are filepaths)
*v2: read/write webish, mkdir, del (files)
*v2.5: del (directories)
*v3: CLI tool.
 v4: FUSE (linux) -- http://allmydata.org/trac/tahoe/ticket/36
 v5: WebDAV

Operations/Deployment/Doc/Free Software/Community:
 - move this file into the wiki ?

back pocket ideas:
 when nodes are unable to reach storage servers, make a note of it, inform
 verifier/checker eventually. verifier/checker then puts server under
 observation or otherwise looks for differences between their self-reported
 availability and the experiences of others

 store filetable URI in the first 10 peers that appear after your own nodeid
 each entry has a sequence number, maybe a timestamp
 on recovery, find the newest

 multiple categories of leases:
  1: committed leases -- we will not delete these in any case, but will instead
     tell an uploader that we are full
   1a: active leases
   1b: in-progress leases (partially filled, not closed, pb connection is
       currently open)
  2: uncommitted leases -- we will delete these in order to make room for new
     lease requests
   2a: interrupted leases (partially filled, not closed, pb connection is
       currently not open, but they might come back)
   2b: expired leases

  (I'm not sure about the precedence of these last two. Probably deleting
  expired leases instead of deleting interrupted leases would be okay.)

big questions:
 convergence?
 peer list maintenance: lots of entries