.. -*- coding: utf-8-with-signature-unix; fill-column: 77 -*-

********************************
Using Tahoe as a key-value store
********************************

There are several ways you could use Tahoe-LAFS as a key-value store.

Looking only at things that are *already implemented*, there are three
options:

1. Immutable files

   API:

    * key ← put(value)

      This is spelled "`PUT /uri`_" in the API.

      Note: the user (client code) of this API does not get to choose the key!
      The key is determined programmatically using secure hash functions and
      encryption of the value and of the optional "added convergence secret".

    * value ← get(key)

      This is spelled "`GET /uri/$FILECAP`_" in the API. "$FILECAP" is the
      key.

   For details, see "immutable files" in :doc:`performance`, but in summary:
   the performance is not great but not bad.

   That document doesn't mention that if the size of the A-byte mutable file
   is less than or equal to `55 bytes`_ then the performance cost is much
   smaller, because the value gets packed into the key. Added a ticket:
   `#2226`_.

2. Mutable files

   API:

    * key ← create()

      This is spelled "`PUT /uri?format=mdmf`_".

      Note: again, the key cannot be chosen by the user! The key is
      determined programmatically using secure hash functions and RSA public
      key pair generation.

    * set(key, value)

    * value ← get(key)

      This is spelled "`GET /uri/$FILECAP`_". Again, the "$FILECAP" is the
      key. This is the same API as for getting the value from an immutable,
      above. Whether the value you get this way is immutable (i.e. it will
      always be the same value) or mutable (i.e. an authorized person can
      change what value you get when you read) depends on the type of the
      key.

   Again, for details, see "mutable files" in :doc:`performance` (and
   `these tickets`_ about how that doc is incomplete), but in summary, the
   performance of the create() operation is *terrible*! (It involves
   generating a 2048-bit RSA key pair.) The performance of the set and get
   operations are probably merely not great but not bad.

3. Directories

   API:

    * directory ← create()

      This is spelled "`POST /uri?t=mkdir`_".

      :doc:`performance` does not mention directories (`#2228`_), but in order
      to understand the performance of directories you have to understand how
      they are implemented. Mkdir creates a new mutable file, exactly the
      same, and with exactly the same performance, as the "create() mutable"
      above.

    * set(directory, key, value)

      This is spelled "`PUT /uri/$DIRCAP/[SUBDIRS../]FILENAME`_". "$DIRCAP"
      is the directory, "FILENAME" is the key. The value is the body of the
      HTTP PUT request. The part about "[SUBDIRS../]" in there is for
      optional nesting which you can ignore for the purposes of this
      key-value store.

      This way, you *do* get to choose the key to be whatever you want (an
      arbitrary unicode string).

      To understand the performance of ``PUT /uri/$directory/$key``,
      understand that this proceeds in two steps: first it uploads the value
      as an immutable file, exactly the same as the "put(value)" API from the
      immutable API above. So right there you've already paid exactly the
      same cost as if you had used that API. Then after it has finished
      uploading that, and it has the immutable file cap from that operation
      in hand, it downloads the entire current directory, changes it to
      include the mapping from key to the immutable file cap, and re-uploads
      the entire directory. So that has a cost which is easy to understand:
      you have to download and re-upload the entire directory, which is the
      entire set of mappings from user-chosen keys (Unicode strings) to
      immutable file caps. Each entry in the directory occupies something on
      the order of 300 bytes.

      So the "set()" call from this directory-based API has obviously much
      worse performance than the the equivalent "set()" calls from the
      immutable-file-based API or the mutable-file-based API. This is not
      necessarily worse overall than the performance of the
      mutable-file-based API if you take into account the cost of the
      necessary create() calls.

    * value ← get(directory, key)

      This is spelled "`GET /uri/$DIRCAP/[SUBDIRS../]FILENAME`_". As above,
      "$DIRCAP" is the directory, "FILENAME" is the key.

      The performance of this is determined by the fact that it first
      downloads the entire directory, then finds the immutable filecap for
      the given key, then does a GET on that immutable filecap. So again,
      it is strictly worse than using the immutable file API (about twice
      as bad, if the directory size is similar to the value size).

What about ways to use LAFS as a key-value store that are not yet
implemented? Well, Zooko has lots of ideas about ways to extend Tahoe-LAFS to
support different kinds of storage APIs or better performance. One that he
thinks is pretty promising is just the Keep It Simple, Stupid idea of "store a
sqlite db in a Tahoe-LAFS mutable". ☺

.. _PUT /uri: https://tahoe-lafs.org/trac/tahoe-lafs/browser/trunk/docs/frontends/webapi.rst#writing-uploading-a-file

.. _GET /uri/$FILECAP: https://tahoe-lafs.org/trac/tahoe-lafs/browser/trunk/docs/frontends/webapi.rst#viewing-downloading-a-file

.. _55 bytes: https://tahoe-lafs.org/trac/tahoe-lafs/browser/trunk/src/allmydata/immutable/upload.py?rev=196bd583b6c4959c60d3f73cdcefc9edda6a38ae#L1504

.. _PUT /uri?format=mdmf: https://tahoe-lafs.org/trac/tahoe-lafs/browser/trunk/docs/frontends/webapi.rst#writing-uploading-a-file

.. _#2226: https://tahoe-lafs.org/trac/tahoe-lafs/ticket/2226

.. _these tickets: https://tahoe-lafs.org/trac/tahoe-lafs/query?status=assigned&status=new&status=reopened&keywords=~doc&description=~performance.rst&col=id&col=summary&col=status&col=owner&col=type&col=priority&col=milestone&order=priority

.. _POST /uri?t=mkdir: https://tahoe-lafs.org/trac/tahoe-lafs/browser/trunk/docs/frontends/webapi.rst#creating-a-new-directory

.. _#2228: https://tahoe-lafs.org/trac/tahoe-lafs/ticket/2228

.. _PUT /uri/$DIRCAP/[SUBDIRS../]FILENAME: https://tahoe-lafs.org/trac/tahoe-lafs/browser/trunk/docs/frontends/webapi.rst#creating-a-new-directory

.. _GET /uri/$DIRCAP/[SUBDIRS../]FILENAME: https://tahoe-lafs.org/trac/tahoe-lafs/browser/trunk/docs/frontends/webapi.rst#reading-a-file