diff --git a/docs/proposed/http-storage-node-protocol.rst b/docs/proposed/http-storage-node-protocol.rst index fb327b0a8..eb5154e45 100644 --- a/docs/proposed/http-storage-node-protocol.rst +++ b/docs/proposed/http-storage-node-protocol.rst @@ -24,11 +24,21 @@ Glossary storage server a Tahoe-LAFS process configured to offer storage and reachable over the network for store and retrieve operations + storage service + a Python object held in memory in the storage server which provides the implementation of the storage protocol + introducer a Tahoe-LAFS process at a known location configured to re-publish announcements about the location of storage servers fURL a self-authenticating URL-like string which can be used to locate a remote object using the Foolscap protocol + (the storage service is an example of such an object) + + NURL + a self-authenticating URL-like string almost exactly like a fURL but without being tied to Foolscap + + swissnum + a short random string which is part of a fURL and which acts as a shared secret to authorize clients to use a storage service lease state associated with a share informing a storage server of the duration of storage desired by a client @@ -45,7 +55,7 @@ Glossary (sometimes "slot" is considered a synonym for "storage index of a slot") storage index - a short string which can address a slot or a bucket + a 16 byte string which can address a slot or a bucket (in practice, derived by hashing the encryption key associated with contents of that slot or bucket) write enabler @@ -128,6 +138,8 @@ The Foolscap-based protocol offers: * A careful configuration of the TLS connection parameters *may* also offer **forward secrecy**. However, Tahoe-LAFS' use of Foolscap takes no steps to ensure this is the case. +* **Storage authorization** by way of a capability contained in the fURL addressing a storage service. + Discussion !!!!!!!!!! @@ -158,6 +170,10 @@ there is no way to write data which appears legitimate to a legitimate client). Therefore, **message confidentiality** is necessary when exchanging these secrets. **Forward secrecy** is preferred so that an attacker recording an exchange today cannot launch this attack at some future point after compromising the necessary keys. +A storage service offers service only to some clients. +A client proves their authorization to use the storage service by presenting a shared secret taken from the fURL. +In this way **storage authorization** is performed to prevent disallowed parties from consuming any storage resources. + Functionality ------------- @@ -214,6 +230,10 @@ Additionally, by continuing to interact using TLS, Bob's client and Alice's storage node are assured of both **message authentication** and **message confidentiality**. +Bob's client further inspects the fURL for the *swissnum*. +When Bob's client issues HTTP requests to Alice's storage node it includes the *swissnum* in its requests. +**Storage authorization** has been achieved. + .. note:: Foolscap TubIDs are 20 bytes (SHA1 digest of the certificate). @@ -343,6 +363,12 @@ one branch contains all of the share data; another branch contains all of the lease data; etc. +Authorization is required for all endpoints. +The standard HTTP authorization protocol is used. +The authentication *type* used is ``Tahoe-LAFS``. +The swissnum from the NURL used to locate the storage service is used as the *credentials*. +If credentials are not presented or the swissnum is not associated with a storage service then no storage processing is performed and the request receives an ``UNAUTHORIZED`` response. + General ~~~~~~~ @@ -380,6 +406,10 @@ then the expiration time of that lease will be changed to 31 days after the time If it does not match an existing lease then a new lease will be created with this ``renew-secret`` which expires 31 days after the time of this operation. +``renew-secret`` and ``cancel-secret`` values must be 32 bytes long. +The server treats them as opaque values. +:ref:`Share Leases` gives details about how the Tahoe-LAFS storage client constructs these values. + In these cases the response is ``NO CONTENT`` with an empty body. It is possible that the storage server will have no shares for the given ``storage_index`` because: diff --git a/docs/specifications/derive_renewal_secret.py b/docs/specifications/derive_renewal_secret.py new file mode 100644 index 000000000..75009eda4 --- /dev/null +++ b/docs/specifications/derive_renewal_secret.py @@ -0,0 +1,87 @@ + +""" +This is a reference implementation of the lease renewal secret derivation +protocol in use by Tahoe-LAFS clients as of 1.16.0. +""" + +from allmydata.util.base32 import ( + a2b as b32decode, + b2a as b32encode, +) +from allmydata.util.hashutil import ( + tagged_hash, + tagged_pair_hash, +) + + +def derive_renewal_secret(lease_secret: bytes, storage_index: bytes, tubid: bytes) -> bytes: + assert len(lease_secret) == 32 + assert len(storage_index) == 16 + assert len(tubid) == 20 + + bucket_renewal_tag = b"allmydata_bucket_renewal_secret_v1" + file_renewal_tag = b"allmydata_file_renewal_secret_v1" + client_renewal_tag = b"allmydata_client_renewal_secret_v1" + + client_renewal_secret = tagged_hash(lease_secret, client_renewal_tag) + file_renewal_secret = tagged_pair_hash( + file_renewal_tag, + client_renewal_secret, + storage_index, + ) + peer_id = tubid + + return tagged_pair_hash(bucket_renewal_tag, file_renewal_secret, peer_id) + +def demo(): + secret = b32encode(derive_renewal_secret( + b"lease secretxxxxxxxxxxxxxxxxxxxx", + b"storage indexxxx", + b"tub idxxxxxxxxxxxxxx", + )).decode("ascii") + print("An example renewal secret: {}".format(secret)) + +def test(): + # These test vectors created by intrumenting Tahoe-LAFS + # bb57fcfb50d4e01bbc4de2e23dbbf7a60c004031 to emit `self.renew_secret` in + # allmydata.immutable.upload.ServerTracker.query and then uploading a + # couple files to a couple different storage servers. + test_vector = [ + dict(lease_secret=b"boity2cdh7jvl3ltaeebuiobbspjmbuopnwbde2yeh4k6x7jioga", + storage_index=b"vrttmwlicrzbt7gh5qsooogr7u", + tubid=b"v67jiisoty6ooyxlql5fuucitqiok2ic", + expected=b"osd6wmc5vz4g3ukg64sitmzlfiaaordutrez7oxdp5kkze7zp5zq", + ), + dict(lease_secret=b"boity2cdh7jvl3ltaeebuiobbspjmbuopnwbde2yeh4k6x7jioga", + storage_index=b"75gmmfts772ww4beiewc234o5e", + tubid=b"v67jiisoty6ooyxlql5fuucitqiok2ic", + expected=b"35itmusj7qm2pfimh62snbyxp3imreofhx4djr7i2fweta75szda", + ), + dict(lease_secret=b"boity2cdh7jvl3ltaeebuiobbspjmbuopnwbde2yeh4k6x7jioga", + storage_index=b"75gmmfts772ww4beiewc234o5e", + tubid=b"lh5fhobkjrmkqjmkxhy3yaonoociggpz", + expected=b"srrlruge47ws3lm53vgdxprgqb6bz7cdblnuovdgtfkqrygrjm4q", + ), + dict(lease_secret=b"vacviff4xfqxsbp64tdr3frg3xnkcsuwt5jpyat2qxcm44bwu75a", + storage_index=b"75gmmfts772ww4beiewc234o5e", + tubid=b"lh5fhobkjrmkqjmkxhy3yaonoociggpz", + expected=b"b4jledjiqjqekbm2erekzqumqzblegxi23i5ojva7g7xmqqnl5pq", + ), + ] + + for n, item in enumerate(test_vector): + derived = b32encode(derive_renewal_secret( + b32decode(item["lease_secret"]), + b32decode(item["storage_index"]), + b32decode(item["tubid"]), + )) + assert derived == item["expected"] , \ + "Test vector {} failed: {} (expected) != {} (derived)".format( + n, + item["expected"], + derived, + ) + print("{} test vectors validated".format(len(test_vector))) + +test() +demo() diff --git a/docs/specifications/index.rst b/docs/specifications/index.rst index 2029c9e5a..e813acf07 100644 --- a/docs/specifications/index.rst +++ b/docs/specifications/index.rst @@ -14,5 +14,6 @@ the data formats used by Tahoe. URI-extension mutable dirnodes + lease servers-of-happiness backends/raic diff --git a/docs/specifications/lease.rst b/docs/specifications/lease.rst new file mode 100644 index 000000000..16adef0a7 --- /dev/null +++ b/docs/specifications/lease.rst @@ -0,0 +1,69 @@ +.. -*- coding: utf-8 -*- + +.. _share leases: + +Share Leases +============ + +A lease is a marker attached to a share indicating that some client has asked for that share to be retained for some amount of time. +The intent is to allow clients and servers to collaborate to determine which data should still be retained and which can be discarded to reclaim storage space. +Zero or more leases may be attached to any particular share. + +Renewal Secrets +--------------- + +Each lease is uniquely identified by its **renewal secret**. +This is a 32 byte string which can be used to extend the validity period of that lease. + +To a storage server a renewal secret is an opaque value which is only ever compared to other renewal secrets to determine equality. + +Storage clients will typically want to follow a scheme to deterministically derive the renewal secret for a particular share from information the client already holds about that share. +This allows a client to maintain and renew single long-lived lease without maintaining additional local state. + +The scheme in use in Tahoe-LAFS as of 1.16.0 is as follows. + +* The **netstring encoding** of a byte string is the concatenation of: + + * the ascii encoding of the base 10 representation of the length of the string + * ``":"`` + * the string itself + * ``","`` + +* The **sha256d digest** is the **sha256 digest** of the **sha256 digest** of a string. +* The **sha256d tagged digest** is the **sha256d digest** of the concatenation of the **netstring encoding** of one string with one other unmodified string. +* The **sha256d tagged pair digest** the **sha256d digest** of the concatenation of the **netstring encodings** of each of three strings. +* The **bucket renewal tag** is ``"allmydata_bucket_renewal_secret_v1"``. +* The **file renewal tag** is ``"allmydata_file_renewal_secret_v1"``. +* The **client renewal tag** is ``"allmydata_client_renewal_secret_v1"``. +* The **lease secret** is a 32 byte string, typically randomly generated once and then persisted for all future uses. +* The **client renewal secret** is the **sha256d tagged digest** of (**lease secret**, **client renewal tag**). +* The **storage index** is constructed using a capability-type-specific scheme. + See ``storage_index_hash`` and ``ssk_storage_index_hash`` calls in ``src/allmydata/uri.py``. +* The **file renewal secret** is the **sha256d tagged pair digest** of (**file renewal tag**, **client renewal secret**, **storage index**). +* The **base32 encoding** is ``base64.b32encode`` lowercased and with trailing ``=`` stripped. +* The **peer id** is the **base32 encoding** of the SHA1 digest of the server's x509 certificate. +* The **renewal secret** is the **sha256d tagged pair digest** of (**bucket renewal tag**, **file renewal secret**, **peer id**). + +A reference implementation is available. + +.. literalinclude:: derive_renewal_secret.py + :language: python + :linenos: + +Cancel Secrets +-------------- + +Lease cancellation is unimplemented. +Nevertheless, +a cancel secret is sent by storage clients to storage servers and stored in lease records. + +The scheme for deriving **cancel secret** in use in Tahoe-LAFS as of 1.16.0 is similar to that used to derive the **renewal secret**. + +The differences are: + +* Use of **client renewal tag** is replaced by use of **client cancel tag**. +* Use of **file renewal secret** is replaced by use of **file cancel tag**. +* Use of **bucket renewal tag** is replaced by use of **bucket cancel tag**. +* **client cancel tag** is ``"allmydata_client_cancel_secret_v1"``. +* **file cancel tag** is ``"allmydata_file_cancel_secret_v1"``. +* **bucket cancel tag** is ``"allmydata_bucket_cancel_secret_v1"``. diff --git a/newsfragments/3774.documentation b/newsfragments/3774.documentation new file mode 100644 index 000000000..d58105966 --- /dev/null +++ b/newsfragments/3774.documentation @@ -0,0 +1 @@ +There is now a specification for the scheme which Tahoe-LAFS storage clients use to derive their lease renewal secrets. diff --git a/newsfragments/3785.documentation b/newsfragments/3785.documentation new file mode 100644 index 000000000..4eb268f79 --- /dev/null +++ b/newsfragments/3785.documentation @@ -0,0 +1 @@ +The Great Black Swamp specification now describes the required authorization scheme.