Merge remote-tracking branch 'origin/master' into 3777.gbs-immutable-download-spec-improvements

This commit is contained in:
Jean-Paul Calderone 2021-09-07 14:10:35 -04:00
commit 9def6e79c9
6 changed files with 190 additions and 1 deletions

View File

@ -24,11 +24,21 @@ Glossary
storage server
a Tahoe-LAFS process configured to offer storage and reachable over the network for store and retrieve operations
storage service
a Python object held in memory in the storage server which provides the implementation of the storage protocol
introducer
a Tahoe-LAFS process at a known location configured to re-publish announcements about the location of storage servers
fURL
a self-authenticating URL-like string which can be used to locate a remote object using the Foolscap protocol
(the storage service is an example of such an object)
NURL
a self-authenticating URL-like string almost exactly like a fURL but without being tied to Foolscap
swissnum
a short random string which is part of a fURL and which acts as a shared secret to authorize clients to use a storage service
lease
state associated with a share informing a storage server of the duration of storage desired by a client
@ -45,7 +55,7 @@ Glossary
(sometimes "slot" is considered a synonym for "storage index of a slot")
storage index
a short string which can address a slot or a bucket
a 16 byte string which can address a slot or a bucket
(in practice, derived by hashing the encryption key associated with contents of that slot or bucket)
write enabler
@ -128,6 +138,8 @@ The Foolscap-based protocol offers:
* A careful configuration of the TLS connection parameters *may* also offer **forward secrecy**.
However, Tahoe-LAFS' use of Foolscap takes no steps to ensure this is the case.
* **Storage authorization** by way of a capability contained in the fURL addressing a storage service.
Discussion
!!!!!!!!!!
@ -158,6 +170,10 @@ there is no way to write data which appears legitimate to a legitimate client).
Therefore, **message confidentiality** is necessary when exchanging these secrets.
**Forward secrecy** is preferred so that an attacker recording an exchange today cannot launch this attack at some future point after compromising the necessary keys.
A storage service offers service only to some clients.
A client proves their authorization to use the storage service by presenting a shared secret taken from the fURL.
In this way **storage authorization** is performed to prevent disallowed parties from consuming any storage resources.
Functionality
-------------
@ -214,6 +230,10 @@ Additionally,
by continuing to interact using TLS,
Bob's client and Alice's storage node are assured of both **message authentication** and **message confidentiality**.
Bob's client further inspects the fURL for the *swissnum*.
When Bob's client issues HTTP requests to Alice's storage node it includes the *swissnum* in its requests.
**Storage authorization** has been achieved.
.. note::
Foolscap TubIDs are 20 bytes (SHA1 digest of the certificate).
@ -343,6 +363,12 @@ one branch contains all of the share data;
another branch contains all of the lease data;
etc.
Authorization is required for all endpoints.
The standard HTTP authorization protocol is used.
The authentication *type* used is ``Tahoe-LAFS``.
The swissnum from the NURL used to locate the storage service is used as the *credentials*.
If credentials are not presented or the swissnum is not associated with a storage service then no storage processing is performed and the request receives an ``UNAUTHORIZED`` response.
General
~~~~~~~
@ -380,6 +406,10 @@ then the expiration time of that lease will be changed to 31 days after the time
If it does not match an existing lease
then a new lease will be created with this ``renew-secret`` which expires 31 days after the time of this operation.
``renew-secret`` and ``cancel-secret`` values must be 32 bytes long.
The server treats them as opaque values.
:ref:`Share Leases` gives details about how the Tahoe-LAFS storage client constructs these values.
In these cases the response is ``NO CONTENT`` with an empty body.
It is possible that the storage server will have no shares for the given ``storage_index`` because:

View File

@ -0,0 +1,87 @@
"""
This is a reference implementation of the lease renewal secret derivation
protocol in use by Tahoe-LAFS clients as of 1.16.0.
"""
from allmydata.util.base32 import (
a2b as b32decode,
b2a as b32encode,
)
from allmydata.util.hashutil import (
tagged_hash,
tagged_pair_hash,
)
def derive_renewal_secret(lease_secret: bytes, storage_index: bytes, tubid: bytes) -> bytes:
assert len(lease_secret) == 32
assert len(storage_index) == 16
assert len(tubid) == 20
bucket_renewal_tag = b"allmydata_bucket_renewal_secret_v1"
file_renewal_tag = b"allmydata_file_renewal_secret_v1"
client_renewal_tag = b"allmydata_client_renewal_secret_v1"
client_renewal_secret = tagged_hash(lease_secret, client_renewal_tag)
file_renewal_secret = tagged_pair_hash(
file_renewal_tag,
client_renewal_secret,
storage_index,
)
peer_id = tubid
return tagged_pair_hash(bucket_renewal_tag, file_renewal_secret, peer_id)
def demo():
secret = b32encode(derive_renewal_secret(
b"lease secretxxxxxxxxxxxxxxxxxxxx",
b"storage indexxxx",
b"tub idxxxxxxxxxxxxxx",
)).decode("ascii")
print("An example renewal secret: {}".format(secret))
def test():
# These test vectors created by intrumenting Tahoe-LAFS
# bb57fcfb50d4e01bbc4de2e23dbbf7a60c004031 to emit `self.renew_secret` in
# allmydata.immutable.upload.ServerTracker.query and then uploading a
# couple files to a couple different storage servers.
test_vector = [
dict(lease_secret=b"boity2cdh7jvl3ltaeebuiobbspjmbuopnwbde2yeh4k6x7jioga",
storage_index=b"vrttmwlicrzbt7gh5qsooogr7u",
tubid=b"v67jiisoty6ooyxlql5fuucitqiok2ic",
expected=b"osd6wmc5vz4g3ukg64sitmzlfiaaordutrez7oxdp5kkze7zp5zq",
),
dict(lease_secret=b"boity2cdh7jvl3ltaeebuiobbspjmbuopnwbde2yeh4k6x7jioga",
storage_index=b"75gmmfts772ww4beiewc234o5e",
tubid=b"v67jiisoty6ooyxlql5fuucitqiok2ic",
expected=b"35itmusj7qm2pfimh62snbyxp3imreofhx4djr7i2fweta75szda",
),
dict(lease_secret=b"boity2cdh7jvl3ltaeebuiobbspjmbuopnwbde2yeh4k6x7jioga",
storage_index=b"75gmmfts772ww4beiewc234o5e",
tubid=b"lh5fhobkjrmkqjmkxhy3yaonoociggpz",
expected=b"srrlruge47ws3lm53vgdxprgqb6bz7cdblnuovdgtfkqrygrjm4q",
),
dict(lease_secret=b"vacviff4xfqxsbp64tdr3frg3xnkcsuwt5jpyat2qxcm44bwu75a",
storage_index=b"75gmmfts772ww4beiewc234o5e",
tubid=b"lh5fhobkjrmkqjmkxhy3yaonoociggpz",
expected=b"b4jledjiqjqekbm2erekzqumqzblegxi23i5ojva7g7xmqqnl5pq",
),
]
for n, item in enumerate(test_vector):
derived = b32encode(derive_renewal_secret(
b32decode(item["lease_secret"]),
b32decode(item["storage_index"]),
b32decode(item["tubid"]),
))
assert derived == item["expected"] , \
"Test vector {} failed: {} (expected) != {} (derived)".format(
n,
item["expected"],
derived,
)
print("{} test vectors validated".format(len(test_vector)))
test()
demo()

View File

@ -14,5 +14,6 @@ the data formats used by Tahoe.
URI-extension
mutable
dirnodes
lease
servers-of-happiness
backends/raic

View File

@ -0,0 +1,69 @@
.. -*- coding: utf-8 -*-
.. _share leases:
Share Leases
============
A lease is a marker attached to a share indicating that some client has asked for that share to be retained for some amount of time.
The intent is to allow clients and servers to collaborate to determine which data should still be retained and which can be discarded to reclaim storage space.
Zero or more leases may be attached to any particular share.
Renewal Secrets
---------------
Each lease is uniquely identified by its **renewal secret**.
This is a 32 byte string which can be used to extend the validity period of that lease.
To a storage server a renewal secret is an opaque value which is only ever compared to other renewal secrets to determine equality.
Storage clients will typically want to follow a scheme to deterministically derive the renewal secret for a particular share from information the client already holds about that share.
This allows a client to maintain and renew single long-lived lease without maintaining additional local state.
The scheme in use in Tahoe-LAFS as of 1.16.0 is as follows.
* The **netstring encoding** of a byte string is the concatenation of:
* the ascii encoding of the base 10 representation of the length of the string
* ``":"``
* the string itself
* ``","``
* The **sha256d digest** is the **sha256 digest** of the **sha256 digest** of a string.
* The **sha256d tagged digest** is the **sha256d digest** of the concatenation of the **netstring encoding** of one string with one other unmodified string.
* The **sha256d tagged pair digest** the **sha256d digest** of the concatenation of the **netstring encodings** of each of three strings.
* The **bucket renewal tag** is ``"allmydata_bucket_renewal_secret_v1"``.
* The **file renewal tag** is ``"allmydata_file_renewal_secret_v1"``.
* The **client renewal tag** is ``"allmydata_client_renewal_secret_v1"``.
* The **lease secret** is a 32 byte string, typically randomly generated once and then persisted for all future uses.
* The **client renewal secret** is the **sha256d tagged digest** of (**lease secret**, **client renewal tag**).
* The **storage index** is constructed using a capability-type-specific scheme.
See ``storage_index_hash`` and ``ssk_storage_index_hash`` calls in ``src/allmydata/uri.py``.
* The **file renewal secret** is the **sha256d tagged pair digest** of (**file renewal tag**, **client renewal secret**, **storage index**).
* The **base32 encoding** is ``base64.b32encode`` lowercased and with trailing ``=`` stripped.
* The **peer id** is the **base32 encoding** of the SHA1 digest of the server's x509 certificate.
* The **renewal secret** is the **sha256d tagged pair digest** of (**bucket renewal tag**, **file renewal secret**, **peer id**).
A reference implementation is available.
.. literalinclude:: derive_renewal_secret.py
:language: python
:linenos:
Cancel Secrets
--------------
Lease cancellation is unimplemented.
Nevertheless,
a cancel secret is sent by storage clients to storage servers and stored in lease records.
The scheme for deriving **cancel secret** in use in Tahoe-LAFS as of 1.16.0 is similar to that used to derive the **renewal secret**.
The differences are:
* Use of **client renewal tag** is replaced by use of **client cancel tag**.
* Use of **file renewal secret** is replaced by use of **file cancel tag**.
* Use of **bucket renewal tag** is replaced by use of **bucket cancel tag**.
* **client cancel tag** is ``"allmydata_client_cancel_secret_v1"``.
* **file cancel tag** is ``"allmydata_file_cancel_secret_v1"``.
* **bucket cancel tag** is ``"allmydata_bucket_cancel_secret_v1"``.

View File

@ -0,0 +1 @@
There is now a specification for the scheme which Tahoe-LAFS storage clients use to derive their lease renewal secrets.

View File

@ -0,0 +1 @@
The Great Black Swamp specification now describes the required authorization scheme.