From c612fb7075a8662590ec26e275ce8f94884458d4 Mon Sep 17 00:00:00 2001 From: Jean-Paul Calderone Date: Mon, 14 May 2018 14:30:34 -0400 Subject: [PATCH 01/76] initial pass over security section --- docs/proposed/http-storage-node-protocol.rst | 105 +++++++++++++++++++ docs/proposed/index.rst | 1 + 2 files changed, 106 insertions(+) create mode 100644 docs/proposed/http-storage-node-protocol.rst diff --git a/docs/proposed/http-storage-node-protocol.rst b/docs/proposed/http-storage-node-protocol.rst new file mode 100644 index 000000000..60e1afd9e --- /dev/null +++ b/docs/proposed/http-storage-node-protocol.rst @@ -0,0 +1,105 @@ +.. -*- coding: utf-8 -*- + +HTTP Storage Node Protocol +========================== + +The target audience for this document is Tahoe-LAFS developers. +After reading this document, +one should expect to understand how Tahoe-LAFS clients interact over the network with Tahoe-LAFS storage nodes. + +Security +-------- + +Requirements +~~~~~~~~~~~~ + +A client node relies on a storage node to persist certain data until a future retrieval request is made. +In this way, the node is vulnerable to attacks which cause the data not to be persisted. +Though this vulnerability can be mitigated by including redundancy in the share encoding parameters for stored data, it is still sensible to attempt to minimize unnecessary vulnerability to this attack. + +One way to do this is for the client to be confident it the storage node with which it is communicating is really the expected node. +Therefore, the protocol must include some means for cryptographically verifying the identify of the storage node. +The initialization of the client with the correct identity information is out of scope for this protocol +(the system may be trust-on-first-use, there may be a third-party identity broker, etc). + +With confidence that communication is proceeding with the intended storage node, +it must also be possible to trust that data is exchanged without modification. +That is, the protocol must include some means to cryptographically verify the integrity of exchanged messages. + +Solutions +~~~~~~~~~ + +Communication with the storage node will take place using TLS 1.2 [#]_. + + * The storage node will present a certificate proving its identity. + * The certificate will include a ``subjectAltName`` containing ... [#]_. + * The certificate will be signed by an entity known to and trusted by the client. + This entity will *not* be a standard web-focused Certificate Authority. + +When connecting to a storage node, +the client will take the following steps to gain confidence it has reached the intended peer: + + * It will perform the usual cryptographic verification of the certificate presented by the storage server + (that is, + that the certificate itself is well-formed, + that the signature it carries is valid, + that the signature was created by a "trusted entity"). + * It will consider the only "trusted entity" to be an entity explicitly configured for the intended storage node + (specifically, it will not considered the standard web-focused Certificate Authorities to be trusted). + * It will check the ``subjectAltName`` against ... [#]_. + +To further clarify, consider this example. +Alice operates a storage node. +Alice generates a Certificate Authority certificate and secures the private key appropriately. +Alice generates a Storage Node certificate and signs it with the Certificate Authority certificate's private key. +Alice prints out the Certificate Authority certificate and storage node URI [#]_ and hands it to Bob. +Bob creates a client node. +Bob configures the client node with the storage node URI and the Certificate Authority certificate received from Alice. + +Bob's client node can now perform a TLS handshake with a server at the address indicated by the storage node URI. +Following the above described validation procedures, +Bob's client node can determine whether it has reached Alice's storage node or not. + +Additionally, +by continuing to interact using TLS, +Bob's client and Alice's storage node are assured of the integrity of the communication. + +Transition +~~~~~~~~~~ + +Storage nodes already possess an x509 certificate. +This is used with Foolscap to provide the same security properties described in the above requirements section. +There are some differences. + + * The certificate is self-signed. + * The certificate has a ``commonName`` of "newpb_thingy". + * The validity of the certificate is determined by checking the certificate digest against a value carried in the fURL. + Only a correctly signed certificate with a matching digest is accepted. + +A mixed-protocol storage node should: + + * Start the Foolscap server as it has always done. + * Start a TLS server dispatching to an HTTP server. + * Use the same certificate as the Foolscap server uses. + * Accept anonymous client connections. + +A mixed-protocol client node should: + + * If it is configured with a storage URI, connect using HTTP over TLS. + * If it is configured with a storage fURL, connect using Foolscap. + If the server version indicates support for the new protocol: + * Attempt to connect using the new protocol. + * Drop the Foolscap connection if this new connection succeeds. + +Client node implementations could cache a successful protocol upgrade. +This would avoid the double connection on subsequent startups. +This is left as a decision for the implementation, though. + +.. [#] What are best practices regarding TLS version? + Would a policy of "use the newest version shared between the two endpoints" be better? + Is it necessary to specify more than a TLS version number here? + For example, should we be specifying a set of ciphers as well? + Or is that a quality of implementation issue rather than a protocol specification issue? +.. [#] TODO +.. [#] TODO +.. [#] URL? IRI? diff --git a/docs/proposed/index.rst b/docs/proposed/index.rst index 3211b317f..a052baeff 100644 --- a/docs/proposed/index.rst +++ b/docs/proposed/index.rst @@ -18,3 +18,4 @@ index only lists the files that are in .rst format. magic-folder/remote-to-local-sync magic-folder/user-interface-design magic-folder/multi-party-conflict-detection + http-storage-node-protocol From 53dce7eafc808c3519d91f095a2224460bdb5ed4 Mon Sep 17 00:00:00 2001 From: Jean-Paul Calderone Date: Mon, 14 May 2018 15:58:21 -0400 Subject: [PATCH 02/76] first pass over read and write api --- docs/proposed/http-storage-node-protocol.rst | 54 ++++++++++++++++++++ 1 file changed, 54 insertions(+) diff --git a/docs/proposed/http-storage-node-protocol.rst b/docs/proposed/http-storage-node-protocol.rst index 60e1afd9e..2205e2b3a 100644 --- a/docs/proposed/http-storage-node-protocol.rst +++ b/docs/proposed/http-storage-node-protocol.rst @@ -95,6 +95,60 @@ Client node implementations could cache a successful protocol upgrade. This would avoid the double connection on subsequent startups. This is left as a decision for the implementation, though. +Reading +------- + +``GET /v1/storage/:storage_index``: + +Retrieve a mapping describing buckets for the indicated storage index. +The mapping is returned as an encoded structured object +(JSON is used for the example here but is not necessarily the true encoding). +The mapping has share numbers as keys and bucket identifiers as values. +For example:: + + .. XXX Share numbers are logically integers and probably sequential starting from 0. + But JSON cannot encode them as integers if they are keys in a mapping. + Is this really a mapping or would an array (with share number implied by array index) work as well? + + {0: "abcd", 1: "efgh"} + +``GET /v1/buckets/:bucket_id`` + +Read data from the indicated bucket. +The data is returned raw (i.e., ``application/octet-stream``). +Range requests may be made to read only part of a bucket. + +``POST /v1/buckets/:bucket_id/corrupt`` + +Advise the server the share data read from the indicated bucket was corrupt. +The request body includes an human-meaningful string with details about the corruption. +It also includes potentially important details about the share. + +For example:: + + {"share_type": "mutable", "storage_index": "abcd", "share_number": 3, + "reason": "expected hash abcd, got hash efgh"} + +Writing +------- + +``POST /v1/buckets`` + +Create some new buckets in which to store some shares. +Details of the buckets to create are encoded in the request body. +For example:: + + {"storage_index": "abcd", "renew_secret": "efgh", "cancel_secret": "ijkl", + "sharenums": [1, 7, ...], "allocated_size": 12345} + +The response body includes encoded information about the created buckets. +For example:: + + .. XXX Same deal about share numbers as integers/strings here. + But here it's clear we can't just use an array as mentioned above. + {"already_have": [1, ...], + "allocated": {"7": "bucket_id", ...}} + .. [#] What are best practices regarding TLS version? Would a policy of "use the newest version shared between the two endpoints" be better? Is it necessary to specify more than a TLS version number here? From 8e9ba5211864244250d5024a06dbab5a6efe53d7 Mon Sep 17 00:00:00 2001 From: Jean-Paul Calderone Date: Tue, 15 May 2018 09:07:07 -0400 Subject: [PATCH 03/76] spurious indentation --- docs/proposed/http-storage-node-protocol.rst | 26 ++++++++++---------- 1 file changed, 13 insertions(+), 13 deletions(-) diff --git a/docs/proposed/http-storage-node-protocol.rst b/docs/proposed/http-storage-node-protocol.rst index 2205e2b3a..ff7cb853c 100644 --- a/docs/proposed/http-storage-node-protocol.rst +++ b/docs/proposed/http-storage-node-protocol.rst @@ -71,25 +71,25 @@ Storage nodes already possess an x509 certificate. This is used with Foolscap to provide the same security properties described in the above requirements section. There are some differences. - * The certificate is self-signed. - * The certificate has a ``commonName`` of "newpb_thingy". - * The validity of the certificate is determined by checking the certificate digest against a value carried in the fURL. - Only a correctly signed certificate with a matching digest is accepted. +* The certificate is self-signed. +* The certificate has a ``commonName`` of "newpb_thingy". +* The validity of the certificate is determined by checking the certificate digest against a value carried in the fURL. + Only a correctly signed certificate with a matching digest is accepted. A mixed-protocol storage node should: - * Start the Foolscap server as it has always done. - * Start a TLS server dispatching to an HTTP server. - * Use the same certificate as the Foolscap server uses. - * Accept anonymous client connections. +* Start the Foolscap server as it has always done. +* Start a TLS server dispatching to an HTTP server. + * Use the same certificate as the Foolscap server uses. + * Accept anonymous client connections. A mixed-protocol client node should: - * If it is configured with a storage URI, connect using HTTP over TLS. - * If it is configured with a storage fURL, connect using Foolscap. - If the server version indicates support for the new protocol: - * Attempt to connect using the new protocol. - * Drop the Foolscap connection if this new connection succeeds. +* If it is configured with a storage URI, connect using HTTP over TLS. +* If it is configured with a storage fURL, connect using Foolscap. + If the server version indicates support for the new protocol: + * Attempt to connect using the new protocol. + * Drop the Foolscap connection if this new connection succeeds. Client node implementations could cache a successful protocol upgrade. This would avoid the double connection on subsequent startups. From 599bf074e39c846e02110aab4994b389ba92971d Mon Sep 17 00:00:00 2001 From: Jean-Paul Calderone Date: Tue, 15 May 2018 09:07:58 -0400 Subject: [PATCH 04/76] more spurious indentation --- docs/proposed/http-storage-node-protocol.rst | 24 ++++++++++---------- 1 file changed, 12 insertions(+), 12 deletions(-) diff --git a/docs/proposed/http-storage-node-protocol.rst b/docs/proposed/http-storage-node-protocol.rst index ff7cb853c..e55156420 100644 --- a/docs/proposed/http-storage-node-protocol.rst +++ b/docs/proposed/http-storage-node-protocol.rst @@ -31,22 +31,22 @@ Solutions Communication with the storage node will take place using TLS 1.2 [#]_. - * The storage node will present a certificate proving its identity. - * The certificate will include a ``subjectAltName`` containing ... [#]_. - * The certificate will be signed by an entity known to and trusted by the client. - This entity will *not* be a standard web-focused Certificate Authority. +* The storage node will present a certificate proving its identity. +* The certificate will include a ``subjectAltName`` containing ... [#]_. +* The certificate will be signed by an entity known to and trusted by the client. + This entity will *not* be a standard web-focused Certificate Authority. When connecting to a storage node, the client will take the following steps to gain confidence it has reached the intended peer: - * It will perform the usual cryptographic verification of the certificate presented by the storage server - (that is, - that the certificate itself is well-formed, - that the signature it carries is valid, - that the signature was created by a "trusted entity"). - * It will consider the only "trusted entity" to be an entity explicitly configured for the intended storage node - (specifically, it will not considered the standard web-focused Certificate Authorities to be trusted). - * It will check the ``subjectAltName`` against ... [#]_. +* It will perform the usual cryptographic verification of the certificate presented by the storage server + (that is, + that the certificate itself is well-formed, + that the signature it carries is valid, + that the signature was created by a "trusted entity"). +* It will consider the only "trusted entity" to be an entity explicitly configured for the intended storage node + (specifically, it will not considered the standard web-focused Certificate Authorities to be trusted). +* It will check the ``subjectAltName`` against ... [#]_. To further clarify, consider this example. Alice operates a storage node. From b6572e2856009644bcb680c5df28e44f7f8d2552 Mon Sep 17 00:00:00 2001 From: Jean-Paul Calderone Date: Tue, 15 May 2018 09:41:45 -0400 Subject: [PATCH 05/76] clear now they are not necessarily consecutive --- docs/proposed/http-storage-node-protocol.rst | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/docs/proposed/http-storage-node-protocol.rst b/docs/proposed/http-storage-node-protocol.rst index e55156420..f991e5bd6 100644 --- a/docs/proposed/http-storage-node-protocol.rst +++ b/docs/proposed/http-storage-node-protocol.rst @@ -106,9 +106,9 @@ The mapping is returned as an encoded structured object The mapping has share numbers as keys and bucket identifiers as values. For example:: - .. XXX Share numbers are logically integers and probably sequential starting from 0. - But JSON cannot encode them as integers if they are keys in a mapping. - Is this really a mapping or would an array (with share number implied by array index) work as well? + .. XXX Share numbers are logically integers. + JSON cannot encode integer mapping keys. + So this is not valid JSON but you know what I mean. {0: "abcd", 1: "efgh"} @@ -145,9 +145,9 @@ The response body includes encoded information about the created buckets. For example:: .. XXX Same deal about share numbers as integers/strings here. - But here it's clear we can't just use an array as mentioned above. + {"already_have": [1, ...], - "allocated": {"7": "bucket_id", ...}} + "allocated": {7: "bucket_id", ...}} .. [#] What are best practices regarding TLS version? Would a policy of "use the newest version shared between the two endpoints" be better? From 5b35f591f1e3638f6e960b2073d94b872887a198 Mon Sep 17 00:00:00 2001 From: Jean-Paul Calderone Date: Tue, 15 May 2018 09:42:10 -0400 Subject: [PATCH 06/76] write share data --- docs/proposed/http-storage-node-protocol.rst | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/docs/proposed/http-storage-node-protocol.rst b/docs/proposed/http-storage-node-protocol.rst index f991e5bd6..7005c8775 100644 --- a/docs/proposed/http-storage-node-protocol.rst +++ b/docs/proposed/http-storage-node-protocol.rst @@ -149,6 +149,11 @@ For example:: {"already_have": [1, ...], "allocated": {7: "bucket_id", ...}} +``PUT /v1/buckets/:bucket_id`` + +Write the share data to the indicated bucket. +The request body is the raw share data (i.e., ``application/octet-stream``). + .. [#] What are best practices regarding TLS version? Would a policy of "use the newest version shared between the two endpoints" be better? Is it necessary to specify more than a TLS version number here? From 73d903ad961bd36f75694b2a0001a6a30b68e96f Mon Sep 17 00:00:00 2001 From: Jean-Paul Calderone Date: Tue, 15 May 2018 09:42:20 -0400 Subject: [PATCH 07/76] client-selected resource identifier -> PUT --- docs/proposed/http-storage-node-protocol.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/proposed/http-storage-node-protocol.rst b/docs/proposed/http-storage-node-protocol.rst index 7005c8775..87d383cfd 100644 --- a/docs/proposed/http-storage-node-protocol.rst +++ b/docs/proposed/http-storage-node-protocol.rst @@ -132,13 +132,13 @@ For example:: Writing ------- -``POST /v1/buckets`` +``PUT /v1/storage/:storage_index`` Create some new buckets in which to store some shares. Details of the buckets to create are encoded in the request body. For example:: - {"storage_index": "abcd", "renew_secret": "efgh", "cancel_secret": "ijkl", + {"renew_secret": "efgh", "cancel_secret": "ijkl", "sharenums": [1, 7, ...], "allocated_size": 12345} The response body includes encoded information about the created buckets. From a3d4edca7dbb21cdcb9833b06899f2d5062df516 Mon Sep 17 00:00:00 2001 From: Jean-Paul Calderone Date: Tue, 15 May 2018 10:10:21 -0400 Subject: [PATCH 08/76] retrieve server version and info --- docs/proposed/http-storage-node-protocol.rst | 23 ++++++++++++++++++++ 1 file changed, 23 insertions(+) diff --git a/docs/proposed/http-storage-node-protocol.rst b/docs/proposed/http-storage-node-protocol.rst index 87d383cfd..c524d31bd 100644 --- a/docs/proposed/http-storage-node-protocol.rst +++ b/docs/proposed/http-storage-node-protocol.rst @@ -95,6 +95,29 @@ Client node implementations could cache a successful protocol upgrade. This would avoid the double connection on subsequent startups. This is left as a decision for the implementation, though. +Server Details +-------------- + +``GET /v1/version`` + +Retrieve information about the version of the storage server. +Information is returned as an encoded mapping. +For example:: + + { "http://allmydata.org/tahoe/protocols/storage/v1" : + { "maximum-immutable-share-size": 1234, + "maximum-mutable-share-size": 1235, + "available-space": 123456, + "tolerates-immutable-read-overrun": true, + "delete-mutable-shares-with-zero-length-writev": true, + "fills-holes-with-zero-bytes": true, + "prevents-read-past-end-of-share-data": true, + "http-protocol-available": true + }, + "application-version": "1.13.0" + } + + Reading ------- From 23242266dcc4ac13650f08836240b9d7bb5b1daf Mon Sep 17 00:00:00 2001 From: Jean-Paul Calderone Date: Tue, 15 May 2018 14:16:02 -0400 Subject: [PATCH 09/76] consistent style --- docs/proposed/http-storage-node-protocol.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/proposed/http-storage-node-protocol.rst b/docs/proposed/http-storage-node-protocol.rst index c524d31bd..7fa12569b 100644 --- a/docs/proposed/http-storage-node-protocol.rst +++ b/docs/proposed/http-storage-node-protocol.rst @@ -121,7 +121,7 @@ For example:: Reading ------- -``GET /v1/storage/:storage_index``: +``GET /v1/storage/:storage_index`` Retrieve a mapping describing buckets for the indicated storage index. The mapping is returned as an encoded structured object From 465489fd0b696eb80e52ec6284c5d3892e777bed Mon Sep 17 00:00:00 2001 From: Jean-Paul Calderone Date: Tue, 15 May 2018 14:46:18 -0400 Subject: [PATCH 10/76] re-organize --- docs/proposed/http-storage-node-protocol.rst | 93 ++++++++++++-------- 1 file changed, 56 insertions(+), 37 deletions(-) diff --git a/docs/proposed/http-storage-node-protocol.rst b/docs/proposed/http-storage-node-protocol.rst index 7fa12569b..f8f242183 100644 --- a/docs/proposed/http-storage-node-protocol.rst +++ b/docs/proposed/http-storage-node-protocol.rst @@ -118,10 +118,64 @@ For example:: } +Shares +------ + +Shares are immutable data stored in buckets. + +Writing +~~~~~~~ + +``POST /v1/buckets/:storage_index`` +!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! + +Create some new buckets in which to store some shares. +Details of the buckets to create are encoded in the request body. +For example:: + + {"renew_secret": "efgh", "cancel_secret": "ijkl", + "sharenums": [1, 7, ...], "allocated_size": 12345} + +The response body includes encoded information about the created buckets. +For example:: + + {"already_have": [1, ...], + "allocated": {7: "bucket_id", ...}} + + + +Discussion +`````````` + +We considered making this ``POST /v1/storage`` instead. +The motivation was to keep *storage index* out of the request URL. +Request URLs have a mildly elevated chance of being logged by something. +We were concerned that having the *storage index* logged may increase some risks. +However, we decided this does not matter because the *storage index* can only be used to read the share (which is ciphertext). +TODO Verify this conclusion. + +``PUT /v1/buckets/:bucket_id`` +!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! + +Write the share data to the indicated bucket. +The request body is the raw share data (i.e., ``application/octet-stream``). + +``POST /v1/buckets/:bucket_id/corrupt`` + +Advise the server the share data read from the indicated bucket was corrupt. +The request body includes an human-meaningful string with details about the corruption. +It also includes potentially important details about the share. + +For example:: + + {"share_type": "mutable", "storage_index": "abcd", "share_number": 3, + "reason": "expected hash abcd, got hash efgh"} + Reading -------- +~~~~~~~ ``GET /v1/storage/:storage_index`` +!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Retrieve a mapping describing buckets for the indicated storage index. The mapping is returned as an encoded structured object @@ -136,47 +190,12 @@ For example:: {0: "abcd", 1: "efgh"} ``GET /v1/buckets/:bucket_id`` +!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Read data from the indicated bucket. The data is returned raw (i.e., ``application/octet-stream``). Range requests may be made to read only part of a bucket. -``POST /v1/buckets/:bucket_id/corrupt`` - -Advise the server the share data read from the indicated bucket was corrupt. -The request body includes an human-meaningful string with details about the corruption. -It also includes potentially important details about the share. - -For example:: - - {"share_type": "mutable", "storage_index": "abcd", "share_number": 3, - "reason": "expected hash abcd, got hash efgh"} - -Writing -------- - -``PUT /v1/storage/:storage_index`` - -Create some new buckets in which to store some shares. -Details of the buckets to create are encoded in the request body. -For example:: - - {"renew_secret": "efgh", "cancel_secret": "ijkl", - "sharenums": [1, 7, ...], "allocated_size": 12345} - -The response body includes encoded information about the created buckets. -For example:: - - .. XXX Same deal about share numbers as integers/strings here. - - {"already_have": [1, ...], - "allocated": {7: "bucket_id", ...}} - -``PUT /v1/buckets/:bucket_id`` - -Write the share data to the indicated bucket. -The request body is the raw share data (i.e., ``application/octet-stream``). - .. [#] What are best practices regarding TLS version? Would a policy of "use the newest version shared between the two endpoints" be better? Is it necessary to specify more than a TLS version number here? From 357820357ca990f2a1db0773d041438bc1f9885b Mon Sep 17 00:00:00 2001 From: Jean-Paul Calderone Date: Tue, 15 May 2018 15:04:20 -0400 Subject: [PATCH 11/76] front matter --- docs/proposed/http-storage-node-protocol.rst | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/docs/proposed/http-storage-node-protocol.rst b/docs/proposed/http-storage-node-protocol.rst index f8f242183..569a2a0b4 100644 --- a/docs/proposed/http-storage-node-protocol.rst +++ b/docs/proposed/http-storage-node-protocol.rst @@ -7,6 +7,11 @@ The target audience for this document is Tahoe-LAFS developers. After reading this document, one should expect to understand how Tahoe-LAFS clients interact over the network with Tahoe-LAFS storage nodes. +The primary goal of the introduction of this protocol is to simplify the task of implementing a Tahoe-LAFS storage server. +Specifically, it should be possible to implement a Tahoe-LAFS storage server without a Foolscap implementation +(substituting an HTTP server implementation). +The Tahoe-LAFS client will also need to change but it is not expected that it will be noticably simplified by this change. + Security -------- From 5f88cd068740f5e639722058ca12d7cd26813dc1 Mon Sep 17 00:00:00 2001 From: Jean-Paul Calderone Date: Tue, 15 May 2018 15:16:01 -0400 Subject: [PATCH 12/76] rework the security section --- docs/proposed/http-storage-node-protocol.rst | 41 ++++++++++---------- 1 file changed, 20 insertions(+), 21 deletions(-) diff --git a/docs/proposed/http-storage-node-protocol.rst b/docs/proposed/http-storage-node-protocol.rst index 569a2a0b4..bd6ccbcfa 100644 --- a/docs/proposed/http-storage-node-protocol.rst +++ b/docs/proposed/http-storage-node-protocol.rst @@ -34,34 +34,31 @@ That is, the protocol must include some means to cryptographically verify the in Solutions ~~~~~~~~~ -Communication with the storage node will take place using TLS 1.2 [#]_. - -* The storage node will present a certificate proving its identity. -* The certificate will include a ``subjectAltName`` containing ... [#]_. -* The certificate will be signed by an entity known to and trusted by the client. - This entity will *not* be a standard web-focused Certificate Authority. +Communication with the storage node will take place using TLS. +The TLS version and configuration will be dictated by an ongoing understanding of best practices. +The only requirement is that the certificate have a valid signature. +The storage node will publish the corresponding public key +(e.g., via an introducer). +The public key will constitute the storage node's identity. When connecting to a storage node, the client will take the following steps to gain confidence it has reached the intended peer: * It will perform the usual cryptographic verification of the certificate presented by the storage server (that is, - that the certificate itself is well-formed, - that the signature it carries is valid, - that the signature was created by a "trusted entity"). -* It will consider the only "trusted entity" to be an entity explicitly configured for the intended storage node - (specifically, it will not considered the standard web-focused Certificate Authorities to be trusted). -* It will check the ``subjectAltName`` against ... [#]_. + that the certificate itself is well-formed + and that the signature it carries is valid. +* It will compare the hash of the public key of the certificate to the expected public key. To further clarify, consider this example. Alice operates a storage node. -Alice generates a Certificate Authority certificate and secures the private key appropriately. -Alice generates a Storage Node certificate and signs it with the Certificate Authority certificate's private key. -Alice prints out the Certificate Authority certificate and storage node URI [#]_ and hands it to Bob. -Bob creates a client node. -Bob configures the client node with the storage node URI and the Certificate Authority certificate received from Alice. +Alice generates a key pair and secures it properly. +Alice generates a self-signed storage node certificate with the key pair. +Alice's storage node announces a fURL containing (among other information) the public key to an introducer. +Bob creates a client node pointed at the same introducer. +Bob's client node receives the announcement from Alice's storage node. -Bob's client node can now perform a TLS handshake with a server at the address indicated by the storage node URI. +Bob's client node can now perform a TLS handshake with a server at the address indicated by the storage node fURL. Following the above described validation procedures, Bob's client node can determine whether it has reached Alice's storage node or not. @@ -74,17 +71,20 @@ Transition Storage nodes already possess an x509 certificate. This is used with Foolscap to provide the same security properties described in the above requirements section. -There are some differences. * The certificate is self-signed. + This remains the same. * The certificate has a ``commonName`` of "newpb_thingy". + This is not harmful to the new protocol. * The validity of the certificate is determined by checking the certificate digest against a value carried in the fURL. Only a correctly signed certificate with a matching digest is accepted. + This validation will be replaced with a public key hash comparison. A mixed-protocol storage node should: * Start the Foolscap server as it has always done. * Start a TLS server dispatching to an HTTP server. + * Use the same certificate as the Foolscap server uses. * Accept anonymous client connections. @@ -93,6 +93,7 @@ A mixed-protocol client node should: * If it is configured with a storage URI, connect using HTTP over TLS. * If it is configured with a storage fURL, connect using Foolscap. If the server version indicates support for the new protocol: + * Attempt to connect using the new protocol. * Drop the Foolscap connection if this new connection succeeds. @@ -206,6 +207,4 @@ Range requests may be made to read only part of a bucket. Is it necessary to specify more than a TLS version number here? For example, should we be specifying a set of ciphers as well? Or is that a quality of implementation issue rather than a protocol specification issue? -.. [#] TODO -.. [#] TODO .. [#] URL? IRI? From 6b72750397eb6aac5152bb426ca971b5b42a34b1 Mon Sep 17 00:00:00 2001 From: Jean-Paul Calderone Date: Tue, 15 May 2018 15:27:26 -0400 Subject: [PATCH 13/76] reduce verticality --- docs/proposed/http-storage-node-protocol.rst | 5 +---- 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/docs/proposed/http-storage-node-protocol.rst b/docs/proposed/http-storage-node-protocol.rst index bd6ccbcfa..6c01645c0 100644 --- a/docs/proposed/http-storage-node-protocol.rst +++ b/docs/proposed/http-storage-node-protocol.rst @@ -145,10 +145,7 @@ For example:: The response body includes encoded information about the created buckets. For example:: - {"already_have": [1, ...], - "allocated": {7: "bucket_id", ...}} - - + {"already_have": [1, ...], "allocated": {7: "bucket_id", ...}} Discussion `````````` From 178cb58a5753585151ceac7c65b7752268e31154 Mon Sep 17 00:00:00 2001 From: Jean-Paul Calderone Date: Tue, 15 May 2018 15:27:33 -0400 Subject: [PATCH 14/76] dunno how much the risk is elevated --- docs/proposed/http-storage-node-protocol.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/proposed/http-storage-node-protocol.rst b/docs/proposed/http-storage-node-protocol.rst index 6c01645c0..7df25a2df 100644 --- a/docs/proposed/http-storage-node-protocol.rst +++ b/docs/proposed/http-storage-node-protocol.rst @@ -152,7 +152,7 @@ Discussion We considered making this ``POST /v1/storage`` instead. The motivation was to keep *storage index* out of the request URL. -Request URLs have a mildly elevated chance of being logged by something. +Request URLs have an elevated chance of being logged by something. We were concerned that having the *storage index* logged may increase some risks. However, we decided this does not matter because the *storage index* can only be used to read the share (which is ciphertext). TODO Verify this conclusion. From 6d84cd8179c4f2e41463ec5ef4d642974b7b9a2b Mon Sep 17 00:00:00 2001 From: Jean-Paul Calderone Date: Tue, 15 May 2018 15:27:53 -0400 Subject: [PATCH 15/76] these are gone --- docs/proposed/http-storage-node-protocol.rst | 7 ------- 1 file changed, 7 deletions(-) diff --git a/docs/proposed/http-storage-node-protocol.rst b/docs/proposed/http-storage-node-protocol.rst index 7df25a2df..b1662b863 100644 --- a/docs/proposed/http-storage-node-protocol.rst +++ b/docs/proposed/http-storage-node-protocol.rst @@ -198,10 +198,3 @@ For example:: Read data from the indicated bucket. The data is returned raw (i.e., ``application/octet-stream``). Range requests may be made to read only part of a bucket. - -.. [#] What are best practices regarding TLS version? - Would a policy of "use the newest version shared between the two endpoints" be better? - Is it necessary to specify more than a TLS version number here? - For example, should we be specifying a set of ciphers as well? - Or is that a quality of implementation issue rather than a protocol specification issue? -.. [#] URL? IRI? From c824bcd8b232d7a05d86ca20f2064a9586300ce6 Mon Sep 17 00:00:00 2001 From: Jean-Paul Calderone Date: Tue, 15 May 2018 15:28:03 -0400 Subject: [PATCH 16/76] make the share a logical child of the bucket? --- docs/proposed/http-storage-node-protocol.rst | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/docs/proposed/http-storage-node-protocol.rst b/docs/proposed/http-storage-node-protocol.rst index b1662b863..6f642bb95 100644 --- a/docs/proposed/http-storage-node-protocol.rst +++ b/docs/proposed/http-storage-node-protocol.rst @@ -163,7 +163,8 @@ TODO Verify this conclusion. Write the share data to the indicated bucket. The request body is the raw share data (i.e., ``application/octet-stream``). -``POST /v1/buckets/:bucket_id/corrupt`` +``POST /v1/buckets/:bucket_id/:share_number/corrupt`` +!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Advise the server the share data read from the indicated bucket was corrupt. The request body includes an human-meaningful string with details about the corruption. @@ -171,7 +172,7 @@ It also includes potentially important details about the share. For example:: - {"share_type": "mutable", "storage_index": "abcd", "share_number": 3, + {"share_type": "mutable", "storage_index": "abcd", "reason": "expected hash abcd, got hash efgh"} Reading From 4ad5b5ab461752317429d81a7575f4a33ff6c1f6 Mon Sep 17 00:00:00 2001 From: Jean-Paul Calderone Date: Tue, 15 May 2018 16:00:40 -0400 Subject: [PATCH 17/76] address slots --- docs/proposed/http-storage-node-protocol.rst | 75 ++++++++++++++++++++ 1 file changed, 75 insertions(+) diff --git a/docs/proposed/http-storage-node-protocol.rst b/docs/proposed/http-storage-node-protocol.rst index 6f642bb95..9e51b4753 100644 --- a/docs/proposed/http-storage-node-protocol.rst +++ b/docs/proposed/http-storage-node-protocol.rst @@ -199,3 +199,78 @@ For example:: Read data from the indicated bucket. The data is returned raw (i.e., ``application/octet-stream``). Range requests may be made to read only part of a bucket. + +Slots +----- + +Slots are mutable data. + +Writing +~~~~~~~ + +``POST /v1/slot/:storage_index`` +!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! + +General purpose test-read-and-set operation for mutable slots. +The request body includes the secrets necessary to write to the slot +and the test, read, and write vectors for the operation. +For example:: + + { + "secrets": { + "write-enabler": "abcd", + "lease-renew": "efgh", + "lease-cancel": "ijkl" + }, + "test-write-vectors": { + 0: { + "test": [{ + "offset": 3, + "size": 5, + "operator": "eq", + "specimen": "hello" + }, ...], + "write": [{ + "offset": 9, + "data": "world" + }, ...], + "new-length": 5 + } + }, + "read-vector": [{"offset": 3, "size": 12}, ...] + } + +The response body contains a boolean indicating whether the tests all succeed +(and writes were applied) and a mapping giving read data (pre-write). +For example:: + + { + "success": true, + "data": { + 0: ["foo"], + 5: ["bar"], + ... + } + } + +Reading +~~~~~~~ + +``POST /v1/slot/:storage_index`` + +Read a vector from the numbered shares associated with the given storage index. +The request body contains the share numbers and read vector. +For example:: + + { + "shares": [3, 5, 7], + "read-vector": ["offset": 3, "size": 12}, ...] + } + +The response body contains a mapping giving the read data. +For example:: + + { + 3: ["foo"], + 7: ["bar"] + } From 4bed6363a3ae713007b2e340d1d1f49288eae319 Mon Sep 17 00:00:00 2001 From: Jean-Paul Calderone Date: Wed, 16 May 2018 09:49:48 -0400 Subject: [PATCH 18/76] be specific about public key comparison --- docs/proposed/http-storage-node-protocol.rst | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+) diff --git a/docs/proposed/http-storage-node-protocol.rst b/docs/proposed/http-storage-node-protocol.rst index 9e51b4753..09f8ba89c 100644 --- a/docs/proposed/http-storage-node-protocol.rst +++ b/docs/proposed/http-storage-node-protocol.rst @@ -49,6 +49,7 @@ the client will take the following steps to gain confidence it has reached the i that the certificate itself is well-formed and that the signature it carries is valid. * It will compare the hash of the public key of the certificate to the expected public key. + The specifics of the comparison are the same as for the comparison specified by `RFC 7469`_ with "sha256" [#]_. To further clarify, consider this example. Alice operates a storage node. @@ -274,3 +275,21 @@ For example:: 3: ["foo"], 7: ["bar"] } + + +.. _RFC 7469: https://tools.ietf.org/html/rfc7469#section-2.4 + +.. [#] + More simply:: + + from hashlib import sha256 + from cryptography.hazmat.primitives.serialization import ( + Encoding, + SubjectPublicKeyInfo, + ) + from foolscap import base32 + + spki_bytes = cert.public_key().public_bytes(DER, SubjectPublicKeyInfo) + spki_sha256 = sha256(spki_bytes).digest() + spki_digest32 = base32.encode(spki_sha256) + assert spki_digest32 == tub_id From 67ff44039f2fe5e377b8e5f9ad0e703dccf0f34a Mon Sep 17 00:00:00 2001 From: Jean-Paul Calderone Date: Wed, 16 May 2018 09:49:58 -0400 Subject: [PATCH 19/76] add values to the example --- docs/proposed/http-storage-node-protocol.rst | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/docs/proposed/http-storage-node-protocol.rst b/docs/proposed/http-storage-node-protocol.rst index 09f8ba89c..5985ee118 100644 --- a/docs/proposed/http-storage-node-protocol.rst +++ b/docs/proposed/http-storage-node-protocol.rst @@ -56,12 +56,17 @@ Alice operates a storage node. Alice generates a key pair and secures it properly. Alice generates a self-signed storage node certificate with the key pair. Alice's storage node announces a fURL containing (among other information) the public key to an introducer. +For example, ``pb://i5xb...@example.com:443/g3m5...``. Bob creates a client node pointed at the same introducer. Bob's client node receives the announcement from Alice's storage node. -Bob's client node can now perform a TLS handshake with a server at the address indicated by the storage node fURL. +Bob's client node can now perform a TLS handshake with a server at the address indicated by the storage node fURL +(``example.com:443`` in this example). Following the above described validation procedures, Bob's client node can determine whether it has reached Alice's storage node or not. +If and only if the public key hash matches the value in the published fURL +(``i5xb...`` in this example) +then Alice's storage node has been contacted. Additionally, by continuing to interact using TLS, From 5fa71484e32778ce5cf0b3822a4ff4c86e67b147 Mon Sep 17 00:00:00 2001 From: Jean-Paul Calderone Date: Wed, 16 May 2018 10:16:58 -0400 Subject: [PATCH 20/76] call out the base32/base64 mismatch --- docs/proposed/http-storage-node-protocol.rst | 2 ++ 1 file changed, 2 insertions(+) diff --git a/docs/proposed/http-storage-node-protocol.rst b/docs/proposed/http-storage-node-protocol.rst index 5985ee118..c6236930d 100644 --- a/docs/proposed/http-storage-node-protocol.rst +++ b/docs/proposed/http-storage-node-protocol.rst @@ -298,3 +298,5 @@ For example:: spki_sha256 = sha256(spki_bytes).digest() spki_digest32 = base32.encode(spki_sha256) assert spki_digest32 == tub_id + + Note we use the Tahoe-LAFS-preferred base32 encoding rather than base64. From 1d3f9715f8c90fdd9688a46df3ff08cab444aa18 Mon Sep 17 00:00:00 2001 From: Jean-Paul Calderone Date: Thu, 17 May 2018 14:01:18 -0400 Subject: [PATCH 21/76] trivial json markup fix --- docs/proposed/http-storage-node-protocol.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/proposed/http-storage-node-protocol.rst b/docs/proposed/http-storage-node-protocol.rst index c6236930d..171a76abf 100644 --- a/docs/proposed/http-storage-node-protocol.rst +++ b/docs/proposed/http-storage-node-protocol.rst @@ -270,7 +270,7 @@ For example:: { "shares": [3, 5, 7], - "read-vector": ["offset": 3, "size": 12}, ...] + "read-vector": [{"offset": 3, "size": 12}, ...] } The response body contains a mapping giving the read data. From 4e99f22c2b7b53e7a7deeee0607c16dc44e391c0 Mon Sep 17 00:00:00 2001 From: Jean-Paul Calderone Date: Thu, 17 May 2018 14:01:36 -0400 Subject: [PATCH 22/76] make containers plural I suppose --- docs/proposed/http-storage-node-protocol.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/proposed/http-storage-node-protocol.rst b/docs/proposed/http-storage-node-protocol.rst index 171a76abf..7850868f5 100644 --- a/docs/proposed/http-storage-node-protocol.rst +++ b/docs/proposed/http-storage-node-protocol.rst @@ -214,7 +214,7 @@ Slots are mutable data. Writing ~~~~~~~ -``POST /v1/slot/:storage_index`` +``POST /v1/slots/:storage_index`` !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! General purpose test-read-and-set operation for mutable slots. @@ -262,7 +262,7 @@ For example:: Reading ~~~~~~~ -``POST /v1/slot/:storage_index`` +``POST /v1/slots/:storage_index`` Read a vector from the numbered shares associated with the given storage index. The request body contains the share numbers and read vector. From eb9b44885e6e2270bb1ace751f66d942ab054d54 Mon Sep 17 00:00:00 2001 From: Jean-Paul Calderone Date: Thu, 17 May 2018 14:01:46 -0400 Subject: [PATCH 23/76] simple naming mistake this must be a different endpoint or it is ambiguous with bucket interactions. plus it makes more sense that "place where storage indexes are" is different from "place where buckets are" although I am still uncomfortable with the idea that "storage indexes" are things and not ... indexes ... --- docs/proposed/http-storage-node-protocol.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/proposed/http-storage-node-protocol.rst b/docs/proposed/http-storage-node-protocol.rst index 7850868f5..e9a37278d 100644 --- a/docs/proposed/http-storage-node-protocol.rst +++ b/docs/proposed/http-storage-node-protocol.rst @@ -138,7 +138,7 @@ Shares are immutable data stored in buckets. Writing ~~~~~~~ -``POST /v1/buckets/:storage_index`` +``POST /v1/storage/:storage_index`` !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Create some new buckets in which to store some shares. From d011c2f93607a2a8c062f2f11a60f30d4d9f1b2d Mon Sep 17 00:00:00 2001 From: Jean-Paul Calderone Date: Thu, 17 May 2018 14:11:32 -0400 Subject: [PATCH 24/76] rst twiddles --- docs/proposed/http-storage-node-protocol.rst | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/docs/proposed/http-storage-node-protocol.rst b/docs/proposed/http-storage-node-protocol.rst index e9a37278d..b06b3c72f 100644 --- a/docs/proposed/http-storage-node-protocol.rst +++ b/docs/proposed/http-storage-node-protocol.rst @@ -204,7 +204,7 @@ For example:: Read data from the indicated bucket. The data is returned raw (i.e., ``application/octet-stream``). -Range requests may be made to read only part of a bucket. +*Range* requests may be made to read only part of a bucket. Slots ----- @@ -215,7 +215,7 @@ Writing ~~~~~~~ ``POST /v1/slots/:storage_index`` -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! +!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! General purpose test-read-and-set operation for mutable slots. The request body includes the secrets necessary to write to the slot @@ -263,6 +263,7 @@ Reading ~~~~~~~ ``POST /v1/slots/:storage_index`` +!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Read a vector from the numbered shares associated with the given storage index. The request body contains the share numbers and read vector. From 2bbe51a01d239620bb1d4059fbdf22a709c45ad1 Mon Sep 17 00:00:00 2001 From: Jean-Paul Calderone Date: Thu, 17 May 2018 14:11:37 -0400 Subject: [PATCH 25/76] Discuss Range requests for uploads --- docs/proposed/http-storage-node-protocol.rst | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/docs/proposed/http-storage-node-protocol.rst b/docs/proposed/http-storage-node-protocol.rst index b06b3c72f..b97b3c35f 100644 --- a/docs/proposed/http-storage-node-protocol.rst +++ b/docs/proposed/http-storage-node-protocol.rst @@ -168,6 +168,13 @@ TODO Verify this conclusion. Write the share data to the indicated bucket. The request body is the raw share data (i.e., ``application/octet-stream``). +*Range* requests are encouraged for large transfers. +For example, +for a 1MiB share the data can be broken in to 8 128KiB chunks. +Each chunk can be *PUT* separately with the appropriate *Range* headers. +The server must recognize when all of the data has been received and mark the bucket as filled. +Clients should upload chunks in re-assembly order. +Servers may reject out-of-order chunks for implementation simplicity. ``POST /v1/buckets/:bucket_id/:share_number/corrupt`` !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! From 113af95984beccabe001d5f934a95804903baf84 Mon Sep 17 00:00:00 2001 From: Jean-Paul Calderone Date: Thu, 17 May 2018 15:08:03 -0400 Subject: [PATCH 26/76] when you are sending a range, you use Content-Range when you are _asking_ for a range, you use Range --- docs/proposed/http-storage-node-protocol.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/proposed/http-storage-node-protocol.rst b/docs/proposed/http-storage-node-protocol.rst index b97b3c35f..d6c336ba8 100644 --- a/docs/proposed/http-storage-node-protocol.rst +++ b/docs/proposed/http-storage-node-protocol.rst @@ -168,10 +168,10 @@ TODO Verify this conclusion. Write the share data to the indicated bucket. The request body is the raw share data (i.e., ``application/octet-stream``). -*Range* requests are encouraged for large transfers. +*Content-Range* requests are encouraged for large transfers. For example, for a 1MiB share the data can be broken in to 8 128KiB chunks. -Each chunk can be *PUT* separately with the appropriate *Range* headers. +Each chunk can be *PUT* separately with the appropriate *Content-Range* headers. The server must recognize when all of the data has been received and mark the bucket as filled. Clients should upload chunks in re-assembly order. Servers may reject out-of-order chunks for implementation simplicity. From d3f9ee2406b94c469e9cf3a3c019015111379e0f Mon Sep 17 00:00:00 2001 From: Jean-Paul Calderone Date: Thu, 17 May 2018 15:09:50 -0400 Subject: [PATCH 27/76] link to a different upload resume strategy --- docs/proposed/http-storage-node-protocol.rst | 2 ++ 1 file changed, 2 insertions(+) diff --git a/docs/proposed/http-storage-node-protocol.rst b/docs/proposed/http-storage-node-protocol.rst index d6c336ba8..3c88ac8a1 100644 --- a/docs/proposed/http-storage-node-protocol.rst +++ b/docs/proposed/http-storage-node-protocol.rst @@ -176,6 +176,8 @@ The server must recognize when all of the data has been received and mark the bu Clients should upload chunks in re-assembly order. Servers may reject out-of-order chunks for implementation simplicity. +.. think about copying https://developers.google.com/drive/api/v2/resumable-upload + ``POST /v1/buckets/:bucket_id/:share_number/corrupt`` !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! From 00ae3b56633228a6e4c68260cc8ed410d9dffd90 Mon Sep 17 00:00:00 2001 From: Jean-Paul Calderone Date: Fri, 18 May 2018 09:05:25 -0400 Subject: [PATCH 28/76] discuss encoded hash length --- docs/proposed/http-storage-node-protocol.rst | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/docs/proposed/http-storage-node-protocol.rst b/docs/proposed/http-storage-node-protocol.rst index 3c88ac8a1..37c03aee4 100644 --- a/docs/proposed/http-storage-node-protocol.rst +++ b/docs/proposed/http-storage-node-protocol.rst @@ -72,6 +72,15 @@ Additionally, by continuing to interact using TLS, Bob's client and Alice's storage node are assured of the integrity of the communication. +.. I think Foolscap TubIDs are 20 bytes and base32 encode to 32 bytes. + SPKI information discussed here is 32 bytes and base32 encodes to 52 bytes. + https://tools.ietf.org/html/rfc7515#appendix-C may prove a better choice for encoding the information into a fURL. + It will encode 32 bytes into merely 43... + We could also choose to reduce the hash size of the SPKI information through use of another cryptographic hash (replacing sha256). + A 224 bit hash function (SHA3-224, for example) might be suitable - + improving the encoded length to 38 bytes. + + Transition ~~~~~~~~~~ From fb51c1df40832d0f1305908d741ff05265895422 Mon Sep 17 00:00:00 2001 From: Jean-Paul Calderone Date: Fri, 18 May 2018 09:05:37 -0400 Subject: [PATCH 29/76] correct the sample code --- docs/proposed/http-storage-node-protocol.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/proposed/http-storage-node-protocol.rst b/docs/proposed/http-storage-node-protocol.rst index 37c03aee4..86f9a98d1 100644 --- a/docs/proposed/http-storage-node-protocol.rst +++ b/docs/proposed/http-storage-node-protocol.rst @@ -309,11 +309,11 @@ For example:: from hashlib import sha256 from cryptography.hazmat.primitives.serialization import ( Encoding, - SubjectPublicKeyInfo, + PublicFormat, ) from foolscap import base32 - spki_bytes = cert.public_key().public_bytes(DER, SubjectPublicKeyInfo) + spki_bytes = cert.public_key().public_bytes(Encoding.DER, PublicFormat.SubjectPublicKeyInfo) spki_sha256 = sha256(spki_bytes).digest() spki_digest32 = base32.encode(spki_sha256) assert spki_digest32 == tub_id From dd78fe81d07aa36c2516d0fb7fc3bdd1b79ccc3f Mon Sep 17 00:00:00 2001 From: Jean-Paul Calderone Date: Fri, 18 May 2018 09:05:46 -0400 Subject: [PATCH 30/76] note the failure case --- docs/proposed/http-storage-node-protocol.rst | 1 + 1 file changed, 1 insertion(+) diff --git a/docs/proposed/http-storage-node-protocol.rst b/docs/proposed/http-storage-node-protocol.rst index 86f9a98d1..23efc5c93 100644 --- a/docs/proposed/http-storage-node-protocol.rst +++ b/docs/proposed/http-storage-node-protocol.rst @@ -184,6 +184,7 @@ Each chunk can be *PUT* separately with the appropriate *Content-Range* headers. The server must recognize when all of the data has been received and mark the bucket as filled. Clients should upload chunks in re-assembly order. Servers may reject out-of-order chunks for implementation simplicity. +If an individual *PUT* fails then only a limited amount of effort is wasted on the necessary retry. .. think about copying https://developers.google.com/drive/api/v2/resumable-upload From 3ef1ceeead07b506b5f86c4d711106d80d81796f Mon Sep 17 00:00:00 2001 From: Jean-Paul Calderone Date: Fri, 18 May 2018 09:13:57 -0400 Subject: [PATCH 31/76] markup --- docs/proposed/http-storage-node-protocol.rst | 1 + 1 file changed, 1 insertion(+) diff --git a/docs/proposed/http-storage-node-protocol.rst b/docs/proposed/http-storage-node-protocol.rst index 23efc5c93..55de4d88f 100644 --- a/docs/proposed/http-storage-node-protocol.rst +++ b/docs/proposed/http-storage-node-protocol.rst @@ -120,6 +120,7 @@ Server Details -------------- ``GET /v1/version`` +!!!!!!!!!!!!!!!!!!! Retrieve information about the version of the storage server. Information is returned as an encoded mapping. From 943b389d775f7e9bcf0b51c75de7ad74f7c16299 Mon Sep 17 00:00:00 2001 From: Jean-Paul Calderone Date: Fri, 18 May 2018 11:09:17 -0400 Subject: [PATCH 32/76] Banish slots and deemphasize buckets --- docs/proposed/http-storage-node-protocol.rst | 120 ++++++++++--------- 1 file changed, 65 insertions(+), 55 deletions(-) diff --git a/docs/proposed/http-storage-node-protocol.rst b/docs/proposed/http-storage-node-protocol.rst index 55de4d88f..e6ef5015d 100644 --- a/docs/proposed/http-storage-node-protocol.rst +++ b/docs/proposed/http-storage-node-protocol.rst @@ -119,6 +119,9 @@ This is left as a decision for the implementation, though. Server Details -------------- +JSON is used throughout for the examples but is likely not the preferred encoding. +The structure of the examples should nevertheless be representative. + ``GET /v1/version`` !!!!!!!!!!!!!!!!!!! @@ -139,19 +142,17 @@ For example:: "application-version": "1.13.0" } - -Shares ------- - -Shares are immutable data stored in buckets. +Immutable +--------- Writing ~~~~~~~ -``POST /v1/storage/:storage_index`` -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! +``POST /v1/immutable/:storage_index`` +!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Create some new buckets in which to store some shares. +Initialize an immutable storage index with some buckets. +The buckets may have share data written to them once. Details of the buckets to create are encoded in the request body. For example:: @@ -161,85 +162,98 @@ For example:: The response body includes encoded information about the created buckets. For example:: - {"already_have": [1, ...], "allocated": {7: "bucket_id", ...}} + .. XXX Share numbers are logically integers. + JSON cannot encode integer mapping keys. + So this is not valid JSON but you know what I mean. + + {"already_have": [1, ...], "allocated": [7, ...]} Discussion `````````` -We considered making this ``POST /v1/storage`` instead. +We considered making this ``POST /v1/immutable`` instead. The motivation was to keep *storage index* out of the request URL. Request URLs have an elevated chance of being logged by something. We were concerned that having the *storage index* logged may increase some risks. However, we decided this does not matter because the *storage index* can only be used to read the share (which is ciphertext). TODO Verify this conclusion. -``PUT /v1/buckets/:bucket_id`` -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! +``PUT /v1/immutable/:storage_index/:share_num`` +!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Write the share data to the indicated bucket. +Write data for the indicated share. +The share number must belong to the storage index. The request body is the raw share data (i.e., ``application/octet-stream``). *Content-Range* requests are encouraged for large transfers. For example, for a 1MiB share the data can be broken in to 8 128KiB chunks. -Each chunk can be *PUT* separately with the appropriate *Content-Range* headers. -The server must recognize when all of the data has been received and mark the bucket as filled. +Each chunk can be *PUT* separately with the appropriate *Content-Range* header. +The server must recognize when all of the data has been received and mark the share as complete +(which it can do because it was informed of the size when the storage index was initialized). Clients should upload chunks in re-assembly order. Servers may reject out-of-order chunks for implementation simplicity. If an individual *PUT* fails then only a limited amount of effort is wasted on the necessary retry. .. think about copying https://developers.google.com/drive/api/v2/resumable-upload -``POST /v1/buckets/:bucket_id/:share_number/corrupt`` -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! +``POST /v1/immutable/:storage_index/:share_number/corrupt`` +!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Advise the server the share data read from the indicated bucket was corrupt. +Advise the server the data read from the indicated share was corrupt. The request body includes an human-meaningful string with details about the corruption. It also includes potentially important details about the share. For example:: - {"share_type": "mutable", "storage_index": "abcd", - "reason": "expected hash abcd, got hash efgh"} + {"reason": "expected hash abcd, got hash efgh"} + +.. share_type, storage_index, and share number are inferred from the URL Reading ~~~~~~~ -``GET /v1/storage/:storage_index`` -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! +``GET /v1/immutable/:storage_index/shares`` +!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Retrieve a mapping describing buckets for the indicated storage index. -The mapping is returned as an encoded structured object -(JSON is used for the example here but is not necessarily the true encoding). -The mapping has share numbers as keys and bucket identifiers as values. +Retrieve a list indicating all shares available for the indicated storage index. For example:: - .. XXX Share numbers are logically integers. - JSON cannot encode integer mapping keys. - So this is not valid JSON but you know what I mean. + [1, 5] - {0: "abcd", 1: "efgh"} +``GET /v1/immutable/:storage_index?share=s0&share=sN`` +!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -``GET /v1/buckets/:bucket_id`` -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! +Read data from the indicated shares. +If no ``share`` query arguments are present, +read data from all shares present. +The data is returned in a multipart container. +*Range* requests may be made to read only parts of the shares. -Read data from the indicated bucket. -The data is returned raw (i.e., ``application/octet-stream``). -*Range* requests may be made to read only part of a bucket. +.. Blech, multipart! + We know the data size. + How about implicit size-based framing, instead? + Maybe HTTP/2 server push is a better solution. + For example, request /shares and get a push of the first share with the result? + (Then request the rest, if you want, while reading the first.) -Slots ------ - -Slots are mutable data. +Mutable +------- Writing ~~~~~~~ -``POST /v1/slots/:storage_index`` -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! +``POST /v1/mutable/:storage_index`` +!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -General purpose test-read-and-set operation for mutable slots. -The request body includes the secrets necessary to write to the slot -and the test, read, and write vectors for the operation. +Initialize a mutable storage index with some buckets. +Essentially the same as the API for initializing an immutable storage index. + +``POST /v1/read-test-write/:storage_index`` +!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! + +General purpose read-test-and-write operation for mutable storage indexes. +The request body includes the secrets necessary to rewrite to the shares +along with test, read, and write vectors for the operation. For example:: { @@ -282,18 +296,15 @@ For example:: Reading ~~~~~~~ -``POST /v1/slots/:storage_index`` -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! +``GET /v1/mutable/:storage_index?share=s0&share=sN`` +!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! + +Read mutable shares (like the immutable version). + +``GET /v1/mutable/:storage_index?share=:s1&share=:sN&offset=o1&size=z1&offset=oN&size=zN`` +!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Read a vector from the numbered shares associated with the given storage index. -The request body contains the share numbers and read vector. -For example:: - - { - "shares": [3, 5, 7], - "read-vector": [{"offset": 3, "size": 12}, ...] - } - The response body contains a mapping giving the read data. For example:: @@ -302,7 +313,6 @@ For example:: 7: ["bar"] } - .. _RFC 7469: https://tools.ietf.org/html/rfc7469#section-2.4 .. [#] From 9402698918cabad2939fae7e9a68c2c5f7021768 Mon Sep 17 00:00:00 2001 From: Jean-Paul Calderone Date: Fri, 18 May 2018 11:11:04 -0400 Subject: [PATCH 33/76] Harmonize hyphens --- docs/proposed/http-storage-node-protocol.rst | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/docs/proposed/http-storage-node-protocol.rst b/docs/proposed/http-storage-node-protocol.rst index e6ef5015d..14b9dc6d7 100644 --- a/docs/proposed/http-storage-node-protocol.rst +++ b/docs/proposed/http-storage-node-protocol.rst @@ -156,8 +156,8 @@ The buckets may have share data written to them once. Details of the buckets to create are encoded in the request body. For example:: - {"renew_secret": "efgh", "cancel_secret": "ijkl", - "sharenums": [1, 7, ...], "allocated_size": 12345} + {"renew-secret": "efgh", "cancel-secret": "ijkl", + "share-numbers": [1, 7, ...], "allocated-size": 12345} The response body includes encoded information about the created buckets. For example:: @@ -166,7 +166,7 @@ For example:: JSON cannot encode integer mapping keys. So this is not valid JSON but you know what I mean. - {"already_have": [1, ...], "allocated": [7, ...]} + {"already-have": [1, ...], "allocated": [7, ...]} Discussion `````````` @@ -207,7 +207,7 @@ For example:: {"reason": "expected hash abcd, got hash efgh"} -.. share_type, storage_index, and share number are inferred from the URL +.. share-type, storage-index, and share-number are inferred from the URL Reading ~~~~~~~ From 6c664d69a826b2e9e128702a838bbbbd4c1530a5 Mon Sep 17 00:00:00 2001 From: Jean-Paul Calderone Date: Fri, 18 May 2018 13:01:03 -0400 Subject: [PATCH 34/76] consistent non-abbreviation --- docs/proposed/http-storage-node-protocol.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/proposed/http-storage-node-protocol.rst b/docs/proposed/http-storage-node-protocol.rst index 14b9dc6d7..b0907b106 100644 --- a/docs/proposed/http-storage-node-protocol.rst +++ b/docs/proposed/http-storage-node-protocol.rst @@ -178,8 +178,8 @@ We were concerned that having the *storage index* logged may increase some risks However, we decided this does not matter because the *storage index* can only be used to read the share (which is ciphertext). TODO Verify this conclusion. -``PUT /v1/immutable/:storage_index/:share_num`` -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! +``PUT /v1/immutable/:storage_index/:share_number`` +!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Write data for the indicated share. The share number must belong to the storage index. From 69195e0a5af4c57c6e0bac21ce47659de6934fcf Mon Sep 17 00:00:00 2001 From: Jean-Paul Calderone Date: Fri, 18 May 2018 13:01:10 -0400 Subject: [PATCH 35/76] maybe we don't even want Range --- docs/proposed/http-storage-node-protocol.rst | 1 + 1 file changed, 1 insertion(+) diff --git a/docs/proposed/http-storage-node-protocol.rst b/docs/proposed/http-storage-node-protocol.rst index b0907b106..b558869f6 100644 --- a/docs/proposed/http-storage-node-protocol.rst +++ b/docs/proposed/http-storage-node-protocol.rst @@ -232,6 +232,7 @@ The data is returned in a multipart container. .. Blech, multipart! We know the data size. How about implicit size-based framing, instead? + Or frame it all in a CBOR array (drop *Range* and use query args!). Maybe HTTP/2 server push is a better solution. For example, request /shares and get a push of the first share with the result? (Then request the rest, if you want, while reading the first.) From c6a8e4535c1a93034cf89a5a1334948668444e7c Mon Sep 17 00:00:00 2001 From: Jean-Paul Calderone Date: Fri, 18 May 2018 13:01:19 -0400 Subject: [PATCH 36/76] mount this beneath the storage index resource --- docs/proposed/http-storage-node-protocol.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/proposed/http-storage-node-protocol.rst b/docs/proposed/http-storage-node-protocol.rst index b558869f6..03b7395c9 100644 --- a/docs/proposed/http-storage-node-protocol.rst +++ b/docs/proposed/http-storage-node-protocol.rst @@ -249,8 +249,8 @@ Writing Initialize a mutable storage index with some buckets. Essentially the same as the API for initializing an immutable storage index. -``POST /v1/read-test-write/:storage_index`` -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! +``POST /v1/mutable/:storage_index/read-test-write`` +!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! General purpose read-test-and-write operation for mutable storage indexes. The request body includes the secrets necessary to rewrite to the shares From 93889035156b38bb3cb125afb3cedbe16d094606 Mon Sep 17 00:00:00 2001 From: Jean-Paul Calderone Date: Fri, 18 May 2018 13:01:45 -0400 Subject: [PATCH 37/76] need a way to advise of corrupt mutable shares --- docs/proposed/http-storage-node-protocol.rst | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/docs/proposed/http-storage-node-protocol.rst b/docs/proposed/http-storage-node-protocol.rst index 03b7395c9..644eb294f 100644 --- a/docs/proposed/http-storage-node-protocol.rst +++ b/docs/proposed/http-storage-node-protocol.rst @@ -314,6 +314,12 @@ For example:: 7: ["bar"] } +``POST /v1/mutable/:storage_index/:share_number/corrupt`` +!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! + +Advise the server the data read from the indicated share was corrupt. +Just like the immutable version. + .. _RFC 7469: https://tools.ietf.org/html/rfc7469#section-2.4 .. [#] From f09ed91ab6a981d529f51ab74ee71debf6149b40 Mon Sep 17 00:00:00 2001 From: Jean-Paul Calderone Date: Fri, 18 May 2018 13:01:57 -0400 Subject: [PATCH 38/76] collapse these two APIs, they are the same also add mutable .../shares listing --- docs/proposed/http-storage-node-protocol.rst | 22 +++++++++++++------- 1 file changed, 15 insertions(+), 7 deletions(-) diff --git a/docs/proposed/http-storage-node-protocol.rst b/docs/proposed/http-storage-node-protocol.rst index 644eb294f..00754bbe9 100644 --- a/docs/proposed/http-storage-node-protocol.rst +++ b/docs/proposed/http-storage-node-protocol.rst @@ -297,21 +297,29 @@ For example:: Reading ~~~~~~~ -``GET /v1/mutable/:storage_index?share=s0&share=sN`` -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! +``GET /v1/mutable/:storage_index/shares`` +!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Read mutable shares (like the immutable version). +Retrieve a list indicating all shares available for the indicated storage index. +For example:: -``GET /v1/mutable/:storage_index?share=:s1&share=:sN&offset=o1&size=z1&offset=oN&size=zN`` + [1, 5] + +``GET /v1/mutable/:storage_index?share=:s0&share=:sN&offset=o1&size=z0&offset=oN&size=zN`` !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Read a vector from the numbered shares associated with the given storage index. +Read some data from some shares associated with the given storage index. +If ``share`` query parameters are given, only data from those shares is read. +Otherwise, data from all shares is read. +If ``size`` and ``offset`` query parameters are given, +only the portions of the selected shares thus identified are returned. + The response body contains a mapping giving the read data. For example:: { - 3: ["foo"], - 7: ["bar"] + 3: ["foo", "bar"], + 7: ["baz", "quux"] } ``POST /v1/mutable/:storage_index/:share_number/corrupt`` From 3898911fcc2ba592d2bb51cc9e5a639ac71303a6 Mon Sep 17 00:00:00 2001 From: Jean-Paul Calderone Date: Fri, 18 May 2018 13:07:18 -0400 Subject: [PATCH 39/76] consistent title levels --- docs/proposed/http-storage-node-protocol.rst | 3 +++ 1 file changed, 3 insertions(+) diff --git a/docs/proposed/http-storage-node-protocol.rst b/docs/proposed/http-storage-node-protocol.rst index 00754bbe9..e84c81bcf 100644 --- a/docs/proposed/http-storage-node-protocol.rst +++ b/docs/proposed/http-storage-node-protocol.rst @@ -122,6 +122,9 @@ Server Details JSON is used throughout for the examples but is likely not the preferred encoding. The structure of the examples should nevertheless be representative. +General +~~~~~~~ + ``GET /v1/version`` !!!!!!!!!!!!!!!!!!! From f4b59b166da9af47fe0f56d19157f12bc41c72e6 Mon Sep 17 00:00:00 2001 From: Jean-Paul Calderone Date: Fri, 18 May 2018 13:08:13 -0400 Subject: [PATCH 40/76] no more int-key mappings --- docs/proposed/http-storage-node-protocol.rst | 4 ---- 1 file changed, 4 deletions(-) diff --git a/docs/proposed/http-storage-node-protocol.rst b/docs/proposed/http-storage-node-protocol.rst index e84c81bcf..c7c868d4a 100644 --- a/docs/proposed/http-storage-node-protocol.rst +++ b/docs/proposed/http-storage-node-protocol.rst @@ -165,10 +165,6 @@ For example:: The response body includes encoded information about the created buckets. For example:: - .. XXX Share numbers are logically integers. - JSON cannot encode integer mapping keys. - So this is not valid JSON but you know what I mean. - {"already-have": [1, ...], "allocated": [7, ...]} Discussion From d09b613d593516ce31574fcffae1cbc9bd7a1e31 Mon Sep 17 00:00:00 2001 From: Jean-Paul Calderone Date: Fri, 18 May 2018 15:45:22 -0400 Subject: [PATCH 41/76] make mutable and immutable read the same --- docs/proposed/http-storage-node-protocol.rst | 44 ++++++++------------ 1 file changed, 17 insertions(+), 27 deletions(-) diff --git a/docs/proposed/http-storage-node-protocol.rst b/docs/proposed/http-storage-node-protocol.rst index c7c868d4a..f7ba924d6 100644 --- a/docs/proposed/http-storage-node-protocol.rst +++ b/docs/proposed/http-storage-node-protocol.rst @@ -219,22 +219,23 @@ For example:: [1, 5] -``GET /v1/immutable/:storage_index?share=s0&share=sN`` -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! +``GET /v1/immutable/:storage_index?share=:s0&share=:sN&offset=o1&size=z0&offset=oN&size=zN`` +!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Read data from the indicated shares. -If no ``share`` query arguments are present, -read data from all shares present. -The data is returned in a multipart container. -*Range* requests may be made to read only parts of the shares. +Read data from the indicated immutable shares. +If ``share`` query parameters are given, selecte only those shares for reading. +Otherwise, select all shares present. +If ``size`` and ``offset`` query parameters are given, +only the portions thus identified of the selected shares are returned. +Otherwise, all data is from the selected shares is returned. -.. Blech, multipart! - We know the data size. - How about implicit size-based framing, instead? - Or frame it all in a CBOR array (drop *Range* and use query args!). - Maybe HTTP/2 server push is a better solution. - For example, request /shares and get a push of the first share with the result? - (Then request the rest, if you want, while reading the first.) +The response body contains a mapping giving the read data. +For example:: + + { + 3: ["foo", "bar"], + 7: ["baz", "quux"] + } Mutable ------- @@ -307,19 +308,8 @@ For example:: ``GET /v1/mutable/:storage_index?share=:s0&share=:sN&offset=o1&size=z0&offset=oN&size=zN`` !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Read some data from some shares associated with the given storage index. -If ``share`` query parameters are given, only data from those shares is read. -Otherwise, data from all shares is read. -If ``size`` and ``offset`` query parameters are given, -only the portions of the selected shares thus identified are returned. - -The response body contains a mapping giving the read data. -For example:: - - { - 3: ["foo", "bar"], - 7: ["baz", "quux"] - } +Read data from the indicated mutable shares. +Just like ``GET /v1/mutable/:storage_index``. ``POST /v1/mutable/:storage_index/:share_number/corrupt`` !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! From 931ffec0051f2c553805166223289e2912420a53 Mon Sep 17 00:00:00 2001 From: Jean-Paul Calderone Date: Mon, 21 May 2018 13:31:10 -0400 Subject: [PATCH 42/76] semantic newlines --- docs/proposed/http-storage-node-protocol.rst | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/docs/proposed/http-storage-node-protocol.rst b/docs/proposed/http-storage-node-protocol.rst index f7ba924d6..63257e18e 100644 --- a/docs/proposed/http-storage-node-protocol.rst +++ b/docs/proposed/http-storage-node-protocol.rst @@ -20,7 +20,8 @@ Requirements A client node relies on a storage node to persist certain data until a future retrieval request is made. In this way, the node is vulnerable to attacks which cause the data not to be persisted. -Though this vulnerability can be mitigated by including redundancy in the share encoding parameters for stored data, it is still sensible to attempt to minimize unnecessary vulnerability to this attack. +Though this vulnerability can be mitigated by including redundancy in the share encoding parameters for stored data, +it is still sensible to attempt to minimize unnecessary vulnerability to this attack. One way to do this is for the client to be confident it the storage node with which it is communicating is really the expected node. Therefore, the protocol must include some means for cryptographically verifying the identify of the storage node. From 4626a092248caf729c58b5a5f9428e36f15ab409 Mon Sep 17 00:00:00 2001 From: Jean-Paul Calderone Date: Mon, 21 May 2018 13:31:16 -0400 Subject: [PATCH 43/76] elaborate on reputation-based assumptions --- docs/proposed/http-storage-node-protocol.rst | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/docs/proposed/http-storage-node-protocol.rst b/docs/proposed/http-storage-node-protocol.rst index 63257e18e..681bb6006 100644 --- a/docs/proposed/http-storage-node-protocol.rst +++ b/docs/proposed/http-storage-node-protocol.rst @@ -23,7 +23,9 @@ In this way, the node is vulnerable to attacks which cause the data not to be pe Though this vulnerability can be mitigated by including redundancy in the share encoding parameters for stored data, it is still sensible to attempt to minimize unnecessary vulnerability to this attack. -One way to do this is for the client to be confident it the storage node with which it is communicating is really the expected node. +One way to do this is for the client to be confident it the storage node with which it is communicating is really the expected node +(because this allows it to develop a notion of that node's reputation over time; +the more retrieval requests it satisfies correctly the more it probably will). Therefore, the protocol must include some means for cryptographically verifying the identify of the storage node. The initialization of the client with the correct identity information is out of scope for this protocol (the system may be trust-on-first-use, there may be a third-party identity broker, etc). From cea0ae800487b79540235efa68323814a1c05e5a Mon Sep 17 00:00:00 2001 From: Jean-Paul Calderone Date: Mon, 21 May 2018 14:14:39 -0400 Subject: [PATCH 44/76] tahoe-lafs is already good at redundant storage --- docs/proposed/http-storage-node-protocol.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/proposed/http-storage-node-protocol.rst b/docs/proposed/http-storage-node-protocol.rst index 681bb6006..c5a915741 100644 --- a/docs/proposed/http-storage-node-protocol.rst +++ b/docs/proposed/http-storage-node-protocol.rst @@ -20,7 +20,7 @@ Requirements A client node relies on a storage node to persist certain data until a future retrieval request is made. In this way, the node is vulnerable to attacks which cause the data not to be persisted. -Though this vulnerability can be mitigated by including redundancy in the share encoding parameters for stored data, +Though this vulnerability can be (and typically is) mitigated by including redundancy in the share encoding parameters for stored data, it is still sensible to attempt to minimize unnecessary vulnerability to this attack. One way to do this is for the client to be confident it the storage node with which it is communicating is really the expected node From fa4384e36e006e0c64d9438594108e013ae2f6b6 Mon Sep 17 00:00:00 2001 From: Jean-Paul Calderone Date: Mon, 21 May 2018 14:14:53 -0400 Subject: [PATCH 45/76] add a security summary (noting foolscap features) also, expanded discussion. --- docs/proposed/http-storage-node-protocol.rst | 37 +++++++++++++++++--- 1 file changed, 32 insertions(+), 5 deletions(-) diff --git a/docs/proposed/http-storage-node-protocol.rst b/docs/proposed/http-storage-node-protocol.rst index c5a915741..0a4c98c51 100644 --- a/docs/proposed/http-storage-node-protocol.rst +++ b/docs/proposed/http-storage-node-protocol.rst @@ -18,21 +18,48 @@ Security Requirements ~~~~~~~~~~~~ +Summary +!!!!!!! + +An HTTP-based protocol should offer at minimum the security properties offered by the Foolscap-based protocol. +The Foolscap-based protocol offers: + +* **Peer authentication** by way of checked x509 certificates +* **Message authentication** by way of TLS +* **Message confidentiality** by way of TLS + + * A careful configuration of the TLS connection parameters *may* also offer **forward secrecy**. + However, Tahoe-LAFS' use of Foolscap takes no steps to ensure this is the case. + +Discussion +!!!!!!!!!! + A client node relies on a storage node to persist certain data until a future retrieval request is made. In this way, the node is vulnerable to attacks which cause the data not to be persisted. Though this vulnerability can be (and typically is) mitigated by including redundancy in the share encoding parameters for stored data, it is still sensible to attempt to minimize unnecessary vulnerability to this attack. -One way to do this is for the client to be confident it the storage node with which it is communicating is really the expected node -(because this allows it to develop a notion of that node's reputation over time; -the more retrieval requests it satisfies correctly the more it probably will). -Therefore, the protocol must include some means for cryptographically verifying the identify of the storage node. +One way to do this is for the client to be confident the storage node with which it is communicating is really the expected node. +That is, for the client to perform **peer authentication** of the storage node it connects to. +This allows it to develop a notion of that node's reputation over time. +The more retrieval requests the node satisfies correctly the more it probably will satisfy correctly. +Therefore, the protocol must include some means for verifying the identify of the storage node. The initialization of the client with the correct identity information is out of scope for this protocol (the system may be trust-on-first-use, there may be a third-party identity broker, etc). With confidence that communication is proceeding with the intended storage node, it must also be possible to trust that data is exchanged without modification. -That is, the protocol must include some means to cryptographically verify the integrity of exchanged messages. +That is, the protocol must include some means to perform **message authentication**. +This is most likely done using cryptographic MACs (such as those used in TLS). + +The messages which enable the mutable shares feature include secrets related to those shares. +For example, the write enabler secret is used to restrict the parties with write access to mutable shares. +It is exchanged over the network as part of a write operation. +An attacker learning this secret and overwrite share data with garbage +(lacking a separate encryption key, +there is no way to write data which appears legitimate to a legitimate client). +Therefore, **message confidentiality** is necessary when exchanging these secrets. +**Forward security** is preferred so that an attacker recording an exchange today cannot launch this attack at some future point after compromising the necessary keys. Solutions ~~~~~~~~~ From 11184939e86b81069bee371741ca5753259ffcb4 Mon Sep 17 00:00:00 2001 From: Jean-Paul Calderone Date: Mon, 21 May 2018 14:59:10 -0400 Subject: [PATCH 46/76] It's SPKI not public key --- docs/proposed/http-storage-node-protocol.rst | 17 ++++++++++++----- 1 file changed, 12 insertions(+), 5 deletions(-) diff --git a/docs/proposed/http-storage-node-protocol.rst b/docs/proposed/http-storage-node-protocol.rst index 0a4c98c51..a42f916a6 100644 --- a/docs/proposed/http-storage-node-protocol.rst +++ b/docs/proposed/http-storage-node-protocol.rst @@ -67,9 +67,9 @@ Solutions Communication with the storage node will take place using TLS. The TLS version and configuration will be dictated by an ongoing understanding of best practices. The only requirement is that the certificate have a valid signature. -The storage node will publish the corresponding public key +The storage node will publish the corresponding Subject Public Key Information hash (SPKI hash) (e.g., via an introducer). -The public key will constitute the storage node's identity. +The SPKI hash will constitute the storage node's identity. When connecting to a storage node, the client will take the following steps to gain confidence it has reached the intended peer: @@ -78,7 +78,7 @@ the client will take the following steps to gain confidence it has reached the i (that is, that the certificate itself is well-formed and that the signature it carries is valid. -* It will compare the hash of the public key of the certificate to the expected public key. +* It will compare the SPKI hashof the certificate to the expected value. The specifics of the comparison are the same as for the comparison specified by `RFC 7469`_ with "sha256" [#]_. To further clarify, consider this example. @@ -94,7 +94,7 @@ Bob's client node can now perform a TLS handshake with a server at the address i (``example.com:443`` in this example). Following the above described validation procedures, Bob's client node can determine whether it has reached Alice's storage node or not. -If and only if the public key hash matches the value in the published fURL +If and only if the SPKI hash matches the value in the published fURL (``i5xb...`` in this example) then Alice's storage node has been contacted. @@ -123,7 +123,14 @@ This is used with Foolscap to provide the same security properties described in This is not harmful to the new protocol. * The validity of the certificate is determined by checking the certificate digest against a value carried in the fURL. Only a correctly signed certificate with a matching digest is accepted. - This validation will be replaced with a public key hash comparison. + This validation will be replaced with an SPKI hash comparison. + This introduces a difference from the Foolscap protocol: + it allows the generation and use of new certificates using the same key pair. + This does not seem likely to pose any new risks. + On the contrary, + it may remove certain risks by allowing certificate renewal at certificate expiration time. + This will allow the certificate validation code to be simplified somewhat + (compared to the current implementation which must make an exception for validity-period-related validation errors). A mixed-protocol storage node should: From 16076f9bd71cfe92161e84504ec0dd390b25b391 Mon Sep 17 00:00:00 2001 From: Jean-Paul Calderone Date: Mon, 21 May 2018 14:59:28 -0400 Subject: [PATCH 47/76] be explicit about the security goals being achieved --- docs/proposed/http-storage-node-protocol.rst | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/docs/proposed/http-storage-node-protocol.rst b/docs/proposed/http-storage-node-protocol.rst index a42f916a6..580a284eb 100644 --- a/docs/proposed/http-storage-node-protocol.rst +++ b/docs/proposed/http-storage-node-protocol.rst @@ -85,7 +85,7 @@ To further clarify, consider this example. Alice operates a storage node. Alice generates a key pair and secures it properly. Alice generates a self-signed storage node certificate with the key pair. -Alice's storage node announces a fURL containing (among other information) the public key to an introducer. +Alice's storage node announces (to an introducer) a fURL containing (among other information) the SPKI hash. For example, ``pb://i5xb...@example.com:443/g3m5...``. Bob creates a client node pointed at the same introducer. Bob's client node receives the announcement from Alice's storage node. @@ -97,12 +97,15 @@ Bob's client node can determine whether it has reached Alice's storage node or n If and only if the SPKI hash matches the value in the published fURL (``i5xb...`` in this example) then Alice's storage node has been contacted. +**Peer authentication** has been achieved. Additionally, by continuing to interact using TLS, -Bob's client and Alice's storage node are assured of the integrity of the communication. +Bob's client and Alice's storage node are assured of both **message authentication** and **message confidentiality**. -.. I think Foolscap TubIDs are 20 bytes and base32 encode to 32 bytes. +.. note:: + + I think Foolscap TubIDs are 20 bytes and base32 encode to 32 bytes. SPKI information discussed here is 32 bytes and base32 encodes to 52 bytes. https://tools.ietf.org/html/rfc7515#appendix-C may prove a better choice for encoding the information into a fURL. It will encode 32 bytes into merely 43... From 176732dcaff94d4095e4e20b01c78f6f145d62e4 Mon Sep 17 00:00:00 2001 From: Jean-Paul Calderone Date: Mon, 21 May 2018 14:59:46 -0400 Subject: [PATCH 48/76] gotta announce the new fURL sometime --- docs/proposed/http-storage-node-protocol.rst | 1 + 1 file changed, 1 insertion(+) diff --git a/docs/proposed/http-storage-node-protocol.rst b/docs/proposed/http-storage-node-protocol.rst index 580a284eb..8c7a0a56f 100644 --- a/docs/proposed/http-storage-node-protocol.rst +++ b/docs/proposed/http-storage-node-protocol.rst @@ -141,6 +141,7 @@ A mixed-protocol storage node should: * Start a TLS server dispatching to an HTTP server. * Use the same certificate as the Foolscap server uses. + * Announce both its Foolscap fURL and its HTTP fURL. * Accept anonymous client connections. A mixed-protocol client node should: From 4592bf3de2c6c69a229ac9017cfd172ccad4237c Mon Sep 17 00:00:00 2001 From: Jean-Paul Calderone Date: Mon, 21 May 2018 16:23:53 -0400 Subject: [PATCH 49/76] wip - more edits of the security material & transition plan --- docs/proposed/http-storage-node-protocol.rst | 63 ++++++++++++-------- 1 file changed, 37 insertions(+), 26 deletions(-) diff --git a/docs/proposed/http-storage-node-protocol.rst b/docs/proposed/http-storage-node-protocol.rst index 8c7a0a56f..90d94935b 100644 --- a/docs/proposed/http-storage-node-protocol.rst +++ b/docs/proposed/http-storage-node-protocol.rst @@ -74,11 +74,12 @@ The SPKI hash will constitute the storage node's identity. When connecting to a storage node, the client will take the following steps to gain confidence it has reached the intended peer: -* It will perform the usual cryptographic verification of the certificate presented by the storage server - (that is, - that the certificate itself is well-formed +* It will perform the usual cryptographic verification of the certificate presented by the storage server. + That is, + it will check that the certificate itself is well-formed, + that it is currently valid [#]_, and that the signature it carries is valid. -* It will compare the SPKI hashof the certificate to the expected value. +* It will compare the SPKI hash of the certificate to the expected value. The specifics of the comparison are the same as for the comparison specified by `RFC 7469`_ with "sha256" [#]_. To further clarify, consider this example. @@ -117,32 +118,25 @@ Bob's client and Alice's storage node are assured of both **message authenticati Transition ~~~~~~~~~~ -Storage nodes already possess an x509 certificate. -This is used with Foolscap to provide the same security properties described in the above requirements section. +To provide a seamless user experience during this protocol transition, +there should be a period during which both protocols are supported by storage nodes. +The HTTP protocol announcement will be introduced in a way that updated client software can recognize. +Its introduction will also be made in such a way that non-updated client software disregards the new information +(of which it cannot make any use). -* The certificate is self-signed. - This remains the same. -* The certificate has a ``commonName`` of "newpb_thingy". - This is not harmful to the new protocol. -* The validity of the certificate is determined by checking the certificate digest against a value carried in the fURL. - Only a correctly signed certificate with a matching digest is accepted. - This validation will be replaced with an SPKI hash comparison. - This introduces a difference from the Foolscap protocol: - it allows the generation and use of new certificates using the same key pair. - This does not seem likely to pose any new risks. - On the contrary, - it may remove certain risks by allowing certificate renewal at certificate expiration time. - This will allow the certificate validation code to be simplified somewhat - (compared to the current implementation which must make an exception for validity-period-related validation errors). +Therefore, concurrent with the following, storage nodes will continue to operate their Foolscap server unaltered compared to their previous behavior. -A mixed-protocol storage node should: +Storage nodes will begin to operate a new HTTP-based server. +They may re-use their existing x509 certificate or generate a new one. +Generation of a new certificate allows for certain non-optimal conditions to be address:: +* The ``commonName`` of ``newpb_thingy`` may be changed to a more descriptive value. +* A ``notValidAfter`` field with a timestamp in the past may be updated. -* Start the Foolscap server as it has always done. -* Start a TLS server dispatching to an HTTP server. +Storage nodes will announce a new fURL for this new HTTP-based server. +This fURL will be announced alongside their existing Foolscap-based server's fURL. - * Use the same certificate as the Foolscap server uses. - * Announce both its Foolscap fURL and its HTTP fURL. - * Accept anonymous client connections. +Non-updated clients will see the Foolscap fURL and continue with their current behavior. +Updated clients will see the Foolscap fURL *and* the HTTP fURL and prefer the HTTP fURL. A mixed-protocol client node should: @@ -360,6 +354,23 @@ Just like the immutable version. .. _RFC 7469: https://tools.ietf.org/html/rfc7469#section-2.4 +.. [#] + The security value of checking ``notValidBefore`` and ``notValidAfter`` is not entirely clear. + There is an argument to make that letting an existing TLS implementation which wants to make these checks just make them reduces overall complexity + (and, at least in general, reducing complexity is good for security). + On the other hand, checking the validity time period forces certificate regeneration. + A possible compromise is to recommend very long-lived certificates + (many years, perhaps many decades?). + "Recommend" may be read as "provide software encouraging the generation of". + But what about key theft? + If certificates are valid for years then a successful attacker can pretend to be a valid storage node for years. + An introducer *might* eventually recognize such a node as an attacker and blacklist their announcements... + It's likely not all clients configured to use compromised storage server identities will be updated + (if only because there are many of them + but possibly also because there is no automatic mechanism for fixing this state). + Such clients may go on placing shares on an attacker's storage server for a long time. + Would short-validity-period certificates with automatic certificate renewal not be better? + .. [#] More simply:: From 17ae8a191b678953fe7d7727ec3a89373d814c0b Mon Sep 17 00:00:00 2001 From: Jean-Paul Calderone Date: Tue, 22 May 2018 08:27:15 -0400 Subject: [PATCH 50/76] I like it --- docs/proposed/http-storage-node-protocol.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/proposed/http-storage-node-protocol.rst b/docs/proposed/http-storage-node-protocol.rst index 90d94935b..cae08a673 100644 --- a/docs/proposed/http-storage-node-protocol.rst +++ b/docs/proposed/http-storage-node-protocol.rst @@ -1,7 +1,7 @@ .. -*- coding: utf-8 -*- -HTTP Storage Node Protocol -========================== +HTTP Storage Node Protocol ("Great Black Swamp", "GBS") +======================================================= The target audience for this document is Tahoe-LAFS developers. After reading this document, From b73e95ec306d48817a3578a389c685a53db118f6 Mon Sep 17 00:00:00 2001 From: Jean-Paul Calderone Date: Tue, 22 May 2018 08:27:24 -0400 Subject: [PATCH 51/76] discuss protocol identification --- docs/proposed/http-storage-node-protocol.rst | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/docs/proposed/http-storage-node-protocol.rst b/docs/proposed/http-storage-node-protocol.rst index cae08a673..d30a9b42b 100644 --- a/docs/proposed/http-storage-node-protocol.rst +++ b/docs/proposed/http-storage-node-protocol.rst @@ -87,11 +87,12 @@ Alice operates a storage node. Alice generates a key pair and secures it properly. Alice generates a self-signed storage node certificate with the key pair. Alice's storage node announces (to an introducer) a fURL containing (among other information) the SPKI hash. -For example, ``pb://i5xb...@example.com:443/g3m5...``. +For example, ``pb://i5xb...@example.com:443/g3m5...#v=2`` [#]_. Bob creates a client node pointed at the same introducer. Bob's client node receives the announcement from Alice's storage node. -Bob's client node can now perform a TLS handshake with a server at the address indicated by the storage node fURL +Bob's client node recognizes the fURL as referring to an HTTP-dialect server due to the ``v=2`` fragment. +Bob's client node can now perform a TLS handshake with a server at the address in the fURL location hints (``example.com:443`` in this example). Following the above described validation procedures, Bob's client node can determine whether it has reached Alice's storage node or not. @@ -387,3 +388,10 @@ Just like the immutable version. assert spki_digest32 == tub_id Note we use the Tahoe-LAFS-preferred base32 encoding rather than base64. + +.. [#] + Other schemes for differentiating between the two server types is possible. + If the tubID length remains different, + that provides an unambiguous (if obscure) signal about which protocol to use. + Or a different scheme could be adopted + (``[x-]pb+http``, ``x-tahoe+http``, ``x-gbs`` come to mind). From c321c937f62458c091d966951e51cbfb5fc52421 Mon Sep 17 00:00:00 2001 From: Jean-Paul Calderone Date: Tue, 22 May 2018 08:27:31 -0400 Subject: [PATCH 52/76] copy edits and another option for tubID length --- docs/proposed/http-storage-node-protocol.rst | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/docs/proposed/http-storage-node-protocol.rst b/docs/proposed/http-storage-node-protocol.rst index d30a9b42b..228d08f29 100644 --- a/docs/proposed/http-storage-node-protocol.rst +++ b/docs/proposed/http-storage-node-protocol.rst @@ -107,13 +107,16 @@ Bob's client and Alice's storage node are assured of both **message authenticati .. note:: - I think Foolscap TubIDs are 20 bytes and base32 encode to 32 bytes. - SPKI information discussed here is 32 bytes and base32 encodes to 52 bytes. - https://tools.ietf.org/html/rfc7515#appendix-C may prove a better choice for encoding the information into a fURL. + Foolscap TubIDs are 20 bytes (SHA1 digest of the certificate). + They are presented with base32 encoding at a length of 32 bytes. + SPKI information discussed here is 32 bytes (SHA256 digest). + They will present in base32 as 52 bytes. + https://tools.ietf.org/html/rfc7515#appendix-C may prove a better (more compact) choice for encoding the information into a fURL. It will encode 32 bytes into merely 43... We could also choose to reduce the hash size of the SPKI information through use of another cryptographic hash (replacing sha256). A 224 bit hash function (SHA3-224, for example) might be suitable - improving the encoded length to 38 bytes. + Or we could stick with the Foolscap digest function - SHA1. Transition From ff48e6741825afffcc5fda6a0686e329b3d6f364 Mon Sep 17 00:00:00 2001 From: Jean-Paul Calderone Date: Tue, 22 May 2018 08:42:16 -0400 Subject: [PATCH 53/76] flop some heading levels around --- docs/proposed/http-storage-node-protocol.rst | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/docs/proposed/http-storage-node-protocol.rst b/docs/proposed/http-storage-node-protocol.rst index 228d08f29..b6e23ba4d 100644 --- a/docs/proposed/http-storage-node-protocol.rst +++ b/docs/proposed/http-storage-node-protocol.rst @@ -12,11 +12,12 @@ Specifically, it should be possible to implement a Tahoe-LAFS storage server wit (substituting an HTTP server implementation). The Tahoe-LAFS client will also need to change but it is not expected that it will be noticably simplified by this change. -Security --------- Requirements -~~~~~~~~~~~~ +------------ + +Security +~~~~~~~~ Summary !!!!!!! @@ -62,7 +63,7 @@ Therefore, **message confidentiality** is necessary when exchanging these secret **Forward security** is preferred so that an attacker recording an exchange today cannot launch this attack at some future point after compromising the necessary keys. Solutions -~~~~~~~~~ +--------- Communication with the storage node will take place using TLS. The TLS version and configuration will be dictated by an ongoing understanding of best practices. From 44afc1de038bd56fd117f65e45b067599922a62e Mon Sep 17 00:00:00 2001 From: Jean-Paul Calderone Date: Tue, 22 May 2018 08:42:28 -0400 Subject: [PATCH 54/76] talk about a non-security requirement! --- docs/proposed/http-storage-node-protocol.rst | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/docs/proposed/http-storage-node-protocol.rst b/docs/proposed/http-storage-node-protocol.rst index b6e23ba4d..5429e9dda 100644 --- a/docs/proposed/http-storage-node-protocol.rst +++ b/docs/proposed/http-storage-node-protocol.rst @@ -62,6 +62,15 @@ there is no way to write data which appears legitimate to a legitimate client). Therefore, **message confidentiality** is necessary when exchanging these secrets. **Forward security** is preferred so that an attacker recording an exchange today cannot launch this attack at some future point after compromising the necessary keys. +Functionality +------------- + +Tahoe-LAFS application-level information must be transferred using this protocol. +This information is exchanged with a dozen or so request/response-oriented messages. +Some of these messages carry large binary payloads. +Others are small structured-data messages. +Some facility for expansion to support new information exchanges should also be present. + Solutions --------- From 5ede9662bb22415f232f3f47fc9435e7b6e1161e Mon Sep 17 00:00:00 2001 From: Jean-Paul Calderone Date: Tue, 22 May 2018 08:42:39 -0400 Subject: [PATCH 55/76] fix typo --- docs/proposed/http-storage-node-protocol.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/proposed/http-storage-node-protocol.rst b/docs/proposed/http-storage-node-protocol.rst index 5429e9dda..62308efa7 100644 --- a/docs/proposed/http-storage-node-protocol.rst +++ b/docs/proposed/http-storage-node-protocol.rst @@ -56,7 +56,7 @@ This is most likely done using cryptographic MACs (such as those used in TLS). The messages which enable the mutable shares feature include secrets related to those shares. For example, the write enabler secret is used to restrict the parties with write access to mutable shares. It is exchanged over the network as part of a write operation. -An attacker learning this secret and overwrite share data with garbage +An attacker learning this secret can overwrite share data with garbage (lacking a separate encryption key, there is no way to write data which appears legitimate to a legitimate client). Therefore, **message confidentiality** is necessary when exchanging these secrets. From bf305b91e43daa6dcd74cac1ddeb89fa3dd4dd60 Mon Sep 17 00:00:00 2001 From: Jean-Paul Calderone Date: Tue, 22 May 2018 08:42:55 -0400 Subject: [PATCH 56/76] HTTP *per se* is not a requirement --- docs/proposed/http-storage-node-protocol.rst | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/docs/proposed/http-storage-node-protocol.rst b/docs/proposed/http-storage-node-protocol.rst index 62308efa7..bb2a40149 100644 --- a/docs/proposed/http-storage-node-protocol.rst +++ b/docs/proposed/http-storage-node-protocol.rst @@ -1,7 +1,7 @@ .. -*- coding: utf-8 -*- -HTTP Storage Node Protocol ("Great Black Swamp", "GBS") -======================================================= +Storage Node Protocol ("Great Black Swamp", "GBS") +================================================== The target audience for this document is Tahoe-LAFS developers. After reading this document, @@ -9,9 +9,9 @@ one should expect to understand how Tahoe-LAFS clients interact over the network The primary goal of the introduction of this protocol is to simplify the task of implementing a Tahoe-LAFS storage server. Specifically, it should be possible to implement a Tahoe-LAFS storage server without a Foolscap implementation -(substituting an HTTP server implementation). -The Tahoe-LAFS client will also need to change but it is not expected that it will be noticably simplified by this change. - +(substituting a simpler GBS server implementation). +The Tahoe-LAFS client will also need to change but it is not expected that it will be noticably simplified by this change +(though this may be the first step towards simplifying it). Requirements ------------ @@ -22,7 +22,7 @@ Security Summary !!!!!!! -An HTTP-based protocol should offer at minimum the security properties offered by the Foolscap-based protocol. +The storage node protocol should offer at minimum the security properties offered by the Foolscap-based protocol. The Foolscap-based protocol offers: * **Peer authentication** by way of checked x509 certificates From 97176e88d42518cf99e419ef8fa3ae3d5fceac2b Mon Sep 17 00:00:00 2001 From: Jean-Paul Calderone Date: Tue, 22 May 2018 08:43:12 -0400 Subject: [PATCH 57/76] but it is part of this proposed solution --- docs/proposed/http-storage-node-protocol.rst | 3 +++ 1 file changed, 3 insertions(+) diff --git a/docs/proposed/http-storage-node-protocol.rst b/docs/proposed/http-storage-node-protocol.rst index bb2a40149..796bfaac0 100644 --- a/docs/proposed/http-storage-node-protocol.rst +++ b/docs/proposed/http-storage-node-protocol.rst @@ -74,6 +74,9 @@ Some facility for expansion to support new information exchanges should also be Solutions --------- +An HTTP-based protocol, dubbed "Great Black Swamp" (or "GBS"), is described below. +This protocol aims to satisfy the above requirements at a lower level of complexity than the current Foolscap-based protocol. + Communication with the storage node will take place using TLS. The TLS version and configuration will be dictated by an ongoing understanding of best practices. The only requirement is that the certificate have a valid signature. From 65103445ea1bd552aa834f0551f7f66ec16fa8e8 Mon Sep 17 00:00:00 2001 From: Jean-Paul Calderone Date: Tue, 22 May 2018 08:43:19 -0400 Subject: [PATCH 58/76] secrecy is the kind of security we're talking about here --- docs/proposed/http-storage-node-protocol.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/proposed/http-storage-node-protocol.rst b/docs/proposed/http-storage-node-protocol.rst index 796bfaac0..eb5e8d849 100644 --- a/docs/proposed/http-storage-node-protocol.rst +++ b/docs/proposed/http-storage-node-protocol.rst @@ -60,7 +60,7 @@ An attacker learning this secret can overwrite share data with garbage (lacking a separate encryption key, there is no way to write data which appears legitimate to a legitimate client). Therefore, **message confidentiality** is necessary when exchanging these secrets. -**Forward security** is preferred so that an attacker recording an exchange today cannot launch this attack at some future point after compromising the necessary keys. +**Forward secrecy** is preferred so that an attacker recording an exchange today cannot launch this attack at some future point after compromising the necessary keys. Functionality ------------- From ab37b5eabb7cf8a6883fb2466dc87fadbefd25ac Mon Sep 17 00:00:00 2001 From: Jean-Paul Calderone Date: Tue, 22 May 2018 09:00:10 -0400 Subject: [PATCH 59/76] clean up the description of the tls usage --- docs/proposed/http-storage-node-protocol.rst | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/docs/proposed/http-storage-node-protocol.rst b/docs/proposed/http-storage-node-protocol.rst index eb5e8d849..691ede2be 100644 --- a/docs/proposed/http-storage-node-protocol.rst +++ b/docs/proposed/http-storage-node-protocol.rst @@ -79,10 +79,10 @@ This protocol aims to satisfy the above requirements at a lower level of complex Communication with the storage node will take place using TLS. The TLS version and configuration will be dictated by an ongoing understanding of best practices. -The only requirement is that the certificate have a valid signature. -The storage node will publish the corresponding Subject Public Key Information hash (SPKI hash) -(e.g., via an introducer). -The SPKI hash will constitute the storage node's identity. +The storage node will present an x509 certificate during the TLS handshake. +Storage clients will require that the certificate have a valid signature. +The Subject Public Key Information (SPKI) hash of the certificate will constitute the storage node's identity. +The **tub id** portion of the storage node fURL will be replaced with the SPKI hash. When connecting to a storage node, the client will take the following steps to gain confidence it has reached the intended peer: From 504452f1fd1e9eaab00e104d394df130c715fb7c Mon Sep 17 00:00:00 2001 From: Jean-Paul Calderone Date: Tue, 22 May 2018 09:00:30 -0400 Subject: [PATCH 60/76] clean up description of certificate validity period --- docs/proposed/http-storage-node-protocol.rst | 35 ++++++++++++-------- 1 file changed, 22 insertions(+), 13 deletions(-) diff --git a/docs/proposed/http-storage-node-protocol.rst b/docs/proposed/http-storage-node-protocol.rst index 691ede2be..502c016bb 100644 --- a/docs/proposed/http-storage-node-protocol.rst +++ b/docs/proposed/http-storage-node-protocol.rst @@ -373,20 +373,29 @@ Just like the immutable version. .. [#] The security value of checking ``notValidBefore`` and ``notValidAfter`` is not entirely clear. - There is an argument to make that letting an existing TLS implementation which wants to make these checks just make them reduces overall complexity - (and, at least in general, reducing complexity is good for security). - On the other hand, checking the validity time period forces certificate regeneration. - A possible compromise is to recommend very long-lived certificates - (many years, perhaps many decades?). - "Recommend" may be read as "provide software encouraging the generation of". - But what about key theft? + The arguments which apply to web-facing certificates do not seem to apply + (due to the decision for Tahoe-LAFS to operate independently of the web-oriented CA system). + + There is an argument to make that complexity is reduced by allowing an existing TLS implementation which wants to make these checks make them + (compared to including additional code to either bypass them or disregard their results). + Reducing complexity, at least in general, is often good for security. + + On the other hand, checking the validity time period forces certificate regeneration + (which comes with its own set of complexity). + + A possible compromise is to recommend very certificates with validity periods of many years or decades. + "Recommend" may be read as "provide software supporting the generation of". + + What about key theft? If certificates are valid for years then a successful attacker can pretend to be a valid storage node for years. - An introducer *might* eventually recognize such a node as an attacker and blacklist their announcements... - It's likely not all clients configured to use compromised storage server identities will be updated - (if only because there are many of them - but possibly also because there is no automatic mechanism for fixing this state). - Such clients may go on placing shares on an attacker's storage server for a long time. - Would short-validity-period certificates with automatic certificate renewal not be better? + However, short-validity-period certificates are no help in this case. + The attacker can generate new, valid certificates using the stolen keys. + + Therefore, the only recourse to key theft + (really *identity theft*) + is to burn the identity and generate a new one. + Burning the identity is a non-trivial task. + It is worth solving but it is not solved here. .. [#] More simply:: From acf541a0be5f6cf62c18f0862655481ff767de85 Mon Sep 17 00:00:00 2001 From: Jean-Paul Calderone Date: Tue, 22 May 2018 09:08:59 -0400 Subject: [PATCH 61/76] try to make the example more useful --- docs/proposed/http-storage-node-protocol.rst | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/docs/proposed/http-storage-node-protocol.rst b/docs/proposed/http-storage-node-protocol.rst index 502c016bb..23dc5b948 100644 --- a/docs/proposed/http-storage-node-protocol.rst +++ b/docs/proposed/http-storage-node-protocol.rst @@ -100,7 +100,8 @@ Alice operates a storage node. Alice generates a key pair and secures it properly. Alice generates a self-signed storage node certificate with the key pair. Alice's storage node announces (to an introducer) a fURL containing (among other information) the SPKI hash. -For example, ``pb://i5xb...@example.com:443/g3m5...#v=2`` [#]_. +Imagine the SPKI hash is ``i5xb...``. +This results in a fURL of ``pb://i5xb...@example.com:443/g3m5...#v=2`` [#]_. Bob creates a client node pointed at the same introducer. Bob's client node receives the announcement from Alice's storage node. @@ -109,9 +110,7 @@ Bob's client node can now perform a TLS handshake with a server at the address i (``example.com:443`` in this example). Following the above described validation procedures, Bob's client node can determine whether it has reached Alice's storage node or not. -If and only if the SPKI hash matches the value in the published fURL -(``i5xb...`` in this example) -then Alice's storage node has been contacted. +If and only if the validation procedure is successful does Bob's client node conclude it has reached Alice's storage node. **Peer authentication** has been achieved. Additionally, From 534b8db3180c6d3afefbbdc4d478724e831f3702 Mon Sep 17 00:00:00 2001 From: Jean-Paul Calderone Date: Tue, 22 May 2018 09:57:18 -0400 Subject: [PATCH 62/76] markup and spelling --- docs/proposed/http-storage-node-protocol.rst | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/docs/proposed/http-storage-node-protocol.rst b/docs/proposed/http-storage-node-protocol.rst index 23dc5b948..9f1df9a89 100644 --- a/docs/proposed/http-storage-node-protocol.rst +++ b/docs/proposed/http-storage-node-protocol.rst @@ -144,7 +144,8 @@ Therefore, concurrent with the following, storage nodes will continue to operate Storage nodes will begin to operate a new HTTP-based server. They may re-use their existing x509 certificate or generate a new one. -Generation of a new certificate allows for certain non-optimal conditions to be address:: +Generation of a new certificate allows for certain non-optimal conditions to be addressed: + * The ``commonName`` of ``newpb_thingy`` may be changed to a more descriptive value. * A ``notValidAfter`` field with a timestamp in the past may be updated. From a592053b18454a3fe3b558cea1f4916cd4524927 Mon Sep 17 00:00:00 2001 From: Jean-Paul Calderone Date: Tue, 22 May 2018 09:57:29 -0400 Subject: [PATCH 63/76] refer to GBS more than HTTP --- docs/proposed/http-storage-node-protocol.rst | 8 +++----- 1 file changed, 3 insertions(+), 5 deletions(-) diff --git a/docs/proposed/http-storage-node-protocol.rst b/docs/proposed/http-storage-node-protocol.rst index 9f1df9a89..556901014 100644 --- a/docs/proposed/http-storage-node-protocol.rst +++ b/docs/proposed/http-storage-node-protocol.rst @@ -136,13 +136,11 @@ Transition To provide a seamless user experience during this protocol transition, there should be a period during which both protocols are supported by storage nodes. -The HTTP protocol announcement will be introduced in a way that updated client software can recognize. -Its introduction will also be made in such a way that non-updated client software disregards the new information +The GBS announcement will be introduced in a way that *updated client* software can recognize. +Its introduction will also be made in such a way that *non-updated client* software disregards the new information (of which it cannot make any use). -Therefore, concurrent with the following, storage nodes will continue to operate their Foolscap server unaltered compared to their previous behavior. - -Storage nodes will begin to operate a new HTTP-based server. +Storage nodes will begin to operate a new GBS server. They may re-use their existing x509 certificate or generate a new one. Generation of a new certificate allows for certain non-optimal conditions to be addressed: From 3d3c3d2eb4208bf9297a0f9d6bb0441efafa1fef Mon Sep 17 00:00:00 2001 From: Jean-Paul Calderone Date: Tue, 22 May 2018 09:57:39 -0400 Subject: [PATCH 64/76] elaborate on the transition stages talk about cases of each stage and desired behavior --- docs/proposed/http-storage-node-protocol.rst | 57 ++++++++++++++++---- 1 file changed, 46 insertions(+), 11 deletions(-) diff --git a/docs/proposed/http-storage-node-protocol.rst b/docs/proposed/http-storage-node-protocol.rst index 556901014..6934eb78e 100644 --- a/docs/proposed/http-storage-node-protocol.rst +++ b/docs/proposed/http-storage-node-protocol.rst @@ -149,22 +149,57 @@ Generation of a new certificate allows for certain non-optimal conditions to be Storage nodes will announce a new fURL for this new HTTP-based server. This fURL will be announced alongside their existing Foolscap-based server's fURL. +Such an announcement will resemble this:: -Non-updated clients will see the Foolscap fURL and continue with their current behavior. -Updated clients will see the Foolscap fURL *and* the HTTP fURL and prefer the HTTP fURL. + { + "anonymous-storage-FURL": "pb://...", # The old key + "anonymous-storage-gbs-FURL": "pb://...#v=2" # The new key + } -A mixed-protocol client node should: +The transition process will proceed in three stages: -* If it is configured with a storage URI, connect using HTTP over TLS. -* If it is configured with a storage fURL, connect using Foolscap. - If the server version indicates support for the new protocol: +1. The first stage represents the starting conditions in which clients and servers can speak only Foolscap. +#. The intermediate stage represents a condition in which some clients and servers can both speak Foolscap and GBS. +#. The final stage represents the desired condition in which all clients and servers speak only GBS. - * Attempt to connect using the new protocol. - * Drop the Foolscap connection if this new connection succeeds. +During the first stage only one client/server interaction is possible: +the storage server announces only Foolscap and speaks only Foolscap. +During the final stage there is only one supported interaction: +the client and server are both updated and speak GBS to each other. -Client node implementations could cache a successful protocol upgrade. -This would avoid the double connection on subsequent startups. -This is left as a decision for the implementation, though. +During the intermediate stage there are four supported interactions: + +1. Both the client and server are non-updated. + The interaction is just as it would be during the first stage. +#. The client is updated and the server is non-updated. + The client will see the Foolscap announcement and the lack of a GBS announcement. + It will speak to the server using Foolscap. +#. The client is non-updated and the server is updated. + The client will see the Foolscap announcement. + It will speak Foolscap to the storage server. +#. Both the client and server are updated. + The client will see the GBS announcement and disregard the Foolscap announcement. + It will speak GBS to the server. + +There is one further complication: +the client maintains a cache of storage server information +(to avoid continuing to rely on the introducer after it has been introduced). +The follow sequence of events is likely: + +1. The client connects to an introducer. +#. It receives an announcement for a non-updated storage server (Foolscap only). +#. It caches this announcement. +#. At some point, the storage server is updated. +#. The client uses the information in its cache to open a Foolscap connection to the storage server. + +Ideally, +the client would not rely on an update from the introducer to give it the GBS fURL for the updated storage server. +Therefore, +when an updated client connects to a storage server using Foolscap, +it should request the server's version information. +If this information indicates that GBS is supported then the client should cache this GBS information. +On subsequent connection attempts, +it should make use of this GBS information. Server Details -------------- From 4e10f7971a43227049f1b56887bdc6ec8ffd434c Mon Sep 17 00:00:00 2001 From: Jean-Paul Calderone Date: Tue, 29 May 2018 10:52:37 -0400 Subject: [PATCH 65/76] discuss decision to use query args --- docs/proposed/http-storage-node-protocol.rst | 23 ++++++++++++++++++++ 1 file changed, 23 insertions(+) diff --git a/docs/proposed/http-storage-node-protocol.rst b/docs/proposed/http-storage-node-protocol.rst index 6934eb78e..523b2094a 100644 --- a/docs/proposed/http-storage-node-protocol.rst +++ b/docs/proposed/http-storage-node-protocol.rst @@ -322,6 +322,20 @@ For example:: 7: ["baz", "quux"] } +Discussion +`````````` + +Offset and size of the requested data are specified here as query arguments. +Instead, this information could be present in a ``Range`` header in the request. +This is the more obvious choice and leverages an HTTP feature built for exactly this use-case. +However, HTTP requires that the ``Content-Type`` of the response to "range requests" be ``multipart/...``. +The ``multipart`` major type brings along string sentinel delimiting as a means to frame the different response parts. +There are many drawbacks to this framing technique: + +1. It is resource-intensive to generate. +2. It is resource-intensive to parse. +3. It is complex to parse safely [#]_ [#]_ [#]_ [#]_. + Mutable ------- @@ -453,3 +467,12 @@ Just like the immutable version. that provides an unambiguous (if obscure) signal about which protocol to use. Or a different scheme could be adopted (``[x-]pb+http``, ``x-tahoe+http``, ``x-gbs`` come to mind). + +.. [#] + https://www.cvedetails.com/cve/CVE-2017-5638/ +.. [#] + https://pivotal.io/security/cve-2018-1272 +.. [#] + https://nvd.nist.gov/vuln/detail/CVE-2017-5124 +.. [#] + https://efail.de/ From b8cfee79e38d85131e574d96f5e6b722d7b54f9a Mon Sep 17 00:00:00 2001 From: Jean-Paul Calderone Date: Wed, 6 Jun 2018 13:31:34 -0400 Subject: [PATCH 66/76] frame it a little more --- docs/proposed/http-storage-node-protocol.rst | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/docs/proposed/http-storage-node-protocol.rst b/docs/proposed/http-storage-node-protocol.rst index 523b2094a..47b56390f 100644 --- a/docs/proposed/http-storage-node-protocol.rst +++ b/docs/proposed/http-storage-node-protocol.rst @@ -204,6 +204,12 @@ it should make use of this GBS information. Server Details -------------- +The protocol primarily enables interaction with "resources" of two types: +storage indexes +and shares. +A particular resource is addressed by the HTTP request path. +Details about the interface are encoded in the HTTP message body. + JSON is used throughout for the examples but is likely not the preferred encoding. The structure of the examples should nevertheless be representative. From c3011a434b5cf70f1ff507ea7002b5ec4649e7f1 Mon Sep 17 00:00:00 2001 From: Jean-Paul Calderone Date: Wed, 6 Jun 2018 13:46:45 -0400 Subject: [PATCH 67/76] Specify preferred encoding and encoding negotiation --- docs/proposed/http-storage-node-protocol.rst | 26 ++++++++++++++++++-- 1 file changed, 24 insertions(+), 2 deletions(-) diff --git a/docs/proposed/http-storage-node-protocol.rst b/docs/proposed/http-storage-node-protocol.rst index 47b56390f..3057a8194 100644 --- a/docs/proposed/http-storage-node-protocol.rst +++ b/docs/proposed/http-storage-node-protocol.rst @@ -210,8 +210,26 @@ and shares. A particular resource is addressed by the HTTP request path. Details about the interface are encoded in the HTTP message body. -JSON is used throughout for the examples but is likely not the preferred encoding. -The structure of the examples should nevertheless be representative. +Message Encoding +~~~~~~~~~~~~~~~~ + +The preferred encoding for HTTP message bodies is `CBOR`_. +A request may be submitted using an alternate encoding by declaring this in the ``Content-Type`` header. +A request may indicate its preference for an alternate encoding in the response using the ``Accept`` header. +These two headers are used in the typical way for an HTTP application. + +The only other encoding support for which is currently recommended is JSON. +For HTTP messages carrying binary share data, +this is expected to be a particularly poor encoding. +However, +for HTTP messages carrying small payloads of strings, numbers, and containers +it is expected that JSON will be more convenient than CBOR for ad hoc testing and manual interaction. + +For this same reason, +JSON is used throughout for the examples presented here. +Because of the simple types used throughout +and the equivalence described in `RFC 7049`_ +these examples should be representative regardless of which of these two encodings is chosen. General ~~~~~~~ @@ -424,6 +442,10 @@ Just like the immutable version. .. _RFC 7469: https://tools.ietf.org/html/rfc7469#section-2.4 +.. _RFC 7049: https://tools.ietf.org/html/rfc7049#section-4 + +.. _CBOR: http://cbor.io/ + .. [#] The security value of checking ``notValidBefore`` and ``notValidAfter`` is not entirely clear. The arguments which apply to web-facing certificates do not seem to apply From c43eacc3a9b976e69e9757f3ddecedf3c2fbbc1b Mon Sep 17 00:00:00 2001 From: Jean-Paul Calderone Date: Wed, 13 Jun 2018 08:27:45 -0400 Subject: [PATCH 68/76] clarify which party is vulnerable --- docs/proposed/http-storage-node-protocol.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/proposed/http-storage-node-protocol.rst b/docs/proposed/http-storage-node-protocol.rst index 3057a8194..d6591217b 100644 --- a/docs/proposed/http-storage-node-protocol.rst +++ b/docs/proposed/http-storage-node-protocol.rst @@ -36,7 +36,7 @@ Discussion !!!!!!!!!! A client node relies on a storage node to persist certain data until a future retrieval request is made. -In this way, the node is vulnerable to attacks which cause the data not to be persisted. +In this way, the client node is vulnerable to attacks which cause the data not to be persisted. Though this vulnerability can be (and typically is) mitigated by including redundancy in the share encoding parameters for stored data, it is still sensible to attempt to minimize unnecessary vulnerability to this attack. From 145ee3b6ab15a7d2fee584cf0f03d30d992deed9 Mon Sep 17 00:00:00 2001 From: Jean-Paul Calderone Date: Wed, 27 Jun 2018 16:39:02 -0400 Subject: [PATCH 69/76] mention the introducer --- docs/proposed/http-storage-node-protocol.rst | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/docs/proposed/http-storage-node-protocol.rst b/docs/proposed/http-storage-node-protocol.rst index d6591217b..0c6803ef2 100644 --- a/docs/proposed/http-storage-node-protocol.rst +++ b/docs/proposed/http-storage-node-protocol.rst @@ -103,7 +103,8 @@ Alice's storage node announces (to an introducer) a fURL containing (among other Imagine the SPKI hash is ``i5xb...``. This results in a fURL of ``pb://i5xb...@example.com:443/g3m5...#v=2`` [#]_. Bob creates a client node pointed at the same introducer. -Bob's client node receives the announcement from Alice's storage node. +Bob's client node receives the announcement from Alice's storage node +(indirected through the introducer). Bob's client node recognizes the fURL as referring to an HTTP-dialect server due to the ``v=2`` fragment. Bob's client node can now perform a TLS handshake with a server at the address in the fURL location hints From 4e5ec27d50263cb3440ba19e7dba4555016f291d Mon Sep 17 00:00:00 2001 From: Jean-Paul Calderone Date: Wed, 27 Jun 2018 16:49:45 -0400 Subject: [PATCH 70/76] Use that : notation consistently here --- docs/proposed/http-storage-node-protocol.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/proposed/http-storage-node-protocol.rst b/docs/proposed/http-storage-node-protocol.rst index 0c6803ef2..14935ad14 100644 --- a/docs/proposed/http-storage-node-protocol.rst +++ b/docs/proposed/http-storage-node-protocol.rst @@ -429,8 +429,8 @@ For example:: [1, 5] -``GET /v1/mutable/:storage_index?share=:s0&share=:sN&offset=o1&size=z0&offset=oN&size=zN`` -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! +``GET /v1/mutable/:storage_index?share=:s0&share=:sN&offset=:o1&size=:z0&offset=:oN&size=:zN`` +!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Read data from the indicated mutable shares. Just like ``GET /v1/mutable/:storage_index``. From 4cd018fc11d699d6cc2bd6e57aea59dbd35f2f16 Mon Sep 17 00:00:00 2001 From: Jean-Paul Calderone Date: Wed, 27 Jun 2018 16:51:47 -0400 Subject: [PATCH 71/76] Consistently name the gbs information And replace the flag with the full information otherwise the client cannot find the gbs server without talking to the introducer again. --- docs/proposed/http-storage-node-protocol.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/proposed/http-storage-node-protocol.rst b/docs/proposed/http-storage-node-protocol.rst index 14935ad14..a1ea262a5 100644 --- a/docs/proposed/http-storage-node-protocol.rst +++ b/docs/proposed/http-storage-node-protocol.rst @@ -154,7 +154,7 @@ Such an announcement will resemble this:: { "anonymous-storage-FURL": "pb://...", # The old key - "anonymous-storage-gbs-FURL": "pb://...#v=2" # The new key + "gbs-anonymous-storage-url": "pb://...#v=2" # The new key } The transition process will proceed in three stages: @@ -250,7 +250,7 @@ For example:: "delete-mutable-shares-with-zero-length-writev": true, "fills-holes-with-zero-bytes": true, "prevents-read-past-end-of-share-data": true, - "http-protocol-available": true + "gbs-anonymous-storage-url": "pb://...#v=2" }, "application-version": "1.13.0" } From 209c8694f94f689159078fbdef0190ad9154e484 Mon Sep 17 00:00:00 2001 From: Jean-Paul Calderone Date: Wed, 27 Jun 2018 16:53:17 -0400 Subject: [PATCH 72/76] Simplify language --- docs/proposed/http-storage-node-protocol.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/proposed/http-storage-node-protocol.rst b/docs/proposed/http-storage-node-protocol.rst index a1ea262a5..cc1dc6c1f 100644 --- a/docs/proposed/http-storage-node-protocol.rst +++ b/docs/proposed/http-storage-node-protocol.rst @@ -452,7 +452,7 @@ Just like the immutable version. The arguments which apply to web-facing certificates do not seem to apply (due to the decision for Tahoe-LAFS to operate independently of the web-oriented CA system). - There is an argument to make that complexity is reduced by allowing an existing TLS implementation which wants to make these checks make them + Arguably, complexity is reduced by allowing an existing TLS implementation which wants to make these checks make them (compared to including additional code to either bypass them or disregard their results). Reducing complexity, at least in general, is often good for security. From ff12263ed536e5704b27071d0c35859d4c179c75 Mon Sep 17 00:00:00 2001 From: Jean-Paul Calderone Date: Wed, 27 Jun 2018 16:53:37 -0400 Subject: [PATCH 73/76] remove an extra extra word --- docs/proposed/http-storage-node-protocol.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/proposed/http-storage-node-protocol.rst b/docs/proposed/http-storage-node-protocol.rst index cc1dc6c1f..0df4cb743 100644 --- a/docs/proposed/http-storage-node-protocol.rst +++ b/docs/proposed/http-storage-node-protocol.rst @@ -459,7 +459,7 @@ Just like the immutable version. On the other hand, checking the validity time period forces certificate regeneration (which comes with its own set of complexity). - A possible compromise is to recommend very certificates with validity periods of many years or decades. + A possible compromise is to recommend certificates with validity periods of many years or decades. "Recommend" may be read as "provide software supporting the generation of". What about key theft? From 250465f810eccac53410002ebb78d0da9bd6b3f7 Mon Sep 17 00:00:00 2001 From: Jean-Paul Calderone Date: Fri, 29 Jun 2018 11:11:30 -0400 Subject: [PATCH 74/76] Discard base32 and SHA1. --- docs/proposed/http-storage-node-protocol.rst | 31 ++++++++++++-------- 1 file changed, 19 insertions(+), 12 deletions(-) diff --git a/docs/proposed/http-storage-node-protocol.rst b/docs/proposed/http-storage-node-protocol.rst index 0df4cb743..fcc6a648a 100644 --- a/docs/proposed/http-storage-node-protocol.rst +++ b/docs/proposed/http-storage-node-protocol.rst @@ -121,15 +121,17 @@ Bob's client and Alice's storage node are assured of both **message authenticati .. note:: Foolscap TubIDs are 20 bytes (SHA1 digest of the certificate). - They are presented with base32 encoding at a length of 32 bytes. + They are encoded with Base32 for a length of 32 bytes. SPKI information discussed here is 32 bytes (SHA256 digest). - They will present in base32 as 52 bytes. - https://tools.ietf.org/html/rfc7515#appendix-C may prove a better (more compact) choice for encoding the information into a fURL. - It will encode 32 bytes into merely 43... - We could also choose to reduce the hash size of the SPKI information through use of another cryptographic hash (replacing sha256). + They would be encoded in Base32 for a length of 52 bytes. + `base64url`_ provides a more compact encoding of the information while remaining URL-compatible. + This would encode the SPKI information for a length of merely 43 bytes. + SHA1, + the current Foolscap hash function, + is not a practical choice at this time due to advances made in `attacking SHA1`_. + The selection of a safe hash function with output smaller than SHA256 could be the subject of future improvements. A 224 bit hash function (SHA3-224, for example) might be suitable - improving the encoded length to 38 bytes. - Or we could stick with the Foolscap digest function - SHA1. Transition @@ -481,14 +483,15 @@ Just like the immutable version. Encoding, PublicFormat, ) - from foolscap import base32 + from pybase64 import urlsafe_b64encode - spki_bytes = cert.public_key().public_bytes(Encoding.DER, PublicFormat.SubjectPublicKeyInfo) - spki_sha256 = sha256(spki_bytes).digest() - spki_digest32 = base32.encode(spki_sha256) - assert spki_digest32 == tub_id + def check_tub_id(tub_id): + spki_bytes = cert.public_key().public_bytes(Encoding.DER, PublicFormat.SubjectPublicKeyInfo) + spki_sha256 = sha256(spki_bytes).digest() + spki_encoded = urlsafe_b64encode(spki_sha256) + assert spki_encoded == tub_id - Note we use the Tahoe-LAFS-preferred base32 encoding rather than base64. + Note we use `base64url`_ rather than the Foolscap- and Tahoe-LAFS-preferred Base32. .. [#] Other schemes for differentiating between the two server types is possible. @@ -505,3 +508,7 @@ Just like the immutable version. https://nvd.nist.gov/vuln/detail/CVE-2017-5124 .. [#] https://efail.de/ + +.. _base64url: https://tools.ietf.org/html/rfc7515#appendix-C + +.. _attacking SHA1: https://en.wikipedia.org/wiki/SHA-1#Attacks From 635c0c5db0606d41a950a93711e92188846dd116 Mon Sep 17 00:00:00 2001 From: Jean-Paul Calderone Date: Fri, 29 Jun 2018 11:30:28 -0400 Subject: [PATCH 75/76] Slots are not separately, explicitly created --- docs/proposed/http-storage-node-protocol.rst | 6 ------ 1 file changed, 6 deletions(-) diff --git a/docs/proposed/http-storage-node-protocol.rst b/docs/proposed/http-storage-node-protocol.rst index fcc6a648a..68b5e727c 100644 --- a/docs/proposed/http-storage-node-protocol.rst +++ b/docs/proposed/http-storage-node-protocol.rst @@ -369,12 +369,6 @@ Mutable Writing ~~~~~~~ -``POST /v1/mutable/:storage_index`` -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - -Initialize a mutable storage index with some buckets. -Essentially the same as the API for initializing an immutable storage index. - ``POST /v1/mutable/:storage_index/read-test-write`` !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! From 687c4c8f4f7905a11375fc4b2d8411d9f38a7e39 Mon Sep 17 00:00:00 2001 From: Jean-Paul Calderone Date: Fri, 29 Jun 2018 11:30:45 -0400 Subject: [PATCH 76/76] Talk about lack of creation --- docs/proposed/http-storage-node-protocol.rst | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/docs/proposed/http-storage-node-protocol.rst b/docs/proposed/http-storage-node-protocol.rst index 68b5e727c..d0bd8cfd6 100644 --- a/docs/proposed/http-storage-node-protocol.rst +++ b/docs/proposed/http-storage-node-protocol.rst @@ -373,6 +373,12 @@ Writing !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! General purpose read-test-and-write operation for mutable storage indexes. +A mutable storage index is also called a "slot" +(particularly by the existing Tahoe-LAFS codebase). +The first write operation on a mutable storage index creates it +(that is, +there is no separate "create this storage index" operation as there is for the immutable storage index type). + The request body includes the secrets necessary to rewrite to the shares along with test, read, and write vectors for the operation. For example::