From a8832b11b6e365564d1b53f35bd885353d550841 Mon Sep 17 00:00:00 2001 From: Jean-Paul Calderone Date: Fri, 20 Jan 2023 14:29:17 -0500 Subject: [PATCH 01/51] Start adapting language to narrow down possible interpretations --- docs/proposed/http-storage-node-protocol.rst | 113 ++++++++++++++----- 1 file changed, 84 insertions(+), 29 deletions(-) diff --git a/docs/proposed/http-storage-node-protocol.rst b/docs/proposed/http-storage-node-protocol.rst index aee201cf5..397d64ec2 100644 --- a/docs/proposed/http-storage-node-protocol.rst +++ b/docs/proposed/http-storage-node-protocol.rst @@ -3,7 +3,7 @@ Storage Node Protocol ("Great Black Swamp", "GBS") ================================================== -The target audience for this document is Tahoe-LAFS developers. +The target audience for this document is developers working on Tahoe-LAFS or on an alternate implementation intended to be interoperable. After reading this document, one should expect to understand how Tahoe-LAFS clients interact over the network with Tahoe-LAFS storage nodes. @@ -64,6 +64,10 @@ Glossary lease renew secret a short secret string which storage servers required to be presented before allowing a particular lease to be renewed +The key words +"MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" +in this document are to be interpreted as described in RFC 2119. + Motivation ---------- @@ -119,8 +123,8 @@ An HTTP-based protocol can make use of TLS in largely the same way to provide th Provision of these properties *is* dependant on implementers following Great Black Swamp's rules for x509 certificate validation (rather than the standard "web" rules for validation). -Requirements ------------- +Design Requirements +------------------- Security ~~~~~~~~ @@ -189,6 +193,9 @@ Solutions An HTTP-based protocol, dubbed "Great Black Swamp" (or "GBS"), is described below. This protocol aims to satisfy the above requirements at a lower level of complexity than the current Foolscap-based protocol. +Summary (Non-normative) +!!!!!!!!!!!!!!!!!!!!!!! + Communication with the storage node will take place using TLS. The TLS version and configuration will be dictated by an ongoing understanding of best practices. The storage node will present an x509 certificate during the TLS handshake. @@ -240,7 +247,7 @@ When Bob's client issues HTTP requests to Alice's storage node it includes the * They are encoded with Base32 for a length of 32 bytes. SPKI information discussed here is 32 bytes (SHA256 digest). They would be encoded in Base32 for a length of 52 bytes. - `base64url`_ provides a more compact encoding of the information while remaining URL-compatible. + `unpadded base64url`_ provides a more compact encoding of the information while remaining URL-compatible. This would encode the SPKI information for a length of merely 43 bytes. SHA1, the current Foolscap hash function, @@ -332,12 +339,15 @@ Details about the interface are encoded in the HTTP message body. Message Encoding ~~~~~~~~~~~~~~~~ -The preferred encoding for HTTP message bodies is `CBOR`_. -A request may be submitted using an alternate encoding by declaring this in the ``Content-Type`` header. -A request may indicate its preference for an alternate encoding in the response using the ``Accept`` header. -These two headers are used in the typical way for an HTTP application. +Clients and servers MUST use the ``Content-Type`` and ``Accept`` header fields as specified in `RFC 9110`_ for message body negotiation. -The only other encoding support for which is currently recommended is JSON. +The encoding for HTTP message bodies SHOULD be `CBOR`_. +Clients submitting requests using this encoding MUST include a ``Content-Type: application/cbor`` request header field. +A request MAY be submitted using an alternate encoding by declaring this in the ``Content-Type`` header field. +A request MAY indicate its preference for an alternate encoding in the response using the ``Accept`` header field. +A request which includes no ``Accept`` header field MUST be interpreted in the same way as a request including a ``Accept: application/cbor`` header field. + +Clients and servers SHOULD support ``application/json`` request and response message body encoding. For HTTP messages carrying binary share data, this is expected to be a particularly poor encoding. However, @@ -350,10 +360,19 @@ Because of the simple types used throughout and the equivalence described in `RFC 7049`_ these examples should be representative regardless of which of these two encodings is chosen. -The one exception is sets. -For CBOR messages, any sequence that is semantically a set (i.e. no repeated values allowed, order doesn't matter, and elements are hashable in Python) should be sent as a set. -Tag 6.258 is used to indicate sets in CBOR; see `the CBOR registry `_ for more details. -Sets will be represented as JSON lists in examples because JSON doesn't support sets. +One exception to this rule is for sets. +For CBOR messages, +any sequence that is semantically a set (i.e. no repeated values allowed, order doesn't matter, and elements are hashable in Python) should be sent as a set. +Tag 6.258 is used to indicate sets in CBOR; +see `the CBOR registry `_ for more details. +The JSON encoding does not support sets. +Sets MUST be represented as arrays in JSON-encoded messages. + +Another exception to this rule is for bytes. +The CBOR encoding natively supports a bytes type while the JSON encoding does not. +Bytes MUST be represented as strings giving the `Base64`_ representation of the original bytes value. + +Clients and servers MAY support additional request and response message body encodings. HTTP Design ~~~~~~~~~~~ @@ -368,29 +387,49 @@ one branch contains all of the share data; another branch contains all of the lease data; etc. -An ``Authorization`` header in requests is required for all endpoints. -The standard HTTP authorization protocol is used. -The authentication *type* used is ``Tahoe-LAFS``. -The swissnum from the NURL used to locate the storage service is used as the *credentials*. -If credentials are not presented or the swissnum is not associated with a storage service then no storage processing is performed and the request receives an ``401 UNAUTHORIZED`` response. +Clients and servers MUST use the ``Authorization`` header field, +as specified in `RFC 9110`_, +for authorization of all requests to all endpoints specified here. +The authentication *type* MUST be ``Tahoe-LAFS``. +Clients MUST present the swissnum from the NURL used to locate the storage service as the *credentials*. -There are also, for some endpoints, secrets sent via ``X-Tahoe-Authorization`` headers. -If these are: +If credentials are not presented or the swissnum is not associated with a storage service then the server MUST issue a ``401 UNAUTHORIZED`` response and perform no other processing of the message. + +Requests to certain endpoints MUST include additional secrets in the ``X-Tahoe-Authorization`` headers field. +The endpoints which require these secrets are: + +* ``PUT /storage/v1/lease/:storage_index``: + The secrets included MUST be ``lease-renew-secret`` and ``lease-cancel-secret``. + +* ``POST /storage/v1/immutable/:storage_index``: + The secrets included MUST be ``lease-renew-secret``, ``lease-cancel-secret``, and ``upload-secret``. + +* ``PATCH /storage/v1/immutable/:storage_index/:share_number``: + The secrets included MUST be ``upload-secret``. + +* ``PUT /storage/v1/immutable/:storage_index/:share_number/abort``: + The secrets included MUST be ``upload-secret``. + +* ``POST /storage/v1/mutable/:storage_index/read-test-write``: + The secrets included MUST be ``lease-renew-secret``, ``lease-cancel-secret``, and ``write-enabler``. + +If these secrets are: 1. Missing. 2. The wrong length. 3. Not the expected kind of secret. 4. They are otherwise unparseable before they are actually semantically used. -the server will respond with ``400 BAD REQUEST``. +the server MUST respond with ``400 BAD REQUEST`` and perform no other processing of the message. 401 is not used because this isn't an authorization problem, this is a "you sent garbage and should know better" bug. -If authorization using the secret fails, then a ``401 UNAUTHORIZED`` response should be sent. +If authorization using the secret fails, +then the server MUST send a ``401 UNAUTHORIZED`` response and perform no other processing of the message. Encoding ~~~~~~~~ -* ``storage_index`` should be base32 encoded (RFC3548) in URLs. +* ``storage_index`` MUST be `Base32`_ encoded in URLs. General ~~~~~~~ @@ -398,11 +437,14 @@ General ``GET /storage/v1/version`` !!!!!!!!!!!!!!!!!!!!!!!!!!! -Retrieve information about the version of the storage server. -Information is returned as an encoded mapping. -For example:: +This endpoint allows clients to retrieve some basic metadata about a storage server from the storage service. +The response MUST represent a mapping from schema identifiers to the metadata. - { "http://allmydata.org/tahoe/protocols/storage/v1" : +The only schema identifier specified is ``"http://allmydata.org/tahoe/protocols/storage/v1"``. +The server MUST include an entry in the mapping with this key. +The value for the key MUST be another mapping with the following keys and value types:: + + { "http://allmydata.org/tahoe/protocols/storage/v1": { "maximum-immutable-share-size": 1234, "maximum-mutable-share-size": 1235, "available-space": 123456, @@ -414,6 +456,11 @@ For example:: "application-version": "1.13.0" } +The server SHOULD populate as many fields as possible with accurate information about itself. + +XXX Document every single field + + ``PUT /storage/v1/lease/:storage_index`` !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! @@ -926,10 +973,18 @@ otherwise it will read a byte which won't match `b""`:: 204 NO CONTENT +.. _Base64: https://www.rfc-editor.org/rfc/rfc4648#section-4 + +.. _Base32: https://www.rfc-editor.org/rfc/rfc4648#section-6 + +.. _RFC 4648: https://tools.ietf.org/html/rfc4648 + .. _RFC 7469: https://tools.ietf.org/html/rfc7469#section-2.4 .. _RFC 7049: https://tools.ietf.org/html/rfc7049#section-4 +.. _RFC 9110: https://tools.ietf.org/html/rfc9110 + .. _CBOR: http://cbor.io/ .. [#] @@ -974,7 +1029,7 @@ otherwise it will read a byte which won't match `b""`:: spki_encoded = urlsafe_b64encode(spki_sha256) assert spki_encoded == tub_id - Note we use `base64url`_ rather than the Foolscap- and Tahoe-LAFS-preferred Base32. + Note we use `unpadded base64url`_ rather than the Foolscap- and Tahoe-LAFS-preferred Base32. .. [#] https://www.cvedetails.com/cve/CVE-2017-5638/ @@ -985,6 +1040,6 @@ otherwise it will read a byte which won't match `b""`:: .. [#] https://efail.de/ -.. _base64url: https://tools.ietf.org/html/rfc7515#appendix-C +.. _unpadded base64url: https://tools.ietf.org/html/rfc7515#appendix-C .. _attacking SHA1: https://en.wikipedia.org/wiki/SHA-1#Attacks From 7b207383088ce1f866bc6442e07b5f675ceea4b7 Mon Sep 17 00:00:00 2001 From: Jean-Paul Calderone Date: Mon, 23 Jan 2023 10:40:41 -0500 Subject: [PATCH 02/51] some more edits --- docs/proposed/http-storage-node-protocol.rst | 128 ++++++++++--------- 1 file changed, 70 insertions(+), 58 deletions(-) diff --git a/docs/proposed/http-storage-node-protocol.rst b/docs/proposed/http-storage-node-protocol.rst index 397d64ec2..cff6dc67b 100644 --- a/docs/proposed/http-storage-node-protocol.rst +++ b/docs/proposed/http-storage-node-protocol.rst @@ -438,27 +438,28 @@ General !!!!!!!!!!!!!!!!!!!!!!!!!!! This endpoint allows clients to retrieve some basic metadata about a storage server from the storage service. -The response MUST represent a mapping from schema identifiers to the metadata. +The response MUST validate against this CDDL schema:: -The only schema identifier specified is ``"http://allmydata.org/tahoe/protocols/storage/v1"``. -The server MUST include an entry in the mapping with this key. -The value for the key MUST be another mapping with the following keys and value types:: + {'http://allmydata.org/tahoe/protocols/storage/v1' => { + 'maximum-immutable-share-size' => uint + 'maximum-mutable-share-size' => uint + 'available-space' => uint + 'tolerates-immutable-read-overrun' => bool + 'delete-mutable-shares-with-zero-length-writev' => bool + 'fills-holes-with-zero-bytes' => bool + 'prevents-read-past-end-of-share-data' => bool + } + 'application-version' => bstr + } - { "http://allmydata.org/tahoe/protocols/storage/v1": - { "maximum-immutable-share-size": 1234, - "maximum-mutable-share-size": 1235, - "available-space": 123456, - "tolerates-immutable-read-overrun": true, - "delete-mutable-shares-with-zero-length-writev": true, - "fills-holes-with-zero-bytes": true, - "prevents-read-past-end-of-share-data": true - }, - "application-version": "1.13.0" - } +The server SHOULD populate as many fields as possible with accurate information about its behavior. -The server SHOULD populate as many fields as possible with accurate information about itself. +For fields which relate to a specific API +the semantics are documented below in the section for that API. +For fields that are more general than a single API the semantics are as follows: -XXX Document every single field +* available-space: + The server SHOULD use this field to advertise the amount of space that it currently considers unused and is willing to allocate for client requests. ``PUT /storage/v1/lease/:storage_index`` @@ -518,21 +519,23 @@ Writing !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Initialize an immutable storage index with some buckets. -The buckets may have share data written to them once. -A lease is also created for the shares. +The server MUST allow share data to be written to the buckets at most one time. +The server MAY create a lease for the buckets. Details of the buckets to create are encoded in the request body. For example:: {"share-numbers": [1, 7, ...], "allocated-size": 12345} -The request must include ``X-Tahoe-Authorization`` HTTP headers that set the various secrets—upload, lease renewal, lease cancellation—that will be later used to authorize various operations. +The server SHOULD accept a value for **allocated-size** that is less than or equal to the value for the server's version message's **maximum-immutable-share-size** value. + +The request MUST include ``X-Tahoe-Authorization`` HTTP headers that set the various secrets—upload, lease renewal, lease cancellation—that will be later used to authorize various operations. For example:: X-Tahoe-Authorization: lease-renew-secret X-Tahoe-Authorization: lease-cancel-secret X-Tahoe-Authorization: upload-secret -The response body includes encoded information about the created buckets. +The response body MUST include encoded information about the created buckets. For example:: {"already-have": [1, ...], "allocated": [7, ...]} @@ -589,26 +592,28 @@ Rejected designs for upload secrets: !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Write data for the indicated share. -The share number must belong to the storage index. -The request body is the raw share data (i.e., ``application/octet-stream``). -*Content-Range* requests are required; for large transfers this allows partially complete uploads to be resumed. +The share number MUST belong to the storage index. +The request body MUST be the raw share data (i.e., ``application/octet-stream``). +The request MUST include a *Content-Range* header field; +for large transfers this allows partially complete uploads to be resumed. + For example, a 1MiB share can be divided in to eight separate 128KiB chunks. Each chunk can be uploaded in a separate request. Each request can include a *Content-Range* value indicating its placement within the complete share. If any one of these requests fails then at most 128KiB of upload work needs to be retried. -The server must recognize when all of the data has been received and mark the share as complete +The server MUST recognize when all of the data has been received and mark the share as complete (which it can do because it was informed of the size when the storage index was initialized). -The request must include a ``X-Tahoe-Authorization`` header that includes the upload secret:: +The request MUST include a ``X-Tahoe-Authorization`` header that includes the upload secret:: X-Tahoe-Authorization: upload-secret Responses: -* When a chunk that does not complete the share is successfully uploaded the response is ``OK``. - The response body indicates the range of share data that has yet to be uploaded. +* When a chunk that does not complete the share is successfully uploaded the response MUST be ``OK``. + The response body MUST indicate the range of share data that has yet to be uploaded. That is:: { "required": @@ -620,11 +625,12 @@ Responses: ] } -* When the chunk that completes the share is successfully uploaded the response is ``CREATED``. +* When the chunk that completes the share is successfully uploaded the response MUST be ``CREATED``. * If the *Content-Range* for a request covers part of the share that has already, and the data does not match already written data, - the response is ``CONFLICT``. - At this point the only thing to do is abort the upload and start from scratch (see below). + the response MUST be ``CONFLICT``. + In this case the client MUST abort the upload. + The client MAY then restart the upload from scratch. Discussion `````````` @@ -650,34 +656,32 @@ From RFC 7231:: This cancels an *in-progress* upload. -The request must include a ``X-Tahoe-Authorization`` header that includes the upload secret:: +The request MUST include a ``X-Tahoe-Authorization`` header that includes the upload secret:: X-Tahoe-Authorization: upload-secret -The response code: - -* When the upload is still in progress and therefore the abort has succeeded, - the response is ``OK``. - Future uploads can start from scratch with no pre-existing upload state stored on the server. -* If the uploaded has already finished, the response is 405 (Method Not Allowed) - and no change is made. +If there is an incomplete upload with a matching upload-secret then the server MUST consider the abort to have succeeded. +In this case the response MUST be ``OK``. +The server MUST respond to all future requests as if the operations related to this upload did not take place. +If there is no incomplete upload with a matching upload-secret then the server MUST respond with ``Method Not Allowed`` (405). +The server MUST make no client-visible changes to its state in this case. ``POST /storage/v1/immutable/:storage_index/:share_number/corrupt`` !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Advise the server the data read from the indicated share was corrupt. The -request body includes an human-meaningful text string with details about the -corruption. It also includes potentially important details about the share. +Advise the server the data read from the indicated share was corrupt. +The request body includes an human-meaningful text string with details about the corruption. +It also includes potentially important details about the share. For example:: {"reason": "expected hash abcd, got hash efgh"} -.. share-type, storage-index, and share-number are inferred from the URL - -The response code is OK (200) by default, or NOT FOUND (404) if the share -couldn't be found. +The report pertains to the immutable share with a **storage index** and **share number** given in the request path. +If the identified **storage index** and **share number** are known to the server then the response SHOULD be accepted and made available to server administrators. +In this case the response SHOULD be ``OK``. +If the response is not accepted then the response SHOULD be ``Not Found`` (404). Reading ~~~~~~~ @@ -685,26 +689,34 @@ Reading ``GET /storage/v1/immutable/:storage_index/shares`` !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Retrieve a list (semantically, a set) indicating all shares available for the -indicated storage index. For example:: +Retrieve a list (semantically, a set) indicating all shares available for the indicated storage index. +For example:: [1, 5] -An unknown storage index results in an empty list. +If the **storage index** in the request path is not known to the server then the response MUST include an empty list. ``GET /storage/v1/immutable/:storage_index/:share_number`` !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Read a contiguous sequence of bytes from one share in one bucket. -The response body is the raw share data (i.e., ``application/octet-stream``). -The ``Range`` header may be used to request exactly one ``bytes`` range, in which case the response code will be 206 (partial content). -Interpretation and response behavior is as specified in RFC 7233 § 4.1. -Multiple ranges in a single request are *not* supported; open-ended ranges are also not supported. +The response body MUST be the raw share data (i.e., ``application/octet-stream``). +The ``Range`` header MAY be used to request exactly one ``bytes`` range, +in which case the response code MUST be ``Partial Content`` (206). +Interpretation and response behavior MUST be as specified in RFC 7233 § 4.1. +Multiple ranges in a single request are *not* supported; +open-ended ranges are also not supported. +Clients MUST NOT send requests using these features. -If the response reads beyond the end of the data, the response may be shorter than the requested range. -The resulting ``Content-Range`` header will be consistent with the returned data. +If the response reads beyond the end of the data, +the response MUST be shorter than the requested range. +It MUST contain all data in the share and then end. +The resulting ``Content-Range`` header MUST be consistent with the returned data. -If the response to a query is an empty range, the ``NO CONTENT`` (204) response code will be used. +The server MUST indicate this behavior by specifying **True** for **tolerates-immutable-read-overrun** in its version response. + +If the response to a query is an empty range, +the server MUST send a ``No Content`` (204) response. Discussion `````````` @@ -743,13 +755,13 @@ The first write operation on a mutable storage index creates it (that is, there is no separate "create this storage index" operation as there is for the immutable storage index type). -The request must include ``X-Tahoe-Authorization`` headers with write enabler and lease secrets:: +The request MUST include ``X-Tahoe-Authorization`` headers with write enabler and lease secrets:: X-Tahoe-Authorization: write-enabler X-Tahoe-Authorization: lease-cancel-secret X-Tahoe-Authorization: lease-renew-secret -The request body includes test, read, and write vectors for the operation. +The request body MUST include test, read, and write vectors for the operation. For example:: { From 98a3691891bfbe4af2871a58fba0e586400e90cf Mon Sep 17 00:00:00 2001 From: Jean-Paul Calderone Date: Wed, 25 Jan 2023 09:55:40 -0500 Subject: [PATCH 03/51] Add more CDDL to the spec; remove some server version flags from it --- docs/proposed/http-storage-node-protocol.rst | 95 ++++++++++++++++---- src/allmydata/storage/http_client.py | 16 +++- src/allmydata/storage/http_server.py | 21 ++++- 3 files changed, 109 insertions(+), 23 deletions(-) diff --git a/docs/proposed/http-storage-node-protocol.rst b/docs/proposed/http-storage-node-protocol.rst index cff6dc67b..838f88426 100644 --- a/docs/proposed/http-storage-node-protocol.rst +++ b/docs/proposed/http-storage-node-protocol.rst @@ -444,10 +444,6 @@ The response MUST validate against this CDDL schema:: 'maximum-immutable-share-size' => uint 'maximum-mutable-share-size' => uint 'available-space' => uint - 'tolerates-immutable-read-overrun' => bool - 'delete-mutable-shares-with-zero-length-writev' => bool - 'fills-holes-with-zero-bytes' => bool - 'prevents-read-past-end-of-share-data' => bool } 'application-version' => bstr } @@ -522,11 +518,18 @@ Initialize an immutable storage index with some buckets. The server MUST allow share data to be written to the buckets at most one time. The server MAY create a lease for the buckets. Details of the buckets to create are encoded in the request body. +The request body MUST validate against this CDDL schema:: + + { + share-numbers: #6.258([0*256 uint]) + allocated-size: uint + } + For example:: {"share-numbers": [1, 7, ...], "allocated-size": 12345} -The server SHOULD accept a value for **allocated-size** that is less than or equal to the value for the server's version message's **maximum-immutable-share-size** value. +The server SHOULD accept a value for **allocated-size** that is less than or equal to the lesser of the values of the server's version message's **maximum-immutable-share-size** or **available-space** values. The request MUST include ``X-Tahoe-Authorization`` HTTP headers that set the various secrets—upload, lease renewal, lease cancellation—that will be later used to authorize various operations. For example:: @@ -536,6 +539,13 @@ For example:: X-Tahoe-Authorization: upload-secret The response body MUST include encoded information about the created buckets. +The response body MUST validate against this CDDL schema:: + + { + already-have: #6.258([0*256 uint]) + allocated: #6.258([0*256 uint]) + } + For example:: {"already-have": [1, ...], "allocated": [7, ...]} @@ -614,7 +624,13 @@ Responses: * When a chunk that does not complete the share is successfully uploaded the response MUST be ``OK``. The response body MUST indicate the range of share data that has yet to be uploaded. - That is:: + The response body MUST validate against this CDDL schema:: + + { + required: [0* {begin: uint, end: uint}] + } + + For example:: { "required": [ { "begin": @@ -673,6 +689,11 @@ The server MUST make no client-visible changes to its state in this case. Advise the server the data read from the indicated share was corrupt. The request body includes an human-meaningful text string with details about the corruption. It also includes potentially important details about the share. +The request body MUST validate against this CDDL schema:: + + { + reason: tstr + } For example:: @@ -690,6 +711,10 @@ Reading !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Retrieve a list (semantically, a set) indicating all shares available for the indicated storage index. +The response body MUST validate against this CDDL schema:: + + #6.258([0*256 uint]) + For example:: [1, 5] @@ -710,11 +735,9 @@ Clients MUST NOT send requests using these features. If the response reads beyond the end of the data, the response MUST be shorter than the requested range. -It MUST contain all data in the share and then end. +It MUST contain all data up to the end of the share and then end. The resulting ``Content-Range`` header MUST be consistent with the returned data. -The server MUST indicate this behavior by specifying **True** for **tolerates-immutable-read-overrun** in its version response. - If the response to a query is an empty range, the server MUST send a ``No Content`` (204) response. @@ -762,6 +785,20 @@ The request MUST include ``X-Tahoe-Authorization`` headers with write enabler an X-Tahoe-Authorization: lease-renew-secret The request body MUST include test, read, and write vectors for the operation. +The request body MUST validate against this CDDL schema:: + + { + "test-write-vectors": { + 0*256 share_number : { + "test": [0*30 {"offset": uint, "size": uint, "specimen": bstr}] + "write": [* {"offset": uint, "data": bstr}] + "new-length": uint / null + } + } + "read-vector": [0*30 {"offset": uint, "size": uint}] + } + share_number = uint + For example:: { @@ -784,6 +821,14 @@ For example:: The response body contains a boolean indicating whether the tests all succeed (and writes were applied) and a mapping giving read data (pre-write). +The response body MUST validate against this CDDL schema:: + + { + "success": bool, + "data": {0*256 share_number: [0* bstr]} + } + share_number = uint + For example:: { @@ -795,7 +840,7 @@ For example:: } } -A test vector or read vector that read beyond the boundaries of existing data will return nothing for any bytes past the end. +A server MUST return nothing for any bytes beyond the end of existing data for a test vector or read vector that reads tries to read such data. As a result, if there is no data at all, an empty bytestring is returned no matter what the offset or length. Reading @@ -805,23 +850,34 @@ Reading !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Retrieve a set indicating all shares available for the indicated storage index. -For example (this is shown as list, since it will be list for JSON, but will be set for CBOR):: +The response body MUST validate against this CDDL schema:: + + #6.258([0*256 uint]) + +For example:: [1, 5] ``GET /storage/v1/mutable/:storage_index/:share_number`` !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Read data from the indicated mutable shares, just like ``GET /storage/v1/immutable/:storage_index`` +Read data from the indicated mutable shares, just like ``GET /storage/v1/immutable/:storage_index``. -The ``Range`` header may be used to request exactly one ``bytes`` range, in which case the response code will be 206 (partial content). -Interpretation and response behavior is as specified in RFC 7233 § 4.1. -Multiple ranges in a single request are *not* supported; open-ended ranges are also not supported. +The response body MUST be the raw share data (i.e., ``application/octet-stream``). +The ``Range`` header MAY be used to request exactly one ``bytes`` range, +in which case the response code MUST be ``Partial Content`` (206). +Interpretation and response behavior MUST be specified in RFC 7233 § 4.1. +Multiple ranges in a single request are *not* supported; +open-ended ranges are also not supported. +Clients MUST NOT send requests using these features. -If the response reads beyond the end of the data, the response may be shorter than the requested range. -The resulting ``Content-Range`` header will be consistent with the returned data. +If the response reads beyond the end of the data, +the response MUST be shorter than the requested range. +It MUST contain all data up to the end of the share and then end. +The resulting ``Content-Range`` header MUST be consistent with the returned data. -If the response to a query is an empty range, the ``NO CONTENT`` (204) response code will be used. +If the response to a query is an empty range, +the server MUST send a ``No Content`` (204) response. ``POST /storage/v1/mutable/:storage_index/:share_number/corrupt`` @@ -833,6 +889,9 @@ Just like the immutable version. Sample Interactions ------------------- +This section contains examples of client/server interactions to help illuminate the above specification. +This section is non-normative. + Immutable Data ~~~~~~~~~~~~~~ diff --git a/src/allmydata/storage/http_client.py b/src/allmydata/storage/http_client.py index 90bda7fc0..2f4f8398e 100644 --- a/src/allmydata/storage/http_client.py +++ b/src/allmydata/storage/http_client.py @@ -70,15 +70,14 @@ class ClientException(Exception): # indicates a set. _SCHEMAS = { "get_version": Schema( + # Note that the single-quoted (`'`) string keys in this schema + # represent *byte* strings - per the CDDL specification. Text strings + # are represented using strings with *double* quotes (`"`). """ response = {'http://allmydata.org/tahoe/protocols/storage/v1' => { 'maximum-immutable-share-size' => uint 'maximum-mutable-share-size' => uint 'available-space' => uint - 'tolerates-immutable-read-overrun' => bool - 'delete-mutable-shares-with-zero-length-writev' => bool - 'fills-holes-with-zero-bytes' => bool - 'prevents-read-past-end-of-share-data' => bool } 'application-version' => bstr } @@ -447,6 +446,15 @@ class StorageClientGeneral(object): decoded_response = yield self._client.decode_cbor( response, _SCHEMAS["get_version"] ) + # Add some features we know are true because the HTTP API + # specification requires them and because other parts of the storage + # client implementation assumes they will be present. + decoded_response[b"http://allmydata.org/tahoe/protocols/storage/v1"].update({ + b'tolerates-immutable-read-overrun': True, + b'delete-mutable-shares-with-zero-length-writev': True, + b'fills-holes-with-zero-bytes': True, + b'prevents-read-past-end-of-share-data': True, + }) returnValue(decoded_response) @inlineCallbacks diff --git a/src/allmydata/storage/http_server.py b/src/allmydata/storage/http_server.py index fd7fd1187..b7ca2d971 100644 --- a/src/allmydata/storage/http_server.py +++ b/src/allmydata/storage/http_server.py @@ -592,7 +592,26 @@ class HTTPServer(object): @_authorized_route(_app, set(), "/storage/v1/version", methods=["GET"]) def version(self, request, authorization): """Return version information.""" - return self._send_encoded(request, self._storage_server.get_version()) + return self._send_encoded(request, self._get_version()) + + def _get_version(self) -> dict[str, Any]: + """ + Get the HTTP version of the storage server's version response. + + This differs from the Foolscap version by omitting certain obsolete + fields. + """ + v = self._storage_server.get_version() + v1_identifier = b"http://allmydata.org/tahoe/protocols/storage/v1" + v1 = v[v1_identifier] + return { + v1_identifier: { + b"maximum-immutable-share-size": v1[b"maximum-immutable-share-size"], + b"maximum-mutable-share-size": v1[b"maximum-mutable-share-size"], + b"available-space": v1[b"available-space"], + }, + b"application-version": v[b"application-version"], + } ##### Immutable APIs ##### From 48a2d4d31d86bc39c2e1caa06f6e9c3e6baac741 Mon Sep 17 00:00:00 2001 From: Jean-Paul Calderone Date: Fri, 17 Feb 2023 13:58:58 -0500 Subject: [PATCH 04/51] ``Authorization`` is the right header field --- src/allmydata/storage/http_common.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/allmydata/storage/http_common.py b/src/allmydata/storage/http_common.py index 123ce403b..e5f07898e 100644 --- a/src/allmydata/storage/http_common.py +++ b/src/allmydata/storage/http_common.py @@ -28,7 +28,7 @@ def get_content_type(headers: Headers) -> Optional[str]: def swissnum_auth_header(swissnum: bytes) -> bytes: - """Return value for ``Authentication`` header.""" + """Return value for ``Authorization`` header.""" return b"Tahoe-LAFS " + b64encode(swissnum).strip() From 8645462f4e1b44507e9dcde63c05ff3ef9f30453 Mon Sep 17 00:00:00 2001 From: Jean-Paul Calderone Date: Fri, 17 Feb 2023 14:00:03 -0500 Subject: [PATCH 05/51] Base64 encode the swissnum Typically swissnums themselves are base32 encoded but there's no requirement that this is the case. Base64 encoding in the header ensures we can represent whatever the value was. --- docs/proposed/http-storage-node-protocol.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/proposed/http-storage-node-protocol.rst b/docs/proposed/http-storage-node-protocol.rst index 838f88426..7f678d271 100644 --- a/docs/proposed/http-storage-node-protocol.rst +++ b/docs/proposed/http-storage-node-protocol.rst @@ -391,7 +391,7 @@ Clients and servers MUST use the ``Authorization`` header field, as specified in `RFC 9110`_, for authorization of all requests to all endpoints specified here. The authentication *type* MUST be ``Tahoe-LAFS``. -Clients MUST present the swissnum from the NURL used to locate the storage service as the *credentials*. +Clients MUST present the `Base64`_-encoded representation of the swissnum from the NURL used to locate the storage service as the *credentials*. If credentials are not presented or the swissnum is not associated with a storage service then the server MUST issue a ``401 UNAUTHORIZED`` response and perform no other processing of the message. From 369d26f0f8c4975c7855e43ce9033caf893c50f0 Mon Sep 17 00:00:00 2001 From: Jean-Paul Calderone Date: Fri, 10 Mar 2023 11:17:09 -0500 Subject: [PATCH 06/51] There is a limit to the size of the corruption report a server must accept --- docs/proposed/http-storage-node-protocol.rst | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/docs/proposed/http-storage-node-protocol.rst b/docs/proposed/http-storage-node-protocol.rst index 7f678d271..4f5a53906 100644 --- a/docs/proposed/http-storage-node-protocol.rst +++ b/docs/proposed/http-storage-node-protocol.rst @@ -692,7 +692,7 @@ It also includes potentially important details about the share. The request body MUST validate against this CDDL schema:: { - reason: tstr + reason: tstr .size (1..32765) } For example:: @@ -704,6 +704,11 @@ If the identified **storage index** and **share number** are known to the server In this case the response SHOULD be ``OK``. If the response is not accepted then the response SHOULD be ``Not Found`` (404). +Discussion +`````````` + +The seemingly odd length limit on ``reason`` is chosen so that the *encoded* representation of the message is limited to 32768. + Reading ~~~~~~~ From b27946c3c6b590667ee54f43f61bc72e57780d6d Mon Sep 17 00:00:00 2001 From: Jean-Paul Calderone Date: Fri, 10 Mar 2023 11:17:23 -0500 Subject: [PATCH 07/51] trim overlong section marker --- docs/proposed/http-storage-node-protocol.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/proposed/http-storage-node-protocol.rst b/docs/proposed/http-storage-node-protocol.rst index 4f5a53906..6e5b85716 100644 --- a/docs/proposed/http-storage-node-protocol.rst +++ b/docs/proposed/http-storage-node-protocol.rst @@ -864,7 +864,7 @@ For example:: [1, 5] ``GET /storage/v1/mutable/:storage_index/:share_number`` -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! +!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Read data from the indicated mutable shares, just like ``GET /storage/v1/immutable/:storage_index``. From c3afab15ed43a729a8517e7ded6a6877b3c765f0 Mon Sep 17 00:00:00 2001 From: Jean-Paul Calderone Date: Mon, 13 Mar 2023 09:22:42 -0400 Subject: [PATCH 08/51] correct version type annotation --- src/allmydata/storage/http_server.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/allmydata/storage/http_server.py b/src/allmydata/storage/http_server.py index b7ca2d971..5560d3a73 100644 --- a/src/allmydata/storage/http_server.py +++ b/src/allmydata/storage/http_server.py @@ -594,7 +594,7 @@ class HTTPServer(object): """Return version information.""" return self._send_encoded(request, self._get_version()) - def _get_version(self) -> dict[str, Any]: + def _get_version(self) -> dict[bytes, Any]: """ Get the HTTP version of the storage server's version response. From 7859ba733717dbc75b98554311cf7a59733ed5f7 Mon Sep 17 00:00:00 2001 From: Jean-Paul Calderone Date: Mon, 13 Mar 2023 09:25:49 -0400 Subject: [PATCH 09/51] fix title level inconsistency --- docs/proposed/http-storage-node-protocol.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/proposed/http-storage-node-protocol.rst b/docs/proposed/http-storage-node-protocol.rst index 6e5b85716..c9bdf3013 100644 --- a/docs/proposed/http-storage-node-protocol.rst +++ b/docs/proposed/http-storage-node-protocol.rst @@ -194,7 +194,7 @@ An HTTP-based protocol, dubbed "Great Black Swamp" (or "GBS"), is described belo This protocol aims to satisfy the above requirements at a lower level of complexity than the current Foolscap-based protocol. Summary (Non-normative) -!!!!!!!!!!!!!!!!!!!!!!! +~~~~~~~~~~~~~~~~~~~~~~~ Communication with the storage node will take place using TLS. The TLS version and configuration will be dictated by an ongoing understanding of best practices. From 5facd06725d2f0c11e497c84e4d90e90bc37dd95 Mon Sep 17 00:00:00 2001 From: Jean-Paul Calderone Date: Mon, 13 Mar 2023 10:42:30 -0400 Subject: [PATCH 10/51] adjust markup to clarify the encoding exceptions --- docs/proposed/http-storage-node-protocol.rst | 14 ++++++++++---- 1 file changed, 10 insertions(+), 4 deletions(-) diff --git a/docs/proposed/http-storage-node-protocol.rst b/docs/proposed/http-storage-node-protocol.rst index c9bdf3013..f6d90526e 100644 --- a/docs/proposed/http-storage-node-protocol.rst +++ b/docs/proposed/http-storage-node-protocol.rst @@ -347,6 +347,8 @@ A request MAY be submitted using an alternate encoding by declaring this in the A request MAY indicate its preference for an alternate encoding in the response using the ``Accept`` header field. A request which includes no ``Accept`` header field MUST be interpreted in the same way as a request including a ``Accept: application/cbor`` header field. +Clients and servers MAY support additional request and response message body encodings. + Clients and servers SHOULD support ``application/json`` request and response message body encoding. For HTTP messages carrying binary share data, this is expected to be a particularly poor encoding. @@ -360,7 +362,11 @@ Because of the simple types used throughout and the equivalence described in `RFC 7049`_ these examples should be representative regardless of which of these two encodings is chosen. -One exception to this rule is for sets. +There are two exceptions to this rule. + +1. Sets +!!!!!!! + For CBOR messages, any sequence that is semantically a set (i.e. no repeated values allowed, order doesn't matter, and elements are hashable in Python) should be sent as a set. Tag 6.258 is used to indicate sets in CBOR; @@ -368,12 +374,12 @@ see `the CBOR registry Date: Mon, 13 Mar 2023 10:44:09 -0400 Subject: [PATCH 11/51] nail it down --- docs/proposed/http-storage-node-protocol.rst | 1 + 1 file changed, 1 insertion(+) diff --git a/docs/proposed/http-storage-node-protocol.rst b/docs/proposed/http-storage-node-protocol.rst index f6d90526e..f9f2cd868 100644 --- a/docs/proposed/http-storage-node-protocol.rst +++ b/docs/proposed/http-storage-node-protocol.rst @@ -436,6 +436,7 @@ Encoding ~~~~~~~~ * ``storage_index`` MUST be `Base32`_ encoded in URLs. +* ``share_number`` MUST be a decimal representation General ~~~~~~~ From 6dc6d6f39f35fe6f51002b46b90727381b142e04 Mon Sep 17 00:00:00 2001 From: Jean-Paul Calderone Date: Mon, 13 Mar 2023 11:06:16 -0400 Subject: [PATCH 12/51] inline the actual base32 alphabet we use --- docs/proposed/http-storage-node-protocol.rst | 100 ++++++++++++++++++- 1 file changed, 96 insertions(+), 4 deletions(-) diff --git a/docs/proposed/http-storage-node-protocol.rst b/docs/proposed/http-storage-node-protocol.rst index f9f2cd868..21e27d7dd 100644 --- a/docs/proposed/http-storage-node-protocol.rst +++ b/docs/proposed/http-storage-node-protocol.rst @@ -244,9 +244,9 @@ When Bob's client issues HTTP requests to Alice's storage node it includes the * .. note:: Foolscap TubIDs are 20 bytes (SHA1 digest of the certificate). - They are encoded with Base32 for a length of 32 bytes. + They are encoded with `Base32_` for a length of 32 bytes. SPKI information discussed here is 32 bytes (SHA256 digest). - They would be encoded in Base32 for a length of 52 bytes. + They would be encoded in `Base32`_ for a length of 52 bytes. `unpadded base64url`_ provides a more compact encoding of the information while remaining URL-compatible. This would encode the SPKI information for a length of merely 43 bytes. SHA1, @@ -336,6 +336,100 @@ and shares. A particular resource is addressed by the HTTP request path. Details about the interface are encoded in the HTTP message body. +String Encoding +~~~~~~~~~~~~~~~ + +.. _Base32: + +Where the specification refers to Base32 the meaning is *unpadded* Base32 encoding as specified by `RFC 4648`_ using a *lowercase variation* of the alphabet from Section 6. + +That is, the alphabet is: + +.. list-table:: Base32 Alphabet + :header-rows: 1 + + * - Value + - Encoding + - Value + - Encoding + - Value + - Encoding + - Value + - Encoding + + * - 0 + - a + - 9 + - j + - 18 + - s + - 27 + - 3 + * - 1 + - b + - 10 + - k + - 19 + - t + - 28 + - 4 + * - 2 + - c + - 11 + - l + - 20 + - u + - 29 + - 5 + * - 3 + - d + - 12 + - m + - 21 + - v + - 30 + - 6 + * - 4 + - e + - 13 + - n + - 22 + - w + - 31 + - 7 + * - 5 + - f + - 14 + - o + - 23 + - x + - + - + * - 6 + - g + - 15 + - p + - 24 + - y + - + - + * - 7 + - h + - 16 + - q + - 25 + - z + - + - + * - 8 + - i + - 17 + - r + - 26 + - 2 + - + - + Message Encoding ~~~~~~~~~~~~~~~~ @@ -1058,8 +1152,6 @@ otherwise it will read a byte which won't match `b""`:: .. _Base64: https://www.rfc-editor.org/rfc/rfc4648#section-4 -.. _Base32: https://www.rfc-editor.org/rfc/rfc4648#section-6 - .. _RFC 4648: https://tools.ietf.org/html/rfc4648 .. _RFC 7469: https://tools.ietf.org/html/rfc7469#section-2.4 From 6771ca8ce4caf34cceacd806c8c7c45eb80af315 Mon Sep 17 00:00:00 2001 From: Jean-Paul Calderone Date: Mon, 13 Mar 2023 13:29:53 -0400 Subject: [PATCH 13/51] fix table markup --- docs/proposed/http-storage-node-protocol.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/proposed/http-storage-node-protocol.rst b/docs/proposed/http-storage-node-protocol.rst index 21e27d7dd..ebe39578c 100644 --- a/docs/proposed/http-storage-node-protocol.rst +++ b/docs/proposed/http-storage-node-protocol.rst @@ -346,7 +346,7 @@ Where the specification refers to Base32 the meaning is *unpadded* Base32 encodi That is, the alphabet is: .. list-table:: Base32 Alphabet - :header-rows: 1 + :header-rows: 1 * - Value - Encoding From fe0e159e52712c14557e0c188798f8286e33ca65 Mon Sep 17 00:00:00 2001 From: Jean-Paul Calderone Date: Mon, 13 Mar 2023 13:30:32 -0400 Subject: [PATCH 14/51] Give base32 a section heading We don't have any other sections but ... :shrug: --- docs/proposed/http-storage-node-protocol.rst | 3 +++ 1 file changed, 3 insertions(+) diff --git a/docs/proposed/http-storage-node-protocol.rst b/docs/proposed/http-storage-node-protocol.rst index ebe39578c..f81b2bc79 100644 --- a/docs/proposed/http-storage-node-protocol.rst +++ b/docs/proposed/http-storage-node-protocol.rst @@ -341,6 +341,9 @@ String Encoding .. _Base32: +Base32 +!!!!!! + Where the specification refers to Base32 the meaning is *unpadded* Base32 encoding as specified by `RFC 4648`_ using a *lowercase variation* of the alphabet from Section 6. That is, the alphabet is: From 6a0a895ee88e34da3c798acc19c5800af3fda414 Mon Sep 17 00:00:00 2001 From: Jean-Paul Calderone Date: Mon, 13 Mar 2023 13:37:01 -0400 Subject: [PATCH 15/51] Encode the reason limit in the implementation as well --- src/allmydata/storage/http_server.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/allmydata/storage/http_server.py b/src/allmydata/storage/http_server.py index 5560d3a73..3ae16ae5c 100644 --- a/src/allmydata/storage/http_server.py +++ b/src/allmydata/storage/http_server.py @@ -273,7 +273,7 @@ _SCHEMAS = { "advise_corrupt_share": Schema( """ request = { - reason: tstr + reason: tstr .size (1..32765) } """ ), From 44f5057ed39cba4f853ad3aaf862244323b29858 Mon Sep 17 00:00:00 2001 From: Jean-Paul Calderone Date: Wed, 22 Mar 2023 08:07:59 -0400 Subject: [PATCH 16/51] fix link markup --- docs/proposed/http-storage-node-protocol.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/proposed/http-storage-node-protocol.rst b/docs/proposed/http-storage-node-protocol.rst index f81b2bc79..493cf8f58 100644 --- a/docs/proposed/http-storage-node-protocol.rst +++ b/docs/proposed/http-storage-node-protocol.rst @@ -244,7 +244,7 @@ When Bob's client issues HTTP requests to Alice's storage node it includes the * .. note:: Foolscap TubIDs are 20 bytes (SHA1 digest of the certificate). - They are encoded with `Base32_` for a length of 32 bytes. + They are encoded with `Base32`_ for a length of 32 bytes. SPKI information discussed here is 32 bytes (SHA256 digest). They would be encoded in `Base32`_ for a length of 52 bytes. `unpadded base64url`_ provides a more compact encoding of the information while remaining URL-compatible. From 7c0b21916f376f139e2569242e443fea60c40723 Mon Sep 17 00:00:00 2001 From: Jean-Paul Calderone Date: Wed, 22 Mar 2023 08:35:17 -0400 Subject: [PATCH 17/51] specify the unit of `available-space` --- docs/proposed/http-storage-node-protocol.rst | 1 + 1 file changed, 1 insertion(+) diff --git a/docs/proposed/http-storage-node-protocol.rst b/docs/proposed/http-storage-node-protocol.rst index 493cf8f58..3e74c94d6 100644 --- a/docs/proposed/http-storage-node-protocol.rst +++ b/docs/proposed/http-storage-node-protocol.rst @@ -560,6 +560,7 @@ For fields that are more general than a single API the semantics are as follows: * available-space: The server SHOULD use this field to advertise the amount of space that it currently considers unused and is willing to allocate for client requests. + The value is a number of bytes. ``PUT /storage/v1/lease/:storage_index`` From e7ed17af17c7c77daa24203954ff2ef2198875c6 Mon Sep 17 00:00:00 2001 From: Jean-Paul Calderone Date: Wed, 22 Mar 2023 08:42:32 -0400 Subject: [PATCH 18/51] fix some editing errors about overreads and generally try to clarify --- docs/proposed/http-storage-node-protocol.rst | 13 +++++++++++-- 1 file changed, 11 insertions(+), 2 deletions(-) diff --git a/docs/proposed/http-storage-node-protocol.rst b/docs/proposed/http-storage-node-protocol.rst index 3e74c94d6..5009a992e 100644 --- a/docs/proposed/http-storage-node-protocol.rst +++ b/docs/proposed/http-storage-node-protocol.rst @@ -950,8 +950,17 @@ For example:: } } -A server MUST return nothing for any bytes beyond the end of existing data for a test vector or read vector that reads tries to read such data. -As a result, if there is no data at all, an empty bytestring is returned no matter what the offset or length. +A client MAY send a test vector or read vector to bytes beyond the end of existing data. +In this case a server MUST behave as if the test or read vector referred to exactly as much data exists. + +For example, +consider the case where the server has 5 bytes of data for a particular share. +If a client sends a read vector with an ``offset`` of 1 and a ``size`` of 4 then the server MUST respond with all of the data except the first byte. +If a client sends a read vector with the same ``offset`` and a ``size`` of 5 (or any larger value) then the server MUST respond in the same way. + +Similarly, +if there is no data at all, +an empty byte string is returned no matter what the offset or length. Reading ~~~~~~~ From c49aa446552f3060b4f53bddd300e288be1eb21d Mon Sep 17 00:00:00 2001 From: Jean-Paul Calderone Date: Wed, 22 Mar 2023 09:04:15 -0400 Subject: [PATCH 19/51] Update the raw number and give a reference for interpretation --- docs/performance.rst | 5 +++-- docs/specifications/dirnodes.rst | 10 +++++----- src/allmydata/client.py | 2 -- 3 files changed, 8 insertions(+), 9 deletions(-) diff --git a/docs/performance.rst b/docs/performance.rst index 6ddeb1fe8..a0487c72c 100644 --- a/docs/performance.rst +++ b/docs/performance.rst @@ -82,8 +82,9 @@ network: A memory footprint: N/K*A -notes: Tahoe-LAFS generates a new RSA keypair for each mutable file that it -publishes to a grid. This takes up to 1 or 2 seconds on a typical desktop PC. +notes: +Tahoe-LAFS generates a new RSA keypair for each mutable file that it publishes to a grid. +This takes around 100 milliseconds on a relatively high-end laptop from 2021. Part of the process of encrypting, encoding, and uploading a mutable file to a Tahoe-LAFS grid requires that the entire file be in memory at once. For larger diff --git a/docs/specifications/dirnodes.rst b/docs/specifications/dirnodes.rst index 88fcd0fa9..c53d28a26 100644 --- a/docs/specifications/dirnodes.rst +++ b/docs/specifications/dirnodes.rst @@ -267,7 +267,7 @@ How well does this design meet the goals? value, so there are no opportunities for staleness 9. monotonicity: VERY: the single point of access also protects against retrograde motion - + Confidentiality leaks in the storage servers @@ -332,8 +332,9 @@ MDMF design rules allow for efficient random-access reads from the middle of the file, which would give the index something useful to point at. The current SDMF design generates a new RSA public/private keypair for each -directory. This takes considerable time and CPU effort, generally one or two -seconds per directory. We have designed (but not yet built) a DSA-based +directory. This takes some time and CPU effort (around 100 milliseconds on a +relatively high-end 2021 laptop) per directory. +We have designed (but not yet built) a DSA-based mutable file scheme which will use shared parameters to reduce the directory-creation effort to a bare minimum (picking a random number instead of generating two random primes). @@ -363,7 +364,7 @@ single child, looking up a single child) would require pulling or pushing a lot of unrelated data, increasing network overhead (and necessitating test-and-set semantics for the modification side, which increases the chances that a user operation will fail, making it more challenging to provide -promises of atomicity to the user). +promises of atomicity to the user). It would also make it much more difficult to enable the delegation ("sharing") of specific directories. Since each aggregate "realm" provides @@ -469,4 +470,3 @@ Preventing delegation between communication parties is just as pointless as asking Bob to forget previously accessed files. However, there may be value to configuring the UI to ask Carol to not share files with Bob, or to removing all files from Bob's view at the same time his access is revoked. - diff --git a/src/allmydata/client.py b/src/allmydata/client.py index 2adf59660..8a10fe9e7 100644 --- a/src/allmydata/client.py +++ b/src/allmydata/client.py @@ -175,8 +175,6 @@ class KeyGenerator(object): """I return a Deferred that fires with a (verifyingkey, signingkey) pair. The returned key will be 2048 bit""" keysize = 2048 - # RSA key generation for a 2048 bit key takes between 0.8 and 3.2 - # secs signer, verifier = rsa.create_signing_keypair(keysize) return defer.succeed( (verifier, signer) ) From c1de2efd2d97d4bc79afb40fe0f9dfe6c450b01b Mon Sep 17 00:00:00 2001 From: Jean-Paul Calderone Date: Wed, 22 Mar 2023 09:04:31 -0400 Subject: [PATCH 20/51] news fragment --- newsfragments/3993.minor | 0 1 file changed, 0 insertions(+), 0 deletions(-) create mode 100644 newsfragments/3993.minor diff --git a/newsfragments/3993.minor b/newsfragments/3993.minor new file mode 100644 index 000000000..e69de29bb From 1f29d5a23a6c772c35588f01b1c2a853691a4f5c Mon Sep 17 00:00:00 2001 From: Itamar Turner-Trauring Date: Fri, 24 Mar 2023 11:38:15 -0400 Subject: [PATCH 21/51] News fragment. --- newsfragments/3996.minor | 0 1 file changed, 0 insertions(+), 0 deletions(-) create mode 100644 newsfragments/3996.minor diff --git a/newsfragments/3996.minor b/newsfragments/3996.minor new file mode 100644 index 000000000..e69de29bb From ce6b7aeb828e4cdd2f2056e18ae7872dd53d6787 Mon Sep 17 00:00:00 2001 From: Itamar Turner-Trauring Date: Fri, 24 Mar 2023 11:38:22 -0400 Subject: [PATCH 22/51] More modern pylint and flake8 and friends. --- setup.py | 2 +- tox.ini | 7 +++---- 2 files changed, 4 insertions(+), 5 deletions(-) diff --git a/setup.py b/setup.py index 152c49f0e..6528b01ed 100644 --- a/setup.py +++ b/setup.py @@ -400,7 +400,7 @@ setup(name="tahoe-lafs", # also set in __init__.py # disagreeing on what is or is not a lint issue. We can bump # this version from time to time, but we will do it # intentionally. - "pyflakes == 2.2.0", + "pyflakes == 3.0.1", "coverage ~= 5.0", "mock", "tox ~= 3.0", diff --git a/tox.ini b/tox.ini index 382ba973e..447745784 100644 --- a/tox.ini +++ b/tox.ini @@ -100,10 +100,9 @@ commands = [testenv:codechecks] basepython = python3 deps = - # Newer versions of PyLint have buggy configuration - # (https://github.com/PyCQA/pylint/issues/4574), so stick to old version - # for now. - pylint < 2.5 + # Make sure we get a version of PyLint that respects config, and isn't too + # old. + pylint < 2.18, >2.14 # On macOS, git inside of towncrier needs $HOME. passenv = HOME setenv = From 56e3aaad03f1839f50fce1a526f1b969517cc538 Mon Sep 17 00:00:00 2001 From: Itamar Turner-Trauring Date: Fri, 24 Mar 2023 11:41:25 -0400 Subject: [PATCH 23/51] Lint fix and cleanup. --- src/allmydata/immutable/upload.py | 18 ++++-------------- 1 file changed, 4 insertions(+), 14 deletions(-) diff --git a/src/allmydata/immutable/upload.py b/src/allmydata/immutable/upload.py index 1d70759ff..9d6842f44 100644 --- a/src/allmydata/immutable/upload.py +++ b/src/allmydata/immutable/upload.py @@ -2,22 +2,12 @@ Ported to Python 3. """ -from __future__ import absolute_import -from __future__ import division -from __future__ import print_function -from __future__ import unicode_literals +from __future__ import annotations -from future.utils import PY2, native_str -if PY2: - from future.builtins import filter, map, zip, ascii, chr, hex, input, next, oct, open, pow, round, super, bytes, dict, list, object, range, str, max, min # noqa: F401 +from future.utils import native_str from past.builtins import long, unicode from six import ensure_str -try: - from typing import List -except ImportError: - pass - import os, time, weakref, itertools import attr @@ -915,8 +905,8 @@ class _Accum(object): :ivar remaining: The number of bytes still expected. :ivar ciphertext: The bytes accumulated so far. """ - remaining = attr.ib(validator=attr.validators.instance_of(int)) # type: int - ciphertext = attr.ib(default=attr.Factory(list)) # type: List[bytes] + remaining : int = attr.ib(validator=attr.validators.instance_of(int)) + ciphertext : list[bytes] = attr.ib(default=attr.Factory(list)) def extend(self, size, # type: int From eb1cb84455883660301f51c7783497963c58007e Mon Sep 17 00:00:00 2001 From: Itamar Turner-Trauring Date: Fri, 24 Mar 2023 11:42:38 -0400 Subject: [PATCH 24/51] Lint fix and cleanup. --- src/allmydata/introducer/server.py | 19 ++++--------------- 1 file changed, 4 insertions(+), 15 deletions(-) diff --git a/src/allmydata/introducer/server.py b/src/allmydata/introducer/server.py index f0638439a..98136157d 100644 --- a/src/allmydata/introducer/server.py +++ b/src/allmydata/introducer/server.py @@ -2,24 +2,13 @@ Ported to Python 3. """ -from __future__ import absolute_import -from __future__ import division -from __future__ import print_function -from __future__ import unicode_literals +from __future__ import annotations - -from future.utils import PY2 -if PY2: - from future.builtins import filter, map, zip, ascii, chr, hex, input, next, oct, open, pow, round, super, bytes, dict, list, object, range, str, max, min # noqa: F401 from past.builtins import long from six import ensure_text import time, os.path, textwrap - -try: - from typing import Any, Dict, Union -except ImportError: - pass +from typing import Any, Union from zope.interface import implementer from twisted.application import service @@ -161,11 +150,11 @@ class IntroducerService(service.MultiService, Referenceable): # v1 is the original protocol, added in 1.0 (but only advertised starting # in 1.3), removed in 1.12. v2 is the new signed protocol, added in 1.10 # TODO: reconcile bytes/str for keys - VERSION = { + VERSION : dict[Union[bytes, str], Any]= { #"http://allmydata.org/tahoe/protocols/introducer/v1": { }, b"http://allmydata.org/tahoe/protocols/introducer/v2": { }, b"application-version": allmydata.__full_version__.encode("utf-8"), - } # type: Dict[Union[bytes, str], Any] + } def __init__(self): service.MultiService.__init__(self) From 958c08d6f577fa97e3b5a146405b4329a14c6235 Mon Sep 17 00:00:00 2001 From: Itamar Turner-Trauring Date: Fri, 24 Mar 2023 11:44:14 -0400 Subject: [PATCH 25/51] Lint fix and cleanup. --- src/allmydata/node.py | 17 +++-------------- 1 file changed, 3 insertions(+), 14 deletions(-) diff --git a/src/allmydata/node.py b/src/allmydata/node.py index 34abb307f..58ee33ef5 100644 --- a/src/allmydata/node.py +++ b/src/allmydata/node.py @@ -4,14 +4,8 @@ a node for Tahoe-LAFS. Ported to Python 3. """ -from __future__ import absolute_import -from __future__ import division -from __future__ import print_function -from __future__ import unicode_literals +from __future__ import annotations -from future.utils import PY2 -if PY2: - from future.builtins import filter, map, zip, ascii, chr, hex, input, next, oct, open, pow, round, super, bytes, dict, list, object, range, str, max, min # noqa: F401 from six import ensure_str, ensure_text import json @@ -23,11 +17,7 @@ import errno from base64 import b32decode, b32encode from errno import ENOENT, EPERM from warnings import warn - -try: - from typing import Union -except ImportError: - pass +from typing import Union import attr @@ -281,8 +271,7 @@ def _error_about_old_config_files(basedir, generated_files): raise e -def ensure_text_and_abspath_expanduser_unicode(basedir): - # type: (Union[bytes, str]) -> str +def ensure_text_and_abspath_expanduser_unicode(basedir: Union[bytes, str]) -> str: return abspath_expanduser_unicode(ensure_text(basedir)) From 76ecdfb7bcd9e64bb191409c33a10fd5621a7102 Mon Sep 17 00:00:00 2001 From: Itamar Turner-Trauring Date: Fri, 24 Mar 2023 11:44:59 -0400 Subject: [PATCH 26/51] Fix lint. --- src/allmydata/scripts/admin.py | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/allmydata/scripts/admin.py b/src/allmydata/scripts/admin.py index 579505399..02fd9a143 100644 --- a/src/allmydata/scripts/admin.py +++ b/src/allmydata/scripts/admin.py @@ -255,9 +255,9 @@ def do_admin(options): return f(so) -subCommands = [ +subCommands : SubCommands = [ ("admin", None, AdminCommand, "admin subcommands: use 'tahoe admin' for a list"), - ] # type: SubCommands + ] dispatch = { "admin": do_admin, From e1839ff30d629129b2aed9f0462a3f1bae1df9de Mon Sep 17 00:00:00 2001 From: Itamar Turner-Trauring Date: Fri, 24 Mar 2023 11:45:56 -0400 Subject: [PATCH 27/51] Fix lints. --- src/allmydata/scripts/cli.py | 22 +++++----------------- 1 file changed, 5 insertions(+), 17 deletions(-) diff --git a/src/allmydata/scripts/cli.py b/src/allmydata/scripts/cli.py index 579b37906..6e1f28d11 100644 --- a/src/allmydata/scripts/cli.py +++ b/src/allmydata/scripts/cli.py @@ -1,22 +1,10 @@ """ Ported to Python 3. """ -from __future__ import unicode_literals -from __future__ import absolute_import -from __future__ import division -from __future__ import print_function - -from future.utils import PY2 -if PY2: - from future.builtins import filter, map, zip, ascii, chr, hex, input, next, oct, open, pow, round, super, bytes, dict, list, object, range, str, max, min # noqa: F401 - import os.path, re, fnmatch -try: - from allmydata.scripts.types_ import SubCommands, Parameters -except ImportError: - pass +from allmydata.scripts.types_ import SubCommands, Parameters from twisted.python import usage from allmydata.scripts.common import get_aliases, get_default_nodedir, \ @@ -29,14 +17,14 @@ NODEURL_RE=re.compile("http(s?)://([^:]*)(:([1-9][0-9]*))?") _default_nodedir = get_default_nodedir() class FileStoreOptions(BaseOptions): - optParameters = [ + optParameters : Parameters = [ ["node-url", "u", None, "Specify the URL of the Tahoe gateway node, such as " "'http://127.0.0.1:3456'. " "This overrides the URL found in the --node-directory ."], ["dir-cap", None, None, "Specify which dirnode URI should be used as the 'tahoe' alias."] - ] # type: Parameters + ] def postOptions(self): self["quiet"] = self.parent["quiet"] @@ -484,7 +472,7 @@ class DeepCheckOptions(FileStoreOptions): (which must be a directory), like 'tahoe check' but for multiple files. Optionally repair any problems found.""" -subCommands = [ +subCommands : SubCommands = [ ("mkdir", None, MakeDirectoryOptions, "Create a new directory."), ("add-alias", None, AddAliasOptions, "Add a new alias cap."), ("create-alias", None, CreateAliasOptions, "Create a new alias cap."), @@ -503,7 +491,7 @@ subCommands = [ ("check", None, CheckOptions, "Check a single file or directory."), ("deep-check", None, DeepCheckOptions, "Check all files/directories reachable from a starting point."), ("status", None, TahoeStatusCommand, "Various status information."), - ] # type: SubCommands + ] def mkdir(options): from allmydata.scripts import tahoe_mkdir From 0cd197d4d0b156124f73df7b9b16607c846208ee Mon Sep 17 00:00:00 2001 From: Itamar Turner-Trauring Date: Fri, 24 Mar 2023 11:46:46 -0400 Subject: [PATCH 28/51] Update another instance of List. --- src/allmydata/immutable/upload.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/allmydata/immutable/upload.py b/src/allmydata/immutable/upload.py index 9d6842f44..0421de4e0 100644 --- a/src/allmydata/immutable/upload.py +++ b/src/allmydata/immutable/upload.py @@ -910,7 +910,7 @@ class _Accum(object): def extend(self, size, # type: int - ciphertext, # type: List[bytes] + ciphertext, # type: list[bytes] ): """ Accumulate some more ciphertext. From ae29ea2b23cda231a953e9b1a8c92016c3ac0f53 Mon Sep 17 00:00:00 2001 From: Itamar Turner-Trauring Date: Fri, 24 Mar 2023 11:47:43 -0400 Subject: [PATCH 29/51] Fix lint, and some Python 3 cleanups. --- src/allmydata/scripts/common.py | 36 ++++++++------------------------- 1 file changed, 8 insertions(+), 28 deletions(-) diff --git a/src/allmydata/scripts/common.py b/src/allmydata/scripts/common.py index c9fc8e031..d6ca8556d 100644 --- a/src/allmydata/scripts/common.py +++ b/src/allmydata/scripts/common.py @@ -4,29 +4,13 @@ Ported to Python 3. """ -from __future__ import unicode_literals -from __future__ import absolute_import -from __future__ import division -from __future__ import print_function - -from future.utils import PY2 -if PY2: - from future.builtins import filter, map, zip, ascii, chr, hex, input, next, oct, open, pow, round, super, bytes, dict, list, object, range, str, max, min # noqa: F401 -else: - from typing import Union - +from typing import Union, Optional import os, sys, textwrap import codecs from os.path import join import urllib.parse -try: - from typing import Optional - from .types_ import Parameters -except ImportError: - pass - from yaml import ( safe_dump, ) @@ -37,6 +21,8 @@ from allmydata.util.assertutil import precondition from allmydata.util.encodingutil import quote_output, \ quote_local_unicode_path, argv_to_abspath from allmydata.scripts.default_nodedir import _default_nodedir +from .types_ import Parameters + def get_default_nodedir(): return _default_nodedir @@ -59,7 +45,7 @@ class BaseOptions(usage.Options): def opt_version(self): raise usage.UsageError("--version not allowed on subcommands") - description = None # type: Optional[str] + description : Optional[str] = None description_unwrapped = None # type: Optional[str] def __str__(self): @@ -80,10 +66,10 @@ class BaseOptions(usage.Options): class BasedirOptions(BaseOptions): default_nodedir = _default_nodedir - optParameters = [ + optParameters : Parameters = [ ["basedir", "C", None, "Specify which Tahoe base directory should be used. [default: %s]" % quote_local_unicode_path(_default_nodedir)], - ] # type: Parameters + ] def parseArgs(self, basedir=None): # This finds the node-directory option correctly even if we are in a subcommand. @@ -283,9 +269,8 @@ def get_alias(aliases, path_unicode, default): quote_output(alias)) return uri.from_string_dirnode(aliases[alias]).to_string(), path[colon+1:] -def escape_path(path): - # type: (Union[str,bytes]) -> str - u""" +def escape_path(path: Union[str, bytes]) -> str: + """ Return path quoted to US-ASCII, valid URL characters. >>> path = u'/føö/bar/☃' @@ -302,9 +287,4 @@ def escape_path(path): ]), "ascii" ) - # Eventually (i.e. as part of Python 3 port) we want this to always return - # Unicode strings. However, to reduce diff sizes in the short term it'll - # return native string (i.e. bytes) on Python 2. - if PY2: - result = result.encode("ascii").__native__() return result From 29a66e51583f549c6d080dc0dd25cf2a77b7039a Mon Sep 17 00:00:00 2001 From: Itamar Turner-Trauring Date: Fri, 24 Mar 2023 12:01:12 -0400 Subject: [PATCH 30/51] Fix lint. --- src/allmydata/scripts/debug.py | 18 ++++-------------- 1 file changed, 4 insertions(+), 14 deletions(-) diff --git a/src/allmydata/scripts/debug.py b/src/allmydata/scripts/debug.py index 6201ce28f..b6eba842a 100644 --- a/src/allmydata/scripts/debug.py +++ b/src/allmydata/scripts/debug.py @@ -1,19 +1,8 @@ """ Ported to Python 3. """ -from __future__ import unicode_literals -from __future__ import absolute_import -from __future__ import division -from __future__ import print_function -from future.utils import PY2, bchr -if PY2: - from future.builtins import filter, map, zip, ascii, chr, hex, input, next, oct, open, pow, round, super, bytes, dict, list, object, range, str, max, min # noqa: F401 - -try: - from allmydata.scripts.types_ import SubCommands -except ImportError: - pass +from future.utils import bchr import struct, time, os, sys @@ -31,6 +20,7 @@ from allmydata.mutable.common import NeedMoreDataError from allmydata.immutable.layout import ReadBucketProxy from allmydata.util import base32 from allmydata.util.encodingutil import quote_output +from allmydata.scripts.types_ import SubCommands class DumpOptions(BaseOptions): def getSynopsis(self): @@ -1076,9 +1066,9 @@ def do_debug(options): return f(so) -subCommands = [ +subCommands : SubCommands = [ ("debug", None, DebugCommand, "debug subcommands: use 'tahoe debug' for a list."), - ] # type: SubCommands + ] dispatch = { "debug": do_debug, From 0e6825709dbe28178e58d0d66820b139531a24b5 Mon Sep 17 00:00:00 2001 From: Itamar Turner-Trauring Date: Fri, 24 Mar 2023 12:03:04 -0400 Subject: [PATCH 31/51] Fix lints. --- src/allmydata/scripts/create_node.py | 48 ++++++++++------------------ src/allmydata/scripts/runner.py | 21 +++--------- 2 files changed, 21 insertions(+), 48 deletions(-) diff --git a/src/allmydata/scripts/create_node.py b/src/allmydata/scripts/create_node.py index 5d9da518b..7d15b95ec 100644 --- a/src/allmydata/scripts/create_node.py +++ b/src/allmydata/scripts/create_node.py @@ -1,25 +1,11 @@ -# Ported to Python 3 - -from __future__ import print_function -from __future__ import absolute_import -from __future__ import division -from __future__ import unicode_literals - -from future.utils import PY2 -if PY2: - from future.builtins import filter, map, zip, ascii, chr, hex, input, next, oct, open, pow, round, super, bytes, dict, list, object, range, str, max, min # noqa: F401 - import io import os -try: - from allmydata.scripts.types_ import ( - SubCommands, - Parameters, - Flags, - ) -except ImportError: - pass +from allmydata.scripts.types_ import ( + SubCommands, + Parameters, + Flags, +) from twisted.internet import reactor, defer from twisted.python.usage import UsageError @@ -48,7 +34,7 @@ def write_tac(basedir, nodetype): fileutil.write(os.path.join(basedir, "tahoe-%s.tac" % (nodetype,)), dummy_tac) -WHERE_OPTS = [ +WHERE_OPTS : Parameters = [ ("location", None, None, "Server location to advertise (e.g. tcp:example.org:12345)"), ("port", None, None, @@ -57,29 +43,29 @@ WHERE_OPTS = [ "Hostname to automatically set --location/--port when --listen=tcp"), ("listen", None, "tcp", "Comma-separated list of listener types (tcp,tor,i2p,none)."), -] # type: Parameters +] -TOR_OPTS = [ +TOR_OPTS : Parameters = [ ("tor-control-port", None, None, "Tor's control port endpoint descriptor string (e.g. tcp:127.0.0.1:9051 or unix:/var/run/tor/control)"), ("tor-executable", None, None, "The 'tor' executable to run (default is to search $PATH)."), -] # type: Parameters +] -TOR_FLAGS = [ +TOR_FLAGS : Flags = [ ("tor-launch", None, "Launch a tor instead of connecting to a tor control port."), -] # type: Flags +] -I2P_OPTS = [ +I2P_OPTS : Parameters = [ ("i2p-sam-port", None, None, "I2P's SAM API port endpoint descriptor string (e.g. tcp:127.0.0.1:7656)"), ("i2p-executable", None, None, "(future) The 'i2prouter' executable to run (default is to search $PATH)."), -] # type: Parameters +] -I2P_FLAGS = [ +I2P_FLAGS : Flags = [ ("i2p-launch", None, "(future) Launch an I2P router instead of connecting to a SAM API port."), -] # type: Flags +] def validate_where_options(o): if o['listen'] == "none": @@ -508,11 +494,11 @@ def create_introducer(config): defer.returnValue(0) -subCommands = [ +subCommands : SubCommands = [ ("create-node", None, CreateNodeOptions, "Create a node that acts as a client, server or both."), ("create-client", None, CreateClientOptions, "Create a client node (with storage initially disabled)."), ("create-introducer", None, CreateIntroducerOptions, "Create an introducer node."), -] # type: SubCommands +] dispatch = { "create-node": create_node, diff --git a/src/allmydata/scripts/runner.py b/src/allmydata/scripts/runner.py index d9fbc1b0a..18387cea5 100644 --- a/src/allmydata/scripts/runner.py +++ b/src/allmydata/scripts/runner.py @@ -1,28 +1,15 @@ -from __future__ import print_function -from __future__ import absolute_import -from __future__ import division -from __future__ import unicode_literals - -from future.utils import PY2 -if PY2: - from future.builtins import filter, map, zip, ascii, chr, hex, input, next, oct, open, pow, round, super, bytes, dict, list, object, range, str, max, min # noqa: F401 - import os, sys -from six.moves import StringIO +from io import StringIO from past.builtins import unicode import six -try: - from allmydata.scripts.types_ import SubCommands -except ImportError: - pass - from twisted.python import usage from twisted.internet import defer, task, threads from allmydata.scripts.common import get_default_nodedir from allmydata.scripts import debug, create_node, cli, \ admin, tahoe_run, tahoe_invite +from allmydata.scripts.types_ import SubCommands from allmydata.util.encodingutil import quote_local_unicode_path, argv_to_unicode from allmydata.util.eliotutil import ( opt_eliot_destination, @@ -47,9 +34,9 @@ if _default_nodedir: NODEDIR_HELP += " [default for most commands: " + quote_local_unicode_path(_default_nodedir) + "]" -process_control_commands = [ +process_control_commands : SubCommands = [ ("run", None, tahoe_run.RunOptions, "run a node without daemonizing"), -] # type: SubCommands +] class Options(usage.Options): From aea748a890e16b845b477f39eeb2d186fdd40ea9 Mon Sep 17 00:00:00 2001 From: Itamar Turner-Trauring Date: Fri, 24 Mar 2023 12:03:43 -0400 Subject: [PATCH 32/51] Fix lint. --- src/allmydata/scripts/tahoe_invite.py | 18 +++--------------- 1 file changed, 3 insertions(+), 15 deletions(-) diff --git a/src/allmydata/scripts/tahoe_invite.py b/src/allmydata/scripts/tahoe_invite.py index b62d6a463..b44efdeb9 100644 --- a/src/allmydata/scripts/tahoe_invite.py +++ b/src/allmydata/scripts/tahoe_invite.py @@ -1,19 +1,6 @@ """ Ported to Python 3. """ -from __future__ import unicode_literals -from __future__ import absolute_import -from __future__ import division -from __future__ import print_function - -from future.utils import PY2 -if PY2: - from future.builtins import filter, map, zip, ascii, chr, hex, input, next, oct, open, pow, round, super, bytes, dict, list, object, range, str, max, min # noqa: F401 - -try: - from allmydata.scripts.types_ import SubCommands -except ImportError: - pass from twisted.python import usage from twisted.internet import defer, reactor @@ -21,6 +8,7 @@ from twisted.internet import defer, reactor from allmydata.util.encodingutil import argv_to_abspath from allmydata.util import jsonbytes as json from allmydata.scripts.common import get_default_nodedir, get_introducer_furl +from allmydata.scripts.types_ import SubCommands from allmydata.client import read_config @@ -112,10 +100,10 @@ def invite(options): print("Completed successfully", file=out) -subCommands = [ +subCommands : SubCommands = [ ("invite", None, InviteOptions, "Invite a new node to this grid"), -] # type: SubCommands +] dispatch = { "invite": invite, From 494a977525f6db91c3a3f5be2a7b75dd91b0ce22 Mon Sep 17 00:00:00 2001 From: Itamar Turner-Trauring Date: Fri, 24 Mar 2023 12:06:01 -0400 Subject: [PATCH 33/51] Fix lint. --- src/allmydata/storage/http_server.py | 14 ++++++++------ 1 file changed, 8 insertions(+), 6 deletions(-) diff --git a/src/allmydata/storage/http_server.py b/src/allmydata/storage/http_server.py index 3ae16ae5c..7437b3ec7 100644 --- a/src/allmydata/storage/http_server.py +++ b/src/allmydata/storage/http_server.py @@ -4,7 +4,7 @@ HTTP server for storage. from __future__ import annotations -from typing import Dict, List, Set, Tuple, Any, Callable, Union, cast +from typing import Any, Callable, Union, cast from functools import wraps from base64 import b64decode import binascii @@ -67,8 +67,8 @@ class ClientSecretsException(Exception): def _extract_secrets( - header_values, required_secrets -): # type: (List[str], Set[Secrets]) -> Dict[Secrets, bytes] + header_values: list[str], required_secrets: set[Secrets] +) -> dict[Secrets, bytes]: """ Given list of values of ``X-Tahoe-Authorization`` headers, and required secrets, return dictionary mapping secrets to decoded values. @@ -173,7 +173,7 @@ class UploadsInProgress(object): _uploads: dict[bytes, StorageIndexUploads] = Factory(dict) # Map BucketWriter to (storage index, share number) - _bucketwriters: dict[BucketWriter, Tuple[bytes, int]] = Factory(dict) + _bucketwriters: dict[BucketWriter, tuple[bytes, int]] = Factory(dict) def add_write_bucket( self, @@ -798,7 +798,9 @@ class HTTPServer(object): # The reason can be a string with explanation, so in theory it could be # longish? info = await self._read_encoded( - request, _SCHEMAS["advise_corrupt_share"], max_size=32768, + request, + _SCHEMAS["advise_corrupt_share"], + max_size=32768, ) bucket.advise_corrupt_share(info["reason"].encode("utf-8")) return b"" @@ -973,7 +975,7 @@ def listen_tls( endpoint: IStreamServerEndpoint, private_key_path: FilePath, cert_path: FilePath, -) -> Deferred[Tuple[DecodedURL, IListeningPort]]: +) -> Deferred[tuple[DecodedURL, IListeningPort]]: """ Start a HTTPS storage server on the given port, return the NURL and the listening port. From 3212311bbe3dd7f17bd8b7a7d74ad70fa503a7a1 Mon Sep 17 00:00:00 2001 From: Itamar Turner-Trauring Date: Fri, 24 Mar 2023 12:06:49 -0400 Subject: [PATCH 34/51] Fix lint. --- src/allmydata/storage/lease_schema.py | 17 ++--------------- 1 file changed, 2 insertions(+), 15 deletions(-) diff --git a/src/allmydata/storage/lease_schema.py b/src/allmydata/storage/lease_schema.py index 7e604388e..63d3d4ed8 100644 --- a/src/allmydata/storage/lease_schema.py +++ b/src/allmydata/storage/lease_schema.py @@ -2,19 +2,7 @@ Ported to Python 3. """ -from __future__ import absolute_import -from __future__ import division -from __future__ import print_function -from __future__ import unicode_literals - -from future.utils import PY2 -if PY2: - from future.builtins import filter, map, zip, ascii, chr, hex, input, next, oct, open, pow, round, super, bytes, dict, list, object, range, str, max, min # noqa: F401 - -try: - from typing import Union -except ImportError: - pass +from typing import Union import attr @@ -95,8 +83,7 @@ class HashedLeaseSerializer(object): cls._hash_secret, ) - def serialize(self, lease): - # type: (Union[LeaseInfo, HashedLeaseInfo]) -> bytes + def serialize(self, lease: Union[LeaseInfo, HashedLeaseInfo]) -> bytes: if isinstance(lease, LeaseInfo): # v2 of the immutable schema stores lease secrets hashed. If # we're given a LeaseInfo then it holds plaintext secrets. Hash From 8d84e8a19f66cc05421608e3f25378f96ddad68c Mon Sep 17 00:00:00 2001 From: Itamar Turner-Trauring Date: Fri, 24 Mar 2023 12:08:04 -0400 Subject: [PATCH 35/51] Fix lint. --- src/allmydata/storage/server.py | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/src/allmydata/storage/server.py b/src/allmydata/storage/server.py index 2bf99d74c..6099636f8 100644 --- a/src/allmydata/storage/server.py +++ b/src/allmydata/storage/server.py @@ -2,8 +2,9 @@ Ported to Python 3. """ from __future__ import annotations + from future.utils import bytes_to_native_str -from typing import Dict, Tuple, Iterable +from typing import Iterable, Any import os, re @@ -823,7 +824,7 @@ class FoolscapStorageServer(Referenceable): # type: ignore # warner/foolscap#78 self._server = storage_server # Canaries and disconnect markers for BucketWriters created via Foolscap: - self._bucket_writer_disconnect_markers = {} # type: Dict[BucketWriter,Tuple[IRemoteReference, object]] + self._bucket_writer_disconnect_markers : dict[BucketWriter, tuple[IRemoteReference, Any]] = {} self._server.register_bucket_writer_close_handler(self._bucket_writer_closed) From 74e3e27bea3309b33dc153114aad151baf7a4dd2 Mon Sep 17 00:00:00 2001 From: Itamar Turner-Trauring Date: Fri, 24 Mar 2023 15:06:27 -0400 Subject: [PATCH 36/51] Fix lint. --- src/allmydata/test/cli/test_create.py | 17 +++-------------- 1 file changed, 3 insertions(+), 14 deletions(-) diff --git a/src/allmydata/test/cli/test_create.py b/src/allmydata/test/cli/test_create.py index 609888fb3..1d1576082 100644 --- a/src/allmydata/test/cli/test_create.py +++ b/src/allmydata/test/cli/test_create.py @@ -1,21 +1,11 @@ """ Ported to Python 3. """ -from __future__ import absolute_import -from __future__ import division -from __future__ import print_function -from __future__ import unicode_literals - -from future.utils import PY2 -if PY2: - from future.builtins import filter, map, zip, ascii, chr, hex, input, next, oct, open, pow, round, super, bytes, dict, list, object, range, str, max, min # noqa: F401 +from __future__ import annotations import os -try: - from typing import Any, List, Tuple -except ImportError: - pass +from typing import Any from twisted.trial import unittest from twisted.internet import defer, reactor @@ -356,8 +346,7 @@ class Config(unittest.TestCase): self.assertIn("is not empty", err) self.assertIn("To avoid clobbering anything, I am going to quit now", err) -def fake_config(testcase, module, result): - # type: (unittest.TestCase, Any, Any) -> List[Tuple] +def fake_config(testcase: unittest.TestCase, module: Any, result: Any) -> list[tuple]: """ Monkey-patch a fake configuration function into the given module. From 0c92fe554ddc8ce8b4c1b4efed943ff71158efab Mon Sep 17 00:00:00 2001 From: Itamar Turner-Trauring Date: Fri, 24 Mar 2023 15:07:22 -0400 Subject: [PATCH 37/51] Fix lint. --- src/allmydata/test/eliotutil.py | 22 +++------------------- 1 file changed, 3 insertions(+), 19 deletions(-) diff --git a/src/allmydata/test/eliotutil.py b/src/allmydata/test/eliotutil.py index dd21f1e9d..bdc779f1d 100644 --- a/src/allmydata/test/eliotutil.py +++ b/src/allmydata/test/eliotutil.py @@ -3,18 +3,6 @@ Tools aimed at the interaction between tests and Eliot. Ported to Python 3. """ -from __future__ import absolute_import -from __future__ import division -from __future__ import print_function -from __future__ import unicode_literals - -# Python 2 compatibility -# Can't use `builtins.str` because it's not JSON encodable: -# `exceptions.TypeError: is not JSON-encodeable` -from past.builtins import unicode as str -from future.utils import PY2 -if PY2: - from future.builtins import filter, map, zip, ascii, chr, hex, input, next, oct, open, pow, round, super, bytes, dict, list, object, range, max, min # noqa: F401 from six import ensure_text @@ -23,11 +11,7 @@ __all__ = [ "EliotLoggedRunTest", ] -try: - from typing import Callable -except ImportError: - pass - +from typing import Callable from functools import ( partial, wraps, @@ -147,8 +131,8 @@ class EliotLoggedRunTest(object): def with_logging( - test_id, # type: str - test_method, # type: Callable + test_id: str, + test_method: Callable, ): """ Decorate a test method with additional log-related behaviors. From 1668b2fcf6c6c65250922853047059786096add6 Mon Sep 17 00:00:00 2001 From: Itamar Turner-Trauring Date: Fri, 24 Mar 2023 15:09:11 -0400 Subject: [PATCH 38/51] Fix lint. --- src/allmydata/test/no_network.py | 47 ++++++++++++-------------------- 1 file changed, 17 insertions(+), 30 deletions(-) diff --git a/src/allmydata/test/no_network.py b/src/allmydata/test/no_network.py index 66748e4b1..2346d96c1 100644 --- a/src/allmydata/test/no_network.py +++ b/src/allmydata/test/no_network.py @@ -1,34 +1,23 @@ """ -Ported to Python 3. +This contains a test harness that creates a full Tahoe grid in a single +process (actually in a single MultiService) which does not use the network. +It does not use an Introducer, and there are no foolscap Tubs. Each storage +server puts real shares on disk, but is accessed through loopback +RemoteReferences instead of over serialized SSL. It is not as complete as +the common.SystemTestMixin framework (which does use the network), but +should be considerably faster: on my laptop, it takes 50-80ms to start up, +whereas SystemTestMixin takes close to 2s. + +This should be useful for tests which want to examine and/or manipulate the +uploaded shares, checker/verifier/repairer tests, etc. The clients have no +Tubs, so it is not useful for tests that involve a Helper. """ -from __future__ import absolute_import -from __future__ import division -from __future__ import print_function -from __future__ import unicode_literals -# This contains a test harness that creates a full Tahoe grid in a single -# process (actually in a single MultiService) which does not use the network. -# It does not use an Introducer, and there are no foolscap Tubs. Each storage -# server puts real shares on disk, but is accessed through loopback -# RemoteReferences instead of over serialized SSL. It is not as complete as -# the common.SystemTestMixin framework (which does use the network), but -# should be considerably faster: on my laptop, it takes 50-80ms to start up, -# whereas SystemTestMixin takes close to 2s. +from __future__ import annotations -# This should be useful for tests which want to examine and/or manipulate the -# uploaded shares, checker/verifier/repairer tests, etc. The clients have no -# Tubs, so it is not useful for tests that involve a Helper. - -from future.utils import PY2 -if PY2: - from future.builtins import filter, map, zip, ascii, chr, hex, input, next, oct, open, pow, round, super, bytes, dict, list, object, range, str, max, min # noqa: F401 -from past.builtins import unicode from six import ensure_text -try: - from typing import Dict, Callable -except ImportError: - pass +from typing import Callable import os from base64 import b32encode @@ -251,7 +240,7 @@ def create_no_network_client(basedir): :return: a Deferred yielding an instance of _Client subclass which does no actual networking but has the same API. """ - basedir = abspath_expanduser_unicode(unicode(basedir)) + basedir = abspath_expanduser_unicode(str(basedir)) fileutil.make_dirs(os.path.join(basedir, "private"), 0o700) from allmydata.client import read_config @@ -577,8 +566,7 @@ class GridTestMixin(object): pass return sorted(shares) - def copy_shares(self, uri): - # type: (bytes) -> Dict[bytes, bytes] + def copy_shares(self, uri: bytes) -> dict[bytes, bytes]: """ Read all of the share files for the given capability from the storage area of the storage servers created by ``set_up_grid``. @@ -630,8 +618,7 @@ class GridTestMixin(object): with open(i_sharefile, "wb") as f: f.write(corruptdata) - def corrupt_all_shares(self, uri, corruptor, debug=False): - # type: (bytes, Callable[[bytes, bool], bytes], bool) -> None + def corrupt_all_shares(self, uri: Callable, corruptor: Callable[[bytes, bool], bytes], debug: bool=False): """ Apply ``corruptor`` to the contents of all share files associated with a given capability and replace the share file contents with its result. From 9d45cd85c712c9ee857d79032e026b1b149bbf0f Mon Sep 17 00:00:00 2001 From: Itamar Turner-Trauring Date: Fri, 24 Mar 2023 15:12:16 -0400 Subject: [PATCH 39/51] Fix lint. --- src/allmydata/test/test_download.py | 16 +++------------- 1 file changed, 3 insertions(+), 13 deletions(-) diff --git a/src/allmydata/test/test_download.py b/src/allmydata/test/test_download.py index 85d89cde6..4d57fa828 100644 --- a/src/allmydata/test/test_download.py +++ b/src/allmydata/test/test_download.py @@ -1,23 +1,14 @@ """ Ported to Python 3. """ -from __future__ import print_function -from __future__ import absolute_import -from __future__ import division -from __future__ import unicode_literals -from future.utils import PY2, bchr -if PY2: - from future.builtins import filter, map, zip, ascii, chr, hex, input, next, oct, open, pow, round, super, bytes, dict, list, object, range, str, max, min # noqa: F401 +from future.utils import bchr # system-level upload+download roundtrip test, but using shares created from # a previous run. This asserts that the current code is capable of decoding # shares from a previous version. -try: - from typing import Any -except ImportError: - pass +from typing import Any import six import os @@ -1197,8 +1188,7 @@ class Corruption(_Base, unittest.TestCase): return d - def _corrupt_flip_all(self, ign, imm_uri, which): - # type: (Any, bytes, int) -> None + def _corrupt_flip_all(self, ign: Any, imm_uri: bytes, which: int) -> None: """ Flip the least significant bit at a given byte position in all share files for the given capability. From 0bdea026f0023b318389ed26b025bc3d5d1f5355 Mon Sep 17 00:00:00 2001 From: Itamar Turner-Trauring Date: Fri, 24 Mar 2023 15:13:20 -0400 Subject: [PATCH 40/51] Fix lint. --- src/allmydata/test/test_helper.py | 19 ++++--------------- 1 file changed, 4 insertions(+), 15 deletions(-) diff --git a/src/allmydata/test/test_helper.py b/src/allmydata/test/test_helper.py index 933a2b591..b280f95df 100644 --- a/src/allmydata/test/test_helper.py +++ b/src/allmydata/test/test_helper.py @@ -1,14 +1,7 @@ """ Ported to Python 3. """ -from __future__ import absolute_import -from __future__ import division -from __future__ import print_function -from __future__ import unicode_literals - -from future.utils import PY2 -if PY2: - from future.builtins import filter, map, zip, ascii, chr, hex, input, next, oct, open, pow, round, super, bytes, dict, list, object, range, str, max, min # noqa: F401 +from __future__ import annotations import os from struct import ( @@ -17,13 +10,8 @@ from struct import ( from functools import ( partial, ) -import attr -try: - from typing import List - from allmydata.introducer.client import IntroducerClient -except ImportError: - pass +import attr from twisted.internet import defer from twisted.trial import unittest @@ -35,6 +23,7 @@ from eliot.twisted import ( inline_callbacks, ) +from allmydata.introducer.client import IntroducerClient from allmydata.crypto import aes from allmydata.storage.server import ( si_b2a, @@ -132,7 +121,7 @@ class FakeCHKCheckerAndUEBFetcher(object): )) class FakeClient(service.MultiService): - introducer_clients = [] # type: List[IntroducerClient] + introducer_clients : list[IntroducerClient] = [] DEFAULT_ENCODING_PARAMETERS = {"k":25, "happy": 75, "n": 100, From 0377f858c2aa4628a2cc86018fcdea01a3ccc00e Mon Sep 17 00:00:00 2001 From: Itamar Turner-Trauring Date: Fri, 24 Mar 2023 15:14:23 -0400 Subject: [PATCH 41/51] Correct type. --- src/allmydata/test/no_network.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/allmydata/test/no_network.py b/src/allmydata/test/no_network.py index 2346d96c1..ee1f48b17 100644 --- a/src/allmydata/test/no_network.py +++ b/src/allmydata/test/no_network.py @@ -618,7 +618,7 @@ class GridTestMixin(object): with open(i_sharefile, "wb") as f: f.write(corruptdata) - def corrupt_all_shares(self, uri: Callable, corruptor: Callable[[bytes, bool], bytes], debug: bool=False): + def corrupt_all_shares(self, uri: bytes, corruptor: Callable[[bytes, bool], bytes], debug: bool=False): """ Apply ``corruptor`` to the contents of all share files associated with a given capability and replace the share file contents with its result. From 0d92aecbf3b97237f611628404efb24d45a4196e Mon Sep 17 00:00:00 2001 From: Itamar Turner-Trauring Date: Fri, 24 Mar 2023 15:14:59 -0400 Subject: [PATCH 42/51] Fix lint. --- src/allmydata/test/test_istorageserver.py | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/src/allmydata/test/test_istorageserver.py b/src/allmydata/test/test_istorageserver.py index 9e7e7b6e1..ded9ac1ac 100644 --- a/src/allmydata/test/test_istorageserver.py +++ b/src/allmydata/test/test_istorageserver.py @@ -8,9 +8,9 @@ reused across tests, so each test should be careful to generate unique storage indexes. """ -from future.utils import bchr +from __future__ import annotations -from typing import Set +from future.utils import bchr from random import Random from unittest import SkipTest @@ -1041,7 +1041,7 @@ class IStorageServerMutableAPIsTestsMixin(object): class _SharedMixin(SystemTestMixin): """Base class for Foolscap and HTTP mixins.""" - SKIP_TESTS = set() # type: Set[str] + SKIP_TESTS : set[str] = set() def _get_istorage_server(self): native_server = next(iter(self.clients[0].storage_broker.get_known_servers())) From f5d9947368d799581e65609da2bd18dfb5352509 Mon Sep 17 00:00:00 2001 From: Itamar Turner-Trauring Date: Fri, 24 Mar 2023 15:15:51 -0400 Subject: [PATCH 43/51] Fix lint. --- src/allmydata/uri.py | 19 ++----------------- 1 file changed, 2 insertions(+), 17 deletions(-) diff --git a/src/allmydata/uri.py b/src/allmydata/uri.py index 5641771d3..fccf05db9 100644 --- a/src/allmydata/uri.py +++ b/src/allmydata/uri.py @@ -6,26 +6,11 @@ Ported to Python 3. Methods ending in to_string() are actually to_bytes(), possibly should be fixed in follow-up port. """ -from __future__ import absolute_import -from __future__ import division -from __future__ import print_function -from __future__ import unicode_literals - -from future.utils import PY2 -if PY2: - # Don't import bytes or str, to prevent future's newbytes leaking and - # breaking code that only expects normal bytes. - from future.builtins import filter, map, zip, ascii, chr, hex, input, next, oct, open, pow, round, super, dict, list, object, range, max, min # noqa: F401 - from past.builtins import unicode as str from past.builtins import unicode, long import re - -try: - from typing import Type -except ImportError: - pass +from typing import Type from zope.interface import implementer from twisted.python.components import registerAdapter @@ -707,7 +692,7 @@ class DirectoryURIVerifier(_DirectoryBaseURI): BASE_STRING=b'URI:DIR2-Verifier:' BASE_STRING_RE=re.compile(b'^'+BASE_STRING) - INNER_URI_CLASS=SSKVerifierURI # type: Type[IVerifierURI] + INNER_URI_CLASS : Type[IVerifierURI] = SSKVerifierURI def __init__(self, filenode_uri=None): if filenode_uri: From 63549c71efee7c952ea0a9d8b3a80e4a53fa7236 Mon Sep 17 00:00:00 2001 From: Itamar Turner-Trauring Date: Fri, 24 Mar 2023 15:18:46 -0400 Subject: [PATCH 44/51] Fix lints, remove some Python 2 junk. --- src/allmydata/util/base32.py | 60 +++++++++++------------------------- 1 file changed, 18 insertions(+), 42 deletions(-) diff --git a/src/allmydata/util/base32.py b/src/allmydata/util/base32.py index ab65beeac..19a3bbe26 100644 --- a/src/allmydata/util/base32.py +++ b/src/allmydata/util/base32.py @@ -3,30 +3,11 @@ Base32 encoding. Ported to Python 3. """ -from __future__ import absolute_import -from __future__ import division -from __future__ import print_function -from __future__ import unicode_literals - -from future.utils import PY2 -if PY2: - from builtins import filter, map, zip, ascii, chr, hex, input, next, oct, open, pow, round, super, bytes, dict, list, object, range, str, max, min # noqa: F401 - -if PY2: - def backwardscompat_bytes(b): - """ - Replace Future bytes with native Python 2 bytes, so % works - consistently until other modules are ported. - """ - return getattr(b, "__native__", lambda: b)() - import string - maketrans = string.maketrans -else: - def backwardscompat_bytes(b): - return b - maketrans = bytes.maketrans - from typing import Optional +def backwardscompat_bytes(b): + return b +maketrans = bytes.maketrans +from typing import Optional import base64 from allmydata.util.assertutil import precondition @@ -34,7 +15,7 @@ from allmydata.util.assertutil import precondition rfc3548_alphabet = b"abcdefghijklmnopqrstuvwxyz234567" # RFC3548 standard used by Gnutella, Content-Addressable Web, THEX, Bitzi, Web-Calculus... chars = rfc3548_alphabet -vals = backwardscompat_bytes(bytes(range(32))) +vals = bytes(range(32)) c2vtranstable = maketrans(chars, vals) v2ctranstable = maketrans(vals, chars) identitytranstable = maketrans(b'', b'') @@ -61,16 +42,16 @@ def get_trailing_chars_without_lsbs(N): d = {} return b''.join(_get_trailing_chars_without_lsbs(N, d=d)) -BASE32CHAR = backwardscompat_bytes(b'['+get_trailing_chars_without_lsbs(0)+b']') -BASE32CHAR_4bits = backwardscompat_bytes(b'['+get_trailing_chars_without_lsbs(1)+b']') -BASE32CHAR_3bits = backwardscompat_bytes(b'['+get_trailing_chars_without_lsbs(2)+b']') -BASE32CHAR_2bits = backwardscompat_bytes(b'['+get_trailing_chars_without_lsbs(3)+b']') -BASE32CHAR_1bits = backwardscompat_bytes(b'['+get_trailing_chars_without_lsbs(4)+b']') -BASE32STR_1byte = backwardscompat_bytes(BASE32CHAR+BASE32CHAR_3bits) -BASE32STR_2bytes = backwardscompat_bytes(BASE32CHAR+b'{3}'+BASE32CHAR_1bits) -BASE32STR_3bytes = backwardscompat_bytes(BASE32CHAR+b'{4}'+BASE32CHAR_4bits) -BASE32STR_4bytes = backwardscompat_bytes(BASE32CHAR+b'{6}'+BASE32CHAR_2bits) -BASE32STR_anybytes = backwardscompat_bytes(bytes(b'((?:%s{8})*') % (BASE32CHAR,) + bytes(b"(?:|%s|%s|%s|%s))") % (BASE32STR_1byte, BASE32STR_2bytes, BASE32STR_3bytes, BASE32STR_4bytes)) +BASE32CHAR = b'['+get_trailing_chars_without_lsbs(0)+b']' +BASE32CHAR_4bits = b'['+get_trailing_chars_without_lsbs(1)+b']' +BASE32CHAR_3bits = b'['+get_trailing_chars_without_lsbs(2)+b']' +BASE32CHAR_2bits = b'['+get_trailing_chars_without_lsbs(3)+b']' +BASE32CHAR_1bits = b'['+get_trailing_chars_without_lsbs(4)+b']' +BASE32STR_1byte = BASE32CHAR+BASE32CHAR_3bits +BASE32STR_2bytes = BASE32CHAR+b'{3}'+BASE32CHAR_1bits +BASE32STR_3bytes = BASE32CHAR+b'{4}'+BASE32CHAR_4bits +BASE32STR_4bytes = BASE32CHAR+b'{6}'+BASE32CHAR_2bits +BASE32STR_anybytes = bytes(b'((?:%s{8})*') % (BASE32CHAR,) + bytes(b"(?:|%s|%s|%s|%s))") % (BASE32STR_1byte, BASE32STR_2bytes, BASE32STR_3bytes, BASE32STR_4bytes) def b2a(os): # type: (bytes) -> bytes """ @@ -80,7 +61,7 @@ def b2a(os): # type: (bytes) -> bytes """ return base64.b32encode(os).rstrip(b"=").lower() -def b2a_or_none(os): # type: (Optional[bytes]) -> Optional[bytes] +def b2a_or_none(os: Optional[bytes]) -> Optional[bytes]: if os is not None: return b2a(os) return None @@ -100,8 +81,6 @@ NUM_OS_TO_NUM_QS=(0, 2, 4, 5, 7,) NUM_QS_TO_NUM_OS=(0, 1, 1, 2, 2, 3, 3, 4) NUM_QS_LEGIT=(1, 0, 1, 0, 1, 1, 0, 1,) NUM_QS_TO_NUM_BITS=tuple([_x*8 for _x in NUM_QS_TO_NUM_OS]) -if PY2: - del _x # A fast way to determine whether a given string *could* be base-32 encoded data, assuming that the # original data had 8K bits for a positive integer K. @@ -135,8 +114,6 @@ def a2b(cs): # type: (bytes) -> bytes """ @param cs the base-32 encoded data (as bytes) """ - # Workaround Future newbytes issues by converting to real bytes on Python 2: - cs = backwardscompat_bytes(cs) precondition(could_be_base32_encoded(cs), "cs is required to be possibly base32 encoded data.", cs=cs) precondition(isinstance(cs, bytes), cs) @@ -144,9 +121,8 @@ def a2b(cs): # type: (bytes) -> bytes # Add padding back, to make Python's base64 module happy: while (len(cs) * 5) % 8 != 0: cs += b"=" - # Let newbytes come through and still work on Python 2, where the base64 - # module gets confused by them. - return base64.b32decode(backwardscompat_bytes(cs)) + + return base64.b32decode(cs) __all__ = ["b2a", "a2b", "b2a_or_none", "BASE32CHAR_3bits", "BASE32CHAR_1bits", "BASE32CHAR", "BASE32STR_anybytes", "could_be_base32_encoded"] From 6ce53000f0382cf53b72006d1df48648a2e8f651 Mon Sep 17 00:00:00 2001 From: Itamar Turner-Trauring Date: Fri, 24 Mar 2023 15:19:39 -0400 Subject: [PATCH 45/51] Fix lint. --- src/allmydata/util/deferredutil.py | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/src/allmydata/util/deferredutil.py b/src/allmydata/util/deferredutil.py index 83de411ce..70ce8dade 100644 --- a/src/allmydata/util/deferredutil.py +++ b/src/allmydata/util/deferredutil.py @@ -208,10 +208,9 @@ class WaitForDelayedCallsMixin(PollMixin): @inline_callbacks def until( - action, # type: Callable[[], defer.Deferred[Any]] - condition, # type: Callable[[], bool] -): - # type: (...) -> defer.Deferred[None] + action: Callable[[], defer.Deferred[Any]], + condition: Callable[[], bool], +) -> defer.Deferred[None]: """ Run a Deferred-returning function until a condition is true. From 06dc32a6c0e625c7188321aa6b6f5a2b2d2c7e89 Mon Sep 17 00:00:00 2001 From: Itamar Turner-Trauring Date: Fri, 24 Mar 2023 15:20:11 -0400 Subject: [PATCH 46/51] Fix lint. --- src/allmydata/util/pollmixin.py | 16 ++-------------- 1 file changed, 2 insertions(+), 14 deletions(-) diff --git a/src/allmydata/util/pollmixin.py b/src/allmydata/util/pollmixin.py index 582bafe86..b23277565 100644 --- a/src/allmydata/util/pollmixin.py +++ b/src/allmydata/util/pollmixin.py @@ -4,22 +4,10 @@ Polling utility that returns Deferred. Ported to Python 3. """ -from __future__ import print_function -from __future__ import absolute_import -from __future__ import division -from __future__ import unicode_literals - -from future.utils import PY2 -if PY2: - from builtins import filter, map, zip, ascii, chr, hex, input, next, oct, open, pow, round, super, bytes, dict, list, object, range, str, max, min # noqa: F401 +from __future__ import annotations import time -try: - from typing import List -except ImportError: - pass - from twisted.internet import task class TimeoutError(Exception): @@ -29,7 +17,7 @@ class PollComplete(Exception): pass class PollMixin(object): - _poll_should_ignore_these_errors = [] # type: List[Exception] + _poll_should_ignore_these_errors : list[Exception] = [] def poll(self, check_f, pollinterval=0.01, timeout=1000): # Return a Deferred, then call check_f periodically until it returns From ee75bcd26bb2336c275224988feffdc531c36d05 Mon Sep 17 00:00:00 2001 From: Itamar Turner-Trauring Date: Fri, 24 Mar 2023 15:20:48 -0400 Subject: [PATCH 47/51] Fix lint. --- src/allmydata/web/common.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/allmydata/web/common.py b/src/allmydata/web/common.py index 3d85b1c4d..bd1e3838e 100644 --- a/src/allmydata/web/common.py +++ b/src/allmydata/web/common.py @@ -117,7 +117,7 @@ def boolean_of_arg(arg): # type: (bytes) -> bool return arg.lower() in (b"true", b"t", b"1", b"on") -def parse_replace_arg(replace): # type: (bytes) -> Union[bool,_OnlyFiles] +def parse_replace_arg(replace: bytes) -> Union[bool,_OnlyFiles]: assert isinstance(replace, bytes) if replace.lower() == b"only-files": return ONLY_FILES From 51c7ca8d2cee964c94c8bce689520eb25c8325ee Mon Sep 17 00:00:00 2001 From: Itamar Turner-Trauring Date: Fri, 24 Mar 2023 15:22:21 -0400 Subject: [PATCH 48/51] Workaround for incompatibility. --- setup.py | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/setup.py b/setup.py index 6528b01ed..854a333f1 100644 --- a/setup.py +++ b/setup.py @@ -63,7 +63,11 @@ install_requires = [ # Twisted[conch] also depends on cryptography and Twisted[tls] # transitively depends on cryptography. So it's anyone's guess what # version of cryptography will *really* be installed. - "cryptography >= 2.6", + + # * cryptography 40 broke constants we need; should really be using them + # * via pyOpenSSL; will be fixed in + # * https://github.com/pyca/pyopenssl/issues/1201 + "cryptography >= 2.6, < 40", # * The SFTP frontend depends on Twisted 11.0.0 to fix the SSH server # rekeying bug From 796fc5bdc532d3809fcda2a50175ebacd9eb0504 Mon Sep 17 00:00:00 2001 From: Itamar Turner-Trauring Date: Fri, 24 Mar 2023 15:27:51 -0400 Subject: [PATCH 49/51] Fix lint. --- misc/checkers/check_load.py | 15 +++------------ 1 file changed, 3 insertions(+), 12 deletions(-) diff --git a/misc/checkers/check_load.py b/misc/checkers/check_load.py index 21576ea3a..d509b89ae 100644 --- a/misc/checkers/check_load.py +++ b/misc/checkers/check_load.py @@ -33,20 +33,11 @@ a mean of 10kB and a max of 100MB, so filesize=min(int(1.0/random(.0002)),1e8) """ +from __future__ import annotations import os, sys, httplib, binascii import urllib, json, random, time, urlparse -try: - from typing import Dict -except ImportError: - pass - -# Python 2 compatibility -from future.utils import PY2 -if PY2: - from future.builtins import str # noqa: F401 - if sys.argv[1] == "--stats": statsfiles = sys.argv[2:] # gather stats every 10 seconds, do a moving-window average of the last @@ -54,9 +45,9 @@ if sys.argv[1] == "--stats": DELAY = 10 MAXSAMPLES = 6 totals = [] - last_stats = {} # type: Dict[str, float] + last_stats : dict[str, float] = {} while True: - stats = {} # type: Dict[str, float] + stats : dict[str, float] = {} for sf in statsfiles: for line in open(sf, "r").readlines(): name, str_value = line.split(":") From 226da2fb2afa5961f8580619002905e3aeec580d Mon Sep 17 00:00:00 2001 From: Jean-Paul Calderone Date: Sun, 26 Mar 2023 11:49:17 -0400 Subject: [PATCH 50/51] Add missing pyyaml dependency It worked without this because we got the pyyaml dependency transitively but we should declare it directly since it is a direct dependency. --- nix/tahoe-lafs.nix | 1 + 1 file changed, 1 insertion(+) diff --git a/nix/tahoe-lafs.nix b/nix/tahoe-lafs.nix index 2e1c4aa39..bf3ea83d3 100644 --- a/nix/tahoe-lafs.nix +++ b/nix/tahoe-lafs.nix @@ -34,6 +34,7 @@ let magic-wormhole netifaces psutil + pyyaml pycddl pyrsistent pyutil From 6bf1f0846a9858e50988cf235562b5f92c11ebd5 Mon Sep 17 00:00:00 2001 From: Jean-Paul Calderone Date: Sun, 26 Mar 2023 12:56:26 -0400 Subject: [PATCH 51/51] additional news fragment --- newsfragments/3997.installation | 1 + 1 file changed, 1 insertion(+) create mode 100644 newsfragments/3997.installation diff --git a/newsfragments/3997.installation b/newsfragments/3997.installation new file mode 100644 index 000000000..186be0fc2 --- /dev/null +++ b/newsfragments/3997.installation @@ -0,0 +1 @@ +Tahoe-LAFS is incompatible with cryptography >= 40 and now declares a requirement on an older version.