mirror of
https://github.com/tahoe-lafs/tahoe-lafs.git
synced 2024-12-19 04:57:54 +00:00
248 lines
7.7 KiB
ReStructuredText
248 lines
7.7 KiB
ReStructuredText
.. -*- coding: utf-8-with-signature -*-
|
|
|
|
============================================
|
|
Performance costs for some common operations
|
|
============================================
|
|
|
|
1. `Publishing an A-byte immutable file`_
|
|
2. `Publishing an A-byte mutable file`_
|
|
3. `Downloading B bytes of an A-byte immutable file`_
|
|
4. `Downloading B bytes of an A-byte mutable file`_
|
|
5. `Modifying B bytes of an A-byte mutable file`_
|
|
6. `Inserting/Removing B bytes in an A-byte mutable file`_
|
|
7. `Adding an entry to an A-entry directory`_
|
|
8. `Listing an A entry directory`_
|
|
9. `Checking an A-byte file`_
|
|
10. `Verifying an A-byte file (immutable)`_
|
|
11. `Repairing an A-byte file (mutable or immutable)`_
|
|
|
|
``K`` indicates the number of shares required to reconstruct the file
|
|
(default: 3)
|
|
|
|
``N`` indicates the total number of shares produced (default: 10)
|
|
|
|
``S`` indicates the segment size (default: 128 KiB)
|
|
|
|
``A`` indicates the number of bytes in a file
|
|
|
|
``B`` indicates the number of bytes of a file that are being read or
|
|
written
|
|
|
|
``G`` indicates the number of storage servers on your grid
|
|
|
|
Most of these cost estimates may have a further constant multiplier: when a
|
|
formula says ``N/K*S``, the cost may actually be ``2*N/K*S`` or ``3*N/K*S``.
|
|
Also note that all references to mutable files are for SDMF-formatted files;
|
|
this document has not yet been updated to describe the MDMF format.
|
|
|
|
Publishing an ``A``-byte immutable file
|
|
=======================================
|
|
|
|
when the file is already uploaded
|
|
---------------------------------
|
|
|
|
If the file is already uploaded with the exact same contents, same
|
|
erasure coding parameters (K, N), and same added convergence secret,
|
|
then it reads the whole file from disk one time while hashing it to
|
|
compute the storage index, then contacts about N servers to ask each
|
|
one to store a share. All of the servers reply that they already have
|
|
a copy of that share, and the upload is done.
|
|
|
|
disk: A
|
|
|
|
cpu: ~A
|
|
|
|
network: ~N
|
|
|
|
memory footprint: S
|
|
|
|
when the file is not already uploaded
|
|
-------------------------------------
|
|
|
|
If the file is not already uploaded with the exact same contents, same
|
|
erasure coding parameters (K, N), and same added convergence secret,
|
|
then it reads the whole file from disk one time while hashing it to
|
|
compute the storage index, then contacts about N servers to ask each
|
|
one to store a share. Then it uploads each share to a storage server.
|
|
|
|
disk: 2*A
|
|
|
|
cpu: 2*~A
|
|
|
|
network: N/K*A
|
|
|
|
memory footprint: N/K*S
|
|
|
|
Publishing an ``A``-byte mutable file
|
|
=====================================
|
|
|
|
cpu: ~A + a large constant for RSA keypair generation
|
|
|
|
network: A
|
|
|
|
memory footprint: N/K*A
|
|
|
|
notes:
|
|
Tahoe-LAFS generates a new RSA keypair for each mutable file that it publishes to a grid.
|
|
This takes around 100 milliseconds on a relatively high-end laptop from 2021.
|
|
|
|
Part of the process of encrypting, encoding, and uploading a mutable file to a
|
|
Tahoe-LAFS grid requires that the entire file be in memory at once. For larger
|
|
files, this may cause Tahoe-LAFS to have an unacceptably large memory footprint
|
|
(at least when uploading a mutable file).
|
|
|
|
Downloading ``B`` bytes of an ``A``-byte immutable file
|
|
=======================================================
|
|
|
|
cpu: ~B
|
|
|
|
network: B
|
|
|
|
notes: When Tahoe-LAFS 1.8.0 or later is asked to read an arbitrary
|
|
range of an immutable file, only the S-byte segments that overlap the
|
|
requested range will be downloaded.
|
|
|
|
(Earlier versions would download from the beginning of the file up
|
|
until the end of the requested range, and then continue to download
|
|
the rest of the file even after the request was satisfied.)
|
|
|
|
Downloading ``B`` bytes of an ``A``-byte mutable file
|
|
=====================================================
|
|
|
|
cpu: ~A
|
|
|
|
network: A
|
|
|
|
memory footprint: A
|
|
|
|
notes: As currently implemented, mutable files must be downloaded in
|
|
their entirety before any part of them can be read. We are
|
|
exploring fixes for this; see ticket #393 for more information.
|
|
|
|
Modifying ``B`` bytes of an ``A``-byte mutable file
|
|
===================================================
|
|
|
|
cpu: ~A
|
|
|
|
network: A
|
|
|
|
memory footprint: N/K*A
|
|
|
|
notes: If you upload a changed version of a mutable file that you
|
|
earlier put onto your grid with, say, 'tahoe put --mutable',
|
|
Tahoe-LAFS will replace the old file with the new file on the
|
|
grid, rather than attempting to modify only those portions of the
|
|
file that have changed. Modifying a file in this manner is
|
|
essentially uploading the file over again, except that it re-uses
|
|
the existing RSA keypair instead of generating a new one.
|
|
|
|
Inserting/Removing ``B`` bytes in an ``A``-byte mutable file
|
|
============================================================
|
|
|
|
cpu: ~A
|
|
|
|
network: A
|
|
|
|
memory footprint: N/K*A
|
|
|
|
notes: Modifying any part of a mutable file in Tahoe-LAFS requires that
|
|
the entire file be downloaded, modified, held in memory while it is
|
|
encrypted and encoded, and then re-uploaded. A future version of the
|
|
mutable file layout ("LDMF") may provide efficient inserts and
|
|
deletes. Note that this sort of modification is mostly used internally
|
|
for directories, and isn't something that the WUI, CLI, or other
|
|
interfaces will do -- instead, they will simply overwrite the file to
|
|
be modified, as described in "Modifying B bytes of an A-byte mutable
|
|
file".
|
|
|
|
Adding an entry to an ``A``-entry directory
|
|
===========================================
|
|
|
|
cpu: ~A
|
|
|
|
network: ~A
|
|
|
|
memory footprint: N/K*~A
|
|
|
|
notes: In Tahoe-LAFS, directories are implemented as specialized mutable
|
|
files. So adding an entry to a directory is essentially adding B
|
|
(actually, 300-330) bytes somewhere in an existing mutable file.
|
|
|
|
Listing an ``A`` entry directory
|
|
================================
|
|
|
|
cpu: ~A
|
|
|
|
network: ~A
|
|
|
|
memory footprint: N/K*~A
|
|
|
|
notes: Listing a directory requires that the mutable file storing the
|
|
directory be downloaded from the grid. So listing an A entry
|
|
directory requires downloading a (roughly) 330 * A byte mutable
|
|
file, since each directory entry is about 300-330 bytes in size.
|
|
|
|
Checking an ``A``-byte file
|
|
===========================
|
|
|
|
cpu: ~G
|
|
|
|
network: ~G
|
|
|
|
memory footprint: negligible
|
|
|
|
notes: To check a file, Tahoe-LAFS queries all the servers that it knows
|
|
about. Note that neither of these values directly depend on the size
|
|
of the file. This is relatively inexpensive, compared to the verify
|
|
and repair operations.
|
|
|
|
Verifying an A-byte file (immutable)
|
|
====================================
|
|
|
|
cpu: ~N/K*A
|
|
|
|
network: N/K*A
|
|
|
|
memory footprint: N/K*S
|
|
|
|
notes: To verify a file, Tahoe-LAFS downloads all of the ciphertext
|
|
shares that were originally uploaded to the grid and integrity checks
|
|
them. This is (for grids with good redundancy) more expensive than
|
|
downloading an A-byte file, since only a fraction of these shares would
|
|
be necessary to recover the file.
|
|
|
|
Verifying an A-byte file (mutable)
|
|
==================================
|
|
|
|
cpu: ~N/K*A
|
|
|
|
network: N/K*A
|
|
|
|
memory footprint: N/K*A
|
|
|
|
notes: To verify a file, Tahoe-LAFS downloads all of the ciphertext
|
|
shares that were originally uploaded to the grid and integrity checks
|
|
them. This is (for grids with good redundancy) more expensive than
|
|
downloading an A-byte file, since only a fraction of these shares would
|
|
be necessary to recover the file.
|
|
|
|
Repairing an ``A``-byte file (mutable or immutable)
|
|
===================================================
|
|
|
|
cpu: variable, between ~A and ~N/K*A
|
|
|
|
network: variable; between A and N/K*A
|
|
|
|
memory footprint (immutable): (1+N/K)*S
|
|
(SDMF mutable): (1+N/K)*A
|
|
|
|
notes: To repair a file, Tahoe-LAFS downloads the file, and
|
|
generates/uploads missing shares in the same way as when it initially
|
|
uploads the file. So, depending on how many shares are missing, this
|
|
can cost as little as a download or as much as a download followed by
|
|
a full upload.
|
|
|
|
Since SDMF files have only one segment, which must be processed in its
|
|
entirety, repair requires a full-file download followed by a full-file
|
|
upload.
|