[CORDA-1629]: Documented backup strategy recommendations for Corda platform components. (#3377)

This commit is contained in:
Michele Sollecito 2018-06-19 11:05:23 +01:00 committed by GitHub
parent fe705ef93d
commit 2dded2aa28
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

View File

@ -12,7 +12,7 @@ to time. You can have logging printed to the console as well by passing the ``--
The default logging level is ``INFO`` which can be adjusted by the ``--logging-level`` command line argument. This configuration The default logging level is ``INFO`` which can be adjusted by the ``--logging-level`` command line argument. This configuration
option will affect all modules. option will affect all modules.
It may be the case that you require to amend the log level of a particular subset of modules (e.g. if you'd like to take a It may be the case that you require to amend the log level of a particular subset of modules (e.g., if you'd like to take a
closer look at hibernate activity). So, for more bespoke logging configuration, the logger settings can be completely overridden closer look at hibernate activity). So, for more bespoke logging configuration, the logger settings can be completely overridden
with a `Log4j 2 <https://logging.apache.org/log4j/2.x>`_ configuration file assigned to the ``log4j.configurationFile`` system property. with a `Log4j 2 <https://logging.apache.org/log4j/2.x>`_ configuration file assigned to the ``log4j.configurationFile`` system property.
@ -46,7 +46,7 @@ Now start the node as usual but with the additional parameter ``log4j.configurat
``java <Your existing startup options here> -Dlog4j.configurationFile=sql.xml -jar corda.jar`` ``java <Your existing startup options here> -Dlog4j.configurationFile=sql.xml -jar corda.jar``
To determine the name of the logger, for Corda objects, use the fully qualified name (e.g. to look at node output To determine the name of the logger, for Corda objects, use the fully qualified name (e.g., to look at node output
in more detail, use ``net.corda.node.internal.Node`` although be aware that as we have marked this class ``internal`` we in more detail, use ``net.corda.node.internal.Node`` although be aware that as we have marked this class ``internal`` we
reserve the right to move and rename it as it's not part of the public API as yet). For other libraries, refer to their reserve the right to move and rename it as it's not part of the public API as yet). For other libraries, refer to their
logging name construction. If you can't find what you need to refer to, use the ``--logging-level`` option as above and logging name construction. If you can't find what you need to refer to, use the ``--logging-level`` option as above and
@ -158,3 +158,40 @@ node is running out of memory, you can give it more by running the node like thi
The example command above would give a 1 gigabyte Java heap. The example command above would give a 1 gigabyte Java heap.
.. note:: Unfortunately the JVM does not let you limit the total memory usage of Java program, just the heap size. .. note:: Unfortunately the JVM does not let you limit the total memory usage of Java program, just the heap size.
Backup recommendations
----------------------
Various components of the Corda platform read their configuration from the file system, and persist data to a database or into files on disk.
Given that hardware can fail, operators of IT infrastructure must have a sound backup strategy in place. Whilst blockchain platforms can sometimes recover some lost data from their peers, it is rarely the case that a node can recover its full state in this way because real-world blockchain applications invariably contain private information (e.g., customer account information). Moreover, this private information must remain in sync with the ledger state. As such, we strongly recommend implementing a comprehensive backup strategy.
The following elements of a backup strategy are recommended:
Database replication
++++++++++++++++++++
When properly configured, database replication prevents data loss from occurring in case the database host fails.
In general, the higher the number of replicas, and the further away they are deployed in terms of regions and availability zones, the more a setup is resilient to disasters.
The trade-off is that, ideally, replication should happen synchronously, meaning that a high number of replicas and a considerable network latency will impact the performance of the Corda nodes connecting to the cluster.
Synchronous replication is strongly advised to prevent data loss.
Database snapshots
++++++++++++++++++
Database replication is a powerful technique, but it is very sensitive to destructive SQL updates. Whether malicious or unintentional, a SQL statement might compromise data by getting propagated to all replicas.
Without rolling snapshots, data loss due to such destructive updates will be irreversible.
Using snapshots always implies some data loss in case of a disaster, and the trade-off is between highly frequent backups minimising such a loss, and less frequent backups consuming less resources.
At present, Corda does not offer online updates with regards to transactions.
Should states in the vault ever be lost, partial or total recovery might be achieved by asking third-party companies and/or notaries to provide all data relevant to the affected legal identity.
File backups
++++++++++++
Corda components read and write information from and to the file-system. The advice is to backup the entire root directory of the component, plus any external directories and files optionally specified in the configuration.
Corda assumes the filesystem is reliable. You must ensure that it is configured to provide this assurance, which means you must configure it to synchronously replicate to your backup/DR site.
If the above holds, Corda components will benefit from the following:
* Guaranteed eventual processing of acknowledged client messages, provided that the backlog of persistent queues is not lost irremediably.
* A timely recovery from deletion or corruption of configuration files (e.g., ``node.conf``, ``node-info`` files, etc.), database drivers, CorDapps binaries and configuration, and certificate directories, provided backups are available to restore from.
.. warning:: Private keys used to sign transactions should be preserved with the utmost care. The recommendation is to keep at least two separate copies on a storage not connected to the Internet.