mirror of
https://github.com/corda/corda.git
synced 2025-02-20 17:33:15 +00:00
TWP: Address review comments
This commit is contained in:
parent
46a305602b
commit
d5f6d90b37
@ -386,4 +386,10 @@ publisher = {USENIX Association},
|
||||
author = {Fabian Vogelsteller, Vitalik Buterin},
|
||||
howpublished = {\url{https://eips.ethereum.org/EIPS/eip-20}},
|
||||
year = {2015}
|
||||
}
|
||||
|
||||
@misc{ISDACDM,
|
||||
author = {ISDA},
|
||||
howpublished = {\url{https://portal.cdm.rosetta-technology.io/}},
|
||||
year = {2018}
|
||||
}
|
@ -119,7 +119,7 @@ information to node administrators and users and may interact with people as wel
|
||||
to enable developers to re-use common protocols such as notarisation, membership broadcast and so on.
|
||||
\item The data model allows for arbitrary object graphs to be stored in the ledger. These graphs are called \emph{states} and are the atomic unit of data.
|
||||
\item Nodes are backed by a relational database and data placed in the ledger can be queried using SQL as well as joined
|
||||
with private tables. States can declare a relational mapping using the JPA standard.
|
||||
with private tables. States can declare a relational mapping using the Java Persistence Architecture standard (JPA)~\cite{JPA}.
|
||||
\item The platform provides a rich type system for the representation of things like dates, currencies, legal entities and
|
||||
financial entities such as cash, issuance, deals and so on.
|
||||
\item The network can support rapid bulk data imports from other database systems without placing load on the network.
|
||||
@ -319,7 +319,7 @@ channel protocol\cite{PaymentChannels} involves two parties putting money into a
|
||||
iterating with your counterparty a shared transaction that spends that pot, with extra transactions used for the
|
||||
case where one party or the other fails to terminate properly. Such protocols typically involve reliable private
|
||||
message passing, checkpointing to disk, signing of transactions, interaction with the p2p network, reporting
|
||||
progress to the user, maintaining a complex state machine with timeouts and error cases, and possibly interaction
|
||||
progress to the user, maintaining a complex state machine with timeouts and error cases, and possibly interacting
|
||||
with internal systems on either side. All this can become quite involved. The implementation of payment channels in
|
||||
the \texttt{bitcoinj} library is approximately 9000 lines of Java, very little of which involves cryptography.
|
||||
|
||||
@ -348,8 +348,8 @@ bytecode-to-bytecode transformation occurs that rewrites the classes into a form
|
||||
machine. These state machines are sometimes called coroutines, and the transformation engine Corda uses (Quasar) is
|
||||
capable of rewriting code arbitrarily deep in the stack on the fly. The developer may thus break his or her logic
|
||||
into multiple methods and classes, use loops, and generally structure their program as if it were executing in a
|
||||
single blocking thread. There's only a small list of things they should not do: sleeping, directly accessing the
|
||||
network APIs, or doing other tasks that might block outside of the framework.
|
||||
single blocking thread. There's only a small list of things they should not do: sleeping, accessing the
|
||||
network outside of the framework, and blocking for long periods of time (upgrades require in-flight flows to finish).
|
||||
|
||||
\paragraph{Transparent checkpointing.}When a flow wishes to wait for a message from another party (or input from a
|
||||
human being) the underlying stack frames are suspended onto the heap, then crawled and serialized into the node's
|
||||
@ -373,9 +373,10 @@ hierarchical and steps can have sub-trackers for invoked sub-flows.
|
||||
\paragraph{Flow hospital.}Flows can pause if they throw exceptions or explicitly request human assistance. A flow
|
||||
that has stopped appears in the \emph{flow hospital} where the node's administrator may decide to kill the flow or
|
||||
provide it with a solution. Some flows that end up in the hospital will be retried automatically by the node
|
||||
itself, for example in case of database deadlocks that require a retry. The ability to request manual solutions is
|
||||
useful for cases where the other side isn't sure why you are contacting them, for example, the specified reason for
|
||||
sending a payment is not recognised, or when the asset used for a payment is not considered acceptable.
|
||||
itself, for example in case of database deadlocks that require a retry. Future versions of the framework may add
|
||||
the ability to request manual solutions, which would be useful for cases where the other side isn't sure why
|
||||
you are contacting them. For example, if the specified reason for sending a payment is not recognised, or
|
||||
when the asset used for a payment is not considered acceptable.
|
||||
|
||||
For performance reasons messages sent over flows are protected only with TLS. This means messages sent via flows
|
||||
are deniable unless explicitly signed by the application. Automatic signing and recording of flow contents may be
|
||||
@ -624,9 +625,9 @@ verify functions to use is the union of the contracts specified by each state, w
|
||||
combined with a \emph{constraint} (see~\cref{sec:contract-constraints}). Embedding the JVM specification in the
|
||||
Corda specification enables developers to write code in a variety of languages, use well developed toolchains, and
|
||||
to reuse code already authored in Java or other JVM compatible languages. A good example of this feature in action
|
||||
is the ability to embed the ISDA Common Domain Model directly into CorDapps. The CDM is a large collection of types
|
||||
mapped to Java classes that model derivatives trading in a standardised way. It is common for industry groups to
|
||||
define such domain models and for them to have a Java mapping.
|
||||
is the ability to embed the ISDA Common Domain Model\cite{ISDACDM} directly into CorDapps. The CDM is
|
||||
a large collection of types mapped to Java classes that model derivatives trading in a standardised way. It is
|
||||
common for industry groups to define such domain models and for them to have a Java mapping.
|
||||
|
||||
Current versions of the platform only execute attachments that have been previously installed (and thus
|
||||
whitelisted), or attachments that are signed by the same signer as a previously installed attachment. Thus nodes
|
||||
@ -708,7 +709,7 @@ transaction in which two file paths overlap between attachments is invalid. A sm
|
||||
expected to overlap normally, such as files in the \texttt{META-INF} directory, are excluded.
|
||||
|
||||
\paragraph{Package namespace ownership.} Corda allows parts of the Java package namespace to be reserved for
|
||||
particular developers, identified by a public key (which may or may not be an identity on the node's zone). Any JAR
|
||||
particular developers with a network, identified by a public key (which may or may not be linked to an identity). Any JAR
|
||||
that exports a class in an owned package namespace but which is not signed by the owning key is considered to be
|
||||
invalid. Reserving a package namespace is optional but can simplify the data model and make applications more
|
||||
secure.
|
||||
@ -799,9 +800,9 @@ counterparty with the data elements that are needed along with the Merkle branch
|
||||
as seen in the diagrams below, that counterparty can sign the entire transaction whilst only being able to see some
|
||||
of it. Additionally, if the counterparty needs to be convinced that some third party has already signed the
|
||||
transaction, that is also straightforward. Typically an oracle will be presented with the Merkle branches for the
|
||||
command or state that contains the data, and the timestamp field, and nothing else. The resulting signature
|
||||
contains flag bits indicating which parts of the structure were presented for signing to avoid a single signature
|
||||
covering more than expected.
|
||||
command or state that contains the data, and the timestamp field, and nothing else. If an oracle also takes part
|
||||
in the ledger as a direct participant it should therefore derive a separate key for oracular usage, to avoid
|
||||
being tricked into blind-signing a transaction that might also affect its own states.
|
||||
|
||||
\begin{figure}[H]
|
||||
\includegraphics[width=\textwidth]{tearoffs1}
|
||||
@ -813,8 +814,6 @@ covering more than expected.
|
||||
\caption{Construction of a Merkle branch}
|
||||
\end{figure}
|
||||
|
||||
% TODO: The flag bits are unused in the current reference implementation.
|
||||
|
||||
There are several reasons to take this more indirect approach. One is to keep a single signature checking code
|
||||
path. By ensuring there is only one place in a transaction where signatures may be found, algorithmic agility and
|
||||
parallel/batch verification are easy to implement. When a signature may be found in any arbitrary location in a
|
||||
@ -1009,7 +1008,7 @@ each block contains a reward of newly issued bitcoins, an unrecognised block rep
|
||||
block typically represents a profit.
|
||||
|
||||
Bitcoin uses proof-of-work because it has a design goal of allowing an unlimited number of identityless parties to
|
||||
join and leave the network at will, whilst simultaneously making it hard to execute Sybil attacks (attacks in which
|
||||
join and leave the consensus forming process at will, whilst simultaneously making it hard to execute Sybil attacks (attacks in which
|
||||
one party creates multiple identities to gain undue influence over the network). This is an appropriate design to
|
||||
use for a peer to peer network formed of volunteers who can't/won't commit to any long term relationships up front,
|
||||
and in which identity verification is not done. Using proof-of-work then leads naturally to a requirement to
|
||||
@ -1095,31 +1094,6 @@ propagate far and the only entities who will learn their transaction hashes are
|
||||
select to keep the data from the notary. For liquid assets a validating notary should always be used to prevent
|
||||
value destruction and theft if the transaction identifiers leak.
|
||||
|
||||
\subsection{Merging networks}
|
||||
|
||||
Because there is no single block chain it becomes possible to merge two independent networks together by simply
|
||||
establishing two-way connectivity between their nodes then configuring each side to trust each other's notaries and
|
||||
certificate authorities.
|
||||
|
||||
This ability may seem pointless: isn't the goal of a decentralised ledger to have a single global database for
|
||||
everyone? It is, but a practical route to reaching this end state is still required. It is often the case that
|
||||
organisations perceived by consumers as being a single company are in fact many different entities cross-licensing
|
||||
branding, striking deals with each other and doing internal trades with each other. This sort of setup can occur
|
||||
for regulatory reasons, tax reasons, due to a history of mergers or just through a sheer masochistic love of
|
||||
paperwork. Very large companies can therefore experience all the same synchronisation problems a decentralised
|
||||
ledger is intended to fix but purely within the bounds of that organisation. In this situation the main problem to
|
||||
tackle is not malicious actors but rather heterogenous IT departments, varying software development practices,
|
||||
unlinked user directories and so on. Such organisations can benefit from gaining experience with the technology
|
||||
internally and cleaning up their own internal views of the world before tackling the larger problem of
|
||||
synchronising with the wider world as well.
|
||||
|
||||
When merging networks, both sides must trust that each other's notaries have never signed double spends. When
|
||||
merging an organisation-private network into the global ledger it should be possible to simply rely on incentives
|
||||
to provide this guarantee: there is no point in a company double spending against itself. However, if more evidence
|
||||
is desired, a standalone notary could be run against a hardware security module with audit logging enabled. The
|
||||
notary itself would simply use a private database and run on a single machine, with the logs exported to the people
|
||||
running a global network for asynchronous post-hoc verification.
|
||||
|
||||
\subsection{Guaranteed data distribution}
|
||||
|
||||
In any global consensus system the user is faced with the question of whether they have the latest state of the
|
||||
@ -1159,7 +1133,7 @@ in the state to the notary cluster, which then stores it in the local databases
|
||||
cluster has committed the transaction, key identities are looked up and any which resolve successfully are sent
|
||||
copies of the transaction. In normal operation the notary is not provided with the certificates linking the random
|
||||
keys to the long term identity keys and thus does not know who is involved with the operation (assuming source IP
|
||||
address obfuscation is in use, see~\cref{sec:privacy}).
|
||||
address obfuscation would be implemented, see~\cref{subsec:privacy-upgrades}).
|
||||
|
||||
\section{The vault}\label{sec:vault}
|
||||
|
||||
@ -1206,8 +1180,7 @@ features are therefore highly desirable for improving the productivity of app de
|
||||
\end{itemize}
|
||||
|
||||
Corda states are defined using a subset of the JVM bytecode language which includes annotations. The vault
|
||||
recognises annotations from the \emph{Java Persistence Architecture} (JPA) specification defined in JSR
|
||||
338\cite{JPA}. These annotations define how a class maps to a relational table schema including which member is the
|
||||
recognises annotations from the JPA specification defined in JSR 338\cite{JPA}. These annotations define how a class maps to a relational table schema including which member is the
|
||||
primary key, what SQL types to map the fields to and so on. When a transaction is submitted to the vault by a flow,
|
||||
the vault finds states it considers relevant (i.e. which contains a key owned by the node) and the relevant CorDapp
|
||||
has been installed into the node as a plugin, the states are fed through an object relational mapper which
|
||||
@ -1226,8 +1199,6 @@ features of their chosen database engine that they like. They can also create th
|
||||
views of the underlying data for end user applications, as long as they don't impose any constraints that would
|
||||
prevent the node from syncing the database with the actual contents of the ledger.
|
||||
|
||||
% TODO: Artemis stores message queues separately right now, although it does have a JDBC backend we don't use it.
|
||||
|
||||
States are arbitrary object graphs. Whilst nothing stops a state from containing multiple classes intended for
|
||||
different tables, it is typical that the relational representation will not be a direct translation of the
|
||||
object-graph representation. States are queried by the vault for the ORM mapped class to use, which will often skip
|
||||
@ -1431,58 +1402,6 @@ issuer to re-issue the asset onto the ledger with a new reference field. This op
|
||||
unlinks the new version of the asset from the old, meaning that nodes won't attempt to explore the original dependency
|
||||
graph during verification.
|
||||
|
||||
Corda has been designed with the future integration of additional privacy technologies in mind. Of all potential
|
||||
upgrades, three are particularly worth a mention.
|
||||
|
||||
\paragraph{Secure hardware.}Although we narrow the scope of data propagation to only nodes that need to see that
|
||||
data, `need' can still be an unintuitive concept in a decentralised database where often data is required only to
|
||||
perform security checks. We have successfully experimented with running contract verification inside a secure
|
||||
enclave protected JVM using Intel SGX\texttrademark~, an implementation of the `trusted computing'
|
||||
concept\cite{mitchell2005trusted}. Secure hardware platforms allow computation to be performed in an undebuggable
|
||||
tamper-proof execution environment, for the software running inside that environment to derive encryption keys
|
||||
accessible only to that instance, and for the software to \emph{remotely attest} to a third party over the internet
|
||||
that it is indeed running in the secure state. By having nodes remotely attest to each other that they are running
|
||||
smart contract verification logic inside an enclave it becomes possible for the dependencies of a transaction to be
|
||||
transmitted to a peer encrypted under an enclave key, thus allowing them to verify the dependencies using software
|
||||
they have audited themselves, but without being able to see the data on which it operates.
|
||||
|
||||
Secure hardware opens up the potential for a one-shot privacy model that would dramatically simplify the task of
|
||||
writing smart contracts. However, it does still require the sensitive data to be sent to the peer who may then
|
||||
attempt to attack the hardware or exploit side channels to extract business intelligence from inside the encrypted
|
||||
container.
|
||||
|
||||
\paragraph{Mix networks.}Some nodes may be in the position of learning about transactions that aren't directly
|
||||
related to trades they are doing, for example notaries or regulator nodes. Even when key randomisation is used
|
||||
these nodes can still learn valuable identity information by simply examining the source IP addresses or the
|
||||
authentication certificates of the nodes sending the data for notarisation. The traditional cryptographic solution
|
||||
to this problem is a \emph{mix network}\cite{Chaum:1981:UEM:358549.358563}. The most famous mix network is Tor, but
|
||||
a more appropriate design for Corda would be that of an anonymous remailer. In a mix network a message is
|
||||
repeatedly encrypted in an onion-like fashion using keys owned by a small set of randomly selected nodes. Each
|
||||
layer in the onion contains the address of the next `hop'. Once the message is delivered to the first hop, it
|
||||
decrypts it to reveal the next encrypted layer and forwards it onwards. The return path operates in a similar
|
||||
fashion. Adding a mix network to the Corda protocol would allow users to opt-in to a privacy upgrade, at the cost
|
||||
of higher latencies and more exposure to failed network nodes.
|
||||
|
||||
\paragraph{Zero knowledge proofs.}The holy grail of privacy in decentralised database systems is the use of zero
|
||||
knowledge proofs to convince a peer that a transaction is valid, without revealing the contents of the transaction
|
||||
to them. Although these techniques are not yet practical for execution of general purpose smart contracts, enormous
|
||||
progress has been made in recent years and we have designed our data model on the assumption that we will one day
|
||||
wish to migrate to the use of \emph{zero knowledge succinct non-interactive arguments of knowledge}\cite{184425}
|
||||
(`zkSNARKs'). These algorithms allow for the calculation of a fixed-size mathematical proof that a program was
|
||||
correctly executed with a mix of public and private inputs. Programs can be expressed either directly as a system
|
||||
of low-degree multivariate polynomials encoding an algebraic constraint system, or by execution on a simple
|
||||
simulated CPU (`vnTinyRAM') which is itself implemented as a large pre-computed set of constraints. Because the
|
||||
program is shared the combination of an agreed upon function (i.e. a smart contract) along with private input data
|
||||
is sufficient to verify correctness, as long as the prover's program may recursively verify other proofs, i.e. the
|
||||
proofs of the input transactions. The BCTV zkSNARK algorithms rely on recursive proof composition for the execution
|
||||
of vnTinyRAM opcodes, so this is not a problem. The most obvious integration with Corda would require tightly
|
||||
written assembly language versions of common smart contracts (e.g. cash) to be written by hand and aligned with the
|
||||
JVM versions. Less obvious but more powerful integrations would involve the addition of a vnTinyRAM backend to an
|
||||
ahead of time JVM bytecode compiler, such as Graal\cite{Graal}, or a direct translation of Graal's graph based
|
||||
intermediate representation into systems of constraints. Direct translation of an SSA-form compiler IR to
|
||||
constraints would be best integrated with recent research into `scalable probabilistically checkable
|
||||
proofs'\cite{cryptoeprint:2016:646}, and is an open research problem.
|
||||
|
||||
\section{Future work}
|
||||
|
||||
Corda has a long term roadmap with many planned extensions. In this section we explore a variety of planned upgrades
|
||||
@ -1762,6 +1681,86 @@ such a requirement.
|
||||
|
||||
% TODO: Nothing related to data distribution groups is implemented.
|
||||
|
||||
\subsection{Merging networks}
|
||||
|
||||
Because there is no single block chain, it is theoretically possible to merge two independent networks together by simply
|
||||
establishing two-way connectivity between their nodes then configuring each side to trust each other's network operators
|
||||
(and by extension their network parameters, certificate authorities and so on).
|
||||
|
||||
This ability may seem pointless: isn't the goal of a decentralised ledger to have a single global database for
|
||||
everyone? It is, but a practical route to reaching this end state is still required. It is often the case that
|
||||
organisations perceived by consumers as being a single company are in fact many different entities cross-licensing
|
||||
branding, striking deals with each other and doing internal trades with each other. This sort of setup can occur
|
||||
for regulatory reasons, tax reasons, due to a history of mergers or just through a sheer masochistic love of
|
||||
paperwork. Very large companies can therefore experience all the same synchronisation problems a decentralised
|
||||
ledger is intended to fix but purely within the bounds of that organisation. In this situation the main problem to
|
||||
tackle is not malicious actors but rather heterogenous IT departments, varying software development practices,
|
||||
unlinked user directories and so on. Such organisations can benefit from gaining experience with the technology
|
||||
internally and cleaning up their own internal views of the world before tackling the larger problem of
|
||||
synchronising with the wider world as well.
|
||||
|
||||
When merging networks, both sides must trust that each other's notaries have never signed double spends. When
|
||||
merging an organisation-private network into the global ledger it should be possible to simply rely on incentives
|
||||
to provide this guarantee: there is no point in a company double spending against itself. However, if more evidence
|
||||
is desired, a standalone notary could be run against a hardware security module with audit logging enabled. The
|
||||
notary itself would simply use a private database and run on a single machine, with the logs exported to the people
|
||||
running a global network for asynchronous post-hoc verification.
|
||||
|
||||
\subsection{Privacy upgrades}\label{subsec:privacy-upgrades}
|
||||
|
||||
Corda has been designed with the future integration of additional privacy technologies in mind. Of all potential
|
||||
upgrades, three are particularly worth a mention.
|
||||
|
||||
\paragraph{Secure hardware.}Although we narrow the scope of data propagation to only nodes that need to see that
|
||||
data, `need' can still be an unintuitive concept in a decentralised database where often data is required only to
|
||||
perform security checks. We have successfully experimented with running contract verification inside a secure
|
||||
enclave protected JVM using Intel SGX\texttrademark~, an implementation of the `trusted computing'
|
||||
concept\cite{mitchell2005trusted}. Secure hardware platforms allow computation to be performed in an undebuggable
|
||||
tamper-proof execution environment, for the software running inside that environment to derive encryption keys
|
||||
accessible only to that instance, and for the software to \emph{remotely attest} to a third party over the internet
|
||||
that it is indeed running in the secure state. By having nodes remotely attest to each other that they are running
|
||||
smart contract verification logic inside an enclave it becomes possible for the dependencies of a transaction to be
|
||||
transmitted to a peer encrypted under an enclave key, thus allowing them to verify the dependencies using software
|
||||
they have audited themselves, but without being able to see the data on which it operates.
|
||||
|
||||
Secure hardware opens up the potential for a one-shot privacy model that would dramatically simplify the task of
|
||||
writing smart contracts. However, it does still require the sensitive data to be sent to the peer who may then
|
||||
attempt to attack the hardware or exploit side channels to extract business intelligence from inside the encrypted
|
||||
container.
|
||||
|
||||
\paragraph{Mix networks.}Some nodes may be in the position of learning about transactions that aren't directly
|
||||
related to trades they are doing, for example notaries or regulator nodes. Even when key randomisation is used
|
||||
these nodes can still learn valuable identity information by simply examining the source IP addresses or the
|
||||
authentication certificates of the nodes sending the data for notarisation. The traditional cryptographic solution
|
||||
to this problem is a \emph{mix network}\cite{Chaum:1981:UEM:358549.358563}. The most famous mix network is Tor, but
|
||||
a more appropriate design for Corda would be that of an anonymous remailer. In a mix network a message is
|
||||
repeatedly encrypted in an onion-like fashion using keys owned by a small set of randomly selected nodes. Each
|
||||
layer in the onion contains the address of the next `hop'. Once the message is delivered to the first hop, it
|
||||
decrypts it to reveal the next encrypted layer and forwards it onwards. The return path operates in a similar
|
||||
fashion. Adding a mix network to the Corda protocol would allow users to opt-in to a privacy upgrade, at the cost
|
||||
of higher latencies and more exposure to failed network nodes.
|
||||
|
||||
\paragraph{Zero knowledge proofs.}The holy grail of privacy in decentralised database systems is the use of zero
|
||||
knowledge proofs to convince a peer that a transaction is valid, without revealing the contents of the transaction
|
||||
to them. Although these techniques are not yet practical for execution of general purpose smart contracts, enormous
|
||||
progress has been made in recent years and we have designed our data model on the assumption that we will one day
|
||||
wish to migrate to the use of \emph{zero knowledge succinct non-interactive arguments of knowledge}\cite{184425}
|
||||
(`zkSNARKs'). These algorithms allow for the calculation of a fixed-size mathematical proof that a program was
|
||||
correctly executed with a mix of public and private inputs. Programs can be expressed either directly as a system
|
||||
of low-degree multivariate polynomials encoding an algebraic constraint system, or by execution on a simple
|
||||
simulated CPU (`vnTinyRAM') which is itself implemented as a large pre-computed set of constraints. Because the
|
||||
program is shared the combination of an agreed upon function (i.e. a smart contract) along with private input data
|
||||
is sufficient to verify correctness, as long as the prover's program may recursively verify other proofs, i.e. the
|
||||
proofs of the input transactions. The BCTV zkSNARK algorithms rely on recursive proof composition for the execution
|
||||
of vnTinyRAM opcodes, so this is not a problem. The most obvious integration with Corda would require tightly
|
||||
written assembly language versions of common smart contracts (e.g. cash) to be written by hand and aligned with the
|
||||
JVM versions. Less obvious but more powerful integrations would involve the addition of a vnTinyRAM backend to an
|
||||
ahead of time JVM bytecode compiler, such as Graal\cite{Graal}, or a direct translation of Graal's graph based
|
||||
intermediate representation into systems of constraints. Direct translation of an SSA-form compiler IR to
|
||||
constraints would be best integrated with recent research into `scalable probabilistically checkable
|
||||
proofs'\cite{cryptoeprint:2016:646}, and is an open research problem.
|
||||
|
||||
|
||||
\section{Conclusion}
|
||||
|
||||
We have presented Corda, a decentralised database designed for the financial sector. It allows for a unified data
|
||||
|
Loading…
x
Reference in New Issue
Block a user