mirror of
https://github.com/corda/corda.git
synced 2025-02-21 09:51:57 +00:00
Tech white paper: new sections on privacy, data distribution groups (aka clubs), notary involvement with data distribution.
This commit is contained in:
parent
eff2f38949
commit
b19c6de69a
@ -276,9 +276,42 @@ publisher = {USENIX Association},
|
||||
2014, Valencia, - Spain, September 28, 2014.},
|
||||
pages = {7--16},
|
||||
year = {2014},
|
||||
crossref = {DBLP:conf/models/2014gemoc},
|
||||
url = {http://ceur-ws.org/Vol-1236/paper-03.pdf},
|
||||
timestamp = {Mon, 30 May 2016 16:28:38 +0200},
|
||||
biburl = {http://dblp2.uni-trier.de/rec/bib/conf/models/VoelterL14},
|
||||
bibsource = {dblp computer science bibliography, http://dblp.org}
|
||||
}
|
||||
|
||||
@misc{FinneyAttack,
|
||||
author = {Hal Finney},
|
||||
title = {Best practice for fast transaction acceptance - how high is the risk?},
|
||||
howpublished = {\url{https://bitcointalk.org/index.php?topic=3441.msg48384#msg48384}}
|
||||
}
|
||||
|
||||
@article{Chaum:1981:UEM:358549.358563,
|
||||
author = {Chaum, David L.},
|
||||
title = {Untraceable Electronic Mail, Return Addresses, and Digital Pseudonyms},
|
||||
journal = {Commun. ACM},
|
||||
issue_date = {Feb. 1981},
|
||||
volume = {24},
|
||||
number = {2},
|
||||
month = feb,
|
||||
year = {1981},
|
||||
issn = {0001-0782},
|
||||
pages = {84--90},
|
||||
numpages = {7},
|
||||
url = {http://doi.acm.org/10.1145/358549.358563},
|
||||
doi = {10.1145/358549.358563},
|
||||
acmid = {358563},
|
||||
publisher = {ACM},
|
||||
address = {New York, NY, USA},
|
||||
keywords = {digital signatures, electronic mail, privacy, public key cryptosystems, security, traffic analysis},
|
||||
}
|
||||
|
||||
@misc{cryptoeprint:2016:646,
|
||||
author = {Eli Ben-Sasson and Iddo Ben-Tov and Alessandro Chiesa and Ariel Gabizon and Daniel Genkin and Matan Hamilis and Evgenya Pergament and Michael Riabzev and Mark Silberstein and Eran Tromer and Madars Virza},
|
||||
title = {Computational integrity with a public random string from quasi-linear PCPs},
|
||||
howpublished = {Cryptology ePrint Archive, Report 2016/646},
|
||||
year = {2016},
|
||||
note = {\url{http://eprint.iacr.org/2016/646}},
|
||||
}
|
@ -6,7 +6,6 @@
|
||||
\usepackage{amsfonts}
|
||||
\usepackage{minted}
|
||||
\usemintedstyle{vs}
|
||||
|
||||
\newminted{kotlin}{%
|
||||
breakbytoken,%
|
||||
breaklines,%
|
||||
@ -27,11 +26,11 @@
|
||||
\usepackage{textcomp}
|
||||
\usepackage{scrextend}
|
||||
\usepackage{cleveref}
|
||||
\usepackage{csquotes}
|
||||
\crefformat{section}{\S#2#1#3}
|
||||
\addtokomafont{labelinglabel}{\sffamily}
|
||||
%\usepackage[natbibapa]{apacite}
|
||||
\renewcommand{\thefootnote}{\alph{footnote}}
|
||||
|
||||
%\epigraphfontsize{\small\itshape}
|
||||
\setlength\epigraphwidth{4.5cm}
|
||||
\setlength\epigraphrule{0pt}
|
||||
@ -43,9 +42,9 @@
|
||||
|
||||
%\renewcommand{\abstractname}{An introduction}
|
||||
\begin{center}
|
||||
Version 0.3
|
||||
Version 0.4
|
||||
|
||||
\emph{Confidential: For R3 DLG only - INCOMPLETE}
|
||||
\emph{Confidential: For R3 DLG members only}
|
||||
\end{center}
|
||||
|
||||
\vspace{10mm}
|
||||
@ -199,6 +198,8 @@ More complex notions of identity that may attest to many time-varying attributes
|
||||
system: the base identity is always just an X.500 name. Note that even though messaging is always identified, transactions
|
||||
themselves may still contain anonymous public keys.
|
||||
|
||||
% TODO: Currently the node only lets you pick the CN and the rest of the X.500 name is dummy data.
|
||||
|
||||
\subsection{The network map}
|
||||
|
||||
Every network requires a network map service, which may itself be composed of multiple cooperating nodes. This is
|
||||
@ -255,7 +256,7 @@ of its content. The purpose of the receipts is to give a node undeniable evidenc
|
||||
notification that would stand up later in a dispute mediation process. Corda does not attempt to support deniable
|
||||
messaging.
|
||||
|
||||
\section{Flow framework}
|
||||
\section{Flow framework}\label{sec:flows}
|
||||
|
||||
It is common in decentralised ledger systems for complex multi-party protocols to be needed. The Bitcoin payment channel
|
||||
protocol\cite{PaymentChannels} involves two parties putting money into a multi-signature pot, then iterating with your
|
||||
@ -391,10 +392,10 @@ have occurrred. This is discussed in more detail below.
|
||||
is useful for secure signing devices (see \cref{sec:secure-signing-devices}).
|
||||
\end{labeling}
|
||||
|
||||
% TODO: Update this one transaction types are separated.
|
||||
% TODO: Update this once transaction types are separated.
|
||||
% TODO: This description ignores the participants field in states, because it probably needs a rethink.
|
||||
% TODO: Specify the curve used here once we decide how much we care about BIP32 public derivation.
|
||||
% TODO: Messages aren't implemented.
|
||||
% TODO: Specify the elliptic curve used here once we finalise our choice.
|
||||
% TODO: Summaries aren't implemented.
|
||||
|
||||
Signatures are appended to the end of a transaction and transactions are identified by the hash used for signing, so
|
||||
signature malleability is not a problem. There is never a need to identify a transaction including its accompanying
|
||||
@ -560,7 +561,7 @@ lag between the ledger becoming inaccurate and it catching up with reality. In t
|
||||
can be used in which the involved parties minus the uncooperative party agree to mark the relevant states as
|
||||
no longer consumed/spent. This is essentially a limited form of database rollback.
|
||||
|
||||
\subsection{Identity lookups}
|
||||
\subsection{Identity lookups}\label{sec:identity-lookups}
|
||||
|
||||
In all block chain inspired systems there exists a tension between wanting to know who you are dealing with and
|
||||
not wanting others to know. A standard technique is to use randomised public keys in the shared data, and keep
|
||||
@ -886,7 +887,6 @@ to, such as file IO or external entropy.
|
||||
\item Sets the \texttt{strictfp} flag on all methods, which requires the JVM to do floating point arithmetic in a hardware
|
||||
independent fashion. Whilst we anticipate that floating point arithmetic is unlikely to feature in most smart contracts
|
||||
(big integer and big decimal libraries are available), it is available for those who want to use it.
|
||||
% TODO: The sandbox code doesn't flip the strictfp flag yet.
|
||||
\item Forbids \texttt{invokedynamic} bytecode except in special cases, as the libraries that support this functionality have
|
||||
historically had security problems and it is primarily needed only by scripting languages. Support for the specific
|
||||
lambda and string concatenation metafactories used by Java code itself are allowed.
|
||||
@ -939,6 +939,12 @@ ensure protocol compliance a higher performance algorithm like RAFT may be used.
|
||||
a single network may provide a single global BFT notary for general use and region-specific RAFT notaries for low
|
||||
latency trading within a unified regulatory area, for example London or New York.
|
||||
|
||||
Notaries accept transactions submitted to them for processing and either return a signature over the transaction, or
|
||||
a rejection error that states that a double spend has occurred. The presence of a notary signature from the state's
|
||||
chosen notary indicates transaction finality. An app developer triggers notarisation by invoking the
|
||||
\texttt{Finality} flow on the transaction once all other necessary signatures have been gathered. Once the finality flow
|
||||
returns successfully, the transaction can be considered committed to the database.
|
||||
|
||||
\subsection{Comparison to Nakamoto block chains}
|
||||
|
||||
Bitcoin organises the timeline into a chain of blocks, with each block pointing to a previous block the miner has chosen
|
||||
@ -1058,6 +1064,47 @@ standalone notary could be run against a hardware security module with audit log
|
||||
use a private database and run on a single machine, with the logs exported to the people running a global network for
|
||||
asynchronous post-hoc verification.
|
||||
|
||||
\subsection{Guaranteed data distribution}
|
||||
|
||||
In any global consensus system the user is faced with the question of whether they have the latest state of the database.
|
||||
Programmers working with block chains often make the simplifying assumption that because there is no formal map
|
||||
of miner locations and thus transactions are distributed to miners via broadcast, that they can listen to the
|
||||
stream of broadcasts and learn if they have the latest data. Alas, nothing stops someone privately providing a
|
||||
miner who has a known location with a transaction that they agree not to broadcast. The first time the rest of
|
||||
the network finds out about this transaction is when a block containing it is broadcast. When used to do double
|
||||
spending fraud this type of attack is known as a Finney Attack\cite{FinneyAttack}. Proof-of-work based systems
|
||||
rely on aligned incentives to discourage such attacks: to quote the Bitcoin white paper, \blockquote{He ought to
|
||||
find it more profitable to play by the rules ... than to undermine the system and the validity of his own wealth.}
|
||||
In practice this approach appears to work well enough most of the time, given that miners typically do not accept
|
||||
privately submitted transactions.
|
||||
|
||||
In a system without global broadcast things are very different: the notary clusters \emph{must} accept transactions
|
||||
directly and there is no mechanism to ensure that everyone sees that the transaction is occurring. Sometimes this
|
||||
doesn't matter: most transactions are irrelevant for you and having to download them just wastes resources. But
|
||||
occasionally you do wish to become aware that the ledger state has been changed by someone else. A simple example
|
||||
is an option contract in which you wish to expire the option unless the counterparty has already exercised it. Them
|
||||
exercising the option must not require the seller to sign off on it, as it may be advantageous for the seller to refuse
|
||||
if it would cause them to lose money. Whilst the seller would discover if the buyer had exercised the option when they
|
||||
attempted to expire it, due to the notary informing them that their expiry transaction was a double spend, it is
|
||||
preferable to find out immediately.
|
||||
|
||||
The obvious way to implement this is to give notaries the responsibility for ensuring all interested parties find out
|
||||
about a transaction. However, this would require the notaries to know who the involved parties actually are, which
|
||||
would create an undesirable privacy leak. It would also place extra network load on the notaries who would frequently
|
||||
be sending transaction data to parties that may already have it, or may simply not care. In many cases there may be
|
||||
no requirement for the notary to act as a trusted third party for data distribution purposes, as game-theoretic
|
||||
assumptions or legal assurances are sufficiently strong that peers can be trusted to deliver transaction data as part
|
||||
of their regular flows.
|
||||
|
||||
To solve this, app developers can choose whether to request transaction distribution by the notary or not. This works
|
||||
by simply piggybacking on the standard identity lookup flows (see \cref{sec:identity-lookups}). If a node wishes to be
|
||||
informed by the notary when a state is consumed, it can send the certificates linking the random keys in the state
|
||||
to the notary cluster, which then stores it in the local databases as per usual. Once the notary cluster has committed
|
||||
the transaction, key identities are looked up and any which resolve successfully are sent copies of the transaction. In
|
||||
normal operation the notary is not provided with the certificates linking the random keys to the long term identity keys
|
||||
and thus does not know who is involved with the operation (assuming source IP address obfuscation is in use, see
|
||||
\cref{sec:privacy}).
|
||||
|
||||
\section{The vault}\label{sec:vault}
|
||||
|
||||
In any blockchain based system most nodes have a wallet, or as we call it, a vault.
|
||||
@ -1135,23 +1182,60 @@ annotated in other ways, for instance to customise its mapping to XML/JSON, or t
|
||||
\cite{BeanValidation}. These annotations won't affect the behaviour of the node directly but may be useful when working
|
||||
with states in surrounding software.
|
||||
|
||||
%\section{Integration with market infrastructure}
|
||||
%
|
||||
%Trade is the lifeblood of the economy. A distributed ledger needs to provide a vibrant platform on which trading may
|
||||
%take place. However, the decentralised nature of such a network makes it difficult to build competitive
|
||||
%market infrastructure on top of it, especially for highly liquid assets like securities. Markets typically provide
|
||||
%features like a low latency orderbook, integrated regulatory compliance, price feeds and other things that benefit
|
||||
%from a central meeting point.
|
||||
%
|
||||
%The Corda data model allows for integration of the ledger with existing markets and exchanges. A sell order for
|
||||
%an asset that exists on-ledger can have a \emph{partially signed transaction} attached to it. A partial
|
||||
%signature ... % TODO
|
||||
\subsection{Key randomisation}\label{sec:key-randomisation}
|
||||
|
||||
% In many markets, central infrastructures such as clearing houses (also known as Central Counterparties, or CCPs)
|
||||
% and Central Securities Depositories (CSD) have been created. They provide governance, rules definition and
|
||||
% enforcement, risk management and shared data and processing services. The partial data visibility, flexible
|
||||
% transaction verification logic and pluggable notary design means Corda could be a particularly good fit for
|
||||
% future distributed ledger services contemplated by CCPs and CSDs.
|
||||
A standard privacy technique in block chain systems is the use of randomised unlinkable public keys to stand in for
|
||||
actual verified identities. Ownership of these pseudonyms may be revealed to a counterparty using a simple interactive
|
||||
protocol in which Alice selects a random nonce (`number used once') and sends it to Bob, who then signs the nonce with
|
||||
the private key corresponding to the public key he is proving ownership of.
|
||||
|
||||
Generating fresh keys for each new deal or asset transfer rapidly results in many private keys being created. These
|
||||
keys must all be backed up and kept safe, which poses a significant management problem when done at scale. The canonical
|
||||
way to resolve this problem is through the use of deterministic key derivation, as pioneered by the Bitcoin community in
|
||||
BIP 32 `Hierarchical Deterministic Wallets'\cite{BIP32}. Deterministic key derivation allows all private key
|
||||
material needed to be derived from a single, small pool of entropy (e.g. a carefully protected and backed up 128 bits of
|
||||
random data). More importantly, when the full BIP 32 technique is used in combination with an elliptic curve that supports
|
||||
it, public keys may also be deterministically derived \emph{without} access to the underlying private key material. This
|
||||
allows devices to provide fresh public keys to counterparties without being able to sign with those keys, enabling
|
||||
better security along with operational efficiencies.
|
||||
|
||||
Corda does not place any constraints on the mathematical properties of the digital signature algorithms parties use.
|
||||
However, implementations are recommended to use hierarchical deterministic key derivation when possible.
|
||||
|
||||
\section{Integration with market infrastructure}
|
||||
|
||||
Trade is the lifeblood of the economy. A distributed ledger needs to provide a vibrant platform on which trading may
|
||||
take place. However, the decentralised nature of such a network makes it difficult to build competitive
|
||||
market infrastructure on top of it, especially for highly liquid assets like securities. Markets typically provide
|
||||
features like a low latency order book, integrated regulatory compliance, price feeds and other things that benefit
|
||||
from a central meeting point.
|
||||
|
||||
The Corda data model allows for integration of the ledger with existing markets and exchanges. A sell order for
|
||||
an asset that exists on-ledger can have a \emph{partially signed transaction} attached to it. A partial
|
||||
signature is a signature that allows the signed data to be changed in controlled ways after signing. Partial signatures
|
||||
are directly equivalent to Bitcoin's \texttt{SIGHASH} flags and work in the same way - signatures contain metadata
|
||||
describing which parts of the transaction are covered. Normally all of a transaction would be covered, but using this
|
||||
metadata it is possible to create a signature that only covers some inputs and outputs, whilst allowing more to be
|
||||
added later.
|
||||
|
||||
This feature is intended for integration of the ledger with the order books of markets and exchanges. Consider a stock
|
||||
exchange. A buy order can be submitted along with a partially signed transaction that signs a cash input state
|
||||
and a output state representing some quantity of the stock owned by the buyer. By itself this transaction is invalid,
|
||||
as the cash does not appear in the outputs list and there is no input for the stock. A sell order can be combined with
|
||||
a mirror-image partially signed transaction that has a stock state as the input and a cash state as the output. When
|
||||
the two orders cross on the order book, the exchange itself can take the two partially signed transactions and merge
|
||||
them together, creating a valid transaction that it then notarises and distributes to both buyer and seller. In this
|
||||
way trading and settlement become atomic, with the ownership of assets on the ledger being synchronised with the view
|
||||
of market participants. Note that in this design the distributed ledger itself is \emph{not} a marketplace, and does
|
||||
not handle distribution or matching of orders. Rather, it focuses on management of the pre- and post- trade lifecycles.
|
||||
|
||||
\paragraph{Central counterparties.}In many markets, central infrastructures such as clearing houses (also known as
|
||||
Central Counterparties, or CCPs) and Central Securities Depositories (CSD) have been created. They provide governance,
|
||||
rules definition and enforcement, risk management and shared data and processing services. The partial data visibility,
|
||||
flexible transaction verification logic and pluggable notary design means Corda could be a particularly good fit for
|
||||
future distributed ledger services contemplated by CCPs and CSDs.
|
||||
|
||||
% TODO: Partial signatures are not implemented.
|
||||
|
||||
\section{Domain specific languages}
|
||||
|
||||
@ -1242,6 +1326,15 @@ of smart contracts. A good example of this is the Whiley language by Dr David Pe
|
||||
checks program-integrated proofs at compile time. By building on industry-standard platforms, we gain access to
|
||||
cutting edge research from the computer science community outside of the distributed systems world.
|
||||
|
||||
\subsection{Projectional editing}
|
||||
|
||||
Custom languages and type systems for the expression of contract logic can be naturally combined with \emph{projectional
|
||||
editing}, in which source code is not edited textually but rather a structure aware
|
||||
editor\cite{DBLP:conf/models/VoelterL14}. Such languages can consist not only of traditional grammar-driven text
|
||||
oriented structures but also diagrams, tables and recursive compositions of them together. Given the frequent occurrence
|
||||
of data tables and English-oriented nature of many financial contracts, a dedicated environment for the construction of
|
||||
smart contract logic may be appreciated by the users.
|
||||
|
||||
\section{Secure signing devices}\label{sec:secure-signing-devices}
|
||||
|
||||
\subsection{Background}
|
||||
@ -1400,18 +1493,121 @@ are ideal for the task.
|
||||
Being able to connect live data structures directly to UI toolkits also contributes to the avoidance
|
||||
of XSS exploits, XSRF exploits and similar security problems based on losing track of buffer boundaries.
|
||||
|
||||
\section{Privacy}
|
||||
|
||||
TODO
|
||||
|
||||
\section{Data distribution groups}
|
||||
|
||||
TODO
|
||||
By default, distribution of transaction data is defined by app-provided flows (see \cref{sec:flows}). Flows specify
|
||||
when and to which peers transactions should be sent. Typically these destinations will be calculated based on the content
|
||||
of the states and the available identity lookup certificates, as the intended use case of financial data usually
|
||||
contains the identities of the relevant parties within it. Sometimes though, the set of parties that should receive
|
||||
data isn't known ahead of time and may change after a transaction has been created. For these cases partial data
|
||||
visibility is not a good fit and an alternative mechanism is needed.
|
||||
|
||||
\section{Future work}
|
||||
A data distribution group (DDG) is created by generating a keypair and a self-signed certificate for it. Groups are
|
||||
identified internally by their public key and may be given string names in the certificate, but nothing in the
|
||||
software assumes the name is unique: it's intended only for human consumption and it may conflict with other independent
|
||||
groups. In case of conflict user interfaces disambiguate by appending a few characters of the base58 encoded public key
|
||||
to the name like so: "My popular group name (a4T)". As groups are not globally visible anyway, it is unlikely that
|
||||
conflicts will be common or require many code letters to deconflict, and some groups may not even be intended for
|
||||
human consumption at all.
|
||||
|
||||
Although intended to be a production-ready platform for building decentralised financial databases, there are
|
||||
multiple areas of research remaining to be explored.
|
||||
Once a group is created other nodes can be invited to join it by using an invitation flow. Membership can be either
|
||||
read only or read/write. To add a node as read-only, the certificate i.e. pubkey alone is sent. To add a node as
|
||||
read/write the cert and private key are sent. A future elaboration on the design may support giving each member a
|
||||
separate private key which would allow tracing who added transactions to a group, but this is left for future work.
|
||||
In either case the node records in its local database which other nodes it has invited to the group once they accept
|
||||
the invitation.
|
||||
|
||||
When the invite is received the target node runs the other side of the flow as normal, which may either automatically
|
||||
accept membership if it's configured to trust the inviting node, or send a message to a message queue for processing by an
|
||||
external system, or kick it up to a human administrator for approval. Invites to groups the node is already a
|
||||
member of are rejected. The accepting node also records which node invited it. So, there ends up being a two-way
|
||||
recorded relationship between inviter and invitee stored in their vaults. Finally the inviter side of the
|
||||
invitation flow pushes a list of all the transaction IDs that exist in the group and the invitee side resolves all of
|
||||
them. The end result is that all the transactions that are in the group are sent to the new node (along with all
|
||||
dependencies).
|
||||
|
||||
Note that this initial download is potentially infinite if transactions are added to the group as fast or faster than the
|
||||
new node is downloading and checking them. Thus whilst it may be tempting to try and expose a notion of `doneness' to
|
||||
the act of joining a group, it's better to see the act of joining as happening at a specific point in time and the
|
||||
resultant flood of transaction data as an ongoing stream, rather than being like a traditional file download.
|
||||
|
||||
When a transaction is sent to the vault, it always undergoes a relevancy test, regardless of whether it is in a group
|
||||
or not (see \cref{sec:vault}). This test is extended to check also for the
|
||||
signatures of any groups the node is a member of. If there's a match then the transaction's states are all considered
|
||||
relevant. In addition, the vault looks up which nodes it invited to this group, and also which nodes invited it, removes
|
||||
any nodes that have recently sent us this transaction and then kicks off a \texttt{PropagateTransactionToGroup} flow
|
||||
with each of them. The other side of this flow checks if the transaction is already known, if not requests it, checks
|
||||
that it is indeed signed by the group in question, resolves it and then assuming success, sends it to the vault. In this
|
||||
way a transaction added by any member of the group propagates up and down the membership tree until all the members have
|
||||
seen it. Propagation is idempotent - if the vault has already seen a transaction before then it isn't processed again.
|
||||
|
||||
The structure we have so far has some advantages and one big disadvantage. The advantages are:
|
||||
|
||||
\begin{itemize}
|
||||
\item [Simplicity] The core data model is unchanged. Access control is handled using existing tools like signatures, certificates and flows.
|
||||
\item [Privacy] It is possible to join a group without the other members being aware that you have done so. It is possible to create groups without non-members knowing the group exists.
|
||||
\item [Scalability] Groups are not registered in any central directory. A group that exists between four parties imposes costs only on those four.
|
||||
\item [Performance] Groups can be created as fast as you can generate keypairs and invite other nodes to join you.
|
||||
\item [Responsibility] For every member of the group there is always a node that has a responsibility for sending you
|
||||
new data under the protocol (the inviting node). Unlike with Kademlia style distributed hash tables, or Bitcoin style
|
||||
global broadcast, you can never find yourself in a position where you didn't receive data yet nobody has violated the
|
||||
protocol. There are no points at which you pick a random selection of nodes and politely ask them to do something for
|
||||
you, hoping that they'll choose to stick around.
|
||||
\end{itemize}
|
||||
|
||||
The big disadvantage is that it's brittle. If you have a membership tree and a node goes offline for a while,
|
||||
then propagation of data will split and back up in the outbound queues of the parents and children of the offline
|
||||
node until it comes back.
|
||||
|
||||
To strengthen groups we can add a new feature, membership broadcasts. Members of the group that have write access may
|
||||
choose to sign a membership announcement and propagate it through the tree. These announcements are recorded in the
|
||||
local database of each node in the group. Nodes may include these announced members when sending newly added
|
||||
transactions. This converts the membership tree to a graph that may contain cycles, but infinite propagation loops are
|
||||
not possible because nodes ignore announcements of new transactions/attachments they've already received. Whether a group
|
||||
prefers privacy or availability may be hinted in the certificate that defines it: if availability is preferred, this is
|
||||
a signal that members should always announce themselves (which would lead to a mesh).
|
||||
|
||||
The network map for a network defines the event horizon, the span of time that is allowed to elapse before an offline
|
||||
node is considered to be permanently gone. Once a peer has been offline for longer than the event horizon any nodes that
|
||||
invited it remove it from their local tables. If a node was invited to a group by a gone peer and there are no other
|
||||
nodes that announced their membership it can use, the node should post a message a queue and/or notify the
|
||||
administrator, as it's now effectively been evicted from the group.
|
||||
|
||||
The resulting arrangement may appear similar to a gossip network. However the underlying membership tree structure
|
||||
remains. Thus when all nodes are online (or online enough) messages are guaranteed to propagate to everyone in the
|
||||
network. You can't get situations where a part of the club has become split from the rest without anyone being aware of
|
||||
that fact; an unlikely but possible occurrence in a gossip network. It also isn't like a distributed hash table where
|
||||
data isn't fully replicated, so we avoid situations where data has been added to the group but stops being available due
|
||||
to node outages. It is always possible to reason about the behaviour of the network and always possible to assign
|
||||
responsibility if something goes wrong.
|
||||
|
||||
Note that it is not possible to remove members after they have been added to a group. We could provide a remove
|
||||
announcement but it'd be advisory only: nothing stops nodes from ignoring it. It is also not possible to enumerate
|
||||
members of a group because there is no requirement to do a membership broadcast when you join and no way to enforce such
|
||||
a requirement.
|
||||
|
||||
% TODO: Nothing related to data distribution groups is implemented.
|
||||
|
||||
\section{Privacy}
|
||||
|
||||
Privacy is not a standalone feature in the way that many other aspects described in this paper are, so this section
|
||||
summarises features described elsewhere. Corda exploits multiple techniques to improve user privacy over other
|
||||
distributed ledger systems:
|
||||
|
||||
\paragraph{Partial data visibility.}Transactions are not globally broadcast as in many other systems.
|
||||
\paragraph{Transaction tear-offs.}Transactions are structured as Merkle trees, and may have individual subcomponents be
|
||||
revealed to parties who already know the Merkle root hash. Additionally, they may sign the transaction without being
|
||||
able to see all of it. See \cref{sec:tear-offs}
|
||||
\paragraph{Key randomisation.}The vault generates and uses random keys that are unlinkable to an identity without the
|
||||
corresponding linkage certificate. See \cref{sec:vault}.
|
||||
\paragraph{Graph pruning.}Large transaction graphs that involve liquid assets can be `pruned' by requesting the asset
|
||||
issuer to re-issue the asset onto the ledger with a new reference field. This operation is not atomic, but effectively
|
||||
unlinks the new version of the asset from the old, meaning that nodes won't attempt to explore the original dependency
|
||||
graph during verification.
|
||||
|
||||
Corda has been designed with the future integration of additional privacy technologies in mind. Of all potential
|
||||
upgrades, three are particularly worth a mention.
|
||||
|
||||
\paragraph{Secure hardware.}Although we narrow the scope of data propagation to only nodes that need to see that
|
||||
data, `need' can still be an unintuitive concept in a decentralised database where often data is required only
|
||||
@ -1431,28 +1627,36 @@ of writing smart contracts. However, it does still require the sensitive data to
|
||||
who may then attempt to attack the hardware or exploit side channels to extract business intelligence from
|
||||
inside the encrypted container.
|
||||
|
||||
\paragraph{Mix networks.}Some nodes may be in the position of learning about transactions that aren't directly related
|
||||
to trades they are doing, for example notaries or regulator nodes. Even when key randomisation is used these nodes can
|
||||
still learn valuable identity information by simply examining the source IP addresses or the authentication certificates
|
||||
of the nodes sending the data for notarisation. The traditional cryptographic solution to this problem is a
|
||||
\emph{mix network}\cite{Chaum:1981:UEM:358549.358563}. The most famous mix network is Tor, but a more appropriate design
|
||||
for Corda would be that of an anonymous remailer. In a mix network a message is repeatedly encrypted in an onion-like
|
||||
fashion using keys owned by a small set of randomly selected nodes. Each layer in the onion contains the address of the
|
||||
next `hop'. Once the message is delivered to the first hop, it decrypts it to reveal the next encrypted layer and
|
||||
forwards it onwards. The return path operates in a similar fashion. Adding a mix network to the Corda protocol
|
||||
would allow users to opt-in to a privacy upgrade, at the cost of higher latencies and more exposure to failed network
|
||||
nodes.
|
||||
|
||||
\paragraph{Zero knowledge proofs.}The holy grail of privacy in decentralised database systems is the use of zero
|
||||
knowledge proofs to convince a peer that a transaction is valid without revealing the contents of the transaction to
|
||||
knowledge proofs to convince a peer that a transaction is valid, without revealing the contents of the transaction to
|
||||
them. Although these techniques are not yet practical for execution of general purpose smart contracts, enormous
|
||||
progress has been made in recent years and we have designed our data model on the assumption that we will one day wish
|
||||
to migrate to the use of \emph{zero knowledge succinct non-interactive arguments of knowledge}\cite{184425}
|
||||
(`zkSNARKs'). These algorithms allow for the calculation of a fixed-size mathematical proof that a program was
|
||||
correctly executed with a mix of public and private inputs on a simple simulated CPU (`vnTinyRAM'). Because the program
|
||||
is shared, the combination of an agreed upon function (i.e. a smart contract) along with private input data is
|
||||
sufficient to verify correctness, as long as the prover's program may recursively verify other proofs, i.e. the proofs
|
||||
of the input transactions. The BCTV techniques rely on recursive proof composition for the execution of vnTinyRAM
|
||||
opcodes, so this is not a problem. Integration with Corda would require the addition of a vnTinyRAM compiler backend to
|
||||
an ahead of time JVM bytecode compiler, such as Graal\cite{Graal}, along with the significant adaptations required for
|
||||
execution in the highly limited proving environment.
|
||||
|
||||
\paragraph{New domain specific languages.} Custom languages and type systems for the expression
|
||||
of contract logic can be naturally combined with \emph{projectional editing}, in which source code is not edited
|
||||
textually but rather a structure aware editor\cite{DBLP:conf/models/VoelterL14}. Such languages can consist not
|
||||
only of traditional grammar-driven text oriented structures but also diagrams, tables and recursive compositions of
|
||||
them together. Given the frequent occurrence of data tables and English-oriented nature of many financial
|
||||
contracts, a dedicated environment for the construction of smart contract logic may be appreciated by the users.
|
||||
Additionally, DSLs for contract development may choose to explore approaches that trade off ease of use to gain
|
||||
correctness, for example, total languages, formally verifiable languages, a subset of Haskell or Idris etc.
|
||||
(`zkSNARKs'). These algorithms allow for the calculation of a fixed-size mathematical proof that a program was correctly
|
||||
executed with a mix of public and private inputs. Programs can be expressed either directly as a system of low-degree
|
||||
multivariate polynomials encoding an algebraic constraint system, or by execution on a simple simulated CPU (`vnTinyRAM') which is itself
|
||||
implemented as a large pre-computed set of constraints. Because the program is shared the combination of an
|
||||
agreed upon function (i.e. a smart contract) along with private input data is sufficient to verify correctness,
|
||||
as long as the prover's program may recursively verify other proofs, i.e. the proofs of the input transactions.
|
||||
The BCTV zkSNARK algorithms rely on recursive proof composition for the execution of vnTinyRAM opcodes, so this is not a
|
||||
problem. The most obvious integration with Corda would require tightly written assembly language versions of common
|
||||
smart contracts (e.g. cash) to be written by hand and aligned with the JVM versions. Less obvious but more powerful
|
||||
integrations would involve the addition of a vnTinyRAM backend to an ahead of time JVM bytecode compiler, such as
|
||||
Graal\cite{Graal}, or a direct translation of Graal's graph based intermediate representation into systems of constraints.
|
||||
Direct translation of an SSA-form compiler IR to constraints would be best integrated with recent research
|
||||
into `scalable probabilistically checkable proofs'\cite{cryptoeprint:2016:646}, and is an open research problem.
|
||||
|
||||
\section{Conclusion}
|
||||
|
||||
@ -1470,9 +1674,8 @@ length-prefixed buffers throughout for the systematic avoidance of common buffer
|
||||
ledger data relevant to them by issuing ordinary SQL queries against mature database engines, and may craft complex
|
||||
multi-party transactions with ease in programming languages that are already familiar to them.
|
||||
|
||||
% TODO: Write a section on integration with market infrastructure.
|
||||
% Finally, the platform defines standard ways to integrate the global ledger with financial infrastructure like high
|
||||
% performance markets and netting services.
|
||||
Finally, the platform defines standard ways to integrate the global ledger with financial infrastructure like high
|
||||
performance markets and netting services.
|
||||
|
||||
\section{Acknowledgements}
|
||||
|
||||
|
Loading…
x
Reference in New Issue
Block a user