mirror of
https://github.com/corda/corda.git
synced 2025-01-18 10:46:38 +00:00
Delete the notes directory, it is long since obsoleted by the wiki and the docs site.
This commit is contained in:
parent
e3cfe0ae49
commit
d6cdc8b8de
@ -1,195 +0,0 @@
|
|||||||
General design scratchpad
|
|
||||||
|
|
||||||
Do we need blocks at all? Blocks are an artifact of proof-of-work, which isn't acceptable on private block chains
|
|
||||||
due to the excessive energy usage, unclear incentives model and so on. They're also very useful for SPV operation,
|
|
||||||
but we have no such requirements here.
|
|
||||||
|
|
||||||
Possible alternative, blend of ideas from:
|
|
||||||
|
|
||||||
* Google Spanner
|
|
||||||
* Hawk
|
|
||||||
* Bitcoin
|
|
||||||
* Ethereum
|
|
||||||
* Intel/TCG
|
|
||||||
|
|
||||||
+ some of my own ideas
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
# Blockless operation
|
|
||||||
|
|
||||||
* A set of timestampers are set up around the world with clocks synchronised to GPS time (the most accurate clock
|
|
||||||
available as it's constantly recalibrated against the US Naval Observatory atomic clock). Public timestampers
|
|
||||||
are available already and can be easily used in the prototyping phase, but as they're intended for low traffic
|
|
||||||
applications eventually we'd want our own.
|
|
||||||
|
|
||||||
There is a standard protocol for timestamp servers (RFC 3161). It appears to include everything that we might want
|
|
||||||
and little more, i.e. it's a good place to start. A more modern version of it with the same features can be easily
|
|
||||||
generated later.
|
|
||||||
|
|
||||||
* All transactions submitted to the global network must be timestamped by a recognised TSP (i.e. signed by a root cert
|
|
||||||
owned by R3).
|
|
||||||
|
|
||||||
* Transactions are ordered according to these timestamps. They are assumed to be precise enough that conflicts where
|
|
||||||
two transactions have actually equal times can ~never happen: a trivial resolution algorithm (e.g. based on whichever
|
|
||||||
hash is lower) can be used in case that ever happens by fluke.
|
|
||||||
|
|
||||||
* If need be, clock uncertainty can be measured and overlapping intervals can result in conflict/reject states, as in
|
|
||||||
Spanner's TrueTime. The timestamping protocol (RFC 3161) exposes clock uncertainty.
|
|
||||||
|
|
||||||
* Transactions are timestamped as a group. This ensures that if multiple transactions are needed to carry out a trade,
|
|
||||||
individual transactions cannot be extracted by a malicious party and executed independently as the original bundle
|
|
||||||
will always be able to win, when broadcast.
|
|
||||||
|
|
||||||
* Nodes listen to a broadcast feed of timestamped transactions. They know how to roll back transactions and replay
|
|
||||||
new ones in case of conflict, but this is done independent of any block construct.
|
|
||||||
|
|
||||||
* Nodes that are catching up simply download all the transactions from peers that occur after the time they shut down.
|
|
||||||
They can be sure they didn't miss any by asking peers to calculate a UTXO set summary at a given time and then
|
|
||||||
verifying it against their own local calculations (this is slow, but shouldn't normally flag any issues so it can
|
|
||||||
be done asynchronously).
|
|
||||||
|
|
||||||
* Individual transactions/UTXOs can specify time bounds, e.g. "30 seconds". A node compares a transaction timestamp
|
|
||||||
to its own local clock and applies the specified bound to the local clock: if the transaction is out of bounds and
|
|
||||||
the node isn't catching up, then it is dropped. This prevents people timestamping a malicious transaction X and
|
|
||||||
keeping it private, then broadcasting a publicly timestamped transaction Y, then overriding Y with X long after the
|
|
||||||
underlying trade has become irreversible. Because time bounds are specified on a _per transaction_ basis, it is
|
|
||||||
arbitrarily controllable: traders that want very, very fast clearing can specify a small time boundary and it's up
|
|
||||||
to them to ensure their own systems are capable of getting an accurate trusted timestamp and broadcasting it within
|
|
||||||
that tight bound. Traders that care less, e.g. because the trade represents physical movement of real goods, can use
|
|
||||||
a much larger time bound and get more robustness against transient network/hardware hiccups.
|
|
||||||
|
|
||||||
* Like in Ethereum, transactions can update stored state (contracts? receipts? what is the right term?)
|
|
||||||
|
|
||||||
This can be called transaction-chains. All transactions are public.
|
|
||||||
|
|
||||||
For political expedience, we may wish to impose a (not strictly necessary) block quantisation anyway, so the popular
|
|
||||||
term 'block chain' can be applied and also for auditing/reporting convenience.
|
|
||||||
|
|
||||||
# Privacy
|
|
||||||
|
|
||||||
* Transactions can have two halves: the public side and the private side. The public side is a "normal" transaction that
|
|
||||||
includes a program of sufficient power to verify various kinds of signatures and proofs. The optional private side
|
|
||||||
is an arbitrary program which is executed by a third party. Various techniques are used to lower the trust required
|
|
||||||
in the third parties. We can call these notaries.
|
|
||||||
|
|
||||||
* It's up to the contract designer to decide how much they rely on notaries - if at all. They are technically not
|
|
||||||
required at all: the system would work (and scale) without them. But they can be used to improve privacy.
|
|
||||||
|
|
||||||
* Simplest "dummy" notary is just a machine that signs the output of the program to state it ran it properly. The notary
|
|
||||||
is trusted to execute the program correctly and privately. The signature is checked by the public side. This allows
|
|
||||||
traders to perform e.g. a Dutch auction with only the final results being reflected on the public network.
|
|
||||||
|
|
||||||
* Next best is an SGX based notary. This can provide both privacy and assurance that the code is executed correctly,
|
|
||||||
assuming Intel is trustworthy. Note: it's a safe assumption that if R3 becomes very popular with financial networks,
|
|
||||||
intelligence agencies will attempt to gain covert access to it given the NSA/GCHQ hacking of Western Union and clear
|
|
||||||
interest in SWIFT data. Thus care must be used to ensure the (entirely unprovable) SGX computers are not interdicted
|
|
||||||
during delivery.
|
|
||||||
|
|
||||||
* In addition, zero knowledge proofs can be considered as a supplement to SGX. They can give extra assurance against
|
|
||||||
corrupted notaries calculating incorrect results. However, unlike SGX, they cannot reduce the amount of information
|
|
||||||
the notary sees, and thus they are strictly a "backup". In addition they have _severe_ caveats, in particular, a
|
|
||||||
complex and expensive setup phase that must be executed for each contract (in fact for each version of each contract),
|
|
||||||
and execution of the private side is extremely slow.
|
|
||||||
|
|
||||||
This makes them suitable only for contracts that are basically finalised and in which the highest levels of assurance
|
|
||||||
are required, and fast or frequent trading is not required. The technology may well improve over time.
|
|
||||||
|
|
||||||
* In some cases homomorphic encryption could be used as a privacy supplement to SGX.
|
|
||||||
|
|
||||||
# Scaling
|
|
||||||
|
|
||||||
* Global broadcast systems are frequently attacked for 'not scaling'. But this is an absolute statement in a world of
|
|
||||||
tradeoffs: technically speaking the NASDAQ is a broadcast system as you can subscribe to data feeds via e.g. OPRA.
|
|
||||||
Some of these feeds can reach millions of messages per second. Nonetheless, financial firms are capable of digesting
|
|
||||||
them without issue. Even the largest feeds have finite traffic and predictable growth patterns.
|
|
||||||
|
|
||||||
* We can assume powerful hardware, as the primary users of this system would be financial institutions. There is no
|
|
||||||
requirement to run on people's laptops, outside of testing/devnet scenarios. For instance it's safe to assume SSD
|
|
||||||
based storage: we can simply tell institutions that want to get on the network to buy a proper server.
|
|
||||||
|
|
||||||
* There is no requirement for lightweight/mobile clients, unlike in Bitcoin.
|
|
||||||
|
|
||||||
* Transaction checking is highly parallelisable.
|
|
||||||
|
|
||||||
* Therefore, as long as transactions are kept computationally cheap, there should be no problem reaching even very high
|
|
||||||
levels of traffic.
|
|
||||||
|
|
||||||
Conclusion: scaling in a Bitcoin style manner should not be a problem, even if high level languages like Java or Kotlin
|
|
||||||
are in use.
|
|
||||||
|
|
||||||
# Programmability
|
|
||||||
|
|
||||||
* The public side of a transaction must use a globally agreed execution environment, like the EVM is for Ethereum.
|
|
||||||
The private sides can run anything: as the public side checks a proof of execution of the private side, there is
|
|
||||||
no requirement that the private side use any particular language or runtime.
|
|
||||||
|
|
||||||
* Inventing a custom VM and language doesn't make sense: there is only one special requirement that is different
|
|
||||||
to most VMs and that's the ability to impose hard CPU usage limits. But existing VMs can be extended to deliver
|
|
||||||
this functionality much more easily than entirely new VMs+languages can be created.
|
|
||||||
|
|
||||||
* For prototyping and possibly for production use, we should use the JVM:
|
|
||||||
|
|
||||||
* Sandboxing already available, easy to use
|
|
||||||
* Several languages available, developers are familiar
|
|
||||||
* If host environment also runs on the JVM, no time wasted on interop issues, see the Ethereum ABI issues
|
|
||||||
* HotSpot already has a CPU/memory tracking API and can interrupt threads (but lacks the ability to hard shut down
|
|
||||||
malicious code)
|
|
||||||
* Code annotations can be used to customise whatever languages are used for contract-specific use cases.
|
|
||||||
* Can be forced to run in interpreted mode at first, but if we need the extra performance later due to high traffic
|
|
||||||
the JIT compiler will automatically make contract code fast.
|
|
||||||
* Has industrial strength debugging/monitoring tools.
|
|
||||||
* Banks are already deeply familiar with it.
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
# Transaction design
|
|
||||||
|
|
||||||
Use a vaguely bitcoin-like design with "states" which are consumed and generated by "contracts" (programs). Everyone
|
|
||||||
runs the same programs simultaneously in order to verify state transitions. Transactions consist of input states,
|
|
||||||
output states and "commands" which represent signed auxiliary inputs to the transitions.
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
------
|
|
||||||
|
|
||||||
# Useful technologies
|
|
||||||
|
|
||||||
FIX SBE is a very (very) efficient binary encoding designed for HFT:
|
|
||||||
|
|
||||||
http://real-logic.github.io/simple-binary-encoding/
|
|
||||||
|
|
||||||
It's mostly analogous to protocol buffers but imposes some additional constraints and has an uglier API, in return for
|
|
||||||
much higher performance. It probably isn't useful during the prototyping phase. But it may be a useful optimisation
|
|
||||||
later.
|
|
||||||
|
|
||||||
CopyCat is an implementation of Raft (similar to Paxos), as an embeddable framework. Raft/Paxos type algorithms are not
|
|
||||||
suitable as the basis for a global distributed ledger due to tiny throughput, but may be useful as a subcomponent of
|
|
||||||
other algorithms. For instance possibly a multi-step contract protocol could use Raft/Paxos between a limited number of
|
|
||||||
counterparties to synchronise changes.
|
|
||||||
|
|
||||||
http://kuujo.github.io/copycat/user-manual/introduction/
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
------
|
|
||||||
|
|
||||||
|
|
||||||
# Prototyping
|
|
||||||
|
|
||||||
Stream 1:
|
|
||||||
|
|
||||||
1. Implement a simple star topology for message routing (full p2p can come later). Ensure it's got a clean modular API.
|
|
||||||
2. Implement a simple chat app on top of it. This will be useful later for sending commands to faucets, bots, etc.
|
|
||||||
3. Add auto-update
|
|
||||||
4. Design a basic transaction/transaction bundle abstraction and implement timestamping of the bundles. Make chat lines
|
|
||||||
into "transactions", so they are digitally signed and timestamped properly.
|
|
||||||
5. Implement detection of conflicts and rollbacks.
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
Stream 2: Design straw-man contracts and data structures (in Java or Kotlin) for
|
|
||||||
|
|
||||||
1. payments
|
|
||||||
2. simplified bond auctions
|
|
||||||
3. maybe a CDS
|
|
@ -1,19 +0,0 @@
|
|||||||
# Simple payment
|
|
||||||
|
|
||||||
CashState:
|
|
||||||
- Issuing institution
|
|
||||||
- Deposit reference (pointer into internal ledger)
|
|
||||||
- Currency code
|
|
||||||
- Claim size (initial state = size of original deposit)
|
|
||||||
- Public key of current owner
|
|
||||||
|
|
||||||
ExitCashState:
|
|
||||||
- Amount to reduce claim size by
|
|
||||||
- Signature signed by ownerPubKey
|
|
||||||
|
|
||||||
State transition function (contract):
|
|
||||||
1. If input states contains an ExitCashState, set reduceByAmount=state.amount
|
|
||||||
1. For all proposed output states, they must all be instances of CashState
|
|
||||||
For all proposed input states, they must all be instances of CashState
|
|
||||||
2. Sum claim sizes in all predecessor states. Sum claim sizes in all successor states
|
|
||||||
3. Accept if outputSum == inputSum - reduceByAmount
|
|
@ -1,19 +0,0 @@
|
|||||||
How to represent pointers to states in the type system? Opaque or exposed as hashes?
|
|
||||||
|
|
||||||
# Create states vs check states?
|
|
||||||
|
|
||||||
1. Derive output states entirely from input states + signed commands, *or*
|
|
||||||
2. Be given the output states and check they're valid
|
|
||||||
|
|
||||||
The advantage of 1 is that it feels safer: you can't forget to check something in the output state by accident. On
|
|
||||||
the other hand, then it's up to the platform to validate equality between the states (probably by serializing them
|
|
||||||
and comparing bit strings), and that would make unit testing harder as the generic machinery can't give good error
|
|
||||||
messages for a given mismatch. Also it means you can't do an equivalent of OP_RETURN and insert extra no-op states
|
|
||||||
in the output list that are ignored by all the input contracts. Does that matter if extensibility/tagging is built in
|
|
||||||
more elegantly? Is it better to prevent this for the usual spam reasons?
|
|
||||||
|
|
||||||
The advantage of 2 is that it seems somehow more extensible: old contracts would ignore fields added to new states if
|
|
||||||
they didn't understand them (or is that a disadvantage?)
|
|
||||||
|
|
||||||
# What precisely is signed at each point?
|
|
||||||
|
|
Loading…
Reference in New Issue
Block a user