Instant Sync Full-Validation Bitcoin Nodes
A cornerstone of the Bitcoin scaling debates is the ability of nodes to fully validate the chain state. Without validating the entire state, a cartel of >50% of miners could do whatever they want, like steal coins or increase inflation. But more adoption makes it more difficult to validate the chain which pushes users to delegate trust to miners (SPV) or companies (APIs).
This puts adoption and security at odds with each other. I think the best way to fight this is to make full validation more efficient than delegating trust. I’ve been thinking about how to do this for a long time and think I’ve finally figured it out. Along the way I’ve found a project working on the same ideas, but as a standalone cryptocurrency called Coda. I’ve refined some of my ideas based on theirs and figured out how it could be applied to any cryptocurrency, like Bitcoin, without making any consensus changes.
The result is a Bitcoin client that can theoretically sync and validate the chain state in constant time and with a constant data size, from 1 transaction to trillions.
Coda is a succinct blockchain which means that no matter how many transactions or blocks it has it can always be represented as a 1kb zero-knowledge proof of integrity. Syncing the chain means getting the latest proof and verifying it, not downloading blocks and executing transactions. It can happen in seconds and will only get faster.
I highly recommend everybody check out their testnet explorer. This page contains an entire working full node and performs a full initial sync of the chain state every time you load the page. This could be the future of every API, SPV, and full Bitcoin node.
How does this work?
What makes a state valid?
The heart of every cryptocurrency is its state. In Bitcoin we call the state the UTXO set and it’s represented as a Replicated State Machine. Every node agrees on an initial state (Genesis) and applies a sequence of state transitions (transactions) to arrive at the correct UTXO set.
The Transition function is what makes cryptocurrencies different from one another. It defines what is and isn’t valid. Validating the UTXO set is equivalent to calculating Transition for every block in history. In other words, a valid state is one that 1) starts at Genesis and 2) consists only of valid transitions. This gets expensive to check as the number of blocks and transactions grows.
Examples of rules that Transition is enforcing are: The first block must be built on top of Genesis, all transactions must have valid signatures, PoW must be valid, etc.
Zero-Knowledge proofs and trustless computation
Zero-Knowledge proofs are a relatively new cryptographic idea that allows anyone to calculate a function (called a “circuit” in ZK-parlance) that generates an output and a proof. With the proof, anybody can verify that the prover faithfully executed the function and obtained that output. The verifier does not need to know the input to the function and does not need to trust the prover. This is an immensely powerful technique.
By implementing the Transition function as a ZK circuit, provers can create fixed-size proofs that an output state is valid and they can be validated in constant-time. If it outputs the hash of the state then blocks can now be validated without processing their transactions or knowing the entire state.
I’m not going to go much into how ZK proofs themselves work in this article. For more information start with this great article by Vitalik Buterin.
Recursive proofs to validate back to genesis
Being able to validate blocks in constant-time is great, but the demo had O(1) proofs for the entire chain state, how? It turns out that ZK circuits can implement any function, including the validation of other ZK circuits or even themselves. A ZK circuit that validates proofs from itself is called a recursive ZK circuit.
If the Transition function takes in the proof of the previous state and validates it, then the resulting proof attests that 1) the block contains only valid transactions and 2) the initial state was valid. Every proof, P, now witnesses that its transactions and every transaction it builds on is valid.
ZK-sync: Instant full validation in Bitcoin without consensus changes
I’ve given an overview of how Coda achieves a succinct blockchain, but how can we bring this to Bitcoin without the risk of consensus changes that rely on developing technology? How can we enable teams to start experimenting without messing with the existing battle-tested protocol?
Validating full chain state and syncing state
Instead of baking a particular ZK construction into the Bitcoin protocol we can build an overlay protocol where anybody can produce and publish proofs by processing mined blocks with a ZK circuit that implements Bitcoin’s transition rules. This process would be just like a standard full node syncing, except the state they calculate would be authenticated by the circuit.
Clients connect to this network, download and validate the most worked state hash, and are now fully synced and have an “authenticated state” hash with its integrity verified. Clients that want a piece of the state could request it from state-holding nodes on the network. A mechanism would transmit state fragments and proofs-of-inclusion to clients on demand. Coda achieves this by representing the state as a merkle tree. The UTXO set could be represented however is necessary in this overlay network.
Users would choose which circuit they trust to be correct in the same way they choose which node software (ABC, BU, bchd, etc) to use currently. Existing style full nodes would only need to be used by proof producers.
Checking cumulative PoW
Coda uses Proof-of-Stake, but that’s orthogonal to the state validation. We can utilize the same system with PoW.
In addition to the UTXO set, we can add block height and cumulative work to our authenticated state. Transition would validate the block’s PoW and add the work proved to an accumulator in the state. A client will be presented with multiple valid states while syncing and if it takes the one with the most work and validates the state proof then it’ll know it’s on the best available chain.
This creates a type of Non-Interactive Proof-of-Proof-of-Work which permits clients to skip validating the PoW for every block individually, giving us the O(1) syncing property!
Who is producing these proofs?
Anybody can produce them because verifiers do not need to trust the provers. Businesses like exchanges are incentivized to ensure proofs are being produced in order to provide a friendly UX to customers. Miners are incentivized to utilize this scheme because they can transmit only a state hash and proof for other miners to start mining on.
In the legacy model the entire transaction history needs to be processable by a large portion of the network in order to remain secure against invalid state manipulation. In this proposed model, just a single entity providing proofs is sufficient and could be operated by institutions (i.e. non-profits, universities etc) or businesses trustlessly. If all provers are compromised the chain is still secure, but clients that rely on proofs can no longer validate updates. This is a much better failure mode than SPV because nodes are not tricked into accepting an invalid state, they just can’t continue to validate new states until a prover is back online.
The whitepaper’s SPV, but more secure and efficient
The Bitcoin whitepaper describes a system that has become known as “fraud proofs”. The idea is that honest nodes could create a proof that a particular block is invalid:
[T]he simplified method can be fooled by an attacker’s fabricated transactions for as long as the attacker can continue to overpower the network. One strategy to protect against this would be to accept alerts from network nodes when they detect an invalid block
Fraud proofs never materialized like we all hoped, but instead of finding a way to prove that blocks are invalid we have found the opposite: a way to prove that blocks are valid. I think this is far more powerful as it affords clients the security of full validation with far fewer resources than SPV.
Zero-Knowledge about the future
I think ZK-sync is viable, but there is some ground work to be done. Different ZK constructions have different security and scaling implications that need to be further investigated. Software support for creating ZK circuits needs a lot of infrastructure development. The cryptography could use some performance enhancements.
A lot of the problems are being tackled by Coda’s parent company, O(1) labs, with things like the Snarky language and potential hardware support for ZK operations. Because my proposal is an overlay system we can easily experiment with different technologies without disturbing anybody and there doesn’t need to be buy-in from anybody but the participants of the overlay. Different teams can compete to create the best protocol.
Incentives to toward full reliance on proofs
What could things look like if ZK-sync proves to be successful and able to provide the same level of security as existing full nodes? If the community accepted these changes they could hard fork to utilize them more completely. Transactions wouldn’t need to be known to anybody but miners. Miners would process the block and publish the new state only.
As blocks grow in size miners should want to encourage the community to hard fork to using proofs exclusively instead of manually processing transactions. This would let them relay blocks in a fixed-size and fixed-time; giving them more fee revenue without increasing orphan risk.
As the chain grows in size and the reliance on proofs increases, who is incentivized to keep the entire chain any more? Do we need it for anything? Or are individual transactions worthless as long as they valid?
The community will have to decide how it wants to handle these issues.
I hope I have provided a convincing sketch of how Bitcoin, and any blockchain, could move to a new paradigm with full-validation security at nearly instant speeds. And do so without jeopardizing security or decentralization. This would incentivize users to use a more secure client, eliminate a huge swath of scaling concerns, and enable truly global and trustless p2p concensus.
If you have any feedback or interest in these ideas please let me know. Please check out the Coda testnet to see what we could have for Bitcoin.