Trustless Trust with Verifiable Computation
An investigation on the consensus algorithms of blockchain systems
By Derek Zhang
Ethereum has implemented a blockchain system which creates consensus on the state of a general purpose virtual machine, the EVM, allowing users to execute functions to support dApps beyond simple token transactions. While these functions can reach consensus by Proof-of-Work(PoW) consensus algorithm, we find out that when the execution requires non-trivial computation effort, practical attacks exist which lead to incorrect results. The reason is that the PoW system is not designed with Incentive Compatibility. We call this the verifier’s dilemma. And a resolution is proposed to solve this dilemma by a prover-verifier consensus scheme. We further discuss multiparty computation and how to construct information theoretic MAC as a computation proof. Finally we compare the related work, and discuss how ARPA solves this problem.
The Verifier’s Dilemma
In blockchain systems based on Nakamoto consensus, miners receive reward for solving computationally hard problems. In order to race ahead of others, miners are incentivized to allocate their computation resource on the puzzle game written in the consensus algorithm. When a miner who finds the correct hash value(the leader), it will broadcast it to the entire network. All other mining nodes will check the validity of this block before mining the next block. The design philosophy of the system is that if miners skips checking, they risk of investing on invalid forked chain and jeopardize their own fate of mining (on the wrong chain).
If the block is simply composed of transactions, it requires negligible computation resources for verification. Miners can do this extra check without spending any significant computation work. In this scenario, if one skips the validation, the risk of including an invalid transaction and ending up in an invalid fork overweights the benefit of saving the computation. On the other hand, if the broadcast block contains complex smart contracts that require significant computation resources to validate it, the miner can decide to skip the validation to gain advantage on mining the next block.To prevent this, Ethereum introduced the gas system to limit resource exhaustive attack in the system.
However, this still does not prevent the attacker from creating and including resource-intensive contracts in his newly mined block. If following the consensus protocol, other miners will validate the block for free. As a consequence, the attack not only exhausts miners’ resource, but also give the attacker some time ahead of other miners in the race for the next block.
Having learned that validating an exhaustive contract may lead an inferior position at mining, miners have a strong incentive to skip expensive contracts. For example, the task requester wants to outsource a heavy computation. Let’s say a complex computation task in the form of a smart contract was initiated by requester R, a prover P found has just mined the current block, and other validators V (miners) are required to check the new block. In this case, P can pick a random number as the result of the contract, and Vs will skip the validation, resulting an inconsistent state in the blockchain. In addition, R will get an incorrect answer as well as wasted its money.
As we mentioned earlier, Ethereum introduced constraints using gasLimit. Unfortunately, however, it is not a fool-proof mitigation against this attack. Thus, PoW consensus permits unverified blocks when working on consensus of computation-intensive tasks.
Autonomy on classic blockchain systems
Ethereum and other turing-complete computation blockchain system assume that the computation and verification of a function is equivalent,e.g., to validate a result from peers in the network, one needs to compute the function again, using the same amount of computation. Indeed, in a trustless network, the only way to agree with the result is to run the Smart Contract (the function) again. Furthermore, to trust others’ validation work, one node needs to validate it again. E.g., if a node trust no one, it has to validate both computation and validation work from others. This leads to never-ending cycle of validation and the no consensus can be converged from it.
The biggest contribution of Nakamoto consensus, to decades of research on BFT algorithm, is that it transformed the endless validation cycle to a hash prover-validator model, where the proof is extremely hard to find, but the validation work has complexity of O(1). Therefore, blockchain network can reach consensus by validating the only one block mined from a leader. In Bitcoin, the block only contains transactions, which is very easy to verify, and thus verifiers are not subject to the dilemma introduced earlier. However, in Ethereum, turing-complete smart contract can include resource-intensive computation. This results a situation where the verifiers need to run every computation in order to verify the block, and thus raise the issue of verifier’s dilemma.
Formally, the advantage for a rational miner by skipping the verification of a computation f is:
Where V_f is the amount of work required to validate the function, and V_df is the work required by deviating from the honest protocol(skipping the validation). Generally, V_df=0, which is the cost of skipping the validation. In the cases mentioned before, if a computation-heavy task was included in a block by the leading miner (P), or the task requester (R), the maximum advantage a miner (V) can gain by skipping the validation is Adv(f).
We can claim that by skipping validation, V also suffers from mining on the wrong chain. Knowing other miners are faithfully working on the due diligence work, the probability that the block is invalid is small, but still non-zero. The optimal decision of whether skipping the validation, largely depend on how computation-intense the task is, if the percentage of faithful miner in the network is constant.
To verify a transaction, this is a simple matter of checking a number of hashes, signatures, and account balances in Bitcoin network. For Ethereum, things are more complex, as it is to verify the state of a powerful virtual machine, therefore it is done by recomputing the steps the virtual machine must take .
Verifiable computation provides a means to prove the correctness of a computation, such that verifying this proof is computationally less complex that performing the computation itself . Furthermore, verifiable computation enables a weak computer to offload the computation of some function, to other perhaps untrusted clients, while maintaining verifiable results. The other clients evaluate the function and return the result with a proof that the computation of the function was carried out correctly.
Formally, if a cloud service has been tasked with computing the result of a function f, how can the client be sure that the result for an input x they receive is correct? They could recompute f(x) themselves and compare the values, but this would defeat the purpose of using a cloud service for the computation in the first place. It is possible, of course, that there is an easy way to check if the output is correct for some particular f. For instance, if f=h’ for some hash function h, f would be very difficult to compute, while h would be easy. And this is exactly how Nakamoto consensus solved this verification problem: by transforming a W(f) = V(f) problem to a W(f) >> V(f) hash puzzle.
However, blockchain-based transaction verification does not work for arbitrary computation, and is situation specific. The high level solution is to delegate the computation to someone, and the rest of the network nodes only need to verify the computation with a small computation overhead. We call this process the off-chain computation. To achieve this goal, a computation proof is needed in the delegation process. As a metaphor, if you want to prove that you have been to somewhere, the best proof is an item that is unique to that place. We can think off-chain computation as a delegation process, and the envoy has to prove that he actually finished the task. Similar to this concept, an off-chain computation scheme needs a verification scheme that:
- Can prove that the computation is faithfully done,
- The amount of work done cannot be exaggerated, and optionally,
- The privacy of the content is guaranteed and nothing is revealed to the participants.
This relationship is illustrated in Figure 1.
With verifiable computation, instead of just computing y=f(x), a computation node will compute a proof p as well: (y, p) = f(x, p). And the verifier can check the proof using a verification function V(f, x, y, p). The verification function V must be such that it returns true only if y=f(x) and (y, p)=P(f, x). And the executing V should be computationally less complex than f.
Multiparty Computation(MPC) is a practical way by which multiple parties can compute some functions of their individual secret inputs, without any party revealing anything to the other parties about their inputs other than what can be learned from the output. As mentioned earlier, the current state of smart contract execution is duplicated across all nodes on blockchain. However, using MPC, the computation is verifiable and can be safely delegated to other nodes. In other words, the process can be trusted by a third party, as long as they provide proof that the process is done correctly. In a trustless network, this proof can be further broken down in two conceptual parts:
- The computation is conducted with prescribed function and correct input data,
- The proof cannot be forged.
In ARPA MPC protocol, Message Authentication Code (MAC) is a value that can be used to confirm a message has been created by a certain party, and to detect if a message has been changed. The main tool to achieve active security in modern, secret sharing-based protocols is information-theoretic MAC. MACs convey the correctness of computation, so that the computation is verifiable. The validation complexity is O(1), which means it consumes constant time to compute. Therefore, it is possible to be executed in any VM. A typical information-theoretic MAC scheme is listed as Algorithm 3.
If the computation nodes deviates from the protocol, such as sending incorrect result, the proof-carrying MAC will not be reconciled and the validation process cannot be passed. Therefore, part 1 is guaranteed by MPC protocol cryptographically.
After the computation nodes finish their work, they sign their MACs with their own private keys, and store them on the DHT-based distributed storage. This prevents the MAC from being forged by third parties. Therefore, the proof cannot be forged and can be trusted that it is the authentic proof even on a public-accessible storage. These produces desired property of part 2.
In summary, we now can conclude that the secure computation can be broken down into computation and verification parts. With part 1 and part 2 solved, it is clear that the computation work can be delegated to any party, without preexisting trust. Thus, the trustless trust is achieved with MAC verification scheme using MPC.
Several technologies existed as candidates of secure computation., namely Fully Homomorphic Encryption(FHE), Zero-Knowledge Proof(ZKP), and Trust Execution Environment. Our discoveries indicate that, while some technologies have advantage such as computation efficiency, some of them are not offering security and verifiable features needed in a permissionless network. In essence, we need to be able to verify the secureness, correctness and privacy-preservingness of computation. The details of each item are explained below:
- Efficiency: The speed of computation
- Privacy-preserving: Privacy-preserving here refers to the capability of evaluate a function on a dataset while not revealing the detail
- Proof of Correctness: Prove the computation work is actually using the prescribed function
- Proof of Computation: Prove the amount of work the participating node has done
- Proof of Secureness: Prove that the computation is actually carried out in the secured environment
The main contribution of Bitcoin is that it created a decentralized public ledger and it can reach consensus in a trustless network. We discovered that with Nakamoto consensus, the system is not Incentive Compatible, especially when the block contains computation-intensive transaction:the best strategy for a rational miner is to skip the verification in order to catch up with the leading miner. We call this The Verifier’s Dilemma. Further analysis has shown that classic blockchain system assume the Proving and Validation has same amount of work, and PoW consensus solved this issue by transform the work to a hash verification, where the validation has a constant O(1) complexity. With verifiable computation, where the proof-carrying result is significantly easier to validate, we found a path to solve this dilemma. Multiparty Computation, first created to solve secure computation between multiple untrusted parties, was modified to support computation proof. We discussed the properties of MAC and illustrated how to use MAC to support verifiable computation. And further discussed the possibility of using computation delegation to reach trustless trust even for resource-intensive task.
We at ARPA are a multi-national team focused on secure multiparty computation and its blockchain solution for a general purpose privacy-preserving computation network. While it is a challenging work on the cryptography side, we aim to improve blockchain in the following aspects:
- Secure Computation Computation is carried out securely so that no participating nodes should learn anything more than its prescribed output.
- Verifiable Computation The computation can be audited publicly, and its correctness can be proved. Therefore, it is possible to outsource the computation from the blockchain network.
- Layer2 Solution Combining secure and verifiable computation, the heavy-lifting work of computation is done off-chain, essentially making our secure computation protocol adaptable to any existing blockchain network.
- Scalability ARPA is designed as a layer2 solution. The on-chain network will never reach its computation (gas) limit. Therefore, we can improve the computation scalability and TPS(transaction per second) of any network. The computation capacity is increased linearly to participating nodes.
- Efficiency State-of-the-art implementation of MPC protocol is used to speed-up the secure computation.
- Availability World’s first general purpose MPC network for secure computation. With high availability and low cost, we promote data security and privacy practice that is difficult to achieve otherwise.
If you are interested in our project and current progress, please join our telegram@arpa_community.