The Road Ahead for Ethereum: Three Hard Problems
UPDATE: This post is from June 2016 and listed high-level design. I plan to take a deeper dive into various aspects and link them from here. This Reddit discussion has more details. It’s great to see innovation in decentralized computing.
Ethereum has generated a lot of excitement lately as a single world computer that can’t be shut down and can verifiably run applications.
What I Love About Ethereum:
First, let’s separate Ethereum the technology from the Ethereum community. The success of an open-source project depends on community involvement as much as the technology. Compared to Bitcoin, Ethereum has done a much better job at building a community of developers with multiple implementations, great communication, and quick decision making.
I love the Ethereum community and the culture of fast experimentation.
Hard Computer Science Problems:
Ethereum, in my view, has taken on at least three hard Computer Science problems. It’s unclear if any of them can be solved in a practical sense. What’s worse is that the broader Ethereum community believes that Ethereum already has the solutions or is close to having them.
1. Scalability of Distributed Systems:
Ethereum, in the current form, works at small scale and by design breaks at large scale. Every new user/node is adding state for every other user/node at a rate that is not sustainable in a distributed system.
- Consistency: all nodes see the same data at the same time.
- Availability: every request receives a success/failure response.
- Partition Tolerance: the system can operate under network partitions
“P” is usually always a given i.e., if there is no partition of the network then what choices does a system make? Blockchains give up on consistency. That’s the design space that blockchains operate in: they remain available while they’re not consistent and try to limit the time (latency) it takes before data gets consistent (with high probability).
Bitcoin very explicitly makes the decision that (a) it is eventually consistent (data gets consistent after certain time/blocks), and (b) it tries to keep the amount of data that needs to get consistent as small as possible (hence it’s not trivial to just increase the Bitcoin blocksize and push more data).
Ethereum is pushing the boundaries of scalability in two conflicting directions. It wants to put more data in the blockchain and it wants to reduce the time it takes to become globally consistent. Given the underlying hardware/bandwidth limitations, this is a logical fallacy.
Boot up for New Nodes: Let’s look at a new node that wants to bootup and independently verify the Ethereum blockchain from the beginning. That node will need to run every single computation that every Ethereum user ran. An analogy to this is that you start your web browser and the browser needs to run every single website on the Internet. It’d take a long time to start.
People seem to think Ethereum can solve these scalability problems. I hope they can, but to put things in perspective it’d require some new theoretical advancements in distributed systems that researchers didn’t discover in the last thirty years or so.
2. Running Untrusted Code:
Ethereum argues that you can run untrusted code because you are paying for compute steps and you can only do limited number of computations and can’t have infinite loops. But infinite loops are only a subset of a larger problem. Even within limited compute steps there can be attacks. The recent DAO attack is just one example. Attacks on VMs are a heavily studied area and all sorts of vulnerabilities are frequently discovered.
The problem basically boils down to: can we run untrusted remote code with 100% guarantee that it will not crash my program? If Ethereum can’t solve this problem then a single attack can effectively stop the network from making forward progress (imagine all Ethereum nodes crashing after processing a single transaction). If Ethereum can solve this problem, then they’ve opened a brave new world of remote code execution from untrusted parties, a problem that people have been wrestling with for decades.
With a limited (i.e., non-turing-complete) scripting language, the scope of this problem drastically narrows down and that’s why Bitcoin uses a limited scripting language.
3. Complexity in the Network:
Ethereum, by design, wants to put all complexity in the network. This is the exact opposite design choice that the end-to-end principle made for the Internet. Complexity is not only an enemy of security (as seen in the DAO attack and running turing-complete code), but is also an enemy of building scalable networks. People have tried putting complexity in large networks earlier, e.g., active networking, only to realize how hard it is.
The Internet was able to survive and scale because of a fundamental design choice that was made at the core of the network and Ethereum needs to find a way to survive while making the exact opposite design choice.
I believe that Ethereum adds a lot of value to the blockchain ecosystem. Their bold experiments are enabling new applications and bringing attention to critical problems. In the long run, we will all benefit from their efforts. However, we need a much deeper understanding of the technical challenges and need to explicitly differentiate between experimentation phase and production deployments.