Specializing to solve the Blockchain Trilemma
Intro:
The modular blockchain thesis has risen to prominence as the scalability limitations of the Ethereum blockchain have become apparent and layer 2 solutions have been rolled out to separate the execution layer while continuing to use the Eth layer 1 chain for consensus, data availability, and settlement. These are the four layers that comprise the thesis with the central idea being that separating their functions into different chains yields a multitude of benefits. Ethereum is taking advantage of some aspects of the thesis, but fundamentally was not designed with it in mind thus limiting its ability to execute. Meanwhile the Celestia chain, originators of the thesis, is designed from the ground up to be modular and provides an interesting lens to view how we may see blockchains scale into the future. I’ll go through a couple different examples and how they align to the thesis as well as highlight some general properties as a way to illustrate how a modular approach can be used to solve the challenges of scalability, security, and decentralization.
Layers:
Consensus is at the very bottom of the stack and is a frequently used term that can encompass all kinds of ideas, protocols, and incentive structures. At the most fundamental level, consensus is taking in messages and agreeing on the order of those messages. All the other aspects of consensus can be extrapolated once you have that agreement in place. Consensus is reached by using the data in the block header. The header contains metadata about the block as well as a merkle root, which can be thought of as a commitment to the raw data. The header must be used because having to process the actual raw data would be too slow to be usable when reaching consensus. The best analogy I heard, though it feels somewhat inadequate to me, was to think of the header as a shipping label. The various package handlers have what they need to do their job without each one needing to open the package.
Data Availability occurs when a block is proposed and is whether or not the data was made available for nodes to verify. There are two types of nodes on a blockchain network. Full nodes have the full history of the blockchain and are capable of downloading and checking the validity of every transaction that is made. This can require a significant amount of resources, but these nodes cannot be tricked into verifying a block that has invalid transactions. The other type of node is light clients. Light clients have significantly less resources and are used pragmatically by relying just on block headers to find the state of an account, watch for events, or check that a transaction was confirmed. Light clients must connect to intermediary full nodes in order to request data and interact with the blockchain. Meaningful decentralization of the data availability layer with the biggest chains relies on having many full nodes. During the blocksize wars for Bitcoin, keeping the blocksizes small so people could download the full chain and reasonably run their own full nodes was a critical consideration.
Execution sits at the top of the stack and is most closely associated with the processing of transactions that interact with applications. This is the layer most people are familiar with since it’s where most people interact with the blockchain. There’s been the growth specifically in Ethereum “Layer 2” rollup solutions, which are execution layers that ethereum is relying on to provide a scalable way to access the broader eth ecosystem. A rollup for all intents and purposes is just a normal blockchain. Instead of having a validator set doing their own consensus there are rollup operators that collect all the transactions, build a block, compute the next state, and post that block to another blockchain that does the consensus and data availability. In order to provide the efficiency needed to scale these execution layers they must post something to the consensus and data availability layer that is more compressed than the raw transaction data of what has happened. This is where fraud proofs and validity proofs are utilized. Instead of writing many transactions, the rollup produces a hash which represents the new state of the chain.
Settlement is by far the most difficult concept to grasp since it has been taken for granted with the monolithic approach to blockchains that has been taken so far. The settlement layer can do proof verification and dispute resolution, facilitate bridging between multiple execution layers, and act as a liquidity hub. There are clever ways to have proof verification and dispute resolution done by nodes, and with those other use cases not being strictly required, that means a settlement layer is technically optional in the modular blockchain paradigm. The overwhelming practical benefits of a settlement layer means that the majority of chains will leverage this layer in some form. Settlement layers serve to reduce the amount of overhead that is required to validate all the rollups. Normally you would need a bunch of light clients to validate each of the other chains you’re interacting with, but by using a single client to a shared settlement layer to trust all the other chains that also settle there you get a compatibility network effect. Another helpful way to think of a settlement layer is as an “in ecosystem” bridge.
Chains:
The ethereum community has been preaching the modular blockchain thesis since they realized they had a massive scalability problem. Much the same way ethereum backed into proof of stake to solve ESG concerns as well as numerous others problems, they’re taking a similar tack with the modular approach to solve problems that were not accounted for in their original design philosophy. It’s also worth a shot at the ethereum community, who love to hammer bitcoin for shifting narratives, that they have effectively pivoted from a “world computer for dapps” to a “global settlement layer” narrative. Due to the size of the ethereum ecosystem there is a plethora of examples we can look at that illustrate the strengths of a modular approach as well as where they’ll likely encounter problems due to the reality of their initial design as a monolithic blockchain. The pure modular approach is being pioneered by the Celestia chain who also claim to be the genesis of the thesis and I haven’t found anything to contradict that claim. The Celestia approach having been developed to be modular from the outset offers significantly more flexibility in where and how their chains can be plugged in across the broader crypto ecosystem.
The ethereum chain is focused on scaling by adding a bunch of execution layers, commonly referred to as rollups. The two main flavors of rollups are optimistic which uses fraud proofs to run a validity check if a block is contested and zero knowledge which relies on validity proofs which ensures a block is valid before proposing. One such rollup is “Aztec” which is designed to apply privacy to transactions. Sparing you the technical details, making transactions private is computationally expensive which is why this use case is a perfect candidate to be separated out to its own execution layer. The privacy can be provided in an environment where it can be optimized and the ultimate results can be settled down to Ethereum with the base layer not being impacted by that computationally expensive operation. You’ll see many rollups, some of which are specialized like Aztec and others that are general purpose in an attempt to capture the network effects on developers who are comfortable with the EVM already. One of the benefits of separate execution layers though is that they can have their own execution environment which provides more flexibility for developers. One such example is Fuel which utilizes the FuelVM while striving to be a modular execution layer on top of Ethereum.
The biggest challenge Ethereum faces in its attempt at modularization is the fact that it was initially designed as a monolithic chain which means it has properties that are enshrined in the protocol which precludes its ability to optimize the layers. The glaring example is that even as layer 2’s gain adoption there still exists an execution layer on ethereum itself. This execution capability at the base layer means it cannot be optimized for proof resolution and dispute settling because it also must support the ability to execute smart contracts. Continuing this line of thinking, that also means that the fee markets for these two distinct functions are muddled together. Meaning the execution capabilities inherent in the chain can affect the fees on the network that layer 2’s are using to settle, another area where the ability to optimize is lost.
The core difference of the Celestia chain approach is that it was designed to be modular, meaning there is no execution or settlement layers built into it. Celestia strictly provides consensus and data availability functions, and optimizes for that purpose. One of the interesting effects of this is that modular chains that leverage Celestia can be sovereign, acting unilaterally to respond to hacks or push upgrades. In Ethereum, a DAO or multisig would be needed to interact with the smart contract that governs the rules of the layer 2, which introduces centralization concerns. The modular approach allows new chains to effectively outsource the functions they don't want to build themselves which allows them to specialize and provides benefits such as reduced time to deployment and minimizing costs. The Celestia founders have put out a ton of content, they have excellent documentation, and I would recommend focusing on their efforts if you want to continue investigating the modular approach to blockchains. One of their particularly interesting solutions is called data availability sampling which utilizes light clients to improve decentralization. A couple other cool solutions include Cevmos, Celestiums, which along with Fuel aim to merge Ethereum and Celestia ecosystems.
Summary:
The consensus layer is centered around the validators who agree on ordering and produce blocks, the data availability layer which is concerned with ensuring the nodes can verify those blocks, and the execution layers are designed for applications which produce the transactions that ultimately fill the block space. Those three are all pretty straight forward. Then there’s the settlement layer which is where there exists a lot of potential around scalability and network effects depending on adoption and implementation. The classic blockchain trilemma is that with a monolithic approach a chain must decide between scalability, security, and decentralization and they can only effectively provide two of the three. The modular thesis of separating these functions to different chains allows each chain to optimize for their layer which has the net effect of solving the trilemma where scalability is tackled at the execution layer, security at the settlement layer with its network effects, and with a purpose optimized consensus and data availability layer it can be designed to maximize decentralization.
There are monolithic chains such as Solana which have no intention of modularization. There’s Ethereum in the middle of the spectrum, adopting the modular approach out of necessity and doing their best given some limitations enshrined in their original design. Finally, the modular maximalist approach that Celestia is taking. The scalability that this approach provides from an infrastructure perspective combined with the speed and cost optimization for developers, and the general flexibility across all layers is unmatched in the crypto ecosystem currently. Overall I think the modular approach has a future as it facilitates innovation and growth which we can use as much as we can get in the early life stage that crypto is in.