Modular Blockchain Architecture — Rollups, Sharding, "DankSharding"

Guram Kashmadze
WEB3CHRONICLES

--

In this article, we will look at the architecture of Modular Blockchains, scalability issues, and techniques to improve them.

Table of Contents

  • Intro
  • Problem
  • Blockchain Responsibilities
  • Proposal
  • Data Sharding
  • Rollups
  • Optimistic Rollup
  • ZK-Rollup
  • Rollup Finality
  • Rollups + Data Shards
  • Danksharding
  • Historical Data and State Data
  • Challenges
  • Outro

Intro

We, as a society, rarely develop new needs. Instead, we have new techniques that solve existing needs more efficiently.

We can look at Blockchain as one of the new iterations of this attempt to improve current solutions to the existing problems.

The main improvement we can get from this technology is Trust and the Commonality of Knowledge.

Blockchain makes it nearly impossible to tamper with the records, and we get the Trust component. Blockchain makes the knowledge available to everyone, and we get the Commonality of Knowledge.

The Problem

Current Blockchain platforms have a big problem: high fees. Users have to pay a fee to interact with the Blockchain. We have to pay a fee to a Blockchain network because participant nodes store, validate, and secure the data in the Blockchain.

Why do we have high fees? The answer is supply-demand economics. There is a lot of demand for the space (block space) in the Blockchain, and it has small throughput. For example, Ethereum, the leading platform for the DApps, has a throughput of 15 TPS (Transaction Per Second).

Suppose we increase the throughput by increasing block sizes and storing everything in it. Why wouldn't this work? For example, Ethereum's average block size is between 50–100 kilobytes. Let's make it 1MB. Why is this a problem?

As we mentioned, every participant node of the Blockchain has to download, validate and store every piece of information. So, this naive solution will dramatically increase participant nodes' storage and network requirements. Therefore, it will favor large companies that run big data centers with much more computational and storage capabilities than a regular user, which will lead us to a more centralized blockchain. So, that's not an option because we will lose the Trust aspect. So, even though many speculative projects don't care about it, the underlying platform should not allow it.

The final thought is that we have to improve some responsibilities of the Blockchain without sacrificing the other ones.

Blockchain Responsibilities

Blockchain has three primary responsibilities :

  • Consensus
  • Data Availability
  • Computation

The Consensus component is responsible for collective agreement about the state of the Blockchain. So it's the heart of the Trust aspect of the Blockchain. Consensus is crucial in distributed systems where we have multiple participants. Consensus defines agreement rules, which are used by participant nodes to identify and agree on the valid state of the Blockchain. Every Blockchain has different Consensus rules/methods, and it makes sure that every participant node follows these rules.

The Data Availability component is responsible for storing and propagating the data to everyone. So basically, it's every piece of data, every record, every information that moved into the Blockchain. The Data Availability component makes sure that the data is published, stored, and propagated through the network. The Data Availability component delivers the Commonality of the Knowledge aspect of the Blockchain.

The Computation component executes various operations on the records, leading to the system's new, global state. For example, if I have ten coins, and someone transfers me another ten coins, some computation will validate this operation. It will ensure that the coins exist and that the sender is the actual owner. After that, it will update the corresponding accounts and balances.

Monolith Blockchain — everything is done by one big coupled system

The Root of the Problem

The problem is that all the responsibilities mentioned above are tightly coupled. That's why we call them Monoliths.

A Monolith is a system where one central component holds all the responsibilities. It has several jobs to take care of, and it does everything in one place. Sub-components are absent or tightly coupled.

Why is this bad? — Monoliths don't scale.

If we could decouple the Blockchain responsibilities into subcomponents, we would be able to scale them independently. In addition, we would be able to try various techniques on different subcomponents without breaking one another. we can call this type of system Modular.

A Modular system consists of separate subcomponents. Subcomponents are connected but not coupled. In a Modular system, one can alter, swap or add any subcomponent without affecting the whole system.

Decoupling was the central concept behind moving to n-tier applications, microservice architectures, and separating the data/logic/presentation layers from one another 15–20 years ago in Web2 software development.

DeCoupled vs. Coupled

The Proposal

Let's try to turn a Monolithic Blockchain into a Modular one.
We should decouple all of the three responsibilities.

The central platform (Mainchain, Layer 1) should be decomposed and responsible for the Consensus and the Data Availability subcomponents.

The Consensus should be turned into a subcomponent of Layer 1 and guarantee the collective agreement of the data.

The Data Availability should be turned into a subcomponent of Layer 1 and be able to store a higher amount of data, thus handling higher throughput.

The Computation responsibility should be separated from the Mainchain and carried into efficient external solutions.

Layer 1 should not depend on the concrete implementations/variations of the external computation solutions.

After the decoupling, we will get the following system :

  • Consensus — Mainchain's responsibility
  • Data Availability — Shardings' responsibility (on-chain solution)
  • Computation — Rollups' responsibility (off-chain solution)

Data Availability Layer — Sharding

We want to scale the data availability layer with Data Sharding. Sharding means splitting the data into different parts and storing them in separate physical locations. It is a well-known practice of horizontal scaling in the Relational Database Architecture used to scale them for decades.

Let's explore data Sharding with some real-life analogies.
Let's think about a football stadium with 90 000 seats.
Let's imagine that this stadium has only one entrance with one ticket validator. There would be a lot of traffic in front of the entrance, correct? So the overall throughput is equal to the throughput of this single entrance.

Now think about our current stadiums, it is divided into sections — Section A, Section B, etc. Let's say ten sections, each section has 10 000 seats, and each has a separate entrance.
So if a user bought a ticket in Section B, one would be able to navigate to the Section B entrance.
The ticket validator will validate only tickets in Section B and nothing more. The validator will not even have any information about other sections.

Basically, the stadium is Sharded. The seats/records are separated across sections/shards, and each section/shard has its own gate(s)/validator(s).

In general, if the gate/validator is not a bottleneck any more, you can even add more seats/records to the stadium/system, meaning that sharding lets you increase the total storage capacity.

After Sharding, the overall throughput will equal the sum of all entrance throughputs. There are a lot of Sharding techniques. Some of them are complex, some are not, but the idea behind all of them is the same.

Sharding splits the records and stores them in the Blockchain separately from each other. The Mainchain keeps track of its shards and acts as a coordinator. Each shard has its own validator set. Validators are responsible for validating and propagating the records inside a shard.

Let's overview Ethereum's initial version of the Data Sharding approach.
Ethereum planned to spread its data layer on 64 shards. Each shard would have a different number of validators to validate only their assigned shard. In terms of security, these validators would be shuffled from time to time. A separate committee selects validators and assigns them to specific shards.
The main chain (Beacon Chain) would have links to shards to know which accounts and related data are on which shard. Applications would store data on the shard, not on the main chain (the beacon chain).

As we've said, there are a lot of Sharding techniques. The more complex solutions need more sophisticated architectures. Changing the existing platform on the go without breaking anything is extremely hard, even in centralized systems. In distributed open systems, it's even harder. It is way easy to start from scratch and build a new Blockchain that is initially Sharded than to migrate from a non-sharding live Blockchain to a sharded one. That's why Ethereum struggled with the Sharding update and decided to temporarily abandon this approach and focus on a more simplified version of Sharding. It's a sign that Ethereum Foundation is maturing as a Product team with a customer-centric approach rather than a platform-first approach.

We'll talk about Ethereum's new approach to Sharding in the upcoming sections. Stay tuned.

Computation Layer — Rollup

We want to decouple the computation responsibility from the Mainchain by introducing Rollups.
What is a Rollup?
Let's get back to our stadium.
Let's say you are going to a stadium with your five friends. You bought tickets in Section B, You are in the queue at the entrance, and the ticket validator scans your tickets with the mobile phone and validates that they are authentic. The validator has to validate each ticket independently — for each ticket.

Currently, we have the same solution in Monolithic blockchains. Layer 1 (every participant node) processes every transaction, one by one. If the transaction is valid, then we get the new global state. The transaction itself and the new global state are propagated through the network.

Typical Monolith Blockchain

Now, let's imagine that you bought tickets for your friends (6 tickets), and this batch of tickets has one QR Code. In this case, the validator scans grouped tickets and validates the batch purchase with one operation. If the QR code is valid, you are ready to enter the stadium without additional checks.

Rollups do the same thing — they Roll-Up or batch a lot of transactions into one piece and send it as a single transaction. The Mainchain has to check only this Rolled-Up transaction. This approach saves time and complexity on the work that needs to be done on the Mainchain by moving complex computations off-chain.

Rollup is a scaling technology, that operates on top of Layer 1. Rollup computes/executes and stores transactions independently from the Mainchain, but sends the batched data in one transaction to the underlying Layer 1 to propagate the output of this computations i.e. the new rollup state

Rollup communicates with Layer 1 using a Rollup smart contract. Rollup smart contract resides on the Mainchain and holds the inner state of the Rollup, i.e., accounts and balances, alongside the compressed transactions that resulted in this internal state. Every time a Rollup makes off-chain computations and batches some data, it sends it to a Rollup smart contract. Rollup smart contract will check if the proposed new Rollup state is valid and propagate it through the network. In simple terms, Rollup smart contract is an entry point and a bridge between the Mainchain and the Rollup. Smart contract's primary responsibilities are :

  • Store the Rollup inner state
  • Store compressed transaction data
  • Manage light computations/disputes (more on this later)

Currently, there is research and development on two types of Rollups — Optimistic Rollup and Zero-Knowledge Rollup, but both of them have the same idea.

The main question is — if we took away the computation subcomponent off-chain, how can Layer 1 be sure that Rollup validates transactions honestly? Is the batched data trustworthy?

The previously mentioned rollups solve this problem in different ways :

  • Optimist Rollup proves fraud (fraud-proof)
  • ZK-Rollup proves the correctness (validity-proof)

Let's explore these rollups in detail.

Optimistic Rollup
In the Optimistic Rollup world, the Rollup validates transactions, batches/compresses them, rolls them up, and sends them to the underlying Layer 1 in one transaction. The Mainchain assumes that this batch is by default valid. That's why this type of Rollup is called Optimistic.

What will happen if the Rollup node, which published the batched data, misbehaves and publishes incorrect/invalid data?
Every participant node can check the data submitted by a Rollup. If some node thinks that it's not correct, the node will raise the alarm, meaning it will create a cryptographic element called fraud-proof. By providing a fraud-proof, challenger node says — I checked the validity of the batched data, checked the proposed new state, and it is incorrect.

In this case, the smart contract, which resides on the Mainchain, will replay the batched operations provided by an Optimistic Rollup to determine the correct state. If the challenger node is right, it will get some stake, and the node that published the data will lose it. Besides that, it will discard the challenged batched data and every batch after it.

If the challenger node made a false fraud-proof, it would lose the stake. This mechanism will discourage fake challenges.

ZK-Rollup

In the ZK-Rollup world, the Rollup validates transactions, batches/compresses them, rolls them up, makes very complex computations to generate validity-proof of this batched data, and sends it to the underlying Layer 1. The Mainchain checks provided validity-proof, and if it's correct, it propagates the new rollup state.

Let's explore the validity-proof. As the name suggests, validity-proof is an element that proves the validity of some sort of information. There are different ways to generate validity-proof. One of them is SNARK.

SNARK is a cryptographic proof that allows you to prove that you've honestly made a computation over some data and got some output. If a node provides a SNARK, the validator quickly checks it and knows that the provided data is trustworthy. Creating a valid SNARK with invalid data or invalid computations is impossible. That's why Layer 1 only checks the SNARK. The Mainchain doesn't have to re-run the transactions.

It's computationally hard and time-consuming to compute the SNARK proof but very easy to check its validity. As a result, ZK-Rollup is very efficient, the heavy lifting part is done off-chain in the Rollup, and the easy part of validation is done in the smart contract on the Mainchain.

P.S. Don't get confused by the name ZK (Zero-Knowledge) Rollup. ZK-Rollup doesn't hide the transactions made in the Rollup from the Mainchain. They even post the complete set of transactions on the Mainchain. The Rollup publishes these transactions to give other participant nodes the ability to recreate the rollup inner state if they need to.

Rollup Finality

Let's stress a crucial difference between Optimistic and ZK-Rollups.

In the case of ZK-Rollups, the node submitting new operations has to prove the validity of the batched data by performing some complex computations on its side. So there is no need to wait for another user to provide some proof that the user is lying.

In the case of Optimistic Rollups, the rollup node provides only the batched data. So we need other nodes (validators) to come up with the fault-proof and accuse the owner of the data of misbehavior. Besides that, we have to give these validators some time to examine Optimistic Rollups' submitted batched operations to raise a dispute. We have to ensure that some nodes are actively monitoring the rollup network. If there are no validator nodes, the Optimistic Rollup will not work.

That's a big difference here, and we have to underline it. With this feature, ZK-Rollup provides one significant advantage compared to Optimistic Rollup — finality.

Finality is the assurance or guarantee that transactions cannot be altered, reversed, or canceled after they are completed.

In technical terms, the finalized transaction is a transaction that is included in some block that is already added to the existing Blockchain, and there is a very low probability that this block will be revoked/reverted.

ZK-Rollup provides instant finality, in contrast with 7–10 days in the case of Optimistic Rollup. For a casual user, it means that if you are interacting with a DApp that is using some sort of Optimistic Rollup (Optimism/Arbitrum, etc.) you have to wait 7–10 days after the operation to be sure that it is confirmed, and will not be reverted.

On the other hand, if you are interacting with DApp that is using some sort of ZK-Rollup (ZK Sync, Loopring, etc.), your transaction is confirmed nearly instantly.

P.S. For this disadvantage some Optimistic Rollup nodes provide some sort of guarantees to the mainchain that their batched operations are valid and ask them to confirm immediately, but it’s a topic of another article.

Rollups + Data Shards (RADs)

We made an overview of Data Sharding and Rollups.
Data Sharding scales the Data Availability Layer, and it increases the storage and throughput without increasing fees and centralization of the Blockchain. With Data Sharding, participant nodes will have to store only some portion of the Blockchain data, thus lowering the hardware requirements of a node.

Rollups scale the Computation Layer. It increases the throughput and saves storage saved on the Mainchain. In addition, the transaction fee is decreased dramatically due to the batching.

Let's explore one more scaling technique, which is pursued by the Ethereum Foundation — Danksharding.

"Danksharding"

Danksharding is the term coined by Ethereum, and it originates from one of its research scientists Dankrad Feist, who offered a simplified version of Sharding.

Some users will state that it's not even Sharding. It depends on which side of maximalism you are on.

In my opinion, Danksharding is Ethereum's acknowledgment that it will be nearly impossible to turn Ethereum into the perfect platform even in 3–5 years. So Danksharding is a way to make things better, not perfect.

nearly every motivational quote on the web

P.S. Ethereum Foundation chose to introduce Danksharding in several steps. Version 1 of Danksharding is called "Proto-Danksharding." This name is also derived from one of the researchers nicknamed Protolambda. Proto-Danksharding is just the first stage of Danksharding.

Danksharding will give off-chain solutions the ability to increase their performance without increasing the gas fee. Danksharding provides additional cheap storage room on Layer 1.

Danksharding has the following idea: each block on the Mainchain will have additional cheap BLOB storage. The computation layer of the Mainchain will completely ignore BLOB storage and will not try to validate its content. On the other hand, the Data Availability Layer will propagate it through the network. Applications will be able to store anything they wish in this area.

Beacon Chain (Layer 1 chain in Ethereum) will be responsible for the Consensus and Data Availability, but not the Computation of the BLOB storage content.

Theoretically, we can call it one giant shard, which will reside in the Layer 1 blocks (Beacon blocks). Accordingly, all blockchain participants will have to download and store it. So basically, the initial version of this scaling technique is vertical scaling of the data storage, not horizontal.

Ethereum Foundation decided to introduce Danksharding with several steps. The idea is to make these changes as seamless and painless as possible. Of course, simplicity has some tradeoffs. In later updates, Danksharding will have more features of the traditional Sharding, like true separation of the data and splitting it across different validator networks, as we've talked about in the Data Sharding section. Still, for now, this is a place to start.

With this update, Data and Computation layers will be decoupled but composable. Some solutions will only use the new BLOB storage, others will use only Rollups, and some of them will get benefits from both of them. That's the idea of Modular systems.

Historical Data and State Data

What about participant nodes and their storage requirements? As we've said, increasing the Block size makes the platform more centralized.

In the case of BLOB storage, participant nodes will store it for 1–3 months. After that, the Mainchain will delete it. It will give other participants enough time to get the data if needed (raise a dispute in case of an Optimistic Rollup or recreate a Rollup inner state). Blockchain will act as a public bulletin board, guaranteeing that the data is available to every participant for the required time.

What about Historical data?

Even without this new proposal, Blockchains will not be able to store all historical data for 50–100 years. It's impossible to do that.

All high-transactional centralized enterprises like Amazon, Facebook, Google Drive, and Visa have policies about data retention.

They store the last two-three years of data in fast-accessible storage and take everything else in archive storage, which is still available at a slower speed or higher cost.

Currently, blockchains don't have this type of problem because the adoption is still low, but we will come to a time when some historical data will not be stored in any of the client nodes, and it will not be accessible.

The solutions will probably be separate decentralized or semi-decentralized projects that will fill this niche and provide a complete archive of the selected Blockchains. DApp smart contracts will connect to these resources, pay some fee, and query historical data.

We have to underline here that we are talking about the historical data, not the current state data. Of course, Mainchain will permanently store the current global state. However, the Mainchain will not keep every operation, every changeset that resulted in the current global state. So, NO, it will not forget that you own some NFT!

Challenges

There are a lot of challenges in the proposed solution. Decoupled systems scale perfectly but introduce some other issues. One of the main issues is interoperability.

How can we communicate between different Rollups or Shards? Do we have to exchange messages through Layer 1 or without it? What about the finalization of the Bridged operations? There are a lot of projects that try to create bridges between Blockchains or Rollups. All of the latest big hacks in the Crypto industry took place in the bridging protocols. Bridging is one significant aspect of the Crypto industry on its own. One thing we can say for sure is that without bridging, it will not be possible to operate. We will address these topics in future articles.

Outro
So, what have we done?
How have we decoupled our Monolithic Blockchain?

First,
We added shards and horizontally scaled the data storage.
Second,
We took away the computation from Layer 1 and brought it into the hands of a Layer 2 Rollup.

Rollups and other Layer 2 solutions are data-hungry, and they will benefit from the additional storage provided by some or another form of Sharding.

As a result,

Layer 1 got lighter, and the information stored on it got smaller.

P.S. There are already proposals in R&D that move the Data Availability responsibility from Layer 1 away and make it separately, off-chain, but Rollups and Sharding techniques are more production-ready than off-chain data-availability solutions, more on that in the upcoming articles.

Thanks to George Kapanadze for the review

--

--