Celestia (Data Availability as a Service) — Blockchain Roadmap

Burak Tahtacıoğlu
11 min readFeb 11, 2022

--

Photo by Nornbert Kowalczyk on Unsplash

Blockchains are distributed networks that perform State Machine Replication (SMR). SMR has 3 basic phases: Data, Consensus, Execution

Data is the transmission and storage of various commands given by users to the network.
Consensus is the ordering of these commands.
Execution is the execution of the ordered commands.

The main problem to be solved in the way of creating its own currency for the Internet is to create a Consensus system that cannot be prevented by an outside intervention. The solution, again suggested by Satoshi Nakamoto, was the Nakamoto Consensus, which people from all over the world can contribute and maintain.

Consensus in SMR is only required for sequencing data. Looking at the Bitcoin structure, it is seen that one more responsibility is given to the Consensus. A consensus function is to eliminate invalid commands and sort only valid commands.

In SMR systems, it is a preference to perform Execution during Consensus or leave Execution after Consensus. Giving Consensus this load in this preference causes many attack types and capacity problems originating from Verifier’s Dilemma. In short, Verifier’s Dilemma is that the burden of verification is insufficient compared to the gain of verification. As a result, those who need to verify may fail to verify, or there may be attacks to profit by unnecessarily busying validators using this verification requirement.

Despite these disadvantages, the main reason for Satoshi Nakamoto to make this choice is to ensure that users can use Bitcoin securely without running a Bitcoin Full Node. This technique, called Simple Payment Verification (SPV), is the most important foundation of Bitcoin design. Bitcoin, as it is understood now, is not a structure that works like an ox, the most important feature of which is the presence of a fixed amount.

Satoshi Nakamoto has two main takeaways about Bitcoin:
Because of game theory, the majority of miners will always behave in a way that benefits the network.

Users will be able to use SPV securely, as the majority of miners are network beneficiaries. Block production, on the other hand, will be the work of large miners specialized in this regard over time.

Satoshi Nakamoto Consensus wanted to create an asymmetry in the load on users and service providers by loading the Execution task in this way. While it puts an extra burden on the miners providing the service, it has taken a significant burden on the users. In this way, with the development of miners’ hardware, Bitcoin will be able to rival Visa in terms of capacity.

Satoshi Nakamoto left the development of Bitcoin to Gavin Andresen, and Gavin Andresen left it to someone else. After a while, the assumption that 51% of the miners on which Bitcoin was built would correctly implement the Bitcoin protocol disappeared with this demonization.

With the collapse of the first foundation on which Bitcoin was built, the second foundational SPV, which needed the first foundation to work, was also targeted. By targeting the SPV, the asymmetry between service providers and users has also disappeared. Execution of all users has made users a limit in scaling. With the limit of users, scaling has been shelved by increasing the capacity of service providers.

In fact, Satoshi Nakamoto’s architecture was completely thrown away. The remnants of that architecture, the processes that do not fulfill their original duties, continue to be found in the architectural structure. The Execution task should not be given if a structure that does not interfere with the Nakamoto Consensus, such as the Execution load added to the Consensus to allow for the same SPV, should not be given the Execution task. Thus, the miners do not have to change the code and create two different blockchains every time the Execution logic changes. This was expressed by Peter Todd at the Scaling Bitcoin conference in 2016 under the name Client-side Validation.

Despite leaving the 51% assumption, the first basis of Bitcoin, can the second basis, SPV, be used? In fact, the assumption of 51% can never be dropped due to the Double Spend attack. Even if there is no Double Spend attack, the reason for insisting on using a Full Node instead of using SPV with the assumption of 51% is to make sure that only valid commands are included in all blocks. Since SPV users only follow the longest chain, they will not notice even if there is an invalid command in the longest chain. If we want SPV users to be aware of this, one of the honest users or miners running Full Nodes can forward this invalid command to users using SPV so that they realize that there is a faulty transaction in the longest chain. This situation corresponds to Fraud Proof.

His issue seems to have been resolved, but another problem here is that honest nodes can access blockchain data so that they can notify users using SPV that there is a faulty transaction on the chain. In fact, attacking miners with more than 51% power may not only publish the header part of the blocks they create, which is stored by SPV users, and not share the rest of the data of the blocks. This problem is also called the Data Availability Problem.

Data Availability is basically a question of whether the data is available or not. While some do not want to have to hold this data in their hands, the other part holds this data in their hands and claims that it is accessible. If everyone wants certain parts of the data at random, the first noticeable problem with this method is that 99.9% of the data is accessible, but the main problem is that the 0.1% part is not accessible. The main problem is that the data is considered accessible because random samplers do not correspond to it.

Erasure Coding
The inspiration for solving the problem of a tiny bit of data not being accessible is the Erasure Coding structure on CDs. What Erasure Coding basically provides is data that you want to be accessible. With Erasure Coding, other data is added to the original data. This divides the state formed with the added data in such a way that if a certain number of parts can be read, regardless of which parts they are, the entire data can be reconstructed.
Divide the data on the CD into 100 parts and write them to the CD. If any 35 of the 100 tracks on this CD are readable, all 35 of the data can be recreated.
In a system using Erasure Coding, a significant part of the data should be accessible. Using these accessible parts, the hidden small part can be reconstructed. The disadvantage of this is that the entire data block has to be recreated in order to access the small data.

This system is used in systems such as Ethereum and Polkadot to protect and scale the data integrity of their own sub-isolated regions. In addition, there are projects such as Polygon Avail and Celestia to provide Data Availability service.

Celestia founder Mustafa Al-Bassam sums them up in Fraud and Data Availability Proofs: Maximizing Light Client Security and Scaling Blockchains with Dishonest Majorities, with contributions from Alberto Sonnino and Vitalik Buterin.

Dishonest Majority: The assumption that the majority of miners are honest, that is, the first basis I mentioned for Bitcoin is considered nonexistent.
Maximizing Light Client Security: The aim is to maximize the security of users running SPV, namely Light Clients.
Fraud and Data Availability Proofs: It is done using Fraud and Data Availability Proof.
Scaling Blockchains: Blockchains can be scaled even in conditions where the honest majority assumption is not accepted.

He combines the Consensus not to interfere in Execution with the Data Availability problem he was working on in this study and publishes his new work.

Blockchains are also called Distributed Ledgers. The reason why Mustafa Al-Bassam chose the name LazyLedger is that this network does not perform Execution. After the Execution, her name is changed to Celestia.

Since Celestia does not perform Execution, it does not have limitations in this regard. In addition to separating Celestia Execution, it completely delegates this work to others. Anyone who wants can add their own Execution logic to Celestia. Celestia focuses on Consensus and Data Availability and enables these features to scale.

So why would these blockchains use the Celestia Consensus layer? For this, we need to learn the concepts of Shared Security and Inter-Blockchain Communication.

Cosmos IBC and other Trustless Bridges use Light Client logic running Bitcoin SPV. When two chains want to connect to each other, they run each other’s Light Client. There is a possibility that Bitcoin miners who have invested billions of dollars will attack Bitcoin and at least hundreds of millions of dollars in Bitcoin, then no one will notice such a large amount of Bitcoin printing, and because the Full Node is not running where they exchanged their Bitcoins, the miners will attack Bitcoin. needs.

In inter-chain communication, however, it is not so convenient and secure. Because when multiple chains are connected to each other, the resulting table provides as much security as the weakest chain. When the majority in this weakest chain does not work honestly, the honest majority assumption in your connection with Light Clients is broken. By solving the Data Availability Problem required for Fraud Proofs, these problems are solved. The reason Polkadot Parachains are not as secure as the weakest link, unlike Cosmos IBC, but as Polkadot is because Polkadot provides Data Availability to Parachains. Parachains can interact securely with each other even in the absence of an honest majority, as the data of Parachain blocks is kept accessible by Polkadot at all times.

In the interaction of chains using the Data Availability service offered by Celestia, Cosmos upgrades it from IBC level security to Shared Security level like Polkadot. Celestia is Cosmos’ solution to providing Shared Security like Polkadot. Because one of Cosmos’ first investors of Celestia uses the Interchain Foundation and Celestia Tendermint infrastructure. So Celestia will be part of the Cosmos ecosystem.

Another example of Shared Security is Ethereum and its Rollups. The assumption that rollups get their security from Ethereum depends on their dependency on Ethereum for Data Availability. However, Ethereum does not have a Data Availability solution and will not exist for a long time. If they try to increase data capacity, they have to move to the honest majority assumption about validators. In this case, it is not possible to get out of the paradigm where the users are the bottleneck in terms of scaling. The only advantage of rollups is to take the Execution load.

Since Ethereum cannot provide enough capacity, Rollups will no longer be Rollups if they use their own data solutions, because the security assumption will have changed again. For this reason, they are no longer defined as Rollup, but by names such as Plasma or Validium. Writing rollups as smart contracts also causes problems such as not being able to fork independently of Ethereum in case of an error.

The chains that will work on Celestia can easily hardfork and softfork. In this way, Celestia changed the Tendermint infrastructure of Cosmos to create an infrastructure called Optimint, so that Cosmos chains that will be connected to another network for Data Availability can be created. In this way, these chains can use Celestia as the Data Availability layer and even become a Rollup that will work on those chains by using Ethereum and Cosmos.

The reason Celestia is flexible in this regard is that it wants to offer data services even to networks connected to those networks, since networks such as Ethereum and Cosmos cannot offer sufficient data capacity.

Evmos is running on a Cosmos network that can run an Ethereum Virtual Machine (EVM) so that Ethereum smart contracts can run on Cosmos’ Tendermint infrastructure. Evmos has several benefits to Celestia. The first of these benefits is that with an Evmos infrastructure that uses Optimint instead of Tendermint, any Ethereum-based application can both migrate to its own application-specific network and enter under the security umbrella of Celestia.

Another example of Evmos, Celestia partnership is with Cevmos, Celestia and Evmos are building a network using a custom Optimint. The feature of this network is that it will be able to run Ethereum style Rollups on it thanks to EVM support. In other words, Ethereum Rollups will be run on it without any changes and the data of the whole system will be kept on Celestia. Unlike the mainnet of Evmos, Cevmos will provide a specialized structure for Rollups to run on it.

Compared to its rivals, Celestia seems to be much earlier than Ethereum’s goal of abandoning the Homogenous Sharding plan and returning to the Rollup Centric plan. Based on the current situation of Ethereum, Avalanche is already better than Ethereum, but there were arguments with Rollups and with the “data layer brings it to the future” approach developed for Rollups.
Compared to Polkadot, it is important for freedom that Celestia leaves the Execution part to the networks that will work on it. But to build a Polkadot-like system around Celestia requires more than one team.

Celestia solves Polkadot’s Data Availability problem with Optimint, modified from Cosmos’ Tendermint, against Substrate, its blockchain development infrastructure. Corresponding to the Polkadot Relay Chain is Cevmos. Against Polkadot’s Parachains, Rollups from Ethereum are placed.

Compared to Polkadot, Cevmos offers Celestia Data Availability service as a fee market and does not provide it as smoothly as Polkadot’s slot rental based system. Cevmos relies on Optimistic Rollup logic and the long challenge periods of Optimistic Rollups are not in vain. Polkadot, on the other hand, is immediately completed with validators assigned for validation of Parachains.

The spread of Zk-Rollups solves this problem, but the team in general may support Optimistic Rollup. Networks under Celestia can interact securely under Shared Security, but currently there are no standards as in Polkadot. The fact that it gives freedom to the applications on it and provides a neutral service may cause a somewhat soulless relationship to be established. Since its income is based on the fee market, a serious fee income will not be created up to the levels that will force its capacity. As soon as the fee revenues start to form, networks can quickly turn to an alternative or their own fork.

Despite these, Celestia is actually a very well-constructed system. It is important that it offers wide freedom to the systems that will work on it and the size of the ecosystems it addresses. It offers a very modular structure in terms of the freedom it offers, and it takes its place in the system as the first project to introduce the concept of modular blockchain.

See you in next article…

--

--