Evolving Topics in Data Availability for the Ethereum Ecosystem

Takens Theorem
Etherscan Blog
Published in
7 min readFeb 7, 2024

--

Crypto news, podcasts, blog posts and 𝕏 threads are awash in “Dencun” — the forthcoming Ethereum upgrade that will probably land next month. It is already live on some testnets. So far, these testnet upgrades are clicking along without major issues. The most significant part of this upgrade is EIP-4844, a collection of updates that will create a new transaction type on Ethereum’s consensus Beacon chain that encodes data called blobs. Blobs will help operation of Ethereum’s second layers (L2s) by reducing fees to store data. They are also a major contribution to the evolving topic of data availability. We discussed EIP-4844 in a summer post last year observing this upgrade as an important step in Ethereum’s “new data economy.”

What is Data Availability?

Explanations abound for the concept, but let’s revisit the basics. In the simplest terms, data availability can be understood as ensuring access to transaction history as a ledger grows. Indeed data availability is typically “solved” in the simplest way by installing an Ethereum client and ensuring you have enough hard disk capacity to sync up with its history.

My favorite, stark statement of the data problem from Nic Carter years ago.

This is a useful way to think of data availability at first. But the topic leads to a lot of exciting technical and socioeconomic detail. For one, it is costly to sync and store data forever. It restricts consensus participation, because syncing can be slow. Data availability in the same execution environment can also clog that execution environment.

So data availability can be considered a more abstract set of questions: What services can offload data storage but ensure liveness of a ledger? How can we decentralize and speed up access to historical chain data? How do we give cryptographic assurances of its truth? Relatedly, how do we avoid attacks on that history? How can we better incentivize and reward participation in data preservation?

Data availability really refers to this rich array of technical, security and economic questions associated with ensuring the availability of information needed to preserve a ledger.

EIP-4844 in the Dencun upgrade addresses this issue because most second layers (L2s) for Ethereum are using mainnet (L1 calldata) for data availability. By doing so, they “inherit the security” of L1, as it’s sometimes said. And because L2 activity has grown, this data availability approach contributes to an increase in Ethereum’s base fee — because this data solution is being squeezed into the mainnet environment. At the time of this writing, you can see several L2 platforms among Etherscan’s “gas guzzler” list, illustrating the rising consumption on mainnet for this purpose:

Linea, zkSync and Arbitrum consuming over 6% of gas together

This concern is even clearer in gas expenditure itself. You can see what L2s and L2-related scaling solutions are spending; these costs are passed onto L2 users, making both L1 and L2 slower and more expensive to use:

Here: zkSync, Linea, Arbitrum, Base, Optimism, Scroll, Layer Zero

One of the goals of EIP-4844 is to make data availability cheaper for L2s. To do this, Dencun will introduce a new transaction type that commits blobs of data. These will be stored temporarily on the Beacon chain (the consensus clients). Blob capacity, cost and expiry period will, in principle, be sufficient for various cryptographic properties of L2s (e.g., optimistic rollups need a week of data during a challenge period). So after Dencun, L2s could shift from using more expensive mainnet calldata to the (theoretically cheaper) blob data market on Beacon chain.

Let’s consider a few main topics being discussed in the context of the Dencun upgrade and data availability.

Topic 1: Effect on Mainnet Fees

The blobs introduced by EIP-4844 will be temporary and stored separately on Beacon chain. They only live for 18 days before they are pruned by Beacon chain clients. Blob storage fees will use the same kind of structure as the very successful demand-based fee market now live on Ethereum (EIP-1559). Blobs solve the data availability problem for L2s in a way that may reduce fees, because they offset those calldata expenses on mainnet.

Naturally one a major question is how the base fee on mainnet will be affected by the availability of blobs. L2s using Ethereum mainnet for data availability may enjoy well more than 10x savings by using blobs and data compression schemes. These savings will translate into significant reduction in cost for users transacting on L2s.

General users of mainnet may derive benefit from this offset, too. The gas guzzler data above suggests anywhere between 5–10% of gas is associated with L1 being used for data availability by L2s. A correlated drop in base fee will likely accompany its shift to blobs.

Rising fees spent for L2 Optimism’s batcher; after EIP-4844, this fee expense will vanish

EIP-4844 and blobs are just one part of this solution though. Several platforms are emerging to address this problem and contribute to data availability in a cross-chain fashion.

Topic 2: Competition in Data Availability Platforms

There are numerous platforms under development to supply data assurances. Many are blockchain-agnostic in that they can serve as a general data solution for any ledger. Because the goal is to decouple data and other functions from L1, you can think of the overall strategy here as one of modularizing components of a blockchain: settlement, execution, data, etc. Celestia is perhaps the most prominent early platform offering generic, cross-chain solutions for modularization.

But there are many related data availability platforms, including Avail, EigenDA, NEAR DA, KYVE and others likely in development.

In a recent post about data availability, Bridget Harris cleverly refers to this development as “cheap [data availability]” that “will inevitably fuel a Cambrian explosion of new custom rollup chains.”

This has already led to some emerging competition among platforms. Anthony Sassano on a Daily Gwei Refuel described the competition as a “race to zero,” suggesting the stable endpoint for these solutions is simply to reduce data costs to near 0. An example chart shared widely on 𝕏 illustrates this, shown below. The chart comes from a NEAR data availability project (“NEAR DA”). It suggests a primary factor is a competition to shrink costs, a vanishing gradient.

From NEAR DA project announcement & summary

Ethereum scaling solutions like L2s could make use of these services exclusively or in addition to blobs. For example, several projects have stated commitment to data availability using EigenDA, summarized here, including Celo which is moving from being a standalone L1 to becoming an L2 in the Ethereum ecosystem.

Topic 3: Blobs as Speculative Platform?

The third and final topic is a less discussed one. But its potential significance justifies inclusion: Will there be playful speculation using the temporary blob storage? A new source for “ethscriptions”? This interesting concern was also raised on Anthony Sassano’s The Daily Gwei Refuel recently. He discussed how the blob market may be spammed for various reasons, likening it to a new avenue for speculation. I wonder if a temporary NFT in the form of a “blobscription” may intrigue users (side note: this is a term in use already, apparently). Anthony contemplated how the 18-day expiry interval for blob data may deter such usage. But I also wonder: Can temporary data pose curious challenges for creative new projects, such as interacting within an 18-day period to sustain and preserve an asset? There is artistic precedent for such fun on-chain projects, such as Sarah Friend’s lifeforms.

Friend’s lifeforms require maintenance for survival

One could imagine the blob fee market for L2s being consumed by such playfulness in a manner similar to the rise of Bitcoin ordinals and its accompanying debates. Blobs could be a new canvas; their evanescence held at a premium: “I owned that for those 18 days!” Crypto denizens will voraciously consume any chain resource that can be tokenized and stylized as a financial product. The third-party modular data availability solutions may play an important role here if this occurs. These platforms may be optimized for any such activity, more resilient to it, because they can use various strategies to segment and process data for specialized purposes. It will be exciting to see how it all unfolds.

Sample blob from Sepolia testnet: What else will blob data encode?

Further Reading

  • The Daily Gwei’s several recent episodes are excellent for summarizing and linking to Dencun discussion (with tons of linked material). Highly recommended.
  • Very detailed post with technical detail from Scroll on EIP-4844 and data availability. There is focus on Scroll’s platform itself here, but lots of other detail.
  • EigenDA also has a nice technical summary of advances.
  • Check out L2Beat for fantastic related data on various L2s (we featured L2Beat in a blog post sometime ago).
  • A great intro and survey by L2IV Research.
  • Fantastic recent survey of data availability layers by Bridget Harris.
  • Fantastic, authoritative technical detail from Domothy.

--

--

Takens Theorem
Etherscan Blog

Dynamic distributed data displays. Intermittent. Friendly.