Hemera BlobArchive for Ethereum: Permanent Data Availability for Layer 2

Sammi Shu
Hemera Protocol
Published in
4 min readMay 23, 2024

Hemera is pleased to introduce BlobArchive, in collaboration with 0G Labs, making it possible to store all data blobs created after the introduction of Etheruem’s EIP-4844.

Below we’ll provide an overview of:

  • An introduction to data blobs
  • The need for data blob storage
  • BlobArchive’s architecture

Background

The introduction of blobs in the Dencun hard fork has allowed L2s to post their transaction data using “data blobs”, significantly reducing gas fees. However, blobs are only stored by Ethereum nodes for ~18 days before being expired and deleted to save on data storage costs. Yet, while blobs are not needed to reconstruct Ethereum mainnet transactions, they are necessary for each L2 to reconstruct their own history (e.g. for verification purposes).

Any L2s using the Ethereum mainnet as their Data Availability layer need transaction data to be available when a new full node is set up and joins the L2 network without P2P sync. Existing L2s mainly sync transaction history from the L2 network itself, which has centralization risks (e.g. in the rare case that peer nodes are corrupted and transaction history is lost).

Thus, the Ethereum ecosystem needs an alternative form of blob preservation to ensure L2s have access to blob data whenever needed, in turn keeping the Data Availability guarantee. As discussed in Paradigm’s recent blog , a robust blob storage infrastructure will also apply to L1 history data storage, providing a viable solution to the question raised by EIP-4444 in the future.

Architectural design

There are two main approaches for preserving blobs off-chain permanently.

The first way is using P2P networks, like Torrents, which can be designed to periodically package blobs and share as public torrent files for node clients to access. The advantage of this option is that torrents is a widely adopted open standard with mature ecosystems and tooling support.

The second way is to leverage cloud hosts plus decentralized storage to create a sufficiently decentralized solution with high-performance guarantees. Cloud hosts allow highly performant and affordable services to be built. To avoid centralization risk, one or a few decentralized storage solutions can be added to provide decentralization guarantee.

We chose the second approach, although future implementations could include a Torrent network as an additional option to maximize the likelihood of preservation.

This has the following design advantages:

  1. Highly performant: be able to sync history fast.
  2. Permanent storage: proactively backing up blob data in decentralized storage providers
  3. Easy verification: open access and anyone can verify the archive’s validity by downloading entire blobs data and verifying that the respective KZG commitment is correct.

How Hemera BlobArchive Works:

Figure 1. Architecture of Hemera BlobArchive services

Account-centric Indexing layer: BlobArchive’s indexer indexes Ethereum mainnet transaction history and retrieves blob data from the Consensus Layer in real-time.

Storage Layer: With the potential accumulation of terabytes of blob data annually, BlobArchive’s storage infrastructure boasts a resilient design, leveraging a decentralized approach in collaboration with our partner 0G.

0G Storage uses erasure coding to replicate data and keep it amongst their Storage Nodes, while 0G can quickly query this data to prove data availability. In tandem, blob data is securely stored and readily accessible for fast retrieval, prioritizing security, scalability, and decentralization.

Full Data Restore Service:

BlobArchive offers an open-source module for users to install and operate on their local machines. This module interfaces with BlobArchive’s API, facilitating the retrieval of blob transactions and corresponding data for designated rollups. It streamlines the organization of retrieved data into batches, ensuring efficient processing and seamless restoration of full historical data.

About 0G

0G is an infinitely scalable data availability (DA) layer and modular A.I. blockchain. 0G is highly programmable and interoperable, capable of supporting a wide variety of industry needs. Its data availability solution is built on top of a general storage system capable of storing vast sums of both structured and unstructured data, enabling the future data needs of Web3.

The team includes 10+ PhD recipients and one of the world’s top cryptographers, while the business team has extensive fundraising and growth success, winning numerous awards such as the top YCombinator project and the top entrepreneur award.

To learn more visit 0g.ai

About Hemera

Hemera Protocol is a decentralized high performance indexing protocol to tackle account-level data querying problems, which is becoming exponentially more expensive as the number of chains increase and parallel EVMs are on the horizon. Underpinned by a decentralized account-centric indexing network that aggregates every transaction across ecosystems, Hemera keeps each account’s social and assets states updated in real-time and acts as a foundational infrastructure for on-chain intelligence. Hemera empowers developers to create and deploy web3-optimized LLM agents via an open-source toolkit, enhancing the blockchain ecosystem with real-time, omni-chain interoperability.

Website: https://thehemera.com/main/home

Twitter: https://twitter.com/HemeraProtocol

--

--