Indexing the universe of blockchains with Covalent

Nichanan Kesonpat
1kxnetwork
Published in
10 min readJul 20, 2022

Over the past 2 years, DeFi and NFTs have ushered new users into crypto by the millions, driving demand for more complex use cases and creating an ecosystem rich in activity and ever-growing on-chain histories. The Ethereum blockchain alone is approaching 1TB in size.

While blockchain data is open for anyone to verify and use, in practice it is costly to derive semantic value from it in a reliable, programmatic fashion. This problem is exacerbated with more L1 chains coming online and the advent of L2 “app-chains”, each with propositions optimized for different use cases and markets. This is reminiscent of how there are over 800 options in the traditional database world — some designed to be jacks of all trades while others more specialized.

While speed, decentralization, and security often dominate the conversation when it comes to blockchain scaling, these factors pertain to write-scalability, which only takes us halfway towards realizing its potential. Often overlooked is the problem of read-scalability: How efficiently we can pull useful information from on-chain data and maintain transparency as more chains and dapps enter the market.

Without solving for read-scalability, blockchain data remains public in theory but inaccessible in practice, posing an opportunity cost borne by the whole ecosystem. Developers would not be able to perform real-time multichain analysis for assets and protocols. Users would not have access to ROI data in their wallets to make better-informed investment decisions. Traditional FinTech players looking to integrate web3 features would face a higher barrier to do so.

A thriving multichain ecosystem needs a data readability layer. Democratized data access reduces information asymmetry, accelerates product innovation, enables better-informed capital allocation, and makes the market efficient as a whole.

This is why we are excited about Covalent.

Covalent is an indexing and querying middleware for blockchains that ensures chain data remains accessible and readable at scale as web3 matures. Covalent makes on-chain data queryable via its Unified API, enabling developers and business users to pull balances, positions, and granular historical transaction data across blockchain networks.

Covalent asynchronously exports data from a blockchain client to produce Block Specimen, a canonical storage format that facilitates re-execution and enrichment outside of a node. Data extraction occurs via a system in which cryptographic proofs are published to the Covalent proof chain and compared. Any deviations in the data export can be found and addressed.

Block Specimens are traced, decoded, and enriched by an indexing engine. Query nodes then serve this data up to the Covalent Unified API to developers and analysts requesting it.

The indexing layer for blockchains should be trust-minimized and maximally flexible to accommodate the diversity of products and services that will be brought to the market over the coming years. Covalent’s design philosophy allows for:

  • Data Verifiability. Trustless data verification via cryptographic proofs enables Covalent to decentralize its infrastructure by enabling anyone to enter the network as Block Specimen producers and incentivizing them to remain honest via stake-and-slash mechanisms.
  • Data Composability. Covalent’s standardized data model allows developers to wrap, remix, and fork data chain-agnostically, just like with assets.
  • No-Code Solution. The no-code feature of Covalent’s Unified API unlocks the solution space for analysts, developers, and other non-technical users working at the BI and visualization layers. Users can create the equivalent of pivot tables on top of base layer data to pull information like 24-hour trading volumes, NFT historical floor prices, and aggregated wallet token balances across networks.
  • Extensibility. Covalent is a turnkey solution for new EVM blockchains and app chains, with an onboarding time of less than a week. For non-EVM blockchains, it is a matter of adhering to the Block Specimen standard to extract data from the chain clients.
  • Flexibility. Covalent follows the Extract-Load-Transform (ELT) data integration paradigm in which the network extracts data from blockchains and loads into a data warehouse. Customers then transform it at query time with the data they want to support. In contrast, subgraph-based indexers follow the Extract-Transform-Load paradigm, where the extracted data is first transformed into a subgraph for a specific use case.

Empowering data-driven decisions and data-powered products with ELT

Since the 1970s, Extract-Transform-Load (ETL) has been the dominant method for businesses to integrate data for analysis. Because data is consolidated (transformed) before being loaded into the target system, ETL helps companies save on storage costs. This method only retains the necessary data for analytical use by a company’s operations arm. As new business questions and demand for new metrics arose, ETL pipelines had to be refactored to generate the data needed for further analyses.

Over the past decade, cloud data warehousing technologies have made it cost-effective to store all raw data in a central location, enabling companies to adopt Extract-Load-Transform (ELT). With this approach, engineers could directly load raw data into a data warehouse without having to refactor data beforehand.

ELT produces a better separation of concerns between data engineering and business intelligence. Because the Extract and Load steps are agnostic to the end use case, data engineers can focus on producing the single source of truth that could answer any business question. This empowers the consumers of this data (analysts, operations) to flexibly and iteratively transform it at query time without the engineering bottleneck.

Today, web2 companies are able to glean insights from proprietary data thanks to the $200 billion data infrastructure market that ingests, integrates, and processes data for downstream analysis. For blockchains, this infrastructure is still in its infancy. But the public nature of chain data, the growing heterogeneity of the L1 ecosystem, and the accelerated innovation in permissionless, composable networks collectively calls for a data accessibility layer that abstracts away and minimizes redundant data wrangling work. Because of this, we believe that ELT is a superior approach to index web3 as it is:

  • Flexible to handle changes in downstream requirements. Because the data is non-destructively normalized (Extract-Load) before query (Transform) time, users can pull data for different types of analyses by simply changing their query. If a protocol’s smart contract is upgraded, Covalent users would only need to requery the comprehensive dataset instead of having to modify and reindex their subgraph like they would have to do under the ETL paradigm.
  • Flexible to handle changes in upstream requirements. As blockchains grow more heterogeneous, the Block Specimen standard ensures that Covalent’s Unified API remains queryable out of the box even for chains with different transaction receipt formats and non-EVM chains. While customers in the ETL paradigm need to rewrite subgraphs for chains like EVMOS or Solana, on Covalent the hard work is done once patches for these chains adhere to the Block Specimen specification.
  • Less operationally costly for customers. With Covalent’s ELT-based approach, the cost center for customers are analysts and IT operators writing queries as opposed to developers building and maintaining subgraphs. Covalent removes the redundant data engineering work, enabling teams to focus on the analytics work that help them make better-informed business decisions.
  • Able to handle queries that require the entire chain data. Because historical trades can be observed anywhere on the blockchain, ELT is the only solution to answer queries pertaining to token balances. This is why Covalent has been popular among portfolio trackers and tax reporting applications. The fact that all data is made available via a Unified API makes Covalent suitable for machine learning efforts, such as address classification and clustering. With this, developers can also use the API as a single point of integration to build a range of data-powered applications.
  • Enabling truly chain-agnostic data consumption. Covalent’s Block Specimen Producers will be full node operators for source blockchains who will run an extract-and-normalize job on the chain data before publishing it to storage nodes. Query nodes read from the storage nodes to respond to external API requests. This means customers can use identical queries on data that is very different from their original form, a powerful property as dapps become cross-chain native.

The separation of concerns between data engineering and business intelligence becomes even more powerful in a world of public blockchain data and decentralized networks. Through ELT, Covalent equips the ecosystem with a canonical, unified picture of multichain data without introducing a bias of how it should be used, providing an unopinionated data infrastructure that makes it as easy as possible for builders and analysts to focus on what they do best.

Unifying Multichain Data with Covalent

Making blockchain data accessible to all

The Covalent Unified API currently supports +1,800 applications across 37 blockchains, is trusted by +27,000 developers, and serves billions of queries per month. This represents a +31k% growth in the number of developers in the ecosystem and +30k% growth in the number of active projects querying the Covalent API in under 24 months.

Users include:

  • Popular wallets like Rainbow and Zerion, who use the Covalent API to aggregate historical balances and PnL across DeFi and NFT assets for their users.
  • Dashboards like CoinGecko to show price trends, liquidity and asset ROI.
  • Cross-chain liquidity aggregators like Li Finance to retrieve asset price information access different networks.
  • Portfolio trackers like Rotki pulling historical balances and pricing data across chains for tax reporting.
  • DeFi apps like Aave and Balancer integrating user data from different chains
  • Exchanges who pull users’ historical transaction data to generate reports for tax compliance.

Covalent’s Unified API will also be a crucial go-to-market tool for web2 apps who want to add crypto features such as displaying NFTs and DeFi positions across different networks. These customers will be able to tap into web3 without investing in additional infrastructure (i.e. running nodes, writing smart contracts). Instead, they will be able to access on-chain data using SQL.

Source. Covalent offers code templates such as a customizable widget to display top tokens or DEX LP statistics.

Covalent’s no-code solution reduces the friction for analysts to build complex dashboards and perform downstream analytics in compliance, risk, or taxation using Analyst Mode, where all requests and responses become an Excel-like experience allowing exports to CSV and Tableau.

Decentralizing the indexing layer of blockchains

Covalent completed its first milestone towards decentralization through the launch of the Covalent Query Token (CQT), a staking asset and settlement token for the network. Network operators who wish to participate in Block Specimen production, indexing, or querying need to stake CQT to run a node. The Block Specimen patch is designed to be easily integrated with live nodes (currently geth), allowing node operators to become Block Specimen producers while continuing to validate Ethereum mainnet.

When customers are charged for calls to the Covalent API in stablecoin, a smart contract market-buys CQT and distributes it to network operators pro rata to their stake. At the same time, passive token holders can delegate to node operators and earn a portion of the rewards.

The Covalent team has seeded the API with core endpoints such as wallet balances, NFT metadata, and integration with the top DeFi protocols like Balancer, Aave, and Uniswap. With Class C endpoints, community members will be able to write their own custom endpoints for others to consume, creating a marketplace for long-tail endpoints; and an incentive to make the API richer with more coverage across on-chain ecosystems.

Covalent Unified API Endpoints

With new chain integrations every month and more applications leveraging the API, Covalent’s flywheel has already begun to spin. Over the course of their progressive decentralization, index and query nodes will improve on the performance and coverage of the API, supporting overall network growth.

Covalent Network Flywheel

What’s next for Covalent

  • Single-sided CQT Staking went live in May 2022. Block Specimen mined on the mainnet proof chain
  • Launch of Indexing Nodes to improve searchability of Block Specimen
  • Launch of Query Nodes which serves the Covalent API
  • Class C Endpoints. Community members can write custom endpoints to the Covalent API, creating a marketplace where the creators of popular endpoints earn a portion of the fees from API calls.

Covalent has established a strong community of builders via the Alchemist program, which has onboarded over +2,500 members onto web3 through education initiatives, networking, Quests, and building out the Covalent ecosystem. Alchemists have become champions of the protocol. As Covalent continues to execute towards becoming a fully decentralized network, Alchemists will continue to be essential stakeholders, driving initiatives and ensuring that governance decisions are aligned with the values of the community.

By making data across blockchain networks accessible, enabling no-code solutions, and democratizing API endpoint generation, Covalent keeps the design space open for all types of stakeholders to help ensure that transparency remains resilient in the wider crypto ecosystem. The token network will enable Covalent to scale as the leading indexer of blockchains paralleling the likes of Google in web2, all the while ensuring that ownership and power is redistributed back to those who add most value.

Team

Covalent is founded and led by Ganesh Swami and Levi Aul. Ganesh is a physicist by training and has over 10 years of data analytics experience, his first company is listed on the NYSE. Levi built one of the first Bitcoin exchanges in Canada and was part of the team that built CouchDB at IBM. The team has since grown to 55 members consisting of network architects, data scientists and software engineers.

If you are passionate about blockchain scalability and data accessibility, Covalent is actively hiring for both technical and non-technical roles including DevOps, Site Reliability Engineers and Technical Writers.

If you’re a developer looking for inspiration about what you can build using the Covalent API, check out the project showcases and protocol documentation. Sign up to become an Alchemist to jumpstart your career in web3 and help build out the ecosystem.

Join the community on Discord, and keep up to date with Covalent on Twitter.

--

--

Nichanan Kesonpat
1kxnetwork

Platform & Content @1kxnetwork | Co-Founder @lastofours | Smart Contracts @upstate-interactive @mochi.game | 🏠. nichanank.com