How to choose a data provider for a web3 project?

Alexandr Kumancev
Coinmonks
5 min readFeb 23, 2023

--

Summary of the live stream where Boris Godlin explains what to consider when working with historical data, how to move from raw data to abstractions and what are the ways of API integrations.

He also explains the approach Footprint Analytics offers to connect web applications and APIs and the project’s mission in blockchain analytics.

Requirements

It is expensive to store historical blockchain data in the blockchain itself, so there is a need for indexers. For example, a cross-chain NFT gallery wants to work with many chains and sectionally show its statistics and analytics at different levels (from the whole market overview to a specific collection), so this is the best case for an example. Such a Dapp project has a lot of criteria that would be interesting to consider:

  1. the data must meet your goals and objectives;
  2. the data must be of good quality;
  3. the number of networks supported should be appropriate for your cases;
  4. you should have easy ways to integrate with the indexer so you can easily fetch that data. You probably don’t want to have a lot of infrastructure costs, you’ll want support and, of course, legal complince.

Indexing Pipeline

There are a lot of blockchains, which means a lot of raw data. Once the raw data is obtained, most usecases will require an abstraction. The indexing process involves trivial processes: you take the data from somewhere, transform it, store it. There are many ways to store data, different kinds of databases. But once you’ve stored it, you want to be able to provide different interfaces with the workings of that data.

indexing pipeline

Types of RPC Nodes

There are many different types of nodes which store different sets of data. If you want to retrieve historical data from the blockchain, you have to work with archive nodes.

rpc node type

Data mapping for Etherscan

Doing a reference on EVM (since it’s popular and many people know how it works) there are 4 primitives. These are blocks, transactions, logs and traces — the raw data that can be retrieved from the blockchain. If you’ve ever worked with Etherscan or written smart contracts, you should know how it works.

data mapping for etherscan

Why raw data is not enough

Raw data may not be enough. If you are an NFT gallery that wants to work with several networks besides EVM: Polkadot or Solana, you need to understand that these are architecturally completely different networks. Since there are different raw data, and you would like to be able to get it across different networks at the same time within one query, you have to abstract away that raw data. So there are abstractions.

why raw data is not enought

EVM data provided by Footprint

Footprint Analytics aggregates raw data and if it conforms to some rules (e.g., the transaction is executed within a smart contract, which is ERC-721 or ERC-1155 standards, that is, implement methods), then it is considered an NFT transaction. Footprint Analytics has a lot of rules that vary from EVM to Solana. They aggregate this under “bronze stable” to the “silver stable” level, where everything across all networks is already in place.

data provider

What are the options?

There are two options: either you trust someone and get the data, or you are the indexer.

Third-party products are much easier to use, because the infrastructure has already been built for you. The company that develops this API often has high computing power and employs professionals. However, you are dependent on the indexer, you have less customization and it may not be cheap.

With SDK, it’s all in your hands and you build all the abstraction and triggers yourself. It’s easy to customize but hard to install and maintain, which means more resources.

options

REST API

Footprint Analytics has two interfaces for different segments: for developers and analysts. Ultimately the data set is the only one. This means that anything you can get through the web application, you can then get through the API.

Footprint Analytics has several ways of API integrations. One of them is REST. This is when some trivial comand scenarios are chosen that you can get data from.

rest api

SQL API

At the core of everything, Footprint Analytics has a relational database, because the number of tools and cases that work with relational databases is universal. And you have to understand that our data uses SQL.This is a very interesting thing for the reason that any query that you execute through a browser and you get a nice chart or table, you can validate. You can make sure that you’re getting what you want to get and that the data is coming in correctly. Then just copy the SQL code you get and paste it into the API, and use it within the API. Footprint Analytics promotes an ecosystem approach of connecting the web application and the API.

Footprint Analytics’ mission is to lower the barrier of entry for blockchain analytics, the project team does abstraction over SQL as a construct building and therefore they have the same approach with APIs.

sql api

Supported networks.

Footprint Analytics supports 24 networks, over 700,000 NFT collections, 17 NFT marketplaces, and 519 DeFi protocalls.

support networks

Thank you for your attention! Hope it was helpful)

Делая референс на EVM (т.к. он популярный и многие знают, как он устроен) есть 4 примитивы. Это блоки, транзакции, логи и трейсы — сырые данные, которые можно получить с блокчейна. Чем и занимаются специалисты Footprint Analytics. Если вы когда-то работали с Etherscan или писали смарт-контракты, вы должны знать, как это работает.

New to trading? Try crypto trading bots or copy trading on best crypto exchanges

Join Coinmonks Telegram Channel and Youtube Channel get daily Crypto News

Also, Read

--

--

Alexandr Kumancev
Coinmonks

Software engineer (Full-stack, Web3). ✍️ Write about blockchain and other amazing stuff