Review of whitepaper of The Graph

Rithuraj Nambiar
Nerd For Tech
Published in
9 min readMay 4, 2022

Introduction

The increasing amount of security breaches and privacy issues in Web2 application Stack, along with the other issues like frequent interference of Government has convinced us to the fact that Web2 need to improvise and Web3 must be introduced. Our movement towards Web3 had started with the introduction of Blockchain Technology, which allows us to create a decentralized network for our transactions. Unlike web2, web3 applications won’t be deployed on a single server and the data collected won’t be stored on a single database, rather web3 applications would be hosted and running on blockchain technology, using decentralized networks of many peer-to-peer nodes or as we can call them the servers.
The role of cryptocurrencies in blockchain technology can be termed as
the financial incentive to those helping in creating, governing, or even maintaining pre-existing projects. Currently, blockchains hold a huge amount of data, but every book loses a number of readers just because there isn’t an index or table of content to guide the reader throughout, likewise huge amount of blockchain data goes waste as there isn’t an efficient way of indexing the data on the network.
The data on the blockchain is increasing exponentially and there is no efficient way to access this data without exposing the data to third-party organizations, now with the increasing number of dApps, the need for standardizing a way for querying data is required. The Graph — A Decentralized Query Protocol that helps in organizing blockchain
data and makes it easily accessible. This technology uses GraphQL to query
the data and thus index and sort the data. For Web3, The Graph suggests a
layer of Query Execution Layer in the application stack, that would ensure that all the dApps are enabled in such a way that indexing and sorting is done in an efficient manner.

Web3 Application Stack

Providing a decentralized query execution layer would allow dApps developers to ship more reliable dApps faster with fewer resources. It would also enable dApps to become fully decentralized.

Decentralized Query Protocol

dApps allow users to stay in control of their data. The increase in dApps allow
users to gain financial incentive for contributing to the improvisation of larger
and further-reaching public commons. Now, the apps that are being developed need to agree to certain standardized names, also to a common way for querying data. dApps include various services in domains like Finance, Arts, Technology and Gaming as well. The decentralized query protocol is defined to be the collection of rules by which clients pay a decentralized network of nodes for indexing, caching, and querying data that is stored on public blockchains and decentralized storage networks such as IPFS/Swarm. The basic mission of The Graph is to create a full-stack decentralized environment, that would allow applications that are completely powered by public infrastructure. Full-stack decentralized would ensure users know that application in and out and also ensure them that these applications won’t disappear into thin air.
Currently, we are paying third-party applications for running, maintaining
and providing various other services that help us in running a particular application, but to shift to decentralization we need to move towards paying networks of decentralized service providers for granular usage of these resources. The Graph Network would be decentralizing the API and query layer of the internet applications stack, this would be enabling us to query the blockchain data without relying on a third-party centralized network.

The query protocol that graph would be providing would be meeting up with
these requirements that would be help us in moving towards Web3.0:

  1. A client using Graph will be able to trust the results of querying the data
    without individually verifying it.
  2. A client would be able to efficiently pay for each query processed by the
    network, without any risks either to the client or the nodes.
  3. A client would be able to pay for the predictable performance of the queries that are run against specific data sources.
  4. A client would be able to pay for keeping the data available for running
    queries on specific data sources.
  5. A client would be able to pay for queries, performance, and data avail-
    ability efficiently.
  6. Financial incentives as a form of tokens would be enabled in order to en-
    courage the growth of the network.

Design

The Graph implements a protocol that synthesizes ideas from distributed computing and crypto-economics to produce a network that is self-organizing, robust, and secure. The graph protocol can be divided into several sub-protocols, which can be treated as distinct inter-operable layers.

Sub-Protocols in the Graph

Sub Protocols in the Graph

The layers here are various sub-protocols and each play a very crucial role by
its own.

  1. Query Execution Marketplace: The Query Marketplace lets the End Users pay Query Nodes for individual queries issued against a specific Data Source. The price of these queries will be set by Indexers and vary based on the cost to index the subgraph, the demand for queries, the amount of curation signal, and the market rate for blockchain queries.
  2. Indexing and Caching Marketplace: While the Query Marketplace incentivizes the Query Nodes to respond to individual queries, it does not provide any guarantee that there are query nodes that are available to process the query performantly. To do this the indexing and caching marketplace allows query nodes to be compensated for providing a specific service-level agreement which is a promise to be available to process queries for a specific data source within certain latency and cost bounds.
  3. Governance: Since the graph would be indexing the data for the dApps, the output data matters to the end-users, so The Graph provides a way to vote and decide what data should be included and what should be excluded.
  4. Payment Channels: As the graph would be facilitating high transaction
    cost low, it would be using payment channels for micro-transactions. This
    allows transactions between one-to-one nodes or even can be used for many-
    to-many nodes.
  5. Query Execution: The sub-protocol of query execution is split into
    five distinct stages:
    a. Query Splitting: The query is firstly split into fragments, which may correspond to multiple data sources.
    b. Service Discovery: The service-addressable network is then leveraged to find a P2P Node, with a routing table corresponding to the specific service group.
    c. Query Routing: The gateway node that originated the query, would decide which fragment should be forwarded to which query node.
    d. Nested Query Processing: As each query fragment is corresponding to a single entity If the user processes nested queries, they are treated as separate query fragments.
    e. Response Collation: The final results are now awaited after all the query fragments are executed both in series and in parallel, and the response is collated in a way that it meets with the GraphQL specifications.
  6. Consensus Layer: This layer provides a kind of guarantee that the mechanisms in the protocols are immutable, irreversible, and can be carried out without the interference of centralized networks. Pre-existing blockchains like Ethereum are used for this purpose.
  7. P2P Network: The P2P Network of The Graph is used to locate nodes
    capable of providing the particular service, which can be any arbitrary computational work. The design of their P2P Network sub-protocols is modular with the respect to the service being provided.
  8. Storage Layer: This protocol can implement various kinds of storage
    using Storage Adapters, an idea inspired by IPLD. These may include various
    kinds of blockchain networks, like Ethereum, IPFS, etc.

Query Language

GraphQL is a query language, developed and open-sourced by Facebook. It is
mostly used to make APIs fast, flexible, and developer-friendly. The Graph is
written in such a way that supports queries written in GraphQL. Though the
most-popular querying language is SQL, SQL is much more compatible with
centralized API Servers and data access layers, but the requirement of dApps is that they don’t need centralized infrastructure to function, the important thing is that dApps clients can query data directly from the front-end in a flexible way. GraphQL has been adapted to meet this criterion and has been accelerating adoption in the web and mobile communities. A dataset is defined as data that exists on the public blockchain or decentralized storage networks, and data-source is composed of a Schema and one or more Datasets. Data identifies the raw data in the decentralized storage layer. It
contains an identifier for the storage system being used and the location of the raw data in that storage system. All queries in the Graph are executed against a particular data source. The schema used is a GraphQL SDL Schema and defines the entities, values, types, and relationships which may be queried. The mappings define how the data maps to a particular schema. It also includes metadata around the format in which the data is stored, such as CSV, or any custom binary format. There are various roles assigned in the protocol defined as functional roles.

  1. dApps Client: A front-end application running on the user's machine
    which queries The Graph.
  2. Gateway node: A node that acts as an HTTP, WebSocket or JSON
    RPC endpoint for dApps clients to query The Graph.
  3. P2P Node: A node that participates in the P2P Network.
  4. Query Node: A node that participates in query processing.

Token

The graph introduces a new token, The Graph Token or GRT Token, which plays a vital role in securing and governing the network, as we discussed earlier as the role of cryptocurrencies in the blockchain network. GRT is an ERC-20 token on the Ethereum blockchain, that allocates resources on the network. The Graph Network also lets diverse, active participants earn income for providing data services. Both technical and non-technical persons can contribute to The Graph Network and the open data economy in a variety of ways.

  1. Indexers: Indexers are node operators in The Graph Network that pro-
    vide indexing and query processing services in exchange for Graph Tokens
    (GRT). Query fees and indexer awards are paid to indexers in exchange
    for their services. This role includes having an advanced technical level.
  2. Curators: Curators are subgraph developers, data consumers, or community members who signal to Indexers which APIs should be indexed
    by The Graph Network. Curators deposit GRT into a bonding curve to
    signal on a specific subgraph and earn a portion of query fees for the sub-
    graphs they signal on; incentivizing the highest quality data sources. This
    role requires a moderate level of technical knowledge.
  3. Delegators: Delegators are individuals who would like to contribute to
    securing the network but do not want to run a Graph Node themselves.
    Delegators contribute by delegating GRT to exist Indexers and they
    earn a portion of query fees and indexing rewards in return. Delegators
    select Indexers based on their performance on measures like query fee
    rates, past slashing and up-time as well as delegator parameters like the
    cut of fees and rewards from the Indexer. Very low technical knowledge
    is required for this role.
  4. Consumers: Consumers are the end-users of The Graph that query sub-
    graphs and pay query fees to the Indexers, Curators, and Delegators. Consumers are likely to be developers or projects themselves that cover query fees for their applications as they would AWS or cloud service costs. However, some applications will pass on query fees to users or bundle the cost in product fees. Consumers will pay for query fees via “gateways” or
    wallets that will be built on top of open-source contracts in The Graph Network.

Conclusion

The Graph-A decentralized query protocol is a very efficient way of indexing
blockchain data without any interference of centralized networks. This would
help us in moving towards Web3 where dApps would be dominated, which will put us back in control of our own data. This way many products and services can be built on interchangeable datasets and users would be able to switch dApps efficiently. The Graph would be initiating the end of data monopolies.

References

The Graph: A Decentralized Query Protocol for Blockchains.
IPLD/specs GitHub accessed March 5,2018. https://github.com/ipld

If you require any further information, please do not hesitate to contact me.

Portfolio : rithurajnambiar17.github.io

LinkedIn: https://www.linkedin.com/in/rithuraj-nambiar/

GitHub: https://www.github.com/rithurajnambiar17/

E-Mail: rithurajnambiar17@gmail.com

--

--

Rithuraj Nambiar
Nerd For Tech

AI Enthusiast | Machine Learning | Data Science | Python |