The Web3 RPC Problem: A Diary Reflection on Crypto’s Infrastructure Dilemma

KagemniKarimu
Lava Network
Published in
7 min readMar 16, 2023
Convenience or decentralization: who’s going to run the nodes?

The internet has evolved to become user-friendly — to hide the “machinery”, if you would. Most people browse websites without ever thinking deeply about the servers humbly humming while dishing out byte streams. In today’s web, the term “Cloud” evokes a magical quality: when something is on the cloud, it is hosted somewhere, everywhere, or nowhere! It is not always clear from the terms used by users and approaches taken by developers, exactly how and where the web does what it does.

Blockchains, as distributed ledgers and consensus-making state machines, are even more mystical. A blockchain is said to be distributed, but distributed how and where? Some marketing speech would leave you to believe that blockchains are scaffolded out of thin air and hang nebulously in the “Ecosystem.” You’ll note the similarity to the hand-waviness of the “Cloud.” However, there is a simple law of the virtual world to observe, and it holds fast, even in web3! When it all boils down, everything is hosted somewhere.

What’s in a blockchain?

So, what about a blockchain? A blockchain is hoisted (held up) by nodes, and nodes are hosted on servers. There are nodes everywhere: full nodes, validator nodes, light nodes, even archival nodes! Each node serves a consensus-bearing copy of a particular blockchain, specialized with its specific architecture and interfaces. Like servers under the traditional model, these nodes respond to calls, queries, and requests. But each node only answers to the blockchain(s) for which it is built around. The node bears a tremendous technological burden, and the humble nodes do it with very little recognition of their criticality to the success of Web3.

Types of blockchain nodes
Types of blockchain nodes

The problem is that these nodes, by and large, are expensive to start, difficult to host, and tedious to maintain. Think of the idea of hosting your own server to access the web or create a web application. Under web2, this is solved by data and hosting centers. Data and hosting centers are ubiquitous because they are both profitable and affordable — by providing Infrastructure as a Service (IaaS) or Platform as a Service (PaaS) — data and hosting service providers allow commercial upstarts to rely upon them as opposed to creating expensive on-premises infrastructure and platforms. This is the exact tradeoff which has led to the dominance of cloud hosting!

Right now, web3 does not have an elegant solution to this problem. And because of this, web3 currently experiences downtime during NFT minting events, privacy leaks leaks caused by unsafe RPC, DNS hijacking at RPC gateways, censorship by centralized providers of RPC, and lack of reliable RPC support for many chains and ecosystems. In this environment, web3 developers can be sometimes seen on community discords calling into the void for RPC access on their favorite chains!

Amidst all this, there are umpteenth remote procedural calls made to nodes of different sizes, architectures, and geolocations. Each one must humbly return blockchain data without grumbling, sputtering, or failing. Web3 needs to be hosted somewhere. It needs its fancy tradeoffs… It needs the magic of what cloud has done for web2… At the end of the day, someone is going to be responsible for hosting these damn nodes.

Shoutout to those who run their own nodes ❤

To that end, we’ve tried several solutions:

1. The Paid Host: Centralized providers with or without SLAs

We’ve tried the centralized route of big providers of RPC. These providers offer nodes and endpoints at a premium to paying customers. This works pretty good but does not solve the central danger. This is essentially a hardened single point of failure: a proprietary RPC platform or node secured by a single team’s best efforts. Centralized providers control the infrastructure they provide, which means that if there is a problem with their infrastructure, it affects everyone.

If a centralized provider goes down, experiences an outage, or acts unethically, it can be catastrophic for anyone reliant on their service or those reliant on those reliant on their service (read: everyone). This can even happen accidentally, as when an entire country’s IP range is forbidden by a provider (see above)! Additionally, this approach is essentially un-democratic. Small, but growing, passionate communities may find difficulty getting support for the addition of their respective chains and projects of interest.

2. The Free Host: A volunteer army, armed with the Web3 Ethos

We’ve tried the altruistic approach. There are many volunteer nodes which give information/access to blockchains on-demand at no-cost. They’re good and we’re very appreciative of their service. While this can be a cost-effective solution for the requestor in the short-term, it does not provide any guarantees of security, availability, confidentiality, or reliability. In fact, building on public RPC can lead to significant performance bottlenecks and can be a major hindrance to the development of more complex applications.

As of today, many of these RPC endpoints are hosted (begrudingly) by blockchain projects themselves who want to ensure enduring access to their chains! Maintaining infrastructure is a full-time endeavor that teams will do for the good of their project, but as a supplementary service, in hopes that the community will eventually take up the mantle. Alas, running an RPC operation for free has proven unsustainable- it takes lots of resources and work to maintain, which could, in most cases, be better used on other work areas. Because of this, public RPC often has significant latency, uptime, and other performance issues, plus no assurances of quality of service and privacy. All that to say, there’s certainly a need for plenty of public RPC nodes, but they cannot be the only solution.

3. The Mavericks: RPC Consumers hungry for RPC who decide to bootstrap their own nodes

We’ve tried the bootstrap approach. Developers can and, in some cases, should run their own nodes. However, developers running their own nodes can face real technical issues. Running a node and writing a smart contract are not necessarily overlapping skills; running a node requires significant area expertise and resources, more akin to system administration. You need to constantly monitor and update a node to ensure that it is functioning correctly. If a developer does not have the necessary technical expertise or resources to maintain their node properly, it can lead to downtime, security vulnerabilities, or other technical issues.

Besides that, the minimum requirements to run a node can be quite high. You’ll need server-level hardware available to accomplish the task. Historically, very few people who want to develop applications are interested in hosting the infrastructure necessary to do so. This is what caused Vitalik Buterin to notoriously quip that running a full node is a “weird mountain man fantasy.” We can depend on the mavericks to exist! However, expecting every developer to be maverick is akin to requiring software engineers to build the desktop they program on — it will appeal to the most hardcore, but it will discourage and barricade many more than it appeals to.

4. Aggregators: Indexes, Indexers, and Public Lists

A final and noteworthy approach is to create tooling that climbs across space and uptime, and aggregates usable nodes and RPC endpoints. This approach is of personal interest to me because it is a clever way of blending the prior 3 approaches. There are some pretty cool solutions here, here and here. An issue remains: outside of the weak quality check that comes with ensuring a node’s endpoints are responsive — there is no means to protect against fraudulence or to incentivize the best performers on a list. In fact, some of these aggregators can have the opposite intended effect on public RPC. Rewarding the best performers with higher traffic and no incentive may cause them to rate-limit access and reduce performance in order to maintain uptime. Still, they are a somewhat elegant solution to a not easy-to-solve problem — and with some work (and combined with the other three) they can help close the gap.

So, we have tried solutions. Our solutions are usable but sub-optimal. This is the Web3 RPC Problem. As developers under web3 today, we’re all digital nomads in search of reliable nodes to pull blockchain data from. We deserve better solutions to this problem, with clearer ties and commitments to privacy, security, availability, and reliability. In my next article on this topic, I’ll discuss Lava and why it could be a solution to the Web3 RPC problem. If you’re interested in solving the Web3 RPC problem together, stay tuned!

A closing thought ;-)

We should accept the premise that people will not run their own servers by designing systems that can distribute trust without having to distribute infrastructure. — Moxie

--

--