Building a Web 3.0 system that will fail just like you planned — Part 1 — RPCs

Or Kazaz
5 min readAug 24, 2023

--

Your Web 3.0 system will fail. Be prepared for it.

Like any modern system, building your Web 3.0 DeFi system with failures in mind is crucial - and can cost you a lot of money if you fail to do so.

What will your users experience if your DB is currently down? You’re getting quotes of the user’s favorite stocks from Yahoo Finance; what will happen if that fails?
Most of you will probably answer that you have a backup—either a DB replication or another API source for quotes. You understand the importance of having fallbacks.

But what are some critical fallbacks required in Web 3.0 and DeFi development? And why will failing to do so in Web 3.0 development cost you a lot of money?
In this part, we’ll review the pitfalls of using an RPC server with examples and solutions from our day-to-day Web 3.0 and DeFi development at ToK Labs.

You can also check out part 2, in which I’m diving into an interesting problem we had with DEX aggregators and how we solved it.

I will not talk about everything.

This article is not intended to teach you Web 3.0 principles. I’ll do my best to shed some light on certain terms, but you need to have some notion of the concept of swaps & providing liquidity on AMMs (DEXs).

What do we even do?

In ToK Labs, we’ve developed an automated market-making system to provide liquidity in the most profitable markets while taking an impermanent loss risk.

As such, we’re performing basic tasks every second:

  • Perform GET RPC calls to communicate with the blockchain.
  • Get quotes for different token pairs.
  • Find the most profitable swap between multiple DEXs on the blockchain when entering/exiting a position. This is being done using DEX aggregators.
  • Submit blockchain transactions via an RPC server.

In the article, we’ll focus on the RPC servers. But first, let’s understand a bit more about it.

RPC servers

To communicate with the blockchain, an RPC server is needed.
The slightest delay in getting the correct information or submitting the transaction costs us money.
So this is one of the most critical decisions we had to make.

You have a few options to achieve such access, each with its pros and cons:

Public RPC servers

These are free servers that you don’t need anything to set up. Just pick one and get started. The problem here is that you have no guarantee of the quality of the server. We couldn’t allow ourselves that limitation.

Setup your own node

You can spin up your own node in your own VPC. This will save you on networking costs in your cloud provider + you will enjoy the fastest responses. This sounded like the ideal solution for us.
But nothing comes for free.
Maintaining nodes alone can be expensive in human hours and cloud provider billings. To begin our journey with this approach meant taking our eyes off the ball.
This will be the best option for us in the future, but not for now.

Use an RPC provider

Like using a cloud provider instead of setting up your own data center and services, choosing an RPC provider is an ideal balance between speed, quality, and costs.
This is why we’ve chosen this approach.

The leading cloud providers out there are Alchemy, QuickNode, and Infura.

What can go wrong?

A lot. Here are some of the cases we ran into and how we’ve dealt with them.

Illegitimate failed RPC requests

It could be 503 from the server, 429 rate limits, timeouts, and every reason that is an Illegitimate failed transaction or gas estimation.
These errors could be categorized as “it happened only on that RPC provider; if I tried it on a different one, it would have worked.”

So what we wanted was a system to perform the RPC request, and upon failure:

  • Retry it with the same provider X amount of times
  • Wait, Y ms in between retries
  • If all retries fail, try the next RPC provider in the list

There are some solutions out there that we tried, like ethers’s FallbackProvider (which has the perfect name for our use case).
Unfortunately, it has many, many issues. It wasn’t usable for us.

So, we built one that does all of that.
Now, we have confidence that we’re immune to these types of errors.

For those trying to do something similar, the best place to start is extending the JsonRpcProvider and overriding the perform, send, and detectNetwork functions.

Maybe one day we’ll open-source it 😉

Failed RPC request, BUT the tx was actually successful

Take a look at this code here using ethers:

const rpc = getRpcProvider();
const wallet = new Wallet(WALLET_PRIVATE_KEY);
wallet.connect(rpc);

await (await wallet.sendTransaction(tx)).wait();

We got an error from the RPC server, but the tx was submitted to the blockchain and could succeed.

Submitting another tx could be problematic for several reasons:

  • You never intended for it to happen. You might lose a lot of money due to that (for example, performing another expensive swap you never meant to do).
  • Using the same nonce twice. Which will lead to another error or tx replacement.

That was an interesting case that happened a lot.
Here is how we solved it:

const rpc = getRpcProvider();
const wallet = new Wallet(WALLET_PRIVATE_KEY);
wallet.connect(rpc);

const txResponse = await wallet.sendTransaction(tx);

// Store the txResponse.hash in your DB

await txResponse.wait();

The difference now is that you first go ahead and store the tx hash in your DB and only then wait for it to finish.
Now, before you retry the tx, after a failed RPC request, you could do something like this:

const existingTxData = await waitOnExistingTx(txHash);

if (!existingTxData) {
Logger.log(`Tx ${txHash} does not exist, it might be dropped or replaced. Check if you're using different nonces for each tx`);
} else {
if (existingTxData.txReceipt.status !== 1) {
Logger.log(`Tx ${txHash} has failed`);
} else {
// The tx was successful! Continue from here
}

And waitOnExistingTx will look something like this:

export async function waitOnExistingTx(
hash: string
): Promise<{ txResponse: ethers.providers.TransactionResponse; txReceipt: ethers.providers.TransactionReceipt } | undefined> {
const rpc = getRpcProvider();

const txResponse = await rpc.getTransaction(hash);

if (txResponse) {
// The reason we're using waitForTransaction and not txResponse.wait() is that the latter does not know how to handle replaced/dropped tx
const txReceipt = await rpc.waitForTransaction(hash);

return { txReceipt, txResponse };
}
}

Problem solved.

Delayed blocks

The RPC node is far behind the head of the blockchain. That means you’re making decisions based on old data.

I won’t dive deep into how we dealt with it during runtime.
But, we have a system in place to constantly test how far behind every RPC server we work with as part of our system. If it was too far, we got a notification about it.
This is one of the reasons we removed GetBlock from our RPCs list.

This covers most of the problems we encountered while using an RPC provider (but the same issues apply to public RPCs and if you’re running your own node) and how we overcame them.

It’s true in every modern system, but mistakes on a Web 3.0 DeFi system could instantly cost you a LOT of money. Always build for failures.

In part 2, I’m diving into an interesting problem we had with DEX aggregators and how we solved it.

--

--

Or Kazaz

Chief Architect @ Uplifted.ai | Consulting R&D organizations/startups | Ex Director @ Autodesk | Entrepreneur at ❤️