How we built the best Ethereum node infrastructure service on đ .
And what we really mean when we say âbestâ.
In the beginning, there was Infura.
When we started building the open source technology that would ultimately become the beating heart of Rivet, we didnât have plans to offer it as a service. We were building it for OpenRelay, our 0x order book infrastructure project.
Back then, developers had two choices for Ethereum infrastructure â Infura or DIY. And while the truly OG teams frequently took it upon themselves to build and host their own nodes, the vast majority of request traffic was going through Infura.
While we were bound and determined determined to go the OG route and host our own nodes, we quickly learned a thing that all teams attempting to host a high-capacity, high-availability service capable of catering to a large number of simultaneous users learns:
Ethereum node clusters are brittle, complex, compromise-laden, soul-consuming, time-sucking, anxiety-inducing single points of catastrophic failure that require constant babysitting.
The uninitiated might think âwhatâs the big deal? You should treat it like any server â just spin up another instance and kill the old one. What a bunch of noobs!â But if youâre a seasoned Ethereum developer, youâll know that Ethereum nodes have to sync with the network before they can respond to requests â a process that means saving the entire history of the Ethereum blockchain to disk. And youâll know it often takes half a day or more to do so â presuming youâre building from a relatively recent backup. If youâre starting from scratch, it can take days.
Even in the best-case scenario youâre in for a devastatingly-long outage if all of your nodes happen to fail at once.
And thatâs just the start of it.
If youâre running multiple nodes behind a load balancer, there is almost always going to be a chance that your nodes will not all be synchronized with one another since new blocks propagate across the network unevenly in a peer-to-peer network like Ethereum.
That means your query results can conflict internally! For example, imagine you run a cluster of two nodes behind a load balancer.
Your dapp executes a function that updates token balances at the latest block and displays it in a UI. First, it queries for the latest block number. Then based on the result, it queries for a token balance at that specific block number.
Now imagine the load balancer sends the first query to Node 1, which is at that moment one block ahead of Node 2. It returns the result and the function fires off a request to get the token balance at that block.
This time, however, the load balancer routes the request to Node 2, which hasnât yet caught up to Node 1 â and worse than one might even expect, it doesnât return an appropriate error responseâjust an outdated balance.
Now, this wonât always happen, but it will always have some probability of happening â and itâll be unpredictable because the frequency that probability is realized will depend in part on how much traffic youâre getting.
While you might mitigate this problem in several different ways, your mitigation will never be 100% effective, and all of them will make the user experience inconsistent and â for lack of a better word â a little (or a lot) janky.
Topping it all off, if you get a sudden boost in traffic, youâre gonna need a lot of idle capacity since you canât spin up new nodes very quickly. And if you donât have enough idle capacity, your nodes will be overrun â degrading your service and potentially causing them to crash.
And those are just the broad strokes of the challenge.
Nightmare, right? You can see why many developers preferred to make it Infuraâs problem.
The trouble with Infura.
Given all that can go wrong, you might wonder why we didnât just set up a free endpoint with Infura and call it a day.
Well, there were three main reasons.
1. Infura and Goliath
I probably donât have to explain to you that Web3 is all about correcting the mistakes of Web2. Thatâs why we all got into this to begin with â to build a cure for the moral hazard now reified by Big Tech, inherent to centralized control of vast amounts of user data.
If everyone used Infura, history would just be repeating itself, and we didnât want to contribute to the trend.
2. Infura and Classic Jaguars
If youâve ever tried to get replacement parts for a 1969 E-type Roadster, youâll know that the parts take a long time to ship from the UK, that theyâre expensive AF, and some parts have to be purchased from a salvage yard because the OEM has discontinued them. Waiting 3â5 weeks for a procurement agent to source and ship you a $100 taillight bulb is no fun â and is also a great example of why most people donât make classic Jaguars their daily drivers.
When youâre trying to build an open source project, (OpenRelay, as with all our products, is entirely open source), you donât want to build dependencies on things that may one day be a lot like that classic Jaguarâs taillight bulbsâ proprietary and not guaranteed to be affordable or readily available down the road.
3. Infura and the Power of Control of the Vertical
We wanted to build OpenRelay to take full advantage of the capabilities of Geth, and we wanted to optimize it to be as efficient as possible without worrying that something might change that would undermine or break our optimizations.
When you build dependencies on proprietary code managed by third parties, youâre locked in to their decisions â and the inherent downstream limitations those decisions impose â for as long as the dependency exists.
So all things considered, Infura wasnât really an option we could live with.
Enter the EtherCattle Initiative: our evil master plan to build a no-compromises easy-to-manage open source Ethereum node cluster architecture.
Rather than bite the bullet and resign ourselves to mitigating all of the issues with the ânodes-behind-a-load-balancerâ approach, we came up with something different â and the EtherCattle Initiative was born.
Thanks in part to grant funding from the 0x Project, we built a system that enabled us to spin up additional node capacity in minutes. The solution â streaming replication â wasnât especially exotic or even new. It had just never been implemented in an Ethereum client before.
While it took us a number of months and a lot of instrumental innovations, once we got it together, it worked like a charm. With it, we could kill unhealthy instances and spin up new ones with no trouble at all â and add or reduce capacity based on current usage metrics rather than attempted tea-leaf reading based on anticipated capacity.
The only problem? OpenRelay didnât need nearly the kind of capacity afforded by the minimum viable size of a high-availability cluster. What would we do with all the extra capacity?
Rivet is born.
The idea for Rivet â a competitor to Infura that could 1) reduce dependency on a single centralized provider, 2) give projects an open source alternative to Infura and others that were then emerging (such as Alchemy), 3) leverage the remarkable capabilities of the EtherCattle Initiative technology to deliver a service that was qualitatively just plain better, ultimately contributing to the ascendency of Ethereum-based projects and the advent of Web3, 4) give developers and their supporting teams a tool that would lower barriers to entry into Ethereum development by make their lives less complicated, and 5) help fund the continued development of the open source EtherCattle Initiative.
How we made Rivet the best on đ (and what we mean by âbestâ)
The result in real terms is thisâwe took the underlying technology and paired it with a service that:
1. Has predictable, transparent pricing. Nobody likes getting surprised by (or explaining) bigger-than-expected bills â a problem that emerges from complex billing models based on compute credits or t-shirt sizes and overages. In fact, nobody really likes talking about it at all â itâs just not that interesting or rewarding to consider the nuances of a byzantine grid of prices.
People building the future have better things to worry about. So our pricing strategy was intentionally designed to be as simple and straightforward as possible.
$1 = 100k requests. Itâd be hard to invent something more simple.
2. Self-service, minimal data capture. We didnât want to pepper developers with a ton of webforms or require sales calls and consultations before developers could just get started using the service. We also would prefer not spend a lot of time doing that kind of stuff. So Rivet is self-service and fast to get started. All you need is email address or an Ethereum wallet and youâre off to the races â no meetings required.
3. Minimalism in design and function. Twiddly bits are sometimes a fun distraction, but the fact remains â theyâre a distraction. Thatâs why you wonât find a whole lot of gizmos, fancy analytics tools, weird metrics, or finicky options in the Rivet dashboard. Because ultimately the most important thing about Rivet is that it does what its supposed to quietly, and otherwise stays out of your way.
Fun Tidbit: We did consider putting an easter egg in the dashboard at one point â we held back. Who wants to look up at the clock in horror and realize theyâve been playing Galaga in their infrastructure providerâs dashboard all afternoon, right?đ¤Ş
4. Friendly, collegial expert support by people who know our software inside and out. When you reach out for help, weâre prompt and ready to help you with whatever you might need.
If you reach out more than once, weâll remember who you are the second time. And best of all â we donât assume by default youâre doing something wrong. We work with you to work out whatever comes up.
5. Security, reliability, privacy, availability, and performance you canât get anywhere else. There are a untold nuances glossed over in this short history of how Rivet came to be, and we know it is without equal because we refused to compromise. We werenât rushing. We werenât trying to save on short-term costs to get it on the market fast. We built it for ourselvesâfor our own project. And then we designed the Rivet service around it to be the one that wouldâve stopped us from doing so had it existed at the time.
The best part? Weâre still just getting warmed up.
Watch this space â you ainât seen nothinâ yet.
â â¤ď¸Rivet