Notes from the Lab 1: You wanted JPEGs. You got 404s.

John Crain
SuperRare đź’Ž
4 min readFeb 18, 2023

In recent weeks, superrare.com has had acute site reliability problems — many of you have experienced long loading times, page timeouts, 404s and other issues that made the website unusable on a frequent basis.

We’ve been heads down working to address these issues, but haven’t done a good job of communicating, so understandably many are wondering “WTF is going on?” We hear you, and in an attempt to close this information gap, we’re going to begin posting regular updates like this one to keep everyone in the loop on what’s happening at SuperRare Labs.

So what’s going on at SuperRare Labs?

Big picture, we are ramping up for SuperRare 3.0 — a series of comprehensive product updates that will make SuperRare long-term scalable, with the goal of vastly improving discoverability, curation and the success of both artists and collectors in our ecosystem.

Much more to come on that front soon, but in the meantime if the website isn’t usable and people see 404s instead of artwork, none of the future stuff really matters. It may be hard to believe from the outside looking in, but our primary engineering focus for the last few months has been on infrastructure and site reliability. This post will be dedicated to more details about what’s been happening on that front. In the coming weeks once these critical issues have been resolved, we’ll get back into more forward looking updates.

You wanted JPEGs. You got 404s.

The recent site reliability issues have stemmed from two areas, both of which caused periodic critical failures of some SuperRare.com servers.

First, we undertook a large infrastructure migration in January, which had to be rolled back after certain core services started failing. Identifying the root cause of these failures was complicated by the fact that everything initially appeared stable. What began as run-of-the-mill bug reports began to cascade and were compounded by nuances in our data caching, making things very challenging to debug. It became clear that we needed to roll everything back in order to properly address the underlying issues that had been discovered. In order to prevent anything similar from reoccurring, we’re now working through this migration in smaller, more observable pieces to ensure that these changes won’t cause any further impacts on user experience. We’re disappointed in how this failed migration ended up negatively affecting so many of you, this infrastructure migration is much needed and will ultimately improve the day to day experience of using SuperRare.

Second, during our years as a bootstrapped startup, we had happily been providing API access to many developers and other community members who were interested in building on top of SR’s unique and historic dataset. This wasn’t a problem when the industry was small and requests were easily manageable, but as the market exploded so too did the appetite for slurping up this data. And unfortunately, without a production-ready API designed to handle the kind of traffic we’re now seeing on a consistent basis — we were subjected to the “hug of death.” As the SuperRare DAO has ramped up over the past year, the demands have outstripped the bandwidth our beta API was capable of providing, which has caused an increasing number of service failures that ultimately result in slow site performance and 404s. The timing of this further compounded the debugging challenges we faced around the failed infrastructure migration (Murphy’s Law is very much alive and well). We are now working closely with these community members to provide interim data solutions while rebuilding and bulletproofing our APIs for better scalability moving forward.

In short — both failures were subtle, hard to identify, and required deep investigation/re-strategizing before we could regain our footing — but these were ultimately our own fault and also served to provide some invaluable lessons in how to scale web3 infrastructure.

Regardless of the circumstances, we want everyone to know that we take performance issues like this very seriously. We’re on a mission to make SuperRare the best art collecting platform in the world — a robust, fast, and fun application that can unlock a future of cryptoart with billions of users. Every setback on this path is a personal disappointment, and something we don’t take lightly. And while this was a painful lesson — you can rest assured that we’re channeling this energy into outcomes that will be a net gain for everyone who wants to see a better, more high functioning SuperRare in the near future.

See you next time for Notes from the Lab 2.

–SuperRare Labs team

--

--

John Crain
SuperRare đź’Ž

Founder @SuperRare_co @ConsenSys OG. Founding member @blockapps.