How Three Strategic Moves Rescued Our Backend’s Performance
One of the products I’m responsible for at my current company is an NFT Marketplace. Technically speaking, it’s a complex one. Apart from the general web development combo of frontend, backend, and database, we also have several different services running that are needed to keep different functionalities working as they should. In-app notification systems, blockchain-indexer services, services that monitor traded values, and so on.
A common denominator in the majority of Marketplace-like products is that the amount of data being processed — loaded, filtered, sorted, etc — is overwhelming. So as the products scale, performance plays a huge role in continuous development.
Another usual feature in marketplaces is a page where users can browse through a list of products, frequently with the ability to sort and filter through this data, to find the best match for their heart’s desire. In our marketplace, this is no different. We called it the “Explore page”.
Our “Explore page” was a performance nightmare. As we scaled and started offering new products, rendering and sorting the sheer amount of data became more and more challenging. We were already measuring high loading times, in the house of 10s, but the last straw came with the launch of a particularly heavy NFT collection, that, depending on your connection, increased the loading times to up to 27.2s!
In 2023, we wrapped up a refactor that targeted reducing this loading time. After some weeks of work, the improvement was outstanding, with nearly instantaneous loading times, the measured improvement was over 180 times.
To reach that, we worked on a mix of migrating our data sources, a clever new service, and computer architecture-ish changes. And this is the story of how we did it :)
Migrating our data sources
If you have not been living under a tree for the past three years, you probably already heard about NFTs. Now, to get something out of this story, you don’t need to know in detail what non-fungible tokens are, it’s enough to know that, in the most common form, NFTs have metadata associated with them: a name, a specific set of characteristics of that particular token (called traits), and some media or artwork to represent them, the most common one being an image, like those weird apes you probably saw online.
I say “the most common”, because there is not really a rule. It could be a GIF, a video, a document, a song, an iterative videogame, whatever. The “last straw” collection mentioned in the previous section, for example, had a high-resolution video as its artwork.
When we talk about NFTs, another important detail is that there are no standards related to where the media is stored. They could be in a centralized medium — think an S3 bucket — and the marketplaces can consume an API endpoint to fetch the info; or in a decentralized medium, an IPFS server.
The Interplanetary File System (dramatic name, btw) is essentially a file system that allows you to store files and track versions over time, like Git, on a distributed P2P network, somewhat like BitTorrent.
In our NFT Marketplace, different NFT Collections are indexed from the Blockchain: identified, and added to our local database. These are then shown in the UI for users to interact with.
Having no standard regarding where the artwork from different collections is coming from creates a problem because even if you save a link/pointer to the image in a local database, the latency to fetch this information will vary from NFT to NFT. The worst part is that you don’t have control over it.
Now that you have this pickle in your hands, if you have to display NFTs from different collections on one single page, fetching the artwork to preview it from different places may cause problems. The latency of one IPFS server may be lower than a different NFT’s API endpoint, for example. Weird and bad user experience, and you can wave goodbye to your customers as they exit through the left door.
This was part of our performance problem. Once identified, the solution was straightforward: let’s keep a local copy of the artwork from all of these NFTs, the main bottleneck to load the entire Explore Page.
When we indexed a new collection of NFTs to our Marketplace, we would make a copy of the artwork to a local R2 storage (again, think a S3 bucket, but Cloudflare’s). Having a local copy simplified things: when users access our Explore page, instead of fetching the artwork from different places for each NFT, all of them are now coming from the same place, thus removing the different latencies problem from the equation!
A clever new service
With the data source migration, we already reduced the loading times significantly. There was still one problem here: some artwork files were heavier than others. A bitart PFP NFT is way lighter than an abstract artwork in a looping video, for example. And here is where the “Clever new service” comes into play.
The quality of the artwork is important for the NFT community. So if I am to access the individual page of an NFT, I do expect to see the artwork in its fully realized and pure form. Resolution is not something we can mess up here.
If we’re talking about an individual page of an NFT.
In the case of a page such as ours “Explore”, where we are just showing a thumbnail of the artwork, the quality didn’t matter that much.
So we created a new service, that downscalles all artwork before they are saved in our R2 instance. In our databases, we now saved two pointers of the same artwork: one to the source, being it wherever it was, and another to our R2, hosting the downscaled one, to be used where just a thumbnail is enough. This service is also executed as part of the pipeline to index new collections, so moving forward the same standard is applied to every new product added.
Now, there is only one problem left. The sheer amount of data that we were handling.
New NFT collections are created every day, and it’s not uncommon for them to have 10k to 100k items. If each NFT is an item in a database document, you can imagine how fast this can grow. Navigating through this document to fetch specific items, filter, and sort them is a heavy process, even if the artwork files are swiftly available from a local database, in a lower resolution that wouldn’t get in the way.
The solution for the final problem lies in computer architecture theory: memory hierarchy.
Computer Architecture-ish changes
When I was back in college, I was a big nerd in Computer Architecture. As an Electronics Engineer bachelor, who for some years studied hardware and software as intrinsically separated things, there was something about finally getting to understand the interface between both that was just :chef’s kiss:. So when the opportunity appeared to use some concepts remotely resembling those I studied for years, it got me fired up.
So “memory hierarchy”, what do I mean by this? In a computer, we have several levels of memory. The speed to access the data stored in these memories is inversely proportional to its size. So if you have pieces of info that are used more frequently in whatever process you have running, they get stored in the cache memory, which is physically closer to the CPU, thus smaller, thus the faster level of memory you can have.
Now, these different levels of memory do not contain different data, but each smaller hierarchy contains a subset of the one immediately above it in size, think Cache vs RAM, for example.
And this is the exact concept we applied to our Explore page. We created a new document in our MongoDB, a “cache” from our original millions-of-items NFT document, containing only a subset of the original document’s data: the first ten pages of NFTs we are showing considering all combinations of sortings and filtering available.
Ten was no arbitrary number. From what we could analyze, it was rare for our users to go deeper than this in their browsing-for-new-products sessions, and it was also a number that would significantly reduce the size of the document we’re consulting to run all sorting/filtering queries.
Additionally, we updated some of our endpoints so that when a given NFT had their data updated, both of our database documents were updated with the same value, the original and the new cache one, thus keeping the memory coherent.
That was the last step of our refactoring. Testing all pieces together, we measured times of around 150 ms, a day-to-night change in the loading times we were used to.
Wrapping up
The key takeaway here is that, as your product grows, it will reach new levels of technical maturity. You and your team must keep up the pace. New technologies and auxiliary services are important, but the crucial thing here is being able to see different levels of abstraction, and understanding the whole system, not only being an expert on parts of it.
The latter was particularly important in this refactor. There was not — as there usually isn’t — a silver bullet to solve the performance problem on our Explore page. It took us time to sit back, review our architecture, analyze our possibilities, and refactor parts of our Marketplace that were affecting the specific performance issue at our hands.
Once we were done with our refactor, we even recorded a video showcasing the difference in performance for the same page for our community. It was a huge delivery from our team and one that I’m particularly proud of :)