A True Story of How Redis Saved Me a Truckload of Money on Compute Costs

On being a “move fast and break things” type of developer.

Austin Starks
CodeX
12 min readAug 28, 2024

--

Move Fast and Break Things – Flux

As a soloprenuer, I don’t have time to write design docs, draw architectural diagrams, and make the theortically best decision for every single one of my features.

I just do what works. For example, for the longest time, my AI-Powered Trading app NexusTrade was a complete monolith.

To deploy it, all of the features were on one instance. This had a number of huge benefits including:

  • Costs: Having one big instance is a lot cheaper than running 4 or 5 specialized instances.
  • Simplicity: Deploying was literally as easy as the click of a button.
  • Speed: Because everything was on one machine, there was no network latency when launching backtests or starting genetic optimizations.

However, as more people started using the NexusTrade Premium plan, I knew I had to separate my app into different components.

The premium plan for NexusTrade includes access to a feature called genetic optimization. This algorithm is the computational version of natural selection — it takes different variations of one’s trading strategies, and iteratively evolves a population of these rules to create a diverse population of new trading strategies.

Simply put, by running a genetic optimization on their strategy, a trader can see different optimized variants of that strategy, each with their own strengths and weaknesses.

However, this algorithm is very computationally expensive. If it were to run on the same machine as the server, it would consume all of the server’s resources, and potentially crash.

Moreover, because of this algorithm, my server could not scale horizontally. The way genetic algorithms are currently implemented is a Rust script that polls for ready optimizers. Unless the script was updated to include a locking mechanism, two instances of the genetic optimization engine could be working on one portfolio, leading to deadlock, stale data, and a waste of computational resources.

Thus, the end result is me scaling my application into the following jobs:

This separation worked well for over a year. However, I recently started experiencing a host of issues with my setup.

High Costs, Frequent Crashes, and a Slow App

As a move fast, break things developer, I often implemented things in a quick-and-dirty way, but in a way that would be easy to extend in the future. For example, users in NexusTrade have notifications. This can be order updates, notifications about portfolio state, and updates regarding the optimization process.

To implement “real-time” notifications, I basically had a React hook on a setInterval to poll the server every second.

useGetNotifications hook in my app

As the number of users for my platform grew, the app became less responsive and more prone to crashes. Decisions like this ultimately led to the scaling issues that I was seeing.

Everything really came to a head when I noticed that when I clicked the buttons on the header really fast, the app would crash 100% of the time, redirecting users to a 502 Bad Gateway page.

After I noticed that, I decided to take action.

Costs Before The Big Refactor

I use a managed deployment platform called Render. Render is the perfect mix of cost, ease-of-use, and user experience when it comes to deploying applications. They allow you to deploy apps quickly without having to worry about the underlying infrastructure.

Cost for deploying on Render

After upgrading my NexusTrade Web Server, and then upgrading it again, I was met with the following costs to run my app:

  • Render Team of 1 User at $20/month
  • NexusTrade Web — Two instances of Pro Max at $225/month
  • NexusTrade Backtesting App (Rust) – One instance of Pro at $85/month
  • NexusTrade LiveTrading App (Rust) – One instance of Standard at $25/month
  • NexusGenAI (TypeScript + NodeJS + React) – One instance of Standard at $25/month

Added together, that’s a whopping $605/month! This doesn’t include my MongoDB instance or my other operational costs (such as the API access to OpenAI and Claude). My costs were exploding!

Combined with the fact that the app felt slow, I decided to take a step back and solve the issues at their core.

Moving Slow and Fixing Things

Analytics for my web server

I knew my app was crashing likely due to a the client requesting too many resources on the server. The graphs show that CPU and Memory were very low for most portions of the day, with very large spikes happening at somewhat random.

In addition, I noticed my app was a little delayed. When I would click on a new tab, a spinner would spin for halve a second, then load the content. This was subtle, but as a frequent user of my own platform, annoyed me to no end. A financial platform should feel lightning fast.

Redis

To kill two birds with one stone, I decided to integrate redis and frontend caching. First, let’s discuss redis.

While I initially planned to do one master Redis instance that did caching and real-time notifications, I quickly learned that redis can only do either or.

Error: Connection in subscriber mode, only subscriber commands may be used

So I ended up creating two different redis instances.

  • nexustrade-redis-cache: used to cache data on most endpoints to reduce server load and increase speed
  • nexustrade-redis-pubsub: used to create publishers and subscribers to send real-events across my application

The redis cache was a TTL cache that served most of the user’s data back to them. Considering the main problem I wanted to solve was my application crashing, I figured caching would be a good way to reduce on computational costs.

Also, the getNotifications poll loop that I mentioned earlier enraged me every time I saw it. So, in conjunction with caching, I decided to implement real-time notifications using websockets.

The end result of these two redis instances is an app that would serve requests faster and receive real-time notifications.

Frontend Caching with SWR

Another nitpick I had with my app was that it was a little bit slow.

When using the app, I became accustomed to the white loading icon that always appeared on my screen. Navigating through pages took a quarter of a second, but still felt slow and disjointed. In order to further reduce server load, my idea was to implement.

However, when learning about frontend caching libraries, I learned about “stale-while-revalidate” (SWR). SWR is a caching strategy where stale data is served to the user, and a request is done in the background to revalidate the data.

Because of this background request, I understood that the strategy wouldn’t significantly reduce server load, BUT the thought of my UI having that slick feel to it was enticing.

So, after comparing it with other libraries like React Query, I decided to do the quickest, lowest-lift option that would improve my frontend experience. I chose SWR.

The End Result of this Refactor

Increased application complexity

Because the rest of this article will be overwhelmingly positive, let’s talk about by far, the biggest downside of implementing these caching strategies: complexity.

I’ve never learned so much about basic networking in my entire life. Allowing the user to get real-time notification updates required me to set up a socket and caching system from my notification queue worker to my server.

Then, my server needed a socket directly to the client’s computer. I also needed to implement things like authentication to make sure unauthorized actors didn’t have access to my system.

Finally, I’ve significantly increased the number of things I had to think about when implementing a new feature. I had to think about caching and how to make sure my users don’t get stale cache data. A perfect revalidation strategy does not exist, and after these changes, my app just became three times more complex.

Nonetheless, implementing these changes was a lifesaver for my app and my wallet.

Significantly improved application, user experience, and costs

In total, this refactor is probably one most exciting things to happen with my application in a while. It’s rare that you do a technically complex project that has the benefits of improving the user experience, saving on cost, and increasing app stability.

Completely Eliminated Crashes

The first thing I noticed right out of the gate is that my app no longer crashed!

Even with significantly reduced servers (which I will talk about towards the end), my app has not crashed a single time since being deployed.

This is partially because of the redis cache saving database calls, but also because of the caching strategy stopped the app from repeatedly requesting a heavy resource. Let me explain.

How moving fast broke things

Over a year ago, I noticed that after logging onto the app and clicking a portfolio, it took a while for the portfolio’s history to load on the screen. This led to a frustrating user experience where they had the portfolio values and the strategies, but not the portfolio’s history.

And as a move fast and break things engineer, I decided to implement an in-memory cache for the portfolio history.

The idea was simple: anytime a user logged in, they would have their portfolio’s data ready to go.

The old implementation of getPortfolios

This solved the issue of the delay that my users were experiencing. I hadn’t thought much about this since I implemented it.

My caching strategy discovered my flaw with this.

I realize that the old implementation of getPortfolios could repeatedly call the database and fetch the portfolio history of every portfolio a user has. This is particularly detrimental if users are changing to the portfolios tab repeatedly or if there are multiple users at the same time.

My caching strategy completely prevented this.

The refactored version with caching reduced the number of times updatePortfolioHistory was called

So in the end, this project served as a mini audit, which also contributed to the significant savings in computational costs.

Massive Performance Improvements

The reduction in computation to run my app

Another huge benefit of this implementation is that the app is a lot more stable. I could rapidly switch between tabs and the transitions would be seamless.

I wish I had taken a before and after video. The difference in how the app behaves is truly night and day.

Everything loads lightning fast, and yet my database is receiving the lowest amount of requests it ever has.

Saving a Truck-Load of Money

The cost of each Redis instance

The best part about this project is that I’m saving money. A lot of it.

Let’s recalculate my compute costs with this cache implementation:

  • Render Team of 1 User at $20/month
  • NexusTrade Web — Two instances of Pro at $85/month
  • NexusTrade Backtesting App (Rust) — One instance of Pro at $85/month
  • NexusTrade LiveTrading App (Rust) — One instance of Standard at $25/month
  • NexusGenAI (TypeScript + NodeJS + React) — One instance of Standard at $25/month
  • NexusTrade Redis Cache — One instance of Standard at $10/month
  • NexusTrade PubSub Cache — One instance of Standard at $10/month

That’s a total $345. Recall that I was spending $605/month on compute alone. So in total, I saved 43% on server costs.

This was astronomical! For fun, I decided to decrease my server again. This would be reducing the NexusTrade Web App to Standard from Pro. I let it run for a few hours, and still noticed my app was lightning fast. This would be $225/month, or a 63% cost reduction.

The only reason I re-upgraded was because one of the servers had run out of memory, and I wanted to provide a more flawless user experience. The end result was the same – I saved hundreds of dollars every single month with this one neat trick!

Lessons Learned

I know this article is getting long, but I did learn a few things from this experience that I will carry with me for the rest of my career.

Don’t be afraid of new technologies.

I tend to be the guy that sticks with what he knows. I knew about Redis but didn’t know how it worked or how to set it up. It wasn’t until I absolutely needed to save money on compute costs that I eventually implemented a Redis solution.

And even with this refactor, I almost gave up! Having to stand up multiple different Redis servers and working with websockets across a distributed system was intimidating and new. I also had initial concerns about the costs of Redis (as I knew my compute costs were getting higher for me everyday).

I only finished the implementation because my twin brother told me to stop being a baby (perhaps, in more harsh terms as us twins tend to speak). So I sucked it up, and kept working with the refactor.

I’m SOOO glad I did.

Sometimes, the tortoise wins the race

While my strategy of moving fast and breaking things has worked very well for me to implement NexusTrade, I learned that sometimes you can move faster by slowing down.

Taking the time to carefully analyze Redis in the beginning may have saved me a few hundred dollars in compute this month. But, at the same time, having the lived experience to see the different redis makes is also invaluable.

Concluding Thoughts

Ultimately, I wouldn’t have changed my approach any other way. My new solution saves money, is much faster, has a better user experience, and significantly increased my application’s stability.

These improvements couldn’t have come at a better time. As I begin to implement critical features like algorithmic live trading, server stability is absolutely crucial.

The journey of optimizing NexusTrade has not only improved our platform but has also opened up new possibilities for our users. With over 6,000 people already exploring the power of algorithmic trading on NexusTrade, we’re just getting started.

The Future of Algorithmic Trading is Here

I’m revolutionizing algorithmic trading with cutting-edge AI technology and user-friendly features. My latest updates include AI-generated daily market updates to keep you informed, seamless quicktests for instant strategy performance insights, and enhanced AI collaboration through chat attachments. These innovations allow you to iterate on your ideas faster and make more informed trading decisions than ever before.

Whether you’re a seasoned trader looking to automate complex strategies or a curious beginner wanting to explore data-driven investing, NexusTrade provides the tools you need to succeed. Our platform offers:

  • Customizable algorithmic trading strategies with AI-powered optimization
  • In-depth financial research and comparative analysis tools
  • Risk-free backtesting and paper-trading capabilities

Don’t let another trading opportunity pass you by. Join the thousands of traders who are already using NexusTrade to revolutionize their approach to the markets. Step into the future of trading — where AI meets strategy, and where your investment ideas become automated reality. Start for free trial at NexusTrade.io today and experience the evolution of trading firsthand.

--

--

Austin Starks
CodeX

https://nexustrade.io/ Highly technical and ambitious. Building a no-code algotrading platform and an ecosystem of AI applications. https://nexusgenai.io