Optimizing Code to Reduce Cloud Costs

Colum Ferry
Dec 20, 2019 · 5 min read

This article is going to be more of a personal anecdote about an interesting situation (🤯) I faced and still have to cope with whilst doing freelance work that involves Angular and Firebase.

Now, I know some of us still have not yet had a chance to use some serverless technologies ☁️ in one form or another with our work, but you may know that these services come with a service charge of some form based on some form of usage metric, such as uptime of a Function, or Read and Writes per day, etc. 💸💸

With this being the case, surely we have to start considering how expensive the code we write is to actually run on these Cloud Platforms?

If our code is expensive to run, then our company is potentially going to bleed money covering the costs to run our application. 😖😵

How do they get this money back? Do we charge our valued customers more!? Surely not, they want discounts not increases!! 🔥🔥

There’s nothing for it. We have to optimise our code. Yes I did say optimise!

It is certainly a form of optimisation. We optmise our code to be more performant, to be more scalable, to be more maintainable, why then should we not optimise our code so that it is cheaper?

The Problem 🤦

At the start I mentioned that this would be slightly anecdotal, so I’ll get into that now.
I am developing a game for a client that uses Angular and Firebase. Specifically it uses Firestore and Firebase Auth.
My client is still at proof of concept stage, and because of that, they are running the Free Tier on Firebase which allows Firestore 50K Reads per day, 20K Writes and 20K Deletes.

Part of the game has a 2D map, and this map has at minimum, 10,000 tiles, with each of these tiles containing an entity.

We’re developers, we know a 10K for loop is really nothing to be afraid of, we can handle that no problem! So where is the problem!?

Actually, that’s just it. Those 10,000 entities are stored in Firestore. So when the user goes to the map, we’re faced with a dilemma, that normally, we probably wouldn’t even consider. 🤔

If we try to read the entire map from Firestore, suddenly we have just performed 10,000 Read Transactions on Firestore. AND that is only ONE User active! One user has just used 1/5th of our total daily usage limit!

Another issue that cropped up is that these 10K tiles have NPCs associated with them, uniquely, wherein they have standard stats across all tiles, but their health is unique to each tile. Think of it as each tile allows for multiple active users to fight one NPC together, like a Raid on an MMO. ⚔️

So we have to track 10K NPCs health in Firestore also.
That’s not too bad, we only have to update one NPCs health at a time when users are battling it.

BUT we also have an Admin Panel, that gives the Game Admin CRUD ability over the NPCs.

What if they change the base health for an NPC? All 10K NPCs need updated. That’s 10K Writes! That’s HALF our daily Write Usage. 😱

Ok Ok, we have a problem. Our code is perfect, but it’s hitting usage limits. We need to do something, we need a solution.

I used my anecdote as an example, but I’m sure some of you have also had similar issues. I know in years past we had to optimise due to other forms of usage restrictions: CPU ability, RAM availability, Storage Size issues, Network Latency etc, but I feel like this is unique in that whilst we can optimise our code to be more performant, we could still run the risk of it being expensive.

The Solution 🤩

There is not a one solution fits all for this, rather you have to really look at what your code is doing and think about what you could do differently to prevent it from incurring increased charges that could potentially cripple smaller businesses if they aren’t expecting it. 😥😭

I’ll explain how I solved my issues in the hopes that it will give you some inspiration on how to comabt your own potential issues if they ever arise.

As mentioned in my problem statement, I had an issue where 5 concurrent users would end up hitting the Read Usage Cap for the day, meaning any other user would not be able to play the Game.

We sat down and we looked at how users would play the game. We asked ourselves, “Is one user really going to move across all 10,000 tiles in the map?”

Most likely not! Therefore, why do we need all 10,000 tiles to be read from Firestore if >90% may never get used?

Our Solution: We changed how the map system works to only pull the data for the tile the user is currently at. If the user moves around roughly 5–10 map tiles, we managed to drop our Firestore Reads from 10K to 10. That’s an optimisation of ~199% 🚀🚀🚀 (Please correct me if I am wrong).

Our next problem came from the Admin Panel. We need to update all our NPCs to nerf or buff them. That’s potentially 10K Writes!
But wait…

Do we have a user on every single tile?

We could do, but also, one single user isn’t going to need to know about 90% of the tiles, so do they really need to know about all 10K NPCs?
Our Solution: What we did in the end up was create a Collection in Firestore to hold reference data for the NPCs. Then when a user would enter a Map Tile, we read the Map Tile data from Firebase, compare it to the NPCs reference data Collection, and then we will, if required, update that Map Tile’s NPC data to match the reference data. We are only performing Two Writes now. One to write the reference data, and one to write the Map Tile data, if it needs it. Again, that’s an optimisation of ~199% 🚀🚀🚀

Conclusion 💥

Whilst I cannot give any exact solutions or tips beyond sharing my experience and what I did to tackle the issues that arose, I do think this is something that needs to become more apparent in the community for anyone who works with Cloud Technologies.

Yes we write our code as we always have and let it incur the standard cost that it will take to run that code, but is optimising our code so that it is cheaper to run on Cloud Platforms a paradigm shift that we should consider and discuss more at length.

Should our Code Reviews take into account how expensive the code that is being reviewed is going to be to run? Do we need to have dedicated analysts to determine if we can save money by taking a slightly different coding approach than we might have done otherwise?

Let me know what you think below or reach out to me on Twitter: @FerryColum.


Originally published at https://dev.to on December 20, 2019.

Colum Ferry

Written by

The Startup

Medium's largest active publication, followed by +586K people. Follow to join our community.

More From Medium

More from The Startup

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade