Scaling a fintech in high volatility markets

Published in

Ninety Nine Product & Tech

5 min readMay 14, 2021

Hi there!

At Ninety Nine we have been busy with all the recent product features we have released and the 1,5k new companies now available in our apps, but we wanted to take some time to write about an engineering area we have learnt a lot about and subsequently made a lot more robust: scaling our systems when markets suddenly become very hot.

This is indeed one of the core (and interesting) aspects to take into account when building a backend that needs to support a product with a usage highly influenced by external variables, like companies reporting earnings that can cause a tsunami of orders (buying or selling depending on the company reporting benefits or losses), political decisions or economic indicators, among others. Not to mention we live in 2021 and new trends are making the markets more volatile than ever, so 1) if you are an investor be careful where you put your money and 2) if you are a fintech engineer make sure your systems scale with demand, and your users don’t have to stop selling or buying because of downtime.

Good news is, the cloud is your ally in this battle, you just need to prepare in advance and know your tools.

As mentioned in our previous post, since inception we defined an architecture with scalability in mind, but at the same time putting a pragmatic angle to make sure we were building a system that can be maintained easily by a handful of engineers. We are a startup with a Tech team of 9, including iOS, Android, Backend and Core engineers (that said, we are hiring so check our careers page if you find these challenges interesting).

In short:

We use a microservices architecture. This has pros and cons but allows for an easier, cheaper and more effective scalability as specific services can be implemented, analyzed and diagnosed separately depending on the load. For example, the service that calculates real-time stock price for a company is used heavily as price is pulled each time a user visualizes a company (to display the price), sends an order (to block the money of the transaction) or visualizes the portfolio (to calculate total profit). This is quasi-real-time data and can only be cached for ~2 seconds. On the other hand, the service that provides market opening and closing time is only called twice per day.
All of our services are run using containers (long live Docker!), from local development to Staging testing to Production. This allows us to easily replicate the same behavior across different environments, and also easily create more copies of the services to serve more traffic in production when the ones running start getting overwhelmed.
To run the containers, we use ECS (yes, we are proud AWS users), the native AWS container service. ECS might not have all the bells and whistles Kubernetes has — but hey, we don’t really need them to run a small set of microservices maintained by a team of our size. In this case we prefer keeping it simple.
We have proper monitoring in place with CloudWatch, with a lot of alerts that get triggered when things go south. This provides much needed mental peace and confidence you have a good understanding what the situation is in your system, including system overloading.
Last but not least, Mongo Atlas is also a key and robust part of our system, that provides a similar set of features as the ones mentioned (monitoring, stability, native distribution of load) but in this case regarding the database.

These elements will help you scale but just by using them you are not covered — you have to configure things properly to adjust them to your budget and use case.

And we will provide a real-world example.

A common pattern we quickly realized after releasing V1 of our product back in 2020 is the high demand when markets open. This is easily explained by the number of events that happen before market opens and cause people to rush into sending orders the second the market opens.

This is a pretty standard pattern on a regular day:

Here you can see database CPU being stable and going up more than 2x the moment markets open (times are in UTC, market opening at 14:30 in this graph).

When something is causing a stock to be hot, or any other reason that causes a rush, picture will look more like this:

In this case you can see the number of queries spiking 40x (time also in UTC but in this case it’s Summer DST, market opening at 13:30).

So, in order to be prepared for this, you have two options:

Run your services in huge hardware that can put up with the maximum load.
Run your services in small containers and turn up more instances as load increases, dividing this load.

Number 2 is obviously preferred (specially in a cost-sensitive startup with a limited budget), with the drawback that new instances take some time from the moment you invoke them until they are connected to the network and start serving traffic. And a lot can happen in these few minutes.

To achieve the best of both worlds, after some time learning and understanding the usage patterns, what we do is turning on a lot more instances before market opens, and keep those copies enough time until load decreases (this is also important to keep costs under control).

ScheduledActions:
  - ScalableTargetAction:
      MaxCapacity: 100
      MinCapacity: 70
    Schedule: "cron(00 14 ? * MON-FRI *)"
    ScheduledActionName: before-market-opens
  - ScalableTargetAction:
      MaxCapacity: 9
      MinCapacity: 3
    Schedule: "cron(30 15 ? * MON-FRI *)"
    ScheduledActionName: after-market-rally

This is a small excerpt of our CloudFormation configuration describing the policy for number of instances, going up from between 3–9 instances to 70–100 30 mins before market opens, and back down to 3–9 after market rush (if load went down, otherwise it keep instances running).

And this is it! Next steps we have in our roadmap regarding scaling would include:

A lot of fine tuning, like analyzing more patterns and adding more cases to this auto scaling
Researching AWS Predictive Scaling that theoretically covers more cases automatically
Use some sort of Machine Learning/IA to anticipate an unexpected load increase in the middle of the day (like a big scale terrorist attack that would cause a lot of selling orders) using external variables like a news feed.

Please let us know in the comments if there’s something you would love to know about Ninety Nine and we will try to cover it in our future posts, and if you found this article interesting and are looking for new challenges remember we are hiring :-)

Scaling a fintech in high volatility markets

Written by Miguel Ángel Fajardo