Horizontal Scaling of a Monolith Service

Less change in code, more change in architecture design

Ashraful Islam Nixon
The Devs Tech
4 min readApr 28, 2020

--

Behind the Story

Early 2019 I helped out a team where I looked over a service that had some technical challenges. The most important of those are,

  1. APIs were becoming slow with high traffic
  2. API performance was still slow after doing multiple vertical scaling

They asked me to help out them to find any kind of solution since they are paying a higher bill with zero performance improvement and traffics are increasing every day :( .

Photo by Hal Gatewood on Unsplash

Initial Problems:

Let’s say they had a server called ServX at the beginning. After getting access in their system what I saw,

  1. Their DB is executing lots of similar queries.
  2. DB queries are taking time for joining in large tables
  3. They had no caching method
  4. They were running API and database in ServX
  5. Their images were being stored in and served from ServX
  6. Their admin panel was running in ServX
  7. Their background process jobs were running in ServX
  8. Their CPU usages are very high
  9. ServX disk i/o is very high
  10. Even if CPU and memory usages were low the API response’s were slow for concurrent requests
initial load testing report, 1.2k rpm for 30 sec

Initial thoughts:

  1. They are running lots of similar queries. Do they need a caching server, maybe Redis?
  2. Maybe it’s not a CPU or memory issue? it’s a hard-drive issue since all services/servers are using the same ServX.

Do we rewrite the system part by part to support micro-services?

But at that moment they do not have enough time or resources to rewrite the services and support the increasing traffic in the existing system.
So we took some series of scale-up actions for the monolith service.

Scaling 101 — implement caching:

Process:

So we decided to implement caching to reduce DB query loads. so what we did is

  1. Set up a Redis cache in ServX
  2. Added straight 2–7 min caching mechanism in major endpoints

Output:

  • DB queries decreased significantly
  • API response improved
  • Concurrent requests are making the same issues as before
  • After a certain time API behaving as same as before, means performance improved but not satisfying

Scaling 102 — separate DB server:

Process:

  1. Create a new DB server dbX
  2. Move the current data from the ServX to dbX
  3. Change the DB related info in ServX and killed the DB server in ServX.

Output:

  1. ServX CPU, memory loads decreased
  2. ServX API response increased
  3. ServX disk i/o is decreased a little bit
  4. Still issue with concurrent traffic response

Scaling 103 — separate Redis server:

Process:

  1. install Redis server in dbX
  2. change the caching related info in ServX and killed the Redis server in ServX.

Output:

  1. ServX CPU, memory loads decreased
  2. ServX API response increased much much better than before
  3. dbX disk i/o increased a bit but not alarming
  4. Still issue with concurrent traffic response

Scaling 104 — separate file server:

Process:

  1. Create a server MiniX to setup minio.
  2. Change in code to support amazon s3 file upload, since minio integration is the same as s3 file upload
  3. Move the existing files to MiniX and setup proper ACL
  4. Change the URLs in DB
  5. Remove the files from local storage

Output:

  1. ServX CPU, memory loads decreased
  2. ServX API response increased
  3. ServX disk i/o decreased significantly
  4. Issues with concurrent traffic response still there but much much less

Scaling 105 — separate APIs from utils tasks:

process:

  1. create a server AdminX
  2. Move the admin panel from ServX to AdminX
  3. Move background tasks from ServX to AdminX
  4. Kill admin panel and background tasks in ServX

output:

  1. ServX CPU, memory loads decreased
  2. ServX API response increased
  3. Issues with concurrent traffic response reduced to almost normal

Scaling 106 — add load-balancer:

Now it’s time to add load balancer since all dependencies are separated into multiple servers.

process:

  1. Create a load balancing server LoadX using Nginx
  2. Take a backup for ServX
  3. Create 6 servers from the ServX backup
  4. Map those 6 servers IP as upstream servers in LoadX configuration

output:

  1. API responses change drastically.
  2. concurrent traffic response reduced like magic.

Final Result:

  • API performance improved from 130 sec to 100–350 ms
  • very little changes in codebases compared to performance output
  • Server cost reduced to 70%
final load testing using 12k rpm for 60 sec

You may think we added so much server’s still cost reduced to 70%? We used very low configured servers since I planned to reduce loads of a high powered server to multiple low powered servers.

Basically we moved from vertical scaling to horizontal scaling! This allows us to handle much more traffics with less costly resources

--

--

Ashraful Islam Nixon
The Devs Tech

Sr. Software Engineer II @ Pathao. Problem Solver. Git Chef. Devoted Web Practitioner. Evil Analyst. Food fanatic.