10+ Scalability Laws To Follow In Your Next Project.

Don’t Touch your keyboard without understanding this

5 min readApr 15, 2024

The world itself is a self-scalable phenomenon worth studying. Every living moment, the world is expanding, evolving, and adaptive. Within it, there exists independence in dependency and vice versa.

Everything and everyone working in their best interest and at the same time, working in the best interest of humanity as a whole.

No mistake or error made by a single person can completely take humanity off course.

Everything and everyone can be made redundant. Everything and everyone is reused. Everything and everyone exists in their most experienced and updated state.

These are some of the heuristics to follow when designing a scalable system.

Before we start any project, we should have a solid idea of how many users or tasks the project would need to handle. You should have this conversation with your boss if it’s at your job.

I prefer to overestimate the number of expected users by a margin of 10 times. If the expected users are 10, I craft the project with 100 in mind. If it’s 100, I expect 1000, and so on. This helps me prepare for unaccounted circumstances, which most likely there would be.

1. Caching

Ask yourself… If I was designing the world, and every time I needed to fetch something, I had to fetch it from the very source and compile it before properly accessing its most valid state, would that be a very efficient system? It essentially means, that for every minute you exist, you need to be recreated from scratch before you can perform a certain action.

Obviously not efficient.

This is why caching is useful. It allows us to access the most valid and useful form of an object. Obviously, the world is far better at updating its cache than we are but you get the point.

We use caching to offload the database as much as possible and access frequently accessed items as fast as possible without causing additional stress on the system.

Always ensure to store your cache in memory which is accessed via the RAM than in disk which is slower than accessing via RAM.

There are softwares that can help us handle caching effectively. Some of them include: Memcached, Redis, Varnish Cache, CDNs like Cloudflare, etc

2. Modularity

We have to break things down to their MVP — Minimum Viable Program. Breaking down the software to separate functionalities helps keep the code clean and maintainable. It allows for easy testing and deployment. It also allows us to easily add more modules and features to our software. All these enhance scalability.

3. Asynchronous and Non-blocking I/O

Running I/O operations asynchronously is a must. We do not want to have to wait for a task or request to finish before performing another task. Before we build our software asynchronously, we must carefully ensure the software would be best built using asynchronous methods.

We use asynchronous operations when all the tasks to be performed are independent of one another. That is, the input and result of one task aren’t dependent on the result or input of another.

We could use asyncio, futures, or threading modules in Python to achieve this.

4. Load Balancing

Load balancing is the distribution of traffic across different servers. A load balancer can be either software or hardware. It determines which server a particular task or traffic should operate in and its chosen server is usually based on criteria such as size, estimated time to complete, least busy server etc.

Using load balancers to distribute incoming traffic across multiple instances of the software ensures that no single instance is overwhelmed.

5. Fault Tolerant

When designing our application, we must ensure its able to withstand and quickly recover from any error it encounters. This is usually done using error-handling methods such as try-and-except/catch blocks.

No single function should have enough power or control to block our entire application from executing or performing other tasks.

We must ensure we design the software resilient to failures/errors by graceful degradation, failover mechanisms, exponential backoff, etc.

6. Database sharding

This involves splitting a database into separate smaller components (usually called shards) to reduce server load and increase response times. A database server can only handle so much load before it begins to slow down your application. When building scalable applications, we must take this into consideration and address this properly.

We can separate the database using different methods. By id (eg 1–10,000, 10,001 – 100,000, etc), using some hashing algorithm to determine which database to allocate a particular record to, separating records based on a column alphabetically eg (Database 1 — [1, Alexander], Database 2 — [2, Bright]).

When choosing our desired method, we must ensure it’s fit for both retrieval and deposition into the database. We do not want to use a certain method only to find out that to retrieve that data we have to search through every single database.

We would also want a method that enables our data to be evenly distributed among each database. This would often require knowledge of the problem context and use case.

Using database sharding allows us to scale horizontally rather than vertically which promotes scalability.

A good database sharding addresses all the above.

7. Use Microservices architectures

Using microservice architectures enables us to loosely couple our application which means we can decrease or increase each service capacity based on demand.

We can do this by either separating our services/components on different servers from the same provider or on different servers from different providers.

Rather than the traditional way, which is monolithic architecture, we can easily scale on demand.

Loosely coupling components also keeps the application running even if one component fails. So our application is almost guaranteed a 100% uptime.

The application is also easy to maintain since developers can easily remove a component and replace it with a new updated version without disrupting the entire software/application.

8. Monitoring and Auto-scaling

It’s critical in any application to have monitoring services integrated into our application. This would help us track important metrics such as failures, application-related errors, outages, performance, speed, etc.

When I developed my first mobile app, I used Sentry as my monitoring service. But when deploying to the cloud, your cloud provider usually has monitoring services out of the box for you. For instance, AWS has CloudWatch.

9. Choosing the right technology

This has to be the most critical aspect. The ability for your entire application to be scalable relies on the technology it’s involved with.

You want to ensure you choose tech stacks like Java or Python which allows us to follow basic OOP principles. These principles also make our projects highly scalable. Node Js is another tech stack that can enhance the scalability of our application.

They allow us to easily reuse and maintain our code base over time. They equally allow us to separate the concerns of our application which promotes readability and reusability.

10. Choose a horizontal scale over a vertical one

You might find that the system is limited to scale horizontally sometimes. Yet, these limits regularly extend databases in the same direction. Besides, with a horizontal scale, you can just add another server rather than upgrade the existing one. It will save you some money.

Bonus: Take work away from the core.

The number of clients will exceed the number of available services you have. So, to avoid any kind of bottlenecks while scaling the application up, distribute as much work as you can away from the core.