What is a scalable application?
In short a scalable web application is the ability or capacity of a system to have multiple instances based on the load of the work it has do.
In the past and now also, you are probably used to work on a project, put it on a virtual machine or dedicated server which works fine, or does it?
Have you ever come to a situation when you have to restart your server because some requests or all of them fail?
It happened to all of us, but did restarting the server solve the problem?
There are many reasons for this to happen but the main problem is that our server can handle a limited number of concurrent connections. Let us go through some points and rules to build a proper application keeping in mind that it will have to handle a bigger load/traffic and we want to be prepared for it.
1. Load balancer
The key component and the first thing is you need to set up or use managed load balancer from your hosting provider be it in AWS, Google Cloud, Azure or whatever.
2 good things to do here is to do frequent health checks on the instances, and make sure new/old instances are registered to your load balancer to achieve auto scaling and avoid unhealthy instances from being used.
If you achieved instance scaling, it does not mean your application has solved all the limitations.
Most probably you will be using a database for your application be it in Mysql, Postgres or any other database and they have their limitations also, so this means that you will have a bottleneck with the database which cannot serve your many instances.
Caching comes into play here because you will avoid hitting the database for every request.
This rule is very important to keep in mind because might have done a upload functionality for images or files and store it to some folder.
In scalable applications the above mentioned feature will have problems, and to demonstrate that lets assume the following scenario:
- You have scalable application which at the moment has 3 instances, lets call them instance A, instance B and instance C.
- A visitor uploads an image
- the load balancer routes that request to the instance A
- The image is saved in the folder of the instance A
- When he refreshes he expects to see the uploaded image
- Now the load balancer routes that request to the instance B
- Now only instance A has the image and instance B does not, so the visitor will get a error that the image does not exist.
To avoid this you can use cloud storages like S3 from AWS or shared disk but shared disks are usually slower so cloud storages would be the way to go.
The logs of your application by default will probably be written in file/s, in this case your should stream them to stdout and from there send them to another database or use services like Cloudwatch from AWS, or just use Sentry which is really good for this.
By any means you should be careful with what you save into the local storage.
The above mentioned rules are the basic steps to keep in mind, but the list goes on for more advanced scalability which are out of the scope for this guide but it is worth researching about database indexation for faster queries, queues for long running jobs to avoid request timeout, No-SQL databases which support shards, message brokers, designing your application as monolithic or micro-services etc.
All these play a role into the capacity of the system to scale.
It is a good practice to know the limitation of your instances so that you know when the system should scale and add more instances. This is called saturation point.
To identify your saturation point a so called ramp up test should be used to stress test your single instances to the point where they become unstable.
For more on how to deploy your application on AWS with low costs check out this article: How to scale laravel on AWS