Scaling A Serverless Applications

Amirhossein Soltani
4 min readJan 6, 2023

--

Scaling a serverless application can be done in a few different ways, depending on the specific needs of your application. Here are a few options:

  1. Auto-scaling: Many cloud providers, such as AWS, AZURE and GCP, offer auto-scaling for serverless applications. This means that the cloud provider will automatically increase or decrease the number of instances of your application in response to changes in demand.(Horizontal scaling)
  2. Manual scaling: You can also manually adjust the number of instances of your application. This can be useful if you want to scale up or down based on a schedule, or if you want more control over the scaling process.
  3. Optimizing code: Another way to scale a serverless application is to optimize the code itself. This can include optimizing database queries, reducing the amount of data being transferred, and using caching to reduce the load on the application.
  4. Splitting workloads: If certain parts of your application are experiencing more demand than others, you can split the workload across multiple functions or services. This can help to distribute the load more evenly and improve scalability.
  5. Use a managed service: Some cloud providers offer managed services for serverless applications, which can handle scaling and other operational tasks for you.

In this article we are going to talk about the first approach,”Auto-scaling” and We will see how the cloud providers such as AWS, AZURE and GCP manage the auto-scaling.

AWS Serverless Auto-scaling

AWS provides autoscaling for its serverless offerings, including AWS Lambda and Amazon Elastic Container Service (ECS). With autoscaling, the number of instances of your application is automatically increased or decreased based on demand.

In the case of AWS Lambda, autoscaling is based on the number of invocations of your function. The number of instances of the function is automatically adjusted based on the incoming request rate and the configured concurrency limits.

Lambda auto-scaling based on request traffic and maximum size

For Amazon ECS, autoscaling is based on the number of tasks in the service, and can be configured to scale based on metrics such as CPU utilization or memory usage.

ECS auto-scaling based on request traffic and maximum size

To set up autoscaling for a serverless application on AWS, you can use the AWS Management Console or the AWS CLI. You can specify the minimum and maximum number of instances, as well as the conditions under which the number of instances should be increased or decreased.

AZURE Serverless Auto-scaling

Azure provides autoscaling for its serverless offerings, including Azure Functions and Azure Container Instances (ACI). With autoscaling, the number of instances of your application is automatically increased or decreased based on demand.

In the case of Azure Functions, autoscaling is based on the number of incoming requests to the function. The number of instances of the function is automatically adjusted based on the incoming request rate and the configured concurrency limits.

Functions auto-scaling Based on multiple sources requests

For Azure Container Instances, autoscaling is based on the number of containers in the instance, and can be configured to scale based on metrics such as CPU utilization or memory usage.

To set up autoscaling for a serverless application on Azure, you can use the Azure portal or the Azure CLI. You can specify the minimum and maximum number of instances, as well as the conditions under which the number of instances should be increased or decreased.

GCP Serverless Auto-scaling

GCP provides autoscaling for its serverless offerings, including Cloud Functions and Cloud Run. With autoscaling, the number of instances of your application is automatically increased or decreased based on demand.

In the case of Cloud Functions, autoscaling is based on the number of invocations of your function. The number of instances of the function is automatically adjusted based on the incoming request rate and the configured concurrency limits.

Below image shows how the functions scale based on the number requests sent by the IOT sensors.

For Cloud Run, autoscaling is based on the number of requests to the service, and can be configured to scale based on metrics such as CPU utilization or memory usage.

To set up autoscaling for a serverless application on GCP, you can use the GCP Console or the gcloud command-line tool. You can specify the minimum and maximum number of instances, as well as the conditions under which the number of instances should be increased or decreased.

Conclusion

As you might have noticed the behavior of each cloud provider for auto-scaling is pretty much the same for the similar services, for example AWS Lambda, AZURE Functions and GCP cloud functions use the same method and behavior for the auto-scaling since they have mutual spirit and these services functionalities are the same.
The same thing applies for the AWS ECS, AZURE Container Instances and GCP Cloud Run, They are all container based services so they use the same method and behavior for the Auto-scaling.

I have more articles about the serverless world and I also write articles and stories about Coding, IT Lessons and Technology, If you’d care to read more, follow me on Medium :)

--

--

Amirhossein Soltani

📍Amsterdam🇳🇱 /n *Tech Lover /n *Software Engineer /n *Tryna be a lovely geek