Autoscaling in AWS
Scaling is reactive and manual in a traditional data center. In the reactive scaling method, we manually brought up and down the servers as per the changes in application workload or traffic. We increase and decrease the servers according to the workload. Suppose the traffic is increasing, hence we will need to add 2–3 more servers to handle the traffic and address the user requests. But this is a time taking process and not sustainable as this is dependent on different factors for example availability of the servers, cost approval, and availability of admins. Also if the traffic goes down we have to remove the extra added servers, otherwise, it will run idle and increase the cost.
Cloud Computing solves all these problems as it’s very easy to add and remove extra resources based on the traffic. We use autoscaling for the same.
What is autoscaling?
Autoscaling is a feature of cloud computing that automatically increases and decreases the compute resources, memory and networking resources as per actual uses and demand. Autoscaling provides flexibility and elasticity to compute demand. It also enables consistent application performance at a low cost.
Let’s talk about autoscaling in AWS.
AWS Autoscaling
AWS Autoscaling is a service that monitors your application and automatically adjusts the capacity to maintain steady, predictable application performance at the lowest cost. It uses cloudwatch to monitor the application demand and raise alarms to scale up and down the resources.
Benefits of AWS Autoscaling
Below are the benefits of AWS AutoScaling:
- Better Fault Tolerance
- High Availability of the resources
- Flexibility and Elasticity
- Only pay for what you use, better cost management
- High Reliability of the resources
- Automatically maintain performance
Types of Scaling Plan
There are different types of plans we can look for:
Manual Scaling -> It helps in managing the task of building or terminating the instances on its own.
Schedule Based -> Based on traffic, we can schedule the time for executing the AWS Autoscaling.
Demand Based -> Based on client demand, we can define required scaling.
Current Instance-Level Based -> We can configure an Autoscaling group for managing running instances.
Let’s see what the Autoscaling group is.
AWS EC2 Autoscaling Group
The Autoscaling group is a collection of Amazon EC2 instances grouped logically for automatic scaling and management. The Autoscaling group also enables us to use EC2 Autoscaling features such as health check replacements and scaling policies.
There are 3 major parameters:
Minimum Size, Desired Capacity, Maximum Size
Minimum Size -> It ensures that we always have a fixed number of instances running. The minimum number of instances should be enough to meet the base application load, but not have too much unused capacity.
Desired Capacity -> This is the number that notates how many instances we want to run ideally. Autoscaling group try to maintain this capacity
Maximum Size -> It notates the max number of instances running in the autoscaling group. Autoscaling group never creates the instances more than the maximum size.
These parameter values will determine the cost of running the cluster.
This was all about Autoscaling. We can scale EC2 instances, ECS, RDS, DynamoDB etc. Read more in-depth about it and how to create and configure an autoscaling group.
Reference: “Cloud Computing With AWS” by Pravin Mishra on Udemy