Scaling Dynamically: Mastering Auto-Scaling in the Cloud

Umar Mukthar
6 min readFeb 15, 2024

--

Introduction

In the ever-evolving digital landscape, the ability to dynamically scale your infrastructure is a game-changer. This article will delve into the world of auto-scaling, a powerful feature that ensures your application can handle varying workloads seamlessly.

Understanding Auto-Scaling

Auto-scaling is a crucial component in cloud computing, allowing your system to automatically adjust resources based on demand. Whether it’s a sudden surge in traffic or a decrease in workload, auto-scaling ensures optimal performance and cost efficiency

  • Dynamic Resource Allocation
  • Seamless Performance Optimization
  • Maintaining Consistent User Experience
  • Cost-Efficiency in Real-Time
  • Enhancing System Reliability

Table of Content

  1. Prerequisites
  2. Create a Launch Template
  3. Configure Auto Scaling Group
  4. Scaling Policies Configuration
  5. Simulating Load for Scale-in and Scale-out Testing
  6. Conclusion

1. Prerequisites

  1. Familiarity with cloud computing concepts and services, particularly AWS EC2 (Elastic Compute Cloud) and Auto Scaling Groups.
  2. Basic understanding of system administration and networking concepts, including configuring instances, setting up security groups, and managing VPCs (Virtual Private Clouds).

2. Create a Launch Template

  • Log in to your AWS Management Console.
  • In the services menu, select EC2 to access the EC2 Dashboard.
  • In the EC2 Dashboard, under Instances, select Launch Templates from the left navigation pane
  • Click on the Create Launch Template button
  • Provide a name and description for your Launch Template.
  • In the Version details section, you can optionally add a version description
  • Choose Amazon Linux 2 as the Amazon Machine Image (AMI) for your instances. This is a reliable and commonly used choice for various applications.
  • Select the instance type that aligns with your application’s requirements. Consider factors such as CPU, memory, and network performance. For example, you might choose a type like t2.micro for basic testing or adjust to a different type based on your application’s needs.
  • Choose your existing key pair to enable secure instance access. This key pair will be associated with the instances, allowing you to connect securely using SSH

Configure Network Settings

  • Set up networking details.
  • Define security groups and other network-related configurations.
  • Enable Auto-assign public IP

Updating User Data for Instance Initialization

In the context of AWS instances, user data provides a powerful mechanism to customize the configuration of your virtual machines during the launch process. In our case, we’re employing a simple Bash script as user data to streamline the setup of our instances.

Here’s a breakdown of the script

#!/bin/bash

# Update the system
yum update -y

# Install the Apache HTTP server
yum install -y httpd

# Start and enable the Apache service
systemctl start httpd
systemctl enable httpd

# Create the HTML content and write it to the default web page
echo "<h1>Server Details</h1>
<p><strong>Hostname:</strong> \$(hostname)</p>
<p><strong>IP Address:</strong> \$(hostname -I | cut -d' ' -f1)</p>" | sudo tee /var/www/html/index.html
  • Review all the details you’ve configured for your Launch Template.
  • Click on Create Launch Template to finalize and create the Launch Template.

3. Configure Auto Scaling Group

  • Navigate to the AWS EC2 console.
  • Under Auto Scaling in the navigation pane, select Auto Scaling Groups.

Click on the “Create Auto Scaling Group” button

  • Provide a name and Select the launch configuration you created earlier. This defines the specifications for the instances that the Auto Scaling Group will launch.
  • Hit Next

Configure Network Settings

  • Choose the Virtual Private Cloud (VPC) for the Auto Scaling Group.
  • Specify the subnets within the chosen VPC where your instances will be launched.
  • Distribute instances across multiple Availability Zones for high availability.

Hit Next

Load Balancing info

  • Specify that instances should be attached to an existing load balancer.
  • Choose your load balancer target group
  • In existing load balancer target group : choose the recured target group

Hit Next

Health Check

Grace Period

  • Set a grace period to allow newly launched instances time to initialize and become fully operational before health checks begin.
  • The grace period is essential to prevent premature termination of instances that may still be initializing and not yet ready to handle traffic.

Hit Next

Set Group Size

  • Define the minimum and maximum number of instances in the group.
  • Minimum Instances: The lowest number of instances the Auto Scaling Group should maintain, ensuring basic availability.
  • Maximum Instances: The upper limit of instances the Auto Scaling Group can scale to, preventing excessive resource consumption.
  • In your scenario, where you want a minimum of 1, a maximum of 5, and a desired capacity of 1, your configuration might look like this

4. Scaling Policies Configuration

  • Choose target tracking scaling policy
  • Define threshold values that trigger scaling actions.
  • CPU Utilization Example:
  • If CPU utilization exceeds a certain percentage 50%, initiate a scale-out action to launch additional instances.
  • If CPU utilization falls below a lower threshold 50%, initiate a scale-in action to terminate instances.
  • Save the changes

Instance maintenance policy

  • Choose mixed behavior

Hit Next

  • Review all the configurations including launch configuration, group size, scaling policies, and health checks.
  • Click Create Auto Scaling Group to finalize the setup.

After creating your Auto Scaling Group with the specified configurations, the system initiates the launch process for the desired number of instances. As the launch progresses, each instance undergoes a series of health checks to ensure it is fully operational before being incorporated into the group.

5. Simulating Load for Scale-in and Scale-out Testing

In real-world scenarios, it’s crucial to test how your Auto Scaling Group responds to varying workloads. One effective way to simulate this is by using the stress tool to induce CPU load on instances, triggering scale-out and scale-in actions. Let's walk through the process

SSH into EC2 Instance

  • Use the following command to SSH into one of your EC2 instances
ssh -i path/to/your/key.pem ec2-user@<instance-public-ip>

Install EPEL Repository

sudo amazon-linux-extras install epel -y

Install Stress Tool

sudo yum install stress -y

Simulate High CPU Load (Scale-Out)

stress --cpu 4 --timeout 300s

This command stresses the CPU with 4 workers for 300 seconds.

Observe Auto Scaling Group Actions

  • Monitor the AWS Management Console or CloudWatch metrics to observe the Auto Scaling Group reacting to increased CPU load.
  • Verify that additional instances are launched to handle the simulated load
  • Once the stress test is completed or terminated, the instances automatically scale back to the desired capacity.

6. Conclusion

Master auto-scaling for dynamic cloud infrastructure! Seamlessly adjust resources based on demand, optimizing performance and cost. Learn to create launch templates, configure scaling groups, and implement policies for resilience and efficiency. Empower your application in the digital landscape. #CloudComputing #AutoScaling

Stay tuned for more. Let’s connect on Linkedin and explore my GitHub for future insights.

--

--

Umar Mukthar

Experienced AWS Cloud Engineer & Freelance Medium Blogger. Passionate about crafting robust cloud solutions & sharing insights in tech.