Scale Down EC2 Container Instances in ECS

AWS ECS in Brief

This article would explain a sample script which is used for scale down a ECS cluster in a cost efficient way using AWS SDK for Python.

Amazon ECS (EC2 Container Service) is a container management system which runs Docker containers while providing high scalability and high performance.Default container orchestration comes with AWS ECS, takes care of everything related to container management and avoid the user having to worry about container management. You can define static number of instances to be occupied in the cluster.

But it would not be very cost friendly because resources are paid even when they are not in used. How can we overcome this. Well..AWS provides auto scaling group feature so that we can attach it to the cluster. We can define how to scale out and scale in the cluster.

Problem with Autoscaling group scaling down

Autoscaling group can be attached to the ECS cluster and we can control the conditions to be auto scaled. But it has its own limitations. The cluster can be auto scaled under following limitations. They are applicable for the whole cluster. Not for a single node in the cluster.

  1. CPU reservation
  2. Memory reservation
  3. CPU utilization
  4. Memory utilization

How do you use above conditions in auto scaling. As an example, you can define cluster to be scaled out by 1 instance if there is a CPU utilization above 95% average for consecutive 5 minutes period of time.

With scaling out, there was no problem. But when we scale in, we came across with a few difficulties. For example, shutting down instances with running containers inside. The other major issue was even though an instance is shutdown, cluster might get scaled out with again in a few mintues. If this keeps happening in a loop, you would be paying unnecessarily for AWS resources. ATM we were testing this, AWS billing happens for hour basis.

How to overcome this…

Therefore, we wanted something more to combine with these conditions, for a better, yet economical scalability. For example, shutdown an instance if only it’s closing to next billing hour and no containers are running inside the EC2 instance in the cluster.

What we did was writing a custom script to run as a cron job so that it would run periodically do the scaling in if necessary. Wait- how can you write a custom script to handle AWS resources. Well..There is an AWS SDK for Python, called boto3 and it helped to solve the problem.

I am not going to describe here how to write boto3 script using Python. I will cover it in another article for sure. But I will describe a boto3 script which I wrote for scaling down.

Well..the script is shown below. You can go through it as I have commented in the code to make it clear.

#---------------Auto Scaling Down Instances When There Are No Running Containers---------------------
import boto3
import datetime
import math
import time
#---This method will provide the unix time once the timestamp is provided
def unix_time(time_stamp):
return time.mktime(time_stamp.timetuple())
#---This method will scale down the cluster
#---containerInstance and container instances count is needed
def scale_down(containerInstance, container_instances_count):
#Here "containerInstance" is an EC2 instance in the cluster
print "\nContainer Instance ID: " + containerInstance['ec2InstanceId']
#Check if the instance state has running task or pending tasks and number of instances in the cluster is greater than 1
#Running task = running containers, pending taks means the containers which are in queue to run in the considering instance
if (containerInstance['runningTasksCount'] == 0 and containerInstance['pendingTasksCount']== 0 and container_instances_count > 1):
#registration time of the container instance
#This means starting time on the EC2 instance
time_reg = containerInstance['registeredAt']
#current time of the instance
current_time =
#next billing hour of the instance
#current_time - registered time give you time different. When you divide it by 60*60 you get the different in hourse
#Eg: hours_different could be 0.5 , 1.3, 2.5 etc.
hours_difference = (unix_time(current_time) - unix_time(time_reg))/(60*60)
#next billing hour can be takes by registered time + round up value of the hours difference
next_billing_hour = time_reg + datetime.timedelta(hours=math.ceil(hours_difference))
print ("Next billing hour begins: %s" % next_billing_hour )
#check if the current time greater than 45 minutes of the current billing hour
# You can edit this by changine following 15 value by any value you like.
#This value actually depends on the cron job
threshold_time = next_billing_hour - datetime.timedelta(minutes=15)
print ("Threshold time to kill: %s" % threshold_time )
print ("Current time: %s" % current_time )
#check if the current time is less than the time to be killed the instance
if unix_time(threshold_time) < unix_time(current_time):
#Terminate the instance and number of available container instances would be decreased by 1
print "Terminating instance " + containerInstance['ec2InstanceId']
#auto scale group API called to terminate the instance by providing the instance ID
#Desired capacity should be decrement after terminating the instance, hence ShouldDecrementDesiredCapacity=true
asgClient.terminate_instance_in_auto_scaling_group(InstanceId=containerInstance['ec2InstanceId'], ShouldDecrementDesiredCapacity=True)
container_instances_count -= 1
print ("Size of the cluster after termination %s\n" %container_instances_count)
#if there are running/pending containers inside the instances
else :
print ("Running Containers {} \nPending tasks {} \nCluster size {} \n ".
#choose the aws user which to access the resources
#this account will be taken from the aws cli you have configured in your machine
session = boto3.Session(profile_name='myAccount')
#ecs client and auto scaling group resource generation
ecsClient = session.client(service_name='ecs')
asgClient = session.client(service_name='autoscaling')
#list container instances of the cluster
#you will have to provide the cluster name here. eg: ECS-Cluster
clusterListResp = ecsClient.list_container_instances(cluster='ECS-Cluster')
#details of EC2 container instances
containerDetails = ecsClient.describe_container_instances(cluster='ECS-Cluster', containerInstances=clusterListResp['containerInstanceArns'])
#Get the instances count in the cluster
container_instances_count = len(containerDetails['containerInstances'])
#loop through every instances to check if it should be terminated
for containerInstance in containerDetails['containerInstances']:
scale_down(containerInstance, container_instances_count)

You can get the code by following this GitHub location as well.

You can simply run this by


To make it more efficient and to get the full use it, you can set this script to run periodically as a cron job.

I hope this article helped you in some way to write a cost saving scale down policy with Python boto3. Hope you enjoyed it.