Sign in

💸 Deploy low cost ECS tasks based on SQS queue size with AWS CDK

In this post we are using AWS CDK to build a system that scales up and down based on the amount of messages in the SQS queue. It allows users to do REST-API calls to an Amazon API Gateway endpoint from their applications or computers. This will add a new item to the SQS queue. In turn, this will trigger your task on ECS. After you’re task is finished it will delete the item from the SQS queue which will automatically scale down you’re ECS cluster and task.

Voila, a low cost autoscaling solution for your high intensity compute jobs.

When you have finished this post, you will have build the following:

  • A CDK project that deploys your resources to AWS
  • During the CDK deployment, it will bundle your Python code and create a Docker image which will deploy to ECR
  • A SQS that spins up ECS tasks based on the amount of messages in the queue
  • The user will be able to call an Api Gateway that puts new messages in the queue
  • The Python code will put out an Hello World message

The above items will results in to the following architecture:

Example architecture of what we are building

This post requires some form of fundamental knowledge on the AWS platform and dockerized development. We will not describe in detail how Docker or ECS works. Due to that we advise:

High computing tasks like automated video generation for the minimal cost

This architecture allows you to let users trigger high compute tasks without having to have resources running consistently. Generating video’s for example takes a lot of computing resources, and keeping that running all the time becomes very expensive very fast. With this setup, users can just trigger an endpoint and forget about it until the job is done.

🧠 Tutorial

In this chapter we are walking through the different steps required for creating this architecture.

The very first thing we will need to do is generate a new CDK project for us to work in. We will use the CDK sample application for now, to get a head start.

In order to begin, open a terminal window where-ever you want to store your project and create a new folder in which you generate a new CDK project:

mkdir docker-ecs-sqs-auto-scaling
cd docker-ecs-sqs-auto-scaling
cdk init sample-app --language typescript

You will notice that there is actually already a SQS queue defined in your CDK application under ~/docker-ecs-sqs-auto-scaling/lib/cdk-demo-stack.ts which we can use.

You will find that it also added an SNS topic which you can remove for now. Instead, we want to add our alarm that monitors the size of the available messages in the queue:

const messageQueue = new sqs.Queue(this, 'DockerEcsSqsAutoScalingQueue', {
visibilityTimeout: cdk.Duration.seconds(300)
});

The CDK code that will deploy our SQS queue

As always, verify you’re finding by deploying your resources with CDK by cdk deploy.

The API Gateway will be the entry point for our users to add messages to the SQS queue. You can add multiple destinations for your API Gateway but in our case we want it to direct requests to our SQS queue.

In order for us to create an API Gateway, we need to have the following:

  1. An IAM role that can be assumed by the API Gateway
  2. An inline policy that will be added to the role above that allows API Gateway to send messages to our SQS queue
  3. The REST api
  4. The method on the GET method that passes the message to the SQS queue (e.g. the endpoint)

When you have CDK’s help on your side, the above becomes the following:

// import * as iam from "@aws-cdk/aws-iam";
// import * as apigateway from "@aws-cdk/aws-apigateway";
// import the above in the top of your file

const credentialsRole = new iam.Role(this, "Role", {
assumedBy: new iam.ServicePrincipal("apigateway.amazonaws.com"),
});

credentialsRole.attachInlinePolicy(
new iam.Policy(this, "SendMessagePolicy", {
statements: [
new iam.PolicyStatement({
actions: ["sqs:SendMessage"],
effect: iam.Effect.ALLOW,
resources: [messageQueue.queueArn],
}),
],
})
);

const api = new apigateway.RestApi(this, "Endpoint", {
deployOptions: {
stageName: "run",
tracingEnabled: true,
},
});

const queue = api.root.addResource("queue");
queue.addMethod(
"GET",
new apigateway.AwsIntegration({
service: "sqs",
path: `${cdk.Aws.ACCOUNT_ID}/${messageQueue.queueName}`,
integrationHttpMethod: "POST",
options: {
credentialsRole,
passthroughBehavior: apigateway.PassthroughBehavior.NEVER,
requestParameters: {
"integration.request.header.Content-Type": `'application/x-www-form-urlencoded'`,
},
requestTemplates: {
"application/json": `Action=SendMessage&MessageBody=$util.urlEncode("$method.request.querystring.message")`,
},
integrationResponses: [
{
statusCode: "200",
responseTemplates: {
"application/json": `{"done": true}`,
},
},
],
},
}),
{ methodResponses: [{ statusCode: "200" }] }
);

As always, verify you’re finding by deploying your resources with CDK by cdk deploy.

Now that we have created the SQS queue, the alarm and the API Gateway that we will use to increase and decrease our ECS cluster size, we need to actually create our cluster that hosts our tasks.

Unfortunately, it isn’t as simple as just creating the cluster as we also need some extra foundational AWS resources. For example, in order for our cluster to read the Docker image from the ECR (repository) we will need a NAT gateway. This is one of the parts of this solution that will cost money consistently throughout the month.

const natGatewayProvider = ec2.NatProvider.instance({
instanceType: new ec2.InstanceType("t3.nano"),
});

const vpc = new ec2.Vpc(this, "FargateVPC", {
natGatewayProvider,
natGateways: 1,
});

const cluster = new ecs.Cluster(this, "Cluster", { vpc });

As always, verify you’re finding by deploying your resources with CDK by cdk deploy.

You’re doing great and this setup is almost finished already. It’s one of the great advantages of CDK that you don’t need that many lines of code to declare youre resources.

In order for us to deploy our code on the ECS cluster we will need to create a Task Definition which contains the information for our task that will run on ECS. This contains configuration for the regular things like cpu and memory limits, but also which containers to run from which registry. Lastly, we will add a Service to the cluster which in turn contains the Task Definition.

// Create a task role that will be used within the container
const EcsTaskRole = new iam.Role(this, "EcsTaskRole", {
assumedBy: new iam.ServicePrincipal("ecs-tasks.amazonaws.com"),
});

EcsTaskRole.attachInlinePolicy(
new iam.Policy(this, "SQSAdminAccess", {
statements: [
new iam.PolicyStatement({
actions: ["sqs:*"],
effect: iam.Effect.ALLOW,
resources: [messageQueue.queueArn],
}),
],
})
);

// Create task definition
const fargateTaskDefinition = new ecs.FargateTaskDefinition(
this,
"FargateTaskDef",
{
memoryLimitMiB: 4096,
cpu: 2048,
taskRole: EcsTaskRole
}
);

// create a task definition with CloudWatch Logs
const logging = new ecs.AwsLogDriver({
streamPrefix: "myapp",
});

// Create container from local `Dockerfile`
const appContainer = fargateTaskDefinition.addContainer("Container", {
image: ecs.ContainerImage.fromAsset("./python-app", {}),
logging,
});

// Create service
const service = new ecs.FargateService(this, "Service", {
cluster,
taskDefinition: fargateTaskDefinition,
desiredCount: 0,
});

As always, verify you’re finding by deploying your resources with CDK by cdk deploy.

Now that we have created the ECS Task Definition above we have defined that it should find a folder called ./python-app which contains our Docker definition. This docker image is what will get deployed as our ECS task – so let's create it:

mkdir python-app
cd python-app
touch Dockerfile
touch app.py
touch requirements.txt

Open up the Dockerfile in you’re favorite IDE and lets add some contents to it. The only thing we need is a simple Python image that runs one file, so we can keep it short and to the following:

FROM python:3.6

USER root

WORKDIR /app

ADD . /app

RUN pip install --trusted-host pypi.python.org -r requirements.txt

CMD ["python", "app.py"]

Next, you can fill up the app.py file that is in this folder with anything you like.

If you want the easiest approach, you might just want to go for the easiest print("Hello World.") for now. Alternatively, you can read the messages from SQS and delete is from the queue accordingly.

Secondly, if you’d ever need extra dependencies you can add those to the requirements.txt file — for now (with the first option) we don’t have any. As an example of the alternative option, you can use the following app.py (and add boto3 to your requirements.txt):

import boto3

sqs = boto3.client('sqs')

queue_url = 'https://sqs.eu-central-1.amazonaws.com/.....' <-- Add your SQS url from the AWS

def delete_sqs_message(receipt_handle):
print(f"Deleting message {receipt_handle}")
# Delete received message from queue
sqs.delete_message(
QueueUrl=queue_url,
ReceiptHandle=receipt_handle
)

# Read SQS
messages = read_sqs()
print(f"Found messages {messages}")

for message in messages:
# Take custom actions based on the message contents
print(f"Activating {message}")
print(f"Said Hello")

# Delete Message
delete_sqs_message(message['ReceiptHandle'])
print(f"Finished for {message}")

In this final step we will add the CDK code that creates the step scaling of your ECS tasks based on the SQS queue size. We will both create a scale-up (increasing the amount of tasks) as well as scaling-in (decreasing the amount of tasks) rule for our Cloudwatch alarm.

In order to do so, we will have to create 2 scaling steps for our ECS service that scales on the QueueMessagesVisibleScaling from the queue that is created above. It is actually really simple:

// Configure task auto-scaling
const scaling = service.autoScaleTaskCount({
minCapacity: 0,
maxCapacity: 1,
});

// Setup scaling metric and cooldown period
scaling.scaleOnMetric("QueueMessagesVisibleScaling", {
metric: messageQueue.metricApproximateNumberOfMessagesVisible(),
adjustmentType: autoscaling.AdjustmentType.CHANGE_IN_CAPACITY,
cooldown: cdk.Duration.seconds(300),
scalingSteps: [
{ upper: 0, change: -1 },
{ lower: 1, change: +1 },
],
});

And that is it, all done! For the very last time, verify your changes by running cdk deploy in your terminal to deploy your latest resources to the cloud.

Now that all the infrastructure is in place it is time to call on the API Gateway to see the magic in action. You can find you’re API Gateway endpoint URL either in your terminal after doing an cdk deploy or in the AWS console under the API Gateway service.

CDK showing you the API Gateway endpoint

With that, you can add a message to your SQS queue in the following way:

https://thmjsgwe1l.execute-api.eu-central-1.amazonaws.com/run/queue?message=test

After this, you should see the messages in your SQS queue increase on the AWS console to 1 (or more) available messages.

A message becomes available under SQS

In a couple of minutes, this should create an ECS task task on your new ECR cluster.

A new task has been spun up and is getting to a running state

✅ Conclusion

I hope you’ve been able to follow along with this post and that you’ve now successfully deployed your resources. You can give that API Gateway URL to anyone for them to be able to trigger new ECS tasks on your cluster.

You can find the complete example in the Github reposistory linked here.

You can find the complete example in the Github reposistory linked here.

If you think this is interesting and want to build out this project a little more, you can do any of the following:

You can always find more solutions and builds on our not build to last blog.

Builder from the https://www.nbtl.blog

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store