Tagging EC2 instances created by AutoScaling Group with Lambda and Cloudwatch

When using autoscaling, AWS doesn’t apply tags to all resources. Here is a simple solution using Lambda and Cloudwatch events

Luca Tiozzo
Mar 24 · 5 min read

Tags on AWS are a very useful tool, commonly used to encapsulate resources in groups. AWS provides some useful examples of tagging strategies, for this article we will consider our tag usage for cost allocation:

  • Environment: We define the environment in which the resource is located (development, quality, etc.);
  • Automation: We define whether the resource should be turned on and/or off at night or during the weekend, when our dev team is not coding and therefore is not needed;
  • Role: We define a role, which is a name to the application or service that they reference (for example, everything related to balancing, such as AWS balancer, EC2 machines with NGINX, etc.) to identify all the AWS resources needed for a given application;

Calculating marginal costs: By dividing resources by role, you can calculate the remaining cost of each GB of storage, traffic, or each new content uploaded to the platform. AWS can provide cost reports using custom tags directly.

In this article, we will focus on the challenge and the benefits of using tags to perform marginal cost calculations.

Our goal

For dynamic resources, it is slightly more complex. In our architecture, a large part of the resources are used for containers. As explained in the previous article, we make extensive use of AutoScalingGroups to manage EC2 machines on which to run containers.

AWS doesn’t propagate tags to all resources automatically

Why recommended solutions do not suit our needs

  • In this thread, AWS suggests converting our Launch Configurations to EC2 launch templates and then to ASG launch template. This new template supports tagging EBS directly from the AutoScalingGroup, but does not provide a solution for Elastic Ip and Network interfaces.
  • In the AWS support answer, they recommended creating a custom init script that tags all resources that are not tagged automatically (even in the previous thread they recommended it as a workaround). This approach works, but it forces you to add permissions to each EC2 for tag itself, plus you have to maintain the code used by the EC2 for each AutoScalingGroup, there isn’t a centralized place to change the configuration.

So we looked for an alternative way, using a very versatile tool, the Lambda functions and the Cloudwatch Events.

Our solution

{
"source": [
"aws.ec2"
],
"detail-type": [
"EC2 Instance State-change Notification"
],
"detail": {
"state": [
"running"
]
}
}

The Cloudwatch Event Rule then starts whenever any EC2 instance goes into the running state, sending the event to a Lambda function (the event is JSON-encoded), which has all the necessary information, including the instance-id of the newly owned instance passed to the running state. The following is an example of an event.

{
"version": "0",
"id": "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx",
"detail-type": "EC2 Instance State-change Notification",
"source": "aws.ec2",
"account": "xxxxxxxxxx",
"time": "2019-05-22T13:21:37Z",
"region": "eu-west-1",
"resources": [
"arn:aws:ec2:eu-west-1:xxxxxxxxxx:instance/i-xxxxxxxxxxxxx"
],
"detail": {
"instance-id": "i-xxxxxxxxxxxxx",
"state": "running"
}
}

With this information, Lambda performs the following operations (in italics the AWS Actions used):

  • Retrieve EC2 machine tags (DescribeInstances);
  • Retrieve the volume, network interfaces, and elastic IP information of the EC2 machine (DescribeVolumes, DescribeNetworkInterfaces, DescribeAddresses);
  • Check if there are tags on the instance as well;
  • If tags are missing, tag the resource (CreateTags);

The only problem we encountered was that sometimes the lambda started before the AutoScalingGroup propagated the tags to the instance, it was a rare occurrence, but when it happened it made the lambda useless. We thus had to either wait for the instance to have tags, or implement a retry mechanism.

Lambdas, however, provide an integrated retry mechanism in case of failure or unmanaged exception, but it is not currently configurable if the lambda is invoked by a Cloudwatch event (with other services you can choose the maximum number of retries). At the time of writing, after the first failure, they will try twice more (with a 1-minute delay between tests, from my tests), which is good enough for our purposes.

Conclusions

Next steps

THRON tech blog

THRON’s tech blog .

Luca Tiozzo

Written by

DevOps Developer @ THRON

THRON tech blog

THRON’s tech blog . THRON is a Digital Asset Management and Product Information Management SaaS that features automatic content classification (ML), real-time content rendition and real-time data analysis to perform content recommendation. www.thron.com

More From Medium

More from THRON tech blog

More on AWS from THRON tech blog

More on AWS from THRON tech blog

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade