Auto tagging with AWS and Terraform

Hotu
Credera Engineering
8 min readApr 21, 2022

In this step-by-step guide, I will show you how to use Terraform to automatically tag AWS resources for cost monitoring purposes.

Terraform 1.1.x will be used throughout this tutorial, and so it is worth noting that that some configuration will look different if you are using a different minor version.

Please note that anything in angled brackets (<>) will need to be populated with values for your desired outcome.

Pre-requisites

  • AWS account
  • Basic understanding of Terraform
  • Basic understanding of Python is beneficial but not essential — thank Google for that!

Overview

The idea behind auto tagging is that an AWS User/Role creates a resource, CloudTrail listens for the creation event, and triggers a Lambda to tag the resource for you.

This process is illustrated by the diagram below.

Process of auto-tagging

Getting started

To set up our Infrastructure as Code (IAC) , we need to define our Terraform configuration, including:

  • Terraform version
  • Required provider
  • Provider configurations
  • AWS service role
  • S3 bucket
  • Cloudtrail
  • Lambda
    - function
    - permissions
  • Cloudwatch
    - log group
    - event rules
    - event targets

Setting up the provider, this tells Terraform what version of itself and of AWS it will need:

# provider.tfterraform {
required_version = ">= 0.14.9"
required_providers {
aws = {
source = "hashicorp/aws"
version = ">= 4.4.0"
}
}
}

Next, create a policy for our IAM role. This will be used later on:

# iam_role_policy.tfdata "aws_iam_policy_document" "assume-role-policy" {
statement {
sid = "LambdaAssumeRole"
effect = "Allow"
principals {
type = "Service"
identifiers = ["lambda.amazonaws.com"]
}
actions = ["sts:AssumeRole"]
}
statement {
sid = "CloudwatchAssumeRole"
effect = "Allow"
principals {
type = "Service"
identifiers = ["cloudwatch.amazonaws.com"]
}
actions = ["sts:AssumeRole"]
}
statement {
sid = "CloudtrailAssumeRole"
effect = "Allow"
principals {
type = "Service"
identifiers = ["cloudtrail.amazonaws.com"]
}
actions = ["sts:AssumeRole"]
}
statement {
sid = "EventsAssumeRole"
effect = "Allow"
principals {
type = "Service"
identifiers = ["events.amazonaws.com"]
}
actions = ["sts:AssumeRole"]
}
statement {
sid = "S3AssumeRole"
effect = "Allow"
principals {
type = "Service"
identifiers = ["s3.amazonaws.com"]
}
actions = ["sts:AssumeRole"]
}
}

The policy will allow our role to interact with the resources we need. The next step is to attach the policy you’ve created to an IAM role:

# iam_role.tfresource "aws_iam_role" "auto-tagger-role" {
name = "auto-tagger-role"
assume_role_policy = data.aws_iam_policy_document.assume-role-policy.json
}

We’ll also need a bucket to store our CloudTrail logs in:

# s3_bucket.tfresource "aws_s3_bucket" "trail-bucket" {
bucket = "auto-tagger-trail-bucket"
force_destroy = true
}

Having force_destroy set to true saves time when destroying the infrastructure. Otherwise, you will have to manually delete it through the GUI.

Create a policy for the bucket to allow Cloudtrail to log to the bucket:

# s3_bucket_policy.tfdata "aws_iam_policy_document" "bucket-policy" {
statement {
principals {
type = "Service"
identifiers = [
"cloudtrail.amazonaws.com"
]
}
effect = "Allow"
actions = [
"s3:GetBucketAcl"
]
resources = [aws_s3_bucket.trail-bucket.arn]
}
statement {
principals {
type = "Service"
identifiers = [
"cloudtrail.amazonaws.com"
]
}
effect = "Allow"
actions = [
"s3:PutObject"
]
resources = [
"${aws_s3_bucket.trail-bucket.arn}/auto-tagger-trail-logs/AWSLogs/*"
]
}
}

Then, attach it to the bucket as follows:

# s3_bucket.tf...resource "aws_s3_bucket_policy" "trail-bucket-policy" {
bucket = aws_s3_bucket.trail-bucket.id
policy = data.aws_iam_policy_document.bucket-policy.json
}

Now that we have our bucket to store our logs and an IAM role, we need to create the Cloudtrail and Lambda.

For Cloudtrail, you will need to associate a Cloudwatch log group to it. This log group will store details of when the Lambda is triggered:

# cloudtrail.tfresource "aws_cloudwatch_log_group" "auto-tagger-log-group" {
name = "/aws/lambda/auto-tagger-function"
}

These logs can be found with access through the GUI in Cloudwatch log groups.

Next, add a statement to the IAM role in-line policy to allow logging to the log group:

# iam_role_policy.tf...data "aws_iam_policy_document" "tag-resources-inline-policy" {  statement {
sid = "AllowLoggingToLogGroup"
effect = "Allow"
actions = [
"cloudwatch:*",
"logs:PutLogEvents",
"logs:CreateLogStream"
]
resources = [
"${aws_cloudwatch_log_group.auto-tagger-log-group.arn}:*"
]
}
}

Attach the in-line policy to the IAM role:

# iam_role.tfresource "aws_iam_role" "auto-tagger-role" {
name = "auto-tagger-role"
assume_role_policy = data.aws_iam_policy_document.assume-role-policy.json
inline_policy {
name = "ResourceTagging"
policy = data.aws_iam_policy_document.tag-resources-inline-policy.json
}
}

Next, we will be adding the Cloudtrail - this is quite important step. The trail will log management events and allow us to implement triggers for our Lambda.

We’ll cover this in more detail later on, but for now we will create the Cloudtrail:

# cloudtrail.tf...resource "aws_cloudtrail" "cloudtrail" {
name = "auto-tagger-trail"
s3_bucket_name = aws_s3_bucket.trail-bucket.bucket
s3_key_prefix = "auto-tagger-trail-log"
cloud_watch_logs_role_arn = aws_iam_role.auto-tagger-role.arn
cloud_watch_logs_group_arn = "${aws_cloudwatch_log_group.auto-tagger-log-group.arn}:*"
event_selector {
read_write_type = "WriteOnly"
include_management_events = true
exclude_management_event_sources = [
"kms.amazonaws.com",
"rdsdata.amazonaws.com"
]
}
depends_on = [
aws_s3_bucket.trail-bucket,
aws_iam_role.auto-tagger-role,
aws_cloudwatch_log_group.auto-tagger-log-group,
aws_s3_bucket_policy.trail-bucket-policy
]
}

We need the depends_on block to tell Terraform that Cloudtrail relies on these other resources. Without it, we’ll hit a Cycle Error.

Next, we will create a Lambda. This will be the logic behind the tagging of our resources — we’ll add the code for that later on. For now, add the following resource block:

# lambda.tfresource "aws_lambda_function" "auto-tagger-function" {
filename = "./lambda_function.zip"
function_name = "auto-tagger-function"
role = aws_iam_role.auto-tagger-role.arn
handler = "lambda_function.lambda_handler"
runtime = "python3.9"
source_code_hash = filebase64sha256("lambda_function.zip")
}

All of the Terraform resources you have written so far are just the basic configurations needed to allow automatic tagging of AWS resources. There are other configurations you can apply but which aren’t needed for this tutorial.

Tagging resources

We’ve now reached the stage where we have most of the infrastructure that we need. For the most part, the IAC that we’ve already written won’t change. Next, we’ll begin writing the stuff that will actually apply tags to the resources we want to tag.

For each different resource you want to tag, there are five components that you will need to provide:

  • Cloudwatch
    - event rules
    - event target
  • Lambda
    - permissions
    - lambda code
  • IAM role inline policy

It’s worth noting that you will be able to add as many of these as you like for the different services that AWS provides!

Event rule

# event_rules.tfresource "aws_cloudwatch_event_rule" "tag-bucket-rule" {
name = "AutoTagBuckets"
role_arn = aws_iam_role.auto-tagger-role.arn
event_pattern = <<EOF
{
"source": ["aws.s3"],
"detail-type": ["AWS API Call via CloudTrail"],
"detail": {
"eventSource": ["s3.amazonaws.com"],
"eventName": ["CreateBucket"]
}
}
EOF
}

This will look for events logged by our Cloudtrail that we created earlier — specifically the buckets that we created. We can also reuse this for other services — to do so, simply copy and change it to the following:

# event_rules.tf...resource "aws_cloudwatch_event_rule" "<name-of-rule>" {
name = "<NameOfRuleInAWS>"
role_arn = aws_iam_role.auto-tagger-role.arn
event_pattern = <<EOF
{
"source": ["aws.<source_of_event>"],
"detail-type": ["AWS API Call via CloudTrail"],
"detail": {
"eventSource": ["<source_of_event>.amazonaws.com"],
"eventName": ["<EventName>"]
}
}
EOF
}

Keep in mind that if you want to view the rules through the GUI, they are now located under Amazon EventBridge and not Cloudwatch.

Event target

# event_target.tfresource "aws_cloudwatch_event_target" "tag-bucket-target" {
arn = aws_lambda_function.auto-tagger-function.arn
rule = aws_cloudwatch_event_rule.tag-bucket-rule.id
}

This provides the rule to the Lambda. Again, you will need to copy and change the code to the following if you want to reuse it for another resource:

# event_target.tfresource "aws_cloudwatch_event_target" "<name-of-target>" {
arn = aws_lambda_function.auto-tagger-function.arn
rule = aws_cloudwatch_event_rule.<name-of-rule>.id
}

Lambda permission

# lambda_permissions.tfresource "aws_lambda_permission" "invoke-bucket-rule" {
action = "lambda:InvokeFunction"
function_name = aws_lambda_function.auto-tagger-function.function_name
principal = "events.amazonaws.com"
source_arn = aws_cloudwatch_event_rule.tag-bucket-rule.arn
depends_on = [
aws_cloudwatch_event_rule.tag-bucket-rule
]
}

This will tell the Lambda to trigger if the Cloudwatch event rule is met. To reuse it, change the following:

# lambda_permissions.tf...resource "aws_lambda_permission" "<invoke-service-name>" {
action = "lambda:InvokeFunction"
function_name = aws_lambda_function.auto-tagger-function.function_name
principal = "events.amazonaws.com"
source_arn = aws_cloudwatch_event_rule.<name-of-rule>.arn
depends_on = [
aws_cloudwatch_event_rule.<name-of-rule>
]
}

In-line policy to tag resources

data "aws_iam_policy_document" "lambda-tag-resources-inline-policy" ...{
statement {
sid = "AllowRoleToTagResources"
effect = "Allow"
actions = [
"s3:GetBucketTagging",
"s3:PutBucketTagging"
]
resources = ["*"]
}
}

This will allow our role to tag resources for us. If you want to tag other resources, you’ll need to add the method of applying tags to the actions block that are specific to that service:

data "aws_iam_policy_document" "lambda-tag-resources-inline-policy"...{
statement {
sid = "AllowRoleToTagResources"
effect = "Allow"
actions = [
"s3:GetBucketTagging",
"s3:PutBucketTagging",
"<service>:<method_of_applying_tags_to_service>"
]
resources = ["*"]
}
}

Lambda code

We’re almost there!

Next up, we need to add some code to our Lambda. This will mean that when it is triggered, the tags will be added to the resource.

# lambda_function.pyimport json
import boto3
import datetime
s3=boto3.client('s3')def lambda_handler(event, context):# this is optional, it will give you an output in the cloudwatch
# log group '/aws/lambda/auto-tagger-function'.
# For actual deployment remove the print statement.
print(json.dumps(event))
detail=event['detail']
event_source=detail['eventSource']
event_name=detail['eventName']
user_type=detail['userIdentity']['type']
principal=detail['userIdentity']['principalId']
# Gets the user from event
if user_type == 'IAMUser':
user=detail['userIdentity']['userName']
else:
user=principal.split(':')[1]
current_date=datetime.date.today().strftime("%Y-%m-%d")
tag_set = {
'TagSet': [
{'Key': 'CreatedBy', 'Value': user},
{'Key': 'CreatedOn', 'Value': current_date}
]
}
if event_name == 'CreateBucket':# Because of the way s3 applies tags, if you have already supplied
# tags to the bucket when creating it, applying tags again will
# overwrite those tags. We need to use a 'try' + 'finally' block
# to add the existing 'TagSet' to the tag_set we want to
# automatically apply
try:
bucket_name = detail['requestParameters']['bucketName']
user_added_tags = s3.get_bucket_tagging(Bucket=bucket_name)['TagSet']
tag_set += user_added_tags
finally:
s3.put_bucket_tagging(Bucket=bucket_name, Tagging=tag_set)
return True

Applying the Terraform

The final step is to run the following:

  • zip lambda_function.zip lambda_function.py
  • terraform apply --auto-approve

And there you have it!

If you want create an S3 Bucket and check its tags at this point, you should see something like the following:

Tags that have been automatically created

The key ‘CreatedBy’ should be the name of the user associated with the AWS you are using.

Things to consider

If you’re going to expand on this, (I assume you will — what’s the point of only tagging buckets!), you will need to change the way you supply tag_sets to different services. Annoyingly, it’s not all the same.

I hope you’ve found this article on auto tagging resources useful and informative. A special mention to Credera UK for allowing me the time to write about auto tagging and share my knowledge.

Useful resources:

Interested in joining us?

Credera is currently hiring! View our open positions and apply here.

Got a question?

Please get in touch to speak to a member of our team.

--

--