Create a Python Lambda to Save On Your Cloud Bill with Terraform

Nick Miller
9 min readDec 20, 2022

To use the cloud well, you must speak the language of cloud. And the language of the cloud is code.

If we want to automate the management of our resources at scale, we need to use code, such as Terraform, to simplify the creation and management of resources. We must prioritize using serverless, such as Lambdas, to keep costs low. And we need to be able to use programming languages, such as Python, to alter running resources dynamically.

In this project, we’ll see how we can use Terraform, Lambdas, and Python to create a repeatable task that protects us from unnecessary charges on our cloud bills. We will use Terraform to create a Lambda running a Python script that will stop all instances with a particular tag, and we will schedule that Lambda to run daily.

Best of all, this will be done entirely in code. So let’s get to it.

Prerequisites

  • Intermediate Understanding of Terraform — I’ll explain my templates, but you should know how Resources and Modules work to get the most out of this walkthrough.
  • Beginner Understanding of Python — We’ll use Conditionals, For Loops, and Lists in the Lambda script.
  • Some knowledge of Boto3 — I’ll use the AWS SDK for Python ( Boto3) throughout the Lambda Script.

Setup and Structure of Project

The setup consists of several files and folders:

  • providers.tf: This file defines the AWS provider, which is required for Terraform to be able to create and manage AWS resources.
  • main.tf: This file illustrates the resources for the project, including a module for managing IAM permissions and another module for creating the Lambda function.
  • iam: This Module defines the IAM resources required for the Lambda function, including an IAM role with permissions to start and stop EC2 instances and an IAM policy that can be attached to the role.
  • lambda: This Module defines the Lambda function, including its name, code, and the IAM role and policy attachment defined in the iam Module.
  • python: a file that contains the script that will be used by the lambda Module.

Let’s go through what this project looks like

Providers.tf

Using us-east-1 for this project but feel free to change that region depending on where you are located.

# -- root/variables.tf -- 

#Declare the AWS provider
terraform {
required_providers {
aws = {
source = "hashicorp/aws"
}
}
}
# Configure the AWS Provider
provider "aws" {
region = "us-east-1"
}

root/main.tf

# -- root/main.tf --

#Defines variable I wll use to name the Lambda I'm creating
locals {
lambda_name = "stop-dev-instances"
}

#Module containing IAM permissions used by Lambda
module "iam" {
source = "./iam"
}

#Defines a data resource of type "archive_file" named "zip_the_python_code".
data "archive_file" "zip_the_python_code" {
type = "zip"
# Creates a zip archive file by combining the contents of the "python" directory and saving it to the specified output path
source_dir = "${path.module}/python/"
# The output path for the zip file is being set to a file
# Named after the value of the "lambda_name" local variable in the "python" directory
output_path = "${path.module}/python/${local.lambda_name}.zip"
}


#Module creating the actual Lambda
module "lambda" {
source = "./lambda"
lambda_name = local.lambda_name
#The filename of the zip file containing the code for the lambda function
filename = "${path.module}/python/${local.lambda_name}.zip"
#The IAM role that the lambda function should assume
lambda_role_arn = module.iam.lambda_role_arn
#The IAM policy attached to that role
role_policy_attachment = module.iam.role_policy_attachment
}

IAM Module

In this Module, we will give Lambda the permissions it needs to stop and start EC2 uses. To do this, we must enable Lambda to assume an IAM role with proper permissions.

iam/main.tf

# -- iam/main.tf -- 

#Policy for Lambda to assume IAM role
resource "aws_iam_role" "lambda_role" {
name = "assume-lambda-role"
#References a file from the iam directory that has the Lambda's assume role IAM policy
assume_role_policy = file("${path.module}/iam_role.txt")
}

#Policy for the IAM role to use
resource "aws_iam_policy" "iam_policy_for_lambda_role" {
name = "aws-policy-for-assume_role"
description = "AWS IAM Policy for managing assume-lambda-role"
#References a file from the iam directory that has the IAM policy for the role assumed
policy = file("${path.module}/iam_policy.txt")
}

#Attach the IAM policy to the role
resource "aws_iam_role_policy_attachment" "role_policy_attachment" {
role = aws_iam_role.lambda_role.name
policy_arn = aws_iam_policy.iam_policy_for_lambda_role.arn
}

iam/iam_role.txt

Policy allowing Lambda to assume the IAM role:

{
"Version": "2012-10-17",
"Statement": [
{
"Action": "sts:AssumeRole",
"Principal": {
"Service": "lambda.amazonaws.com"
},
"Effect": "Allow",
"Sid": ""
}
]
}

iam/iam_policy.txt

Policy allowing IAM Role to find and stop EC2 instances:

{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "VisualEditor0",
"Effect": "Allow",
"Action": [
"ec2:DescribeInstances",
"ec2:StopInstances"
],
"Resource": "*"
},
{
"Sid": "VisualEditor1",
"Effect": "Allow",
"Action": "logs:CreateLogGroup",
"Resource": "arn:aws:logs:us-east-1:235447109042:*"
},
{
"Sid": "VisualEditor2",
"Effect": "Allow",
"Action": [
"logs:CreateLogGroup",
"logs:CreateLogStream",
"logs:PutLogEvents"
],
"Resource": "arn:aws:logs:us-east-1:235447109042:log-group:/aws/lambda/createEnvInstances:*"
}
]
}

iam/outputs.tf

These values will be passed in the Lambda Module via variables in the main.tf in the Root Module.

#-- iam/outputs.tf --

output "lambda_role_arn" {
value = aws_iam_role.lambda_role.arn
}

output "role_policy_attachment" {
value = aws_iam_role_policy_attachment.role_policy_attachment
}

Lambda Module

This Module ingests the previously created IAM policies and a Python script, which we will review after this, to stop all EC2 instances that are labeled Dev. We will also use CloudWatch to schedule the Lambda to run daily.

main.tf

# -- lambda/main.tf --

#Create Lambda function that implements Python code
#This function will stop the instances labeled Dev
resource "aws_lambda_function" "lambda_function" {
#Uses the Python file that is zipped in main.tf
filename = var.filename
function_name = "stop-Dev-instances"
#Attach IAM role to Lambda
role = var.lambda_role_arn
handler = "lambda_function.lambda_handler"
runtime = "python3.8"
timeout = 60
#Wait until IAM Policy is attached to IAM role before creating
depends_on = [var.role_policy_attachment]
}

# Create the daily stop schedule
resource "aws_cloudwatch_event_rule" "every_day" {
name = "daily"
schedule_expression = "rate(1 day)"
}

# Allow CloudWatch to invoke stop_dev_lambda Function
resource "aws_lambda_permission" "allow_cloudwatch_to_invoke" {
function_name = aws_lambda_function.lambda_function.function_name
statement_id = "CloudWatchInvoke"
action = "lambda:InvokeFunction"
#Uses the daily CloudWatch stop schedule we just created
source_arn = aws_cloudwatch_event_rule.every_day.arn
principal = "events.amazonaws.com"
}

# Set the stop_dev_lambda to perform when the every_day is triggered
resource "aws_cloudwatch_event_target" "invoke_lambda" {
rule = aws_cloudwatch_event_rule.every_day.name
arn = aws_lambda_function.lambda_function.arn
depends_on = [aws_cloudwatch_event_rule.every_day, aws_lambda_function.lambda_function]
}

lambda/variables.tf

Declare the variables that will be passed from the Root Module into the Lambda Module:

# -- lambda_name/variables.tf --

variable "filename" {
type = string
}

variable "lambda_name" {
type = string
}

variable "lambda_role_arn" {
type = string
}

variable "role_policy_attachment" {}

Python Script

If you remember, in the Root Module, we had the below code for creating a zip file that we later used as the filenamevalue in our Lambda Module:

#Defines a data resource of type "archive_file" named "zip_the_python_code". 
data "archive_file" "zip_the_python_code" {
type = "zip"
# Creates a zip archive file by combining the contents of the "python" directory and saving it to the specified output path
source_dir = "${path.module}/python/"
# The output path for the zip file is being set to a file
# Named after the value of the "lambda_name" local variable in the "python" directory
output_path = "${path.module}/python/${local.lambda_name}.zip"
}

In the python folder, we have the python script(lambda_function.py) that the Lambda will execute to stop Dev instances. Here’s what our Lambda code will look like:

import json

def lambda_handler(event, context):
import logging
import boto3


#make logging executable in Lambda and locally
if len(logging.getLogger().handlers) > 0:
# the Lambda environment pre-configures a handler logging to stderr. If a handler is already configured,
# `.basicConfig` does not execute. Thus we set the level directly.
# Reference: https://stackoverflow.com/questions/37703609/using-python-logging-with-aws-lambda
logging.getLogger().setLevel(logging.INFO)
else:
logging.basicConfig(level=logging.INFO)

#Set the boto3 client to modify us-east-1 resources
#change this if your region isn't us-east-1
ec2_client=boto3.client("ec2", region_name='us-east-1')

#capture all the instance reservations in us-east-1
list_instances=ec2_client.describe_instances()
reservations=list_instances["Reservations"]

#declare list that will collect ids of instances to stop
stop_list=[]

#iterate through instance reservations
for r in reservations:
instances=r["Instances"]
#iterate through instances within the reservation
for i in instances:
instance_id = i['InstanceId']
#determine whether instance is running AND has a Environment tagged as Dev
#add matching instance_ids to stop_list
if i['State']['Name'] == 'running':
tags=i['Tags']
to_stop=False
for t in tags:
if t['Key']=='Environment' and t['Value']=='Dev':
to_stop=True
if to_stop:
logging.info(f"{instance_id} is being stopped", )
stop_list.append(instance_id)
else:
logging.info(f"{instance_id} is not tagged as Environment:Dev and will not be stopped")
else:
logging.info(f"{instance_id} is not running and will not be stopped")
logging.info(f"Stop List: {stop_list}")

#stop all instance_ids on the stop_list
if(len(stop_list)>0):
result=ec2_client.stop_instances(InstanceIds=stop_list)
#log the outcome of running ec2_clinet.stop_instances
logging.info(f"Result: {result}")
else:
#log that the list is empty if there are no Dev instances to stop
logging.info("Stop List is empty. Nothing to stop")

When Terraform’s data resource block zips this code in the Root Module’s main.tf, the zipped file will also be deposited in the Python directory. The path to the zip will be used again in the creation of the Lambda.

Time to Apply

Verify that you’ve created all the below folders and files in the directory:

After completing the directory, it’s time to apply your changes at the terminal. You should know these commands if you’ve used Terraform before, but if not, here they are:


#make sure the directory has been initialized
terraform init

#validate that the configuration files are syntatically valid and internally consistent
terraform validate

#see what changes Terraform will make when you run your template
terraform plan

#deploy your infrastructure
terraform apply

Testing our Lambda

To this Lambda, navigate to the Lambda menu in your AWS Console. You should see a Function with its name:

Click into the Lambda and click on Test:

Keep all the defaults and use whatever you want for the Event Name whatever you want:

When you hit Test again, you’ll see the Execution results. In the Function Logs section, you should see this:

As we don’t have any Dev instances running, this is the expected result.

To test running Dev instances, you can either manually spin up an instance with Environment Dev Tags or use the below Terraform script to create some test instances quickly:

terraform {
required_providers {
aws = {
source = "hashicorp/aws"
}
}
}

provider "aws" {
region = "us-east-1"
}

resource "aws_instance" "test" {
count = 3
ami = "ami-0093a6022697a73aa"
instance_type = "t2.micro"
tags = {
Name = "Dev-${count.index}"
Environment = "Dev"
}
}

If you’ve run the script, you’ll see this in your EC2 Console:

Now go back to Lambda and run Test again.

This time in the Execution results, we’ll see that the Stop List contains the three Instance IDs tagged with the Dev Environment tag:

And if we look at the EC2 Console, the Instances have been stopped:

Wrapping Up

As you’ve seen in this demo, using Lambda, Python, and Terraform together, you can create automated processes that respond to events, execute code written in Python, and manage infrastructure resources in a reliable and cost-effective way. We performed a relatively simple task of stopping instances on a daily schedule. Still, you can imagine how these three tools could be used together for tasks such as data processing, backups, and monitoring.

If you would like to see the code I used in the course, the repo can be found here: https://github.com/nickcmiller/tf-stop-instances

--

--