EBS Snapshot Management Using AWS Lambda and CloudWatch

Troy Ingram
Mar 12 · 7 min read

Use Case:

You are the admin over a large amount of EC2 instances. You need to ensure that you have EBS Snapshots available for disaster recovery. To save time you want to automate this practice along with deleting EBS Snapshots that are more than 10 days old. The automation of this task will help save your company valuable labor hours that can be better spent elsewhere, saving the company money.

Lambda is an AWS serverless compute service, which will allow us to run our code only when we need it. We’ll use a CloudWatch Event to trigger our Lambda function based on a schedule.

Create EC2 Instance

Note: If you already have at least one EC2 Instance, you can skip to Create An IAM Role.

  1. Navigate to EC2. Services > EC2

4. Select t2 micro (Free tier eligible) and click Next: Configure Instance Details.
5. The defaults are fine. Click Next: Add Storage.
6. No changes. Click Next: Add Tags.
7. Optional: Click Add Tag. For Key: Name and for Value: EBS Snapshots. I like to do this so I can track all my resources used in a lab for easier cleanup. 8. Click Next: Configure Security Group
9. For your security group, you’ll want to make sure it’s secure. Either select an existing Security Group or modify the new Security Group. For this lab I recommend simply changing the SSH Source from Custom to My IP. This will prevent the EC2 Instance from being open to the world.

10. Click Review and Launch.
11. Click Launch

Create An IAM Role

  1. Click Services > IAM

7. Navigate back to Roles page and use the search to find your newly created role.
8. Click on your Role name link to go to your Role summary.
9. Under the Permissions tab, click Add inline policy.
10. In the JSON tab, paste the following:

{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"logs:*"
],
"Resource": "arn:aws:logs:*:*:*"
},
{
"Effect": "Allow",
"Action": "ec2:Describe*",
"Resource": "*"
},
{
"Effect": "Allow",
"Action": [
"ec2:CreateSnapshot",
"ec2:DeleteSnapshot",
"ec2:CreateTags",
"ec2:ModifySnapshotAttribute",
"ec2:ResetSnapshotAttribute"
],
"Resource": [
"*"
]
}
]
}

11. Click Review
12. Click Create Policy

Create Lambda Snapshot Function

Next we need to create a Lambda function which will create snapshots of our EBS volumes.

  1. Navigate to Lambda using Services or the Search option.

Now that the function was created, it needs to be configured.

  1. In the Code section update the lambda_function.py with the following:
# Backup all in-use volumes in all regionsimport boto3def lambda_handler(event, context):
ec2 = boto3.client('ec2')

# Get all in-use volumes in all regions
result = ec2.describe_volumes( Filters=[{'Name': 'status', 'Values': ['in-use']}])

for volume in result['Volumes']:
print("Backing up {volume['VolumeId']} in {volume['AvailabilityZone']}")

# Create snapshot
result = ec2.create_snapshot(VolumeId=volume['VolumeId'],Description='Created by Lambda backup function ebs-snapshots')

# Get snapshot resource
ec2resource = boto3.resource('ec2')
snapshot = ec2resource.Snapshot(result['SnapshotId'])

# Find name tag for volume
if 'Tags' in volume:
for tags in volume['Tags']:
if tags["Key"] == 'Name':
volumename = tags["Value"]
else:
volumename = 'N/A'

# Add volume name to snapshot for easier identification
snapshot.create_tags(Tags=[{'Key': 'Name','Value': volumename}])

2. Click Deploy.

3. The Lambda Function has a default timeout of 3 sec, so it needs to be updated. I went with 1 minute to be safe. The max timeout is 15 minutes (900 seconds), so for future projects, if your code takes longer than 15 minutes to run, Lambda is not the solution. Click Configuration > General configuration > Edit to update the Timeout.

4. Ensure you select the IAM Role created earlier for the Existing role option and Save.

Add CloudWatch Trigger

  1. In the Function overview section click the Add trigger button. You could also get to the same place using Configuration > Triggers > Add trigger.

Now our CloudWatch trigger is set. We could wait until our event triggers according to our schedule to see if it’s working or we could test it right now.

  1. While in your Lambda Function select Test > New Event

4. After an Execution result log will appear. Click Details to view the logs.

5. Next Navigate to Services > EC2 > Elastic Block Store > Snapshots. You should see newly created snapshots of all your in-use EBS volumes.

Create Lambda Function To Delete Old Snapshots

You don’t want to become a snapshot hoarder and you don’t want to have to manually delete your snapshots, so we are going to add another Lambda function to delete snapshots that are older than a set amount of days.

  1. Navigate back to Lambda. Services > Lambda

You’ll need to replace account_id with your own AWS account and you can adjust retention_days based to whatever you want.

# Delete snapshots older than retention periodimport boto3from botocore.exceptions import ClientError
from datetime import datetime,timedelta
def delete_snapshot(snapshot_id):
print(f"Deleting snapshot {snapshot_id} ")
try:
ec2resource = boto3.resource('ec2')
snapshot = ec2resource.Snapshot(snapshot_id)
snapshot.delete()
except ClientError as e:
print(f"Caught exception: {e}")
returndef lambda_handler(event, context):# Get current timestamp in UTC
now = datetime.now()

# AWS Account ID
account_id = '111111111111'
# Define retention period in days
retention_days = 10
# Create EC2 client
ec2 = boto3.client('ec2')
# Filtering by snapshot timestamp comparison is not supported
# So we grab all snapshot id's
result = ec2.describe_snapshots( OwnerIds=[account_id] )
for snapshot in result['Snapshots']:
print(f"Checking snapshot {snapshot['SnapshotId']} which was created on {snapshot['StartTime']}")
# Remove timezone info from snapshot in order for comparison to work below
snapshot_time = snapshot['StartTime'].replace(tzinfo=None)
# Subtract snapshot time from now returns a timedelta
# Check if the timedelta is greater than retention days
if (now - snapshot_time) > timedelta(retention_days):
print(f"Snapshot is older than configured retention of {retention_days} days")
delete_snapshot(snapshot['SnapshotId'])
else:
print(f"Snapshot is newer than configured retention of {retention_days} days, so we keep it")

5. Update the Timeout to 1 minute Configuration > General configuration > Edit. Use the same IAM role we created. I chose to have my trigger scheduled for one hour later than the other Lambda function.

6. You’ll test this Lambda function just like you tested the last. Since we won’t actually see any EBS Snapshots deleted (unless you are testing this with a snapshot outside of the retention time), we will only be able to go off our logs to verify. Looking at the Log output, you should see that the Lambda function checks the Snapshot and since it is not older than 10 days, it is not deleted.

Nerd For Tech

From Confusion to Clarification

Nerd For Tech

NFT is an Educational Media House. Our mission is to bring the invaluable knowledge and experiences of experts from all over the world to the novice. To know more about us, visit https://www.nerdfortech.org/. Don’t forget to check out Ask-NFT, a mentorship ecosystem we’ve started

Troy Ingram

Written by

Aspiring DevOps Engineer documenting my journey.

Nerd For Tech

NFT is an Educational Media House. Our mission is to bring the invaluable knowledge and experiences of experts from all over the world to the novice. To know more about us, visit https://www.nerdfortech.org/. Don’t forget to check out Ask-NFT, a mentorship ecosystem we’ve started

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store