AWS Multi-Account Snapshot management

Esteban Cañizal
etermax technology
Published in
5 min readFeb 9, 2018

AWS offers point-in-time snapshots. These snapshots are incremental backups taken from EC2 EBS Volumes, so the more often you take your snapshot the least time is required to be completed. So far so good, cheap and simple backups. However, what if we need to do this across multiple accounts and regions? As every mortal would probably do: “we just repeat the strategy that worked over and over until it doesn’t make sense anymore”. That is exactly what we did, but It didn’t take long to become senseless though. Copying the script and changing names is not the best idea when you need to manage multiple accounts and regions, and I can prove myself true

#### ||| SNAPSHOT MANAGEMENT ||| ###### Take weekly EC2 volume snapshots0 5 * * 1 /usr/local/aws_scripts/volume_snapshot_create_weekly.sh## Snapshot cleaner. Deletes 3 months old weekly snapshots. Mondays30 5 * * 1 /usr/local/aws_scripts/volume_snapshot_delete_weekly.sh## Take daily EC2 volume snapshots0 4 * * * /usr/local/aws_scripts/volume_snapshot_create_daily.sh## Snapshot cleaner. Deletes 3 months old weekly snapshots. Mondays00 5 * * 1 /usr/local/aws_scripts/volume_snapshot_delete_daily.sh## Take daily EC2 São Paulo volume snapshots15 4 * * * /usr/local/aws_scripts/volume_snapshot_SP_create_daily.sh## Snapshot cleaner São Paulo. Deletes 3 months old weekly snapshots. Mondays15 5 * * 1 /usr/local/aws_scripts/volume_snapshot_SP_delete_daily.sh## Take daily EC2 volume snapshots for COUCHBASE & ElasticSearch0 3 * * * /usr/local/aws_scripts/couch-elastic_snapshot_create_daily.sh

Solution overview

Our solution is a combination of: “Tags” +””Ruby” + “Crontab” (Unix task scheduler).

Tags: We decided which volumes to backup and added a tag “snapshot-daily:true” to each one of them.

Ruby: Scripts for “Creating a Snapshot”, “Deleting a Snapshot” and “Monitor Snapshot creation”. We chose Ruby over Bash because API integration is easier when tasks get more complex. We don’t have any problem with Python or any other API implementations, we had to choose one and we had already done some other funny things with Ruby.

Crontab: We scheduled the 3 tasks accordingly

Steps

  • Tag Volumes. Open the AWS console, go to Volumes, choose the ones you need to backup (one at the time), then go to Tags tab to set there a pair key:value “snapshot-daily:true”
  • Choose a Linux AMI, instance t2.micro should be fine. There you need to install ‘Ruby’, and ‘aws-sdk’ Gem.
    # apt-get update && apt-get install ruby rubygems-integration && gem install aws-sdk
  • Place the following Ruby scripts under /usr/local/aws_tasks directory, for example.

# mkdir -p /usr/local/aws_tasks && cd /usr/local/aws_tasks && vi create_daily_snapshots.rb

#!/usr/bin/env ruby
# Script to create snapshots of those volumes tagged (snapshot-daily:true)
require_relative ‘./library/aws_ec2_client_for’
require_relative ‘./library/aws_volumes_filter_by_tag’
require_relative ‘./library/aws_get_instance_tagname’
require ‘time’
today = Time.now.strftime(“%Y-%m-%d”)creds = YAML.load(File.read(‘aws-credentials.yaml’))
creds.each do |account|
volumes_to_snapshot = aws_volumes_filter_by_tag(account, ‘snapshot-daily’, ‘true’)
ec2 = aws_ec2_client_for(account)
volumes_to_snapshot.each do |vol|
ec2.describe_volumes.volumes.each do |volume|
if volume.volume_id == vol
instance_id = volume.attachments[0].instance_id
instance_name = aws_get_instance_tagname(account, volume.attachments[0].instance_id)
ec2.create_snapshot(volume_id: vol, description: “#{instance_name} (#{instance_id}) taken on #{today} by Infra”)
end
end
end
end
  • It is important keep a certain number of snapshots to avoid incurring in unnecessary charges.

# vi remove_daily_snapshots.rb

#!/usr/bin/env ruby
# It removes snapshots of those volumes tagged (snapshot-daily:true) older than retention_days var.
require_relative ‘./library/aws_ec2_client_for’
require_relative ‘./library/aws_volumes_filter_by_tag’
require ‘time’
retention_days = Time.now — (60 * 60 * 24 * 30) #30 days
creds = YAML.load(File.read(‘aws-credentials.yaml’))
creds.each do |account|
volumes_to_snapshot = aws_volumes_filter_by_tag(account, ‘snapshot-daily’, ‘true’)
ec2 = aws_ec2_client_for(account)
volumes_to_snapshot.each do |vol|
ec2.describe_snapshots.snapshots.each do |snap|
if ( snap.volume_id == vol && snap.start_time < retention_days )
begin
ec2.delete_snapshot({snapshot_id: snap.snapshot_id,})
rescue Aws::EC2::Errors::ServiceError
puts “ It is NOT possible to remove #{snap.snapshot_id} in account: #{account[0]}”
else
puts “#{snap.snapshot_id} successfully removed in account: #{account[0]}”
end
end
end
end
end
  • Since we didn’t use this t2.micro instance just for these tasks, sometimes we used more CPU credits than what we had available and instance would get stale, so we decided to monitor snapshots everyday to make the solution more reliable.

# vi monitor_daily_snapshots.rb

#!/usr/bin/env ruby
#It sends a report by email about those snapshots of volumes with tag(snapshot-hourly:true)
require_relative ‘./library/aws_ec2_client_for’
require_relative ‘./library/aws_volumes_filter_by_tag’
require_relative ‘./library/aws_ses_send_email’
require ‘time’
today = Time.now.strftime(“%Y-%m-%d”)creds = YAML.load(File.read(‘aws-credentials.yaml’))
creds.each do |account|
ec2 = aws_ec2_client_for(account)
volumes_snapshoted = []

volumes_to_snapshot = aws_volumes_filter_by_tag(account, ‘snapshot-daily’, ‘true’)
volumes_to_snapshot.each do |vol|
ec2.describe_snapshots.snapshots.each do |snap|
if snap.volume_id == vol
volumes_snapshoted << vol if (snap.start_time.strftime(“%Y-%m-%d”) == today && snap.progress == ‘100%’)
end
end
end
volumes_not_snapshoted = volumes_to_snapshot — volumes_snapshoted.uniq
volumes_count = volumes_not_snapshoted.count
aws_ses_send_email(“In account: #{account[0]} snapshot could NOT be taken: #{volumes_count} volume/s today”, “#{volumes_not_snapshoted}”) if volumes_count > 0
end
  • At this point you should have something like this:
# ls -l /usr/local/aws_tasks
create_daily_snapshots.rb
monitor_snapshots_daily.rb
remove_daily_snapshots.rb
  • These 3 tasks depend on some smaller scripts we wrote that act as libraries (to reuse code). Put library scripts in place.

# mkdir -p /usr/local/aws-tasks/library && cd /usr/local/aws-tasks/library

# vi aws_ec2_client_for.rb

#!/usr/bin/env rubyrequire ‘aws-sdk’
require ‘yaml’
def aws_ec2_client_for(account)
Aws::EC2::Client.new(access_key_id: account[1][‘access_key_id’],
secret_access_key: account[1][‘secret_access_key’],
region: account[1][‘region’])
end

# vi aws_volumes_filter_by_tag.rb

#!/usr/bin/env ruby
# Passing a Tag(key, value) already set in a volume, you get a list of volumes that accomplish with that condition.
require ‘aws-sdk’
require_relative ‘aws_ec2_resource_for’
def aws_volumes_filter_by_tag (account, tag_key, tag_value)
key_value = []
ec2 = aws_ec2_resource_for(account)
ec2.volumes({filters: [{name: ‘tag:’”#{tag_key}”, values: [“#{tag_value}”]}]}).each do |volume|
key_value << volume.id
end
return key_value
end

# vim aws_get_instance_tagname.rb

#!/usr/bin/env ruby
# Passing AWS instance_id, get tag=Name if set.
require ‘./library/aws_ec2_resource_for’
require ‘aws-sdk’
def aws_get_instance_tagname (account, instance_id)
ec2 = aws_ec2_resource_for(account)
ec2.instances.each do |instance|
if instance.id == instance_id
tags = instance.tags
tag_name=tags.find {|t| t.key == ‘Name’ }
instance_name=tag_name.value
return instance_name
end
end
end
  • As we mentioned at the very beginning of this article, this solution is for multiple AWS accounts and regions. In this example, we are using just two. But you can keep adding as many accounts as you need one after the other. Lets create “aws-credentials.yaml” file.

# vim aws-credentials.yaml

account_one:
access_key_id: ACCESS_KEY_FOR_ACCOUNT_ONE
secret_access_key: xxxxxxxxxx
region: us-east-1
account_two:
access_key_id: ACCESS_KEY_FOR_ACCOUNT_TWO
secret_access_key: xxxxxxxxxx
region: us-east-1
  • Now It’s time to schedule these tasks.

# crontab -e

#### ||| EC2 SNAPSHOT MANAGEMENT ||| ####00 09 * * * ruby /usr/local/aws_tasks/create_daily_snapshots.rb 
21 13 * * * ruby /usr/local/aws_tasks/remove_daily_snapshots.rb
00 13 * * * ruby /usr/local/aws_tasks/monitor_daily_snapshots.rb
  • Finally, once create_daily_snapshots.rb has run, you should be able to see the snapshots already taken.

Summary

In this post, we showed our solution to automate and schedule snapshot management: “create, delete and maintain snapshots”
It can be applied along one or many AWS accounts and regions by running Ruby scripts and using a very small Linux instance that just needs to be running during the timetable defined in Cron. We are working on another article where you can see how to manage instance stop/start automatically.

--

--