Rein In EBS Snapshots, Before They Get Out of Control!

by Rachel Dines

The Trouble with EBS Snapshots

AWS recently announced a price drop on EBS snapshots. While this is great news (and a long time coming), it’s no excuse for letting old snapshots pile up and get out of control. This is a more common problem than you may think. Many organizations use EBS snapshots to create point-in-time recovery points to use in case of data loss or disaster. However, EBS snapshot costs can quickly get out of control if not closely controlled. Individual snapshots are not costly, but the cost can grow quickly when many are created. A compounding factor on this issue is that users can configure settings to automatically create subsequent snapshots on a daily basis, without scheduling older snapshots for deletion.

Setting Sensible Snapshot Retention Cycles

Without naming names, I once heard a tale of woe about a B2B SaaS company who found that among their millions of EBS snapshots, a large percentage of them were more than two years old! Traditional on-premises backup schemas — such as a variant of the grandfather-father-son approach — still apply in a cloud world. You don’t need to keep every single point in time copy. Most organizations will opt for a retention schedule that looks something like this:

  • Hourly snapshots are kept for 1 day (note that only mission critical systems with low recovery point objectives will need hourly snapshots)
  • Nightly snapshots are kept for 1 week
  • Weekly snapshots are kept for 1 month
  • Monthly snapshots are kept for 1 year

That’s it. I’ve seen some companies opt for much shorter retention cycles than I describe above, as short as 60 or 90 days. Best practice is to set a standard in your organization for how many snapshots should be retained per according to data classification strategy. This will vary by environment, and criticality. For example, critical production data may get hourly snapshots that are kept for 2 weeks, then daily snapshots that are kept for 4 weeks, weekly snapshots that are kept for 1 year. In a development environment, however, you may opt for only daily snapshots that are kept for 1 week, and weekly snapshots that are kept for 4 weeks. Whatever you decide is your corporate standard, remember that the majority of the time, a recovery will occur from the most recent snapshot.

Taking Back Control

Organizations can help get EBS snapshots back under control by monitoring snapshot cost and usage per instance to make sure they do not spike out of control. If you have a tool like CloudHealth, this is very straightforward, you can subscribe to a report that shows your EBS snapshot costs over time, and set an alert to tell you if they exceed a certain amount:

As you can see, the majority of our snapshots are coming from production assets which makes sense. Looks like we did a big cleanup in March, good job team! Since then, we’ve kept our snapshot usage under control. Without any automation, that could be a really time consuming task. Good news is, CloudHealth also has governance policies that can run and automatically apply our unique snapshot policy for us.

In about 5 clicks, I just wrote a policy that will look for any snapshots in production that are older than 4 weeks, it will email me to ask for approval to delete the snapshot, and then CloudHealth will delete the snapshot on my behalf. Just for fun, I ran this policy and I immediately found 53 snapshots totalling more than 5TB that were eligible for deletion according to my policy! If I deleted these, I could save more than $250 a month. I also noticed that several of these snapshots do not have a volume associated with them. What likely happened is that a volume was deleted, but the snapshots stayed behind. These may also be good candidates for deletion, although I need to be careful not to delete snapshots that are being utilized as a volume for an instance or snapshots that are meant to be kept for archival purposes.

Next steps

What are you waiting for? Get started looking for old EBS Snapshots that you can delete and quickly slim down your AWS bill. If you’re looking for more quick tips on reducing spend in AWS, I recommend you check out our ebook: 10 Best Practices for Reducing Spend in AWS

Originally published at