Digging into the New AWS Backup Service

8 min readJan 22, 2019

Full disclosure: I work for a vendor that offers a cloud data management/protection solution. However, this is my personal blog and I will be evaluating this new AWS service on its own merits and will not be mentioning mine or any other company that may be offering an alternative.

Last week, Amazon Web Services (AWS) announced their new AWS Backup service which is designed “to centralize and automate the back up of data across AWS services in the cloud as well as on premises using the AWS Storage Gateway.” This is Amazon’s first attempt to create a single solution that orchestrates and manages the life cycle of backups across their vast portfolio of services. In this v1 release, AWS Backup will support backing up Amazon Elastic Block Storage (EBS), Amazon Relation Database Service (RDS), Amazon DynamoDB, Amazon Elastic File System (EFS), and AWS Storage Gateway.

A solution for centralizing backups across AWS has been in high demand among large enterprise customers who are used to deploying tradition backup solutions in their data centers that schedule backups through “jobs” and will handle tasks such as expiring/deleting older backups time. What enterprises do NOT want is to have to manually managed their own backups, create their own tools via scripts or AWS Lambda functions, or use a point solution for each application they have to protect. AWS Backup aims to be that single point of management and source of truth that customers can rely on.

In this post, I want to try and take a deeper look into AWS Backup than what is readily available online. Specifically, I want to look at the architecture (insofar as I can deduce it)of AWS Backup and how it operate under the hood. Note that I will not be providing a product overview or a walk-through of how to configure AWS Backup. Esteemed AWS Technical Evangelist, Jeff Barr, does a great job of that in his blog post and I would actually recommend reading that first.

Architecture

The AWS Backup product page provides the following very high-level diagram:

Similar to what I did when AWS Secrets Manager was first announced, I dug through the AWS Backup documentation and FAQs to see if I could piece together a more detailed architecture. The diagram below is what I speculate is the current AWS Backup is architecture.

Some observations on this possible architecture:

As you would expect, most of the service is built from existing AWS services and primitives.
The policy engine, which seems to be purpose built for AWS Backup, is what the user interacts with to create their backup plan and backup rule.
The policy engine likely interacts with a scheduler, which is probably also purpose built, that initiates time-based tasks passed to it by the policy engine. This would include things like backup frequency, backup windows, and backup expiration times.
Since AWS Backup allows users to define a custom backup frequency using cron expressions, it’s likely that all the pre-built backup frequency options are also created using cron jobs that are enabled by the scheduler when chosen by a user.
The documentation makes it clear that AWS Backup leverages existing backup capabilities, when possible, that is inherent within each service it supports. I dig more into that in the next section. Note that when AWS talks about backups, they typically mean snapshots.
When a backup job kicks off, AWS backup likely spawns some task execution handlers that initiate and manage the various API calls to the service being backed up. These handlers are likely to use AWS Lambda.
Backup vaults are essentially Amazon Simple Storage Service (S3) buckets that are hidden from direct management by a user. It’s not clear to me if all snapshots are being stored in the same S3 bucket or if a backup vault is an abstraction that front-ends multiple hidden buckets.
Since EFS does not have a native snapshot capability, a different process had to be created to enable EFS backups. More on that later.

Amazon EBS, RDS, and DynamoDB Backups

AWS backup uses existing capabilities to backup and to restore EBS, RDS, and DynamoDB. We can confirm this very easily by looking at the AWS CloudTrail events.

Here is the sequence of API calls when an EBS backup job is started:

You can see here that after the IAM user (khui), who is the backup admin, initiates a backup job, the AWS Backup service assumes an AWS backup service Role. In the context of this role, a “Create Snapshot” call is initiated using the EC2 API.

Above, you can see the CreateSnapshot call being logged, the EBS volume that was protected, and the tag that was applied to the snapshot. Though I don’t show it, a DescribeSnapshot API call is made afterwards to confirm the snapshot has been created. You can see the new snapshot in the EC2 dashboard as you would if the snapshot was initiated outside of AWS Backup.

The same sequence occurs when a RDS backup job is initiated and you can see below a CloudTrail event logged for a standard CreateDBsnapshot RDS API call. A DescribeDBSnapshot API call is made afterwards.

Again, you can see the RDS snapshot in the RDS dashboard as would be the case if you initiated a backup outside of using Amazon Backup.

Just for completeness, here is the CloudTrail event for a DynamoDB backup using the standard “CreateBackup” API for DynamoDB. A DescribeBackup API call is made afterwards.

As with the other two services, you can see the snapshot in the DynamoDB dashboard.

I don’t have Storage Gateway setup in the lab but the backup process is the same with AWS Backup assuming a service role and making the necessary API calls. I won’t bore everyone by walking though the API calls to perform a restore. Essentially, the AWS Backup service assumes an AWS restore service to made the relevant restore API calls.

One thing I do want to call out is how EBS restores are handled by AWS Backup in contrast to RDS and DynamoDB restores. RDS and DynamoDB are managed database services which means for restores, not only are new database instances created off the snapshots, those database instances are made available and automatically attached to database servers. These servers are not visible to the user and the process for mounting databases to them are handled by AWS.

With EBS restores, a new EBS volume is created off a snapshot but the process for attaching the volume to an EC2 instance is the responsibility of the user. The diagram below outlines the workflow for replacing an EBS volume, attached to an EC2 instance, with the new volume. You can see AWS Backup handles the first step only.

Amazon EFS Backups

Amazon EFS backups are handled differently than the other services because EFS does not have native snapshot or other backup capabilities built into the service. Prior to AWS Backup, Amazon offered two point solutions that have received mixed receptions by customers. For example, Amazon offers an AWS CloudFormation template that deploys an EFS-to-EFS backup solution. This solution uses an EC2 instance that mount the source file system and copies files to a mounted target file system. The other solution uses AWS Data Pipeline to copy files using rsync.

These are two “side-car” solutions and not native capabilities built into the EFS service and do not have specific APIs that can be called by AWS Backup. Looking at the CloudTrail events when an EFS backup is initiated, we can make some reasonable deductions about how AWS Backup handles this.

There is a “StartBackupJob” event initiated by the AWS Backup service but it is not followed by something like a “CreateEFSSnapshot” API call since none exists. Instead, what we see is a series of “Describe” resources events indicating that AWS Backup is listing the configuration for the file system to be protected.

What is worth noting is that there are no other related API calls before the backup job finishes. My speculation is that, behind the scenes, AWS Backup is launching one or more EC2 instances which mounts the source file system and backs it up. Data movement could be performed using rsync, Data Pipeline, or some other software running on the EC2 instance launched by AWS Backup.

If I am correct, this would be similar to the two point solutions I mentioned earlier except that the backups are stored in S3 and Glacier. I am not sure if files are copied to S3 during the backup process or if files are first copied to an interim target file system and then later moved to S3.

I would speculate that EFS restores work in a similar way with a AWS Backup restore job launching an EC2 instance which copies files from the recovery point back to the original file system or to a new file system. Unfortunately, I don’t have a way to see if the files are being recovered directly from S3 or if they have to be recovered to an interim file system before being copied to the original file system.

Concluding Thoughts

I consider AWS Backup to be a solid v1 release that Amazon can build on over time. It can already claim a place as the backup product with the broadest integration across the AWS services portfolio. I expect the service to grow broader and deeper, including:

Support for additional services such as Amazon Aurora, Amazon DocumentDB, Amazon ElastiCache, and Amazon FSx
Deeper restore capabilities, particularly for restores of EBS volumes. As I wrote earlier, the process to restore and attach a volume is mostly the responsibility of the user. I would expect more orchestration to automate the process.
Integration with Amazon Machine Image (AMI) creation so that an entire EC2 instance can be protected, including attached EBS volumes
Integration with AMIs so that an entire EC2 instance can be restored
Cross-Region and cross-account restore of data

AWS Backup aims to solve a huge problem for enterprises that need a simpler approach to managing data protection across their AWS estate. I am looking forward to seeing what future enhancements Amazon will be adding to the service and how third-party vendors will respond.

Digging into the New AWS Backup Service

Architecture

Amazon EBS, RDS, and DynamoDB Backups

Amazon EFS Backups

Concluding Thoughts

Written by Kenneth Hui