Highly Available Downloads Across Clouds

Mike Perttula
Signiant Engineering
4 min readJul 4, 2017

We’re always told never to put all your eggs in one basket. If you have used Amazon’s Simple Storage Service (S3) in the past year then you may know why. On the morning of February 28th, 2017 Amazon had a large outage in their S3 subsystem that handles all their GET, LIST, PUT, and DELETE requests in the US-East-1 region. If you used Object storage in that region you were out of luck until the outage was resolved which, in this case, took hours. At Signiant, we have a redundancy strategy for our user-facing web applications but during the S3 outage, a flaw was highlighted in how we distribute our installable components. We went into into action to address this.

In doing our post mortem we determined our ability to distribute software and binaries to customers can be affected by an S3 outage. S3 has its hooks in much of the functionality of AWS and specifically Cloudfront, Amazon’s CDN solution, was affected by the outage. We use Cloudfront to serve up a number of our software assets such as the Signiant App (which allows customers to make use of our acceleration technology in Media Shuttle). Not having new versions available, or being possibly offline if nothing could be served by Cloudfront, is not a risk we wanted to take.

X - CDN

The simplest way to make sure you are highly available for object storage is to make sure assets are served from more than one region. This is a staple at Signiant and we already have these assets served in multiple regions. With the S3 outage though, we found that all of Cloudfront was unavailable for a period of time so we decided to go cross CDN to ensure our object storage for downloadable assets is not dependant on a single CDN.

Signiant has a great relationship with AWS but also with Azure. Our Media Shuttle and Flight products both allow uploading and downloading from both AWS and Azure cloud storage. It was a no brainer in using Azure for our cross cloud redundancy given we already run infrastructure there today. Azure’s CDN service makes use of Akamai and Verizon for serving content so we get a bonus of possibly using up to 3 CDNs at one time if needed.

Object storage Copy

When releasing our downloadable software, we make use of our Jenkins promotion tools to get the downloadable artifacts to S3. From there Cloudfront will serve them to its edge nodes for end user download. Instead of investing time in modifying the promotion tools to also upload to Azure blob storage, we created a small service called the “CDN Replication Service”. This service polls an Amazon SQS Queue for changes to an s3 bucket and uploads them to Azure blob storage.

S3 Events to SQS

S3 has a handy ability to send S3 events such has PUT/GET/COPY/DELETE to multiple Amazon Services. By forwarding the upload events to a queue, our replication service is free to pull from that queue and replicate files as they are added. Since our downloadable artifacts are public, we can send a command to Azure’s Copy Blob REST call to pull the new or changed files directly from S3 to a storage location on Azure without incurring any middleman transfer charges from the service. Any transfer failures are left on the queue to retry on the next poll.

Currently the service only handles new or changed content. A future enhancement will be to handle messages related to deletion (not required at this time as we don’t prune old versions often).

How available do you need to be?

Our failover strategy for downloadable assets may not be for everyone. Here are some things to consider when determining your failover strategy:

  • Cost : Is hosting across multiple CDNs and cloud storage worth the return you will receive in availability for customers?
  • Complexity : Are you adding additional layers of complexity and another point of failure with your HA solution?

Any cloud solution is bound to have downtime. In Signiant’s case the cost and complexity was minimal and we felt adding this change does provide enough benefit so that our software is available for download 24/7.

If you are interested in our solution, you can find s3queue2blob code on github.

--

--