Kevin Cantwell
Sep 16, 2016 · 2 min read

A combination of costs and lack of a relational model and our choice to not replicate a backup due to prohibitive costs. We had something like 100TB of data stored in DynamoDB which was expensive. Also, we found it frustrating to not have very good inspectability into our data. We never knew how much of it was stale, incorrect, or missing. We also couldn’t easily or quickly analyze the data as it’s essentially kv. And then there was the outage last September (https://aws.amazon.com/message/5467D2/) that for us, and nobody else afaik, lasted a whole day due to our table size and required that Amazon provide us with a “special” endpoint to use for several weeks afterwards.

Aurora was pretty new, but we had a relationship with the product lead (who was formerly the product lead of DynamoDB) and felt confident that we’d have someone’s ear if a problem arose, so we began to explore that option right away. What we found was that Aurora gave us essentially the same performance but came with all the benefits of a standard SQL database. We were new to the MySQL interface, but there’s a reason lots of people use it. The learning curve was close to flat.

There were some drawbacks to the migration. We did have to rewrite big portions of our code to interact with Aurora instead of DynamoDB, which took several weeks to evaluate and tweak for best performance. We also found ourselves limited by the 64TB size limit of an Aurora cluster. Something that could theoretically be solved by using multiple clusters, but we ended up changing our storage pattern in a way that greatly reduced the amount of data we need to store at any given moment (a pattern that would not have easily been implemented in DynamoDB).

The benefits have been significant. We now have fully relational data with multiple indexes that we can add/change with relative ease; a MySQL interface which “just works” with every database library you can think of; fully managed read-replicas; a much cheaper AWS bill; and we were able to eliminate a few systems that we put in place solely to get optimal performance from DynamoDB.

I could write another whole blog post on the benefits of Aurora and have been asked to do so on several occasions. Sadly, I haven’t found the time yet to make it happen.

    Kevin Cantwell

    Written by

    Data Engineering Manager @ ns1.com