Member-only story
How do I save DynamoDB data relationally at scale?
With S3, Glue and Athena
In this article, I am going to demonstrate a solution that transforms item-level changes in DynamoDB tables and loads into a relational table in Athena Data Catalog for analytics. This solves the pain point of DynamoDB that it does not fully support complex queries and improves the analytical performance on large volume data.
DynamoDB is a document-based database tool fully managed by AWS. It is its NoSQL nature that providers users with flexibility as it does not require a schema on write. But on the other side of the coin, this feature may be an obstacle if we need to conduct complex queries on the data tables stored in DynamoDB.
Here are two situations where it explains why DynamoDB is not fit for analytical queries. First, when we need to filter the data records based on the conditions related to the non-key attributes, the only feasible solution is to scan the whole table. This will be both time and cost consuming if the data volume is astronomical. Although it is arguable that creating indexes with the attributes will help, this approach takes extra memory to store the index data structure. Second, if we need to complete a search in more than one DynamoDB table, it is laborious as JOIN is not supported by DynamoDB. In short, despite of its many features…