The Ultimate AWS Database Guide
With such a variety of database options to choose from on AWS, how do you know which option is the best fit for your use case?
In this blog post, we will explore the length and breadth of database options available within AWS, including their features, advantages and use cases.
Let’s dive straight in!
Relational
Relational Database Management Systems (RDBMS) or SQL databases store and manage structured data. The table format is specified at the repository design stage, with the schema defining the structure of the tables. Relational databases are often used for online transactional processing applications (OLTP).
Amazon RDS stands for Relational Database Service which supports popular relational database engines such as MySQL, PostgreSQL, Oracle, and SQL Server (and many more). It provides consistency and ACID (Atomicity, Consistency, Isolation, Durability) properties. ACID compliance means that transactions will take a database from one stable state to another stable state, making it a reliable database option for critical applications.
Within the Amazon RDS suite is Amazon Aurora, a global-scale relational (MySQL and PostgreSQL) database providing high-performance and highly scalable benefits.
Non-relational
Non-relational or NoSQL databases are a set of databases that share certain features, firstly, they do not, (generally) require a schema. The data is usually semi-structured or unstructured. They are often used for online analytical processing workloads (OLAP). OLAP workloads provide insights into unknown questions by revealing trends and patterns that relational databases cannot.
Key-Value
For quick retrieval, key-value storing is a data storage type that organises group-related data into collections of records with unique keys. It makes retrieval of data quick and efficient.
Amazon DynamoDB is a fully managed NoSQL database service that offers low latency, automatic scaling, and high availability. It is designed to handle massive workloads and scales quickly to accommodate unpredicted demand making it ideal for applications that require fast and predictable performance such as high-traffic web applications and gaming applications.
In-Memory
In-memory databases are a type of database that relies primarily on memory for data storage, in contrast to databases that store data on disk or SSDs. The main advantage is fast performance due to the speed of data retrieval, however, this generally comes at a cost.
Amazon ElastiCache is an in-memory data store service powered by Redis or Memcached. It delivers ultra-fast performance, making it ideal for use cases like caching, session management, and real-time analytics. It provides automatic failover, multi-AZ replication, and data persistence options. ElastiCache scales horizontally to handle increasing workload demands which is perfect for gaming leaderboards and session management.
The following is an example of an architecture that uses Elasticache to cache queries in conjunction with Amazon RDS for ultra-fast data retrieval.
Document
Document databases store data in a semi-structured format, using JSON-like documents.
Amazon DocumentDB is a fully managed and scalable document database service compatible with MongoDB. DocumentDB provides the flexibility to store, query, and index JSON documents while ensuring high availability and durability. Great for content management, catalogues and user profiles.
Wide Column
A wide-column database stores data in a tabular format but with a flexible schema. Unlike a relational database, each row in the table can have varying column names and formats.
Amazon Keyspaces is a fully managed, highly scalable, and highly available wide-column store database service. It is compatible with Apache Cassandra and allows users to migrate existing Cassandra applications easily to the AWS-managed offering. Keyspaces offers automatic scaling, backups, and regional replication for enhanced durability. Keyspaces are suitable for IoT applications and fleet management systems.
Graph
Designed to handle highly connected data, making them ideal for applications involving complex relationships.
Amazon Neptune is a managed graph database service and it allows users to build and run applications requiring highly connected datasets. Neptune supports fast querying of relationships and traversals. Amazon Neptune is durable and provides automatic backups and point-in-time restores. Applications such as social networking and fraud detection can leverage Neptune’s capabilities.
Time Series
Time series databases are optimised for time-stamped or time series data.
Amazon Timestream is a purpose-built time-series database service. It is optimized to store and query time-series data at scale. Timestream supports automated data retention management and real-time analytics with millisecond data retrieval time for various time-based use cases. Timestream can be leveraged for use cases such as IoT, DevOps, financial services, and industrial telemetry.
There are so many options, how do I choose?
AWS provides some great resources to help you learn more about their database services
They also provide a guide on how to decide on the right database.
AWS provide an extensive range of database options to cater to many data storage and management requirements. Whether it’s maintaining relational data, high-performance caching or handling time-series data, this blog post has hopefully given you a small insight into the best database service for the job.
About the author
Laura Gardner is an AWS Architect here at Version 1.