AWS RDS Databases

Dilshan Fernando
8 min readJul 23, 2023

This article is based on the concepts I learned from Stephan Marek’s AWS Solutions Architect — Associate course. I would like to thank Stephan for his excellent teaching and for sharing his deep expertise in AWS. Without his guidance, this article would not have been possible. If you’re looking to expand your AWS knowledge, I highly recommend Stephan’s courses.

Choosing the right database for your project is a critical factor and if you are unable to pick up the correct DB, it may end up in disaster. There are several types of databases available right now. for example relational databases, NoSQL databases, grapp and etc.

So what are the factors should we consider when you choose the correct database for your project? First, we need to answer the below questions.

1. Workload and Throughput:

  • Is the workload predominantly read-heavy, write-heavy, or balanced?
  • What are the current throughput needs for your application?
  • Will the throughput requirements change over time, fluctuate during the day, or need to scale to handle future growth?

2. Data Storage:

  • How much data do you need to store, and for how long will you retain it?
  • Do you anticipate the data volume to grow significantly over time?
  • What is the average object size, and how will these objects be accessed?

3. Data Durability and Source of Truth:

  • How critical is data durability for your application?
  • Do you have a single source of truth for your data, or will you need to synchronize data across multiple databases or regions?

4. Latency and Concurrent Users:

  • What are the required latency levels for data retrieval and updates?
  • How many concurrent users or clients do you expect to access the database simultaneously?

5. Data Model and Querying:

  • What type of data model does your application require (e.g., structured, semi-structured)?
  • How will you query the data, and do you anticipate using complex joins or relationships between entities?

6. Schema Flexibility and Use Cases:

  • Do you need a database with a strong schema to enforce data consistency and integrity?
  • Does your application require more flexibility in terms of data structure?
  • Will you be performing reporting or search operations on the data?

7. RDBMS vs. NoSQL:

  • Consider whether a traditional relational database (RDBMS) or a NoSQL database is more suitable for your application’s data and querying needs.

8. Licensing and Cloud Native Options:

  • Are there any specific licensing costs associated with the databases you are considering?
  • Would it be beneficial to switch to a Cloud Native Database service, such as Amazon Aurora, to take advantage of cloud scalability and managed services?

By carefully answering these questions, you can identify the key requirements and constraints of your architecture, helping you make an informed decision about the most appropriate database solution for your specific use case.

So let’s jump into the main topic. What are the available options in AWS?

Actually, AWS provides so many options and I listed below them.

if you need

  1. RDBMS or Relational Database Management System — RDS, Aurora — great for joins
  2. NoSQL database — no joins, no SQL: DynamoDB (~JSON), ElastiCache (key / value pairs), Neptune (graphs), DocumentDB (for MongoDB) and Keyspaces (for Apache Cassandra)
  3. Object Store: S3 (for big objects) / Glacier (for backups/archives)
  4. Data Warehouse (= SQL Analytics / BI): Redshift (OLAP), Athena, EMR
  5. Search: OpenSearch (JSON) — free text, unstructured searches
  6. Graphs: Amazon Neptune — displays relationships between data
  7. Ledger: Amazon Quantum Ledger Database
  8. Time series: Amazon Timestream

After understanding what kinds of databases are available in AWS, let’s talk about a few of them. In this article, I’m going to discuss Amazon RDS. The rest of the will discuss in a separate article. Let’s go!

Amazon RDS

Amazon RDS is a Relational Database which is used SQL as a Queary language. This Amazon RDS allows you to manage Postgres, MySQL, MariaDB, Oracle, Microsoft SQL Server, and Aurora (AWS Proprietary database) in the cloud.

?? So what are the advantages gained using this Amazon RDS over deploying DA on EC2 ??

There are so many justifications available for using Amazon RDS over DB deployment on the EC2

  1. Managed Service: Amazon RDS is a fully managed service, which means AWS handles various aspects of database administration, such as automated provisioning, OS patching, backups, and software updates. This relieves you of many routine maintenance tasks, allowing you to focus on your application’s development and business logic.
  2. Automated Backups and Point-in-Time Restore: RDS automatically performs regular backups and allows you to restore your database to a specific point in time within the retention period. This helps protect against data loss and simplifies disaster recovery.
  3. High Availability with Multi-AZ: RDS offers Multi-AZ deployment, where it automatically replicates your database across multiple availability zones. If the primary database becomes unavailable, RDS will automatically failover to the standby instance in another availability zone, ensuring better availability and data redundancy.
  4. Read Replicas: RDS allows you to create read replicas of your database to offload read traffic from the primary instance. This enhances read performance and reduces the load on the primary instance.
  5. Scalability: RDS supports both vertical and horizontal scaling. Vertical scaling involves increasing the instance size, while horizontal scaling involves adding read replicas. AWS handles the complexities of scaling the underlying infrastructure for you.
  6. Monitoring and Metrics: RDS provides comprehensive monitoring dashboards and performance insights, helping you track database performance and identify potential issues proactively.
  7. Security and Compliance: Amazon RDS provides various security features, such as encryption at rest and in transit, network isolation, and IAM-based access control. It also offers compliance with industry standards, making it easier to meet regulatory requirements.
  8. Backup and Maintenance Windows: RDS allows you to specify backup and maintenance windows, ensuring that routine maintenance tasks do not disrupt your application’s availability.
  9. Simplified Setup and Configuration: With RDS, the initial setup and configuration of the database are streamlined, saving time and effort compared to manually setting up and configuring a database on EC2 instances.

Important: You cannot SSH into your instances.

Once we have an understanding of why we need to use the RDS database, let’s go with the features or options RDS provides. Let’s begin with Storage Autoscaling.

RDS — Storage Auto Scaling

RDS (Amazon Relational Database Service) offers Storage Auto Scaling, a feature that allows you to increase storage on your RDS database instance dynamically. It automates the process of scaling your database storage, eliminating the need for manual intervention and ensuring your application can handle increased data storage requirements.

Here are the key aspects of RDS Storage Auto Scaling:

Automated Scaling: RDS constantly monitors your database’s storage utilization. When it detects that you are running out of free database storage, it automatically scales up the storage capacity to accommodate the growing data volume.

Maximum Storage Threshold: To control costs and prevent unexpected scaling, you can set a Maximum Storage Threshold. This defines the upper limit or maximum limit for your database storage. Once the storage reaches this threshold, RDS will not scale up beyond this limit.

Automated Modification Criteria: RDS Storage Auto Scaling takes into account specific criteria before automatically modifying the storage:

  • Free storage is less than 10% of the allocated storage.
  • Low-storage condition lasts for at least 5 minutes, preventing unnecessary scaling due to temporary spikes.
  • At least 6 hours have passed since the last storage modification, ensuring a stable period before considering another scaling action.

Useful for Unpredictable Workloads: Storage Auto Scaling is particularly beneficial for applications with unpredictable workloads or data growth patterns. It allows your database to accommodate fluctuating storage needs seamlessly without manual intervention.

Support for Multiple Database Engines: Storage Auto Scaling is available for all RDS database engines, including MariaDB, MySQL, PostgreSQL, SQL Server, and Oracle. This means you can leverage this feature regardless of the database engine you are using in RDS.

By enabling Storage Auto Scaling in RDS, you can ensure that your database can dynamically adapt to changing storage requirements, helping you maintain optimal performance and availability for your application without the need for constant monitoring and manual adjustments.

RDS Read Replicas for read scalability

RDS database provides up to 15 Read-replicas within or across the AWS Availability Zones. And this Read replicas ASYNC with the Master eventually consistent.

What is the use case of these Read replicas??

Consider your application has a lot of traffic and the client needs to add reporting feature which doesn’t have an update or delete feature just reads the data. if we use without using the read replica option, which means we have to use Master Instant to create that feature too. Which impact to the overall performance of the application. There we can use read replica to get read data to reporting feature. Refer to below diagram.

Read replicas are used for SELECT (=read) only kind of statements (not INSERT, UPDATE, DELETE).

RDS Read Replicas — Network Cost

When using Amazon RDS Read Replicas within the same AWS region, you don’t incur any additional network transfer costs. This is because data replication between RDS DB instances and their read replicas within the same region is done internally within the AWS network infrastructure, and it’s not billed as data transfer across different availability zones (AZs) or regions.

However, if you set up RDS Read Replicas across different AWS regions, data replication will involve network transfer across regions, and you may incur additional data transfer charges. AWS treats data transfer between regions as cross-region data transfer, which can be more expensive than data transfer within the same region.

RDS Multi-AZ (Disaster Recovery)

RDS Multi-AZ provides high availability and automatic failover for your RDS database by using synchronous replication to maintain a standby replica in a different Availability Zone. It increases application availability, handles failover automatically, and is crucial for disaster recovery. Use RDS Read Replicas across regions for additional redundancy in DR scenarios.

Hope you got something in this article see you and keep in touch with me :)

--

--