AWS : Database Specialty Exam — Part 1

A Quick Review for Workload-specific Database Design

Pisit J.
Sum up As A Service
5 min readSep 10, 2022

--

Part 1: Workload-specific Database Design (26% of exam)

  • Select appropriate database services for specific types of data and workloads.
  • Determine strategies for disaster recovery and high availability
  • Design database solutions for performance, compliance, and scalability.
  • Compare the costs of database solutions.

1 — How to integrate queries between data from an Amazon RDS PostgreSQL database in one region with data from the Redshift cluster in another region ?

  • Create an external schema from the Amazon RDS PostgreSQL database and use Amazon Redshift Federated Query to access both sources.

https://docs.aws.amazon.com/redshift/latest/dg/federated-overview.html

https://aws.amazon.com/blogs/big-data/amazon-redshift-federated-query-best-practices-and-performance-considerations/

2 — A Database Specialist is migrating an on-premises database to an AWS RDS DB instance. The application requires a database that can handle with high volumes of read operations and always have enough bandwidth to support the IO throughput.

What is the best solution that meets these requirements ?

  • Configure an Amazon RDS Read replica
  • Use an EBS-Optimized instance for the Amazon RDS DB instance

https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebs-optimized.html

3 — Using Amazon Aurora, a Database Specialist was asked whether it is possible to create a copy of your production DB cluster for development, testing, or other purposes, upon request.

What is the most cost-effective solution to deploy this requirements in the shortest amount of time ?

Use the Amazon Aurora cloning.

Note — Aurora uses a copy-on-write protocol to create a clone. This mechanism uses minimal additional space to create an initial clone. Additional storage is allocated only when changes are made to data by the source Aurora DB cluster or the Aurora DB cluster clone.

You can use clones for many types of applications, such as the following:

  • Experiment with potential changes (schema changes and parameter group changes, for example) to assess all impacts.
  • Run workload-intensive operations, such as exporting data or running analytical queries.

https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/Aurora.Managing.Clone.html

4 — A company requires a data warehouse solution that requires low-latency responses for data queries for the current year & entire historical 10-year dataset. It must auto scale to support the number of incoming queries.

What is the cost-effective solution for this requirements ?

  • Use Amazon Redshift to store & query the current year data.
  • Use Redshift Spectrum to query the historical data stored in Amazon S3
  • Enable the Concurrency Scaling feature to adds automatically on-demand cluster capacity when needed.

https://aws.amazon.com/redshift/faqs

5 — What do the Database Specialist need to ensure that the more critical, short-running queries will be prioritized in Amazon Redshift ?

  • Enable automatic workload management and assign the critical queries to a queue with the highest priority.
  • Enable short query acceleration.

Note —You can define the relative importance of queries in a workload by setting a priority value. The priority is specified for a queue and inherited by all queries associated with the queue. You associate queries to a queue by mapping user groups and query groups to the queue.

Short query acceleration (SQA) runs short-running queries in a dedicated queue, so that SQA queries aren’t forced to wait in queues behind longer queries.

https://docs.aws.amazon.com/redshift/latest/dg/query-priority.html

https://docs.aws.amazon.com/redshift/latest/dg/wlm-short-query-acceleration.html

6 — What is the most suitable DynamoDB table design for player session history of a mobile game app ?

Set up a composite primary key where the partition key is player_id and the sort key is timestamp

Note — The partition key of an item is also known as its hash attribute. The term hash attribute derives from the use of an internal hash function in DynamoDB that evenly distributes data items across partitions, based on their partition key values.

The sort key of an item is also known as its range attribute. The term range attribute derives from the way DynamoDB stores items with the same partition key physically close together, in sorted order by the sort key value.

https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/HowItWorks.CoreComponents.html#HowItWorks.CoreComponents.PrimaryKey

https://aws.amazon.com/blogs/database/amazon-dynamodb-gaming-use-cases-and-design-patterns/

7 — An application uses GetItem and PutItem operations respectively to read and write data to its DynamoDB table. The network overhead of the request workload affects the application's performance.

How can Database Specialist improve the application performance ?

Refactor the application to use the BatchGetItem and BatchWriteItem operations in processing related items. If a partial result is returned, use the UnprocessedKeys map from the response to reprocess the failed items.

https://docs.aws.amazon.com/amazondynamodb/latest/APIReference/API_BatchGetItem.html

https://docs.aws.amazon.com/amazondynamodb/latest/APIReference/API_BatchWriteItem.html

8 — A company has successfully launched a mobile application that uses an Amazon DynamoDB in past year. This year, the company plans to release a new core feature and asks the Database Specialist to prepare for short-time activity surge.

What can the Database Specialist do to meet this requirement with the lowest cost ?

Set the selected DynamoDB table to provisioned capacity mode with auto-scaling enabled.

Note — If you choose provisioned mode, you specify the number of reads and writes per second that you require for your application. You can use auto scaling to adjust your table’s provisioned capacity automatically in response to traffic changes. This helps you govern your DynamoDB use to stay at or below a defined request rate in order to obtain cost predictability.

https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/HowItWorks.ReadWriteCapacityMode.html

https://aws.amazon.com/blogs/database/amazon-dynamodb-auto-scaling-performance-and-cost-optimization-at-any-scale/

9 — The Database Specialist needs to add cache layer to an application that only need to cache simple string data. The cache layer must utilize multiple processing cores to easily scale up compute capacity and handle more operations.

What is the most suitable AWS service in this scenario ?

Amazon ElastiCache for Memcached.

https://aws.amazon.com/elasticache/redis-vs-memcached/

10 — A company wants to build a content management application that can scale depending on the read throughput. The fully-managed database service will store data in JSON-like documents.

Which AWS database service would best fit this requirement ?

Amazon DocumentDB

Note — Amazon DocumentDB is a fast, scalable, highly available, and fully managed document database service that supports MongoDB workloads. As a document database, Amazon DocumentDB makes it easy to store, query, and index JSON data.

https://docs.aws.amazon.com/documentdb/latest/developerguide/what-is.html

--

--