A QUICK REFERENCE GUIDE

DynamoDB and Spring Data

Mapping entities with hash and range keys, searching with filters and paging, using global secondary index and more.

Leonardo Carvalho
6 min readMay 17, 2020

Introduction

Getting used to Spring Data it’s a pretty common thing: you put it in your project, realize how it frees you from all sort of tasks regarding manipulating data in and out of databases and suddenly you take it for granted.

The core concepts of Spring Data is closely tied with relational databases and integrating it with DynamoDB tables demands adaptation and, consequently, some work to do. Luckily, a lot of this work it’s already done and you can find references scattered all over the internet, bringing us to the very point of this piece: putting it together.

Getting Started

The full code with all the examples covered in this article and instructions of how to run using DynamoDB tables hosted both in a local instance and in AWS it’s available here.

In order to enable Spring Data features integration with DynamoDB tables, we’ll use the spring-data-dynamodb lib along with the AWS Java SDK for DynamoDB:

Note that we’re using a fork of the original spring-data-dynamodb repository. This is due to an issue that prevents the application to be initialized in some circumstances. You can find more details here.

Now all we have to do is create a AmazonDynamoDB bean with configured credentials. In the example below, we are using profile based credentials, which assumes that you have it configured in a ~/.aws/credentials file in your machine. For more information about configuring AWS credentials, have a look at Working with AWS Credentials.

All tables used in this article can be created using the scripts provided in the GitHub project dynamodb folder and you can find instructions in the README file. Furthermore, you can also create it using the AWS console.

If you are unfamiliar with the DynamoDB concepts, it’s highly recommended to read this article. For now, it’s important to be aware that:

  • Each table has a primary key that can be composed of only a partition key or both a partition and a range key;
  • Items stored in a table are unique by primary key;
  • Partition keys are used to derive in which partition the item will be stored by a DynamoDB internal hash function;
  • Range keys determines in which sort order the items inside a partition will be stored;
  • DynamoDB tables are schemaless and you only have to declare attributes that are either partition or range keys;

1# Starting with a Simple Entity

To validate our configurations and the very basic concepts of spring data, we’ll start inserting and listing items in a pretty straightforward table based in this example described in the DynamoDB developer guide.

The Music table is composed of only a partition key attribute named Artist:

In our application we’ll add a SongTitle attribute as well by only mapping it in the entity class along with the table declared configurations:

At this point, we can insert and list items in the table using the Spring CrudRepository by only creating a MusicRepository:

With the MusicRepository we can do all sort of data integration with the table. Here are some examples of usage:

2# Entities With Composite Primary Keys

So far, so good. The simple example above already covered a lot of the functionalities we might expect, but there’s a major problem in the data model: as we’ve seen before, items in a DynamoDB table are unique by primary key, which means that we can’t insert more than one song per artist in our Music table.

To solve this problem, we can follow the example given in the DynamoDB developer guide by turning the SongTitle attribute into the table range key. This change will transform the table primary key in a composite key comprising both the Artist and SongTitle attributes.

Behold the ImprovedMusic table:

Now we’ll have to do some adapting to fit our new primary key in a way we can declare it in our repository and use the spring data generic methods.

First, let’s encapsulate it in a ID class:

And then put the ID class in the entity mapping class:

The repository interface will look like this:

Now we can interact with the ImprovedMusic table:

#3 Converting Keys to DynamoDB Data Types

When we’re declaring the key attributes for a given table, we can use three different key types: String(S), Number(N) or Binary(B).

Supposing we want to use a timestamp as a key attribute, we clearly can choose a format and represent it in a String form. However, in our java application code, we’ll want to represent it as a LocalDateTime attribute and converting it back and forth can become a real burden.

Let’s see an easy way to encapsulate all the converting in a single, isolated component.

First, we’ll create MuchImprovedMusic table using a randomized UUID as the table hash key named MusicCode and the release date and time as the table range key, namely the ReleaseDateTime attribute.

To convert LocalDateTime objects automatically to a String, DynamoDB compatible, format we’ll create a LocalDateTimeConverter class:

Now we can easily create the ID class for the MuchImprovedMusic table:

The formerly Artist and SongTitle attributes will be added in the entity mapping class:

This time we’ll improve a little bit the repository class. As we have now a primary key composed of an UUID and a timestamp, it turns out we need only of the hash key to uniquely identify an item.

It may seems odd to have the release date and time in the primary key as it didn’t effectively plays a part in the key uniqueness, but remember the range key has a sort role in the partition that can be useful: songs will be stored sorted by release date, making it easy to list recent songs.

So, to get a single item in the table we’ll add a method to find a music only by it’s code:

Some examples of data manipulation in the MuchImprovedMusic table:

#4 Global Secondary Index

Changing our primary key to the current status has its advantages: the items are uniformly and randomized distributed in partitions, we can easily identify an unique item and primary key colliding it’s an nonexistent problem.

But it came with a concession that we may not want to give: now listing an artist musics requires an inefficient table scan.

To solve this problem we can create a Global Secondary Index (GSI). It’s a data structure where items of a table are stored using different partition and range keys than the table itself. In fact, we may choose any table attribute to put in the GSI keys. Also we may propagate only some items of the original table. To learn more about GSI, see this documentation.

Adding a GSI to our MuchImprovedMusic table to enable efficient listing of an artist musics would look like this:

Note that in addition to the GSI declaration, we’ve added the Artist attribute to the table attributes definition as well.

The next step it’s to declare the GSI keys in the table entity mapping:

When a GSI key is also present in a composite table primary key, alike the ReleaseDateTime attribute in our table, remember to put the annotation in both ID and entity classes.

Finally, we can create a method for listing an artist musics in our repository:

To assert that the index is actually being used when we call the findByArtist method, we can see the ConsumedReadCapacityUnits metric in the CloudWatch DynamoDB metrics:

#5 Further Filtering Options

Using the Spring Data CRUDRepository we can create all sort of filter methods to retrieve data from the tables. If the filter matches a table or a GSI key, it will be used to efficiently retrieve the data, otherwise a table scan will be done.

Here are some more filter examples:

In addition, we can add paging options in the methods in a pretty simple way:

When creating the Pageable object, we can specify the size of each page and which page we want to retrieve. The returned page contains a list of the retrieved items and some more useful information for building paged APIs.

Conclusion

That’s all to fulfill this guide purpose. Following these steps we can create tables and index that fits most use cases and we already explored a lot of the released possibilities.

If you stub upon an uncovered requirement, exploring the Spring Data lib may provide some clues of the path to follow, and the internet can lead the way.

See you next time.

--

--