Deal with the change of field data type of PynamoDB, an ORM of DynamoDB

Melon
Craftsmen — Software Maestros
4 min readAug 26, 2020
Source (Free to use): https://www.pexels.com/photo/interior-of-office-building-325229/

PynamoDB, with a Python interface, is a great ORM for AWS DynamoDB. DynamoDB is a NoSQL database with scalability and security and it has awesome API and SDKs. PynamoDB wraps the underneath APIs of DynamoDB and gives a beautiful pythonic way to use DynamoDB.

In real-world our DB fields are not constant. It can change with the requirements update of our project. The hard part is when you have to maintain the old and new features. For example, a field data type can change from String to Binary. So if any API e.g. REST API with versioning is working with DynamoDB, you have to accept both old data type (String) and new data type (Binary). We are going to find a solution in such a case.

Setting the goal

Let’s start with setting a goal. Assume we have a field with Dynamo data type List. But after times it changes to Map. So we will do

  • Accept Map object and save to DynamoDB
  • Accept List object and change it to Map according to requirement and save to DynamoDB as our field changed to Map
  • Return Map object according to requirement change

Setting the project

Let’s add a couple of files to do our task.

# Add project folder
mkdir change-field-ddb
cd change-field-ddb
# Add Pipfile for requirements mange with pipenv
touch Pipfile
# For Custom PynamoDB attributes
touch attributes.py
# Add models.py to add pynamo model
touch models.py
# To test everything
touch test.py

Add python requirements

Add following requirements to Pipfile

[dev-packages]
pipenv = "==2020.8.13"
moto = "==1.3.14"

[packages]
pynamodb = "==4.3.2"

[requires]
python_version = "3.8"

Now initiate pipenv with

pipenv install

Now the project will look like

tree
.
├── attributes.py
├── models.py
├── Pipfile
├── Pipfile.lock
└── test.py

The changed datatype

Let's assume our previous datatype (List) for a field result is

[3.75, 3.17, 3.90, 3.67, .......]

So the result stores CGPA of a student from 1st semester to final semester e.g. result[0] as 1st semester, result[1] as 2nd semester, and so on. But we will make it better (Map) at the updated version

{
'semester 1': 3.75,
'semester 2': 3.17,
'semester 3': 3.90,
'semester 4': 3.67,
..................
}

PynamoDB attributes

Add the custom PynamoDB attribute to accept old (List) and new (Map) attribute and saves only as Map. But at return time it will return old (List) data as Map and also new data as Map. Also, we are going to use custom UUIDAttribute which will be used as a hash key.

# attributes.pyimport uuid

import pynamodb.attributes


# UUID Attribute will use as Dynamo hash key
class UUIDAttribute(pynamodb.attributes.UnicodeAttribute):

def serialize(self, value):
return super().serialize(str(value))

def deserialize(self, value):
return uuid.UUID(super().deserialize(value))


# Add custom attribute to serialize and deserialize data
class ResultAttribute(pynamodb.attributes.MapAttribute):

@classmethod
def is_raw(cls):
# Set to use as AttributeContainer
# https://pynamodb.readthedocs.io/en/latest/api.html#pynamodb.attributes.MapAttribute
return True

@staticmethod
def _parse_value(values):
return {
f'semester {idx+1}': val for idx, val in enumerate(values)
}

def serialize(self, values):
# Convert python list to native pynamo
if isinstance(values, (list, tuple)):
values = self._parse_value(values)
return super().serialize(values)

def get_value(self, value):
try:
# Convert from
# {'L': [{'N': '3.75'}, {'N': '3.17'}]}
# to
# {'M': {'semester 1': {'N': '3.75'}, 'semester 2': {'N': '3.17'}}}
value = {'M': self._parse_value(value['L'])}
except (KeyError, TypeError):
pass
return super().get_value(value)

The PynamoDB Model

We are going to start with a simple PynamoDB Model only with 2 fields

# models.pyimport uuid

import pynamodb.models
import pynamodb.attributes

from attributes import UUIDAttribute, ResultAttribute


class ResultModel(pynamodb.models.Model):
id = UUIDAttribute(hash_key=True, default=uuid.uuid4)
result = ResultAttribute()

class Meta:
table_name = "test-ddb-table"

Time to Test

Add some tests in test.py

from decimal import Decimal

from decimal import Decimal

import boto3
import moto

from models import ResultModel

with moto.mock_dynamodb2():
region = 'eu-west-1'
ResultModel.Meta.region = region
ResultModel.create_table(wait=True)

# Data
result_map = {
'semester 1': 3.75,
'semester 2': 3.17,
'semester 3': 3.90,
'semester 4': 3.67
}
result_list = [3.75, 3.17, 3.90, 3.67]

# Insert as Map with PynamoDB
result1 = ResultModel(result=result_map)
result1.save()
assert ResultModel.count() == 1
result1_id = result1.id
print(result1_id)

# Retrieve the data from inserted as list
result1_retr = ResultModel.get(result1_id)
assert result1_retr.result.attribute_values == result_map

# Insert as list so it will convert and saved as map
result2 = ResultModel(result=result_list)
result2.save()
assert ResultModel.count() == 2
result2_id = result2.id
print(result2_id)

# Retrieve the data from inserted as list
result2_retr = ResultModel.get(result2_id)
assert result2_retr.result.attribute_values == result_map

# Insert list value in result with boto3
dynamodb = boto3.resource('dynamodb', region)
table = dynamodb.Table(ResultModel.Meta.table_name)

# float is not supported by boto3. But decimal is supported. So converted to Decimal
item = [Decimal(str(v)) for v in result_list]
# Updating item instead of creating because result is MapAttribute by default
table.update_item(
Key={'id': str(result1_id)},
AttributeUpdates={
'result': {'Value': item, 'Action': 'PUT'}
}
)

assert table.get_item(Key={'id': str(result1_id)})['Item']['result'] == item

# Retrieve the data that is a list in dynamodb
result1_retr = ResultModel.get(result1_id)
assert result1_retr.result.attribute_values == result_map

Run tests with

pipenv shell
python3 test.py

Full code can be found here:

https://github.com/melon-ruet/change-field-ddb

--

--