Deleting Content in DynamoDB from the CLI

Samuel Cozannet
3 min readOct 24, 2018

--

If you started developing serverless applications using AWS Lambda, chances are you also use DynamoDB as a storage backend.

If so, there are scenarios where you want to replace a table by an empty one:

  • Load & PerformanceTesting
  • Reset development/testing environment

There are basically 2 methods to do so with their own strengths and weaknesses.

Deletion/Creation

A quick and dirty way to do so is to delete and recreate the table. If you want to keep the settings you had before, you will need to extract a schema matching the --generate-cli-skeleton flag.

This can be achieved with

$ aws dynamodb describe-table \
--table-name foobar | \
jq '.Table | del(.TableId, .TableArn, .ItemCount, .TableSizeBytes, .CreationDateTime, .TableStatus, .ProvisionedThroughput.NumberOfDecreasesToday)' | \
tee foobar-schema.json

Once this is done, you can delete the table

$ aws dynamodb delete-table --table-name foobar

And finally recreate it with

$ aws dynamodb create-table --cli-input-json file://foobar-schema.json

This is perfect (and fast!) when the table does not have any streams attached. However, you might have noticed that DynamoDB stream ARNs, which you will need to process streaming data, depend on their creation date-time as in

arn:aws:dynamodb:eu-west-1:123456789012:table/foobar/stream/2018-10-23T10:58:41.679

Which means that if you deleted the table and recreated it, you changed the Stream ARN, thus potentially lost the connection to your lambdas. Sub optimal.

Only deleting keys

If you need to preserve external connectivity, or for some reason cannot tolerate to delete and recreate the table, you might want to delete all its content.

To do that cleanly, you will first need to extract the schema of the table slightly differently that we did before:

$ export KEY_SCHEMA="$(aws dynamodb describe-table \
--table-name foobar | \
jq -r '.Table.KeySchema[].AttributeName' | \
tr '\n' ' ')"

Now you will need to scan your table for these keys, then use them to order deletion. The data extraction has a trick in the jq command:

$ aws dynamodb scan \
--table-name foobar \
--attributes-to-get ${KEY_SCHEMA} | \
jq -r ".Items[] | tojson" | \
tr '\n' '\0'

This returns a single, long line of JSON objects of the table items separated by null characters. We just emulated the -print0 feature of find with jq and tr. .

Now we will need to pipe that in xargs via

... | xargs -0 -I keyItem \
aws dynamodb delete-item \
--table-name foobar \
--key=keyItem

The whole command being:

$ aws dynamodb scan \
--table-name foobar \
--attributes-to-get ${KEY_SCHEMA} | \
jq -r ".Items[] | tojson" | \
tr '\n' '\0' | \
xargs -0 -I keyItem \
aws dynamodb delete-item \
--table-name foobar \
--key=keyItem

If you want to debug the code, you can add -t to the xargs options, which copies the output to stdout before the actual execution.

The advantage of this method is to maintain the overall coherence of your AWS environment. However, it is slooooooowwwwww, especially if you are in underpowered development environments.

Conclusion

This is quite a change compared to my previous K8s work, but I must say serverless seriously rocks, but there are a number of things I wish I had known before starting working on that technology.

This is just the first of a series of tips and tricks about Serverless on AWS so others don't have to spend the time I did on them. I hope you'll enjoy them!

Any question, feedback, don't hesitate to ask in the comments.

--

--