Clojure in AWS Serverless: DynamoDB

Welcome back! Did you miss me? No? Uh… Brushing that aside, it’s time to continue evolving our Clojure Lambda with DynamoDB! In the last tutorial we set up a basic Lambda that generated prime numbers from a test request. While generating primes is a good exercise, generating every single one per request is silly.

First though, I have to admit this will be shorter than my other introductions because Numergent has a much larger tutorial specifically on DynamoDB with Faraday. It seems silly to cover ground that is already well trod. I’ll fill the gap in with a sweet segway instead.

As always, the code is in Github.

But first, the segue…

Follow me to your dreams!

Docker

I can’t keep this a secret: I Love Docker! Ok, maybe I should qualify that. I love Docker for local development.

One of the hardest parts of software development is setting up the local environment. It’s the source of countless bugs because it works on my machine! Even if the process is straightforward, it can still take up a large amount of a developer’s time. Over the years I’ve worked with virtualized environments, stub databases, and Vagrant. In the end Docker won. If this were a car commercial, I would call it best-in-class.

I included a Docker Compose configuration file that sets up the external dependencies with a simple command line, letting you spin up Dynamo Local in an isolated container. More information can be found in the README on requirements and how to run.

Boot tasks

If you peeked below the Docker section of the README, you might have noticed the local boot task. In boot you can define your own commands, called tasks. Tasks act a lot like Ring middleware where they return a function. Let’s take a look at few I’ve defined.

The first helper, with-dev, ropes in the development directory. This task performs a similar function as Leiningen’s :profiles map would accomplish. It takes the existing values in :source-paths and conjes the development directory into it. I use this as a scratchpad for development tools.

The fun part is it’s just Clojure functions. You could perform all sorts of operations on your configuration in these tasks.

The second, local, takes the with-dev task and injects some environment variables with the boot-environ feature of environ library. In this case, I’m injecting :development "true" into my configuration.

With these two tasks, we can now launch a development repl complete with our configuration environment and development namespace folder.

boot local repl

The nice thing about isolating all of these steps in the local task is we control when to insert the development namespace and environment variables.

The third task you might recognize from the last tutorial. build performs the compilation of our Clojure code into an actual file. Each smaller function is a boot built-in task for each compilation step. The task-options! let us set some task arguments globally rather than every time we call it. If you’re a Leiningen fan, you’ll notice similar configuration for :main, the project name and version, as well as :uberjar-name. Even the version is a function!

Boot has many built-in tasks. If you want to know more or see a list, call boot -h. Even the tasks we defined above are listed!

Ok, the segue is over. Now for the fun part.

I’m ready for the fun part!

DynamoDB with Faraday

The first trick in this pony show is to set up our database. If you are following along from the last tutorial, make sure you include Faraday in your configuration.

DynamoDB interface

Because we will be doing this locally, we need to programmatically set up the database.

Here we set up some configuration. We use an environment variable :development to configure whether or not we are using development credentials. Faraday is intelligent enough to grab the production keys from the Lambda environment (like IAM roles) if we pass an incomplete configuration. It is important your :endpoint matches the region you are targeting, otherwise it’ll throw errors about US-East-1 by default. In my case I’m using US-West-2.

Notice we also define a table-name with :primes. This is important later when we set up our DynamoDB table in AWS.

For our local development, we use Faraday’s create-table to create the table, but we would avoid this call in production. I’ll talk more about that later.

Here’s our API for our prime store. Pretty simple. We can list, put and get primes from our store. However, there are some important gotchas about DynamoDB that should be brought up:

  • scan operations are discouraged by AWS on Dynamo because they can potentially eat up many read units. For large data sets, this can get expensive (as in $$$) really quickly.
  • Any read operation is, by default, eventually consistent. This requires only half of the read units than a Consistent Read operation.

Sieve changes

Now that we are storing our primes, our sieve needs some adjustment. Why calculate primes that we’ve already found?

The key parts that changed:

  • We now use what’s in our store first, calculating only new primes
  • When we find a new prime, we add it to the store along with it’s location in the prime list.

Now we can pack everything up with boot build!

DynamoDB in production

We are almost ready to push our code. The last two steps are to procure a DynamoDB table in AWS and set the IAM permissions for our Lambda to talk to it.

Creating a Dynamo table is pretty simple. Since this is a tutorial, we will delete it at the end. If you don’t, it could cost you money!

  1. Open the DynamoDB dashboard in AWS
  2. Click Create Table
  3. Name your table “primes” (remember from above?)
  4. Set the Primary Key to “index” with a Number type.
  5. Keep the default settings for now.
  6. Click Create.

Once the table is created, you will see the table dashboard. Near the bottom of Table Details is the ARN for our new table. We’ll need that for the IAM role.

IAM Dynamo permissions for Lambda

You might be wondering why we chose to manually create the table instead of using our Lambda. The reason is simple: We don’t want our Lambda to have any more permissions than it needs. This is Amazon’s suggestion of Least Privilege. We will restrict our Lambda to only perform actions on our table that is absolutely necessary.

Open up the IAM dashboard and open the Roles tab.

  1. You should see your previous Lambda role from the last tutorial. Click on it.
  2. In the Summary page, click on the Inline Policies at the bottom, and click the create policy link.
  3. Select Policy Generator and click Select.
  4. In the AWS Service dropdown, select Amazon DynamoDB.
  5. In the Actions list, select Scan, PutItem and GetItem.
  6. In the ARN box, put the Primes Table ARN here.
  7. Finally click Add Statement, then Next Step.
  8. You should see a fully formed policy for accessing your Lambda. Optionally change the Policy Name and Sid to “PrimeTableAccess”
  9. Double check your policy with Validate Policy, then click Apply Policy.

You will return to the Lambda IAM role summary for your role. Now you can see the new Policy created just for your Lambda under Inline Policies. Cool!

Upload and run you Lambda

Now that our permissions are set and tables created, it’s time to upload our code! Go to the Lambda dashboard for your Sieve Lambda and upload the JAR.

However, there are two gotchas you need to know:

  1. You probably haven’t updated your test from the timeout input test. Set it something reasonable like { "max": 100 }
  2. We’ve changed the main namespace from lesson-one to lesson-two. Make sure the name of your deflambdafn in your code is updated, as well as the Lambda Configuration tab → Handler to match.

SUCCESS!

Once run, we should see a list of primes. Try running again. Did you notice how much faster it went? Maybe try growing the test to 200 or 500. If you ratchet up reasonably, I got up to 1000, 2000, 5000, and 10000. I got a timeout the first time I tried 20000, but because we save the results every time, the second time it succeeded!

Just in case though, make sure to delete your Dynamo tables in AWS when you are done.

Some key takeways

  • DynamoDB is awesome!
  • We can set fine grain permissions on Dynamo operations in our Lambda with IAM.
  • We could do simple configuration changes with Boot and environment variables.
  • Even with the timeout, we can see Lambda still is operating, so Idempotency is important.

Next time we’ll hook all this up to the API gateway to turn our Prime Lambda into Primes-as-a-Service. As always, you can reach me on Twitter @jamesleonis.

One clap, two clap, three clap, forty?

By clapping more or less, you can signal to us which stories really stand out.