Serverless reporting v0.0.1

Ian Fuller
Trail Blog
Published in
4 min readNov 3, 2016
no servers, anywhere

The timing of the London serverless conference was perfect for the engineering team at Trail. We’re taking our first baby steps into a world without servers and getting feedback from people such as Ben Kehoe is invaluable.

This post describes the implementation for our new reporting feature. I’ll describe the top level serverless architecture we’ve adopted, and how we’re managing the infrastructure.

The work

Our long term vision is to migrate our RoR monolith onto a serverless infrastructure. We’re doing this to:

  • Decompose our application, affording faster build times and greater flexibility around technology choices
  • Provide a foundation that scales more easily — in terms of team size and customer throughput
  • Support a move to ES / CQRS which is more easily distributed and fault tolerant
  • Provide an environment that is loosely coupled by design

To de-risk the move to serverless we decided to focus on an upcoming reporting feature. The non-realtime nature of the work gives us breathing space if we find bottlenecks or run into any failures. Given a stream of events from the monolith, we’ll be using lambda functions to generate report data.

High level architecture

Broadly speaking the work is made up of the following moving parts:

  1. CRUD Model observers, and CQRS event listeners that capture changes in the existing monolith.
  2. An API Gateway endpoint that receives these change events as JSON post data.
  3. A DynamoDB store that persists each of the events.
  4. A number of Lambda functions processing DynamoDB event stream updates.
  5. Additional DynamoDB stores used to persist snapshots based on the event stream.
  6. A scheduled reporting lambda which queries the snapshot stores.
  7. A set of final report projections stored on S3 ready to be consumed.
a little something like this…

Ops

When compared to on premise, serveless requires less dev ops, but if you’re moving from a single PaaS provider you’re going to have your work cut out.

The serverless landscape is fairly new but terraform.io seems to be making good headway, as does Apex.

After some initial prototyping we’ve opted with a mix of Terraform and some basic scripting to get things up and running. Apex looks like a solid tool but given our existing scripts covered all but the infrastructure challenges, including local execution support, we decided to continue with this workflow.

To provide deployment stages we opted with unique AWS accounts for each environment. This allows for a single declarative Terraform configuration and improved access controls.

Deployment

Serverless on AWS is not a fully managed experience. As such we invested a solid amount of time in automating the delivery of infrastructure and code changes. In both cases we used CircleCI and a collection of bash scripts.

Infrastructure deployments:

  1. We run Terraforms plan command to validate changes for each PR.
  2. If the plan succeeds then the PR can be merged.
  3. To deploy to any of our environments we configured Circle to use release names. Add a v0.0.1-preprod tag and Circle would deploy the changes to the preprod AWS account.
  4. For each of these steps the Terraform state file is persisted in an encrypted S3 bucket for each environment (e.g a single state file for preprod, prod, etc).

Lambda deployments:

For our lambda functions we used Terraform to deploy a function name, but not the code change. Our Lambda functions are maintained in a different repo using a set of in-house scripts that use the following process.

  1. A CircleCI script deploys each lambda function, by name, using the AWS sdk.
  2. When a PR is raised the build goes green once unit tests are complete.
  3. As with the infrastructure changes, the lambda functions use environment variables to deploy into each stage. If v0.0.1-preprod is provided as a tag then the lambda function is built and deployed.
  4. Finally, the deployment process checks for successful invocation before marking the build as stable.

Progress

We’re definitely at version 0.0.1. There are likely things we’ll change before going into production (Kensis over DynamoDB streams). And the deployment process is not without challenges. That said, we’re already enjoying super quick build times and the simplicity of our new architecture. There’s definitely value in moving to v0.0.2…

If you’re building something similar or would like to work on projects like this, please get in touch.

--

--