Image for post
Image for post
Photo by EMAR DI on Unsplash

Importing existing AWS Infrastructure into Terraform and Ansible

RJ Zaworski
Aug 18 · 8 min read

You can build things out of pebbles. Working with so many unique pieces isn’t easy, but if you slather them with mortar and fit them together just so, it’s possible to build a house that won’t tumble down in the slightest breeze.

Like many startups, that’s where Koan’s infrastructure started: with lovingly hand-rolled EC2 instances sitting behind lovingly hand-rolled ELBs inside a lovingly — yes — hand-rolled VPC. Each came with its own quirks, software updates, and Linux version. Maintenance was a constant test of our technical acumen and patience (not to mention nerves); scalability was out of the question.

These pebbles carried us from our earliest prototypes to the first public iteration of Koan’s leadership platform. But there comes a day in every startup’s journey when its infrastructure needs to grow up.

Motivations

The wolf chased them down the lane and he almost caught them. But they made it to the brick house and slammed the door closed.

What we wanted were bricks: uniform commodities that can be replicated or replaced at will. Infrastructure built from bricks has some significant advantages over our pebbly roots:

  1. Visibility. Knowing who did what (and when) makes it possible to understand and collaborate on infrastructure. It’s also an absolute must for compliance. Repeatable, version-controlled infrastructure supplements application changelogs with a snapshot of the underlying infrastructure itself.
  2. Confidence. Not knowing — at least, not really knowing — infrastructure makes changes very nervous. For our part, we didn’t. Which isn’t a great position to be in when that infrastructure needs to scale.
  3. Consistency. Pebbles come in all shapes and sizes. New environment variables, port allocations, permissions, directory structure, and dependencies must be individually applied and verified on each instance. This consumes development time and increases the risk of “friendly-fire” incidents from any inconsistencies between different hosts (see: #2).
  4. Repeatability. Rebuilding a pebble means replicating all of the natural forces that shaped it over the eons. Restoring our infrastructure after a catastrophic failure seemed like an impossible task—a suspicion that we weren’t in a hurry to verify.
  5. Scalability. Replacing and extending are two sides of the same coin. While it’s possible to snap a machine image and scale it out indefinitely, an eye to upkeep and our own mental health encouraged us to consider a fresh start. From a minimal, reasonably hardened base image.

Since our work at Koan is all about goal achievement, most of our technical projects start exactly where you’d expect. Here: reproducible infrastructure (or something closer to it), documented and versioned as code. We had plenty of expertise with tools like and to draw on and felt reasonably confident putting them to use—but even with familiar tooling, our initially shaky foundation didn’t exactly discourage caution.

That meant taking things step by gradual step, establishing and socializing patterns that we intended to eventually adopt across all of our cloud infrastructure. That’s a story for future posts, but the journey had to start somewhere.

Dev today, tomorrow the world

“Somewhere,” was our trusty CI environment, . Frequent, thoroughly-tested releases are both a reasonable expectation and a point of professional pride for our development team. is where the QA magic happens, and since downtime on blocks review, we needed to keep disruptions to a minimum.

Before could assume its new form, we needed to be reasonably confident that we could rebuild it:

  1. …in the right VPC
  2. …with the right Security Groups assigned
  3. …with our standard logging and monitoring
  4. …and provisioned with a working instance of the Koan platform

Four little tests, and we’d have both a repeatable environment and a template we could extend out to production.

We planned to tackle in two steps. First, we would document (and eventually rebuild) our AWS infrastructure using . Once we had a reasonably-plausible configuration on our hands, we would then use to deploy the Koan platform. The two-step approach deferred a longer-term dream of fully-immutable resources, but it allowed us to address one big challenge (the infrastructure) while leaving our existing deployment processes largely intact.

Replacing infrastructure with Terraform

First, the infrastructure. The formula for documenting existing infrastructure in goes something like this:

  1. Create a stub entry for an existing resource
  2. Use to attach the stub to the existing infrastructure
  3. Use and/or to reconcile inconsistencies between the stub and reality
  4. Repeat until all resources are documented

Here’s how we documented the VPC's default security group, for example:

At this point, we could run to see the difference between the existing infrastructure and our Terraform config:

Using the diff as an outline, we could then fill in the corresponding entry:

Re-running , we could verify that the updated configuration matched the existing resource:

The keen observer will recognize a prosaic formula crying out for automation, a call we soon answered. But for our first, cautious steps, it was helpful to document resources by hand. We wrote the configurations, parameterized resources that weren’t imported yet, and double-checked (triple-checked) our growing Terraform configuration against the infrastructure reported by the CLI.

Sharing Terraform state with a small team

By default, Terraform tracks the state of managed infrastructure in a local file. This file contains both configuration details and a mapping back to the “live” resources (via IDs, resource names, and in Amazon’s case, ARNs) in the corresponding cloud provider. As a small, communicative team in a hurry, we felt comfortable bucking best practices and checking our state file right into source control. In almost no time we ran into collisions across branches—a shadow of collaboration and locking problems to come—but we resolved to adopt more team-friendly practices soon. For now, we were up and running.

Make it work, make it right.

Provisioning an application with Ansible

With most of our infrastructure documented in Terraform, we were ready to fill it out. At this stage our attention shifted from the infrastructure itself to the applications that would be running on it—namely, the Koan platform.

Koan’s platform deploys as a monolithic bundle containing our business logic, interfaces, and the small menagerie of dependent services that consume them. Which services run on a given EC2 instance will vary from one to the next. Depending on its configuration, a production node might be running our REST and GraphQL APIs, webhook servers, task processors, any of a variety of cron jobs, or all of the above.

As a smaller, lighter, facsimile, has no such differentiation. Its single, inward-facing node plays host to the whole kitchen sink. To simplify testing (and minimize the damage to ), we took the cautious step of replicating this configuration in a representative local environment.

Building a local Amazon Linux environment

Reproducing cloud services locally is tricky. We can’t run EC2 on a developer’s laptop, but Amazon has helpfully shipped images of Amazon Linux—our bricks’ target distribution. With a little bit of fiddling and a lot of help from , we managed to bring up reasonably representative Amazon Linux instances inside a local VirtualBox:

At this point, we could create an inventory assigning the same groups to our "local" environment that we would eventually assign to :

If we did it all over again, we could likely save some time by skipping VirtualBox in favor of a detached EC2 instance. Then again, having a local, fast, safe environment to test against has already saved time in developing new playbooks. The jury’s still out on that one.

Ansible up!

With a reasonable facsimile of our “live” environment, we were finally down to the application layer. approaches hosts in terms of their roles—databases, webservers, or something else entirely. We approached this by separating out two “base” roles for our VMs generally () and our app servers in particular (), where:

  • The role described monitoring, the runtime environment, and a default directory structure and permissions
  • The role added a (verioned) release of the Koan platform

Additional roles layered on top represent each of our minimally-dependent services — , , , and so on—which we then assigned to the local host:

We couldn’t bring EC2 out of the cloud, but bringing up a local instance that quacked a lot like EC2 was now as simple as:

From pebbles to brickwork

With our infrastructure in , our deployment in , and all of the confidence that local testing could buy, we were ready to start making bricks. The plan (and there’s always a plan!) was straightforward enough:

  1. Use to create a new instance
  2. Add the new host to our ansible inventory and provision it
  3. Add it to the ELB and wait for it to join (assuming provisioning succeeded and health checks passed)
  4. Verify its behavior and make adjustments as needed
  5. Remove the old instance (our pebble!) from terraform
  6. Rinse and repeat in production

The entire process was more hands-on than anyone really wanted, but given the indeterminate state of our existing infrastructure and the guiding philosophy of, step one was simply waving out the door.

Make it work, make it right.

Conclusion

Off it went! With only a little back and forth to sort out previously unnoticed details, our new host took its place as brick #1 in Koan’s growing construction. We extracted the configuration into a reusable module and by the end of the week our brickwork stretched all the way out to production.

In our next post, we'll dive deeper into how we imported volumes of undocumented infrastructure into Terraform.

Big thanks to Ashwin Bhat for early feedback, and Randall Gordon and Andy Beers for helping turn the pets/cattle metaphor into something more humane.

And if you’re into building software to help every team achieve its objectives, Koan is hiring!

Developing Koan

Dispatches from the Koan Engineering team

RJ Zaworski

Written by

Digital Entomologist. Learn, do, teach; iterate. https://rjzaworski.com

Developing Koan

A blog from the team building Koan, the platform where teams achieve their goals.

RJ Zaworski

Written by

Digital Entomologist. Learn, do, teach; iterate. https://rjzaworski.com

Developing Koan

A blog from the team building Koan, the platform where teams achieve their goals.

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store