Treating Your Terraform like an Application: Part 2

How to Dockerize Terraform

Sven Hans Knecht
Capital One Tech

--

Part Two in a two-part series about running Terraform in Docker

Note: This post covers Terraform 11. Terraform 12 changes how plugins works and requires different commands and files to build a Terraform container. For more info see: https://www.terraform.io/upgrade-guides/0-12.html

One of the most fundamental parts of transforming a team’s deployment processes and culture is the transition from infrastructure as a manual process to infrastructure as code (IaC). As covered in Part 1 of this series, putting Terraform in a Docker container helps alleviate the new pain points that come from running IaC, especially on a centralized build server like Jenkins. In this post we are going to cover the how.

Putting Terraform in Docker requires a few different steps and background in both Docker and Terraform.

How to Put Terraform into a Container

The two most important principles I’ve found in dockerizing Terraform are:

  • Separate the code that builds the base docker image from the code that builds the Terraform docker image.
  • Make the job that applies the Terraform separate from the job that builds the Terraform.

So, in order to fulfill these requirements, let’s write a base Dockerfile that has all the parts required to validate, plan, and apply/destroy.

Dockerfile:

# build step to create a Terraform bundle per our included terraform-bundle.hcl
FROM golang:alpine AS bundler
RUN apk — no-cache add git unzip && \
go get -d -v github.com/hashicorp/terraform && \
git -C ./src/github.com/hashicorp/terraform checkout v0.11.11 && \
go install ./src/github.com/hashicorp/terraform/tools/terraform-bundle
COPY terraform-bundle.hcl .RUN terraform-bundle package -os=linux -arch=amd64 terraform-bundle.hcl && \
mkdir -p terraform-bundle && \
unzip -d terraform-bundle terraform_*.zip

This is the start of our Dockerfile. Let’s walk through each of its steps.

  1. It pulls from a Golang container.
  2. It adds checks out, from Terraform, and installs Terraform.
  3. It copies from an HCL file (HCL files are a config to describe what providers and version of Terraform to run). I’ve provided an example HCL file here.
  4. Runs terraform-bundle package and download based on the HCL file provided.

We’ll pause here for a moment to look at why we are building a base container this way. There are definitely alternative methods of getting Terraform into a container. However, this method has four things going for it:

  1. The providers are standardized. Much like other dependencies, maintaining providers is important and having a single location to be able to update and roll out is incredibly useful. If a team or product needs a different version of a provider it can be done as a modification to this base container.
  2. The Terraform version is standardized. This prevents unnecessary version conflicts as one of the most aggravating messages in Terraform is fail to read terraform_remote_state generated by tf<new-version> with tf<old-version>
  3. It prevents outside dependencies from being pulled in more than once. You need not worry about the bandwidth to download the providers more than once.
  4. It allows us to standardize the authentication process for Terraform. This is an issue that we will cover at a later point, but having a unified method of authentication allows enforcement of best practices.

Now that we have a base docker image with Terraform and its providers, we can now focus on building a container that actually pulls in and runs Terraform commands.

Terraform Commands Docker Image

For this Dockerfile we have a number of stages that have to happen:

  1. We must pull in any Terraform code that we wish to run.
  2. We must pass a ENV variables to Terraform.
  3. Terraform must be able to authenticate itself.
  4. Terraform must have somewhere to store the state file.

These are listed in order of importance or in the order of problems you should solve. However, they are listed in reverse order of how we are going to look at them.

Terraform must have somewhere to store the state file

First, Point #4. If you are going to use Docker containers to run your IaC, I recommend using remote state. This prevents you from having to worry about volumes, allows you to prevent multiple actions from taking place simultaneously, and allows you to be confident in your remote state reflecting the actual state.

If you are not using remote_state (and I really recommend you do), you will need to mount a volume, edit the Terraform commands we use in this tutorial to write out a state file, and then have a separate task to copy that statefile to where you’d like it stored. NOTE — you cannot store the state file inside the Docker container.

Terraform must be able to authenticate itself

Second, Point #3. Whether you use roles ( like IAM) or keys to authenticate Terraform against your infrastructure, using a single base container allows you to standardize the method of authentication. If you are using access keys then you should store them encrypted somewhere and use something like Hashicorp Vault to get them at run time. If you are using roles, like IAM, then you need a way to get the environment variables/metadata into the container itself. This is done via a two-step process:

  1. By giving the Docker container all the ENV variables of the machine or slave running the Terraform.
  2. By using those ENV variables to authenticate. With AWS, this step would use something like AWS Assume to talk to STS and get credentials after passing in the role from the ENV of the EC2 instance or user running the Terraform.

Passing ENV variables to Terraform

Third, Point #2. This step includes not just ENV variables for authentication, but Terraform init variables to allow for remote state management across regions or accounts or variable files that might change based on the application being deployed. Some tips here:

  • Use prebuilt Terraform ENV variables like TF_CLI_ARGS_INIT which can be created outside the container and then passed with values like SECRET_KEY_ID that are interpolated at run time, despite being told to exist at build time.
  • Separate out what is being passed based on the command being run so that the correct args are passed for validate, plan, apply, and destroy.
  • Set — get-plugins=false when running init so that you don’t try to redownload the providers from earlier.

Passing in the Terraform to be run

Finally, Point #1. I recommend storing the Terraform next to the application being deployed and having infrastructure separate from the application deployment. For example, if you are deploying an ECS cluster, put all the Terraform for that ECS cluster and ALB in one repository, then the Terraform for the ECS service, and then the ALB listener side by side with the application code. This has two benefits:

  1. The application can be deployed anywhere without needing to impact the underlying infrastructure or worry about where it is. Using Terraform data calls to look up information allows for Terraform that looks like:
data “aws_ecs_cluster” “ecs-cluster” {
cluster_name = “${var.cluster}-cluster-${var.env}”
}

Now you’ve removed complexity and are no longer manually updating what environment/cluster is being deployed too.

2. This allows you to version your application Terraform alongside its code, allowing that version to always be deployed, functionally forever, as long as the data look-ups are returned.

I recommend copying in, as part of the Dockerfile, the Terraform code for deployment. This allows you to build an artifact containing the Terraform code that can be pulled down to any system and deployed. I recommend, if you have the need, to push that Docker container to a Docker repository. It makes for great auditing. It also makes for very easy automation.

Conclusion

We’ve now written, one explicitly and one implicitly, two Dockerfiles. One for a base Terraform container and the second for a deployment of Terraform. Regardless of the provider you are deploying too, or what you are deploying, standardizing your deployment process inside a container is probably useful for your use case. It allows you to version the infrastructure code allowing for easy roll backs and auditing. It also allows you to ensure that infrastructure can be rolled out from your developer’s laptop or from a CICD system.

This is useful in the event that you have a mass outage causing your CICD system to go down. If that is true, waiting for the CICD system to come back up might not be plausible, so deploying from anywhere is useful. It also allows you to manage all your Terraform versioning and providers in a single location, while still allowing teams individual flexibility to add on or upgrade separately.

These opinions are those of the author. Unless noted otherwise in this post, Capital One is not affiliated with, nor is it endorsed by any of the companies mentioned. All trademarks and other intellectual property used or displayed are the ownership of their respective owners. This article is © 2019 Capital One.

--

--

Sven Hans Knecht
Capital One Tech

SRE/Platform Engineer Professional. Amateur Analytics and Sports Enthusiast