Stop using Terraform remote state blocks!

How we got rid of remote state lookups and made our Terraform multi-region aware

Jonathan Hosmer

Published in

Peloton-Engineering

11 min readApr 5, 2021

Background

About 6 months ago I discovered a limitation in one of our Terraform repositories that, if not addressed, would seriously hinder the kind of infrastructure growth we needed to continue to scale up.

Nearly all of our AWS infrastructure is managed by Terraform and Terragrunt. We have over 550 Terraform modules, 2800 Terraform files, at least 1400 Terragrunt files, 25 different logical “environments”, and use 7 different Terraform providers, including AWS, DataDog, and CloudFlare.

Originally, all of our AWS resources that were created via Terraform lived in a single AWS region. We had resources in other regions but they were created outside of Terraform and we wanted to import these into our Terraform repository and continue expanding our infrastructure footprint to even more regions. There are many reasons to have infrastructure span multiple regions, including disaster recovery, automated failover, localized edge resources in regions geographically far away, performance testing, etc. However, we had used Terraform in a single region for so long that we had to clean up a lot of code and add some new code before we could even think about creating resources in multiple regions via Terraform.

Peloton is full of exciting technical challenges and our explosive growth has opened up plenty of opportunities to tackle these challenges in ways that account for large-scale and long-term success, not just deadlines and deliverables. I was excited to take on this project because it was not handed down to me from management or some “higher-up”. I discovered the problem, proposed a plan to solve it and was given the green light without being asked how long it would take or how difficult it was. I knew I could handle it and my manager empowered me with the freedom, tools and resources I needed to do it, and do it right! Our leadership recognizes that we are a technology company at the core, and this empowerment supports our incredible growth.

Motivation

Expanding our Terraform to support a multi-region setup was the driving force behind this project. It was also necessary to apply this upgrade to one environment at a time, so we could iron out any problem in a test or stage environment before updating production. To accomplish this we needed to refactor each environment, creating directories for the regions we wanted resources in and moving all existing Terragrunt files into their respective region directory, all while maintaining about 50 PRs against our Terraform repository each week! Here is a visual representation of what our repository looked like and what we wanted it to look like:

terraform                            terraform
└── environments                     └── environments
    ├── prod                             ├── prod
    │   ├── buckets                      │   ├── buckets
    │   └── iam-roles      ->            │   └── iam-roles
    └── stage                            └── stage
        ├── buckets                          ├── us-east-1
        └── iam-roles                        │   ├── buckets
                                             │   └── iam-roles
                                             └── us-west-2
                                                 └── dynamodb

Before getting into the details of how we were going to do this, let me give a quick intro to Terragrunt. One of the great things about Terragrunt is the ability to have one Terraform module applied to multiple environments. You can read more about the details here — how Terragrunt helps keep your Terraform code DRY — but this picture should help illustrate the concept:

We utilize this in many different ways for everything from EC2 instance types to entire IAM role policies. For example, permissions for almost everything in a Production environment should naturally be much more strict than those in a development environment where engineers rapidly iterate on their code and therefore need more access to test out what works and what doesn’t. Here is how this example could look in a Terraform module:

IAM Policy Document for different S3 permissions based on environment

With the ternary operator, we can use the Terraform concat function to combine different actions based on the value of the env variable, which Terragrunt reads from a common.yaml file in the root of each environment directory and automatically passes as an input to all modules. In the above example, if env is “production”, the [“s3:GetObject*”] list will be combined with an empty list, [], resulting in the original list and granting read only permissions. For all other values of env, the [“s3:GetObject*”] list will be combined with additional write permissions, [“s3:PutObject*”], to become [“s3:GetObject*”, “s3:PutObject*”]. At Peloton, we have environment based configuration in literally every single one of our Terraform modules for things like cost-tagging, team ownership, and environment resource grouping, just to name a few.

Now that we’ve seen how Terragrunt works to keep our Terraform repository DRY, let me introduce the first problem we had to solve before we could achieve a multi-region Terraform repository: the terraform_remote_state data block. This data source retrieves module output values from other Terraform configuration, using the latest state snapshot from the remote backend, which in our case is an S3 bucket.

A typical configuration could look like this:

Terraform remote state data block

And you would use it to get output values to pass into other input variables like this:

remote state output values used a configuration settings for another module

When Terragrunt instantiates this Terraform module for the production environment it will read the remote state of another module from the production S3 bucket, stage from the stage bucket, etc. The only constraint is that the state file must be at the same path in all buckets: iam-roles/foo/terraform.tfstate. This presented a problem for us because we needed to be able to update a single environment at a time without breaking other environments. In other words, if we migrate our stage environment to a multi-region layout, our state files in the stage bucket would be at a different path than all other environments. Using the example above, the block would need to be changed for only stage, but remain the same for all other environments:

Different “key” value for multi-region environment

Before getting into the different ways of configuring state files at different paths for different environments while maintaining DRY Terraform code, we need to understand another important factor, the Terragrunt dependencies block. From the Terragrunt documentation:

The dependencies block is used to enumerate all the Terragrunt modules that need to be applied in order for this module to be able to apply. Note that this is purely for ordering the operations when using run-all commands of Terraform. This does not expose or pull in the outputs.

For each terraform_remote_state block, there needs to be a dependencies block in the Terragrunt file that instantiates the module. This is necessary to ensure two things: 1) that the remote state exists and 2) the remote state is not stale.

The disadvantage of these dependencies blocks, though, is that they are optional. If your Terraform does a remote state lookup but you’ve forgotten to include dependencies for what you are looking up, there is a chance of pulling in stale state or the state may not even exist yet. This is especially true when Terraform modules are shared across development teams.

A typical workflow for a new Peloton microservice is for an engineer to write a Terraform module for any resources their service might need: S3 buckets, SQS queues, redis and DynamoDB tables, etc. Then they create those resources in a development environment by creating a Terragrunt file somewhere in the environments/dev directory. Development of a new microservice usually lasts several weeks, sometimes months. When the team feels confident that the service is ready, it goes through a Production Readiness Review (conducted by my Production Reliability SRE team) and is finally ready to have resources created in the production environment.

More often than not, the engineer writing the production Terragrunt is not the one who wrote the original Terraform module. And the required resources may have changed several times during development. This person creating the production Terragrunt file may have never even looked at the actual Terraform module. Forgetting to add dependencies, while not a common mistake, was always a risk and it just takes one time to potentially be disastrous. If we could find something to replace this unreliable dependencies directive and eliminate this potential for human error we could increase infrastructure reliability. As a Site Reliability Engineer, specializing in Production Reliability, I am always biased towards improving reliability.

After much debate on how to solve these problems, we came up with two options.

Option 1 — Patch up existing code

Write a conditional ternary operation on the file path.

For example:

key = “${var.env == “stage” ? “${var.region}/” : “”}iam-roles/foo/terraform.tfstate”

This option seemed straightforward, but we quickly found out that some of the implementation details were less than ideal. First, it would require maintaining a curated list of environments that are multi-region capable, which is a compromise we didn’t like but we were open to because this was a fairly simple solution that could get us over the finish line quickly. If there were only a few of these terraform_remote_state data blocks in our code it would have been quite easy, however, sprinkled throughout the 4000+ files that make up our Terraform repository were about 150 different terraform_remote_state data blocks! Which means we would have to update 150 files every time we migrated a new environment to support the multi-region setup, and it could start to get messy very fast:

key = "${contains(["stage", "test", "qa1", "qa2", "..."], var.env) ? var.region : ""}/iam-roles/foo/terraform.tfstate"

The details kept getting more complicated: How do we account for AWS resources that are global, like S3 buckets and IAM policies? What if we wanted to look up the remote state of a resource in a different environment or a different region or both? It seemed like we would need to add additional variables on each module that used a terraform_remote_state block.

This proposal was looking less attractive the more we thought about it, but maybe it was our best option? Let’s take a look at the second option we came up with to see how it compares.

Option 2 — Rip off the bandaid

Remove all `terraform_remote_state` blocks

But, how, you may ask? Well, let me introduce you to the new(ish) Terragrunt dependency block. For several years we were forced to rely on dependencies to keep our state fresh when doing a state lookup, but less than a year prior to the start of this project, Terragrunt released the dependency block:

The dependency block is used to configure module dependencies. Each dependency block exports the outputs of the target module as block attributes you can reference throughout the configuration.

By replacing each dependencies block with dependency blocks, we could remove every terraform_remote_state block and pull in the values from variables. The remote state lookup was effectively moved up a layer from Terraform to Terragrunt.

In principle, the idea was quite simple. All we had to do was convert outputs from remote state lookups into variables, and pass in those variable values in the Terragrunt files.

In code, what we need to do was change each Terraform module from this:

into this:

and in the Terragrunt file, change dependencies to dependency:

and pass in the new inputs from the exported outputs:

This seemed to solve both of our problems and increased the reliability of our Terraform code and pipelines, because, unlike dependencies, if you forget to include a dependency block in your Terragrunt file you will be met with an immediate error during the Terraform validation or plan phase. You never have to worry about stale or missing state and it is always better to fail with an error than apply an uncertain configuration.

Similar to the first option, this also meant we would be updating all 150 Terraform modules that had terraform_remote_state data blocks, but also the 600 Terragrunt files instantiating those 150 modules. The actual Terraform change would also not be as easy. In the first option a fairly simple sed search and replace would take care of 99% of the modules. Going with this option would require something significantly more complex. We would need to find everywhere a remote state output was used and create new variables for them. Making sure not to use the same name as any existing variables. Modules could have multiple remote state lookup blocks and we would have to figure out the type of each output to set the variables to the same type. It was apparent that this was not going to be easy, but was it worth it? The choices boiled down to either: easy to implement but messy and unreliable vs. complex and difficult to implement but clean and reliable.

Like I said, I am heavily biased towards improving reliability, code quality and best practices, regardless of the amount of effort or time it takes to get there. Needless to say, this was a very easy choice.

There were plenty of obstacles along the way and in part two of this post I will be sharing the gory details of the different scripts I wrote to automate the execution of this upgrade (because there was no way I was going to be updating 750 files by hand). Now, enjoy a brief look into some of the hurdles we faced as a result of choosing the path we did and what we needed to do to overcome them.

Hurdles

Variable state file paths

Some terraform_remote_state blocks did not have static key values but instead had paths that pointed to a tfstate file dynamically based on input variables. We even had a module with a variable that expected a list of AWS Glue job names that were used as the state file location in multiple remote state blocks:

The above block will read a different tfstate file for each job in the list, making it incredibly difficult to determine the dependency modules programmatically.

Modules that source other modules

We have a large set of “shared” modules that have certain configuration settings that we want applied to all resources wherever they are used throughout the organization. For example, cost tagging to track production vs non-production resources, team ownership tags, and custom KMS keys to encrypt any S3 bucket by default, just to name a few. Application developers that need a new S3 bucket do not use the actual aws_s3_bucket Terraform resource from the AWS provider. Instead, they create a high level “service” module that calls the shared S3 module to stand up the actual S3 bucket. This workflow is great because it guarantees that every S3 bucket is encrypted, has proper tags, is not public, etc. However, this also caused a bit of a headache when removing terraform_remote_state blocks from these “shared” modules. To surface all variables from the top all the way down to a shared module, I had to write the script in a way that it would:

add new variables to a shared module for all remote state outputs being used anywhere in the module
look for all “service” modules that source the shared module
a) add the exact same variables to the service module
b) pass those variables to the source (shared) module
check for any other “service” modules that source the service module from step 2 (because you can have as many levels of abstraction as you want — module A can source module B which can source module C, etc, etc)
a) add the exact same variables to the service module
b) pass those variables to the source (service) module
If there are multiple levels of abstraction, make sure the outputs from the source module are also outputs of the service module
Repeat steps 3 and 4 until all modules that source the shared module or source another service module that sources the shared module have been updated with the new variables and outputs
Find all Terragrunt files that source any shared or service module that was updated in steps 1–5 and
a) Add a dependency block for the module who’s remote state was being looked up in the source module (or a higher level service module)
b) use outputs of the new dependency block as the inputs for the variables that were added in step 1.

This was probably the most difficult part of the process and as a result the script had a lot of recursion, nested loops and conditional statements. You can read all about the intricacies of the actual scripts that I wrote to do all this in another blog post coming soon!

References / More Information

Special thanks to Tipene Moss and Pete Clark for all of the help with the project.