Protecting Secrets And Critical Resources With Terraform

A quick guide to the Terraform “lifecycle” and how to prevent embarrassing deployment incidents with it

Edoardo Nosotti
May 21, 2020 · 5 min read
Photo by Jason Briscoe on Unsplash

When I attended to AWS re:Invent 2017 in Las Vegas I had a clear feeling that everything in there was about serverless. Sessions regarding serverless computing and managed services were scheduled in the biggest and fanciest conference rooms available and were literally packed with people. Seats for sessions were reserved with severals months notice and people queued outside to wait for unclaimed spots.
Meanwhile, the adoption of containers had already been soaring for 3 years back then and it scored an impressive +75% in 2018.

The whole shift to serverless, containerization and DevOps practices made it so that we deal with many more configurations than actual systems today and this is clearly reflected into “Infrastructure as Code” scripts. You can deploy a whole microservices-based application running on a serverless computing platform with Terraform only, or spin up a container orchestration platform and deploy containerized apps on it.
Applications require configurations and most likely some of the parameters are “sensitive data” (API keys, authorization tokens, encryption keys, sahred secrets, certificates…), often referred to as secrets.

Whether you are working alone or in a team, you need to backup your IaC scripts somewhere, hopefully in a Version Control System. Configurations in general and secrets in particular, though, are NOT meant to be stored in VCSs.
Ultimately, some VCS and CI services providers do offer secrets-management features on their platforms, but to this date I have only heard people recommending against that.

There are scripts out there scanning public repositories for misplaced credentials. Leave some AWS credentials there and you will soon be paying for somebody else’s botnet. You might even be questioned by the authorities, should that botnet mess with the wrong systems.

…so what?

The quite obvious first step is to create secrets with Terraform and assign empty or placeholder values to them. The actual values can be set after the initial deployment, by different means.

How do you prevent Terraform to overwrite the values and revert them to the placeholder values each time you run it afterwards, to upgrade the infrastructure or fix it, maybe reverting accidental changes made outside of Terraform, though?
This question applies to other kinds of resources too. Resources such as Auto Scaling Groups are meant to change over time, in response to the needs of the services running of them. When you create an ASG you usually need to specify the minimum, maximum and desired amount of instances it should have. A newly deployed ASG will most likely NOT run at its maximum allowed capacity. So you don’t want Terraform to scale it back in when it’s under load and leave your users out in the cold, right?

It’s all about life…cycle choices.

The lifecycle directive changes the behaviour of Terraform for a specific resource. Using the ignore_changes directive within a lifecycle block, you can ask Terraform to create a resource and ignore changes to all or some of its configuration details afterwards. Look at following code snippet:

resource "aws_ssm_parameter" "database_connection_string" {
name = "/myapp/development/DATABASE_CONNECTION_STRING"
type = "SecureString"
tier = "Standard"
value = "temp"
key_id = ...
lifecycle {
ignore_changes = [value]
}
}

this resource block will create an encrypted parameter into the AWS Systems Manager Parameters Store, named /myapp/development/DATABASE_CONNECTION_STRING and set its value to temp. temp of course is not a valid database connection string. You will set the actual connection string later, by other means. If you run Terraform again after changing the parameter value, it should tell you:

No changes. Infrastructure is up-to-date.

That is because of the:

lifecycle {
ignore_changes = [value]
}

block. We told Terraform to ignore any changes made to the value outside of the script, so we can safely set the value and continue using Terraform to build the infrastructure, without overwriting the value with temp.

With Auto Scaling Groups, you can take advantage of lifecycle too:

resource "aws_autoscaling_group" "my_asg" {
name = "my-asg"
max_size = 10
min_size = 2
desired_capacity = 2
health_check_grace_period = 300
health_check_type = "EC2"
default_cooldown = 300
launch_configuration = "my-asg-launch-configuration"
vpc_zone_identifier = "my-vpc-id"
termination_policies = ["Default"]
lifecycle {
ignore_changes = [
desired_capacity,
max_size
]
}
}

The code above will create a new ASG named my-asg, using a “launch configuration” named my-asg-launch-configuration, with a minimum and desired capacity of 2 instances and allowing it to scale up to 10 instances (provided that you create scaling policies too, which are outside the scope of this tutorial).
Since we have specified:

lifecycle {
ignore_changes = [
desired_capacity,
max_size
]
}

we can use Terraform to make incremental changes to the infrastructure regardless of the ASG size when we apply the changes. If it’s running “at full steam” using 10 instances, Terraform will not get in the way.

Setting your priorities right

lifecycle {
create_before_destroy = true
}

is used to ensure the replacement of a resource is created before the original instance is destroyed. As an example, this can be used to create an new DNS record before removing an old record.

taken from the Terraform Docs.

In the scenario presented above, you can create a new DNS record and migrate traffic to the new resource right away, before removing the old record, thus reducing the impact of the change (just choose your TTLs wisely, ok? ;)).

You can’t put fat fingers on a threadmill, but…

This flag provides extra protection against the destruction of a given resource. When this is set to true, any plan that includes a destroy of this resource will return an error message.

This directive only applies to the Terraform script it is put into and won’t prevent any other destructive action to happen outside of it. Another Terraform script could import the same resource in its state and destroy it subsequently. You have been warned.

RockedScience

Tutorials, tips and fast news on Cloud, DevOps and Code

Edoardo Nosotti

Written by

Senior Cloud Solutions Architect and DevOps engineer, passionate about AI, conversational interfaces and IoT.

RockedScience

Tutorials, tips and fast news on Cloud, DevOps and Code

Edoardo Nosotti

Written by

Senior Cloud Solutions Architect and DevOps engineer, passionate about AI, conversational interfaces and IoT.

RockedScience

Tutorials, tips and fast news on Cloud, DevOps and Code

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store