10 things I wish I knew before learning Terraform (Part 2)

Published in

Contino Engineering

6 min readNov 17, 2021

This is the second part of my “lessons learned” blog post on Hashicorp’s Terraform. You can read the first instalment here if you missed it.

10 things I wish I knew before learning Terraform

medium.com

6. Keep your code together as much as possible

One of the biggest benefits of Terraform is the way that you can use derived values for resources used as inputs to other resources. While this in itself isn’t that exciting, the fact that HCL is pretty good at understanding those dependencies as part of the terraform apply process is pretty exciting. What this means is that you can declare something like a security group as part of a rule or a role as part of a policy and Terraform will understand that it needs to create those resources first without any additional input from you- simply by the way that you wrote the code it understands that implicit relationship.

This isn’t without its exceptions though, you’ll notice that I added: “as much as possible”. You are always going to have resources that you’ll need to reference that you didn’t create as part of your pipeline. This might be something like a storage location where all the backups go, or a service account that’s part of your organisation's cloud landing zone. You can work around this by using data calls to achieve the same effect but will create a dependency on these resources existing before your pipeline runs.

7. Have clear lines of demarcation on responsibility

You aren’t always going to be responsible for provisioning all of your infrastructure, especially in larger teams or highly regulated industries. Even if you do deploy all of your own RBAC and transit, there’s a reasonable argument to keep some of those sorts of configurations that might be a consistent pattern separate from your application code. One of the worst scenarios that you can have in Terraform is where you have the same resource be managed by two different state files because each of them assumes that they are the authoritative source for its configuration.

While this might seem obvious, it can get confusing when you are resources that are used widely but are also dependent on other resources for their configuration. Consider the iam_role that we created above for logging into our bucket which would most likely be a resource that’s used across the account. If we have an application running on a container that requires access to a CloudWatch Log Group which is created in another application stack, how do we provide fine-grained access control to the log group when the resource has yet to be created? This might be a scenario where the platform team might be responsible for the role and the application team would be responsible for providing the paths via a parameter or an associated data call.

8. Use multiple environment files for the same code

Oh, it’s not like that in production…

We’ve all been there. The tactical solution that becomes permanent, that 3 am in the morning ClickOps fix to get the server back online after it ran out of disk space. And then you’ve got configuration drift- where your development environment isn’t the same as your production one. What’s worse is that everyone knows that the environments are different. This slows down your delivery because it forces your developers to add in extra checks to their changes or just flat out discourages change at all because their worries are that something will break.

Terraform can help you handle this by allowing you to provide different variable files for each environment while keeping the core code that delivers your infrastructure exactly the same. Consider the below structure;

From this scenario, changes can be made to the environment level tfvars files which can then be applied using the same Terraform codebase using the syntax below. You could even go one step further, there’s no reason why the variable files need to be in the same repository, ensuring that while you can adjust certain configurations of the environment like volume_size without being able to change the resource the same is attached to because the source code lives in another repository.

terraform apply -var-file="./env/nonprod.tfvars"

9. Familiarise yourself with HCL’s functions and meta-arguments

While at its roots HashiCorp Configuration Language (HCL) is a collection of providers, resources, variables, and outputs, how you plumb them all together can often require a little more finesse than is advertised on the box. This can be because of the way that the vendor API returns a value, the order in which resources get created, or maybe the Terraform provider is still being developed, but be sure before too long you’ll find yourself scratching your head wondering why something won’t work.

Enter functions and meta-arguments, some of which we’ve spoken about previously. HCL has a pretty standard set of tools for managing object types (strings, lists, and so on) that you can use to massage inputs to be what you want them to be. Some of the best examples of how to do this sort of thing can be found in the public Terraform Registry; I’m especially fond of the AWS VPC one, an excerpt of which I’ve provided below.

In this case, there’s a conditional on the resource creation using a count to define whether the resource gets created at all, followed by a dynamic number of ingress rules based on a list of string maps. The for_each goes through each map in the list and performs lookups on the values for certain keys and create a number of ingress rules with some acceptable defaults in the event the key doesn’t exist as well as handling multiple values (that’s what the split and compact are for). The beauty of something like this is that it allows you to keep all the configuration changes in your environment variable files, rather than having to make many frequent changes to your underlying infrastructure code which should be measured twice, cut once sort of deal.

A lot of the use cases you’ll have when you first start off won’t be this complicated, so don’t let this scare you off. Just know that there are these and many more ways to get Terraform to give you the outcome that you want, without having to copy and paste blocks from your underlying code.

Terraform meta-arguments (count, lifecycle, depends_on and for_each)
Terraform functions

10. Terraform is not a golden bullet

This might seem like an odd point to finish off on for an article that is pretty clearly “pro Terraform”, but bear with me. When I first started using Terraform, I went on that high of thinking that this was the only tool that I’d ever need to use for configuration management. So much so that when I was working on a project to integrate the business application identity management into Azure AD, I naturally thought about using Terraform to manage the RBAC for applications. It didn’t turn out very well and I ended up writing a much simpler solution using Lambda functions and an API Gateway.

If you were one of the people I was working on with that project and you are reading this, consider this my formal apology :) - Ian

The point of all this is that as great a tool as Terraform is, it’s not the swiss army knife of configuration management. And while it can do certain things for managing application configuration, tools like Ansible are probably better suited in many cases. The same can be said about managing credentials, providing a CMDB, driving role-based access control- while you can make it work don’t fall into the same trap as I did where it became the solution to all your problems.

So there we have it, I hope this has been an interesting read for you if you made it this far. As someone who believes the journey is the worthier part, I really hope that this gives you a leg up into the wider world of HashiCorp’s Terraform.