Working with Terraform: 10 Months In
Terraform is a powerful tool for orchestrating cloud infrastructure, and, arguably, an essential tool once your infrastructure reaches a certain scale (or crosses cloud providers). After having written over 40,000 lines of Terraform configuration in HCL over the past 10 months, I’d like to share some observations about working with Terraform.
Terraform configuration is, by its nature, low level. As such, if you’re creating many similar resources such as Lambda functions or CodePipelines, you can expect fairly repetitive HCL. One potential solution to this is to use Terraform modules, however, this doesn’t always make your overall HCL smaller or more maintainable as you end up having to pass in many variables to each module (and expose just as many outputs).
HCL is not particularly composable. Perhaps an HCL syntactic structure akin to SASS’s @mixin would be helpful, though that approach isn’t without its own issues. I also am not convinced that DRY is the most important principle when it comes to maintainable Terraform configuration.
While it may strike some as heresy against DRY, I’m increasingly comfortable with copy-paste as an alternative to premature modularization. One of the risks of prematurely extracting modules from a codebase is that you’re introducing dependencies that could make your entire system more brittle, especially at the early stages of development when your team is making many changes all over the codebase. The more subsystems that depend on a particular module, the larger your blast radius when making a change to that module. Further, if you extract modules too early, you may not yet fully grasp requirements across the entire system, leading to incidental complexity as you incrementally expand a module to handle additional use cases. Better to let your infrastructure stabilize and then extract modules for maintainability once you have a clear understanding of the requirements.
While the ability to query remote state is one of the most valuable Terraform features, it can also introduce additional risk. Typically, I try to draw remote state boundaries around logical services rather than resource types (e.g., Lambda deployment vs. CodePipeline). Following Charity Majors, I think it’s a great idea to keep your remote state as small and decoupled as possible. However, depending on the scale of your infrastructure, a file per environment may not be granular enough (especially when using an AWS account per environment). Remote state can also obfuscate module dependencies, making it difficult to reason about the impact of a particular change. The best way I’ve found to mitigate this obfuscation is to limit the use of remote state to top level modules and pass in necessary variables to any child modules.
When working with Terraform, try to apply in small increments wherever possible. As you’re building a configuration, work from the inside out (starting with the innermost module), and use an IDE (I’ve had great success with this plugin for IntelliJ IDEA). When working inside out, the IDE’s automatic validation of module variables will prove invaluable. In the absence of an IDE, always run terraform validate (perhaps as a Git pre-commit hook). Never run terraform apply or terraform destroy without first running terraform plan and saving the output for use when you apply. And when you’re refactoring, don’t be afraid to destroy resources and start fresh if you’re able. Often starting from a blank slate will save you a lot of trouble. In cases where you need the resources to exist because people are using them, simply give your new resources a different name or prefix them until they’re ready to go live.
Terraform is still a young tool, and the best practices for using it are evolving rapidly. I’d be curious to hear from anyone else who has used Terraform to build an entire environment from scratch. I suspect that beyond a certain scale, we need some sort of code generation and visualization tool to make sense of it all.
If you find my work interesting, sign up for my mailing list so you don’t miss a thing.