Working with Terraform to streamline our provisioning process
Using declarative infrastructure-as-code to manage cloud resources
Setting the scene
Working for a large and highly scalable e-commerce platform, like ASOS, comes with its challenges. One of these is the management and configuration of a huge number of resources in the cloud.
Using popular infrastructure-as-code techniques and principles, my team and I have created numerous automated deployment pipelines to largely keep this problem at arm’s length. This means that we are able to reliably reproduce environments at the click of a button, while any changes are safely documented in source control. But, as the features and requirements of the business have evolved, the complexity of these pipelines has increased.
Our current provisioning pipelines include combinations of pre-templated steps, custom PowerShell scripts and Microsoft arm templates to name just a few. Once strung together these pipelines can be difficult to maintain, steps quickly become out of date and different techniques are often used to achieve the same thing.
So, this is where Terraform comes in.
What is Terraform?
Terraform is an open-source tool, developed by Hashicorp and used to declaratively define infrastructure-as-code. This tool enables users to manage resources and consistently reproduce infrastructure across a variety of different cloud providers. Combined with its simple CLI (command-line interface), it makes provisioning easier.
Why is Terraform so good?
In addition to the positives of using infrastructure-as-code, Terraform also provides a host of other benefits.
The positives of a declarative style include increased immutably and reduced side effects. It generally also makes code easier to read and more understandable. Terraform uses HCL (Hashicorp Configuration Language), which has been specifically designed to be written and modified by humans. This means it is easy to see what has been deployed and you don’t need to be computer literate to understand the current state of the system.
Iteratively updating infrastructure in place is known as mutable infrastructure, as over time different changes are applied to a system to achieve its current state. This can lead to configuration drift between different resources or environments and creates snowflake servers, which can end up being difficult to reproduce.
Due to its declarative style, Terraform’s definition files can be used to produce an immutable infrastructure. This allows systems to be consistently recreated from scratch, which helps to avoid configuration drift, makes it easy to reproduce systems across different environments and prevents changes that go unrecorded.
Thanks to an intuitive CLI and simple definition files, Terraform is very transparent. Again, due to its declarative nature, it is very easy to see the current state of a system and, when making changes, the desired state is also just as straightforward.
Before updating your infrastructure, the Terraform plan and apply commands provide a clear view of what will change in your system before you agree to update. When running the commands, it is easy to back out if you notice anything unexpected.
The clear HCL definition files combined with the small number of CLI commands make Terraform simple to understand. This means there is a relatively short learning curve, which makes it easy to get started.
Deploying changes is very fast, as Terraform has functionality that can be used to maintain the current state of the system. When changes are applied, it knows exactly what needs to be created, modified or deleted and it can skip everything else.
Terraform supports a wide range of providers, from AWS and Azure all the way through to Githib and Grafana. Essentially, anything with an API can be built into a Terraform provider.
Terraform is distributed as a single executable, which only needs to be installed on the machine executing the commands. There is no need to install, maintain and configure extra software across your estate compared with a client-server architecture.
Terraform and all of its providers are open source. The repositories on Github are very active and open to new features and pull requests from the Hashicorp community.
Are there any cons?
Of course. Terraform is not a one-size-fits-all solution to managing infrastructure. Some Terraform providers are more mature than others - the AWS provider currently exposes many more resources than the Azure provider, for example.
In the case of Azure, if a resource is not yet supported, extra ARM or PowerShell steps need to be forced into the solution, which don’t work well with the HCL language. Only the deployment of ARM templates can be managed by Terraform and not the physical resources it creates so managing changes to these parts of the infrastructure can be difficult.
The framework is built in Golang, so there is a learning curve if you want to contribute to the open-source libraries. Also, due to its declarative nature, a deployment cannot always be rolled back so you must be very careful before applying any changes.
Taking all of the above into consideration, my team and I decided to try using Terraform to provision a small subset of our infrastructure to help us fully understand the benefits and build some confidence.
Terraform does not require you to fully migrate all of your existing infrastructure across into a single project, as it only controls the infrastructure which is defined. The import command allows you to include existing resources, so you don’t have to totally start from scratch, if you are building on pre-existing resources like we were.
Terraform manages infrastructure using small modules known as resources, which were very simple to import into our project, so we were soon able to build up a clear view of our current system. Once imported, we could then start to add the additional resources our system required. All of this was very clear, simple and I liked the way we were able to consolidate all of the different techniques we had previously used in our deployment pipeline into a single project.
The first challenge came when we needed to get our infrastructure deployed using an automation pipeline. Our first iteration was very simple and only consisted of three steps per environment. Step one installed the Terraform exe, the second performed the plan command, so we could review changes, and the final step was used to execute the apply command.
Initially, this was enough, but soon the pipeline was not suitable for a team making numerous changes to our system. It soon became apparent that it was very easy to apply changes without properly reviewing what the change would do, which caused us some issues. To combat this, we added steps that required some manual intervention to make this harder to do.
As the requirements and complexity of our system increased, our first attempt at a project structure was not good enough. With numerous different resources required across many different environments, the project was messy and confusing.
As the Terraform documentation doesn't explicitly provide a recommended project structure, we came up with our own. To improve the development experience and make the project clearer for everyone, we decided to split our project into three distinct groups.
- Modules: Used to create reusable components and set common standards across different resources
- Plan: Used to define blueprints for infrastructure using the modules
- Environments: Used to control environment specific variables and requirements
This trial has allowed us to gain valuable experience and iron out some nuances using Terraform. It has also helped us to refine concepts such as automation, project structure and monitoring, which I will discuss further in later posts.
The process has largely been successful and we will be continuing to use what we have developed so far. I’m sure there is plenty more to learn on our Terraform journey, but I would definitely recommend giving it a go to see if it would work for you.
I hope you liked the article and it gave a helpful introduction to Terraform. Please feel free to drop me a message or give me some claps so more people can see it!