Terraform in Real Life: Writing Modules

Shane Mitchell
Version 1
Published in
11 min readSep 22, 2022

This is a follow-on from my first blog post. In this post, I want to talk about my experience writing Terraform modules and some of the tips I have gathered over the years.

What is a module?

A module is a container for multiple resources that are used together.

Every Terraform configuration has at least one module, known as its root module, which consists of the resources defined in the .tf files in the main working directory.

What are modules? The above description from the terraform docs explains that essentially every Terraform configuration you can init and apply is a module. However, when we talk about modules we usually mean repeatable blocks of Terraform code that we can test, version and reuse to reduce duplication. In a more traditional programming language like Java, for example, Terraform modules would equate to classes or methods. Modules are an extremely useful feature of Terraform that absolutely should be utilised. They do, however, introduce additional considerations to your development process.

The above statement from the terraform docs shows that any directory containing Terraform code is essentially a module. Typically, however, when we talk about Terraform modules, we are not talking about the root module, but instead “child” modules — generally referred to as “modules”. These are pieces of Terraform code that can be used as building blocks which are pieced together to define workloads and environments. Think LEGO.

A Terraform module is simply a directory containing one or more .tf files. When used correctly, it can be a powerful tool, providing a number of benefits including:

  • Version Control
  • Reduced Blast Radius
  • Keeping Terraform code DRY (as much as possible)
  • Ensure Predictability and Consistency across environments and projects
  • Enable Collaboration across DevOps teams

Types of Modules

Child modules can be referenced in your code in two ways — local and remote. Local modules are stored in a directory beside your root module and allow you to group related parts of your code and reuse them. An example of this might be a default S3 bucket template that you want to deploy a number of times to your environment. Local modules are easy to manage as they are just another part of your (root) Terraform repo.

Remote child modules on the other hand are stored in a remote location (e.g. a separate git repository). The primary benefit of remote modules is that they can be reused across multiple Terraform projects, this means however that they can have quite a significant blast radius and therefore need to be managed consistently.

The below diagram is an illustration of how remote modules can be used to build multiple environments.

Overview of how modules can be defined in Terraform

Where to start?

In many cases, you will find what you need in the Terraform Registry. Sometimes you can use modules directly from the registry, or clone them and add some tweaks for your needs. In Version 1, we have our Terraform standards that we implement for our customers, and therefore we write our own. The following tips are aimed at other Terraform users that are writing their own remote modules.

My tips

The following advice has been gathered from using Terraform on real customer projects over the past 6 years. This is not an exhaustive guide, but some lessons I’ve learned and will hopefully be useful to others. My focus here is on remote “child” modules which are stored in their own git repositories.

Is a module needed?

This is the first question you should ask yourself. Planning out modules will make your life a lot easier. While modules offer many advantages, they do add additional overhead for maintaining your code base. Therefore you must decide if a remote module is suited to your needs — I make that decision by asking the following questions:

Module decision Flow

Enforce standards

When you have decided that a remote child module is required, you should have a structure in place — when developing modules to be used by other teams, it is essential to have consistency and standards. Enforcing standards can be greatly helped by using some of the many Terraform tools available (see my previous blog for examples). Here at Version 1, we establish the following criteria for our modules:

  • Consistent repo structure
  • Good quality readmes
  • Naming conventions — tf-<provider>-<module-name>
  • Semantic versioning (semver)
  • Automated Module testing
  • Protection of master/main branch
  • A full working example

Here is an example of a full module repo:

Example of a module repo in GitLab

Note: in the above, the working example is contained in the Terratest directory which is used for both testing the module and demonstrating how to use it.

Pin provider and Terraform versions

As recommended by Hashicorp, it is a best practice to add constraints to both Terraform and provider versions. This is simple to do in the required_providers block and reduces headaches later on.

From experience, however, I have found that pinning providers can make it more complicated to maintain modules — particularly when they are nested (see next section). My preference is to set a minimum provider version in a “child” module and then tighten the constraints in the root module.

# moduleterraform {
required_version = ">= 1.0.0"
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 3.0"
}
}
}

In the above snippet, we are setting a minimum Terraform version to provide flexibility to users of the module. We are more strict on the AWS provider, Tilda allows users to use all minor versions (3.X) but not a different major version such as 4.1.

# root moduleterraform {
required_version = "~> 1.1.0"
required_providers {
aws = {
source = "hashicorp/aws"
version = "=3.20"
}
}
}

This is an example of a root module (e.g. where our module is consumed). Here we are more strict as it is not intended to be a building block, but instead an environment blueprint. Therefore we generally don’t want the same level of flexibility, instead, we want to control configuration changes.

What goes into a child module?

It is important to consider what goes into a child module and where to draw the line. As a simple rule of thumb, I would ask “Do these resources always get created together?”.

For example, when creating a module for an AWS Application Load Balancer (ALB), you might add the following resources:

  • Load Balancer
  • HTTPS Listener
  • Default Target Group
  • Route 53 ALIAS record

And then have inputs for the below, as these may be created as part of another shared module for networking:

  • S3 bucket for logging
  • Route 53 hosted zone

The above is just an example and will depend on your overall structure for modules and projects. The point is that we don’t want to include every resource that might be related in a single child module.

Don’t nest and nest

To reiterate my first point, it will make your life a lot easier if you plan out your modules. Terraform allows modules to be nested, which means a group of fundamental modules can be combined to create a more complex pattern (what we call a “core module”). An example of this might be an application module which is made up of smaller ALB, EC2 and RDS modules. This is a good use case for module nesting, but I suggest do not nest any further than 2 layers. Any more than that results in a Russian-doll scenario which is painful to maintain — an update of a single attribute in the smallest module requires changes at every level of nesting. It’s a lot more trouble than it’s worth IMO.

Don’t complicate inputs

It can seem like a good idea to simplify a module’s inputs by creating a small number of variables with many values such as var.account_config, which may contain values such as region, vpc_id, hosted_zone_name, and subnets.

This could be used in a module like so:

# variables.tf -----------------------variable "account_config" {
type = any
description = "A really useful object, with all my account details"
}
variable "instance_config" {
type = any
description = "A really useful object, with all my instance details"
}

# main.tf ----------------------------
resource "aws_instance" "example" {

key_name = var.account_config.key_pair_name
subnet_id = var.account_config.subnet_id
ami = var.instance_config.ami_id
instance_type = var.instance_config.instance_type
...
}
data "aws_route53_zone" "domain" {
name = var.account_config.hosted_zone_name
private_zone = true
}
resource "aws_route53_record" "ec2_private_dns" {

zone_id = data.aws_route53_zone.domain.zone_id
name = var.instance_config.name type = "A"
ttl = "300"
records = [aws_instance.example.private_ip]
}

In the above example, the variables.tf file is nice and clean — it just has two inputs that populate everything we need. While this may be a tempting approach, IT IS A BAD IDEA… trust me, I’ve been there. This approach makes it much more difficult for consumers of the module to understand the required inputs and to use the module. It also makes it more complicated to set default values for input, and to make some variables optional.

# variables.tf -----------------------variable "key_pair_name" {
type = string
description = "The name of the EC2 keypair to use for the ec2_user SSH key."
}
variable "subnet_id" {
type = string
description = "The ID of the subnet to create your instance in."
}
variable "ami_id" {
type = string
description = "ID of the AMI to use for your instance."
}
variable "instance_type" {
type = string
description = "The type of instance you want to create."
default = "t3.medium" # this makes it optional
}
variable "instance_name" {
type = string
description = "The name of the instance you want to create."
default = "my_ec2_instance" # this makes it optional
}
variable "hosted_zone_name" {
type = string
description = "The name of an existing Route 53 Hosted Zone you want to create a record for your instance in."
}
# main.tf ----------------------------resource "aws_instance" "example" {

key_name = var.key_pair_name
subnet_id = var.subnet_id
ami = var.ami_id
instance_type = var.instance_type
...
}
data "aws_route53_zone" "domain" {
name = var.hosted_zone_name
private_zone = true
}
resource "aws_route53_record" "ec2_private_dns" {

zone_id = data.aws_route53_zone.domain.zone_id
name = var.instance_name type = "A"
ttl = "300"
records = [aws_instance.example.private_ip]
}

A better solution is to have simple inputs, as shown here. While this makes the variables file larger, it makes the modules much less opinionated and easier to use. It also means you can easily use terraform-docs to generate descriptive readmes for your module.

A really good blog on Terraform variable best practices can be found here.

Set logical default values

As mentioned in the previous point, creating lots of inputs provides flexibility to the consumers of the module. By setting default values for these, they become optional. It is important to align these default values with the standards of your organisation. An example of this is setting the default for instance_type to a reasonable size:

variable "instance_type" {
type = string
description = "The type of instance you want to create."
default = "t3.medium" # this makes it optional
}

In cases where your organisation has a strict requirement to enforce (which is often related to security), you can omit the input var completely and hardcode the value in the module. This means that consumers have no option to override the default:

resource "aws_ebs_volume" "example" {
...
encrypted = true # always will be encrypted
kms_key_id = var.kms_key_id # consumer must provide their KMS key as an input
}

Provide (many) descriptive outputs

Modules are intended to be used as building blocks which means they often need to provide information to root modules and other modules. This information is provided by using outputs from your module. Terraform has always allowed you to output attributes of resources, while more recent versions of Terraform allow you to output full resources. I prefer outputting attributes from modules but both approaches are valid, the key is to be consistent across your modules.

# ec2 module - outputs.tfoutput "iam_role_name" {
description = "Name of EC2 instance IAM Role"
value = aws_iam_role.main.name
}output "instance_id" {
description = "ID of EC2 instance"
value = aws_instance.main.id
}# root module - main.tfmodule "ec2" {
...
}
output "ec2_iam_role" {
description = "IAM Role associated with EC2 Instance"
value = module.ec2.iam_role_name
}
output "ec2_instance_id" {
description = "EC2 Instance ID"
value = module.ec2.instance_id
}

The benefit of this approach is that it is clear to consumers of the modules what outputs are available to them, and it is easy to document with a tool like terraform docs.

# ec2 module - outputs.tfoutput "iam_role" {
description = "EC2 instance IAM Role resource"
value = aws_iam_role.main
}output "instance" {
description = "EC2 instance resource"
value = aws_instance.main.id
}# root module - main.tfmodule "ec2" {
...
}
output "ec2_iam_role" {
description = "IAM Role associated with EC2 Instance"
value = module.ec2.iam_role.name
}
output "ec2_instance_id" {
description = "EC2 Instance ID"
value = module.ec2.instance.id
}

This is the other approach which outputs full resources. It is less clear what outputs are available, however, it does mean that the consumer can use any attribute of a resource that is output — they just might need to look up the provider docs to find them.

As mentioned, I prefer the first approach, which aligns with the recommendation for simple inputs. Again, you can use terraform-docs to populate the outputs section of the readme for your module. If you choose this route, I suggest you output every attribute that you think might be relevant — otherwise you will spend a lot of time updating your module just to add outputs as they are needed.

Use labels

This is not always going to suit your needs, but it can be useful to take a label as an input to your modules. If your modules all take a var.label which is used as a prefix to your resource names, it is then easy to implement your naming conventions in your root module which calls multiple modules.

For example:

S3 module

# S3 moduleresource "aws_s3_bucket" "my_bucket" {
bucket = "${var.label}-bucket"
acl = "private"
...
}

VPC module

resource "aws_vpc" "my_vpc" {
cidr_block = "172.16.0.0/16"
tags = {
Name = "${var.label}-vpc"
}
}
resource "aws_subnet" "my_subnet" {
vpc_id = aws_vpc.my_vpc.id
cidr_block = "172.16.10.0/24"
availability_zone = "us-west-2a"
tags = {
Name = "${var.label}-subnet"
}
}

Root Module

# dev.tfvarslabel = "project-app-dev"# test.tfvarslabel = "project-app-test"# main.tfmodule "s3" {
source = "git::<my-repo>/tf-aws-mod-s3.git"
label = var.label
...
}
module "vpc" {
source = "git::<my-repo>/tf-aws-mod-vpc.git"
label = var.label
...
}

HACK — Update (remote) modules locally

As modules are used to decouple your code, they are distributed in nature. This means that it can be time-consuming to update and test code in multiple places. Think of the following setup- your root module calls the module tf-aws-ec2 , which in turn calls the module tf-aws-iam-profile . In this scenario, to make a change to the IAM Profile template for your environment, you need to update two remote modules, update their tags, reference the new module tag in your root module, and finally test that the updates work as expected.

A shortcut for testing these changes can be found in the .terraform directory on your local machine. When you run terraform init or terraform get , copies of all the modules in your configuration are pulled into the local .terraform/modules directory. This gives you the ability to make updates to modules (tf-aws-iam-profile ) locally, and test the changes by running a terraform plan on your configuration. Once you’re confident in the updates, you can then copy those changes to the remote module repos and update the version tags.

Photo by Christopher Gower on Unsplash

Writing modules is an important part of managing enterprise environments with Terraform. Here I have outlined some tips based on my experiences over the years. I hope that it will be useful for others that are writing their own modules and looking forward to hearing any tips you have from your own experiences.

About the author:

Shane Mitchell is a Senior AWS DevOps Engineer here at Version 1.

--

--