Making Terraform Work a Bit Harder
Lots of people use Terraform but they only scratch the surface of what it can do. I thought I would explore some of Terraforms capabilities to keep your code ‘DRY’.
First of all consider this little piece of Terraform HCL
resource "aws_security_group" "basic_sg" {
name = "my_security_group"
description = "just for me, nobody else"
vpc_id = "vpc-0a1b2c3de45f12345"
ingress {
from_port = 443
to_port = 443
protocol = "tcp"
cidr_blocks = [10.11.12.13/20]
}
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
tags = {
Name = "my_security_group"
}
}
Most people who have done any sort of Terraform will have seen something like this; simple, straight forward, just creating a security group for AWS. It works so what’s the problem? If you are doing anything that requires anything beyond a very small, simple deployment, writing Terraform like this will cause your TF to grow exponentially in size and very quickly become unmanageable. It’s not scalable, even just deploying across multiple regions will force this little piece of code to be replicated for each one.
Then if your app needs another port opened up, and you add another ingress block like this for example ….
resource "aws_security_group" "basic_sg" {
name = "my_security_group"
description = "just for me, nobody else"
vpc_id = "vpc-0a1b2c3de45f12345"
ingress {
from_port = 443
to_port = 443
protocol = "tcp"
cidr_blocks = [10.11.12.13/20]
} ingress {
from_port = 8001
to_port = 8001
protocol = "tcp"
cidr_blocks = [10.11.12.13/20]
} egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
tags = {
Name = "my_security_group"
}
}
It will just keep growing.
What if you want to have some apps listening on different ports because you want to run some on the same instance? Just a couple of different permutations is all it takes for the amount of code to grow very quickly.
The first thing to consider is that there is rarely a need to hardcode any Ids. I won’t say never because there is always an exception but so far, I haven’t had to do it after thousands of lines of HCL.
Data Sources
Data sources are Terraforms way of doing a ‘lookup’. Consider the vpc_id in the code snippet above. Why not have Terraform find the id of the vpc you need rather trolling through the AWS Console, copying it out and then writing it into your code? This idea is as equally applicable to Azure or GCP as it is to AWS, I’m just using AWS for my example.
Let’s say you do something sensible like giving your default VPC in each region a name following a convention, and then tag it with it’s name. That would allow this data source to be used ….
data "aws_vpc" "default" {
filter {
name = "tag:Name"
values = ["default-vpc"]
}
}resource "aws_security_group" "basic_sg" {
name = "my_security_group"
description = "just for me, nobody else"
vpc_id = data.aws_vpc.default.id
ingress {
from_port = 443
to_port = 443
protocol = "tcp"
cidr_blocks = [10.11.12.13/20]
} ingress {
from_port = 8001
to_port = 8001
protocol = "tcp"
cidr_blocks = [10.11.12.13/20]
} egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
tags = {
Name = "my_security_group"
}
}
This same block of code will now (almost) work in any region, and without needing to look up the VPC Ids manually and put them in the code. The VPC Id might be used in several different scenarios. So rather than copying and pasting that bit of code into all the places you need, why not turn that data source into a reusable module?
Terraform Modules
Modules are very useful. They do what they say on the tin. There is a good explanation of Modules on the Terraform Website here but in short, they could be as simple as just being one file, main.tf
So making a module out of our data source could just be this code in a main.tf file, in it’s own folder/directory
data "aws_vpc" "default" {
filter {
name = "tag:Name"
values = ["default-vpc"]
}
}
output "my_id" {
value = data.aws_vpc.default.id
}
The output block is important, it is the mechanism for exporting values to code outside of your module. In this case the output variable is called ‘my_id’ and the value of it is the id from the data source. Terraform recommend putting output blocks in their own file although it’s not necessary to make it work.
So how do we use our new module?
module "my_vpc" {
source "../modules/default_vpc"
}resource "aws_security_group" "basic_sg" {
name = "my_security_group"
description = "just for me, nobody else"
vpc_id = module.my_vpc.my_id
ingress {
from_port = 443
to_port = 443
protocol = "tcp"
cidr_blocks = [10.11.12.13/20]
} ingress {
from_port = 8001
to_port = 8001
protocol = "tcp"
cidr_blocks = [10.11.12.13/20]
} egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
tags = {
Name = "my_security_group"
}
}
To use a module, create a module block and then the bare minimum you need is the source, i.e. the relative path to the folder containing your module main.tf. The vpc_id in the security group resource now references the module and the output variable we defined, ‘my_id’.
Lets say for example, that the security group should only allow ingress from within the VPC, i.e. the vpc cidr_block. That’s easy too. First modify our module to output the cidr block of the vpc as well as the id ….
data "aws_vpc" "default" {
filter {
name = "tag:Name"
values = ["default-vpc"]
}
}
output "my_id" {
value = data.aws_vpc.default.id
}output "cidr_block" {
value = data.aws_vpc.default.cidr_block_associations[0].cidr_block
}
The extra output block gets it value from another of the attributes the VPC data source exports. cidr_block_associations is output as an array, I have assumed in this case that we’re only interested in the cidr_block value from the first element of that array.
Plugging this into the security group code ….
module "my_vpc" {
source "../modules/default_vpc"
}resource "aws_security_group" "basic_sg" {
name = "my_security_group"
description = "just for me, nobody else"
vpc_id = module.my_vpc.my_id
ingress {
from_port = 443
to_port = 443
protocol = "tcp"
cidr_blocks = [module.my_vpc.cidr_block]
} ingress {
from_port = 8001
to_port = 8001
protocol = "tcp"
cidr_blocks = [module.my_vpc.cidr_block]
} egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
tags = {
Name = "my_security_group"
}
}
We’re getting rid of the hardcoded values but it’s still a bit messy, how could the ingress and egress rules be cleaned up? What if there was a way to dynamically feed in different sets of ingress and egress rules?
Well, there is, dynamic blocks. Consider this next iteration of our Security Group creation HCL:
module "my_vpc" {
source = "../modules/default_vpc"
}resource "aws_security_group" "basic_sg" {
name = "my_security_group"
description = "just for me, nobody else"
vpc_id = module.my_vpc.my_id
dynamic "ingress" {
for_each = var.ingress_rules
content {
from_port = ingress.value.from
to_port = ingress.value.to
protocol = ingress.value.protocol
cidr_blocks = ingress.value.cidr
}
}
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
tags = {
Name = "my_security_group"
}
}
So what’s going on here? Basically there is now a variable called ‘ingress_rules’ (I’ll cover it in a minute) that I have created that contains all the information needed to create ingress rules (this is as applicable to egress but I have left egress as it was, for simplicity and for comparison). The multiple ingress blocks have been replaced by a single dynamic block named for the block it has replaced. The for_each statement controls the number of repeats for this block and how many rules are created, so if the variable ingress_rules is empty, no rules will be created. The four attributes of an ingress block are specified in a content block and notice that the values assigned to them are not from var.ingress_rules but from ingress.value. This is just the Terraform syntax at work, with ingress.value effectively being a reference to the current set of values from iterating over var.ingress_rules.
The way I have set up the variable for ingress_rules is like this:
variable ingress_rules {
type = list( object({
from = number
to = number
protocol = string
cidr = list(string)
}))
}
So using the ‘default’ mechanism for variables, I can then give it a value of something like this:
variable ingress_rules {
type = list( object({
from = number
to = number
protocol = string
cidr = list(string)
}))default = [{
from = 443
to = 443
protocol = "tcp"
cidr = [module.my_vpc.cidr_block]
},
{
from = 8001
to = 8001
protocol = "tcp"
cidr = [module.my_vpc.cidr_block]
}]
}
This value will create the two ingress rules from earlier because there are two objects in the list. OK, so let’s expand on this and turn the security group block into a module as well.
Turning the Security Group resource into its own Module
To do this, all of the hard coded values need to be replaced by variables, and use a dynamic block for the egress rules as well. I have also removed the vpc data source module from the code, this can also be passed in.
resource "aws_security_group" "basic_sg" {
name = var.name
description = var.description
vpc_id = var.vpc_id
dynamic "ingress" {
for_each = var.ingress_rules
content {
from_port = ingress.value.from
to_port = ingress.value.to
protocol = ingress.value.protocol
cidr_blocks = ingress.value.cidr
}
}
dynamic "egress" {
for_each = var.egress_rules
content {
from_port = egress.value.from
to_port = egress.value.to
protocol = egress.value.protocol
cidr_blocks = egress.value.cidr
}
}
tags = {
Name = var.name
}
}output "id" {
value = aws_security_group.basic_sg.id
}
… and to go with that being the contents of the main.tf file, the following would be the possible contents of a variables.tf in the module directory with the main.tf
variable vpc_id {
type = string
}
variable description {
type = string
}
variable name {
type = string
}
variable ingress_rules {
type = list( object({
from = number
to = number
protocol = string
cidr = list(string)
}))
}
variable egress_rules {
type = list( object({
from = number
to = number
protocol = string
cidr = list(string)
}))
}
Then a higher level root module can use it by doing something like this:
module "my_vpc" {
source = "../modules/default_vpc"
}module "security_group {
source = "../modules/security_group
vpc_id = module.my_vpc.my_id
description = "a basic security group"
name = "my_sg"
ingress_rules = var.ingress_rules
egress_rules = []
}
So now, lets say we have a load of security groups to create, rather than copy and pasting that block of code above for each one, what about something like this…
module "my_vpc" {
source = "../modules/default_vpc"
}module "security_group {
count = length(var.security_groups)
source = "../modules/security_group
vpc_id = module.my_vpc.my_id
description = var.security_groups[count.index].description
name = var.security_groups[count.index].name
ingress_rules = var.security_groups[count.index].ingress_rules
egress_rules = var.security_groups[count.index].egress_rules
}
Then everything is just variables, maybe something like this variable definition for the root module …
variable security_groups {
default = [{
description = "first sg"
name = "sg1"
egress_rules = []
ingress_rules = [{
from = 443
to = 443
protocol = "tcp"
cidr = [module.my_vpc.my_id]
},
{
from = 8001
to = 8001
protocol = "tcp"
cidr = [module.my_vpc.my_id]
}]
},
{
description = "egress sg"
name = "sg2"
ingress_rules = []
egress_rules = [{
from = 0
to = 0
protocol = "-1"
cidr = ["0.0.0.0/0"]
}]
}]
}
This is still a bit messy. One option here is to split out the egress and ingress data sets into maps. There are numerous ways of doing this, one example might look something like this where the ports are separated from the protocol and then the rule blocks are also separate.
variable ports {
type = map
default = {
"https" = 443
"app1" = "8001"
"open" = "0"
}
}variable ingress {
type = map
default = {
"https" = {
from = var.ports["https"]
to = var.ports["https"]
protocol = "tcp"
cidr = [module.my_vpc.my_id]
}
"app1" = {
from = var.ports["app1"]
to = var.ports["app1"]
protocol = "tcp"
cidr = [module.my_vpc.my_id]
}
}variable security_groups {
default = [{
description = "first sg"
name = "sg1"
egress_rules = []
ingress_rules = [var.ingress["https"], var.ingress["app1"]]
},
{
description = "egress sg"
name = "sg2"
ingress_rules = []
egress_rules = [{
from = var.ports["open"]
to = var.ports["open"]
protocol = "-1"
cidr = ["0.0.0.0/0"]
}]
}]
}
Now if you wanted to change the port app1 uses, you just need to change it in one place. There are a few other tricks that Terraform has up its sleeve that I haven’t covered here, ternary expressions are another really useful mechanism for controlling logic and flow as well as numerous functions for manipulating data.
So there is no excuse for churning out Terraform with everything hardcoded.