Mastering Terraform: A Comprehensive Handbook for Infrastructure as Code

Exploring the fundamental concepts of Terraform, the importance of defining providers, declaring resources, managing dependencies, the best practices and more.

Ulises Magana
Cloud Native Daily
17 min readJun 2, 2023

--

Photo by Growtika on Unsplash

Terraform has become a popular infrastructure as code (IaC) tool the last couple of years, especially in the DevOps, SRE and Cloud Engineering fields. It provides the automation, provisioning, and update of cloud and on-premise environments instead of doing manual configuration in them bringing consistency and scalability to manage modern infrastructure and deploy applications efficiently. Whether you are a beginner stepping into the world of infrastructure automation or an experienced professional looking to solidify your Terraform skills, this handbook is your ultimate guide.

Introduction to Infrastructure as Code (IaC)

What is IaC and its benefits?

Infrastructure as Code is defined using declarative language, which means that the configuration, programmatic management and provisioning has to be in a certain state in order to be working properly. It allows you to automate and orchestrate your infrastructure resources through machine-readable code to achieve greater efficiency, scalability, and consistency in your infrastructure deployments. Some of its benefits are the following:

  • Agility and Flexibility
  • Automation
  • Collaboration
  • Consistency and standardization
  • Reproducibility
  • Scalability
  • Version Control and Auditing

Why use Terraform as an IaC tool?

Terraform is the leading choice for infrastructure automation and management, mainly because it has unique features and capabilities to make it a powerful tool for provisioning across different cloud providers, on-premise environments, and hybrid setups. These are some of the key reasons to work with Terraform:

  • Declarative Syntax and Infrastructure Graph
  • Multi-Cloud and Multi-Platform Support
  • Plan and Preview Changes
  • Resource abstraction and Reusability
  • State Management

Getting Started with Terraform

Install Terraform

# Update the package lists
sudo apt update

# Upgrade installed packages to their latest versions
sudo apt upgrade

# Download the GPG key from HashiCorp's website and save it in the keyring
wget -O- https://apt.releases.hashicorp.com/gpg | sudo gpg --dearmor -o /usr/share/keyrings/hashicorp-archive-keyring.gpg

# Add the HashiCorp repository to the package manager's sources.list.d directory
# This command uses the signed-by flag to ensure the package is verified using the downloaded GPG key
echo "deb [signed-by=/usr/share/keyrings/hashicorp-archive-keyring.gpg] https://apt.releases.hashicorp.com $(lsb_release -cs) main" | sudo tee /etc/apt/sources.list.d/hashicorp.list

# Update the package lists again to include the HashiCorp repository
# Install Terraform from the HashiCorp repository
sudo apt update && sudo apt install terraform
  • Verify the installation by opening a command prompt and type ‘Terraform’ and press Enter.
terraform command

Set up Terraform

Set up the provider credentials. In this case we’re going to work with AWS.

  • Log in to AWS and create a new IAM user as shown below or use an existing one that you want to use for Terraform.
User details
  • To manage resources with your Terraform configurations, assign appropriate permissions to your user. While this article grants Administrator access for simplicity, it’s strongly advised to grant permissions only for the specific resources you need. This mitigates the risk of unauthorized access or unintended actions.
Set user permissions
  • If you created a user, download the CSV file with the Access Key ID and the secret access key since these credentials will be used to programmatically access AWS.
Download user’s credentials
aws configure
#Enter your AWS_ACCESS_KEY_ID
#Enter your AWS_SECRET_ACCESS_KEY
#Enter your AWS_DEFAULT_REGION
  • Install the following VSCode extensions since they provide Terraform syntax highlighting and calling up of snippets of the most popular resource types of different cloud infrastructure providers.
HashiCorp Terraform extension
Terraform doc snippets extension
  • Configure the Terraform AWS provider by creating a new directory for your projects. Then create the following two files: main.tf and provider.tf.
New Terraform directory
  • To connect Terraform to a specific cloud provider, it is important to know that Terraform has a public registry which contains a lot of providers and modules. In this case, we’re going to use the AWS provider.
  • Note: A provider is a plugin that allows Terraform to interact with a certain cloud platform, service, or component. It translates the infrastructure resources and configurations coded in Terraform into API calls and actions understood by the target platform.
  • In the provider.tf file write the code below.
terraform {
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 4.0"
}
}
}

# Configure the AWS Provider
provider "aws" {
region = "us-east-1"
}
  • Then create a VPC in the main.tf file.
# Create a VPC
resource "aws_vpc" "name" {
cidr_block = "172.16.0.0/16"
}
  • Test and initialize the configuration by running terraform initin your terminal.
Installing the AWS provider with the ‘terraform init’ command
  • Execute terraform plan to check the configuration and ensure the AWS provider credentials are correctly set up. If everything is configured correctly, you should see what is shown below:
Terraform plan detailing the changes that will be made to your infrastructure

Understanding the Terraform command-line interface (CLI)

The Terraform CLI enables users to manage infrastructure as code. It provides a powerful set of commands and options to facilitate the creation, modification, and destruction of infrastructure resources. Some key points to know about this are the following:

  • Initialize a Terraform configuration: As it was done above, it’s necessary to initialize the configuration directory using the terraform init command. This command downloads the required provider plugins, prepares the directory for Terraform operations, and sets up the backend configuration.
  • Create and modify infrastructure: You can modify any resource, such as virtual machines, networks, storage, etc. There are two essential commands, terraform plan and terraform apply to generate an execution plan and apply changes to the infrastructure, respectively.
  • Manage state: There is a state file to keep track of the resources that Terraform manages and their current state. The CLI is automatically creating and updating this file during the execution commands. This file must be handled and stored securely, especially in collaborative environments.
  • Additional CLI commands: There are other commands that are usually in the Terraform workflow. The terraform validate command is used to validate configuration files, to refresh the state you use terraform refresh, and to destroy your resources there is terraform destroy. You can also use variables, modules, and remote backends to customize and scale infrastructure deployments by the support of the CLI.

Terraform Configuration Language (HCL)

Syntax and structure of HCL

There are two languages you can choose to use when defining and managing your infrastructure with Terraform which are HCL or JSON. Both of them provide a structured and declarative way to describe the desired state of your infrastructure. However, HCL is the recommended one since it offers a human-readable and expressive syntax, and is more commonly used. The structure of HCL consists of blocks, arguments, and values which will be described below.

Blocks: The configuration files are organized into blocks and they represent different aspects of your infrastructure.

# Examples of block type: 'resource', 'provider', 'variable', 'data', 'output'
# Block name: It is an optional label or name of the block.

block_type "block_name" {
# Configuration settings: Arguments and corresponding values specific to the block type.
# Configuration settings: They define properties, attributes and behavior of the resource.
}
# Real AWS resource configuration using HCL

resource "aws_instance" "example" {
ami = "ami-0123456789abcdef0"
instance_type = "t2.micro"
subnet_id = aws_subnet.example.id

tags = {
Name = "ExampleInstance"
Environment = "Production"
}
}

Arguments: They are inside blocks and are used by the key=value syntax. They define properties and settings of the resource or block.

block_type "block_name" {
argument1 = value1
argument2 = value2
# ...
}

Values: They are the data assigned to the arguments within a block. Values can be strings, numbers, lists, maps, or other data structures supported by HCL.

Variables, data types, and expressions in HCL

Variables: They allow you to parameterize your configurations to make them more flexible and reusable. The values are passed into your Terraform modules or configurations and they are usually declared in a file named variables.tf. Provided below, the variable region has the default value of ‘us-west-2’. As you can see var.region is used to access the value of the variable within the resource block.

# var.tf file

variable "region" {
description = "AWS west region 2"
default = "us-west-2"
}
# main.tf file

resource "aws_instance" "example" {
ami = "ami-0123456789abcdef0"
instance_type = "t2.micro"
region = var.region
}

Data types: HCL Terraform supports several data types. In the example below there are three different data types, a list of strings, a map of strings, and a string. The variable usage in the subnet_id attribute is important to mark here because it uses the first subnet ID from the subnet_ids variable using indexing (var.subnet_ids[0]).

variable "subnet_ids" {
description = "List of subnet IDs"
type = list(string)
default = ["subnet-12345678", "subnet-abcdefgh"]
}

variable "tags" {
description = "Map of tags"
type = map(string)
default = {
Name = "ExampleInstance"
Environment = "Production"
}
}

resource "aws_instance" "example" {
ami = "ami-0123456789abcdef0"
instance_type = "t2.micro"
subnet_id = var.subnet_ids[0]
tags = var.tags
}

Another example using other data types such as numbers and objects in a Terraform configuration is in the example below.

variable "instance_count" {
description = "Number of instances to launch"
type = number
default = 2
}

variable "security_group" {
description = "Security group configuration"
type = object({
name = string
description = string
ingress = list(object({
from_port = number
to_port = number
protocol = string
cidr_blocks = list(string)
}))
})
default = {
name = "example-security-group"
description = "Example security group"
ingress = [
{
from_port = 80
to_port = 80
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
},
{
from_port = 22
to_port = 22
protocol = "tcp"
cidr_blocks = ["10.0.0.0/16"]
}
]
}
}

resource "aws_instance" "example" {
ami = "ami-0123456789abcdef0"
instance_type = "t2.micro"
count = var.instance_count
vpc_security_group_ids = [aws_security_group.example.id]
}

resource "aws_security_group" "example" {
name = var.security_group.name
description = var.security_group.description

ingress {
from_port = var.security_group.ingress[0].from_port
to_port = var.security_group.ingress[0].to_port
protocol = var.security_group.ingress[0].protocol
cidr_blocks = var.security_group.ingress[0].cidr_blocks
}

ingress {
from_port = var.security_group.ingress[1].from_port
to_port = var.security_group.ingress[1].to_port
protocol = var.security_group.ingress[1].protocol
cidr_blocks = var.security_group.ingress[1].cidr_blocks
}
}

Finally, let’s see an example of the boolean data type with HCL. The create_instance variable determines whether the instances should be created or not based on its value. With the conditional expression (? 1 : 0) if the value is true it creates a single instance since count = 1, otherwise it would set count = 0 not creating any instance. One scenario in which you might be using a boolean data type is in testing and development when you want to enable or disable the creation of specific resources bases on your testing needs, and this might be done by passing a value during the Terraform command execution which will be shown later in this article.

variable "create_instance" {
description = "Whether to create an instance"
type = bool
default = true
}

resource "aws_instance" "example" {
count = var.create_instance ? 1 : 0
ami = "ami-0123456789abcdef0"
instance_type = "t2.micro"
}

Building and Managing Infrastructure

Defining multiple providers

In today’s complex IT landscape it often involves managing multiple infrastructure platforms, cloud providers, and regions. Terraform has the ability to support multiple providers within a single configuration in which you can seamlessly manage resources across different platforms, such as AWS, Azure, GCP, and others. This offers a great benefit and flexibility if you want to consolidate your infrastructure management achieving a unified approach.

On the other hand, the use of multiple providers empowers organizations with hybrid or multi-cloud deployments. By embracing this approach, you unlock the full potential of Terraform and gain control over your diverse infrastructure landscape. An example into how to integrate AWS, Azure, and GCP in the Terraform configuration file would look like the following:

terraform {
required_providers {
azurerm = {
source = "hashicorp/azurerm"
version = "3.58.0"
}
aws = {
source = "hashicorp/aws"
version = "5.0.1"
}
google = {
source = "hashicorp/google"
version = "4.67.0"
}
}
}

provider "azurerm" {
# Azure-specific configuration options
}

provider "aws" {
# AWS-specific configuration options
}

provider "google" {
# GCP-specific configuration options
}

# Define resources and other Terraform configurations

When defining multiple providers using aliases is useful to manage resources across different providers or regions within a single configuration. There’s also another important concept here, it is provider overrides which allow you to customize specific settings within a provider configuration, bringing flexibility and adaptability in your infrastructure management needs.

Provider aliases: In the example below the two provider blocks have their respective aliases. The aliases are used in the resource blocks to specify the desired provider.

provider "aws" {
alias = "us_east"
region = "us-east-1"
}

provider "azurerm" {
alias = "eastus"
features {}
}
resource "aws_instance" "example" {
provider = aws.us_east
// Resource configuration
}

resource "azurerm_virtual_machine" "example" {
provider = azurerm.eastus
// Resource configuration
}

Provider overrides: In this second example, in the command line the -var flag is used to override the value of aws_region which might be defined in a variables file and provided in the Terraform configuration. Let’s say that aws_region had the value of us-east-1 and you overrode it with us-west-2. This can also be done at other levels such as in the configuration file or by using environment variables.

provider "aws" {
alias = "us_east"
region = var.aws_region
access_key = var.aws_access_key
secret_key = var.aws_secret_key
}

provider "azurerm" {
alias = "eastus"
features {}
subscription_id = var.azure_subscription_id
client_id = var.azure_client_id
client_secret = var.azure_client_secret
tenant_id = var.azure_tenant_id
}
terraform apply -var 'aws_region=us-west-2'

Managing resource dependencies and order of execution

In this subsection we’ll explore the fundamental concepts and techniques for effectively managing resource dependencies. By understanding this, you can ensure the correct provisioning order, maintain consistency, and avoid potential errors in your infrastructure deployments.

Resource Dependencies (Implicit): They establish relationships between different resources within your infrastructure. This concept revolves around that the creation or configuration of one resource relies on the availability or completion of another resource. These dependencies ensure that Terraform provisions resources in the correct order, avoiding any potential conflicts and maintaining the desired state of your infrastructure.

  • An example of a resource dependency is when you want to provision an EC2 instance but this one relies on a security group being created first, thus, having a dependency. In the following Terraform code, you can see that you would express the dependency by referencing the security group within the EC2 instance resource declaration.
resource "aws_security_group" "example" {
name = "example"
description = "Example security group"
# Security group configuration
}

resource "aws_instance" "example" {
ami = "ami-0c94855ba95c71c99"
instance_type = "t2.micro"

security_group_ids = [aws_security_group.example.id]
# Other instance configuration
}

Explicit Dependencies with depends_on: They are necessary in certain scenarios where implicit dependencies may not be sufficient or accurate. They can be used in cases such as when you’re using conditional dependencies, breaking cyclic dependencies, or enforcing a specific order of resource creation.

  • An example is when you want to provision an EC2 instance and an S3 bucket. In this specific case the EC2 needs to access the S3 during the startup process, so to ensure that the instance is provisioned only after the S3 bucket is created, you’ll need to use explicit dependencies as shown below.
resource "aws_s3_bucket" "my_bucket" {
# S3 bucket configuration
}

resource "aws_instance" "my_instance" {
# EC2 instance configuration

depends_on = [
aws_s3_bucket.my_bucket
]
}

Inter-Resource Communication: They allow resources to exchange information and dependencies during provisioning through the use of attributes or outputs, enabling proper coordination and configuration of the infrastructure.

  • A scenario where this happens can be when an EC2 instance needs to be launched within the VPC and requires the VPC’s ID as an input. In the code below the VPC resource defines an output exposing its ID, which can then be referenced by the EC2 instance using thevar.VPC_ID obtaining the ID dynamically and establish the necessary dependency on the VPC during provisioning.
resource "aws_vpc" "my_vpc" {
# VPC configuration
}

output "vpc_id" {
value = aws_vpc.my_vpc.id
}

resource "aws_instance" "my_instance" {
# EC2 instance configuration

vpc_id = var.VPC_ID
}

Using Modules for Dependency Management: Modules allow you to organize and encapsulate groups of resources to facilitate dependency management. With them you can abstract infrastructure components into reusable units that can be provisioned together, promoting modularity, reusability and simplicity on complex infrastructure configurations.

  • Instead of defining several resources directly in your main configuration file, you would define the module blocks and specify their sources as shown below.
module "vpc" {
source = "./modules/vpc"
}

module "subnets" {
source = "./modules/subnets"
}

module "security_groups" {
source = "./modules/security_groups"
}

module "ec2_instances" {
source = "./modules/ec2_instances"
}

Order of Execution: The order of resource provisioning is based on their dependencies. Terraform analyzes them and ensures that resources are provisioned in the correct order. It also evaluates the dependency graph to determine the proper sequence of operations. By understanding this, it allows for smooth deployments and at the same time minimizes the chances of errors or conflicts due to missing dependencies.

  • An example of this is a configuration that includes a VPC, subnets, and EC2 instances. Terraform will make sure that the VPC is created first, then the subnets, and at the end the EC2 instances to make sure the infrastructure is provisioned correctly, as each resource depends on the successful creation of its dependencies.

Terraform State

Terraform state is a representation of the resources defined in your configuration and their current state as tracked by Terraform. It plays a crucial aspect of managing infrastructure since it serves as a source of truth allowing Terraform to understand the existing infrastructure and make necessary changes to achieve the desired state.

Understanding Terraform state and its purpose

Its purpose is to track and manage the lifecycle of resources, and to allow Terraform to plan and apply changes incrementally. There is a state file that records the metadata and attributes of resources, this enables Terraform to understand the current state of the infrastructure to determine the actions required to achieve the desired state.

{
"version": 4,
"terraform_version": "1.2.0",
"serial": 2,
"lineage": "abcd1234-5678-efgh-ijkl-1234567890ab",
"outputs": {
"example_output": {
"value": "Hello, Terraform!",
"type": "string"
}
},
"resources": [
{
"mode": "managed",
"type": "aws_instance",
"name": "example_instance",
"provider": "provider.aws",
"instances": [
{
"schema_version": 1,
"attributes": {
"id": "i-0123456789abcdef0",
"name": "example-instance",
"instance_type": "t2.micro",
"subnet_id": "subnet-0123456789abcdef",
"security_group_ids": [
"sg-0123456789abcdef"
]
},
"private": "false"
}
]
}
]
}

Managing and storing the state file

Proper management and storage of the state file ensure the integrity and consistency of your infrastructure. You have a few options when managing the state file:

  • Local State: The state file is stored by default locally in the same directory of your Terraform configuration files. It is important to ensure that the state file is not accidentally deleted or overwritten since it is crucial for managing and updating your infrastructure.
  • Remote State: It provides benefits over the Local State since it has enhanced collaboration, concurrent access, and centralized management. You can use various remote state backends, such as AWS S3 or Azure Blob Storage, to store the state file securely and access it from multiple environments and team members. Provided below you can see a remote backend storing the state file in an S3 bucket.
terraform {
backend "s3" {
bucket = "example-bucket"
key = "terraform.tfstate"
region = "us-east-1"
dynamodb_table = "terraform-state-lock"
}
}
  • State Locking: It is a mechanism to prevent concurrent modifications to the state file by multiple users or processes when executing Terraform commands, such as in a team environment or in automated CI/CD pipelines. Only one user or process can make changes at a time to the state file, preventing any potential conflict and data corruption. When state locking is enabled, the lock is stored in a lock table, such as in DynamoDB. Any attempts to modify the state file will be blocked until the lock is released.

Provisioning Infrastructure

Using provisioners for executing scripts during resource creation

There are times in which you may have requirements to perform additional configuration tasks during the resource creation when provisioning infrastructure. This can be achieved through Terraform provisioners, which allow you to run scripts or commands on the newly created resources. They can be useful for certain tasks, such as initializing databases, or configuring software.

resource "aws_instance" "example" {
ami = "ami-0c94855ba95c71c99"
instance_type = "t2.micro"

provisioner "local-exec" {
inline = [
"echo 'Hello, World!' > /tmp/hello.txt",
]
}
}

Leveraging Terraform provisioners for configuration management

Terraform provisioners can also be used for configuration management, since there are situations in which you may need to install software, configure services, or perform any other necessary action to ensure that your infrastructure is properly configured.

resource "aws_instance" "example" {
ami = "ami-0c94855ba95c71c99"
instance_type = "t2.micro"

provisioner "remote-exec" {
command = <<-EOF
#!/bin/bash
"sudo apt-get update"
"sudo apt-get install -y nginx"
# Perform additional configuration tasks here
EOF
}
}

Provisioners can also be integrated with popular configuration management tools like Ansible, Chef, and Puppet. With them you can automate the setup, configuration, and management of your resources in a consistent and scalable manner.

Terraform Best Practices and Tips

Following best practices for Terraform code organization and structure

In order to maintain readability and maintainability of your Terraform code, it’s important to follow best practices for code organization and structure. Let’s explore what are some of them:

  • Modularization: Make your code reusable, separating modules for networking, compute instances, and databases to mention an example, instead of defining all your resources in a single configuration file.
  • Naming conventions: Use consistent and descriptive names for resources, variables, and modules. This improves readability and helps in understanding the purpose and function of each component.
  • Logical grouping: Group related resources together within modules or configuration files. This enhances code organization and makes it easier to navigate and manage.

Version control and collaboration with Terraform

By using a version control system you can track changes, collaborate with team members, and revert to previous versions if needed. Here are some recommendations:

  • Git repository: Set up a Git repository to store your Terraform code to maintain a history of your infrastructure configurations.
  • Branching and merging: Utilize branching and merging strategies to manage concurrent changes and prevent conflicts. Create separate branches for different features or environments, and merge them back into the main branch when ready.
  • Collaboration tools: Leverage collaboration platforms like Terraform Cloud, GitHub, or GitLab to facilitate teamwork, code review, and automated workflows. These tools provide features such as pull requests, code reviews, and integration with CI/CD pipelines.

Security considerations for Terraform deployments

Finally, it’s important to consider security best practices. Take into consideration the following ones:

  • Sensitive data protection: Use environment variables instead of hardcoding sensitive information, secure storage systems like HashiCorp Vault, or secrets management tools to store and retrieve data securely.
  • Principle of least privilege: Ensure that the credentials used by Terraform have only the necessary permissions required to provision and manage resources. By following this, you limit any potential damage in case of a security breach.
  • Secure communication: Ensure secure communication between Terraform and the cloud provider API by using encrypted connections like HTTPS.

Summary

Overall, in this Terraform Handbook we have discovered the fundamental concepts, the importance of defining providers, declaring resources, managing dependencies and the best practices. We also explored the significance of Terraform state, as well as the use of modules for dependency management and code reusability.

By following what’s mentioned in this article and incorporating the tips shared throughout this handbook, you can effectively use Terraform to provision, manage, and scale your infrastructure with ease, while maintaining consistency, reusability, and security.

Do stay up-to-date with the latest features and updates of this tool to unlock the full potential of infrastructure automation.

Happy Terraforming!

Further Reading:

--

--

Ulises Magana
Cloud Native Daily

Cloud & Infrastructure Engineer with diverse experience in software development, database administration, SRE & DevOps.