A Developer’s Guide to Terraform
What is Terraform?
Terraform is an infrastructure as code(IaC) open-source software tool that enables you to safely and predictably change, create, and improve infrastructure. By IaC we can automate every service we deploy on multiple providers.
Why do we need Terraform?
Terraform lets you use the same workflow to manage multiple providers and handle cross-cloud dependencies. This simplifies management and orchestration for large-scale, multi-cloud infrastructures.
Why do we need a multi or hybrid cloud?
Multi-cloud is the combination of two or more public cloud providers which we use to deploy our services and hybrid cloud stands for private cloud and public cloud combination, these are nothing but the strategy where an organization leverages two or more cloud computing platforms to perform various tasks. The organization wants to compare the services given by each provider with a perspective (e.g. price) and wants to use the best.
How to start with Terraform?
Install Terraform on Mac, Linux, or Windows by downloading the binary or using a package manager. Then we need to start coding with HashiCorp Configuration Language (HCL) which is a unique configuration language. It was designed to be used with HashiCorp tools, notably Terraform, but HCL has expanded as a more general configuration language and it is visually similar to JSON.
In this blog, I will be using AWS as my cloud provider to explain various syntax, and the syntax and method of writing the code remain the same for any provider.
Terraform has 2805 providers till date among them AWS, GCP, Azure, Kubernetes, Alibaba cloud, Oracle cloud are mainly used ones.
Once you have installed Terraform let’s learn some main components of Terraform which can help us get more familiar with the tool.
Providers
Provider plugins like the AWS provider act as a translation layer that allows Terraform to communicate with many different cloud providers, databases, and services.
Terraform uses providers to provision resources, mentioning providers is the basic first requirement to run any IaC code and the following is an example.
The following block of code means we want to use AWS to launch our infrastructure. “~>” means any version after 4.16.
terraform {
required_providers {
mycloud_aws = {
source = "hashicorp/aws"
version = "~> 4.16"
}
}
}
Configure the AWS Provider
The next step is to authenticate to AWS and we do this with help of an access key and a secret key.
provider "mycloud_aws" {
region = "us-east-1"
access_key = "my-access-key"
secret_key = "my-secret-key"
}
Warning: Hard-coded credentials are not recommended in any Terraform configuration and risk secret leakage should this file ever be committed to a public version control system.
We can use terraform recommended methods as best practices or we can make use of AWS CLI and give the configured name.
//AWS cli configured
provider "mycloud_aws" {
region = "us-east-1"
profile= "default" //this is by default
}
Resources
Resources are a very important element in the Terraform language. Each resource block describes one or more infrastructure objects we desire to configure, such as virtual networks, compute instances, container services, and much more.
Syntax: block name “resource type” “resource name”
Following is an example to launch an EC2 instance, the resource we used here is aws_instance. We use Terraform argument reference to give our requirements to launch resources.
resource "aws_instance" "foo" {
ami = "ami-005e54dee72cc1d00" # us-west-2
instance_type = "t2.micro"
tags = {
Name = "HelloWorld"
}
}
Argument Reference
Inside each resource block, we have to mention the resource respective argument that Terraform has and give the key values according to our requirement. This is mainly to configure Terraform resources.
Search technique: In the doc we always search with the resource name in the search option at the left of the screen and under the resource(aws_instance) we have Resources, we click on this and at the right we have Argument Reference and examples.
Provisioners
Provisioners can be used to execute specific actions on the local machine or on a remote machine in order to prepare servers or other infrastructure objects for service.
When we deploy virtual machines or other similar computing resources, we often need to pass in data about other related infrastructure that the software on that server will need to do its job and this can be done using provisioners.
How to use Provisioners
resource "aws_instance" "web" {
# ...
# Establishes connection to be used by all
# generic remote provisioners (i.e. file/remote-exec)
connection {
type = "ssh"
user = "root"
password = var.root_password
host = self.public_ip
}
provisioner "remote-exec" {
inline = [
"puppet apply",
"consul join ${aws_instance.web.private_ip}",
]
}
}
remote-exec: This provisioner invokes a script on a remote resource after it is created. This can be used to run a configuration management tool, bootstrap into a cluster, etc.
Similarly to invoke a local process, we can use local-exec provisioner instead. The remote-exec provisioner requires a connection and supports both ssh and WinRM.
Connection Block
We can create one or more connection blocks that can connect and access the remote resource. Connection blocks don’t take a block label and can be nested within either a resource or a provisioner. (Refer to the above code.)
We can give the file names as our wish but extension should be .tf and terraform is intelligent to find what is the order of the code.
Variables
We have 3 types:
- Input Variables
It’s just like user-defined variables. Input variables will let us customize aspects of Terraform modules without altering the module’s own source code. This helps us to share modules across different Terraform configurations, making the module composable and reusable.
Declaring an Input Variable
variable "image_id" {
type = string
}
variable "availability_zone_names" {
type = list(string)
default = ["us-west-1a"]
}
variable "docker_ports" {
type = list(object({
internal = number
external = number
protocol = string
}))
default = [
{
internal = 8300
external = 8300
protocol = "tcp"
}
]
}
Type here means data type and default stands for the default value.
2. Attribute references
These can be the information related to the infrastructure we are configuring which we can obtain only after our code is applied, but here we may have to use those information elsewhere and we can use it as attribute references.
Few examples:
- The
ami
argument set in the configuration can be used elsewhere with the reference expressionaws_instance.example.ami
. - The
id
attribute exported by this resource type can be read using the same syntax, givingaws_instance.example.id
.
Syntax: value = <resource type>.<resource name>.<attribute reference>
output "o1"{
value=aws_instance.os1.public_ip
}
Search technique: In the doc we always search with the resource name in the search option at the left of the screen and under the resource(aws_instance) we have Resources, we click on this and at the right we have Attribute Reference and examples.
3. Filesystem and Workspace Info
Following we have: path.module, path.root, path.cwd, terraform.workspace
module "example" {
# ...
name_prefix = "app-${terraform.workspace}"
}
null_resource
This resource implements the standard resource lifecycle but takes no further action. The code written outside any resource may not rerun every time so those can be placed inside null_resource.
Provisioner and connection can be well executed inside null_resource.
Triggers: This is a block we can use to trigger the null_resource i.e., each time its values change then the null_resource will run.
resource "aws_instance" "cluster" {
count = 3
# ...
}
resource "null_resource" "cluster" {
# Changes to any instance of the cluster requires re-provisioning
triggers = {
cluster_instance_ids = join(",", aws_instance.cluster.*.id)
}
# Bootstrap script can run on any instance of the cluster
# So we just choose the first in this case
connection {
host = element(aws_instance.cluster.*.public_ip, 0)
}
provisioner "remote-exec" {
# Bootstrap script called with private_ip of each node in the cluster
inline = [
"bootstrap-cluster.sh ${join(" ",
aws_instance.cluster.*.private_ip)}",
]
}
}
Functions
The Meta-Argument
These are some special constructs in Terraform that can be used in modules or resource blocks.
- The depends-on Meta-Argument: This is used to handle hidden resource or module dependencies that Terraform cannot automatically infer.
- The count Meta-Argument: By default, a resource block configures one real infrastructure object, to configure more than one we can use count.
- The for_each Meta-Argument: If a resource or module block includes a for_each argument whose value is a map or a set of strings, Terraform creates one instance for each member of that map or set.
- The provider Meta-Argument: The provider meta-argument specifies which provider configuration to use for a resource, overriding Terraform's default behavior of selecting one based on the resource type name.
- The lifecycle Meta-Argument: The Resource Behavior page will give us the general lifecycle for resources. Some details of that behavior can be customized using the special nested lifecycle block within a resource block body. We have the following arguments: create_before_destroy, prevent_destroy, ignore_changes, and replace_triggered_by.
Failure Behaviour
By default, provisioners that fail make Terraform apply itself to fail. The on_failure setting can be used to change this. The values used are:
continue: Ignore the error and continue applying the code.
fail: Raise the error and stop execution.
Output Values
Output values make information about the infrastructure available on the command line, and can expose information for other Terraform configurations to use. Output values are similar to return values in programming languages.
Declaring an Output Value
output "instance_ip_addr" {
value = aws_instance.server.private_ip
}
State
Terraform will have the complete information of its target infrastructure and its called state. This state is used by Terraform to map real-world resources to the configuration, keep track of metadata, and improve performance for large infrastructures.
This state is stored by default in a local file named “terraform.tfstate” and this file is called the current state, but it can also be stored remotely, which works better in a team environment.
The code we have written to configure is nothing but the desired state.
Terraform will compare both the current state and desired state and take a decision on what changes have to be made and display for us in the plan.
Data Types and Values
Data types: string, number, bool, list, map
Module Blocks
A module is a container for multiple resources that are used together. It’s the best way to reuse the code in multiple projects. We write the code and store it in a folder and use it in every other code we need.
A module can call other modules, which lets you include the child module’s resources in the configuration in a concise way.
module "servers" {
source = "./app-cluster"
servers = 5
}
Exporting
To use the output block inside our module block we have to write a new output block with a different syntax.
syntax: value= module.<module name>.<output block name in real code>
Data Sources
For Terraform AWS is the source and Terraform will retrieve all information of the source its called Data Source. This information is used before applying the code.
This information we can use to further automation, we write a separate block called data and retrieve this information by applying filters. We can also use custom condition checks and regex can be used.
data "aws_ami" "example" {
most_recent = true
owners = ["self"]
tags = {
Name = "app-server"
Tested = "true"
}
}
Data Source: Info before deploying resources.
Attribute Reference: Info before deploying resources.
Conditional Expressions
It’s just like the ternary operator which uses the value of a boolean expression to select one of two values.
Syntax: condition ? true_val : false_val
Example: var.a != “” ? var.a : “default-a”
Built-in Functions
The Terraform language includes a number of built-in functions that can be used within expressions to transform and combine values. The general syntax for function calls is a function name followed by comma-separated arguments in parentheses: max(5, 12, 9) gives the highest valued number as output. Like this, we have lots of functions that can be found here.
Local Values
It’s just like variables but we can define multiple values within the same value.
Declaring local values.
locals {
service_name = "forum"
owner = "Community Team"
}
Using local values.
resource "aws_instance" "example" {
# ...
tags = local.common_tags
}
Variables wont support function but local values does.
In modules we cant change local values value but in variables we can.
Element Function
It gets a single element from the list. Syntax: element(list, index)
Example:
> element(["a", "b", "c"], 1)
b
> element(["a", "b", "c"], 3)
a
Taint
When we re-apply the code we want some resources to get deployed from scratch that's why we tell Terraform to destroy that resource and create it again.
eg: terraform taint aws_instance.os4
Target
When we re-apply the code we want only specific resources to get deployed.
eg: terraform apply -target=aws_instance.os4
Debugging Terraform
Terraform has detailed logs that can be used by setting the TF_LOG environment variable to any value. Enabling this setting causes detailed logs to appear on the output.
TF_LOG can be set to one of the log levels INFO, ERROR, WARN, DEBUG, or TRACE.
for
Expressions
A for expression creates a complex type value by transforming another complex type value. We can use it just like for the loop we have in programming.
Syntax: [for s in var.list : upper(s)]
Eg: {for s in var.list : s => upper(s)}
output: {
foo = “FOO”
bar = “BAR”
baz = “BAZ”
}
Dynamic blocks
This can be used within blocks like resources. It can usually be used only when assigning a value to an argument using the name=expression form. It’s more like for expression but gives nested blocks for each element of that complex value.
resource "aws_elastic_beanstalk_environment" "tfenvtest" {
name = "tf-test-name"
application = "${aws_elastic_beanstalk_application.tftest.name}"
solution_stack_name = "64bit Amazon Linux 2018.03 v2.11.4 running Go 1.12.6"
dynamic "setting" {
for_each = var.settings
content {
namespace = setting.value["namespace"]
name = setting.value["name"]
value = setting.value["value"]
}
}
}
State Locking
When we are working in a team multiple times we make changes for the same code by different people, in this scenario there might be chances of applying the code in the same team and this leads to problems, to avoid this Terraform has a remote state locking mechanism and when one code is applied we cant apply same code from different repo or terminal. Terraform has .tfstate file to maintain the state, this locking is by default behavior and we can change it.
Backend Configuration
The backend is nothing but where Terraform stores its state file. When working in a team we make sure the state file is present in the cloud.
Eg: S3. Stores the state as a given key in the bucket on Amazon S3. S3 alone won't support state locking and we can use DynamoDB for this.
terraform {
backend "s3" {
bucket = "mybucket"
key = "path/to/my/key"
region = "us-east-1"
}
}
Terraform Cloud
It’s a cloud platform given by hashicorp where multiple users can work together, and version update and bug update is done automatically. We can create policies with the help of sentinel. We can also get the price estimation of our deployment. We get a graphical look and feel where we can give the variables, apply, destroy, etc can be done in a single click without giving commands. Click here to know more.
Splat Expressions
Splat expression can be used to express a common operation which otherwise is performed using for expression.
The special [*] symbol iterates over all of the elements of the list given to its left and accesses from each one the attribute name given to its right.
Eg: var.list[*].id, this is equivalent to [for o in var.list : o.id]
alias
: Multiple Provider Configurations
We might have the requirement of having multiple regions in the same account, this we can have by alias.
# The default provider configuration; resources that begin with `aws_` will use
# it as the default, and it can be referenced as `aws`.
provider "aws" {
region = "us-east-1"
}
# Additional provider configuration for west coast region; resources can
# reference this as `aws.west`.
provider "aws" {
alias = "west"
region = "us-west-2"
}
zipmap
Function
zipmap creates a map with help of two lists
syntax: zipmap(keyslist, valueslist)
> zipmap(["a", "b"], [1, 2])
{
"a" = 1
"b" = 2
}
time_sleep (Resource)
This can be used to have delays between resources during application or destruction.
# This resource will destroy (potentially immediately) after null_resource.next
resource "null_resource" "previous" {}
resource "time_sleep" "wait_30_seconds" {
depends_on = [null_resource.previous]
create_duration = "30s"
}
# This resource will create (at least) 30 seconds after null_resource.previous
resource "null_resource" "next" {
depends_on = [time_sleep.wait_30_seconds]
}
Vault Provider
The Vault provider allows Terraform to use HashiCorp Vault. The main purpose behind the vault is security.
Terraform basic commands
- Terraform version
This command we run after we install Terraform on top of our base OS to check if installed properly and also to check the version.
2. Terraform init
This command we run after we create new code, it will download the plugin/driver.
3. Terraform validate
This command we run to check the syntax of the code.
4. Terraform plan
This command we run to get an execution plan, which lets us preview the changes that Terraform plans to make to our infrastructure.
5. Terraform apply
This command we run to deploy our code, when we run this without passing a saved plan file it will create a new execution plan and prompts you to approve that plan, and takes the indicated actions.
We can pass the -auto-approve option to instruct Terraform to apply the plan without asking for confirmation.
6. Terraform destroy
This command is the best way to destroy all remote objects managed by a particular Terraform configuration.
7. Terraform refresh
This command is used to update the terraform.tfstate file, terraform objects can change outside terraform and this command helps to get all the fresh info.
8. Terraform workspace
We can set the workspace and use it as a parameter inside the code to automate.
Usage: terraform workspace <subcommand> [options] [args]
9. Terraform console
This gives live IDE we can run and test our syntax.
10. Terraform fmt
This command gives proper look and feel to our code.
11. Terraform taint
To destroy and recreate specific resources.
That’s the End! Hopefully, after this, you would feel more confident starting with Terraform. For any suggestions/queries please free feel to address in the comments!