13 min readApr 16, 2023

Building a Resilient Infrastructure in AWS: An Introduction to High-Availability with Terraform

In this article, we are continuing are series with Terraform. If you haven’t seen the introductory to Terraform, check it out here: Automating the Cloud. We will go a step further and discuss how Terraform can leverage AWS’s resources to build a fault-tolerant and highly available infrastructure, using Infrastructure as Code (IAC). First, lets cover the prerequisites.

Prerequisites:

AWS account
AWS Cloud9 IDE
Basic Terraform knowledge
Basic Knowledge of Linux/CLI

Resources:

Terraform is an open-source IaC tool created by HashiCorp, that allows users to define and manage infrastructure resources in a declarative manner, using high-level configuration language. A huge benefit for Terraform is that it is cloud agnostic and supports multiple cloud platforms, including Amazon Web Services (AWS), Microsoft Azure, Google Cloud Platform (GCP), and many others. It’s a powerful tool for creating and managing infrastructure in a scalable and reproducible manner.

Resilient Infrastructure in AWS refers to an architecture that is designed to withstand and recover from various types of disruptions or outages. An architecture that is fault-tolerant and highly available are important concepts in the infrastructure design.

In Terraform, fault tolerance can be achieved by creating redundant resources, such as multiple servers in different availability zones, and using load balancers to distribute traffic across multiple instances. High availability can be achieved by using features such as auto-scaling groups, which automatically replace failed instances. Also, configuring a failover mechanism, such as Route 53 health checks or Elastic Load Balancer failover, the infrastructure can ensure that users can access the application or service with minimal interruption. Both fault-tolerant and highly available architecture architectures are critical for ensuring resiliency of infrastructure.

Tasks:

Launch an Auto Scaling group that spans 2 subnets in your default VPC
Create a Security Group that allows traffic from the Internet and associate it with the Auto Scaling group instances
Include a script in your user data to launch an apache web server. The Auto Scaling group should have a min of 2 and max of 5.
To verify everything is working check the public ip addresses of the two instances. Manually terminate one of the instances to verify that another spins up to meet the minimum requirements of 2 instances
Create an S3 Bucket and set it as your remote backend

Let’s get started!

Setup Integrated Development Environment (IDE):
For this series, I will continue to use AWS Cloud9. You can find the documentation here: AWS Cloud9. Log into your AWS console, type “Cloud9” in the search menu. Note: also, I will be in the US-West-2 Region (Oregon). If your in the East Coast, I recommend using region close to you to ensure you have enough availability zones for this project.

Once your in Cloud9 menu, hit the “Create environment” button.

Next, fill in the details below. Create a name, description, select your environment and Instance Type. I will be using the (Free Tier) t2.micro on Amazon Linux 2. For network settings, use the default VPC.

I’ve made slight changes, and updated my IDE to say “Terraform_Updated.” But you should have a green success banner with newly created environment. Hit the “Open in Cloud” button to launch your IDE.

My environment is below.

Create a working directory:
Lets create a working directory to write our code and save our files.

mkdir <name of directory>
mkdir <luitweek21> #my working directory name

cd <name of directory>
cd luitweek21

Now that we have our working directory, we can get started with creating our infrastructure using Terraform. Throughout this article, we will be utilizing the documentation from Terraform Registry. We will be creating a few files in the working directory.

providers.tf
apache.sh
main.tf
variables.tf

Create a Providers.tf:
In Terraform, a provider is responsible for managing and interacting with a specific cloud platform, and is a way for Terraform to declare their respective versions that will be used for infrastructure deployment. Here is an example of a providers.tf file that specifies AWS as the provider

provider "aws" {
  region = "us-west-2"
}

Lets use the cloud9 IDE to create our own providers.tf file.

touch providers.tf

Lets run a Terraform version in our CLI

First, lets run a few commands to update our Terraform version, as it’s out of date. Enter the following commands.

sudo yum install -y yum-utils shadow-utils
sudo yum-config-manager --add-repo https://rpm.releases.hashicorp.com/AmazonLinux/hashicorp.repo
sudo yum -y install terraform

Head over to the Terraform registry and copy the latest provider updates.

Adjust template to include what you need

#terraform providers
terraform {
  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 4.0"
    }
  }
}

# Configure the AWS Provider
provider "aws" {
  region = "us-west-2"
}

Now, let’s copy and paste this code in our providers.tf file, save, and run the following command below.

terraform fmt

If you don’t see a response or file name after running the command, that means the formatting is correct.

Now, lets run Terraform init command to initialize the working directory containing all configuration files to install necessary plugins.

terraform init

Now, that our file has been initialized, we can run the “terraform validate” command to validate the configuration files in the directory. It’s also a good practice to run this command as you build out your code. Validate as you build out your configuration files, and eliminate the possibility of catching errors late.

terraform validate

Config is Valid!

Next, lets create an apache script that we will bootstrap to an AWS EC2 instance. Check out the AWS instructions Here: AWS User Guide

#!/bin/bash
#Updating all packages and install Apache 
sudo yum update -y &&
sudo yum install -y httpd &&
sudo systemctl enable httpd
sudo systemctl start httpd
#Script for a customized apache webpage
cd /var/www/html
sudo echo "<html><body><h1>Welcome to my Apache webpage!</h1></body></html>" > /var/www/html/index.html

Run your “terraform validate” to ensure no errors.

No Errors!

Now, we can start building out blocks of code to incorporate into our main config file. Before we move to the security group, lets create a block of code for launching the EC2 Instance with Apache configuration. In the config file, we will incorporate our apache.sh file to run at launch.

resource "aws_launch_template" "my-tf-launch" {
  name_prefix   = "my-tf-launch"
  image_id      = "ami-0747e613a2a1ff483"
  instance_type = "t2.micro"
   key_name               = var.key_name
  user_data              = filebase64("${path.root}/installapache.sh")
  vpc_security_group_ids = [aws_security_group.my-tf-sg.id]

  tag_specifications {
    resource_type = "instance"

    tags = {
      Name = "terraform_auto_scaling"
    }
  }
}

Each block of code, will just move into our main.tf file, once completed. Lets shift to security group.

Create a Security Group that allows traffic from the Internet and Associate it with the Auto-Scaling group instances:

In this security group, we will configure to allow us to reach the Internet and attach it to an auto-scaling group.

resource "aws_security_group" "my-tf-sg" {
  name        = "my-tf-sg"
  description = "Security group for web server instances"
  vpc_id      = var.my-dev-vpc

  ingress {
    from_port   = 80
    to_port     = 80
    protocol    = "tcp"
    cidr_blocks = ["0.0.0.0/0"]
  }

  ingress {
    from_port   = 22
    to_port     = 22
    protocol    = "tcp"
    cidr_blocks = ["0.0.0.0/0"]
  }

  egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }
}

You can copy and paste these templates, and adjust the parameters and file names. Now, let’s tackle task 1. In order to create auto-scaling groups, we will need to create a “variables.tf” file that we can incorporate into our “main.tf” file. The process is similar to creating the “providers.tf” and “apache.sh” file we did earlier. We will also break the code down into chunks and validate as we build.

Lets create a code block for our auto-scaling group that will launch with a minimum of two subnets. We will set capacity for a minimum of two desired instances, with a maximum of five instances. It’s important that we accurately associate the arguments that are being defined.

resource "aws_vpc" "my-dev-vpc" {
  cidr_block = "10.10.0.0/16"
}

resource "aws_autoscaling_group" "my-tf-asg" {
  name                = "my-tf-asg"
  min_size            = 2
  max_size            = 5
  desired_capacity    = 2
  vpc_zone_identifier = [var.subnet-public1-us-west-2a, var.subnet-public2-us-west-2b]

  launch_template {
    id      = aws_launch_template.my-tf-launch.id
    version = "$Latest"
  }
}

A lot of our values will consists of variables. Variables are great because it provides flexibility to your code and you can easily modify values without having to edit the underlying code. Another great benefit is reusability. If you have several resources that use the same value (e.g., region), you can define that value once as a variable, and then reference it in each of your resources. Lets create a variables.tf file.

touch variables.tf

Variables will be extremely useful to plug into our main configuration file, especially when changes to variables are required. Let’s build out the variables file.

variable "aws_region" {
  description = "AWS Region to deploy the infrastructure"
  default     = "us-west-2"
}

variable "s3_bucket_name" {
  description = "Unique S3 bucket name to store Terraform state"
  default     = "cblackii-luitw21-bucket"
}

variable "key_name" {
  type    = string
  default = "GeneralUseKeyPair"
}

variable "instance_type" {
  description = "EC2 instance type"
  default     = "t2.micro"
}

variable "image_id" {
  type    = string
  default = "ami-0747e613a2a1ff483" # use the AMI for Amazon Linux 2
}

variable "my-dev-vpc" {
  type    = string
  default = "vpc-042c4102842beb813"
}

variable "vpc_cidr" {
  type    = string
  default = "10.10.0.0/16"
}

variable "subnet-public1-us-west-2a" {
  description = "The VPC subnet the instance(s) will be created in"
  default     = "subnet-0fb177b1051028c2a"
}

variable "subnet-public2-us-west-2b" {
  description = "The VPC subnet the instance(s) will be created in"
  default     = "subnet-09f98d789f54cdb33"
}

variable "subnet-private1-us-west-2a" {
  description = "The VPC subnet the instance(s) will be created in"
  default     = "subnet-021af5a30af937c9f"
}

variable "subnet-private2-us-west-2b" {
  description = "The VPC subnet the instance(s) will be created in"
  default     = "subnet-02574448c9ec3f0e0"
}

Now, enter the following command to create our main.tf file. This will be our main source where we will insert all blocks of code.

touch main.tf

Create an S3 Bucket and Set it as your Remote Backend:
The final piece is creating an S3 bucket with a remote backend. By default, terraform state file is stored locally on your computer, but for tutorial purposes, the task is calling for us to store in a S3 bucket. We can do all this through code like the other files.

NOTE: You will need to have an S3 bucket created first, before adding to a backend. You can do this manually through the AWS Management console, or add to your main.tf file as a resource block.

#create an s3 bucket to be used as remote backend
resource "aws_s3_bucket" "cblackii-luitw21-bucket" {
  bucket        = "cblackii-luitw21-bucket"
  force_destroy = true #this will help to destroy an s3 bucket that is not empty 
}

#enable versioning to keep record of any modifications made to s3 bucket files
resource "aws_s3_bucket_versioning" "cblackii-luitw21-bucket" {
  bucket = aws_s3_bucket.cblackii-luitw21-bucket.id
  versioning_configuration {
    status = "Enabled"
  }
}

#s3 bucket access control list will be private
resource "aws_s3_bucket_acl" "cblackii-luitw21-bucket" {
  bucket = aws_s3_bucket.cblackii-luitw21-bucket.id
  acl    = "private"
}

#block s3 bucket objects from public 
resource "aws_s3_bucket_public_access_block" "cblackii-luitw21-bucket" {
  bucket                  = aws_s3_bucket.cblackii-luitw21-bucket.id
  block_public_acls       = true
  block_public_policy     = true
  ignore_public_acls      = true
  restrict_public_buckets = true
}

Need-To-Know: To maintain Confidentiality and integrity of our data, we are incorporating access controlled to prevent unauthorized access from the public.

#s3 bucket access control list will be private
resource "aws_s3_bucket_acl" "cblackii-luitw21-bucket" {
  bucket = aws_s3_bucket.cblackii-luitw21-bucket.id
  acl    = "private"
}

#block s3 bucket objects from public 
resource "aws_s3_bucket_public_access_block" "cblackii-luitw21-bucket" {
  bucket                  = aws_s3_bucket.cblackii-luitw21-bucket.id
  block_public_acls       = true
  block_public_policy     = true
  ignore_public_acls      = true
  restrict_public_buckets = true
}

Lastly, the DynamoDb table will be used for file-locking features in the S3 remote backend. Amazon S3 and Amazon DynamoDB are two popular services that are used together to build highly scalable and reliable applications. The S3 with the backend, DynamoDB is great option for managing the state of your Terraform configuration.

#create dynamodb table for file-locking of s3 bucket backend
resource "aws_dynamodb_table" "luitweek21-dynamodb" {
  name           = "luitweek21-dynamodb"
  hash_key       = "LockID" #value "LockID"
  billing_mode   = "PROVISIONED" 
  read_capacity  = 10 #free-tier eligible
  write_capacity = 10 #free-tier eligible

  attribute {
    name = "LockID"
    type = "S"
  }
}

We will add all code blocks to our main.tf.

provider "aws" {
  region = var.aws_region
}

#create an s3 bucket to be used as remote backend
resource "aws_s3_bucket" "cblackii-luitw21-bucket" {
  bucket        = "cblackii-luitw21-bucket"
  force_destroy = true #this will help to destroy an s3 bucket that is not empty 
}

#enable versioning to keep record of any modifications made to s3 bucket files
resource "aws_s3_bucket_versioning" "cblackii-luitw21-bucket" {
  bucket = aws_s3_bucket.cblackii-luitw21-bucket.id
  versioning_configuration {
    status = "Enabled"
  }
}

#s3 bucket access control list will be private
resource "aws_s3_bucket_acl" "cblackii-luitw21-bucket" {
  bucket = aws_s3_bucket.cblackii-luitw21-bucket.id
  acl    = "private"
}

#block s3 bucket objects from public 
resource "aws_s3_bucket_public_access_block" "cblackii-luitw21-bucket" {
  bucket                  = aws_s3_bucket.cblackii-luitw21-bucket.id
  block_public_acls       = true
  block_public_policy     = true
  ignore_public_acls      = true
  restrict_public_buckets = true
}

#create dynamodb table for file-locking of s3 bucket backend
resource "aws_dynamodb_table" "my-tf-dbtable" {
  name           = "my-tf-dbtable"
  hash_key       = "LockID" #value "LockID" is required
  billing_mode   = "PROVISIONED"
  read_capacity  = 10 #free-tier eligible
  write_capacity = 10 #free-tier eligible

  attribute {
    name = "LockID" #name "LockID" is required 
    type = "S"
  }
}

resource "aws_security_group" "my-tf-sg" {
  name        = "my-tf-sg"
  description = "Security group for web server instances"
  vpc_id      = var.my-dev-vpc

  ingress {
    from_port   = 80
    to_port     = 80
    protocol    = "tcp"
    cidr_blocks = ["0.0.0.0/0"]
  }

  ingress {
    from_port   = 22
    to_port     = 22
    protocol    = "tcp"
    cidr_blocks = ["0.0.0.0/0"]
  }

  egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }
}

resource "aws_vpc" "my-dev-vpc" {
  cidr_block = "10.10.0.0/16"
}

resource "aws_autoscaling_group" "my-tf-asg" {
  name                = "my-tf-asg"
  min_size            = 2
  max_size            = 5
  desired_capacity    = 2
  vpc_zone_identifier = [var.subnet-public1-us-west-2a, var.subnet-public2-us-west-2b]

  launch_template {
    id      = aws_launch_template.my-tf-launch.id
    version = "$Latest"
  }
}

resource "aws_launch_template" "my-tf-launch" {
  name_prefix   = "my-tf-launch"
  image_id      = "ami-0747e613a2a1ff483"
  instance_type = "t2.micro"
   key_name               = var.key_name
  user_data              = filebase64("${path.root}/installapache.sh")
  vpc_security_group_ids = [aws_security_group.my-tf-sg.id]

  tag_specifications {
    resource_type = "instance"

    tags = {
      Name = "terraform_auto_scaling"
    }
  }
}

terraform {
  backend "s3" {
    bucket = "cblackii-luitw21-bucket"
    key    = "global/s3/terraform.tfstate"
    region = "us-west-2"
    # Add any additional backend-specific configuration options
  }
}

Now, that all of our config files are in the main.tf, lets run the Terraform commands.

Deploy Resources Using Terraform:

First command is used to initialize Terraform.

terraform init

“Terraform has been successfully initialized!”

The next command will be used to validate our code syntax is accurate.

terraform validate

The next command is used to plan all your resources that will be deployed, and will check for errors in code.

terraform plan

You will be shown a list of all planned resources, and will be prompted to run “terraform apply.”

Type in “terraform apply.”

terraform apply

<enter value> Yes

Success!

NOTE: You should only have resources added if done correctly. I had to make several adjustments and troubleshoot, so I ended destroying some resources. Run the following commands to see what’s running.

terraform state list #to see what's running

Verify Resources Deployed:

Lets head over to the EC2 console and see that we have a minimum of 2 EC2s running. Also, to test the Auto-scaling group, we will terminate one of the EC2 instances. You should see a new EC2 instance launch shortly after.

After new EC2 has spun back up and initialized, we will test that our apache webpage is configured correctly.

Check the security groups. Check the Inbound Rules!

Check that your S3 Bucket and DynamoDB table is created.

aws s3 ls --humman-readable

aws dynamodb describe-table --table-name <dynamdobd_table>

S3 bucket and DynamoDB table are successfully created. Now we need to configure our backend for the terraform.tf file.

There you have it. We have our EC2s created with auto-scaling configured. We have our security groups with appropriate inbound/outbound settings. We were able to connect to our public IPv4 address. Our S3 bucket and DynamoDB table were created successfully.

SUCCESS!!! Last step, tear it all down!

terraform destroy

If you made it this far, thank you for reading this article. This was definitely one of the tougher projects, but very rewarding. I’ve learned from all the mistakes and errors. Follow me here: on LinkedIn for more updates!

Building a Resilient Infrastructure in AWS: An Introduction to High-Availability with Terraform

Prerequisites:

Resources:

Tasks:

Written by Charles Black II