Blue Green and Canary Infrastructure with Terraform

9 min readNov 23, 2022

This concept is so popular nowadays. Terraform enables you to create infrastructure with code and codes can be version controlled. In this article, instead of talking about what are the advantages of Terradorm or how to install it (you can check it from https://www.terraform.io/), we will make a real-world example on AWS.

Prerequisites

I will be using Amazon Web Services (AWS) for this tutorial, but the code or implementation won’t vary too much with another provider like Google Cloud Platform, Azure etc

You need to have some basic knowledge of Terraform. Also, you will need to have:

A working AWS account. You can signup and use the free tier offered by AWS
AWS CLI installed on your local machine
Terraform installed on your local machine
AWS IAM User with proper permissions and configure your AWS profile

What is Blue-Green and Canary deployments?

Blue-Green deployment is a DevOps practice that aims to reduce downtime on updates by creating a new copy of the desired component while maintaining the current.

Canary deployment is a deployment that releases an application or a service incrementally to a subset of users. It’s the lowest risk-prone compared to all other deployment strategies because of this control.

Given that, you end with two versions of the system: One with the actual version (blue) and another with a new one (green). When the new version is up and running, you can seamlessly switch traffic to it. This is useful not only to reduce downtime but also to improve rollback time when something bad happens.

Blue/Green Infrastructure

While Blue/Green deployment is a technique more commonly used with application deployment, the reduced costs of the cloud, in addition to the tools we have right now, make it possible to have two copies of an entire cloud infrastructure with little to no pain.

It’s also important to note that doing Blue/Green deployment of the entire Cloud Infrastructure is not a silver bullet and certainly a bit too much if you are doing small changes, for example, adding a new or changing EC2 instance type in your stack. If you plan to do major or breaking changes is a win and this deployment type is worth it.

After finishing this, you will be able to create an infrastructure containing:

A Virtual Private Cloud
Three Subnets, each one in a different Availability Zone
An Internet Gateway
A custom Route Table
Associate Custom Route Table and Subnet
A Security Group
EC2 instances serving an Apache HTTPd server on the Port 80
A Load Balancer pointing to those instances
Optional: Attach A DNS Record to the Load Balancer

Most of these components will be the same for both the Blue/Green environments with minor differences for each environment.

The full example can be seen here.

Step 1: Setup Provider, Backend with Terraform

We start by creating a folder and opening the folder in your favourite editor of choice. I will be using vs code editor.
The first thing we must create is providers configuration. Terraform relies on plugins called “providers” to interact with cloud providers, SaaS providers, and other APIs. Terraform configurations must declare which providers they require so that Terraform can install and use them

Terraform stores the state of the infrastructure in a JSON file. It’s recommended to store that file on an external backend like Amazon S3. As we are using AWS for this tutorial, we will stick to S3, but Terraform supports the equivalent in each provider.

First, you need to create the S3 bucket in which the state will reside. You can do this either via the AWS S3 console or by doing:

aws s3api create-bucket --create-bucket-configuration LocationConstraint=eu-west-1 --bucket  blue-green-for-learning-here --region us-east-1 | jq

We have referenced some variables e.g var.aws_profilehere, so we need to define them. We create a file variables.tf:

Note: Configure AWS profile. Here the profile is personal-deployment you can change this to suit yours.

Step 2: Create VPC

Create a file namedvpc.tf with this content:

cidr_block: 10.0.0.0/16 allows us to use IP Addresses that start with 10.0.X.X This will give us 65,536 IP Addresses ready to use.

You can check for the remaining arguments here.

Step 3: Create Public Subnets

To do anything useful we will first need subnets. We will create 3 of them, each in a different availability zone. Create file named subnets.tfwith this content:

Here we create three subnets specifying:

count: The number of subnets we want to create
availability_zone: In this case, we are using the element() function which takes a list and an index and returns the element, even if the index is greater than the number of elements. This is useful to assign a different availability zone to each subnet.
vpc_id: The VPC ID of the vpc we just created.
cidr_block: We interpolated the previously defined infrastructure_version variable into the CIDR block. This will help in the future when creating the second version (green).

You can check for the remaining arguments here.

Step 4: Create an Internet Gateway

To enable our vpc to connect to the internet, we must have an internet gateway.

Create a file named internet_gw.tf with this content:

You can check for the arguments here.

Step 5: Create a custom route table

Create a custom route table for the public subnets. Public subnets can reach the internet by using this.

Create a file named route_table.tf with this content:

Step 6: Associate custom route table and subnet

In the same file route_table.tf

Step 7: Create a Security Group

We add security group inbound and outbound rules for our EC2 instances. We need to open the ports we need for our instances.

You can check for the remaining arguments here.

Step 8: AMI data source

We will be creating our blue environment soon and we will be adding an EC2 instance. We will be using Amazon Linux 2 AMI. This step is not necessary for this demo but I want to show you how you can get the latest AMI ID for Amazon Linux 2 OS:

Create a file name ami_datasource.tf with this content:

Step 9: Create Blue Environment

First, let’s add the local values we will need for our environment.

Create a file named locals.tf with this content:

Here we create subnets local values. For now, only focus on the subnets

Let’s add a user data script. Create a file named apache-script.sh with this content:

Now, create a file named blue.tf with this content:

Let’s explain a little bit about this file.

We have created a resource of type aws_instance, with these parameters:

count: The number of resources of this type. In this case, it’s based on the enable_blue_env and blue_instance_count variables that have defaults true and 2respectively.
ami: The Amazon Machine Image for the instance. In this case, as we mentioned in step 8, we usedata.aws_ami.amzlinux.id to get the latest Amazon Linux OS image ID.
instance_type: The type of the instance with a default of t2.micro
user_data: This allows us to assign an initialization script to the instance. In our case, we are running an Apache Httpd server with a custom webpage. There are better ways to define user data scripts, but we’ll keep it simple for now.

Don’t bother about lines 16–35, we will come to understand them in our next step.

You can find the remaining arguments here.

Let us update our variable.tf to this:

Step 10: Create a Load Balancer

Create a file named load_balancer.tf with this content:

In this file we have created a Load Balancer with:

name: Self-explanatory
subnets: The subnets the load balancer is available in
security_groups: We have added the previously created security groups to be able to access it
load_balancer_type: Here we are using application load balancer type
listener: We have added a aws_lb_listener resource type that listens on port 80 of the load balancer. The listener uses a default action with aws_lb_target_group we created in the previous step 9 lines 16–35. We also define health_check with a simple HTTP healthcheck that targets port 80 of the instance.

We have created all the components we need for our blue environment. Let’s add one more file for the outputs.

Create a file named outputs.tf with this content:

Here, we have added an output that displays the Load Balancer Public DNS. All Good now 😁. Let’s now start playing with terraform commands.

The first command we need to run is terraform fmt to format our configuration files.

terraform fmt

The next command we need to do is run terraform init to initialize our working directory, load remote state and download all the modules we need

Then we run terraform validate to validate our configuration files. Validate runs checks that verify whether a configuration is syntactically valid and internally consistent.

terraform validate

The next command we need to run is terraform plan to create an execution plan with a preview of changes we will make in our infrastructure.

As you can see in the image, this plan will add 19 resources.

The next command we need to run is terraform apply to execute the actions proposed in terraform plan. This is where we accept the changes and apply them against real infrastructure.

terraform apply

Apply complete! Resources: 19 added, 0 changed, 0 destroyed.

And you can access the webpage using the lb_dns_name output 😁.

We can do a curl for loop for this lb_dns_name:

Hurray!!! We are done with the Blue Environment configuration.

Let’s now create a Green Environment.

Step 11: Create Green Environment

Our Green environment will just be the same as the Blue environment with only one difference it will have a new version of deployment.

Create a file name green.tf with this content:

Let’s update the Load Balancerload_balancer.tf with this new content:

The update here is to include both the two target groups, blue-green-deployment-blue and blue-green-deployment-green ,stickiness and traffic distribution for the two environments with defaults100 to BlueEnvironment and 0 to Green Environment.

Let’s also create a traffic distribution map local values in locals.tf file.

Update locals.tf file with this content:

You can read more about AWS ALB traffic distribution here.

Let’s also update the variables.tf to this content:

Let’s run terraform apply to create a green environment and enable 90%traffic to the blue environment:

terraform apply -var traffic_distribution=blue-90  -var enable_green_env=true -auto-approve

Now if we do a curl for loop for this lb_dns_name you can see most of the traffic 90%is going to the blue deployment with a few 10%to the green deployment.

for i in `seq 1 1000`; do curl $(terraform output -raw lb_dns_name); done

Let’s run terraform apply to split traffic between the two deployments 50–50 :

terraform apply -var traffic_distribution=split  -var enable_green_env=true -auto-approve

You can see the traffic is evenly distributed between the two deployments.

Finally, let’s completely switch green deployment and disable blue deployment.

terraform apply -var traffic_distribution=green  -var enable_green_env=true -var enable_blue_env=false -auto-approve

Now if we do a curl for loop for this lb_dns_name we can see we are only getting responses from the green new green deployment.

This is 🤩 . Congratulations 👏 .

Optional: Attach A DNS Record to the Load Balancer

I’m not going to cover much on this case, but what I’ve ended up doing in production is creating a DNS Record that points to a specific version of the Load Balancer. An example of this terraform could be:

The do terraform apply:

terraform apply -var traffic_distribution=green  -var enable_green_env=true -var enable_blue_env=false -var domain_name=yourdomain.com -auto-approve

Please change yourdomain.com var to your domain.

New outputs

Please note this configuration for DNS only works when you have your domain hosted in AWS Route53

If you are doing this just for fun, please do terraform destroy to delete the new green deployment:

terraform destroy -var traffic_distribution=green  -var enable_green_env=true -var enable_blue_env=false -auto-approve

If you added domain configuration then run:

terraform destroy -var traffic_distribution=green  -var enable_green_env=true -var enable_blue_env=false -var domain_name=yourdomain.com -auto-approve