One-click environment creation with Terraform & Ansible in under 10'

Provision and configure an infrastructure in AWS and deploy applications in one go

Published in

On The Cloud

11 min readApr 20, 2020

When it comes to managing infrastructure, there are many options out there for provisioning, configuration, orchestration & application deployment, as well as a plethora of technologies and tools someone could use for each of those tasks.

From the above list, two of the most popular technologies (probably the leading ones right now) that provide the ability to provision and configure infrastructure, are Terraform and Ansible. Both of them are powerful enough to perform both tasks. At the same time though, each one exceeds in one of them. Terraform is a great tool when it comes to provisioning infrastructure, while Ansible is a excellent choice for orchastrating the configuration of our infra resources.

In this post, we are going to see how we can combine them, in order to create a new environment, in our case at AWS. To make things a bit more interesting, after the creation and during the configuration of our servers, we are going to deploy a Web and a REST API application (we’ll use Ansible for the deployment), just to be able to see an end result after the whole process finishes.

Before we get started

If you would like to follow along, you need to set up/install a few things first.

Install Terraform

The installation of Terraform is pretty straight forward. You just need to find the appropriate package for your system, download it, extract it and add the binary’s location to PATH. For more details, you can check this link:

Installing Terraform

Terraform must first be installed on your machine. Terraform is distributed as a binary package for all supported…

learn.hashicorp.com

Install Ansible

There are more than one way to install Ansible, with the official documentation having detailed instructions for each one of them.

Installing Ansible - Ansible Documentation

This page describes how to install Ansible on different platforms. Ansible is an agentless automation tool that by…

docs.ansible.com

Personally, I preferred the option of using pip to do the installation (although following the specific way, will not create the ansible.cfg file, something that doesn’t cause any issues, at least for this demo).

Create an AWS Account

Finally, you need to either have or create an AWS account, as this is the cloud provider that we’ll be using to create our infrastructure. Instructions on how to do that can be found on the following link:

Create and Activate an AWS Account

I'm getting started with Amazon Web Services. How do I create and activate a new AWS account? Create your account Go to…

aws.amazon.com

You will also have to create a new key-pair file at the region you want to build the infrastructure.

What we will build

As mentioned above, in this example we go from resources provisioning, to server configuration and application deployment. From an infrastructure perspective, we’ll try to keep things simple and build a similar stack we created in a previous post (when we had a look of how to create AWS infra with Cloudformation). The end result will look like this:

After the creation and configuration of the servers has been completed, we’ll deploy to them, two different applications:

the Latest News API: it will be deployed at the EC2 instances located at our private subnets, and it is a simple API written using Java/Spring Boot. The app is determining the caller’s country based on their IP, and calls News API, a free, JSON-based API to get the news for that specific country.
and a Latest News website: it will be deployed at the EC2 instances located at a public subnet. The web page calls our Latest News API to get the data it will display. It consists of a simple HTML5 page and a JS script.

The reason that we’re placing the web app in the public subnet, is to show the difference on the invocation of Ansible playbooks, depending on the subnet that the EC2 is located.

The code for the Latest News API, can be found in this repository, while the Latest News Website files are included in the repository contaning all the Ansible and Terraform code for this demo.

How the whole thing works

One of the advantages of using Terraform & Ansible, is that they can invoke each other, so we can fully automate the process at hand and have one big task that we can run with one click.

Figure 2: Ansible & Terraform invoking each other in the same flow

Ansible has a module for Terraform, which we can use in order to provision our resources. On the other hand, although Terraform does not have an explicit provisioner for Ansible (as it has for Chef or Salt), we can make use of the remote-exec & local-exec provisioners to invoke our Ansible scripts at the time of server creation.

In the above diagram, we can visualize such a flow, with Ansible calling Terraform code, which in turn runs a different set of Ansible playbooks to configure our servers.

To understand this better, let’s start diving into the code and see how each part play their role.

Our base Ansible script

At the root level of our demo project, we can see the following structure:

The create-staging.yml file is our base Ansible script, which will run a few tasks locally before it runs the Terraform code.

What it’s basically doing, is creating an .ansible.cfg file which we use to point to a custom ssh config file. We do that, because in our demo we want to avoid StrictHostKeyChecking when we will be connecting using ssh to our newly created servers, and we don’t want to modify our ~/.ssh/config file.

Our .ansible.cfg file

Our custom ssh config file

The terraform init task is needed in order for the appropriate provider to be downloaded locally (in our case the AWS provider), which will be used by Terraform to provision our resources. The last task is the call to the Terraform module.

Our Terraform files

There are a few files that we’ll be using in order to run our Terraform code. Those can be found in the terraform folder.

These are:

The variables.tf file, which contains some variables. Some of them are already initialized, while others will get their values either from the user when the script will be run (prompt), or from the terraform.tfvars file.
The outputs.tf which list certain values to be outputted after the completion of the main.tf script.
The terraform.tfvars file which initializes all of the variables declared at the variables.tf file that the user would be prompted to initialize (in order to fully automate the process).
The main.tf file, which as the name imples, is our main code file, containing all resource definitions.

The terraform.tfstate is a file that is produced after the execution of the Terraform code, and in our case is stored locally.

The “variables.tf” file

In our example, variables include the region that the infrastructure will be created at, the VPC CIDR, the private key-pair file name and location, the EC2 family type to be used, etc.

The “outputs.tf” file

We’ll use this file to print the DNS name of the Application Load Balancer, as well as the public DNS of the Latest News website app, at the end of our code execution.

The “terraform.tfvars” file

At this file, we’ll define the region, the IP from which we’ll access the bastion host and the website, as well as the name and location of our private key.

The “main.tf” file

In the beggining of the file, we declare information about the provider to be used, such as its name and optionally the credentials that we’ll use to connect to it.

When I run this locally though, I don’t declare the latter, as they’re saved at ~/.aws/credentials file that I have created locally:

[default]
aws_access_key_id=YKRAXU45ALKEWJWDVQT9
aws_secret_access_key=1TRE/p+NIg54MYG32veqxEU4OeGvFMf0hPOO2BXS

We then move on with the provisioning of our infrastructure resources, starting from the VPC and the Internet Gateway.

We declare our subnets, creating 2 in each Availability Zone and using the cidrsubnet funtion to determine dynamically the subnet’s CIDR.

The way that the specific function works might be a bit hard to understand, but I found a nice article that explains things in more details.

We move on with the creation of Elastic IPs and NAT Gateways…

…our Route Tables and their associations with subnets…

…and our Security Groups

At this point, we have all needed resources ready in order to start coding our EC2 instances definitions.

The first one, which will be our bastion host, will be fairly simple as we don’t need to do any configuration after its creation.

For the other EC2 instances though (the API and the website ones), we want to invoke the configuration part as soon as the server has been created.

To do that, we’ll use Terraform’s local-exec provisioner, through which we’ll call the appropriate Ansible’s playbooks to be run. The syntax of the ansible-playbook command is a bit different depending on whether our targeted EC2 instances reside at a private or a public subnet.

Let’s start by examing how this is accomplished for the EC2 instances that we’ll deploy our Latest News API (private subnets).

In this case, when we run the ansible-playbook command, we are using the --ssh-common-args and we pass theProxyCommand option, since our instance is behind a jump server (bastion host).

Terraform will try to run the code inside local-exec at our server, as soon as it has been created. The issue is that at that point, our server won’t be ready to accept ssh connections which are needed in order for Ansible to run. Therefore, local-exec code will exit, and the playbook won’t run.

For that reason, we delay local-exec execution by adding a remote-exec block, which will try to ssh to our remote server in order to perform a task (in our case just echo something). The remote-exec provisioner will keep trying to connect until an ssh connection has been established, giving time to local-exec code block to be executed after everything is ready.

In case the private key you’re using to connect to your EC2 instances is saved at your ssh-agent, you can remove the -i ${var.private_key_file_path} from the ProxyCommand.

Note that since we’re using bastion’s host public IP to connect to our private EC2 instances, our server creationdepends_on = [aws_instance.bastion].

When it comes to configuring our public EC2 instances, the code is pretty much the same, with the main difference being the missing --ssh-common-args option, as this time we can ssh directly to our server using its public IP.

The other difference (which applies in our specific example), is that this time we’re passing --extra-vars to our playbook, which contain the public DNS name of the Load Balancer that our website will use to call our API. And in order for that info to be available at this point, we make sure that our instance depends_on = [aws_lb.latest_news_api] .

After the creation of our EC2 instances, we finally create an Application Load Balancer for our Latest News API, the ALB’s Listeners and the Target Group that it will forward the requests to.

Our Ansible playbooks

Before we run our code, let’s have a quick look on the Ansible playbooks that are called from our Terraform script. In our ansible folder, we have the backend.yml (for the API) and the frontend.yml (for the website). We also have created a few roles, in order to better structure our code.

The backend playbook

What the playbook for our backend servers does, is simply installing:

Java 8
Git (to clone the news-api repo)
Maven (to build and install the app)

and then creates the appropriate systemd unit file, reloads systemd, and runs the Java app as a service.

The frontend playbook

On the other hand, the playbook for our frontend servers (the website), installs apache2 web server…

…copies the html, js and css files under its home dir (/var/www/html ), and restarts the web server.

Running our code

Now that we have seen how our Terraform code is structured and what it does, it’s time to put it to the test. Altough we can just run everything with just one click, by executing our main Ansible script (create-staging.yml), we’ll comment out its last part (where it executes the Terraform code) and run Terraform using its CLI. This way we can have a better look on the logs that Terraform prints during the execution of the code.

...
- name: init terraform
  shell: terraform init
  args:
    chdir: terraform/# - name: apply terraform script
#   terraform:
#     project_path: terraform/
#     state: present

Running our main Ansible play

At our project’s root level, we invoke create-staging.yml (with no inventory passed)

$ ansible-playbook create-staging.yml

Then, we cd to the terraform folder and just before we run the code, we can validate it and see the plan of execution.

$ terraform validate
$ terraform plan

After the plan has been evaluated, you’ll be able to see the resources to be created, changed and deleted, when the code will be executed.

Finally we run our Terraform code, using

$ terraform apply --auto-approve

at which point Terraform starts provisioning the resources…

…connects to the created instances using the local-exec provisioner…

…runs the appropriate Ansible playbooks on the servers…

…and finally displays the values defined in our outputs.tf file

Checking the end result

All scripts have been ran successfully, and it’s now time to check our website. We copy the news_website_address from the Outputs list and paste it in a web browser, and…

Voila! It works!! I’m located in Athens, so when I visit the website, I get to see the latest news from various Greek websites. :)

Cleaning up

After you get to play/test the website, or ssh to the created servers and check that they are configured as expected, don’t forget to destroy the created environment in order not get charged for resources that you’ll no longer be using, by running:

$ terraform destroy

After all, you can now re-create a new environment with a single click, whenever you need to do so.

I hope the above demo is helpful, thank you for reading!