Automated Consul cluster installation

With Packer/Terraform/Ansible on GCP

Larbi Youcef Mohamed Reda
alter way
6 min readMar 26, 2020

--

In this article, I am going to show my personal method to easily install a Consul cluster on GCP (without Kubernetes). I am going to use Packer, Terraform and Ansible for that purpose.

What is Consul ?

Consul is a service discovery/service mesh tool. With this tool, we can register services to make them discoverable from anywhere in a datacenter. We can monitor our services (Health checking), store their configuration (Key-Value store) and create connections between services (Service Mesh with Consul Connect). This is the tool we are going to install.

What is Packer ?

Packer is a tool to create easily and automatically machine images. We will use this tool to create an “out of the box” image that contains a configured Consul server ready to run.

What is Ansible ?

Ansible is an IT automation tool. With this tool we can automate software deployments and system configuration. We will use it to provision the Packer image with the installation of Consul along with its configuration.

What is Terraform ?

Terraform is an Infrastructure as Code tool. With this tool, we can quickly create and manage a cloud infrastructure on different cloud providers like GCP, Azure or AWS. We will use it to actually create instances with Consul installed and ready to join the cluster.

Architecture overview

Here is an illustration of Consul’s architecture :

Consul Architecture

As you can see, there are two types of Nodes in a cluster : Servers and Clients. Servers manage the cluster’s state and data. Clients run alongside with services and register them in the cluster.

For this article, we will focus only on the Consul servers part. The installation and configuration is almost the same for Consul clients.

Consul servers use the following protocols :

To make a cluster, nodes must join each other either manually or automatically.

Goal

Starting from scratch, the goal is to quickly instantiate a Consul cluster with the number of nodes of our choice. To do that, we will split our installation in two parts :

  • Creation of the consul image (Packer/Ansible).
  • Instantiation of our cluster (Terraform).

The cloud provider I chose to show you this method is GCP.

Note : The exact same thing can be done for AWS/Azure or whatever cloud provider. For AWS/Azure though, Consul is directly available as an AWS/Azure service.

Creating the consul image

The first thing we are going to do is to create the consul image for our instances. We are going to use Packer and Ansible to do that. Let’s create our Packer template :

{
"variables": {
"gcp_account_file": "{{env `GOOGLE_APPLICATION_CREDENTIALS`}}"
},
"builders": [
{
"type": "googlecompute",
"account_file": "{{user `gcp_account_file`}}",
"project_id": "mlarbi-awcc",
"source_image": "debian-10-buster-v20200210",
"ssh_username": "packer",
"zone": "europe-west1-c",
"image_description": "“Consul ready image",
"image_name": "consul-server-image"
}
],
"provisioners":[
{
"type": "ansible",
"playbook_file": "../../install.yml",
"galaxy_file": "../../requirements.yml",
"galaxy_force_install": true,
"extra_arguments": [
"--extra-vars", "consul_cloud_provider=gcp",
"--extra-vars", "consul_gcp_provider=gce",
"--extra-vars", "consul_gcp_tag=consul",
"--extra-vars", "consul_mode=server"
]
}
]
}

Note : I used an environment variable GOOGLE_APPLICATION_CREDENTIALS to store the credentials file’s path.

As you can see, we’ll use Debian 10 as our base image and we will use Ansible to provision that image with everything we need to run a Consul server in cluster mode. Now let’s take a look at our Ansible playbook.

- hosts: allbecome: yesroles:
- consul-server

This playbook uses an Ansible Role I wrote called “consul-server”. For more details, you can check out my personal gitlab repository. Here are the tasks of this role :

- name: apt-get update
apt:
update_cache: yes
become: yes

- name: Ensure that zip and unzip are installed
apt:
pkg:
- zip
- unzip
state: present
become: yes

- name: Ensure that the consul package {{ consul_download_url }} is dowloaded to {{ consul_installation_path }}
get_url:
url: "{{ consul_download_url }}"
dest: "{{ consul_installation_path }}"
sha256sum: "{{ consul_sha256_checksum }}"

- name: Extract the consul package {{ consul_installation_path }} to /usr/bin
unarchive:
src: "{{ consul_installation_path }}"
dest: "/usr/bin"
remote_src: yes
become: yes

- name: Ensure that {{ consul_config_path }} exists
file:
path: "{{ consul_config_path }}"
state: directory

- name: Ensure that {{ consul_var_path }} exists
file:
path: "{{ consul_data_path }}"
state: directory

- name: Copy the config-{{ consul_mode }}.json.j2 into "{{ consul_config_path }}/config.json"
template:
src: config-{{ consul_mode }}.json.j2
dest: "{{ consul_config_path }}/config.json"

- name: Copy the cloud_auto_join_{{ consul_cloud_provider }}.json.j2 into "{{ consul_config_path }}/cloud_auto_join.json"
template:
src: "cloud_auto_join_{{ consul_cloud_provider }}.json.j2"
dest: "{{ consul_config_path }}/cloud_auto_join.json"

- name: Copy the consul.service.j2 into /etc/systemd/system/consul.service
template:
src: consul.service.j2
dest: /etc/systemd/system/consul.service

- name: Ensure that the consul.service is enabled
systemd:
name: consul.service
enabled: yes

So basically this playbook installs Consul, creates a config directory and a data directory, puts a simple configuration file config.json and a file called cloud_auto_join.json which contains the Cloud Auto-joining configuration.

What is Cloud Auto-joining ?

Consul supports a very interesting feature called Cloud Auto-join. As we mentioned earlier, nodes need to join each other to make a cluster. In a classic approach, we would have included in our configuration the IP of each Consul Server. Which can be very tricky in a dynamic environment.

With this feature, new nodes use cloud metadata to discover other consul nodes and automatically join them. This makes the configuration of our consul cluster extremely easier.

I created one template per cloud provider containing the Cloud Auto-Join configuration :

{
"retry_join": [
"provider={{ consul_gcp_provider }} tag_value={{ consul_gcp_tag }}"
]
}

With this configuration, Consul is able to discover every GCP instances that have a tag {{ consul_gcp_tag }} (in my example the value is “consul”). If Consul finds a Consul Server in these instances, it will automatically join them. We just need to make sure to put this tag on our instances in the terraform configuration.

Now we can build our image with Packer. Here is a quick demo :

Instantiating our cluster with Terraform

Now that we have our image, we can now quickly create an instance with a running consul service on it. With the Cloud Auto-joining feature, each time we create a new instance, it is automatically added to the cluster. As we mentioned earlier, we will use Terraform for that purpose.

variables.tf :

variable "consul_server_count" {
type = number
default = 3
}
variable "region" {
type = string
default = "europe-west1-c"
}

As you can see, we can configure the number of nodes of our cluster in this file via the consul_server_count variable.

main.tf :

resource "google_compute_firewall" "consul-firewall" {
name = "consul-firewall"
network = "default"
allow {
protocol = "icmp"
}
allow {
protocol = "tcp"
ports = ["22", "8600", "8500", "8300", "8301", "8302"]
}
allow {
protocol = "udp"
ports = ["8600", "8301", "8302"]
}
target_tags = ["consul"]
}
resource "google_compute_instance_group_manager" "consul-cluster" {
name = "consul-cluster-igm"
base_instance_name = "consul-server"
zone = var.region
version {
instance_template = google_compute_instance_template.consul-server.self_link
}
target_size = var.consul_server_count
}
resource "google_compute_instance_template" "consul-server" {name = "consul-server-template"
machine_type = "n1-standard-1"
disk {
source_image = "consul-server-image"
auto_delete = true
boot = true
}
network_interface {
network = "default"
access_config {
nat_ip = ""
}
}
tags = ["consul"]service_account {
scopes = ["compute-ro"]
}
}

So we create an instance group which will contain our Consul instances. These instances are created from an instance template that uses our Consul image we created previously.

As you can see, we also put a “consul” tag to our instance template. This tag is used by Consul to retrieve other Consul Servers in our tenant.

We can now create our consul cluster with Terraform. Here is a quick demo :

Alternative way

This my personal method to quickly and easily install a consul cluster on one of the three major cloud managers (without Kubernetes).

However, Hashicorp maintains Terraform modules to create a consul cluster for GCP. Another way would be to simply use their module :

module "consul_consul-cluster" {
source = "hashicorp/consul/google//modules/consul-cluster"
version = "0.4.0"
gcp_project_id = "mlarbi-awcc"
gcp_region = "europe-west1-c"
machine_type = "n1-standard-1"
cluster_name = "consul-cluster"
cluster_size = "3"
source_image = "consul-image"
startup_script = ""
cluster_tag_name = "consul"
}

Conclusion

That concludes the automatic installation of our Consul Cluster. With Ansible, we can easily customize our installation, and deploy it on a cloud provider with powerful tools like Packer/Terraform. Consul made it easier to create a cluster on a cloud provider with the Cloud Auto-Join feature.

You can find everything needed to do this installation on my personal gitlab repository.

--

--