Introduction to Terraform with DigitalOcean

Terraform has excellent support for various providers, which makes it an invaluable provisioning resources. This isn’t just limited to compute resources, but can be used to manage all parts of your infrastructure imperatively (DNS, Storage, Networking, etc.). This tutorial, however, will focus on declaring compute resources (at the end of this post, I’ll include examples of Terraform managing various parts of infrastructure state).

Many public cloud providers also provide metadata services that make post-provisioning installation a snap, and for most users, mitigates the need for post-provisioning configuration via whatever configuration management suite your org may use. The reason I mention this is that provisioning and configuration management are two separate, distinct processes, but the former can begin to offset things managed (unnecessarily) from the latter, because this division can cut both ways.

Let’s take the example of a CoreOS Container Linux cluster. Some parts of this process rely on old, possibly unsupported features, so this likely won’t be useful in practice for this use case, but I feel this can properly demonstrate how Terraform can add value. Consider what should live where if your components are:

  1. The OS
  2. The network overlay
  3. The container runtime
  4. Your application and configurations supporting it

The impulsive wisdom is that 1–3 can be managed through provisioning, and 4 be something managed in configuration management and your deployment/build-release pipeline. Item 3 is your transition point, where you begin to consider if something is an ongoing, stateful task, or something that only needs to be configured once, and maintained going forward, or requires active management from the first interaction with your environment’s automation.

Moving on to actually provisioning the cluster, let’s think about what might normally be required if, for example, our Terraform provider were DigitalOcean:

  1. Create the droplet
  2. Provide (on older, fleet-centric versions of the OS; it doesn’t actually matter, as this is primarily to demonstrate a concept rather than a time-specific service outline) a cloud-init script that bootstraps the services connecting the nodes.

so, with that in mind, our first piece is going to be a shared piece of state, your Etcd token. You’ll start your Terraform repo with a bash script that will provide you with a token:

#!/bin/bash
DISCOVERY=`curl -s https://discovery.etcd.io/new`
sed s,DISCOVERY_URL,$DISCOVERY,g sample.user_data.yaml > user_data.yaml
echo "Copy and Paste this Discovery URL into your terraform.tfvars file:\n\n $DISCOVERY\n\n i.e.\n\n discovery_url = \"$DISCOVERY\" "

The terraform.tfvars file is optional, and you can have Terraform prompt you for these options, however, this example will use a config file in this format:

digitalocean_token = "<Your DO API Key>"
region = "NYC1"
size = "4GB"
cluster_name = "digitalocean-coreos"
cluster_size = "5"
discovery_url = "https://discovery.etcd.io/<Your Token Generated Above>"
ssh_key_path = "./do-key"
ssh_public_key_path = "./do-key.pub"
ssh_key_fingerprint = "<Your SSH Key Fingerprint>"
user_data_path = "./user_data.yaml"

Take note of the cluster_size parameter; this will be how many droplets are created. Adjust this accordingly, and you’ll see this is applied dynamically to the Terraform state.

You’ll notice I also reference user_data.yaml, this will be your cloud-init script for the cluster, you can generate this dynamically in the Terraform script itself (where your Etcd token is just passed into the string as you’ll see in the enxt section), but for the sake of clarity, you’ll generate it manually here:

#cloud-config
coreos:
etcd2:
discovery: <Your Discovery URL from the above script>
advertise-client-urls: http://$private_ipv4:2379,http://$private_ipv4:4001
initial-advertise-peer-urls: http://$private_ipv4:2380
listen-client-urls: http://0.0.0.0:2379,http://0.0.0.0:4001
listen-peer-urls: http://$private_ipv4:2380
fleet:
public-ip: $private_ipv4
units:
- name: etcd2.service
command: start
- name: fleet.service
command: start

So, with this groundwork in place, let’s move onto the script itself, coreos.tf:

provider "digitalocean" {
token = "${var.digitalocean_token}"
}
resource "digitalocean_droplet" "core" {
name = "${format("${var.cluster_name}-%02d", count.index)}"
image = "coreos-stable"
size = "${var.size}"
count = "${var.cluster_size}"
private_networking = true
region = "${var.region}"
connection {
user = "core"
private_key = "${var.ssh_key_path}"
}
user_data = "${file("${var.user_data_path}")}"
ssh_keys = ["${var.ssh_key_fingerprint}"]
}

You’ll notice the count variable is there, and it will reference your value set in the tfvars file. When you destroy or re-apply the configuration, that value can be adjusted to scale up or down. So, if that value is set to 3, for example, your cluster will be provisioned as digitalocean-coreos-01 through digitalocean-coreos-03 and it can be adjusted without modifying the Terraform script.

I mentioned being able to generate user_data dynamically, and that’s certainly possible: You might, for example, in coreos.tf set user_data to the contents of your user_data.yaml instead of having it in its own file, and just replace the Discovery URL with something like ${var.discovery_url} and define a discovery_url in your tfvars file, in which case that you update the URL, you can apply it to your entire Terraform-managed environment at once, and only update the vars, and not the scripts or supporting configurations.

One final touch I like to include is an output.tf to provide whatever supporting information I may find useful after each run, in this case, the IPs of the droplets just created:

output "CoreOS Cluster IPs" {
value = "${join(",", digitalocean_droplet.core.*.ipv4_address)}"
}

My point about configuration management being a poor substitute for provisioning and bootstrapping, and the inverse, is that, at this point, the task becomes much different; do you really want to re-provision servers on each code deploy? do you want to deploy code every time to make a change to infrastructure? Sometimes these circumstances are unavoidable, but the tasks are different, and can be managed as such. There are many alternatives to this paradigm, and some more mature in one realm or the other, but good planning can make changes to either side of this equation much simpler. From here, you can, using your configuration management and deploy tools, to push apps, or pull images from a registry, whatever you may have happen above the infrastructure (compute and network, in this case) level.

Additional Resources

Like I said, Terraform can be used to managed various parts of infrastructure state, and I’ll include a few repositories where I did just that. I recommend reading through the scripts in these repos and seeing how these components fit together (start with the Terraform scripts themselves, and branch out as resources are referenced!), and maybe consider how provisioning and bootstrapping tools can be used to assist in your environment.

Compute & importing supporting configs for bootstrapping infrastructure:

https://gitlab.com/jmarhee/docker-swarm-digitalocean-terraform

Compute & More dynamic user-data usage for cloud-init:

Compute & Block Storage resources: