Everything as Code: Part 1 — DNS records

This is the first article in a series, where we will present the tools we use as MindGeek Canada to manage our infrastructure as code.

We started using tools to manage our infrastructure as code a little over a year ago. At first, it seemed a little too complex for some of our engineers compared to using the web-UI, but we quickly realized the value it brings.
Would it surprise you if we told you that our main website had no versioning system at all and that all our developers were coding directly on the production servers?
This would be madness, especially considering our main website has to handle 100M unique visitors every day.

Use VI directly on the production server

But when we talk about infrastructure, most people are OK using a web-UI to deploy, reboot or update the settings of a server.

There is a way to deploy and even revert your infrastructure, the same way you deploy your web or mobile application!

I’m not lazy, I’m just very relaxed

A few month ago we wrote about OctoDNS, so this article will be a summary of how we use this tool and what it means for us to manage our DNS records with YAML files.

First, we will start with a quick presentation of our DNS setup: we have around 10 000 zones to manage, with some of them having almost 1 000 records. If we keep growing at the same pace, in 3 years we expect to have around 15 000 zones to manage. To host all these zones, we use three different providers: Amazon Route53, DynDNS and UltraDNS.

Before we move to a centralized system, each provider was handling some of our zones, and we had setup AXFR replication to keep all of them in sync. We had python scripts to add, update or delete zones and records for each provider and the DNS zone transfer logic was handling the rest.

Dealing with each provider and hoping the replication doesn’t break sometimes feels like moving a washer/dryer combo using a skateboard over 2 kilometers. At first, you think it is a good idea, and then you quickly realize something will go wrong but you’re not sure when and how bad it will be when it happens.

Because DNS is an essential piece of the web business, we wanted to find a better way. We thought that if we can use a versioning server for our web application, we should be able to use one for our infrastructure too. This way, we would be able to deploy our changes and revert them if something goes wrong.

OctoDNS was the key to a centralized and organized DNS record manager:

  • All our zones and records are committed in one git repository
  • We deploy the changes using a deployment tool and we receive notifications in HipChat and JIRA about each deployment
  • We can revert changes if ever needed
  • We can add or remove DNS provider very easily
  • We have a history of all the changes applied

Deploy OctoDNS to manage your DNS records

OctoDNS is written in python, making it quite easy to setup. We also provide a dockerfile if it is easier for you: https://github.com/MindGeekOSS/octodns/blob/master/Dockerfile

Only a few steps are required:

  • Install docker and clone our github repository
  • Build the docker container
  • Start writing the yaml files representing your DNS zones.

The main record types (A, AAAA, CNAME, SRV, MX, NS, and more) are all supported and if you need extra types that are not currently supported, a couple of lines of python should be enough to add them in.

# Regular zone
---
? ''
: - ttl: 3600
type: A
value: 1.2.3.4
- ttl: 3600
type: MX
values:
- exchange: mxa-77.email-provider.com.
preference: 10
- exchange: mxa-42.email-provider.com.
preference: 20
- ttl: 86400
type: NS
values:
- ns1.provider1-sdns.net.
- ns1.provider1-sdns.net.
- ns1.provider1-sdns.net.
- ns1.provider1-sdns.net.
- ns1.provider2-sdns.net.
- ns1.provider2-sdns.net.
- ns1.provider2-sdns.net.
- ns1.provider2-sdns.net.
# A record with multiple values
roundrobin:
ttl: 3600
type: A
values:
- 2.3.4.5
- 3.4.5.6
- 4.5.6.7
- 5.6.7.8
- 6.7.8.9
- 7.8.9.10
- 8.9.10.11
# CNAME record
www:
ttl: 10800
type: CNAME
value: example.com.

OctoDNS also has built-in support for healthchecks and geo-distributed records

---
? ''
: - ttl: 3600
type: A
value: 1.2.3.4
- ttl: 86400
type: NS
values:
- ns1.provider1-sdns.net.
- ns1.provider1-sdns.net.
- ns1.provider1-sdns.net.
- ns1.provider1-sdns.net.
- ns1.provider2-sdns.net.
- ns1.provider2-sdns.net.
- ns1.provider2-sdns.net.
- ns1.provider2-sdns.net.
geobalanced:
geo:
A1:
- us-east-1.example.com.
A2:
- us-east-2.example.com.
A3:
- us-east-3.example.com.
AF:
- eu-west-1.example.com.
AN:
- eu-west-1.example.com.
AS:
- eu-east-1.example.com.
EU:
- eu-west-1.example.com.
NA:
- us-east-1.example.com.
OC:
- us-west-1.example.com.
SA:
- sa-east-1.example.com.
ttl: 600
type: CNAME
value: default-geo.example.com.
chat:
healthcheck:
backup: 6.5.4.3
host: chat.example.com
interval: 900
path: /
port: 80
retries: 2
type: HTTP
ttl: 600
type: A
value: 3.4.5.6

You can find more information on how to setup OctoDNS on their github repository: https://github.com/github/octodns#getting-started

Once all zones and records have been added to OctoDNS, you can start deploying changes.

Pull Request to apply changes with OctoDNS

You can open a pull request in your versioning server, so your peers can review the changes before they get deployed.
If we try to deploy a zone with the config we saw earlier, we would get the following output:

Once the change-log has been verified, it is just a matter of hitting the merge button and the deployment process kicks in. 
We now feel a lot more confident when we apply DNS updates as we can first visualize the changes, get them reviewed by a peer, and we know we can rollback the changes as needed.

Our process is the following:

  • Create a JIRA task to describe the changes
  • Code the changes and get them reviewed by a peer
  • Merge the changes in the master branch, and Bamboo deploys them

Conclusion

Number of monthly changes deployed with OctoDNS

Using centralized YAML files to manage our records means we can make it easy for our developers to open a pull request directly instead of waiting for DevOps to make a change. 
It also makes it a lot easier for the DevOps engineers to manage a lot of zones over time.

DNS as code with OctoDNS offers centralization, versioning, history, and visibility for every change deployed.

In the next part, we will talk about how we applied the same principle to our server infrastructure with Terraform.

Next: Part 2 managing cloud infrastructure with Terraform