Deploying Kubernetes clusters with kops and Terraform

At Bench, we use Kubernetes on AWS to run our production microservices. Something that we learned along the way is that Kubernetes by itself won’t get you very far. We had to figure out how to integrate our Kubernetes clusters with the rest of our codified AWS resources such VPCs, RDS, DNS records, storage, access controls and so on. Thanks to kops and it’s excellent integration with Terraform we are able to provision and manage our Kubernetes clusters in the same way that we manage the rest of our AWS infrastructure using Terraform commands, remote state backends, and version control.

Before delving deep, let’s illustrate the workflow:

The idea is to use existing AWS resources to create a cluster.yaml template which will contain all of the resource IDs that kops needs to create a Kubernetes cluster. You might be wondering: why not use kops imperative commands to create the Kubernetes cluster as well as the VPC and any resources required? While this is a valid approach, we want to manage all these building blocks independently and be able to mutate AWS components without having to alter our Kubernetes cluster definitions and vice-versa. We also prefer a declarative coding style to manage our infrastructure so being able to declare the state of our cluster in YAML files fits well with our existing workflows.

Let’s get started. To demonstrate how this works I will create a Kubernetes cluster with kops using values from AWS resources managed by Terraform. If you want to follow along, I recommend you to clone the Github repo and install the following command-line tools:

  • kops
  • terraform
  • kubectl
  • jq

If using OSX and Homebrew, you can run brew install kops terraform kubectl jq to install them all in one go

Creating base AWS resources using Terraform

Here are the AWS resources that I’m about to launch using Terraform:

  • AWS VPC and related network infrastructure (Subnets, NAT Gateways, IGW, etc)
  • Security group for the API server.
  • S3 bucket for the kops state.

Let’s start with main.tf which contains generic config settings and local variables. Note that we’ve created a S3 remote state backend to remotely store and manage our tfstate. If you are using this in production I recommend to set the dynamodb_table field to ensure state locking and consistency.

The local variables that you’d need to change are:

  • region: The AWS region you want the resources launched on.
  • backend: S3 bucket and prefix where you want the tf state stored.
  • azs: The Availability Zones within the region that the VPC will span across.
  • environment: String with the environment that the resources belong to, for example, “test”
  • kops_state_bucket_name: Name of the S3 bucket that will store the kops state
  • kubernetes_cluster_name: Fully qualified domain name of the Kubernetes cluster. Will be used by kops as the cluster DNS name, for example, to host the API server endpoint. Usually maps to a Route53 hosted zone name. You could also use a private hosted zone here.
  • ingress_ips: Array of strings with any CIDR ranges that you want to allow access to your cluster.
  • vpc_name : Name of the VPC, for example “test-vpc”

For the VPC-related resources, I decided to use the official terraform aws-vpc module just to keep things simple but any Terraform VPC configuration will work. In this case, I just went for a VPC with 6 subnets across 3 AZs, resulting in 3 private and 3 public subnets. The module will automatically handle the creation of NAT Gateways, route tables, IGW and so on:

The last resources we need to provision are the Kubernetes API ELB security group and the kops state S3 bucket. We could manage these in the same template as our kops cluster but decided to manage separately to avoid circular dependencies with the final Kubernetes cluster template.

The last and very important part of our Terraform definition are the outputs. These will be used by kops and other Terraform projects to retrieve resource IDs and any settings resulting from this template.

Feel free to run terraform init, terraform plan and terraform apply

Templating our Kubernetes cluster definition

The idea is to use our Terraform outputs to construct a cluster definition. For this, we will use a very handy templating tool from kops based on the Go template package.

First of all, we need a cluster template:

The above is a templated kops cluster definition with a baseline setup to create a Kubernetes cluster with 5 nodes running. We will pass it to our kops toolbox command in order to replace the fields with the Terraform outputs. Ideally, this will be run from within a different directory to avoid mixing up projects. You have an example in the Github repository.

The templating command would look like:

TF_OUTPUT=$(cd ../terraform && terraform output -json)
CLUSTER_NAME="$(echo ${TF_OUTPUT} | jq -r .kubernetes_cluster_name.value)"
kops toolbox template --name ${CLUSTER_NAME} --values <( echo ${TF_OUTPUT}) --template cluster-template.yaml --format-yaml > cluster.yaml

The resulting cluster.yaml file is a fully fledged and usable kops cluster definition and could be applied using kops create. Instead, we are going to import the cluster.yaml definition into the kops S3 state. This will help making the future workflow easier as we plan to use the same commands on subsequent cluster updates:

STATE="s3://$(echo ${TF_OUTPUT} | jq -r .kops_s3_bucket.value)"
kops replace -f cluster.yaml --state ${STATE} --name ${CLUSTER_NAME} --force

--force will ensure the state gets created for the first time.

The output should show the instanceGroup resources being created in the remote state.

Last, we will generate a Terraform file that will represent our cluster configuration:

kops update cluster --target terraform --state ${STATE} --name ${CLUSTER_NAME} --out .

You may need to run the following command to create a secret with your public ssh key due to this kops issue. You can still specify a different key pair for ssh access on the kops config though.

kops create secret --name ${CLUSTER_NAME} --state ${STATE} --name ${CLUSTER_NAME} sshpublickey admin -i ~/.ssh/id_rsa.pub

You will see the full cluster spec creating in the S3 bucket. If all went well, a kubernetes.tf file should have been created. This contains all of the AWS infrastructure needed to run your cluster (think ASG’s, ELB’s, EBS Volumes and so on). Kops also updates the ASG launch configuration on the EC2 instances that will bootstrap Kubernetes on the nodes.

You are ready to run terraform init , terraform plan and terraform apply!

Testing the setup

If all goes well during terraform apply, your Kubernetes cluster should be up and running. In order to download the config for kubectl you can run:

kops export kubecfg --name ${CLUSTER_NAME} --state ${STATE}
kubectl config set-cluster ${CLUSTER_NAME} --server=https://api.${CLUSTER_NAME}

Assuming you added your public IP to the inbound security group of the Terraform VPC config, you should be able to run kubectl commands, for example, kops validate cluster —-state ${STATE} to validate the cluster or kubectl get pods --all-namespaces to see all of the kube-system pods.

Give it about 5 minutes to fully start up and then launch & jump into a busybox container:

kubectl run -i --tty busybox --image=busybox -- sh

Also, keep in mind that if you didn’t use a public Route53 zone you may not be able to interact with the API ELB unless you doing it from within a VPC.

Workflow

Let’s say we want to make a change to our cluster: maybe the number of nodes, the instance type or extract new kops settings from the kops cluster spec. It will be as easy as changing or adding values on the cluster-template.yaml file and then run through the same commands maybe wrapped in a bash script or a Makefile (here is an example). This will ultimately result in an updatedkubernetes.tf. We can then run terraform plan and submit a PR, allowing you and the rest of the team to understand what changes would be made to your Kubernetes cluster before hitting terraform apply. This brings the huge benefit of treating changes to our Kubernetes clusters like we treat any other infrastructure change, and integrate it with the rest of our tools and practices such as CI/CD, integration testing, replicate environments and so on.

After we apply the changes in Terraform, we usually like to perform an orderly rollout of the cluster to ensure that any changes that require new AWS resources actually get rolled out (eg. new launch configurations) but also to test the resilience of our cluster and ensure it’s able to provision new nodes.

There is a built-in kops rolling-update cluster command but you can also run this manually if you want to have more control. To do this, drain the node using kubectl then terminate the EC2 instance once it’s drained. AWS Auto Scaling will ensure a new one comes up with the most recent launch config:

kubectl drain {node} --ignore-daemonsets --delete-local-data
aws ec2 terminate-instances --instance-ids {instance id}

What about the state?

This approach will work better the more stateless your Kubernetes workloads are. Treating Kubernetes clusters as replaceable infrastructure requires a bit of thinking especially if you use persistent volumes or want to run workloads such as databases on Kubernetes. We feel pretty confident that we can recreate our workloads by applying each of our service definitions to a given Kubernetes cluster as long as we keep the state separately on RDS, DynamoDB, and so on. In terms of the etcd state, kops already provisions the etcd volumes independently to the master instances they get attached to. This helps to persist the etcd state after rolling your master nodes without any user intervention. We also use the amazing Heptio Ark to backup persistent resources.

Other considerations

One benefit of having our cluster defined in Terraform is that we can have additional Terraform files in that folder that will complement our cluster definition without being overwritten, as long as they are not named terraform.tf :-)

At the very least, we should have a .tf file that defines a remote state similarly to our VPC example, and we should also create an outputs file so our Kubernetes AWS resources can be used across other Terraform projects, for example, to grant access from our k8s nodes, and so on.

Another thing to keep in mind is the high availability of your cluster. The current template defines 3 master and 2 worker nodes. This works well for AWS regions with 3 AZs or more but is not ideal for regions with less than 3 AZs. Check out the kops HA best practices for more details on this, and ensure you are limiting etcdMembers to 3 if running more than 3 AZs!

You can pass extra --values arguments to the same kops toolbox template command which can help gathering outputs from other Terraform projects.

As an additional safeguard, you should also consider enabling S3 versioning on your kops state bucket and implement tight IAM policies around it.

Cleaning up resources

To clean-up any resources created, you can run:

terraform destroy # do this on any directories with terraform configurations
kops delete cluster --name ${CLUSTER_NAME} --state ${STATE}

Conclusion

We have provisioned a Kubernetes cluster using kops and Terraform and using outputs from other Terraform projects. This results in a declarative approach that gives us more control and visibility than running imperative commands such as kops create cluster.

We also benefit from being able to visualize the changes that will be made using terraform plan and having version-controlled and repeatable clusters it’s also a big plus.


If you are interested in learning more about Bench Accounting or a career with our Engineering team, then please visit us at https://bench.co/careers/.