Matthew Sheppard
Jun 11, 2018 · 20 min read

Written by Matthew Sheppard & Donald Carnegie


Kubernetes (pronounced “koo-burr-NET-eez”) is an open-source platform to automate deploying, scaling, and operating application containers. The cool kids often refer to it by the (IOHO) not very good abbreviation “k8s”. Kubernetes’ job is to make the most efficient use of your infrastructure whilst ensuring that your containerised workloads are available and can scale as needed. Some businesses have reported being able to reduce their cloud infrastructure costs by between 50% and 70% compared to running on traditional VM based architecture.

Contain(erise) Your Success

Kubernetes is a massive success story for the Cloud Native Computing Foundation and the open-source community behind it. First announced in mid-2014 as a small project lead by a team of Google engineers, it has grown to be the de facto container orchestration platform and a commodity across the public cloud ecosystem. Microsoft, Google, Amazon and IBM all have a managed Kubernetes offering of some form. In terms of contributors and velocity, Kubernetes is second only in open-source projects to the Linux Kernel. It’s being used at scale for many cool workloads, from powering Pokemon Go to ensuring HBO GO subscribers can smoothly stream of Game of Thrones season 7.

Continuous Kubernetes

As users and engineers of modern web applications, we expect them to be available 24 hours a day, 7 days a week, and to be able to deploy new versions of them many times a day. Kubernetes in itself is not enough to achieve this goal. It ensures our containerised applications run when and where we want and can find the tools and resources they require. To fully empower our engineers, however, we need to build a CI/CD pipeline around Kubernetes.

GitOps is a term that has been coined by Weaveworks to describe using Git as the declarative source of truth for the Kubernetes cluster state. Git becomes our means of tracking system state, and we use constructs within git such as pull requests as a means for merging the state held in git with the state of cloud infrastructure. In a nutshell, approved pull requests result in live changes being made to production. This is a really powerful approach since a proven source code workflow can be applied to manage infrastructure — grasping it fully can rid us of old-school bureaucratic change control processes that are complicated, take too long, and lack accountability. The changes enacted on your infrastructure are always 100% visible, traceable, and accountable. The history is stored perpetually in the git repository, and rolling back to any point in history is a piece of cake.

Why Should I Use Kubernetes?

Containers are already a huge success story in cloud computing. Despite the relatively short period of time they’ve been in our toolbox, they have become a staple of modern cloud computing and are leveraged by many household name apps. However, the burden of orchestrating, managing, and maintaining containerised systems can be huge — that’s where Kubernetes comes in. Kubernetes takes all the great strengths behind containerisation and provides a platform for deploying and managing them with greater ease.

Kubernetes is an enabler for DevOps; it helps implement key DevOps practices and paves the way for organisations to implement DevOps. Wherever you install it, be it on your laptop, a cloud provider, or an on-premise data centre, it provides automated deployments of your containerised applications with fully consistent environments. With Kubernetes, gone are the days of building and successfully testing locally, only to find your application behaves differently in test or production environments!

YATR? (Yet Another Tutorial, Really?)

There are plenty of Kubernetes tutorials out there, so why write another one? Good question! In building Kubernetes clusters for our own fun and for clients, we’ve not come across a tutorial that brings together all the pieces needed to setup a cluster on AWS that could be made production ready. The documentation is mostly there, but it’s a treasure hunt to track it down and work out how to make it work in each particular situation. This makes it particularly challenging for anyone embarking on their first Kubernetes pilot or making the step up from a local minikube cluster.

The aim of this tutorial is to close off this gap and step through the setup of Kubernetes cluster that:

  • Is highly available: We want to ensure that our environments can deal with failure and that our containerised applications will carry on running if some of our nodes fail or an AWS Availability Zone experiences an outage. To achieve this, we’ll run Kubernetes masters and nodes across 3 AWS Availability Zones.
  • Enforces the “principle of least privilege”: By default all pods should run in a restrictive security context; they should not have the ability to make changes to the Kubernetes cluster or the underlying AWS environment. Any pod that needs to make changes to the Kubernetes cluster should use a named service account with the appropriate role and policies attached. If a pod needs to make calls to the AWS API, the calls should be brokered to ensure that pod has sufficient authorisation to make them and use only temporary IAM credentials. We’ll achieve this by using Kubernetes Role-based Access Control (RBAC) to ensure that by default pods run with no ability to change the cluster configuration. Where specific cluster services need permissions, we will create a specific service account and bind it to a required permission scope (i.e. cluster wide or in just one namespace) and grant the required permissions to that service account. Access to the AWS API will be brokered through kube2iam; all traffic from pods destined for the AWS API will be redirected to kube2iam. Based on annotations in the pod configurations, kube2iam will make a call to the AWS API to retrieve temporary credentials matching the specified role in the annotation and return these to the caller. All other AWS API calls will be proxied through kube2iam to ensure the principle of least privilege is enforced and policy cannot be bypassed.
  • Integrates with Route53 and Classic Load Balancers: When we deploy an application, we want the ability to declare in the configuration how it is made available to the world and where it can be found, and have this automated for us. Kubernetes will automatically provision a Classic Load Balancer to an application and external-dns allows us to assign it a friendly Fully Qualified Domain Name (FQDN), all through Infrastructure as Code.
  • Has a basic CI/CD pipeline wrapped around it: We want to automate the way in which we make changes to the cluster, and to how we deploy / update applications. The configuration files specifying our cluster configuration will be committed to a Git repository and the CI/CD pipeline will apply them to the cluster. To achieve this, we’ll use Travis-CI to apply the configuration that gets committed in our master branch to the Kubernetes cluster. This is a first step in the direction of GitOps, however it doesn’t give us a full GitOps capability.

At the end of the tutorial, we’ll end up with a Kubernetes cluster that looks like this:

Our end-state Kubernetes cluster

Before you get started

We assume you already have some familiarity with Kubernetes. If you are brand new to Kubernetes, it is recommended you review the Kubernetes Basics tutorial and get yourself familiar with its key concepts.

To build our cluster, we need to make sure we have the following tools installed:

  • kubectl
    kubectl (Kubernetes Control) is a command line tool for interacting with a Kubernetes cluster, either running locally on your machine (using minikube) or in the cloud.
  • kops
    The Kubernetes Operations (kops) project provides tooling for building and operating Kubernetes clusters in the cloud. It currently supports Google Cloud & AWS (with other providers in beta). We’ll be using kops to create and manage our cluster in this tutorial.
  • Terraform
    Terraform is an Infrastructure as Code (IAC) tool which allows users to define infrastructure in a high-level configuration language which can then be used to build infrastructure in a service provider such as AWS or Google Cloud Platform. We’ll be using Terraform to create our prerequisites for kops and to modify the IAM policies created by kops.
    AWS CLI is a command line tool for interacting with AWS. This is required by kops & Terraform to perform operations on AWS.

Installation instructions can be found at the links provided.

This tutorial was created using Kubernetes v1.8 and kops v1.8.1.

We’re running Mac OS X with Homebrew, so all we need to do is run the following commands to get these installed:

$ brew update
$ brew install kubectl
$ brew install kops
$ brew install python3
$ easy_install pip
$ pip install awscli — upgrade — user
$ export PATH=~/.local/bin:$PATH
$ brew install terraform

Create the Cluster

Step 1: Clone our repository

$ git clone

Step 2: Setup a FQDN that will be used for the cluster in Route53

The Kubernetes cluster that we will setup will use a FQDN hosted in Route53 to expose service endpoints and the API control plane. You could register a new FQDN or transfer an existing FQDN. AWS has a full step through for each of these options:

Step 3: Create the prerequisites needed for kops

For kops to build the cluster, it needs an S3 store to hold the cluster configuration and an IAM user account that has the following policies attached to it:


prereqs/ will create this for you. It will also create an S3 bucket that will be used as a remote store for our Terraform state. This allows multiple users to work with one set of Infrastructure as Code without causing conflicts. You will need to update the file to replace {my_bucket_name} and {my_tf_bucket_name} with your chosen bucket name.

Then run the following commands:

$ cd prereqs
$ terraform init
$ terraform plan
$ terraform apply

If you log into your AWS account, you will now see a newly created kops IAM user, an S3 bucket for the kops state store, and another S3 bucket for the Terraform state store

Step 4: Use kops to stand-up the cluster

In the previous step, we created an IAM account for kops. Now we need to setup our AWS CLI client to use that account. We can grab the kops IAM ID and secret key from the file Terraform uses to store the state of what it has created in the previous step. Open terraform.tfstate in your text editor and look for the section similar to the below:

Make a note of value in the {iam_id} and {aws_secret_key} fields, and run the following command:

$ aws configure --profile kops
AWS Access Key ID [None]: {iam_id}
AWS Secret Access Key [None]: {aws_secret_key}
Default region name [None]: {your_chosen_aws_region}
Default output format [None]: text

Next we need to set a couple of environmental variables so that kops knows which AWS IAM account to use and where it should put its state store:

$ export AWS_PROFILE=kops
$ export KOPS_STATE_STORE=s3://{my_bucket_name}

Now for the main event — let’s use kops to build our cluster. Run the following command, substituting your AWS region, your DNS zone and your chosen cluster name:

$ kops create cluster --cloud aws \
--bastion \
--node-count 3 \
--node-size t2.medium \
--master-size t2.medium \
--zones {your_chosen_aws_region}a,{your_chosen_aws_region}b,{your_chosen_aws_region}c \
--master-zones {your_chosen_aws_region}a,{your_chosen_aws_region}b,{your_chosen_aws_region}c \
--dns-zone {your_dns_zone} \
--topology private \
--networking calico \
--authorization RBAC \
--name {your_cluster_name} \
--out=k8s \
--target=terraform --yes

This command tells kops that we want to build a cluster that:

  • Will use AWS
  • Has a master node of size t2.medium in each of the specified availability zones
  • Has 3 worker nodes of size t2.medium. kops will spread the worker nodes evenly across each of the availability zones
  • Uses a private network topology, meaning that the all the nodes have private IP addresses and are not directly accessible from the public Internet
  • Uses Calico as a Container Network Interface replacing kubenet as a result of the requirements of the private network topology
  • Uses RBAC for Kubernetes access permissions
  • Is described in a Terraform configuration file to be written to the directory specified by --out

kops generates a set of Terraform configuration files in a newly created k8s directory that can be applied to create the cluster. Before we build our cluster, we want to add a configuration file to tell Terraform to keep its state store on the S3 bucket that we just created.

$ cd k8s
$ terraform init
$ terraform plan
$ terraform apply

It will take between 10 and 15 minutes for your cluster to become available. You can check the status of the cluster by running the following command:

$ kops validate cluster

When the cluster is finished building, you should see an output like this:

Using cluster from kubectl context: cluster.zigzag-london.comValidating cluster cluster.zigzag-london.comINSTANCE GROUPS
bastions Bastion t2.micro 1 1 utility-eu-west-1a,utility-eu-west-1b,utility-eu-west-1c
master-eu-west-1a Master t2.medium 1 1 eu-west-1a
master-eu-west-1b Master t2.medium 1 1 eu-west-1b
master-eu-west-1c Master t2.medium 1 1 eu-west-1c
nodes Node t2.medium 3 3 eu-west-1a,eu-west-1b,eu-west-1c
NAME ROLE READY master True node True master True node True master True node True
Your cluster is ready

Standing up a CI/CD Environment

In order to implement GitOps, we need a CI/CD environment for monitoring our repository & executing updates. For this tutorial we will configure our CI/CD environment to execute deployment steps on each push to the master branch of our repository. We’re doing this for the purposes of convenience of demonstration; allowing developers to do this is definitely bad practise. In a real life project we would recommend a feature branching strategy with a code review and a development lead approval step.

For this tutorial, we are going to use TravisCI, a cloud based CI service. TravisCI is free so long as:

  • You host your repository in Github
  • The repository is publicly accessible

Step 1: Setup Accounts & Clone repository

  • Navigate to GitHub and sign up\log in
  • Create a new, empty repository and name it “k8s-ci”
  • Clone this repository to your local machine:
$ git clone <URL to new repo>
  • Navigate to TravisCI and sign up using your GitHub account. Navigate to your user profile by clicking on your name on the top right.
  • Click the slider against the GitHub repo to enable TravisCI for this repository.

Step 2: Setup Triggers

As we are practicing GitOps, we only want to deploy on an approved pull request. We can configure this in Travis by clicking More Options → Settings.

  • Ensure “build pushed branches” is set to on
  • Ensure “build pushed pull requests” is set to on

Travis gets the vast majority of its instructions from a yaml file stored in your repository. Create an empty file named .travis.yml in the root of your repository, and we can start configuring it based on the .travis.yml in our repository:

  • Line 1: Specifies that we only want a build to run on the master branch. In a real environment, we would likely also deploy on the pushing of a branch, but to a test environment as opposed to a production environment. One way this could be done is by using environment variables to apply conditional logic to the deploy script, but that is out of scope for this post.
  • Line 4: Specifies that we require root permissions using sudo in order to install our dependencies
  • Line 5: This is the start of the block where we set the permissions on each of our scripts so they are executable.
  • Line 10: This is the start of the block where we specify the scripts that need to run first in order to setup the CI environment before we can execute the scripts that do the actual deployment.
  • Line 13: This is the start of the block where we specify the scripts that will run to execute the deployment tasks.

Step 3: Manage Secrets

We don’t want to keep our AWS secrets in a public repository in clear text; this would be extremely bad information security practise. Handily, Travis provides a CLI tool which can be used for storing your secrets to be injected at build time. Travis generates a new public\private key pair for each new account, and these secrets will be encrypted using that keypair and injected as environment variables each time the build runs. To set this up, run the following commands in the root of your repository and login using the requested details:

$ sudo gem install travis
$ travis login --org

There are a two pre-loaded scripts in the build-scripts directory of our repository ready for your secrets to be input. Copy the build_scripts directory from your local copy of our k8s-tutorial repository to your own k8s-ci repository and update them as follows:

  • large-secrets.txt: Add your Kubernetes access keys. These can be found in ~/.kube/config
  • Add your Kubernetes password (again found in ~/.kube/config) and AWS access keys from ~/.aws/credentials

Then, from the root of your repository, run script using the following command:

$ chmod 755 build-scripts/
$ ./build-scripts/

Make a note of the openssl command the script returns for later, as we will need this to decrypt the secrets.

This script will encrypt your secrets using Travis and update your .travis.yml file. Commit the encrypted secrets to your repository:

$ git add build-scripts/large-secrets.txt.enc .travis.yml
$ git commit -m "Committing encrypted secrets"

Now your secrets are secured in Travis, we strongly suggest removing them from all files and scripts. It can be super easy to accidentally commit secrets to source control, and your authors are both guilty of this sin! To catch any accidental commits of secrets, we use git-secrets. You can set it up using the following steps:

$ brew install git-secrets
$ git-secrets --install

Step 4: Install Dependencies

As TravisCI runs each build in a clean Docker container, we need to install our dependencies every time. These dependencies are the same as we set out in the “Before you get started” section of this post. Create a file called in the build-scripts folder and paste the following configuration into it:

Now commit this file to your repository:

$ git add
$ git commit -m "Adding script to install dependencies"

Step 5: Inject Our Secrets

Now we need a script for setting up our secrets in the Docker container that we will do our build steps in. Create a file named in the build-scripts folder. Paste in the script below and update it as follows:

  • Replace {Your cluster url here} with the URL of your Kubernetes cluster
  • Replace the OpenSSL command we made a note of in Step 3 of this section with {Your openssl command from the encrypt secrets stage here} appending ./build-scripts/ before large-secrets.txt.enc
  • Replace {your-aws-region} with the AWS region you are using

This script will pull our secrets from the Travis environment, decrypt them, and inject them into the pertinent config files.

You will notice in the script above that it refers to a file in the build-scripts directory named kubeconfig — we will need to create this too. Paste in the contents below, swapping out the variable {Your cluster url here} with the URL of your Kubernetes cluster.

Commit both these files to your repository:

$ git add kubeconfig
$ git commit -m "Adding script to inject secrets and kubeconfig file"

Step 6: Environment Setup

Before we are ready to deploy applications, we need to prepare the cluster by deploying the configuration for kube2iam and external-dns. The configuration for each of these tools has to be a applied in a set order:

  • Apply Terraform configuration to create a new AWS IAM role (and the required policy grants to that role) and a trust relation back to the AWS IAM role that nodes run under. The trust relationship allows a node to assume the new IAM role.
  • Apply Kubernetes RBAC configuration to create a service account, bind it to a required permission scope, and grant the required permissions to that service account. This service account is then specified as part of the configuration of the pods that are providing each of the specific services.
  • Apply Kubernetes configurations to deploy the services. Depending on the service being deployed, this may be a Kubernetes Deployment or DaemonSet.

We’ll build our deployment script so that the cluster is always configured first.

Copy over the folders containing the templates for external-dns and kube2iam from our repository to your repository.

First, we will create a script that will apply our Terraform configuration. Create a file called in the build-scripts directory and add the following code to it:

This script will traverse the directory structure in our repository and apply any Terraform configuration files it finds.

(NB: In a real production environment, we would add checks in our CI pipeline to ensure that Terraform is not be used maliciously)

Commit this to your repository:

$ git add
$ git commit -m "Adding Terraform deployment script"

Now we’re ready to update and commit the Terraform configuration for each of our 3 services to the repository:

  • Update external_dns/pod-role-trust-policy.json and replace {your-node-iam-role-arn} with the IAM ARN for Kubernetes nodes in your cluster. This can be found by running the following command:
$ aws iam list-roles | grep node
  • Update external_dns/ to replace {your-aws-region} with the AWS region you are working in and {your-tf-bucket} with the name of the bucket you chose to hold the Terraform state store.

Commit the service configuration to your repository:

$ git add external_dns/pod-role-trust-policy.json external_dns/ external_dns/external-dns-role-rights.json external_dns/
$ git commit -m "Adding cluster Terraform service configuration"

Travis is configured to apply configuration on each push to master so if we execute a push now:

$ git push

We should be able to see all our Terraform configuration being applied to our AWS account in the Job Log.

We have now created all the IAM roles and trust relationships needed by our Kubernetes environment.

Next, we need a script to apply our Kubernetes configurations for our environment pre-requisites. To complete this step you will need to copy and update the following files from our repository into your repository:

  • external_dns/external_dns.yaml: Replace {your-dns-zone} with the DNS zone you are using, {your-identifier} with something that will differentiate the DNS records that external-dns will produce (e.g. your name), and {your-external-dns-iam-role-arn} with the IAM ARN for the role that was created when the Terraform configuration was applied. This can be found by running the following command:
$ aws iam get-role --role-name external_dns_pod_role
  • kube2iam and rbac/: No updates required

These updates specify which IAM role each of the pods should assume when they need to access the AWS API.

Now commit these files into the repository:

$ git add external_dns/external_dns.yaml rbac/ kube2iam/
$ git commit -m "Adding external-dns k8s config"

Now we’re going to start building our deploy script for our Kubernetes services. Create a file named in the build-scripts folder. Start the file off with a header like this:

Next, add the below steps that deploy the Kubernetes RBAC configuration to the cluster:

These steps are needed because the external-dns service requires Kubernetes API rights in order to run and provide its service to the cluster. As a reminder, RBAC ensures pods have no access to the Kubernetes API by default. This is in line with the “principle of least privilege” and prevents pods being able to alter the cluster settings if they are compromised for any reason.

In order to get TravisCI to apply these changes, we need to add an additional step to our .travis.yml to execute Add the following to the before_install: section:

- chmod +x ./build-scripts/

And the following into the script: section:

- “./build-scripts/”

Now commit, .travis.yml and push your repository to master and ensure there are no errors in the Travis build log:

$ git add build-scripts/ .travis.yml
$ git commit -m "Adding Travis config to deploy k8s config"
$ git push

Now that we have the Terraform and RBAC configuration added to our CI/CD pipeline, let’s add steps to deploy kube2iam to our script:

kube2iam is deployed first since external-dns will make calls to the AWS API using kube2iam as a broker.

Now push your repository to master and ensure there are no errors in the build log:

$ git add build-scripts/
$ git commit -m "Updating Travis config to deploy k8s config"
$ git push

Now, let’s check on our cluster to make sure all the services have been deployed correctly. external-dns is a Deployment, so we can execute the following command to get its status:

$ kubectl get deployments --namespace=kube-system

If everything has been deployed correctly, we should see something like:

calico-kube-controllers 1 1 1 1 1h
calico-policy-controller 0 0 0 0 1h
dns-controller 1 1 1 1 1h
external-dns 1 1 1 1 1m
kube-dns 2 2 2 2 1h
kube-dns-autoscaler 1 1 1 1 1h

kube2iam is deployed as DaemonSet since it needs to be running on all nodes to broker calls to the AWS API. We run the following command to get its status:

$ kubectl get ds --namespace=kube-system

If all is is well, we should see something like:

calico-node 6 6 6 6 6 <none> 1h
kube2iam 3 3 3 3 3 <none> 7m

Step 7: Deploy A Test Application

Now it’s time to reap the benefits of the hard work of setting up our cluster and see the power of our workflow and Infrastructure as Code to easily deploy a test application!

First, we need to add a deployment step to our script that will deploy our applications:

This step will apply any Kubernetes configuration files in the apps directory of our repository to the cluster. Commit this change and push to master:

$ git add build-scripts/
$ git commit -m "Updating Travis config to deploy k8s config for apps"
$ git push

As we’re beginning our journey towards GitOps, let’s follow a GitOps process flow for deploying a test application:

  • Create a local branch by running the following command:
$ git checkout -b testapp
  • Create a folder under called apps in your repository
  • In the apps folder, create a file called hello_app_deployment.yaml and add the following to it:

This configuration has 2 sections:

  1. Deployment: This specifies the details of the container we are going to run, the amount of resources to give it, which port the application inside the container can be accessed on, and the number of replicas of that application we want to run. In this case, we are going to run a simple container that prints “Salutations, globe!” and then the hostname of the container to port 8080. We specify we are going to run 3 replicas — one for each of our cluster nodes.
  2. Service: This specifies how the deployment should be exposed, either internally or externally. In the case of our test application, we are exposing it internally on the cluster IP on port 80. We also specify the friendly FQDN (e.g. something like that our application can be accessed on here. You will need to replace {your FQDN here} with your friendly FQDN.

Now commit this file to your local branch and push the branch to the remote repository:

$ git add hello_app_deployment.yaml
$ git commit -m "Adding test app"
$ git push -u origin testapp

If we log into GitHub, we should now see we have a new branch called “testapp”:

We want to raise a pull request and merge to master, so click on “Compare & pull request” and follow the process to complete this activity.

Once the deployment has completed, we can verify our test application has deployed correctly by running the below commands and checking for similar output:

$ kubectl get deployments salutations-deploymentNAME                     DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE
salutations-deployment 3 3 3 3 24d
$ kubectl get services salutations-serviceNAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
salutations-service a78b874f74ed0... 80:32439/TCP 6d

The real proof, however, is connecting to our application using our friendly FQDN! If everything has deployed correctly, you should see something like this:

Success! If you now refresh this page, you should see the hostname change when your browser accesses the app running on a different Kubernetes node through the Classic Load Balancer.


Using this tutorial, we have built a Kubernetes cluster with a good set of security defaults and then wrapped a simple CI/CD pipeline around it. We then specified through Infrastructure as Code how we wanted a simple containerised application to be deployed and used our CI/CD pipeline and Kubernetes cluster to deploy it as defined — automatically.

This is just a simple illustration of the rich benefits Kubernetes can bring to your developers and your DevOps capabilities when included as part of your toolchain!

Slalom Engineering

Insights and opinions from software engineers at Slalom.

Matthew Sheppard

Written by

Solution Architect at Slalom UK. Software engineering, cloud native architecture/infrastructure & infosec. Cannot be trusted alone with cheese. @justatad

Slalom Engineering

Insights and opinions from software engineers at Slalom.

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade