Infrastructure as Code: Kubernetes

Mar 3, 2018 · 7 min read

Building an infrastructure is without a doubt a complex art that has been evolving over time. There are many aspects around it that demand improvements over and over again; maintainability, scalability, fault-tolerance, performance just to name a few. To better model the complexity, we can take a look at a subset process of having an infrastructure: deployment.


If you’re a software engineer, you’re probably familiar with the classic 1 liner “it works in my machine”. Deployment is complex, and just because it works in your Mac doesn’t mean it will automatically run when you slap it on a Linux server. Replicating what are necessary in your machine to a God-knows-what-OS machine and infrastructure, and have it behave exactly the same is very complex. While VMs and VMs on steroid such as Vagrant might try to tackle this issue, they are quite bloated in both resources and uptime.

Few years ago, the concept of containerised applications gained popularity, with Docker being the front runner. Container can be considered as a small package of OS along side with configurations wrapped exactly to have your application up and running. Containers run within the same kernel as the Docker host itself and hence can share resources better, which gives Docker a leverage over VMs. Docker surely sets a neat standard of wrapping up an application, ensuring that it will run exactly the same way in any machine that has Docker in it. This however doesn’t answer deployment complexities that lie outside of your application. You know how to make an application run and easily swap it in and out with newer or older version. But you still have to deal with how your applications (plural) communicate, scaled, run forever, revive when your server dies, and once you have it all set, moving the whole things out to another cloud provider service is a giant nightmare on its own.


Kubernetes has been around for few years now, and it has enabled having your infrastructure written as a code. This is a tremendous benefit in two ways: your infrastructure can now be versioned and committed to a Git repository, and your infrastructure can easily be “deployed” elsewhere.

Kubernetes has rich functionalities which are provided through numerous components. In order to understand how Kubernetes works at a basic level, we’ll only look into few components of Kubernetes that are sufficient to get an application up and running.

  • Cluster: the first abstraction layer of Kubernetes is cluster, hence Kubernetes cluster. If we choose AWS as an example, then cluster is simply a collection of EC2 instances. Intuitively, when you have a cluster of something, then you need a way of managing it, otherwise your fancy automations and deployments will be next to useless if let’s say your EC2 keeps crashing due to CPU overload. Thankfully, Kubernetes already provides a tool for it: kops. Kops eases cluster deployment, Kubernetes versioning and upgrades, and makes sure that you always have x number of cluster members running at any given time. With an additional add-on, it can also ensure that your cluster will add another EC2 instance when necessary. Yes, your cluster can be auto scaled!
  • Deployment: as the word suggests, Deployment is an abstraction of your deployment configuration. It tells what Docker image your application is running, which version, what are the environment variables it needs, and so much more. Essentially, Deployment governs another two components of Kubernetes: Pod and ReplicaSet. Pod is an abstraction of your application (which we will look at shortly), while ReplicaSet makes sure that an x number of pods are running at any given time.
  • Pod: as mentioned, Pod is an abstraction of your application. It has an OS running inside, it knows what application it runs, it has its own network interface and internal IP address, and you can SSH into it. If you are familiar with Docker, think of it as a Docker container.
  • Service: a collection of pods don’t know how to communicate with outer world and to pods of other deployment. And this is what a service is for. It defines where you expose your pods to, and how other pods can discover your pods. Bear in mind that pods are mortal, and they can be killed anytime for any reason, and that’s why we need a more stable abstraction on top of it, which come in a form of deployment and service.

To get a better understanding of how this will look like, we’ll take a look at a basic deployment configurations that can get your application up and running. After all, it’s not a technical example unless a classic hello-world example is in it!

Setting Up

In order to get you started, you will need a Kubernetes cluster on your own. For the sake of simplicity, you can start with mini-kube which will start a local Kubernetes cluster in your machine, or you can go big and use kops to install a cluster in AWS for instance. Mini-kube comes together with kubectl, a command line tool to communicate with your cluster. This however is a prerequisite if you want to use kops instead, so you’ll need to install it separately simply by doing this:

# If you're on linux
curl -LO$(curl -s
# or if you're on mac
curl -LO`curl -s`/bin/darwin/amd64/kubectl
# and then
chmod +x ./kubectl
sudo mv ./kubectl /usr/local/bin/kubectl

Following is a simple yaml file to get you started with.

apiVersion: extensions/v1beta1
kind: Deployment
name: hello-world-deployment
replicas: 2
app: "hello-world"
- name: "hello-world-container"
image: "google/python-hello"
imagePullPolicy: Always
- containerPort: 8080
- name: NODE_ENV
value: production
value: "Kubernetes is awesome!"
apiVersion: v1
kind: Service
name: hello-world-lb
run: hello-world
app: hello-world
type: LoadBalancer
- name: "http"
port: 80
targetPort: 8080

protocol: "TCP"

This configuration consists of 1 deployment and 1 load balancer service. If you take a notice at your deployment, you’ll see that we specify replicas as 2. This means that our deployment will have 2 pods, and a replicaset will automatically be created for you to ensure that you will always have 2 hello-world pods. It’s worth noting that your application needs to be stateless to be scalable in this way.

The next thing you need to pay attention is the label “hello-world”. This label is called Selector, and is used by other components of Kubernetes to identify pods that belong to hello-world-deployment. A quick example would be our load balancer service, that has exactly the same selector. Selector is a very powerful convenience in Kubernetes, which we will leave for some other time to discuss.

Last thing for you to look at is the port. Our deployment acknowledges that python-hello image exposes port 8080 to outer world, and we want that port kept open in our pods. That port can later be mapped by our service to another port, which in this case is 80. This means, if we hit the load balancer ingress at port 80, we will be directed to 1 of our 2 pods.

Now, you can get it up by running this command:

➜  kubectl apply -f hello-world.yaml

To validate our tryout, you can run these commands and you will get similar results:

If you run kubectl describe services/hello-world-lb you can see the load balancer ingress, and you can hit it to get a Hello World response from one of our pods.

Now let’s take a quick look back to understand how powerful this is. This example runs in AWS and the cluster is spawned and managed by Kops. However, throughout the process we never need to know which EC2 instance exactly has pod 1 or pod 2, because Kubernetes manages that load. And when we add new deployments into our cluster, Kubernetes will manage the load across cluster and decide where to put the new pods. You can kill one of the pods, and our replicaset will spawn a new one as replacement. You can even try to kill one of the EC2 instances, and Kops will re-spawn a new EC2 instance while recreating pods from deleted EC2 in the ones that are still alive. And surprisingly, we just barely scratch the surface of Kubernetes and how powerful it can get.

Bear in mind that we simply just covered 1 application deployment. Now imagine if we write all deployments and third party dependencies we need in this way, then we will have exactly what we want: infrastructure written as code.


As they claim themselves to be, Kubernetes is truly a “Production-Grade Container Orchestration”. It leverages the use of containers, and define how they are deployed, communicate with each other, communicate with outer world, scaled, upgraded or downgraded, auto-revived, and so much more. Above it all, Kubernetes frees us from being too tightly coupled to a cloud service provider, which otherwise can be a massive disaster over night (extreme example) . Having the entire infrastructure written as a code means that moving your entire infrastructure is as simple as pointing your kubectl to another cluster and run kubectl apply -f yourfolder . You can also commit it to a Git repository and have a clear understanding on how your infrastructure changes over time.


Kubernetes is very rich in features and is rapidly growing, which makes it impossible to discuss in one sitting. Stay tuned for more of Kubernetes deployment auto scaling, Kubernetes cluster auto scaling, 0 down-time deployment, SSL termination, how Nginx ingress can benefit your deployment, and much more of Kubernetes on my next posts!


Written by


Lead Engineer at Style Theory

Style Theory Engineering & Data

Engineering & Data Teams @ Style Theory

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade