Running Workloads in Kubernetes

Published in

Google Cloud - Community

5 min readApr 8, 2017

This post is based on my talk at KubeCon Europe 2017. The recording, slides, and demos of this talk are all available. In this talk, I gave an overview of built-in controllers in Kubernetes. The intended audience is Kubernetes beginners.

Slides of my KubeCon talk: Running Workloads in Kubernetes

You may remember that, the AWS outage happened last month broke lots of popular websites and applications. After the incident, Rob Scott, VP of Software Architecture at Spire, shared a story of how Spire mitigated AWS outage with the magic of Kubernetes.

Rob Scott, VP of Software Architecture @Spire, described how Kubernetes recovered from AWS outage

Kubernetes is a platform for containerized application patterns. These patterns make applications easier to deploy, to administer, to scale, and to recover from failures — that’s the magic.

What’s in a Kubernetes Cluster?

Here is a simplified Kubernetes cluster:

A pod is the smallest and simplest unit that you create or deploy in Kubernetes. A single pod has usually one, or sometimes several tightly coupled containers, and their shared resources. A pod represents a single instance of an application in Kubernetes.

Controller is a higher level of abstraction in Kubernetes. Each controller represents one application pattern. A controller (the red square in the above diagram) manages copies of pods for a specific application pattern. Therefore, you don’t need to create pods directly. Instead, you create controllers to run applications in patterns.

A node is a physical or virtual machine. The master node makes global decisions about the cluster, and the worker nodes maintain pods and provide them running environment. Controllers run on the master node, and manage pods running on the worker nodes.

In this post, I’ll discuss 4 general patterns in Kubernetes: stateless, stateful, daemons, and batch. Again, each pattern is represented by one controller.

Stateless Pattern: Deployment

Stateless means that you don’t need to keep state (persistent data) in your workloads. The controller for stateless pattern is called Deployment. If you want to manage and scale stateless workloads, such as your web applications, mobile backends, or API servers, Deployment is the controller for you.

Deployment leans towards availability than consistency [1]. Deployment provides availability by creating multiple copies of the same pod. Those pods are disposable — if they become unhealthy, Deployment will just create new replacements. What’s more, you can update Deployment at a controlled rate, without a service outage. When an incident like the AWS outage happens, your workloads will automatically recover.

Concrete examples: nginx, Tomcat

Below is a demo of deploying and rolling updating a stateless application with Deployment. In this demo, I used the Kubernetes CLI tool, kubectl, to interact with the cluster. kubectl makes asynchronous requests to the controller, tells the controller what the desired state is, and the controller will just make it so. I also created Service in the demo. Services front applications. You don’t talk to pods directly, but talk to Services. Services load balance traffic to healthy pods, so that when any pod becomes unhealthy or is being terminated, your service is still available.

A demo of deploying and rolling update stateless applications with Deployment

Stateful Pattern: StatefulSet

For stateless applications, scale and recover is easy. However, some of your applications need to store data, like databases, cache, and message queues.
If you’re running distributed stateful workloads, such as Zookeeper, each of your stateful pod will need a stronger notion of identity.

The pods created by Deployment won’t have the same identity after being killed and recreated, and they don’t have unique persistent storage either. Therefore, we need another controller for stateful pattern: StatefulSet. Unlike Deployment, StatefulSet chooses consistency over availability. StatefulSet also manages multiple pods, but unlike Deployment, each pod of a StatefulSet has stable, unique, and sticky identity and storage. That is to say, each pod is similar but slightly different. With StatefulSet, you can deploy, scale, and delete pods in order. This is safer, and makes it easier for you to reason about your stateful applications.

Concrete examples: Zookeeper, MongoDB, MySQL

A demo of running a Zookeeper cluster with StatefulSet

Daemon Pattern: DaemonSet

Sometimes, you want to run daemon-like workloads on your nodes, such as running logs collection daemons or node monitoring daemons. In this case, you use DaemonSet.

DaemonSet makes sure that every node runs a copy of a pod. If you add or remove nodes, pods will be created or removed on them automatically. If you just want to run the daemons on some of the nodes, use node labels to control it — put some labels on nodes, and tell the DaemonSet “Hey, run DaemonSet pods only on nodes with these labels.”

Concrete examples: fluentd, linkerd

A demo of running one pod per node with DaemonSet

Batch Pattern: Job

You might also need to run batch processing workloads in your clusters. The controller for batch workloads is called Job.

Job creates multiple pods running in parallel. You can specify how many number of pods need to complete in this Job. Job is designed for parallel processing of independent but related work items. This can be the emails to be sent, or the frames to be rendered.

A demo of running pods in parallel to completion with Job

Summary

So these are the only patterns that are supported in Kubernetes, right? No! They’re just the most common, general patterns that are supported in Kubernetes. In summary:

Stateless pattern: use Deployment that provides availability, to scale and recover easily
Stateful pattern: use StatefulSet for consistency, to give each pod a unique and sticky identity and storage, and to deploy, scale, terminate in order
Daemon pattern: use DaemonSet, which runs one pod per node by default, or can be customized using node labels
Batch pattern: use Job to run multiple pods in parallel and run them to completion

Now, a question you may have is “This sounds great, but where do I start? Are there examples or tools for me to move my workloads to Kubernetes?”

Yes! There are lots of great tools for you to start with, and one of which is Helm. Helm is the Kubernetes package manager. With Helm, you can download and install helm charts, the curated Kubernetes-native applications, and you start from there.

Another question you might have is “What if I need to run other patterns in Kubernetes? How do I customize?”

Kubernetes is extensible. It provides all the essential primitives, so you can write your own controllers, or use controllers others wrote. For example, ElasticSearch Operator and etcd Operator.

Kubernetes is open. It’s open for suggestions and feedback. You can send a patch or file a bug report to Kubernetes GitHub repo, or chat with us on Slack. You can also follow the latest Kubernetes news on Twitter @kubernetesio. If you have questions or want to follow up, you’re more than welcome to find me on Twitter as well.

Hope you enjoy the post!

1. https://en.wikipedia.org/wiki/CAP_theorem