Kubernetes (commonly stylized as K8s) is a powerful system, originally released in 2014, for managing containerized applications in a clustered environment. Its goal is to provide better ways of managing related, distributed components and services across varied infrastructure.
In this tutorial, we’ll discuss some basic Kubernetes concepts. We will talk about the architecture of the system, the problems it solves, and the model that it uses to handle containerized deployments and scaling. We will reference often an example setup I have created. Here’s the link to the Github repository.
Getting to know Kubernetes
Kubernetes is a system for running and coordinating containerized applications across a cluster of machines. It is a platform manages the life cycle of containerized applications and services using methods that provide predictability, scalability, and high availability. For the example I have prepared, we are only looking at one single-node Kubernetes cluster run via Minikube.
As a Kubernetes user, we can define how our applications should run and the ways they should be able to interact with other apps in Kubernetes as well as the outside world. We can scale our services up or down, perform graceful rolling updates, and switch traffic between different versions of our applications to test features or rollback problematic deployments. Kubernetes provides composable platform primitives that allow us to define and manage our applications with high degrees of flexibility, power, and reliability.
Let’s first take a look at how Kubernetes is designed and organized at a high level. We can think of it as a system built in layers, with each higher layer abstracting the complexity found in the lower levels.
At its base, Kubernetes brings together individual physical or virtual machines into a cluster using a shared network to communicate between each server. This cluster is the physical platform where all Kubernetes components, capabilities, and workloads are configured. Typically we will never need to worry about this layer as a software engineer.
The machines in a cluster are each given a role within the Kubernetes ecosystem. One server functions as the master server. This server acts as the gateway and brain for the cluster by exposing an API for users and clients, health checking other servers, deciding how best to split up and assign work, and orchestrating communication between other components. The master server acts as the primary point of contact to the cluster. It is responsible for most of the centralized logic Kubernetes provides, from scheduling to handling kubectl API requests.
The other machines in the cluster are designated as nodes: servers responsible for accepting and running workloads using local and external resources. To help with isolation, management, and flexibility, Kubernetes runs applications and services in containers, so each node needs to be equipped with a container runtime (typically something like Docker). The node receives work instructions from the master server and creates or destroys containers accordingly, adjusting networking rules to route and forward traffic appropriately. For our example, we don’t need to worry about other nodes as everything is run in one single master node in Minikube.
The final Kubernetes layer is the user defined applications layer. All of these applications run in a containerized environment such as Docker. To start up an application or service, a declarative plan is submitted (eg. deployment.yml) defining what to create and how it should be managed. Think of it as infrastructure as code. The master server then takes the plan and figures out how to run it on the infrastructure by examining the requirements and the current state of the system. Usually this is done using the Kubernetes CLI tool. Here’s a simple example:
kubectl apply -f deployment.yml
Working with Kubernetes
Although containers are the underlying mechanism used to deploy applications, working with Kubernetes involves additional layers of abstraction over the container interface to provide scaling, resiliency, and life cycle management features. Normally we would not manage containers directly. Instead, we define and interact with instances composed of various primitives provided by the Kubernetes object model. We will go over the different types of objects that can be used to define these workloads below.
A pod is the most basic unit that Kubernetes deals with. Containers themselves are not assigned to hosts. Instead, one or more tightly coupled containers are encapsulated in an object called a pod.
A pod generally represents one or more containers that should be controlled as a single application. Pods consist of containers that operate closely together, share a life cycle, and should always be scheduled on the same node. They are managed entirely as a unit and share their environment, volumes, and IP space. Despite their containerized implementation, we should generally think of pods as a single, monolithic application to best conceptualize how the cluster will manage the pod’s resources and scheduling.
Pods usually consist of a main container that satisfies the general purpose of the workload and optionally some helper containers that facilitate closely related tasks. These are programs that benefit from being run and managed in their own containers, but are tightly tied to the main application. For example, a pod may have one container running the primary application server and a helper container pulling down files to the shared filesystem when changes are detected in an external repository. Horizontal scaling is generally discouraged on the pod level because there are other higher level objects more suited for the task. For our example, we don’t need to worry about having any helper containers in our pod.
Generally, we should not manage pods themselves, because they do not provide some of the features typically needed in applications (like sophisticated life cycle management and scaling). Instead, we should work with higher level objects that use pods or pod templates as base components but implement additional functionality.
As mentioned above, rather than working with single pods, we will typically be managing groups of identical, replicated pods. These are created from pod templates and can be horizontally scaled by Controllers known as replication sets (note: there are many types of Controllers to manage pods and other lower level components).
A replication set is an object that defines a pod template and control parameters to scale identical replicas of a pod horizontally by increasing or decreasing the number of running copies. This is an easy way to distribute load and increase availability natively within Kubernetes. The replication set knows how to create new pods as needed because a template that closely resembles a pod definition is embedded within the replication set configuration.
The replication set is responsible for ensuring that the number of pods deployed in the cluster matches the number of pods in its configuration. If a pod or underlying host fails, it will start new pods to compensate. If the number of replicas in a controller’s configuration changes, it either starts up or kills containers to match the desired number.
Like pods, replication sets are rarely the units we will work with directly. While they build on the pod design to add horizontal scaling and reliability guarantees, they lack some of the fine grained life cycle management capabilities found in more complex objects.
Ah finally! Deployments are one of the most common workloads to directly create and manage. Deployments use replication sets as a building block, adding flexible life cycle management functionality to the mix.
Deployments solve many of the pain points that existed in the implementation of rolling updates. When updating applications with a deprecated replication controller, we are required to submit a plan for a new replication controller that would replace the current controller. When using replication controllers, tasks like tracking history, recovering from network failures during the update, and rolling back bad changes are either difficult or left as our responsibility.
Deployments are a high level object designed to ease the life cycle management of replicated pods. Deployments can be modified easily by changing the configuration and Kubernetes will adjust the replica sets, manage transitions between different application versions, and optionally maintain event history and undo capabilities automatically. Because of these features, deployments are the type of Kubernetes object we typically work with. They will be the focus of our example as well.
Jobs and CronJobs
The workloads we’ve described so far have all assumed a long-running, service-like life cycle. Kubernetes uses a workload called jobs to provide a more task-based workflow where the running containers are expected to exit successfully after some time once they have completed their work. Jobs are useful if we need to perform one-off or batch processing instead of running a continuous service.
Building on jobs are CronJobs. Like the conventional
cron daemons on Linux and Unix-like systems that execute scripts on a schedule, cron jobs in Kubernetes provide an interface to run jobs with a scheduling component. Cron jobs can be used to schedule a job to execute in the future or on a regular, reoccurring basis. Kubernetes CronJobs are basically a reimplementation of the classic cron behavior, using the cluster as a platform instead of a single operating system. Unfortunately I did not find a suitable use case for CronJobs when I built my simple Robinhood app but I leave it as an exercise for the reader to dive into Kubernetes CronJobs and the other types of Controllers.
A K8s Service is a component that acts as a basic internal load balancer and ambassador for pods. It’s a way to expose an application running on a set of pods that perform the same function. A service presents this set of pods as a single entity.
This allows us to deploy a service that can keep track of and route to all of the backend containers of a particular type. Internal consumers only need to know about the stable endpoint provided by the service. Meanwhile, the service abstraction allows us to scale out or replace the backend work units as necessary. A service’s IP address remains stable regardless of changes to the pods it routes to. By deploying a service, we easily gain discoverability and can simplify our container designs.
Any time we need to provide access to one or more pods to another application or to external consumers, we should configure a service. For instance, if we have a set of pods running web servers that should be accessible from the internet, a service will provide the necessary abstraction. Likewise, if our web servers need to store and retrieve data, we would want to configure an internal service to give them access to our database pods.
Although services, by default, are only available using an internally routable IP address, they can be made available outside of the cluster by choosing one of several strategies. The NodePort configuration works by opening a static port on each node’s external networking interface. Traffic to the external port will be routed automatically to the appropriate pods using an internal cluster IP service.
Alternatively, the LoadBalancer service type creates an external load balancer to route to the service using a cloud provider’s Kubernetes load balancer integration. The cloud controller manager will create the appropriate resource and configure it using the internal service service addresses.
Lastly, we have Ingress, an API resource that manages external access to services in a cluster. Think of it as an external traffic load balancer. This will be how we access our app from a Chrome browser. Like all the other Kubernetes objects, we can define what we want for an Ingress resource via a yaml file.
We’ll need to add Ingress as an addon to our Minikube setup. Once setup, we can navigate to our Minikube dashboard and checkout our newly created Ingress resource. There is an IP address associated with it under “Endpoints”. We can then copy and paste this address into Chrome and shazaaam. Resilient horizontal scaling on our Minikube cluster.
Kubernetes is awesome. It allows us to run highly available containerized workloads on a highly abstracted platform. Kubernetes has many different building blocks for all our needs and is very mature as a tool (open sourced in 2014). By understanding how these basic building blocks fit together, we can begin to design systems that fully leverage the capabilities of the platform to run and manage our workloads at scale!