Creating Kubernetes operator using Kubebuilder

Adnan Selimovic
5 min readJan 4, 2022

--

Kubernetes is the current de facto standard for the deployment and running of applications that are suitable for modern cloud platforms. A declarative way of defining infrastructure state using YAML allows a super easy definition of the scheme for the deployment of the application. Deploying stateless applications is not a big deal. On the other hand — deploying distributed stateful applications, configuring and operating them is a challenging task.

Kubernetes addressed this issue by allowing developers to extend it, using Kubernetes operators. Operator reacts on the custom resource and reconciliate state in the cluster with the state defined in custom resource, by implementing logic embedded in the operator itself.

When designing/writing an application, intended to run on the Kubernetes, one should take into account capabilities provided by Kubernetes out of the box. It can speed up implementation, make an application more reliable and the code can focus more on business logic itself.

There are multiple ways to create an operator. You could write one from scratch using Kubernetes sig API machinery. It’s a tedious task and the learning curve is steep. As an alternative, multiple tools provide boilerplate code and speed up the writing of operators. Popular ones are Operatorsdk and Kubebuilder. The focus of the article will be on creating an operator using Kubebuilder. Let’s create an operator which will create a pod running simple HTTP API and bind some data to the HTTP API.

Kubebuilder provides CLI for creating and managing operator projects. To start a new project, one would only need to hit:

kubebuilder init --domain clientmgr.io --repo clientmgr.io/tutorial
kubebuilder create api --group qdnqn --version v1 --kind Client

Kubebuilder will create a directory structure and you can start developing the operator straightaway. The core components of an operator are found in the following locations:

  • api/v1/clients_type.go
  • controllers/client_controller.go

The first file, api/v1/clients_type.go, is holding the structure of the Custom resource. The second one is holding controller logic. Important note: There is some confusion about what is difference between operators and controllers is. Operators are having domain-specific knowledge about the application they are operating on.

Creating custom resources in the api/v1/clients_type.go is described below.

// ClientSpec defines the desired state of Clienttype ClientSpec struct {
// INSERT ADDITIONAL SPEC FIELDS - desired state of cluster
// Run "make" to regenerate code after modifying this file
ClientId string `json:"clientId,omitempty"`
ContainerImage string `json:"containerImage,omitempty"`
ContainerTag string `json:"containerTag,omitempty"`
ContainerEntrypoint string `json:"containerEntrypoint,omitempty"`
}
// ClientStatus defines the observed state of Clienttype ClientStatus struct {
// INSERT ADDITIONAL STATUS FIELD - define observed state of cluster
// Run "make" to regenerate code after modifying this file
ClientStatus string `json:"clientStatus,omitempty"`
LastPodName string `json:"lastPodName,omitempty"`
}

Kubebuilder creates the boilerplate, the user only needs to add custom fields, like in the code block above (ClientId, ContainerImage, ContainerTag, and ContainerEntrypoint). By doing this, the custom resource is defined. To create Custom Resource Definition user can run make manifests and make install in the root directory of the project, to apply them to the k8s cluster.

Custom Resouces are created but how to implement custom logic? Let’s check controllers/client_controller.go.

The main code goes in the reconcile function. As seen in the gist below you can see that code is behaving like a state machine. Commenting of the code is provided side by side in the next paragraph.

First, get the client's custom resource that triggered the event (L4-L8).

Deep copy client resource (L10).

If field ClientStatus in the status of CR is empty set it to Pending (L12-L14).

Switch implementing state machine logic (L16).

Set ClientStatus to running and update the status of resources in the cluster (L20-L27).

Create a pod object and store it in a pod (L29).
Check if pod exists (L32).

If LastPodName is empty create a new pod. (L34)

Create pod on the cluster from pod variable (L40).

Trigger requeue (L47).

Cannot get pod; Return error (L53).
If pod failed or succeeded switch custom resource to Cleaning state (L58).

If the pod is running try to bind data to the external HTTP api (L60-L77).

If the pod is pending — no-op (L83).

If client status is changed, update it (L88-L97).

Check if the pod has any clients (external HTTP api call) (L100).
Get pod in the query (L102)
If the pod doesn’t have any clients left; Delete it (L104-L114).

If LastPodName != NewPodName update status accordingly (L116–L122).

Update status of client resource on the cluster if needed (L124-L133).

Checks are performed and action is triggered based on three states of the custom resource:

  • Pending: Switch to Running state.
  • Running: Create pod and bind to the HTTP API or Switch to cleaning to rebind to another pod.
  • Cleaning: Check if the pod has any active clients through HTTP API; If not delete the pod. Otherwise if needed bind the client to the new pod.

It is suggested to maintain reconcile function as clear as possible. In this example, code is constructed only for clear demonstration for readers. When maintaining operator code It would be best to export code and use functions in the reconcile loop to maintain readability.

HTTP API is a simple API written in the GO. The gist is given below.

This API is containerized and operated by the operator written with the help of the Kubebuilder. Let’s check the operator in the action.

To make the operator do some things let’s apply custom resources.

Since we are setting controller reference in the reconcile func, when the pod is created, one client will hold ownership over that pod. Check https://kubernetes.io/docs/concepts/overview/working-with-objects/owners-dependents/.

In the gist, there are 3 clients defined: two pod-owners and one extra client which will bind to the API in the pod created from client-sample-2.

Let’s observe the behavior of the operator before/after applying the custom resources.

Below you can find a terminal session describing the behavior of the operator created.

An operator is deployed in the tutorial-system namespace (L3). We can see that the default namespace is empty before applying custom resources (L1). After clients have been applied to the cluster, pods are created automatically — Something similar would’ve been created using Deployment or StatefulSet (L10). What is making our operator distinct is the fact that It does understand application running HTTP API.

In this article, a simple operator use case is described. Operators can be used to operate distributed applications eg. Hazelcast, DBS, Kafka, and so on. These applications indeed have their operators already available for use by end-users.

It takes a lot of effort to create reliable operators. The use case is simplified to demonstrate how one could create an operator using Kubebuilder. This operator is created only for demonstration and It could be improved a lot.

Using this design pattern to deploy applications to the Kubernetes should be justified. Not every application is eligible for deployment using operators. Trade-offs should be taken into consideration before starting on this journey.

Complete code can be found on Github.

--

--