Simplifying Custom Resource Definition and Controller Creation in Kubernetes using Python

Kubernetes has become the de-facto standard for orchestrating containerized applications. One of its key features is extensibility, enabled by Custom Resource Definitions (CRDs) and custom controllers. This post aims to provide a comprehensive yet simplified guide on creating a CRD, implementing a custom controller with the Kubernetes Operator Python Framework (Kopf), and deploying the controller for local development and testing.

Why and When to Use a Custom Resource Definition (CRD)

Think of Kubernetes as a bustling city, and the applications running on it as the city’s inhabitants. The city (Kubernetes) provides basic services like water, electricity, and public transportation (built-in resources like Pods, Services, and Deployments). These services are great and fulfil the needs of most citizens.

However, suppose you’re an artist and you need a studio with specific lighting conditions, or maybe you’re a chef who needs a custom-built kitchen. The city’s basic services aren’t going to cover these specialized needs. That’s where CRDs come in.

A Custom Resource Definition (CRD) is like getting a permit to modify your living or working space to fit your unique needs. It’s a way to tell Kubernetes, “Hey, I have some specific requirements that your standard offerings don’t cover. So, I’m going to create my own resource type that does.”

CRDs allow you to create your own, user-defined resources that behave just like the built-in ones. They’re your ticket to customizing Kubernetes to your application’s needs, making your application a first-class citizen of the Kubernetes city.

Step 1: Create the Resource Definition

Let’s start by creating a simplified CRD for a Greeting resource. This CRD will have only one field, message.

apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinitionmetadata:name: greetings.example.comspec:group: example.comversions:- name: v1served: truestorage: trueschema:openAPIV3Schema:type: objectproperties:spec:type: objectproperties:message:type: stringscope: Namespacednames:plural: greetingssingular: greetingkind: Greeting

Here’s a breakdown of the CRD’s main components:

  • apiVersion : This is the version of the Kubernetes API you’re using to create this CRD.
  • kind: For CRD creation, this will always be CustomResourceDefinition .
  • metadata: This is data that helps uniquely identify the CRD, including name , which is the name of our CRD.
  • spec: This is where the specification of the CRD is defined. It includes the following fields:
  • group: This is the name of the API group that this CRD belongs to.
  • versions : This is an array which can hold multiple versions of the CRD. Each version has its own name, and flags to indicate whether the version is served and whether it should be stored.
  • scope : This defines whether the CRD is cluster-wide (use Cluster ) or specific to a namespace (use Namespaced ).
  • names : These are the names used for the created resources. The kind is the singular name to refer to the resource, plural is used to refer to the resource in plural form, and singular is the singular form of the resource name.
  • schema : This is where we define the structure of our custom resource. We’re using OpenAPI v3.0 schema to do this. For our Greeting resource, it has a single field, message , which is a string.

After creating the CRD definition, you can apply it using the kubectl apply -f<filename.yaml> command.

$ kubectl apply -f greeting.definition.ym
Outputcustomresourcedefinition.apiextensions.k8s.io/greetings.example.com createdl

Verify

$ kubectl get cr
NAME CREATED ATgreetings.example.com 2023-06-14T15:20:50Zd

Step 2: Create the CRD Instance

Next, let’s create an instance of our Greeting CRD:

apiVersion: example.com/v
kind: Greetingmetadata:name: hello-worldspec:message: "Hello, World!"1
  • apiVersion : This defines the version of the API that the object should be created with. In this case, we’re using example.com/v1 , which means we are using version v1 of our custom API group example.com .
  • kind : This specifies the kind of resource to be created. Here, Greeting refers to the kind of the custom resource that we defined in our Custom Resource Definition (CRD).
  • metadata : This section is used to uniquely identify the object. The name field inside metadata is the name of the resource instance, which in this case is hello-world .
  • spec : This is where the specific configuration for our resource is placed. Inside the spec , we have a field called message with a value of “Hello, World!” . This corresponds to the message property we defined in the schema of our CRD. The spec will be different for different resources based on the schema defined in their CRDs.

So in essence, this YAML file is telling Kubernetes to create a new Greeting resource with the name hello-world and a message saying “Hello, World!” .After defining this file, you can create the Greeting resource instance using the kubectl apply -f <filename.yaml> command. It’s important to note that the name of the resource instance (in this case, hello-world ) must be unique within the namespace for each specific kind of resource.

Apply this instance with kubectl :

$ kubectl apply -f greeting.instance.yam
Outputgreeting.example.com/hello-world createdl
$ kubectl get greeting
OutputNAME AGEhello-world 10ss

Step 3: Create the Controller

With our CRD and its instance set up, we can now create a custom controller. Let’s write a controller that logs a greeting message whenever a Greeting resource is created:

But before that, let’s understand that why do we need to create controller

Creating a custom resource definition (CRD) is akin to defining a new hardware device; it tells the system what the device is. But that’s not enough. Without a driver, a piece of hardware is just a paperweight. It’s the driver that brings the device to life by enabling it to perform tasks.

In the Kubernetes world, controllers are those drivers.

If CRDs are the hardware, controllers are the software. While a CRD tells Kubernetes about a new kind of resource, it’s the controller that breathes life into it, defining what actions to take when changes occur to instances of the resource. Consider the built-in Kubernetes Job controller. When a Job resource is created, it tells Kubernetes to run a task to completion, even in the event of failures or restarts. But it’s the Job controller that ensures this happens. The controller watches for new Job resources, then schedules the specified task to run on a node in the cluster. If a task fails, the Job controller will reschedule it, ensuring the Job resource’s contract of running to completion is honoured.

Similarly, when you create your own CRD, you’re essentially creating a contract of what that resource is. The controller is what ensures the contract is upheld. It continuously reconciles the desired state specified by the CRD with the actual state in the cluster.

Let’s say we create a Greeting CRD and a controller for it. If a user creates a Greeting resource with a message, the controller could be responsible for ensuring that message gets displayed to all active users in a system. If a new user logs in, the controller would reconcile the state and display the greeting message to them, thus fulfilling the Greeting contract.

This is why we need to create a controller for our custom resources. They are what allow our custom resources to truly become first-class citizens in the Kubernetes ecosystem.

In the following section, we’ll take a look at how to create a controller for our Greeting custom resource.

import kopf
@kopf.on.create('example.com', 'v1', 'greetings')def greeting_created_fn(spec, **kwargs):print(f"Greeting: {spec['message']}")

Here, kopf.on.create registers a function to be called whenever a Greeting resource is created. The spec parameter contains the spec of the created resource.

Before we dive into the controller code, let’s make sure we have the necessary prerequisites set up.

Prerequisites

1. Python Environment: Since our controller is going to be written in Python, you’ll need a Python environment set up. It’s recommended to use a virtual environment to avoid package conflicts. Here’s how to set one up:

python3 -m venv myen
source myenv/bin/activatev

1. Kopf: Kopf (Kubernetes Operator Pythonic Framework) is a framework to write Kubernetes operators in Python. You can install it in your virtual environment with pip:

pip install kopf

Now that we have our environment set up, let’s break down our controller code:

import kop
@kopf.on.create('example.com', 'v1', 'greetings')def greeting_created_fn(spec, **kwargs):print(f"Greeting: {spec['message']}")f

Here’s what’s happening in the code:

  • We’re importing the kopf module, which provides the tools to build Kubernetes operators.
  • The @kopf.on.create(‘example.com’, ‘v1’, ‘greetings’) is a decorator that tells Kopf to call the following function when a Greeting resource is created.
  • ‘example.com’ is the group of the CRD, ‘v1’ is the version, and ‘greetings’ is the plural name of the CRD.
  • greeting_created_fn is the function that will be executed when a Greeting resource is created. The function name can be anything you like.
  • spec is a parameter that contains the spec of the custom resource, and *kwargs is a way to capture any additional context provided by Kopf.
  • Inside the function, we’re printing the message field from the Greeting resource’s spec.

With this setup, every time a Greeting resource is created in our cluster, our operator will print out the greeting message. This is a simple example, but you can imagine more complex operations like updating a database or interacting with other Kubernetes resources.

In the next section, we will look at how to deploy and run this operator in your cluster.

Kopf provides a range of decorators to react to different Kubernetes events. Here are some of the most common ones that you might use when creating a Kubernetes operator:

1. @kopf.on.create(…) : React to the creation of a custom resource. The function this decorator is applied to is called whenever a new custom resource is created in the cluster.

2. @kopf.on.update(…) : React to updates to a custom resource. The function this decorator is applied to is called whenever an existing custom resource is modified.

3. @kopf.on.delete(…) : React to the deletion of a custom resource. The function this decorator is applied to is called whenever a custom resource is deleted from the cluster.

4. @kopf.timer(…) : Execute code at regular intervals. This is useful for performing periodic checks or updates that are independent of changes to resources.

5. @kopf.on.resume(…) : Execute code when the operator starts, or when new matching resources are detected. This is useful for bringing the system into a known state at startup, regardless of the state of the custom resources at the time the operator was stopped.

6. @kopf.on.event(…) : React to arbitrary Kubernetes events. This is a lower-level function that can be used to react to any event in the Kubernetes system, not just those related to a specific custom resource.

7. @kopf.daemon(…) : Run code as a background daemon attached to individual resources. This can be useful for long-running tasks or tasks that should run continuously for the lifetime of a resource.

Here is an example of how to use these decorators in a Kopf-based operator:

import kopf
@kopf.on.create('example.com', 'v1', 'greetings')def create_fn(spec, **kwargs):print(f"Create: {spec['message']}")
@kopf.on.update('example.com', 'v1', 'greetings'
def update_fn(spec, **kwargs):print(f"Update: {spec['message']}")@kopf.on.delete('example.com', 'v1', 'greetings')def delete_fn(spec, **kwargs):print(f"Delete: {spec['message']}")@kopf.timer('example.com', 'v1', 'greetings', interval=60)def timer_fn(spec, **kwargs):print(f"Timer: {spec['message']}"))

In this example, we have functions that react to the creation, update, deletion of Greeting resources, and a timer function that runs every 60 seconds.

These decorators give you a powerful toolkit for responding to changes in your Kubernetes environment and maintaining the desired state of your system.

Local Development and Testing

To run this controller locally for development and testing, we need to install kopf and run our Python script:

kopf run controller.py --debug

Deploying the Controller

Why to Deploy

The operator (the controller) is the brain behind the operation. It’s the component that actually performs the tasks needed to reach the desired state of your custom resource. Without deploying the operator, you’d have the definition of your custom resource but nothing to manage it. It’s like having a blueprint for a building but no construction workers to build it.

Where to Deploy

The operator is typically run as a Pod within the same Kubernetes cluster where your custom resources are. This is because the operator needs to have direct access to the Kubernetes API server to watch and manage the custom resources. However,depending on the design of your operator, it could potentially be run outside of the cluster, as long as it still has access to the Kubernetes API server.

When to Deploy

You should deploy your operator after you’ve tested it and are confident that it will behave as expected. You’d typically deploy it before you start creating instances of yourcustom resource, as the operator needs to be running to manage these instances.However, if you create instances of your custom resource before the operator is running, they will just sit idle until the operator starts up and begins managing them.

What Will Happen After Deployment

Once deployed, the operator will start watching for instances of your custom resource and react to any changes. Depending on how you’ve written your operator, it may also react to changes in other Kubernetes resources that it’s designed to manage. For example, if your operator manages a database, it might watch for changes in Pods,Services, and PersistentVolumes in addition to its own custom resource.

Deploying an operator is a significant step in managing complex applications on Kubernetes, allowing you to extend Kubernetes’ functionality to suit your specific needs.

For deployment, we package our controller in a Docker image. Here’s a simple Dockerfile :

FROM python:3.9-slim
WORKDIR /appCOPY . .RUN pip install kopfCMD ["kopf", "run", "/app/controller.py"]

To build and push the Docker image:

docker build -t my-controller:latest 
docker push my-controller:latest.

Finally, we deploy our controller to the Kubernetes cluster using a Deployment:

apiVersion: apps/v1
kind: Deploymentmetadata:name: my-controllerspec:replicas: 1selector:matchLabels:app: my-controllertemplate:metadata:labels:app: my-controllerspec:containers:- name: my-controllerimage: my-controller:latest

Apply this deployment with kubectl :

$ kubectl apply -f controller.deployment.yml
Outputdeployment.apps/my-controller created

And voila! You’ve just created a CRD, an instance of that CRD, a custom controller to watch for changes in that resource, and deployed the controller in your Kubernetes cluster.

Remember that Kubernetes CRDs and controllers are powerful tools that allow you to extend Kubernetes to suit your specific needs. While our example used a simple greeting message, you could create CRDs to represent any kind of resource — from IoT devices to complex application configurations — and controllers to manage these resources in virtually any way you can imagine. Happy Kubernetes-ing!

Some existing and popular CRDS/Operators

1. Prometheus Operator: It allows easy monitoring definitions for Kubernetes services and deployment and management of Prometheus instances.

2. Strimzi: It’s an Operator for Apache Kafka, which makes it easy to run Apache Kafka on Kubernetes.

3. Rook: It is an open-source cloud-native storage orchestrator for Kubernetes, providing the platform, framework, and support for a diverse set of storage solutions to natively integrate with cloud-native environments.

4. KubeVirt: It is a virtual machine management add-on for Kubernetes. It aims to provide a common ground for virtualization solutions on top of Kubernetes.

5. Operator SDK: This is a framework designed to make it easy to write Kubernetes Operators. It provides high-level APIs, useful abstractions, and project scaffolding.

6. Jenkins Operator: This operator brings Kubernetes-native support for Jenkins,which makes it easy to manage and control Jenkins instances in a Kubernetes environment.

7. KubeSphere: It offers rich observability from infrastructure to applications, integrating your favorite tools for multi-dimensional monitoring metrics, multi-tenant log query and collection, alerting and notification124.

Crazy CRD Ideas to work

1. A CRD for managing AI models: This could include the model version, location of the model in a storage system, and the amount of resources required to run the model.

2. A CRD for managing feature flags: The CRD could specify the name of the feature flag, its current state, and which deployments it should affect.

3. A CRD for managing database migrations: This could specify the version of the database schema, the migrations that need to be run, and the order in which they should be applied.

Remember, the possibilities are vast with Kubernetes and CRDs, and you can create a CRD for almost any kind of resource that you would like to manage declaratively in your system.

What Next?

As we reach the end of this entry in our Kubernetes Operator series, we’ve laid the groundwork for some exciting exploration ahead. In our next post, we’re going to dive deep into the inner workings of an existing, robust operator: the Prometheus Operator. We will dissect its structure, explore its code, and understand how it effectively automates the tasks of deploying and managing Prometheus, a popular open-source system monitoring and alerting toolkit, within a Kubernetes cluster.

Understand how prometheus operator is written and how it works Studying the Prometheus Operator will give us invaluable insights into the design and implementation of real-world operators, and serve as inspiration for our own operator development endeavors. As we unpack the Prometheus Operator’s code, we’ll identify key strategies and best practices that we can apply in our operator development projects. So, get ready for an enriching journey into the heart of a successful Kubernetes Operator!

CRD for IoT

But we won’t stop there. In the subsequent installment of this series, we will shift our focus to the Internet of Things (IoT). We’ll discuss how Kubernetes Operators can be leveraged to create more efficient and scalable IoT solutions. We’ll explore the unique challenges that IoT presents and how the development of a custom resource definition (CRD) can address these challenges, leading to more robust, manageable, and powerful IoT deployments.

So, stay tuned for the upcoming posts in this series. We have an exciting journey ahead, filled with valuable insights and hands-on exploration of Kubernetes Operators. See you in the next post!

Looking for consulting or need some high quality expert with high-pace hands on?

Lastly, if you’re looking for consulting in DevOps or help converting your deployments into CRDs, reach out to Buildbot Tech. They can provide expert guidance and assistance with all your DevOps needs.

Well that’s enough for today ha?

Buildbot Technologies Private Limited

We build your idea, We operate your product and We transfer the ownership