When and Why Create a Kubernetes Operator?

Rachit Arora
IBM Data Science in Practice
5 min readMay 1, 2019

Introduction: In order to meet the increasing requirements of running more analytical jobs to get more insights into organizations data and reap benefits from it it makes more sense to move my Big data pipeline to cloud as it meets requirements of building and deploy analytical jobs within seconds with simplified user experience, scalability, and reliability. But Its a huge challenge for Cloud providers to meet all expectations like meetings standards of 99.9999 availability, reduced operations.

My team has been working to provide such a service on cloud. Kubernetes and related technologies have been really helpful to bring down operations and failure rates but adopting Kubernetes is not only the Mantra which help you achieve all your goals but how you design your components on top of it also mater. One of such component which has been very helpful is Kubernetes operators. Kubernetes is de facto choice for running almost all components of your containerized pipeline. This pipeline maybe deploying some microservices or running some custom applications or running applications like Spark etc.

Although for generic use cases like running microservices or running `Hello world programs` your requirements are met using default resource definitions provided in Kubernetes by default like Pod, deployment,PVC etc. But for advanced use cases you may need to create Custom Resource Definitions (CRDs).

Kubernetes is a very extensible platform (post version1.7), admins can define their own types of resources which provide a more domain-specific schema and after creating CRDS they can deploy them as normal resources. CRDs are only a mean to specify your requirement in a declarative form and you will need controllers to monitor its state and reconcile the resource to match with the configuration. This is where Operators come into play.

Kubernetes Operator on OperatorHub

There are many good articles like this which explain how you can create an Operator. The purpose of this post is to describe some of the use cases where you can create an Operator to perform same activity in more effective/Kubernetes native way.

  1. When defining custom applications like Spark, Kafka,Cassandra and Zookeeper etc. You will need to create many microservices in order to manage lifecycle of these applications or use command line interfaces like spark-submit to do the operation of submitting a spark job. Where as you can easily create instances of Spark application using operators. Here is a link to try out Spark operator. With Kubernetes operator you can define the applications like Spark application in a declarative form and use standard ways like kubectl or invoke Kube API server to submit these applications.
  2. For using Service mesh — Istio operators can help you easily set up. Istio addresses the challenges developers and operators face as monolithic applications transition towards a distributed microservice architecture. You can secure service-to-service communication in a cluster with strong identity-based authentication and authorization with help of Istio operator. It also provides a pluggable policy layer and configuration API supporting access controls, rate limits and quotas etc. You can get more advantages by setting up Istio.
  3. Setting up your own etcd to have a scalable Key value store of your own.
  4. You can create separate operators to manage Kubernetes worker node. When a new worker node is added to your cluster , even if API server says your worker is ready its not really ready to take your requests. You can create operator which can do many node management activities like Pull docker images, Install Volume plugins, apply Labels etc.
  5. Replacement for CronJobs to create CRDs to cache or clean up activities
  6. Kubernetes operators are helpful for stateful applications such as databases. Adding or removing instances may require preparation and post-provisioning steps like changing internal configuration, communication with a clustering mechanism, interaction with external systems like DNS etc. This often required manual steps which increases the DevOps burden and increasing the likelihood of error.
  7. Operations such as creating backups or standby snapshots of the data require a little bit of extra steps and often done by manually or by a cronjob but this can be done by an Operator.
  8. Kubernetes Operators can help in managaing complex administrative tasks like granting access, monitoring easily and in a standardized way. You can create operator to enforce Kubernetes level cluster polices eg `do not allow` priviledged pods.
  9. Kubernetes operators can be helpful in enforcing security polices eg `do not create` pods from a vulnerable image version etc.
  10. With Operators one can create templates that can be reused and adapted to automate application management, with no need to reinvent the wheel every time.
  11. Anpther place operator can be used to recover from faults of your Kubernetes worker nodes, handle your responses on a pod restart
  12. If you create pods on demand as result of some API call or to manage requests to create Spark jobs etc and sometimes your request can fail which can result in some dangling pods which need to be cleaned up. You can write operators to look for such pods and clean them up.This helps in saving your compute resources which can be used by other pods.
  13. Operators are very useful in multi-cloud and hybrid cloud environments. You can create Operator to achieve management actives or configuration management when you need to work against Kubernetes clusters provided by different cloud providers. Operators allow a developer to do what he/she does best without necessarily need domain expertise in a particular infrastructure environment. Operator can be used to provision multiple applications in a consistent manner while adhering to best practices for that particular installation.
  14. You can create remedy controller operator which can take corrective action on negative events generated. There can be negative events like docker overlay network failure(Sandbox errors) , Worker node disk getting full, Mount failures for a pod on specific nodes, Kubernetes worker’s kernel related errors. One can have two operators to handle such scenarios. One operator to detect such a problem and a second operator to take remdial action. Most of the times remidal action is to cordon the worker node and then either create a new one or reload/restart the existing node.

In summary Operators are controllers working in association with custom resources to perform tasks that Devops or Humans have to handle. Operators can manage installation, scale, updates, and management lifecycle of a stateful clustered application. Operators presents as a design pattern and reusable templates for various tasks in an application’s lifecycle.

--

--

Rachit Arora
IBM Data Science in Practice

Software Architect. Expert in building cloud services. Loves Kubernetes, containers and Bigdata.