Extending Kubernetes: Deleting DynamoDB Partitions with a Custom Operator

Jitendra Takalkar
7 min readNov 18, 2023

--

Kubernetes Operator

Kubernetes, an open-source, production-grade container orchestration engine, facilitates the automation of deploying, scaling, and managing distributed systems packaged as containers in a resilient manner. The control plane component, overseen by manager (a.k.a. kube-controller-manager), operates controller processes responsible for managing various Kubernetes resources, including Deployments, Pods, Jobs, and more.

Kubernetes Controller & Operator Pattern

The Kubernetes platform offers multiple ways of extending Kubernetes capabilities such as

  • Plugins for extending client's behavior
  • Scheduling extensions
  • API extensions (Custom Resource)

The usual controllers (combination of resource and control loop) in Kubernetes might not fully cater to applications that require precise state maintenance and specialized management. That’s where the Operator pattern steps in, enhancing Kubernetes API capabilities to handle these special needs.

A control loop that watches the shared state of the cluster through the API Server and makes changes attempting to move the current state towards the desired state (reconciliation).

Kubernetes official document mentioned “thermostats”, is the best example to elaborate more on the control loop and controller pattern.

Controller responsible for managing resource

  • Maintaining the control loop.
  • Track at least one resource (Kubernetes or external).
  • Ensure resource maintain desired state as mentioned in resource spec.

Resource

  • Endpoint in Kubernetes API (apiVersion/group/kind)
  • Contains a collection of objects
  • Kubernetes resources such as Pods, ReplicaSet, Services, etc …
  • Custom Resource is just an extension of Kubernetes API

Kubernetes Operator Toolkits

There are dedicated set of tools for building the operator themselves, tools allows us to extend with Kubernetes design principals, and focus on core control loop logic implementations. Most known frameworks are as follows

  • Kubebuilder (Go)
  • Operator Framework/operator-sdk (Helm/Ansible/Go)
  • Operator Framework/java-operator-sdk (Java)
  • Charmed/Juju (Go)
  • Kopf (Python)
  • KubeOps (dotnet)

When exploring operator development, I have personally experimented with the Operator Framework’s operator-sdk to build Helm operators and Kubebuilder for Golang-based implementations.

Operator-sdk simplifies the process of building operators by supporting Helm, Ansible, and Go, offering a versatile toolkit for different use cases.

On the other hand, Kubebuilder, known for its popularity and lower-level approach, allows a more direct entry point for implementing the control loop logic in Golang. Choose the framework as per project needs and expertise.

Use-Case — Deleting DynamoDB Table Partitioned Data

I’ve recently developed a library that utilizes Go concurrency primitives, specifically Goroutines and Channels, to facilitate the deletion of partitioned data from a DynamoDB table.

In the process of creating a sophisticated multi-tenant system, we opted for DynamoDB as our NoSQL database, where individual tenant or customer data is stored in their designated partitions.

In scenarios where we need to delete large data from a particular customer’s partition across multiple tables, this library demonstrates how such a task can be accomplished efficiently using the robust concurrency features of Golang.

DynamoDB Table Partition — Data Deletion

The library utilizes DynamoDB’s QueryPaginator to efficiently paginate through large partitioned datasets. It employs Go’s concurrency primitives, Goroutines, and Channels to parallelize data deletion in batch form. Each Goroutine handles item deletion at the page level, breaking down the data into batches (max 25 items) and processing them concurrently. The librarie’s primary goal is to delete large-scale partition data with minimal execution time.

Kubernetes Local — Job — batch/v1

As outlined in library usage documentation, provides multiple options to execute within the Kubernetes cluster as a batch/v1 Job resource, standalone docker container, or local process.

Kubernetes Operator — Delete Table Partition Data (as a hands-on example)

Let’s use this library to extend Kubernetes through hands-on experimentation by defining a Kubernetes Operator. There are two operators available for us to explore. Let's explore them.

Pre-requisites — Before Working with Operators

Before you start working with the ddbctl-dtp-helm-operator and ddbctl-dtp-operator, make sure you have the following prerequisites in place:

Kubernetes Local Cluster:

Ensure that you have a local Kubernetes cluster set up using microk8s. If you haven’t set up microk8s, you can follow the official documentation or use the microk8s up command to initialize a local cluster.

$ microk8s status

$ kubectl get nodes # local user having cluster-admin permissions

DynamoDB Local Kubernetes Deployment:

Follow the steps outlined in the DynamoDB local Kubernetes deployment guide to deploy DynamoDB locally within your microk8s cluster. This local DynamoDB instance will serve as the database for playing with operator deployment and test working examples.

Kubernetes Helm Operator — DdbctlDtpJob (as a hands-on example)

Clone operator dddbctl-dtp-helm-operator Github repository

$ cd ~/workspace/git.ws
$ git clone https://github.com/jittakal/ddbctl-dtp-helm-operator.git
$ cd ddbctl-dtp-helm-operator

Let's deploy the operator in the local Kubernetes Cluster

$ make deploy # ~/.kube/config - cluster-admin permissions

Verify operator deployment

$ kubectl get namespaces # new namespace - ddbctl-dtp-helm-operator-system

$ kubectl get deployment -n ddbctl-dtp-helm-operator-system # deployment for controller manager

$ kubectl get pods -n ddbctl-dtp-helm-operator-system # pod for controller-manager

$ kubectl get crd # crd entry for ddbctldtpjobs

Delete Table Partition Data using Kubernetes Helm Operator

Modify or create new yaml as per custom resource definition sample available at location config/samples/ddbctl.dtp.charts_v1alpha1_ddbctldtpjob.yaml

apiVersion: ddbctl.dtp.charts.operators.jittakal.io/v1alpha1
kind: DdbctlDtpJob
metadata:
name: ddbctldtpjob-sample
spec:
ddbCtlDtpJob:
awsRegion: us-east-1
endpointURL: http://aws-dynamodb-local.default.svc.cluster.local:8000
partitionValue: TESTTENANTID
tableName: Orders

Suppose our local DynamoDB instance is running within the same Kubernetes cluster, and we have an “Orders” table with partitions organized by a tenant identifier. In this context, we’ve identified the partition with the tenant identifier value “TESTTENANTID” as the candidate for deletion.

$ kubectl apply -f config/samples/ddbctl.dtp.charts_v1alpha1_ddbctldtpjob.yaml

# Verify the Job deployment
$ kubectl get DdbctlDtpJob # List our custom resource

$ kubectl get pods # Job pod should be running/completed

Validate the Job Pods log where the delete summary log would be available

$ kubectl get pods -n default
$ kubectl logs -f <<pod-name-from-above-command>> -n default # Summary log

Kubernetes Operator — DeleteTablePartitionJob (as a hands-on example)

As a process of writing Kubernetes operator, we have to identify Custom Resource (CR) and its definition (CRD), in our example, it is Kubernetes inbuilt resource i.e. batch/v1 Job and the plan is to implement additional human task automation part of control loop implementation such as

  • Automatically remove successfully completed jobs after either 5 or 30 minutes.
  • Automatically delete failed jobs older than the last 5 instances.

Clone operator dddbctl-dtp-operator Github repository

$ cd ~/workspace/git.ws
$ git clone https://github.com/jittakal/ddbctl-dtp-operator.git
$ cd ddbctl-dtp-operator

Let's deploy the operator in the local Kubernetes Cluster

$ make deploy # ~/.kube/config - cluster-admin permissions

Verify operator deployment

$ kubectl get namespaces # new namespace - ddbctl-dtp-operator-system
$ kubectl get deployment -n ddbctl-dtp-operator-system # deployment for controller manager$ kubectl get pods -n ddbctl-dtp-operator-system # pod for controller-manager$ kubectl get crd # crd entry for deletetablepartitiondatajob

Delete Table Partition Data using Kubernetes Golang-based Operator

Modify or create new yaml as per custom resource definition sample available at location config/samples/ddbctl_v1alpha1_deletetablepartitiondatajob_orders.yaml

apiVersion: ddbctl.operators.jittakal.io/v1alpha1
kind: DeleteTablePartitionDataJob
metadata:
labels:
app.kubernetes.io/name: deletetablepartitiondatajob
app.kubernetes.io/instance: deletetablepartitiondatajob-orders
app.kubernetes.io/part-of: ddbctl-dtp-operator
app.kubernetes.io/managed-by: kustomize
app.kubernetes.io/created-by: ddbctl-dtp-operator
name: deletetablepartitiondatajob-orders
spec:
tableName: Orders
partitionValue: TESTTENANTID
endpointURL: http://aws-dynamodb-local.default.svc.cluster.local:8000
awsRegion: us-east-1
$ kubectl apply -f config/samples/ddbctl_v1alpha1_deletetablepartitiondatajob_orders.yaml

# Verify the Job deployment
$ kubectl get DeleteTablePartitionDataJob

$ kubectl get pods # Job pod should be running/completed

Taking a step further, we explored how this library could be transformed into a Kubernetes Operator. The practical walkthrough included the necessary prerequisites, deployment steps, and validation procedures, showcasing the versatility of the library in different Kubernetes environments.

Security Considerations for Kubernetes Clusters

The identification and integration of third-party Kubernetes operators impose an extra responsibility on the security team to validate and certify that these operators do not misuse the cluster’s resources. Default RBAC settings for operators tend to be permissive, potentially granting excessive permissions beyond what is strictly necessary.

Conclusion

The journey from the basics of Kubernetes to hands-on experimentation with custom operators demonstrates the power of extending Kubernetes capabilities to address specific application requirements. The ability to leverage custom operators not only enhances automation but also provides a scalable and efficient solution tailored to the unique needs of each application.

Reference

Note: if you appreciate the blog content, please consider giving it more claps on Medium. Your support is valuable and encourages the creation of more content.

Thank you!

--

--