Deploy Cassandra on Multiple Kubernetes Clusters and Clouds Using K8ssandra Operator
Author: Raghavan Srinivas
Predating the public cloud, Apache Cassandra® is the open-source, powerful database that scales horizontally to match the needs of an organization. With Kubernetes emerging as the leading platform for deploying and managing containerized systems in the cloud, we want to build upon the advantages and flexibility of Cassandra with K8ssandra, Cassandra running on Kubernetes, combining the best of both worlds.
K8ssandra is an open-source, cloud-native, production-ready platform for deploying Cassandra and the required tools on Kubernetes. In other words, K8ssandra provides a complete Cassandra-Kubernetes ecosystem.
Apart from managing the database, K8ssandra also supports the infrastructure for monitoring and optimizing data management. It offers an ecosystem of tools to provide richer data APIs and automated operations alongside Cassandra, such as Reaper, Medusa, Helm, Prometheus and Grafana, Traefik, and Stargate.
Part of the overall K8ssandra project is the recently introduced K8ssandra Operator, which provides single or multi-cluster, multi-region support in Kubernetes when deployed with K8ssandra. In short, by using this Operator, all of the components for metrics, backup and more are already installed and wired together.
In this post, we’ll discuss how to deploy Cassandra on multiple clusters in multiple clouds using the K8ssandra Operator. You’ll find some related resources for this experiment about the K8ssandra Operator on GitHub. Let’s begin!
Step 1: Set up Aviatrix
You need to set up the VPC, networking, and peering for Cassandra to run on multiple clusters, and we’re using the Aviatrix controller for this exercise.
Aviatrix offers multi-cloud native networking solutions which allows you to monitor, manage, and automate the handling of your VPC networks. The controller also enables connectivity between different clouds and offers solutions for AWS, Azure, and Google Cloud Platform. This lets you create clusters on these platforms and run Cassandra from different clouds. There are other technologies or products that enable this as well, but we used Aviatrix here.
Here’s what the setup looks like:
The details above show that Amazon’s EKS is running on 10.1.0.0/23, Azure’s AKS is running on 10.2.0.0/23, and Google’s GKE is running on 10.3.0.0.23 respectively. To set this up, use the Terraform scripts on GitHub, which is an opinionated implementation but makes it easy to set up and tear down.
Once you’ve onboarded the account (we haven’t covered the steps here), you can set up the controller, IP, username, and password. The remaining Terraform scripts will use this information to network appropriately and specify different cloud accounts. At the end of this setup, when you do the “terraform apply”, you’ll have Kubernetes clusters on three clouds with the ability for the pods in each of the clusters to talk to each other.
To verify your setup, go to the Aviatrix controller and click each cluster to see which node(s) it was set up in. EKS should be in 10.1.0.0/23, AKS in 10.2.0.0/23, and GKE on 10.3.0.0.23.
Step 2: Install Jetstack Cert Manager
Follow the steps in this documentation binder and adapt it to work for your multi-cluster setup, as the binder is for kind clusters. All you have to do is install “cert manager” in your EKS and GKE clusters.
Step 3: Set up control and data planes
Prior to the introduction of the K8ssandra operator, if you wanted to set up a multi-cluster, you’d have to manually inject the seeds from one cluster to another. But with the K8ssandra operator, you don’t have to do this manually. The control plane takes care of installing Cassandra and Kubernetes on a multi-cluster or multi-cloud region.
Install the K8ssandra operator to choose your AKS cluster as the control plane. Once you’ve set up the control plane, choose your EKS and GKE clusters as your data planes. To mark a cluster as a data plane, remember to add the K8SSANDRA_CONTROL_PLANE environment variable and ensure it is set to “false”.
Double check your clusters to see if they correctly function as the control and data planes.
Step 4: Install the client configuration
Installing the client configuration(s) in the control plane cluster is a crucial part of installing the multi-cluster, as it tells the control plane that the data planes are a part of it.
This step will take the client contract files from the objects in the data planes and install them in the AKS (control plane). You’ll find the installation instructions on GitHub.
Step 5: Deploy a K8ssandra cluster
In this step, AKS will install the K8ssandra cluster and Stargate API depending on your configuration specifications. As a result, the cluster will be set up to leverage the gossip protocol that Cassandra uses, allowing the nodes to talk to one another.
Once the installation is complete, you can verify the status cluster by executing cqlsh or “nodetool status” on one of the data nodes.
Deploying Cassandra on multiple clusters is made a lot easier with the K8ssandra operator. The tutorial above shows you setting up the Aviatrix controller to enable networking. It also shows how to install the control plane and the data plane(s), inject the configs of the data plane(s) into the control plane, and finally install the K8ssandra cluster spanning the data plane(s) from the control plane.
Follow the DataStax Tech Blog for more developer stories. Check out our YouTube channel for more workshops and tutorials and DataStax Developers on Twitter for the latest news about our developer community.
- Apache Cassandra®
- Microsoft Azure
- Google Cloud Platform
- K8ssandra Documentation
- GitHub: K8ssandra Workshop
- DataStax Discord
- What’s New With The K8ssandra Operator
- Integrating Apache Cassandra® and Kubernetes through K8ssandra
- Recording of presentation and demo at Kubernetes Community Days, Chennai