Multi-Region Apache Cassandra on Azure Kubernetes Service with the K8ssandra Operator (Part One)
Author: Jeff DiNoto
This is part one of a two-part series that demonstrates how the K8ssandra Operator simplifies deploying and managing a multi-region Apache Cassandra® cluster within Azure Kubernetes Service (AKS).
Among the many capabilities and features that the new K8ssandra Operator brings to the table, perhaps the most exciting is the ability to deploy and manage multiple datacenter clusters deployed across multiple regions — really bringing the distributed power of Apache Cassandra to the Kubernetes landscape. Let’s take a look at how this can be done on Azure Kubernetes Service (AKS).
It’s worth noting that while we’re going to focus on an example using AKS, the same basic approach applies to any provider environment. What will differ between them is primarily the means by which network connectivity is established between the Kubernetes clusters.
Deployment architecture
In this two-part series, we’ll use the K8ssandra Operator to deploy and manage an Apache Cassandra cluster with three data centers distributed across three different regions within Azure to demonstrate a method for providing a high level of availability and geographic fault tolerance.
Figure 1 depicts a simple outline of the Cassandra cluster we’re going to deploy. In addition to the regions where our data centers will be deployed, we’ll also be using a fourth region to host our K8ssandra control plane, as we’ll shortly see, that will be deployed in the East US region.
This first post will focus on deploying and configuring the four clusters in different regions, then in part two you’ll see how to deploy the K8ssandra Operator itself and build a K8ssandraCluster.
That said, let’s get started.
Building our clusters
We want to deploy a series of AKS clusters, one in each target region. In our example, each cluster is identical with the exception of the way the networking is configured, so we’ll take a look at one example that can be replicated as many times as is necessary. Figure 2 shown below provides a high-level view of the Kubernetes cluster deployment we want to create.
Before we start building our clusters, let’s discuss the networking configuration depicted in Figure 2.
Networking can be hard
The basic requirement for any multi-region or multi-cluster deployment to work with K8ssandra Operator is that there must be routable connectivity between the pods within each cluster. Luckily, this is a pretty easy requirement to achieve within AKS. Let’s take a look at how we can do that.
Within Azure, the approach to connectivity that we’ll take is to deploy a series of Virtual Network Gateways and provide VPN connections between them to connect the clusters.
When configuring each cluster, a virtual network will be created. There are a few important things to note for each of these configurations:
- The network spaces of each cluster should not overlap
- A separate subnet should be created to support Pod deployment and Virtual Network Gateway deployment
- The network spaces should be large enough to support the expected number of Pods within each cluster
Figure 2 above depicts the subnets that will be created to provide for these requirements. We’ll talk more about how to configure these networking elements as we go through the cluster deployment process.
Deploying our clusters
We should note here that the deployment you’ll see in this example isn’t scaled to achieve production-quality performance. We’re using this example primarily to demonstrate the regional distribution. For your own environments, the sizing and configuration of these clusters would vary from what you’ll see here.
To achieve the deployment depicted in Figure 2, we will need a series of four clusters, each in a different region: East US, West US, Central US, and Canada Central. Let’s walk through the deployment of the first cluster in the East US region. This cluster will eventually host our K8ssandra Operator control plane.
Here we’ll be using the Azure Portal to create and manage our resources. There are other ways to achieve this same goal through Azure deployment templates, or Terraform, for example.
So, within the Azure Portal we navigate to the Kubernetes Service and follow the launch points to create a new cluster.
Figure 3 shows the basic configuration applied to the cluster. Here we’re creating a cluster with four nodes within the East US region.
Continuing with the configuration of our cluster, the most critical aspect to address is the networking configuration. This is where we’ll create the virtual network and subnets to support the connectivity requirements previously discussed.
Figure 4 below depicts the initial networking configuration screen. It’s important at this stage to switch from the “Kubenet” configuration to “Azure CNI”. Azure CNI provides the additional capabilities and configuration options that will be required for multi-region connectivity.
After selecting the “Azure CNI” option, the link to create a new virtual network will appear. Select that option to configure the network for this cluster. This is where the subnet configuration shown in Figure 3 will be provided.
For this first cluster we’ll be using 10.1.0.0/16
address space, as shown in Figure 5.
We then create two separate subnets. The default subnet will be used by the pods, while the gateway subnet will be used by the gateway we need to deploy to establish cross-cluster connectivity.
Depending on the ingress needs of your deployment, it may also be useful to enable HTTP application routing, as shown in Figure 6.
We’ll continue working through the cluster creation process and kick off the creation of the cluster, as summarized in Figure 7.
Next, we repeat this process to deploy the remaining clusters. Each of the clusters will be configured with the appropriate network settings as previously described. Once the deployment is complete, we’ll see the following in the Kubernetes Service:
Establishing cluster connectivity
As a reminder, if we want to establish the required connectivity between each cluster, we’ll need to deploy a series of Virtual Network Gateways and connect them to each other. Let’s take a look at that process within the Azure Portal.
Virtual network gateway deployment
To get started, we’ll navigate to the Virtual Network Gateway home within the portal and select the “Create” option, as shown in Figure 9.
There are a number of settings to properly configure our gateway for connectivity between clusters. Figure 10 below shows all of the settings we’ll use for this first gateway, which will be associated with our first cluster in the East US region. There are a few settings here worth highlighting:
- Gateway Type. We will use a VPN.
- VPN Type. We will use Route-based.
- SKU/Generation. Much like our low-end node configuration, the same is true here, in other environments it may be required or desirable to select a more capable gateway product, we will use the default.
- Virtual Network. Select the virtual network that was configured for the cluster that you want this gateway to be deployed in support of.
- Subnet. Ensure that the subnet that was previously created separate from the subnet used for Pod deployment is selected — it may be selected by default.
- Public IP Address. A public IP address should be created and configured for each gateway device.
Complete the creation process and deploy the virtual gateway. Repeat this same process for each region and cluster — making the relevant selections for each.
After completing the full set of gateway deployments, we’re left with these gateways:
With a gateway deployed within each virtual network, we can now make the VPN connections that enable communication between each of our clusters.
Establish cross-gateway connections
To do this, we’ll return to the virtual network gateway home within the Azure Portal. We’ll start with our first gateway, k8ssandra-dev-vnet-1-vpn-gateway
.
Here we’re looking for the “Connections” menu within the portal.
We’ll need to create connections to each of the other gateways that have been deployed. We’ll then repeat this process across all gateways, establishing a full mesh of connectivity between the gateways. To make a connection, click the “Add” button and get started with the connection configuration.
Figure 13 shows the configuration settings to connect the cluster “1” to cluster “2”. The key configuration elements here are the selection of a VNet-to-VNet Connection type and then choosing the correct virtual network gateway from the remote cluster. A shared secret must be provided by each side of the connection, that’s specified in the Shared key.
Because the connections must be established bidirectionally, when the first connection is created it’ll initially show with a “not connected” status. To complete the connection, we follow the same procedure, but coming from the other direction.
We’ll create a new connection from within the k8ssandra-dev-vnet-2-vpn-gateway
and configure the connection to go to cluster “1”. Here’s where we must provide the same Shared key value from the previous step.
With both connections created symmetrically, the connection will complete deployment and now show as “connected”. With those connections in place, the pods within our first cluster can now route directly to pods in our second cluster.
This procedure should be repeated for each cluster until a full mesh of connections has been established. For example, here you can see the full mesh of configuration as seen from the first cluster:
Now that we have our clusters created and connected, let’s start interacting with them and set the stage to deploy a K8ssandraCluster.
Connecting to the clusters
The Azure Portal provides instructions to easily connect to each cluster. Navigate to the home for the cluster that you want to connect to via Kubectl and look for the “Connect” button.
This will launch a dialog providing a set of commands to locally connect to the new cluster’s context.
Let’s try these commands out locally. Note that the system we’ll be using to execute commands against the cluster has a few important tools installed:
- Kubectl: The Kubernetes command line tool
- Kubectx: A useful tool for easily switching between multiple cluster contexts on the command line
- Helm: The package manager for Kubernetes
- Azure CLI: The Azure command line tool
Let’s first add the context for the cluster using the commands provided in the Azure Portal:
You can now check out the resources of the cluster. Let’s take a look at the node pool:
Using Kubectx
, we can see that the context has been added to our lists of available contexts:
Now, anytime we want to switch back to this context we can simply run the following:
Lastly, we’ll follow this same process for each of our four clusters so that we ultimately have access to the context for each cluster:
And that’s it for part one! Now we’re ready to deploy the K8ssandra Operator and build a K8ssandraCluster. Keep your eyes peeled for Part Two, or better yet, follow the DataStax Tech Blog to get notified about any new posts on all things cloud-native and open source.
Check out DataStax Devs YouTube channel for more free tutorials and follow DataStaxDevs on Twitter and LinkedIn to join a buzzing community of developers from around the world.
Resources
- Kubernetes Documentation: Install Tools
- The Kubernetes operator for K8ssandra
- Github: Faster way to switch between clusters and namespaces in kubectl
- Helm
- Azure CLI
- Introducing the Next Generation of K8ssandra
- Get certified in Apache Cassandra and K8ssandra
- Blog: Deploying to Multiple Kubernetes Clusters with the K8ssandra Operator
- Part One: Deploy a Multi-Datacenter Apache Cassandra Cluster in Kubernetes
- Part Two: Multi-cluster Cassandra deployment with Google Kubernetes Engine
- Multi-Region Cassandra on EKS with K8ssandra and Kubefed