How to Manage the Data Challenges of a Cloud-Native Platform

Author: Raghavan Srinivas

DataStax
Building Real-World, Real-Time AI
5 min readMar 16, 2022

--

The daily operations of a cloud-native platform can be challenging and time-consuming, especially involving persistent data. One minute you could scale up nodes because of the growth of your application, the next, restoring backups. During Developer Austin Week, I spoke about how you can automate many of these operations with K8ssandra.

A typical cloud-native platform has one or more of the following features:

  • An infrastructure-as-a-service (IaaS) offering computation power and storage
  • A Kubernetes engine to help with management and scaling of containerized nodes and applications
  • A database-as-a-service (DBaaS) offering a managed database service to help with scaling and operations such as DataStax’s Astra DB
  • Microservices that help orchestrate the complete solution

Managing the day-to-day challenges with a cloud-native platform can consume a lot of your time if you have to take care of them manually, such as creating backups, accessing your data, and simple monitoring. Thankfully there are automated solutions out there to help you.

K8ssandra

If you’re looking for the benefits of a distributed NoSQL database and the power to automate processes such as deploying, managing, and scaling containerized applications, then K8ssandra is your solution.

K8ssandra is a cloud-native distribution of Apache Cassandra® that runs on Kubernetes. You get Cassandra’s power of scale and high availability, and Kubernetes’ capability to manage containers. Whether you’re installing locally or in the cloud, K8ssandra is easy to install with a Helm command and ability to provide custom values in k8ssandra-values.yaml.

Day-to-day challenges

K8ssandra comes drop-loaded with different components, each with the goal to help automate your daily operations.

In my talk at the Developer Week Austin, we looked at what these components can do and why they might be important.

Monitoring

Monitoring is an essential aspect of all systems and applications. The sooner you know that something is wrong, the faster you can get to fixing it. K8ssandra deploys metrics collectors for Apache Cassandra (MCAC) to the Kubernetes environment.

Figure 1. Visuals of the dashboard that can be created by MCAC.

Built on Prometheus and Grafana, MCAC tracks these metrics via a Java agent and can give you over 100,000 unique metric series per node, such as operations per second or heap usage.

Prometheus gathers all the data and stores it as a multi-dimensional data model. PromQL, a flexible query language, is used to access the data model, which you’ll find is much more powerful and rich for queries than its competitors.

Having lots of metrics doesn’t do you any good if you can’t read them easily. And that’s where Grafana can help. Grafana turns the data collected into highly customizable dashboards that anyone can use to display precisely what you want to show.

Management API

Managing Cassandra operations is mainly command-line driven and is usually outsourced to a team of experts who understand the management of the operational tools. The result of this approach produces a non-uniformed set of best practices and the likelihood of cutting corners to get the job done.

With K8ssandra, you get the power of a management API giving you the ability to:

  • Start nodes
  • Stop nodes
  • Customize configuration
  • Perform Kubernetes liveness/readiness checks

Scaling up and down

If your web application is flourishing and you have more people accessing it, your database will likely need to grow very quickly. With its distributed database structure, you can add nodes and remove them in Cassandra simply by altering the k8ssandra-values.yaml.

Figure 2. A graph showing the linear progression in the number of operations per second your Cassandra database can handle as you increase the number of nodes.

These are the properties within the k8ssandra-values.yaml:

  • Cluster name for your Cassandra installation
  • Name of the data center you used with your Stargate installation
  • The size of the data center

A simple YAML used in the Helm install could look something like this:

Altering the size of the installation is as simple as changing the size property and applying the changes with helm upgrade:

API access

It’s all good and well having a database, but it’s not much use if it’s just sitting there storing data. And how useful will your application be if a user can’t interact with it? You need to be able to complete the four basic operations:

  • Create
  • Read
  • Update
  • Delete

To help you with this K8ssandra deploys Stargate. Stargate is a unified API gateway that sits between your app and your database and gives you a range of different APIs to let you do the above basic operations. Some of the APIs are:

  • REST
  • GraphQL
  • Document API
  • gRPC
Figure 3. A diagram showing the various API extensions available through Stargate.io.

With a mix of different APIs to choose from, you can either use one that you’re familiar with or one that’s right for the job.

Backup and restore

No matter how careful you may be, sometimes the worst happens. Fortunately, K8ssandra allows you to backup and restore your Cassandra database with the help of Medusa.

With Medusa, you can backup and restore single nodes or do a full database backup. It supports Amazon S3, Google Cloud Storage, MinIO and Azure. You can even create a snapshot of your database so if disaster strikes, you can roll back to your previous backup and continue as if nothing happened.

Repair

Entropy is technically a natural event caused by downed nodes and data deletions among other things. There’s a level of entropy with data within Cassandra, which means data can go out of sync. If left alone, data across nodes may become inconsistent and eventually unstable.

That’s why running repairs is an important task. Running repairs will help keep your nodes healthy and help battle entropy. You can run incremental repairs daily, and full repairs weekly or monthly.

K8ssandra deploys Reaper for anti-entropy repair operations on Cassandra. Reaper helps automate repairs and schedule repairs. It also has a simple UI where you can quickly see the health of your clusters.

While technical issues are a fact of life when running cloud-native apps, they don’t have to be a major headache. Certain tools and technology can greatly simplify many of the operational tasks you come across. Learn more about K8ssandra to see what challenges it can solve for you.

Follow the DataStax Tech Blog for more developer stories. Check out our YouTube channel for tutorials and follow DataStax Developers on Twitter for the latest news about our developer community.

Resources

  1. Developer Week Austin
  2. DataStax Astra DB
  3. Metrics Collectors for Apache Cassandra
  4. Stargate
  5. Medusa
  6. Reaper
  7. Join our Discord: Fellowship of the (Cassandra) Rings
  8. DataStax Community Platform
  9. DataStax Academy
  10. DataStax Certifications
  11. DataStax Workshops

--

--

DataStax
Building Real-World, Real-Time AI

DataStax provides the real-time vector data tools that generative AI apps need, with seamless integration with developers' stacks of choice.