Kubernetes StatefulSet for DB applications

Jun Xie
2 min readJul 26, 2022

--

This document describes a common solution of Kubernetes to support stateful applications (like database): StatefulSet. As we are building a layer to support ML vector management, an important decision is how to persist the data to avoid data loss. After reading different database companies’ blogs, it convinced us that StatefulSet can be a plausible way to achieve the goal.

We researched a list of databases, including Cockroach, MySQL, MongoDB, Cassandra, PostgreSQL, Dgraph and Redis. We also took a look at Kafka and Zookeeper as both also need to persist in the state. The criteria to select those databases are two-fold:

  1. They are widely used database solutions in the industry and can be deployed in Kubernetes. Some are cloud native databases, like Cockroach.
  2. Those stateful applications span across different perspectives, like MySQL being SQL and MongoDB being NoSQL. By choosing different types, we can be more confident on the decision.

Here is a list of articles on how different databases or stateful applications use StatefulSet to do the deployment in Kubernetes.

  1. Cockroach
    * 3 ways to master stateful apps in Kubernetes. Two ways in Kubernetes: DaemonSet vs StatefulSet, but Cockroach strongly recommends against the DaemonSet.
    * How to run CockroachDB on Kubernetes
    * Deploy a local cluster with Kubernetes
  2. MySQL
    * Run a replicated Stateful application
    * Kubernetes StatefulSet — Example & Best Practices. This is a good article to describe the E2E flow to deploy MySQL using Kubernetes StatefulSet.
  3. MongoDB
    * Running MongoDB on Kubernetes with StatefulSets
    * How to run MongoDB on Kubernetes
  4. Cassandra
    * Deploying Cassandra with a StatefulSet
  5. PostgreSQL
    * Deploying PostgreSQL as a StatefulSet in Kubernetes
  6. Dgraph
  7. Redis
    * Deploying Redis Cluster on Kubernetes
  8. Kafka
    * Set-up Kafka cluster using Kubernetes StatefulSet
  9. Zookeeper
    * Running Zookeeper, a distributed system coordinator

Based on the above articles, we will give StatefulSet a try to see whether it can support our use cases and will update accordingly in future.

--

--

Jun Xie

Founder and ex-Snap software engineer. I am interested at Machine Learning and Database. Feel free to drop me an email: xiejuncs@gmail.com