Member preview

Persistent Data for StatefulSets with Containership.io Kubernetes on DigitalOcean

I recently began using ContainerShip’s Kubernetes offering, and found it to be an excellent provisioning experience, and made good use of cloud provider-level resources like the DigitalOcean Load Balancer for type: LoadBalancer Kubernetes Services.

One thing in particular I liked was the ability to create different node pools right from their UI, for my various scheduling needs:

In this case, since my goal is to retain data in the event of a node failure, I want to focus on the pstore-pool, because I saw that Volumes were the one thing this offering didn't have support for my provider for--if you're familiar with the AWS EBS storage class in Kubernetes, I wanted to do something like that, and be able to make use of DigitalOcean's block storage, so my approach was something like this:

  1. Provision a Pool (above) of nodes
  2. Attach Volumes to those nodes
  3. Label those nodes as containing block storage
  4. Schedule stateful applications to those nodes using nodeSelector key in their Kubernetes manifests.

So, let’s go ahead and provision and attach a volume to these nodes:

I used the DigitalOcean UI to automatically create a filesystem and attach the disk, and mounted it to /mnt/kube-data:

Now, for Kubernetes to know about this node’s importance to me and my application’s data, we’ll label the host, in this case 058a1624-8ded-4a13-9231-4714d7bafe45:

and add a new label, stateful-data-store and verify our new host(s) is found:

With the host tagged, we can do ahead and deploy an application that may need a data store like this.

For the sake of brevity, let’s assume we’ve completed these tasks thus far on the remaining 2 nodes in our pstore pool, so we have 3 hosts configured with a block storage device mounted to /mnt/kube-data and labelled with stateful-data-store=true in Kubernetes.

We can do something like running a MongoDB Replicaset now, and store this data on a volume and make use of this path on the host, which we can then reimport should the host go down and we remount the volume on a new node.

We’ll create a PersistentVolume using the hostPath method (since I did not define a storage class for DO volumes, we're accessing this data through the mountpoint on the host--a more Kube-native method would be to manage this through a storage class):

kind: PersistentVolume
apiVersion: v1
metadata:
name: mongo-pv
labels:
type: local
spec:
storageClassName: mongodb-data
capacity:
storage: 10Gi
accessModes:
- ReadWriteOnce
hostPath:
path: "/mnt/kube-data/mongo"

and we’ll define a Service and StatefulSet for MongoDB that looks like:

apiVersion: v1
kind: Service
metadata:
name: mongo
labels:
name: mongo
spec:
ports:
- port: 27017
targetPort: 27017
clusterIP: None
selector:
role: mongo
---
apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
name: mongodb
spec:
serviceName: "mongodb"
replicas: 3
template:
metadata:
labels:
role: mongodb
spec:
containers:
- name: mongo
image: mongo
command:
- mongod
- "--replSet"
- rs0
- "--smallfiles"
- "--noprealloc"
ports:
- containerPort: 27017
volumeMounts:
- name: mongo-pv
mountPath: /data/db
volumeClaimTemplates:
- metadata:
name: mongo-pv
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 10Gi

that makes use of our Persistent Volume we created (mongo-pv), and because this is not dynamically provisioned, we won't be declaring persistentVolumeReclaimPolicy(however, if you do define this through a Storage Class, like I describe above, one way to enhance this further is to set this behavior for the provisioner to follow).

The above will create your MongoDB replicaSet and deploy it as a stateful set using the persistent volume definition we created, however, this is not useful to us if we plan to preserve data on its own external volume, so we need to specify in the container field in the above a nodeSelector to target hosts that meet our specifications:

...
spec:
containers:
- name: mongo
image: mongo
command:
...
** nodeSelector:
stateful-data-store: true**

You’ll see in the spec, we've added a field to target our hosts where the stateful-data-store is set to true to target where these pods will go and make use of the host block storage volume, and mount the subdirectory to the pod containers.

Further Reading

Create ContainerShip Cluster on DigitalOcean
MongoDB StatefulSet
Assign Pod Nodes
StatefulSets