Persistent Data for StatefulSets with Containership.io Kubernetes on DigitalOcean
I recently began using ContainerShip’s Kubernetes offering, and found it to be an excellent provisioning experience, and made good use of cloud provider-level resources like the DigitalOcean Load Balancer for
type: LoadBalancer Kubernetes
One thing in particular I liked was the ability to create different node pools right from their UI, for my various scheduling needs:
In this case, since my goal is to retain data in the event of a node failure, I want to focus on the
pstore-pool, because I saw that Volumes were the one thing this offering didn't have support for my provider for--if you're familiar with the AWS EBS storage class in Kubernetes, I wanted to do something like that, and be able to make use of DigitalOcean's block storage, so my approach was something like this:
- Provision a Pool (above) of nodes
- Attach Volumes to those nodes
- Label those nodes as containing block storage
- Schedule stateful applications to those nodes using
nodeSelectorkey in their Kubernetes manifests.
So, let’s go ahead and provision and attach a volume to these nodes:
I used the DigitalOcean UI to automatically create a filesystem and attach the disk, and mounted it to
Now, for Kubernetes to know about this node’s importance to me and my application’s data, we’ll label the host, in this case
and add a new label,
stateful-data-store and verify our new host(s) is found:
With the host tagged, we can do ahead and deploy an application that may need a data store like this.
For the sake of brevity, let’s assume we’ve completed these tasks thus far on the remaining 2 nodes in our
pstore pool, so we have 3 hosts configured with a block storage device mounted to
/mnt/kube-data and labelled with
stateful-data-store=true in Kubernetes.
We can do something like running a MongoDB Replicaset now, and store this data on a volume and make use of this path on the host, which we can then reimport should the host go down and we remount the volume on a new node.
We’ll create a
PersistentVolume using the
hostPath method (since I did not define a storage class for DO volumes, we're accessing this data through the mountpoint on the host--a more Kube-native method would be to manage this through a storage class):
and we’ll define a
StatefulSet for MongoDB that looks like:
- port: 27017
- name: mongo
- containerPort: 27017
- name: mongo-pv
accessModes: [ "ReadWriteOnce" ]
that makes use of our Persistent Volume we created (
mongo-pv), and because this is not dynamically provisioned, we won't be declaring
persistentVolumeReclaimPolicy(however, if you do define this through a Storage Class, like I describe above, one way to enhance this further is to set this behavior for the provisioner to follow).
The above will create your MongoDB replicaSet and deploy it as a stateful set using the persistent volume definition we created, however, this is not useful to us if we plan to preserve data on its own external volume, so we need to specify in the
container field in the above a
nodeSelector to target hosts that meet our specifications:
- name: mongo
You’ll see in the
spec, we've added a field to target our hosts where the
stateful-data-store is set to
true to target where these pods will go and make use of the host block storage volume, and mount the subdirectory to the pod containers.