Using Ark and Restic to provide DR for IBM Kubernetes Service

Restore your application configurations AND your persistent storage.

Overview

Skill Level: Beginner to Intermediate
Basic understanding of Kubernetes is required.

We will use:

  • IBM Cloud Object Storage
  • IBM Kubernetes Service (IKS)
  • Heptio Ark with Restic Plugin
  • IBM Cloud Private (for advanced Use Case)

Heptio Ark provides an automated object storage-based back up and restore capability for Kubernetes (K8S) cluster configurations. Ark has additional plugins that allow it to back up and restore Kubernetes persistent volume content. Ark’s volume DR capabilities come in two forms, one that uses cloud provider specific storage snapshot mechanisms and a more generic Open Source storage DR plugin call Restic. The advantage of Restic is that it is storage back end agnostic allowing one to potentially backup from one cluster to another with different storage back ends.

In this article we will walk through a quick DR scenario using Ark with Restic to back up and restore IBM Kubernetes Service (IKS) application along with its storage volumes. A simple overview of the process is as follows:

  1. Login (or first create) to your IBM Cloud Account.
  2. Create and configure IBM object storage service.
  3. Provision an IKS instance.
  4. Install Ark Client.
  5. Configure Ark and Restic.
  6. Install Ark and Restic into your IKS cluster.
  7. Deploy an application and make a change to the PV content.
  8. Run Ark backup.
  9. Delete the application and PV, simulating disaster.
  10. Restore application from Ark/Restic Backup and all is well again.

Step 1. Login to the IBM Cloud (or create you free account if this is your first time)

https://console.cloud.ibm.com

Step 2. Create an IBM Cloud Object Storage Service Instance

To store Kubernetes backups, you need a destination bucket in an instance of Cloud Object Storage (COS) and you have to configure service credentials to access this instance.

  • If you don’t have a COS instance, you can create a new one, according to the detailed instructions in Creating a new resource instance.
  • The next step is to create a bucket for your backups. Ark and Restic will use the same bucket to store K8S configuration data as well as Volume backups. See instructions in Create a bucket to store your data. We are naming the bucket arkbucket and will use this name later to configure Ark backup location. You will need to choose another name for your bucket as IBM COS bucket names are globally unique. Choose “Cross Region” Resiliency so it is easy to restore anywhere.
COS Bucket Creation (arkbucket shown but create resticbucket also)
  • The last step in the COS configuration is to define a service that can store data in the bucket. The process of creating service credentials is described in Service credentials. Several comments:
Your Ark service will write its backup into the bucket, so it requires the “Writer” access role.
Ark uses an AWS S3 compatible API. Which means it authenticates using a signature created from a pair of access and secret keys — a set of HMAC credentials. You can create these HMAC credentials by specifying {“HMAC”:true} as an optional inline parameter. See step 3 in the Service credentials guide.
  • After successfully creating a Service credential, you can view the JSON definition of the credential. Under the cos_hmac_keys entry there are access_key_id and secret_access_key. We will use them later.

Step 3. Provision an Instance of IKS

https://console.bluemix.net/docs/containers/container_index.html#container_index

A single zone cluster in US-East Region with 3 worker nodes is what we use in this example.

Step 4. Download and Install Ark

  • Download Ark as described here: https://heptio.github.io/ark/v0.10.0/. A single tar ball download (https://github.com/heptio/ark/releases) should install the Ark client program along with the required configuration files for your cluster.
  • Note that you will need Ark v0.10.0 or above for the Restic integration as shown in these instructions.
  • Add the Ark client program (ark) somewhere in your $PATH.

Step 5. Configure Ark Setup

  • Configure your kubectl client to access your IKS deployment.
  • From the Ark root directory, edit the file config/ibm/05-ark-backupstoragelocation.yaml file. Add your COS keys as a Kubernetes Secret named cloud-credentials as shown below. Be sure to update <access_key_id> and <secret_access_key> with the value from your IBM COS service credentials. The remaining changes in the file are in the section showing the BackupStorageLocation resource named default. Configure access to the bucket arkbucket (or whatever you called yours) by editing the spec.objectstore.bucket section of the file. Edit the COS region and s3URL to match your choices. The file should look something like this when done:
apiVersion: v1
kind: Secret
metadata:
namespace: heptio-ark
name: cloud-credentials
stringData:
cloud: |
[default]
# UPDATE ME: the value of “access_key_id” of your COS service credential
aws_access_key_id = <access_key_id>
# UPDATE ME: the value of “secret_access_key” of your COS service credential
aws_secret_access_key = <secret_access_key>
---
apiVersion: ark.heptio.com/v1
kind: BackupStorageLocation
metadata:
name: default
namespace: heptio-ark
spec:
provider: aws
objectStorage:
bucket: arkbucket
config:
s3ForcePathStyle: "true"
s3Url: http://s3-api.us-geo.objectstorage.softlayer.net
region: us-geo

Step 6. Deploy Ark into your IBM IKS Instance

Run the following commands from the Ark root directory:
kubectl apply -f config/common/00-prereqs.yaml

kubectl apply -f config/ibm/05-ark-backupstoragelocation.yaml

kubectl apply -f config/ibm/10-deployment.yaml

kubectl apply -f config/aws/20-restic-daemonset.yaml

Verify that Ark and Restic are running correctly on your IKS cluster with the following command:

kubectl -n heptio-ark get pods

which should show pods running similar to this:

NAME READY STATUS RESTARTS AGE
ark-5464586757-q2crr 1/1 Running 0 5m
restic-7657v 1/1 Running 0 5m
restic-hh677 1/1 Running 0 5m
restic-mb9vh 1/1 Running 0 5m

Note above that the count may vary as there is one Ark pod and a Restic Daemon set (in this case 3 pods, one per worker node).

Step 7. Deploy a sample Application with a Volume to be Backed Up

From the Ark root directory cut the yaml code below and save it asconfig/ibm/with-pv.yaml. We are creating a simple nginx deployment in its own namespace along with a service and a dynamically provisioned PV where we store nginx logs. Note the annotation: backup.ark.heptio.com/backup-volumes: nginx-logs below which tells Restic the volume name that we are interested in backing up.

---
apiVersion: v1
kind: Namespace
metadata:
name: nginx-example
labels:
app: nginx
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: claim-nginx-logs
namespace: nginx-example
labels:
app: nginx
billingType: "monthly"
annotations:
volume.beta.kubernetes.io/storage-class: "ibmc-file-bronze"
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 24Gi
---
apiVersion: apps/v1beta1
kind: Deployment
metadata:
name: nginx-deployment
namespace: nginx-example
spec:
replicas: 1
template:
metadata:
annotations:
backup.ark.heptio.com/backup-volumes: nginx-logs
labels:
app: nginx
spec:
volumes:
- name: nginx-logs
persistentVolumeClaim:
claimName: claim-nginx-logs
containers:
- image: nginx:1.7.9
name: nginx
ports:
- containerPort: 80
volumeMounts:
- mountPath: "/var/log/nginx"
name: nginx-logs
readOnly: false
---
apiVersion: v1
kind: Service
metadata:
labels:
app: nginx
name: my-nginx
namespace: nginx-example
spec:
ports:
- port: 80
targetPort: 80
selector:
app: nginx
type: LoadBalancer

Now we can deploy this sample app by running the following from the Ark root directory:

kubectl create -f config/ibm/with-pv.yaml

We can check if the storage has been provision with:

kubectl -n nginx-example get pvc

You may have to run this over the course of a few minutes as the PVC gets bound. It will show pending but eventually show as bound similar to:

NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
claim-nginx-logs Bound pvc-cab7c88b-e908–11e8–8afb-c295f183323f 24Gi RWX ibmc-file-bronze 3m

Now that we have a volume mounted we can find out the nginx pod name and put something in the volume (or just access the nginx web frontend and see access logs grow). Get your pod name with the following command (sample output shown):

kubectl -n nginx-example get pods
NAME READY STATUS RESTARTS AGE
nginx-deployment-68fbbf4d7c-mkfnt 1/1 Running 0 10m

Using the above pod name (yours will differ) we can log into the instance and add a file with the following commands:

kubectl -n nginx-example exec -it nginx-deployment-68fbbf4d7c-mkfnt -- /bin/bash
root@nginx-deployment-68fbbf4d7c-mkfnt:/# cd /var/log/nginx
root@nginx-deployment-68fbbf4d7c-mkfnt:/var/log/nginx# echo “hw” > hw.txt
root@nginx-deployment-68fbbf4d7c-mkfnt:/var/log/nginx# ls -al
total 16
drwxr-xr-x 2 nobody 4294967294 4096 Nov 15 19:22 .
drwxr-xr-x 1 root root 4096 Jan 27 2015 ..
-rw-r — r — 1 nobody 4294967294 530 Nov 15 19:13 access.log
-rw-r — r — 1 nobody 4294967294 219 Nov 15 19:12 error.log
-rw-r — r — 1 nobody 4294967294 3 Nov 15 19:24 hw.txt
root@nginx-deployment-68fbbf4d7c-mkfnt:/var/log/nginx# exit

We now have some content we would expect to be saved and restored with the addition of our hw.txt file. You can, of course just access the nginx front end service via your browser and see the access.log grow also.

Step 8. Use Ark and Restic to backup K8S config and volume.

We can backup up our sample application by scoping the backup to the application’s namespace as follows:

ark backup create my-nginx-bu --include-namespaces nginx-example
Backup request “my-nginx-bu” submitted successfully.
Run `ark backup describe my-nginx-bu` for more details.

We can check the result with:

ark backup describe my-nginx-bu --details

which after repeating a few times the result should show a complete status.

If you examine your IBM Cloud COS bucket associated with the backup you will see that a set of files has appeared.

Step 9. Simulating Disaster

With the following commands we will delete our application configuration and the PV associated and confirm they are removed:

kubectl delete namespace nginx-example
namespace “nginx-example” deleted
kubectl get pvc -n nginx-example
No resources found.
kubectl get pods -n nginx-example
No resources found.

Step 10. Recovering from Disaster

We can restore the application and volume with the following command:

ark restore create --from-backup my-nginx-bu
Restore request “my-nginx-bu-20181115145200” submitted successfully.
Run `ark restore describe my-nginx-bu-20181115145200` for more details.

Restoring will take longer because we are dynamically provisioning another network drive behind the scenes. If we look at the status of our application it is “pending”:

kubectl get pods -n nginx-example
NAME READY STATUS RESTARTS AGE
nginx-deployment-68fbbf4d7c-mkfnt 0/1 Pending 0 53s
kubectl get pvc -n nginx-example
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
claim-nginx-logs Pending ibmc-file-bronze 1m

Within a minute or two we see our application is up and the volume recovered using the commands below (your pod name will differ). We dump our “hello world” file (hw.txt) and its contents are what we had per-disaster, mission accomplished!

kubectl get pods -n nginx-example
NAME READY STATUS RESTARTS AGE
nginx-deployment-68fbbf4d7c-mkfnt 1/1 Running 0 6m
kubectl -n nginx-example exec -it nginx-deployment-68fbbf4d7c-mkfnt -- cat /var/log/nginx/hw.txt
hw

Advanced Use Cases

In the previous example we walked through the straightforward situation where we simply wanted to restore an old version of an IKS cluster configuration and volumes to the same IKS cluster it was backed up from. The scenario was more of a backup/restore rather than a DR recovery. You likely would not have the same cluster available to restore to (or even the same IBM cloud region given a real catastrophe). In a disaster scenario where the region comes back online quickly this is fine if we can still meet RPO, RTO requirements. Below we discuss how to achieve a few advance use cases including multi-region DR, rapid cloning of environments for developer use and DR to/from on premise K8S instances like IBM Cloud Private (ICP).

DR across IBM Cloud Regions

A more likely DR scenario is where one would want to recover the backed up cluster in a completely different IBM Cloud Region from the original. As an example, I might have my primary cluster in IBM Cloud region US-East but have my DR cluster in US-South. The good news is that this scenario is easy to do with a little preparation and a similar set of steps. Rather then running the restore command on the US-East cluster I simply run on the US-South cluster (obviously you would have Kubectl client configured to US-South cluster). The one caveat with a cross region backup is that prior backing up and PVC you must remove labels tying the PVC to a local IBM Cloud region and zone. For each PVC to back up you would need to run the follow commands prior to backup command:

kubectl -n <namespace> label pvc <PVC name> region-
kubectl -n <namespace> label pvc <PVC name> zone-

With many PVCs this can be tiresome so a utility can be easily written. The utility can ideally be installed as an Ark pre-backup hook also.

Developers Cloning Workspaces

Often times a developer may want to debug a problematic deployment in a test or production cluster. Rather then take over the cluster to debug and potentially derail testing or even production activities the developer can simple clone the environment with Ark. Developers can provision their own clusters with lower cost cloud resource (less CPU, RAM or even fewer worker nodes) as needed. Developer clusters can be ephemeral given the rapid deployment of the cluster and the restored application.

DR to/from any Kubernetes instances (multi-cloud)

One can image many other useful scenarios where the desire is to backup/restore or migrate applications across clouds or to/from on premise Kubernetes. The good new is that Ark and Restic can support this as long as you have the same or equivalent volume storage available on the source and target. One scenario to try is to backup an IBM Cloud Private (ICP) on-prem cluster and restore it as an IKS cluster. One could imagine this as a dev/test cloning scenario with the developer using the cloud and IKS as an on-demand test bed. The destination cluster must have the same Kubernetes storageclass and compatible backend provisioner to that of the source. For example, if I am using the storageclass glusterfs on ICP and I want to restore to IKS I will need to provide the storageclass glusterfs in IKS. This can be done by simply adding a new storageclass definition for glusterfs on IKS but having it use the specification of something equivalent like ibmc-file-bronze (commonly used file volume provisioner on IKS).

Summary

Ark and Restic have made the lives of Kubernetes developers and administrators a lot easier when it comes to DR. Using ubiquitously available object storage as the backend, a Kubernetes API aware client and cluster runtime agents, Ark/Restic has solved the Kubernetes DR challenge in an elegant yet completely accessible way. Given its ease of use and reach feature set, Ark/Restic has expanded the set of achievable use cases to now include developer workflows and potentially even cloud to cloud migration. The sky is the limit with Ark cloud DR.