Introduction to Heptio Ark
An Ark is a vessel or sanctuary that is supposed to serve as protection against extinction. The most famous Ark in history is Noah’s Ark that the Bible says was built with instructions from God to protect Noah’s family and the pairs of animals he was instructed to bring on board. A more modern Ark would be the Svalbard Global Seed Vault which stores seeds in case of a global crisis.
Why are we talking about Arks on a technology blog? Because last week the folks at Heptio have announced the release of Ark. Heptio Ark is a solution that helps ease the pain of cluster backups and restorations for Kubernetes admins.
As the production use of Kubernetes increases, many organizations face challenges backing up and restoring their Kubernetes clusters. At Opcito we’ve helped clients restore their clusters by dumping cluster state from etcd. This method is hit and miss and the addition of persistent volumes and stateful loads makes it a much more complex affair. Divining the relationship between a volume’s snapshot and the pods it was running at a point in time is tricky because of the dynamic nature of the pods and the fact they can be rescheduled to other nodes transparently.
Heptio Ark is aimed at solving this problem of backing up Kubernetes clusters and allowing easy restoration of those backups. Ark allows you to easily create backups of all cluster components. Ark also manages volume snapshotting in a way that maintains pod associations. Restorations are a single command affair and can even be partial (eg. scoped to a namespace).
Ark also opens up some interesting use-cases around testing because now you can snapshot your production cluster and stand up a perfect replica in your test and staging environments with a single command. Ark also lets you move between environments breaking any cloud lock-in you may have for a platform. It’s launching in alpha with support for AWS, Azure and Google Cloud Platform. Heptio says the platform is extensible and more environments will be added in the future.
Your cluster needs to be running at least Kubernetes 1.7 to use Ark as your backup and restore solution.
Ark under the hood.
Ark leverages CRDs (Custom Resource Definitions) that allow you to extend the Kubernetes API with custom user defined resources. CRDs replace the deprecated TPR (Third Party Resource) and are available in Kubernetes 1.7. Custom controllers complete the loop allowing developers to define behaviour based on state stored within Custom Resource Definitions.
Each of Ark’s operations (Backup, Restore and Schedules) are defined as CRD’s and have custom controllers that allow operations on the data they store.Read More….
