Let kapitan take the helm of Kubernetes

Alessandro De Maria
Kapitan Blog
Published in
8 min readFeb 17, 2019

This story is all about Kapitan, the tool that will help you manage your Kubernetes configuration and make you feel good about it.

In spite of limited documentation and marketing efforts, Kapitan has grown organically thanks to the “word-of-mouth” of happy users. We now have a fair share of stars, a couple of mentions from Kubernetes bloggers and evangelists. Kapitan has even its very own section on a book. Most importantly, it has attracted the interest of a number of ambitious companies that recognised in the workflow that Kapitan enables the secret recipe to manage the “configuration spaghetti problem”.

Many also joined us in the #kapitan channel on kubernetes.slack.com (and you should as well!) to create a small but motivated community which makes us proud of what we do :)

However, the majority of people that have heard about Kapitan keep pigeonholing it into the “jsonnet + jinja tool”, clearly missing the whole point about it.

In this post, I will focus on the use of Kapitan to manage Kubernetes deployments, skipping over its flexibility to handle pretty much anything else. But this is the still a very important point to understand: Kapitan was developed to be generic, and we avoided any Kubernetes lock-in. Kubernetes is just one of the many possible uses of it.

Also, this is not meant to be a tutorial (however I promise they will follow), but more of a mild rant explaining why we created it and what problems with deploying Kubernetes configurations we think it solves.

The opinionated bit

In 2015, I started experimenting with Kubernetes and I loved it!
But there were many aspects of it that I didn’t like (and I still do not):

  • context: IMHO, one of the most difficult concepts of Kubernetes to grasp, and one of the most dangerous. Context are a client concept, they are difficult to understand, inconsistently named, and introduce ambiguity when running kubectl commands. I hate (kubectl) contexts!
  • verbose, nested (yaml!) configuration: Took me some time to understand the meaning of each layer of yaml configuration in a manifest. Why did I have to repeat labels in 2-3 different places to make it work?
  • imperative/declarative mess: Kubernetes novice users are encouraged to learn about it by using imperative commands, when everybody knows that’s not what grown-ups really use! I believe this confuses the person that needs to translate a trial of Kubernetes into a proper deployment strategy to help their business. Spoiler: there is no official definition of “proper strategy”
  • run-time configuration: I also agree with Jesse Suen when they warn against passing configuration options to the command line of helm (or kubectl or anything). Passing parameters makes it difficult to ensure the same command will be run twice the same way.
  • application configuration: Well done, you have learnt to manage your Kubernetes yaml manifests. But let’s remember that a pod/deployment is just a mere vessel. You haven’t actual managed the configuration of the application on top of it.
  • developers just wanna have fun: the workflow for a developer interacting with Kubernetes is still broken/undefined. Kubernetes fans still try to convince developers to develop against kubernetes, but are we not just pushing it on them? Listen to Kelsey Hightower!
  • operators: I have mixed feelings about them, but let’s not fight this battle today! :) Let’s just say that I think they are often abused.
  • idempotency: Or rather, “lack of”. Some combined use of the above points encourage workflows that lack idempotency, which is a pity in the case of Kubernetes!

The journey bit

While trying to address some of the above problems, I hacked together a tiny templating system that was making use of j2cli and a couple of bash wrapper scripts to manage kubernetes configurations.

It worked by stuffing everything into a “environmentA.yaml” file and then consuming that file in a Jinja2 template. Deploying a microservices style application made of multiple components was as simple as running:

bin/apply.sh environments/environmentA.yaml

Cool! The yaml file contained everything about the deployment, which was convenient because I could use the same file as a source of information for other things, say .. bash scripts!

I came up with a way to import into scripts values defined in the yaml inventory, so that I could run something like:

bin/create_kafka_topics.sh environments/environmentA.yaml

But then things went quickly wild and out of control:

  • The structure within the yaml file was untameable. Lots of repeated fields, values, mixed configurations.
  • It was impossible to know if deploying to an environment worked until you actually attempted it! This was often caused by changes to the jinja2 templates to accommodate a new inventory value (say, feature_X) which would break environments that didn’t have that feature defined.
  • Same for scripts, you wouldn’t know whether they worked or no until you actually run it.
  • At the time, Kubernetes kept changing so quickly that adapting the manifests to cope with different version was super annoying! Especially the annotations to actual manifest value “dance”.
  • External factor: the development team switched from using config files to command line options. Such a small change actually made us collapse and forced us to think again of a new solution.
  • Most importantly: templating yaml with Jinja (or Go Templates) SUCKED! We had a sad riddle at the time: “What looks like text, reads like text, smells like text, but ain’t text?”. Or as Lee Briggs eloquently put it: “Why the fuck are we templating yaml?”

Kapitan in the making

With all the learning from the failed experience, together with Ricardo Amaro we started thinking about how our ideal configuration management system would look like. We didn’t have a clear idea at the time but we knew what we liked and what we didn’t like.

Like

  • Git based
  • Templating in general: separate “data/values” from templates
  • Separate values for different aspects(application, kubernetes, runtime..)
  • Object oriented approach
  • Simpler “yaml” as an interface to hide kubernetes complexity
  • Explicit and clear to understand what happens and why
  • Reusing values across different components
  • Scripts should also be able to access the values

Dislike

  • kubectl contexts
  • text template engines to generate yaml
  • having to count indents: i.e. `{{ toYaml .Values.resources | indent 10 }}`
  • Magic: everything had to be explicit. No magic hacks.
  • Having to manage application passwords/secrets manually.
  • tiller approach: we wanted to be in control of applying manifests.
  • git-crypt approach: secrets are normally unencrypted on disk.
  • piping templates directly to kubectl.
  • passing command line options.

2 things then happened then:

  1. We discovered jsonnet by Dave Cunningham which delivered on templating yaml/json using an object oriented language.
  2. Gustavo Buriola introduced us to reclass which I think was the biggest contribution to the project.

Ricardo Amaro got to work and soon the whole team was collaborating on Kapitan, either on the core functionalities, or the actual use within our internal project. Secret management, gpg\kms support, custom functions support: Kapitan is now a complete product that delivers more than what’s on the tin.

Hey Kapitan: what’s that all about?

Kapitan is an attempt to solve all/most of the issues identified earlier.

From a technology point of view, Kapitan is very simple:

  • inventory: A yaml based hierarchical collection of values that describe your deployment. Based on reclass. Think Hiera.
  • template engines: Currently Jinja2, Jsonnet, Kadet. They consume the inventory and produce files, usually yaml, json, documentation or bash scripts.
  • secrets: Template secrets so that Kapitan can handle them for you.

We make massive use of jsonnet to drive the templating part of the manifests, while we use Jinja for everything else.

A problem we sometimes hear is that there is a disconnect between how a jsonnet file looks like, and how the equivalent yaml file (or example) looks like. This makes it difficult for some people to buy into the initial commitment to use jsonnet.

If you are wondering what Kadet it, this is our own attempt to solve this problem by creating a Python wrapper around a yaml file. You can then use your favourite yaml file as a seed, and augment it with Python.
I like to think of it like a Python exoskeleton for yaml! More on this soon.

From a process/workflow point of view, Kapitan immediately shows how opinionated it is:

  • opt-in: We don’t enforce any particular workflow/technology, however we tend to operate according to the following principles. But you are free to use Kapitan whichever way you prefer. You do not have to use git, you do not have to compile files in, you don’t even need to use jsonnet! Use what you like and ignore the rest.
  • gitops by birth: Everything is in git. Everything is in “master” branch which represents the blueprint, the desired state.
  • declarative: Kapitan encourages you to always compile your manifests templates into their concrete representations. You also compile your scripts.
  • controlled context: We cleverly use compiled scripts allow to simplify tasks like setting contexts and configuring the clusters.
    Configure your kubernetes setup: compiled/target_A/setup/setup.sh
    Apply your changes by running: compiled/target_A/setup/apply.sh
  • idempotent: Kapitan encourages you to make changes to the templates and the inventory to refactor your code. Compiled manifests/code will not change if you did not mean them to, meaning you get reassurance that your refactoring is correct and low risk.
  • cause&effect: We encourage a workflow where both inventory/template changes and compiled files are in the same merge request. This allows the reviewer to assess both the intended change and the actual effect it cause. This is very helpful to understand if a change to a template will affect one, two or many targets!
  • last mile: Kapitan does not speak Kubernetes: all it does is to create files. kubectl will be in charge to actually deploy the change. All we do is to wrap around commands to execute them in a consistent way

Do YOU need it?

Let’s be clear: you probably don’t need Kapitan (yet)!
Depending on what you are trying to do, and how complex is your setup, you might be doing just fine.

Kapitan is a powerful tool, which requires an investment which only makes sense for complex scenarios where you anticipate the need to deploy many applications to many clusters.

If you need off-the-shelf applications, if you are just learning and exploring about Kubernetes, if you are happy with your current workflow, you are probably fine with Helm or the alternative du jour.

The way I tend to explain it is that I see Helm as the apt-get of Kubernetes, whereas Kapitan aims to be (loosely) a bit more like Puppet.

In my next post, I will show some more concrete examples and do a deep dive on the inventory. Please comment to let me know what you want to expand on or whether you agree/disagree with the content of this post.

--

--

Alessandro De Maria
Kapitan Blog

#father #kapitan #devops. Head of SRE at Synthace. Ex DeepMind. Ex Google. Opinions are my own