How Tenable Uses Helm to Template a Microservice Stack, Part 2

Jonathan Lynch
Tenable TechBlog
Published in
8 min readMay 16, 2019

In Part 1, we showed some of the ways Tenable uses Helm features to help manage Kubernetes configuration of our microservices. In this article, we’re taking things up a layer and showing off some of our orchestration “glue code”.

Start-up Order: Defining Inter-service Dependencies

Thank you to Bart Walczak and Brett Au for their work on this feature!

It is a common problem in the cloud that certain dependencies must be available before your service can start correctly. A typical example is the database must be up before the program that uses the database can run. Microservices introduce further permutations on this theme, e.g. one service might depend on another service, which itself depends on a database. This is especially true when bootstrapping a new site as opposed to maintaining an existing site where everything is already in place.

To achieve this ordering we make use of Init Containers. In Kubernetes, an Init Container is a container that must run and exit before the Pod’s main container(s) can start. Our Init Container’s job is to wait until the requisite dependency is met before exiting, which ensures that the required resources are ready before the microservice starts.

Above: Service A’s Init Container prevents it from starting until Service B is ready.

Init Containers are a typical solution to this problem, but what’s interesting is how we expose this option via the Helm values.yaml. At Tenable, we expose a “dependencies” stanza to allow microservices to express their relationships with other services or datastores.

The above values.yaml excerpt expresses that this particular microservice has dependencies on two datastores (Cassandra and Kafka) and another microservice (service-user). Let’s take a look at the Helm template that interprets these dependencies.

This is an excerpt from our _deployment.tpl which is shared by all microservices. It stipulates that if dependencies are defined in a chart’s values.yaml, then Helm should include an Init Container in that chart’s Deployment. The container we run is helm-worker, which is a custom utility image that we pre-load with scripts and libraries we need for orchestration. We mount in the Kubernetes cluster’s kubeconfig so that the container is able to perform administrative functions within the cluster.

The command we run in helm-worker is waitfor.py. The details of waitfor.py are outside the scope of this article, but it’s a simple python script that loops until the dependencies are available. It receives the list of dependencies as command-line arguments, which are plumbed in via the following section of Helm code:

Here we loop through each dependency defined in the values.yaml. If it’s a pod dependency, we add “-p <POD>” to the command line, and if it’s an endpoint dependency, we add “-e <HOST>”.

Let’s examine one of those printf lines closely. Kubernetes expects a command to be specified as [“a”, “comma-separated”, “list”, “of”, “strings”] as opposed to “a single space-separated string”. We are using Helm’s ability to suppress leading and trailing whitespace in order to produce a single line of output using multiple lines of template. Notice how the closing square bracket for the list we started earlier follows the last {{- end }}, above.

When this template is executed using the example values.yaml from before, it produces:

When all the checks come back green the command exits successfully, causing the Init Container to exit, allowing the Pod to continue coming up!

Database Initialization Self-service with Helm Hooks

Thank you to Lance Dryden and Brett Au for their work on this feature!

Most microservices need to interact with one or more backing datastores. Making this connection requires two basic elements: endpoint and credentials.

For the endpoints, we utilize “convention over configuration” by using Kubernetes Services to provide DNS addresses matching the type of the datastore. Cassandra is available at “cassandra”, Kafka is available at “kafka”, and so on.

Credentials are trickier because they need bootstrapped on a per-microservice basis. Depending on the type of datastore, a schema may need to be created as well. New microservices regularly come into being, and we regularly stand up new sites, so we can’t rely on a manual step here. It would be better if a microservice could declare what datastores it needs and automation could take care of the rest.

To make this work we have a bootstrap script that logs into the database, creates a schema and a user with a randomized password, and inserts that information into Kubernetes Secrets presented to the microservice as environment variables. This script is nothing special and probably exists in some form in every organization that uses a database.

What’s interesting is how we call the script. The bootstrap script is called by a Helm Hook.

Above: A Helm Hook initializes a database and makes the credentials available to the Deployment via a Secret.

Hooks are a Helm feature that allows Helm to launch resources into your cluster in response to Helm lifecycle events (such as a chart being installed). All it takes to turn a regular Kubernetes Object into a Helm Hook is to apply an annotation that indicates which lifecycle event the object should be tied to.

Our Hook template only has a few notable customizations. To start with, we use the range function to loop over the hooks stanza of the values.yaml, a strategy we made extensive use of in Part 1:

Unlike previous examples, this time our range function populates both the index and the value of each list element. The index is automatically filled out by Helm, and gives us a convenient method for providing a unique name for the resource in case multiple hooks are specified in the values.yaml. Next we set the Hook annotations:

Rather than statically setting the hook type, we pull it from the values.yaml entry. It’s important to specify a hook-delete-policy because otherwise the resources created by the Hook will persist in the cluster and prevent future Hooks from running. We chose “hook-succeeded” so that it cleans itself up under normal circumstances but allows us to recover the logs in case of failure. Further down in the template, we define the Job’s container as follows:

The first interesting thing here is how we template out the command. As mentioned in the previous section, Kubernetes expects the command and its arguments to be a list. We also require a list in the corresponding values.yaml stanza that drives this template.

The above is from the values.yaml of service-agent, a microservice that needs access to both postgres and cassandra. Before the Deployment fires up, these Hooks run and ensure that the databases are initialized with its schema. By defining the Hook for both pre-install and pre-upgrade, we ensure that the bootstrapping occurs both during initial installation of the vuln-management chart, as well as during future upgrades of the chart. Specifying both gives us a method for making updates to existing sites.

Aside: Hooks vs. Init Containers

Our Hooks and Init Containers both fill the role of executing code before the microservice starts up. However, each has certain advantages that the other lacks.

Hooks, unlike Init Containers, run to completion before the rest of the resources are applied to the cluster. This is critical for our bootstrapping design, where the script creates a previously non-existent Secret that is then consumed by the Pods. You can’t reference a Secret that doesn’t exist yet: otherwise Kubernetes would throw a CreateContainerConfigError.

Init Containers, unlike Hooks, are serialized on a per-Pod basis rather than per-Chart. This is critical for ordering startup of multiple microservices defined by the same chart. Otherwise, the Hooks would deadlock the chart in a catch-22: the Hook can’t finish until the Pods are ready, but the Pods aren’t created until the Hook finishes.

Iterating on the Self-service API

Thank you to Brett Au for his work on this feature!

I hinted in Part 1 that we want to provide maximum control to the chart maintainer without bogging them down with the technical details of the platform. At this time the chart maintainer is SRE, but down the road we want to remove SRE as a bottleneck by creating a strong and featured enough set of guardrails that the developers themselves can maintain their chart (or more accurately, their values.yaml).

The Hooks template, as defined, is extremely powerful, but it also requires the chart maintainer to know what script to call and what arguments to send. Ideally, the backend infrastructure could be treated as a black box and the microservice would simply declare its dependencies. Let’s revisit the dependencies stanza from before, only now instead of declaring an “endpoint” dependency, we’ll declare a “datastore”.

Now the command definition within the Init Container can be expanded to account for this new type of dependency.

We can also add a Hook that keys off this information. Instead of ranging over .Values.hooks, we can range over .Values.dependencies and render output given the presence of a datastore dependency.

Then, when we construct the command, instead of simply reproducing the command dictated by the hook stanza, we can programmatically determine the bootstrapping that needs to run.

With these adjustments, we now have a simple, non-redundant, and self-explanatory API that allows the infrastructure to build itself automatically as needed.

Helm Hype

If you’ve got a large scale Kubernetes infrastructure to orchestrate, Helm is a great option. Helm has features that cover most things you can’t do with pure Kubernetes. What little glue code remains is unfortunately outside the scope of this article, but hopefully these examples can help you automate all the things!

--

--