Making Kubernetes our own

Published in

John Lewis Partnership Software Engineering

7 min readAug 13, 2019

Hi! My name is Michael Bannister, and I’m the Lead Platform Engineer working within the Digital Platform teams at John Lewis & Partners (JL&P).

At the end of 2018, my colleague Alex wrote about our journey so far, building a platform on Kubernetes and Google Cloud Platform. We’re currently serving about 25 development teams building new customer-facing features and business capabilities, all at various stages between Day 1 of development through to serving production traffic for nearly a year.

Over the last year we’ve seen teams who have largely figured out Kubernetes on their own, referring to existing scripts and config as well as the Kubernetes documentation. But we’ve also seen teams struggle to get their application running having copied an example Deployment from somewhere and applied a lot of trial-and-error to get it working.

I’ll describe some of the ways we tried to solve this, and show you why we are excited about our recent decision to create a new Custom Resource Definition in Kubernetes. We call it…

… the Microservice. (Seems obvious? Good – names should be obvious!)

Our platform deliberately imposes a number of requirements on the workloads that run on it. These include security policies (e.g. disallowing running containers as root), exposing HTTP metrics and defining authentication in the Ingress configuration. We had some “onboarding” docs, but during the first few months they were being changed frequently, and teams fed back that it seemed we were asking them to hit a moving target. We also used a slightly more-complicated-than-average application as an example for Kubernetes YAMLs to copy from.

So, we created a couple of sample applications – a simple UI and a “backend” API – which we deployed (using GitLab CI) and monitored as if they were a real service, and we referenced this from our “onboarding” documentation. But this had a few drawbacks which emerged over time:

Developers still had to copy a set of files into their repo and then go through and find/replace certain values. They didn’t always find/replace the right things – there was no guidance – and so we ended up with some really jumbled naming of things, or overcomplicated configuration where it wasn’t required.
Whenever we made improvements to our standard configuration – for example, adding a Traefik proxy sidecar to capture common HTTP metrics for a new standard dashboard, or setting pod anti-affinity to ensure replicas didn’t get scheduled on the same node – we ended up combing through many repositories to make pull requests, rather than trying to describe to all the teams what changes they should make themselves. We knew this approach wouldn’t scale as more services joined the platform.
A typical application’s Service, Deployment, and ServiceMonitor (used by CoreOS’s Prometheus Operator) ended up being about 150 lines of YAML. That’s a lot for someone to look through and understand which bits are important to their own application, which bits are standard Kubernetes boilerplate, and which bits are platform-mandated boilerplate.

To illustrate the amount of boilerplate, here is a simple example application with Service, Deployment and ServiceMonitor, where the highlighted lines are those specific to this particular application:

141 lines of YAML, 46 of them highlighted

On top of this, there’s a lot of repetition here: the names and labels of each resource are the same, and then there are two selectors which reference the labels. We actually recommend teams use Kustomize, which takes care of much of this repetition, but still only 33% of the lines in these resource definitions are specific to the team’s application. That’s a pretty terrible signal-to-noise ratio. The rest will, at best, be copied and pasted and not touched again. At worst, teams might start fiddling with sections they don’t understand, get into more of a mess, and end up asking the platform team for help anyway. (We’ve seen this happen at least a couple of times, that I know about.)

When the Platform Team was first formed, one of our engineers was a big fan of Google App Engine, whose app.yaml has a handful of mandatory fields, while everything else is optional with sensible defaults. We thought that maybe we’d design our own app.yaml format, use it as input to some kind of templating tool to generate the full Kubernetes-native resource YAML, and pass that to kubectl apply.

But then all teams would need to get hold of our templating tool, both locally and in their deployment pipeline jobs, and we’d have to figure out how to distribute it to them and make sure everyone kept up-to-date.

You might be thinking: what about Helm? Well, we’d rejected it a while back based on concerns around Tiller being insecure, or at least difficult to secure, especially in a multi-tenant environment. There is helm template to template locally, but we didn’t much like the look of the go template language, and we’d still have the distribution problem since (at time of writing) Helm doesn’t support helm template run against remote charts.

Earlier this year, I spent a day on a little proof-of-concept based on a talk from KubeCon/CloudNativeCon December 2018 called Why Are We Copying and Pasting So Much? This talk described how much boilerplate code is involved in writing custom Kubernetes controllers, and how a then-fairly-new Golang library called controller-runtime could greatly simplify your controller code, further supported by tools such as kubebuilder.

I had defined a very basic Microservice custom resource which I could apply to Kubernetes and have it create a Deployment with the correct number of replicas:

apiVersion: services.jl-digital.net/v1alpha1
kind: Microservice
metadata:
  name: microservice-sample
spec:
  image: kennethreitz/httpbin:latest
  replicas: 2

Photo by Michael Bannister | A mosiac constructed by Martin Brown during KubeCon Europe 2019

Once we started properly talking about simplifying the interface for teams, I suggested the CRD/controller approach to my team and sketched out a possible format, including optional/defaulted fields. The team liked the idea, though we were a bit wary of encountering unexpected complexity in an unfamiliar language and tools. Only one of us had done any real work with Golang, and he didn’t consider himself an expert.

Nevertheless, we sketched out a series of implementation stages, kicked off the work, and within two weeks we had deployed microservice-manager and an actual Microservice to our test cluster to replace our simplest existing sample application. Challenges along the way included:

Using kubebuilder v2.0.0-alpha releases. We decided to brave this, but the v2 docs were rather sparse when we started the work (alpha.2). It also meant using the new Golang module system and there was (and still is) much cursing and confusion over the fact that running go build can actually update versions of modules and break things.
Figuring out how to test locally. The engineers who worked on this came up with a pretty cool approach using VSCode’s Remote – Containers extension and kind (Kubernetes in Docker).
Avoiding infinite reconcile loops due to the apiserver setting default values for optional fields. This was helped enormously when we found functions such as SetDefaults_Container in the Kubernetes core API.

Problem solved! Photo by NeONBRAND on Unsplash

We published some documentation and announced the new capability to teams at our fortnightly showcase. Meanwhile, we continued adding some features that we felt were going to be widely needed:

Support for Google Cloud Endpoints by specifying a single field enableCloudEndpoints: true. Previously this required copy/pasting a whole container definition including volume mounts from a Secret; now we provide a convention-over-configuration approach with a clearly-documented requirement on the Secret’s name and contents, and the Endpoint’s name, based on the Microservice name.
Improved validation using a webhook (again, much of the relevant kubebuilder documentation was missing until just after we’d figured it out). This gives teams immediate feedback if they try to specify something invalid, and allows us to be more relaxed about checking preconditions in the microservice-manager code.
Allow application engineers to specify a custom Prometheus metrics endpoint on their app container, which we use to configure the ServiceMonitor appropriately.
Create a PodDisruptionBudget for the pods in the Deployment. Since PodDisruptionBudget is immutable (before 1.15), kubectl apply doesn’t work for it, so it’s always been annoying to have to handle PDBs specially in our deployment scripts. Now, we just have the logic “if it exists, delete then create” coded into the microservice-manager in one place!

Two months on from our first commit to the microservice-manager repo, several teams have started using a Microservice to define their application deployments. They’ve generally been happy, and have already given us some useful feedback. We have a few known issues, and some features we don’t support yet, but it’s working pretty well so far. We want to get more teams to upgrade to use Microservice, so we’re planning to add some really desirable features and simplifications which will only be available through Microservice.

We’re already starting to see some of the benefits we’d hoped for. Earlier I described the pain of raising merge requests across dozens of repos to ask teams to apply changes to “our” configuration in “their” deployments. Now we can just update the microservice-manager, redeploy that and watch the new configuration roll out to many deployments across the cluster!

To summarise, it feels like we’re finally starting to tap into the real power of Kubernetes, to build a model that suits our needs today, but allow for easier change in future. The Digital Platform teams can solve problems once to benefit all our teams, who just want to get on with building a great service for our customers and for our business, so it’s totally worth it!

Making Kubernetes our own

Written by Michael Bannister