Creating a Mutating Webhook for Great Good!

Or: how to automatically provision Pods on a specific node pool

Benjamin Tan Wei Hao
DKatalis

--

Until recently, mutating webhooks in Kubernetes was one of those things that I knew existed but didn’t have much use for. But then one fine day, I had a problem that fitted perfectly for what mutating webhooks were meant to solve.

The Problem

We wanted to shift all our Kubeflow Pipeline workloads onto a dedicated GKE node pool. Of course, simply provisioning a node pool is not enough. To schedule only Kubeflow Pipeline workloads, each Pod would have to include the nodeSelector and tolerations. Now, it is not reasonable to ask every Data Scientist to remember to include this when writing their pipelines, because it isn't their concern.

Therefore, the problem now becomes: how can we automatically schedule Kubeflow Pipeline Pods, that come from any number of namespaces, on the dedicated GKE node pool?

The Solution

Mutating Webhooks! In the rest of the post, I’ll describe how I’ve created a mutating webhook that would:

  1. Look for Pods with a specific label (pipelines.kubeflow.org/kfp_sdk_version)
  2. Add nodeSelector and tolerations so that it gets scheduled on the dedicated node pool

--

--

Benjamin Tan Wei Hao
DKatalis

Author of The Little Elixir & OTP Guidebook, Mastering Ruby Closures, Building an ML Pipeline in Kubeflow. | Currently: Product Owner at @dkatalis.