Scheduling in Kubernetes, Part 2: Pod Affinity

The previous installment covered scheduling via Node Affinity. Node Affinity facilitates scheduling based on Node labels, which know nothing about the structure of your application. Another construct is needed to support application-aware scheduling.

Previously, we asked:

Where should I run this Pod?

Node Affinity narrowed this question down to:

Should I run my Pod on this Node?

This question is about a Node. The scheduler doesn’t factor in any outside information — just the Node itself.

What if “where should I run this pod” depends on where the rest of the application is running? The “rest of the application” is made of Pods. We want the question to be about these Pods:

Should I run my Pod in the same place as this other Pod?
Scheduling is about finding hardware to run your code. (photo source)

Pod Selector

The first step is to define what “other pod” we’re talking about. This part looks just like how we previously defined what Node we were talking about — selection based on labels.

For example, if we’re interested in Pods with the app label web-frontend: app=web-frontend


The next step is to define what “in the same place” means. In the diagram below, are the two Pods in the same place?

If we use the label, “in the same place” means “on the same host”. Here, the Pods are in different “places”:

We can use any Node label for this notion of “place”. Another option is The Pods are in the same zone:

Custom topologies can be encoded as user-defined Node labels. For example, you might label Nodes with the rack they belong to. Here’s how a generic custom_topology label creates groups of “co-located” Nodes:

node-4 and node-3 are co-located in foo. pod-a is in foo. pod-b is in bar.
pod-a and pod-b are both in bar
pod-a and pod-b are both in foo

Should I run my Pod in the same place as this other Pod?

The first and second steps gave us this:

pod: app=web-frontend

i.e. Should I run my Pod in the same hostname as a web-frontend Pod?

The third step is whether the answer is Yes or No. Yes is called Affinity. No is called Anti-Affinity.


Here’s what Yes looks like: (it’s the same as above)

pod: app=web-frontend

i.e. My Pod should run in the same hostname as a web-frontend Pod.

This rule is useful if you want to run your web-store (“My Pod”) on the same host as a web-frontend instance.


(note the anti_pod key)

anti_pod: app=web-frontend

i.e. My Pod should not run in the same hostname as a web-frontend Pod.

This rule is useful if you want to make sure your web-frontend instances all run on different hosts.

Hard, Soft, Combining Rules

Just like Node Affinity rules, Pod Affinity rules come in hard and soft variations, and it’s possible to have any combination of hard and soft rules. The semantics are identical.

e.g. Prefer not to run in the same zone as a web Pod:

anti_pod: app=web:soft # note the ':soft'

Here are two co-located Deployments that each spread their Pods across different Nodes:

app: web-store
- anti_pod: app=web-store # spread out store Pods
app: web-frontend
- anti_pod: app=web-frontend # spread out frontend Pods
- pod: app=web-store:soft # co-locate with store Pods

What’s Next?

Between Node Affinity and Pod Affinity, we’ve now covered the primary mechanisms for user-defined scheduling. For additional reference material, see the Kubernetes and Koki Short docs on affinities.

This isn’t the end, though. There’s some exciting recent work that adds even greater expressiveness to Kubernetes scheduling. More on that soon!

Like what you read? Give Kynan Rilee a round of applause.

From a quick cheer to a standing ovation, clap to show how much you enjoyed this story.