K8s: A Closer Look at Kube-Proxy

An example showing how kube-proxy plays with iptables

Luc Juggery
Jan 13 · 5 min read
Examples of iptables rules
Examples of iptables rules
Examples of iptables rules. Photo by the author.

The Kubernetes network proxy (aka kube-proxy) is a daemon running on each node. It basically reflects the services defined in the cluster and manages the rules to load-balance requests to a service’s backend pods.

A service load-balances incoming requests between the backend pods.
A service load-balances incoming requests between the backend pods.
A service load-balances incoming requests between the backend pods. Photo by the author.

Quick example: Let’s say we have several pods of an API microservice running in our cluster, with those replicas being exposed by a service. When a request reaches the service virtual IP, how is the request forwarded to one of the underlying pods? Well… simply by using the rules that kube-proxy created. OK, it’s not that simple under the hood, but we get the big picture here.

kube-proxy can run in three different modes:

  • iptables (default mode)
  • ipvs
  • userspace (“legacy” mode, not recommended anymore)

While the iptables mode is totally fine for many clusters and workloads, ipvs can be useful when the number of services is important (more than 1,000). Indeed, as iptables rules are read sequentially, its usage can impact the routing performances if many services exist in the cluster.

Tigera (the creator and maintainer of the Calico networking solution) details the difference between the iptables and ipvs mode in this great article. It also provides a high-level comparison between those two modes.

High-level comparison between iptables and ipvs modes. Credit: Tigera.

In this article, we will focus on the iptables mode (an upcoming article will be dedicated to ipvs mode) and thus illustrate how kube-proxy defines iptables rules.

For that purpose, we will use a two-node cluster that I’ve just created using kubeadm:

$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
k8s-1 Ready control-plane,master 57s v1.20.0
k8s-2 Ready <none> 41s v1.20.0

In the next part, we will deploy a simple application using a deployment resource and expose it through a service of type NodePort.

Deploy a Sample Application

First, we create a deployment based on the ghost image (ghost is a free and open source blogging platform) and specify two replicas :

$ kubectl create deploy ghost --image=ghost --replicas=2

Next, we expose the pods using a service of type NodePort:

$ kubectl expose deploy/ghost \
--port 80 \
--target-port 2368 \
--type NodePort

Then we get the information related to this newly created service:

$ kubectl describe svc ghost
Name: ghost
Namespace: default
Labels: app=ghost
Annotations: <none>
Selector: app=ghost
Type: NodePort
IP: 10.98.141.188
Port: <unset> 80/TCP
TargetPort: 2368/TCP
NodePort: <unset> 30966/TCP
Endpoints: 10.44.0.3:2368,10.44.0.4:2368
Session Affinity: None
External Traffic Policy: Cluster
Events: <none>

The important things to note here:

  • The virtual IP address (VIP) allocated to the service: 10.98.141.188.
  • The NodePort 30966 has been allocated to the service. Through this port, we can access the ghost web interface from any node of the cluster (the cluster’s nodes used in this example have the IP addresses 192.168.64.35 and 192.168.64.36):
Accessing the Ghost interface from one of the cluster’s nodes.
Accessing the Ghost interface from one of the cluster’s nodes.
Accessing the Ghost interface from one of the cluster’s nodes.
  • The Endpoints property shows the IP addresses of the pods exposed by the service. In other words, each request that gets to the service’s virtual IP (10.98.141.188) on port 80 will be forwarded to one of the underlying pods’ IP (10.44.0.3 or 10.44.0.4) on port 2368 in a round-robin way.

Note: Endpoints can also be retrieved using the standard kubectl get command:

$ kubectl get endpoints
NAME ENDPOINTS AGE
ghost 10.44.0.3:2368,10.44.0.4:2368 4m
kubernetes 192.168.64.35:6443 6m

Next, we will have a closer look into the iptables rules that kube-proxy has created to route requests towards the backend pods.

A Closer Look Into iptables

Each time a service is created/deleted or the endpoints are modified (e.g. if the number of underlying pods changes due to the scaling of the related deployment), kube-proxy is responsible for updating the iptables rules on each node of the cluster. Let’s see how this is done with the service we defined previously.

As there are quite a lot of iptables chains, we will only consider the main ones involved for the routing of a request that gets on the NodePort and is forwarded to one of the underlying pods:

First, the KUBE-NODEPORTS chain takes into account the packets coming on service of type NodePort.

KUBE-NODEPORTS chain
KUBE-NODEPORTS chain
KUBE-NODEPORTS chain

Each packet coming on port 30966 is thus first handled by the KUBE-MARK-MASQ, which kind of tags the packet with 0x4000.

Note: This mark is only taken into account when load balancing uses the IPVS mode (and thus is not done by iptables).

KUBE-MARK-MASQ chain
KUBE-MARK-MASQ chain
KUBE-MARK-MASQ chain

Next, the packet is handled by the KUBE-SVC-4XJR4EADNBDQKTKS chain (referenced in the KUBE-NODEPORTS chain above). If we take a closer look at that one, we can see two additional iptables chains:

  • KUBE-SEP-7I5NH52DVZSA3QHP
  • KUBE-SEP-PSCUKR75MU2ULAEX
The service iptables chain load-balances requests.
The service iptables chain load-balances requests.
The service iptables chain load-balances requests.

Because of the statistic mode random probability 0.5 statement, each packet getting into the KUBE-SVC-4XJR4EADNBDQKTKS chain is:

  • Handled by KUBE-SEP-7I5NH52DVZSA3QHP 50% of the time and thus ignored 50% of the time.
  • Handled by KUBE-SEP-PSCUKR75MU2ULAEX 50% of the time (when it is ignored by the first chain).

If we inspect both chains, we can see they define the routing towards one of the underlying pods running the ghost application:

Chain routing towards pod 10.44.0.3
Chain routing towards pod 10.44.0.3
Chain routing towards pod 10.44.0.3.
Chain routing towards pod 10.44.0.4.
Chain routing towards pod 10.44.0.4.
Chain routing towards pod 10.44.0.4.

Using a couple of iptables chains, we are then able to understand the journey of a request from when it gets to the node port until it reaches the underlying pod. Pretty cool, right?

Conclusion

In this quick article, I hope I managed to clarify the way kube-proxy works under the hood when using the iptables mode (the default one). In an upcoming article, we will see how the routing is done when the load balancing is done with the ipvs mode.

Better Programming

Advice for programmers.

Sign up for The Best of Better Programming

By Better Programming

A weekly newsletter sent every Friday with the best articles we published that week. Code tutorials, advice, career opportunities, and more! Take a look

By signing up, you will create a Medium account if you don’t already have one. Review our Privacy Policy for more information about our privacy practices.

Check your inbox
Medium sent you an email at to complete your subscription.

Thanks to Zack Shapiro

Luc Juggery

Written by

Docker & Kubernetes trainer (CKA), 中文学生, Learning&Sharing

Better Programming

Advice for programmers.

Luc Juggery

Written by

Docker & Kubernetes trainer (CKA), 中文学生, Learning&Sharing

Better Programming

Advice for programmers.

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store