Understanding Kubernetes Networking — Part 3

Sumeet Kumar

Published in

Microsoft Azure

4 min readOct 20, 2020

This is the third post in the ongoing series on understanding Kubernetes Networking.

In this post, we will discuss on POD to Service communication.

If you have missed Part 2 on pod to pod communication, you can check it here.

3. POD to Service communication.

“Services” are proxies in Kubernetes, that transfers request to a group of PODs.
PODs and VM Host can crash and this can lead to change in IP addresses. Hence, we cannot send the traffic to just IP addresses, so we need to have “Services”, which will help us send the traffic to specific PODs (example: frontend and backend) without actually referring to the exact POD IPs.
Services:

1.) Manages state of PODs,
2.) Keeps track of PODs IP addresses,
3.) Provides Internal and External L3/L4 connectivity,
4.) Exposes specific VIP (which does not change) behind which PODs are kept.

Services are logical.

Labels and Selectors

Labels key-value pairs attached to PODs, Replica-set and Services.
Used to find attributes for an object.
Does not provide any uniqueness.
Label Selector are grouping primitive in Kubernetes.
Used to select a set of objects.
2 types:

Equality Based — Filtering by key and value. Matching object should satisfy all specified labels.
Set-Based — Allows filtering of keys based on set of values.

More Information: https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/

Kubernetes must provide way to perform Load-Balancing to PODs exposed via a Service.
Kubernetes achieve this via — kube-proxy .
Kube-proxy watches Kubernetes Pods for changes in control plane and the way it achieves load balancing is:

- either via “userspace” proxy mode. [We will not discuss this, as it old and slow and not recommended]
- or via “Iptables” proxy mode.
- or via “IPVS” proxy mode.

In Iptables proxy mode — kube-proxy installs iptables rules, which captures the traffic to Service IP and port and redirects traffic to one of the Service’s backend.
Backend is chosen at random.
If the sent request to one of the backend POD that does not respond, the connection will fail.
Hence, to avoid such issues, use “readiness probes” to send the traffic to only healthy backend PODs.
In IPVS [IP Virtual Server] proxy mode — hash tables are utilized as data structure and works in Kernel space.
Hence, it provides lower latency and better performance than Iptables proxy mode.
It provides more option for load balancing traffic to backend PODs:

rr: round-robin
lc: least connection (smallest number of open connections)
dh: destination hashing
sh: source hashing
sed: shortest expected delay
nq: never queue

We previously learnt in the Pod to Pod communication section that Kubernetes requires network connectivity be implemented without the use of NAT.
This is not true for Services, as Services act as proxies, hence they perform DNAT and SNAT.

DNS Services

Kubernetes DNS schedules a DNS Pod and Service on the cluster, to avoid knowing and hardcoding the Service IPs.
Kubernetes configures “Kubelet” running on each Node so that container uses the DNS Service’s IP to resolve DNS name.
Every Service defined in the cluster (including DNS server itself) is assigned a DNS name.
There are 2 types of DNS that can be configured in Kubernetes. (i) Kube-dns (ii) CoreDNS
Azure uses CoreDNS.

More information: https://kubernetes.io/docs/tasks/administer-cluster/dns-custom-nameservers/

Flow, from POD to Service

The packet first leaves the POD through the eth0 interface attached to the Pod’s network namespace. [Src=Pod1 and dest=svcIP]
Then it travels through veth to the bridge.
The ARP protocol running on the bridge does not know about the Service and so it transfers the packet out through the default route — eth0.
Here, before being accepted at eth0, the packet is filtered through iptables.
After receiving the packet, iptables uses the rules (installed on the Node by kube-proxy) in response to Service or POD events to rewrite the destination of the packet from the Service IP to a Specific Pod IP. [src=Pod1 and dest=Pod4]
The packet now reaches destination.
On return path when the packet reaches back to Node and hits the IPtables, the src will become: svcIP and dest: will be: Pod1.

[ You can think of it like a normal LB scenario in Azure, accessing IIS page behind LB. Client will only see the LB IP communicating and the backend server will see the Client IP communicating. ]

“conntrack” is a feature built on top of Netfilter framework.
It is essential for performing complex networking of Kubernetes where nodes need to track connection information between thousands of pods and services.
If you expose your Service via ClusterIP, you should be able to access your application from anywhere within the cluster.
We would observe our Service having a ClusterIP. [In this example we have ServiceType as LoadBalancerIP, which creates NodePort and ClusterIP Services automatically] .

Kube-proxy will configure the Iptables on the nodes to allow traffic to this ClusterIP.

and that concludes this post.

See you on the next one!!!

Understanding Kubernetes Networking — Part 3

3. POD to Service communication.

Labels and Selectors

DNS Services

Flow, from POD to Service

Written by Sumeet Kumar