Exposing TFTP Server as Kubernetes Service — Part 8

Darpan Malhotra
7 min readJun 3, 2022

--

In the journey to expose TFTP server as Kubernetes service so far, we used a Kubernetes cluster bootstrapped by kubeadm with default configuration. And the default is to use kube-proxy in iptables mode. In Part 3, we faced problem while exposing TFTP as NodePort service. We know kube-proxy was using iptables and further using NAT to forward traffic to the target pod (TFTP server). TFTP is not a NAT-friendly protocol, so we used helper modules to fix the problem. Simply put, using iptables troubled us. What if Kubernetes (kube-proxy) doesn’t use iptables at all? Is there any such option? If yes, would we not be hit by the problem with TFTP?
Well, we can run kube-proxy in IPVS mode. This is another mode of operation of kube-proxy, used for in-cluster service load-balancing. In this article, we will run kube-proxy in IPVS mode and explore if TFTP server can be exposed as a ClusterIP service.

IPVS
IPVS (IP Virtual Server) is also built upon Netfilter framework and implements Layer 4 (transport layer) load balancing within Linux kernel.
As per documentation :

IPVS running on a host acts as a load balancer before a cluster of real servers, it can direct requests for TCP/UDP based services to the real servers, and makes services of the real servers to appear as a virtual service on a single IP address.

This precisely is requirement of Kubernetes service — a service which load-balances TCP/UDP traffic to backend pods. That looks like a match made in heaven !

Cluster Setup
I have destroyed the Kubernetes cluster setup used till now to create a new one. There was an option to reconfigure kube-proxy to use IPVS, but I opted to start clean.

Install IPVS on every node (master and workers) and verify:

# yum install ipvsadm -y# ipvsadm -v
ipvsadm v1.27 2008/5/15 (compiled with popt and IPVS v1.2.1)

We will again use kubeadm for bootstrapping the cluster. As the default init configuration uses iptables mode, we will have to modify it. Let us first fetch the default init configuration:

Save the init configuration to modify it later:

# kubeadm config print init-defaults > kubeadmConfig.yaml

Edit kubeadmConfig with following changes:

  • Modify InitConfiguration to set correct advertiseAddress (10.10.100.207) and name of the node (learn-k8s-1).
  • Modify ClusterConfiguration to set podSubnet (192.168.0.0/16).
  • Add KubeProxyConfiguration with mode: ipvs

Initialize the control plane node (learn-k8s-1) using this config:

# kubeadm init --config kubeadmConfig.yaml

With this, kube-proxy should be running in IPVS mode. Before verifying that, we will install Calico CNI. The steps are already described in Part 1.

Let us check the mode in which kube-proxy is running:

  1. Check the logs of kube-proxy pod.
# kubectl logs kube-proxy-jks2s -n kube-system 
I0528 19:36:58.995622 1 node.go:163] Successfully retrieved node IP: 10.10.100.207
I0528 19:36:58.995670 1 server_others.go:138] “Detected node IP” address=”10.10.100.207"
I0528 19:36:59.034487 1 server_others.go:269] “Using ipvs Proxier”

2. Retrieve the mode from HTTP endpoint of kube-proxy.

# curl http://localhost:10249/proxyMode
ipvs

It is clear that kube-proxy is now running in IPVS mode. Let us begin the action by deploying TFTP server and client pods (as done in Part 1).

We will expose TFTP server pod as ClusterIP service (as done in Part 2).

# kubectl apply -f tftp-server-service.yaml 
service/tftp-server created

We will list down all service and endpoints objects:

Client pod running on node learn-k8s-3 should connect to ClusterIP (10.103.156.38) and the packets need to reach TFTP server pod running on node learn-k8s-2. The question is how does this pod-to-service communication happen? In Part 2, we saw kube-proxy, running in iptables mode, was programming iptables to DNAT the service IP to actual pod IP. But now, kube-proxy is running in IPVS mode. So, it must be programming the virtual server table. Let us see what is in the virtual server table on learn-k8s-3 node.

The table has entry for all ClusterIP services including the one we created. We learn the following from this table:

  • Virtual service is identified as tuple UDP:10.103.156.138:69.
  • The real server added to this service is 192.168.29.72:69.
  • The algorithm used to load-balance the traffic from virtual service to real server is rr (round-robin).
  • The packet-forwarding mode used in Masq (NAT).

That’s how IPVS directs traffic from ClusterIP (10.103.156.38) to pod IP (192.168.29.77). As the server pod is on a different node (learn-k8s-2) than the client pod, Calico CNI takes care of routing the traffic to appropriate node using IP-in-IP tunnel. This part of communication is same as Part 2.
To be sure, we will check the route table on the client node (learn-k8s-3) and also capture packets.

This means, any traffic destined to 192.168.29.64/26 (Mask: 255.255.255.192), the gateway (i.e. the next hop) is 10.10.100.208 and tunl0 interface needs to be used. Here, 10.10.100.208 is IP address of node learn-k8s-2, where the server pod is running.

Let us exec into the client container initiate a TFTP read request to ClusterIP of TFTP service and simultaneously capture packets.

# kubectl exec -it tftp-client-76dfcb55dc-st9vl —- bash
root@tftp-client-76dfcb55dc-st9vl:/# tftp 10.103.156.38
tftp> get dummy.txt
Received 2016 bytes in 0.0 seconds

A. Packets on client node (learn-k8s-3) at veth interface (calidb6017df1a7)

Observations:

  • Client (192.168.154.195) is connecting to ClusterIP service IP (10.103.156.38).
  • The response is received from TFTP server pod (192.168.29.77).
    This means, the outgoing traffic was load-balanced by IPVS. There is no DNAT by iptables happening here. We can confirm this by inspecting NAT table rules.
# iptables -t nat -S | grep DNAT
#

Nothing found !

B. Packets on client Node (learn-k8s-3) at tunnel interface (tunl0)

Observations:

  • The traffic is seen by tunnel interface.

This means, IP-in-IP encapsulation/decapsulation is happening here.

C. Packets on client Node (learn-k8s-3) at ethernet interface of node (ens160)

# tshark -i ens160 proto 4 -V -w /tmp/clientNodeIPVS.pcap

Due to the verbosity, no frames are shown here. Later, we get summary of packets in the clientNodeIPVS.pcap file.

Observations:

  • The request packet from client pod gets IP-in-IP encapsulated.
  • The encapsulated packet has destination IP of server node (10.10.100.208).

Next, let us analyze packets on the server side.

A. Packets on server node (learn-k8s-2) at ethernet interface of node (ens160)

# tshark -i ens160 proto 4 -V -w /tmp/serverNodeIPVS.pcap

Due to the verbosity, no frames are shown here. Later, we get summary of packets in the serverNodeIPVS.pcap file.

Observations:

  • The ethernet interface (ens160) on server node has received packet from client node (10.10.100.209).
  • The packet has IP-in-IP encapsulation.

B. Packets on server Node (learn-k8s-2) at tunnel interface (tunl0)

Observations:

  • The traffic is seen by tunnel interface.

This means, IP-in-IP encapsulation/decapsulation is happening here.

C. Packets on server node (learn-k8s-2) at veth interface (cali515518d570e)

Observations:

  • Decapsulated packet (i.e. original packet generated by client) is received by TFTP server pod.
  • The server pod generates response and it takes reverse path but goes through similar transformation.

In this article, we used IPVS mode of kube-proxy in Kubernetes cluster to successfully expose TFTP server as ClusterIP service. Not only entry for ClusterIP service was present in virtual server table, but also there were no DNAT rules in iptables. It looks like using IPVS mode is a good thing as we know TFTP is not a NAT-friendly protocol.
In the next part, we will expose TFTP server as NodePort in this cluster and see if there is any benefit of switching over to IPVS mode from iptables mode.

--

--

Darpan Malhotra

4x AWS Certified including Advanced Networking — Speciality