Exposing TFTP Server as Kubernetes Service — Part 10

5 min readJun 3, 2022

In Part 9, we successfully exposed TFTP server as NodePort service with externalTrafficPolicy=Local. We saw, no NAT operation was happening and TFTP server was seeing original client’s IP address. In this article, we will explore what happens behind the scenes when externalTrafficPolicy=Cluster.

Modify the service manifest and apply again:

Apply the manifest:

# kubectl apply -f dpe-app-service.yaml 
service/tftp-server configured

Let us see the impact of configuring this service.

Interestingly, there is no change in the virtual server table between externalTrafficPolicy=Local and externalTrafficPolicy=Cluster. Then what and where is the difference? We will eventually understand by the end of this article.

As service is configured, let us test file transfer.

# tftp 10.10.100.208
tftp> get dummy.txt
Transfer timed out.

Duh…. this dreaded failure is back !! File transfer has failed with NodePort service (externalTrafficPolicy=Cluster) and kube-proxy configured in IPVS mode. We have to get back to packet capturing mode to analyze this failure.

A. Packets on server node (learn-k8s-2) at veth interface (calic0bf1043683)

Observations:

These traces are very familiar. We have seen them in Part 3. Remember the ICMP (Port unreachable) error?
TFTP server pod is seeing the request is coming from 10.10.100.208. So, original client’s IP address (10.10.100.197) is not seen again.
IP address of the server node is 10.10.100.208. This means, connection is SNATed with source IP set to that of node. SNAT is back !

B. Packets on server node (learn-k8s-2) at ethernet interface of node (ens160)

Observations:

Worker node running TFTP pod (10.10.100.208) receives RRQ from client (10.10.100.197), but it never sends a response [ 1 packet captured ].

Clearly, with IPVS mode, the traffic pattern and failure symptoms are same as with iptables mode:
a. Incoming traffic getting SNATed.
b. ICMP (Port unreachable) error received by TFTP server pod.

If failure symptoms are same, then the fix should also be the same ! If you have read previous articles of this series, you know the fix of this failure — add conntrack and NAT helper modules to kernel. But before applying the fix, let us analyze how the incoming packet even got SNATed. The hope is, in IPVS mode, iptables should have no role at all. We will again inspect the journey of an incoming packet as it is processed by iptables rules.

A. Every incoming packet goes through PREROUTING chain. Kubernetes makes use of PREROUTING chain in nat table to implement its services.

Every incoming packet will match the rule to jump to KUBE-SERVICES chain.

B. KUBE-SERVICES chain is the top level collection of all Kubernetes services.

Incoming packet matches second rule. So, the traffic jumps to KUBE-NODE-PORT chain.

C. Listing down the rules in KUBE-NODE-PORT chain.

To know if the rule applies to the packet, we need to list down the members of KUBE-NODE-PORT-UDP IP set.

# ipset -L KUBE-NODE-PORT-UDP
Name: KUBE-NODE-PORT-UDP
Type: bitmap:port
Revision: 3
Header: range 0–65535
Size in memory: 8300
References: 1
Number of entries: 1
Members:
69

There is only one rule and the packet matches it. So, it jumps to KUBE-MARK-MASQ, where it is MARKed.

D. We saw SNAT getting applied to incoming request. Let us confirm this from POSTROUTING chain.

Packet jumps to KUBE-POSTROUTING.

F. Listing down the rules in KUBE-POSTROUTING.

Let use check the members of KUBE-LOOP-BACK IP set.

# ipset -L KUBE-LOOP-BACK
Name: KUBE-LOOP-BACK
Type: hash:ip,port,ip
Revision: 5
Header: family inet hashsize 1024 maxelem 65536
Size in memory: 688
References: 1
Number of entries: 7
Members:
192.168.29.75,tcp:53,192.168.29.75
192.168.29.75,udp:53,192.168.29.75
192.168.29.76,udp:53,192.168.29.76
192.168.29.76,tcp:9153,192.168.29.76
192.168.29.75,tcp:9153,192.168.29.75
192.168.29.77,udp:69,192.168.29.77
192.168.29.76,tcp:53,192.168.29.76

Clearly, first rule does not match as destination does not belong to KUBE-LOOP-BACK. As the packet is MARKed, second rule does not match either. The fourth rule matches and MASQUERADE (SNAT) gets applied to the packet. That explains the role of iptables even in case when kube-proxy is running in IPVS mode — incoming traffic is SNATed. These iptables rules are very different from the case when externalTrafficPolicy=Local. Here, packets get MARKed to be SNATed later.

Let us get back to our discussion of fixing this failure by adding appropriate helper modules to the kernel.

# modprobe -v nf_conntrack_tftp
insmod /lib/modules/3.10.0–1062.1.2.el7.x86_64/kernel/net/netfilter/nf_conntrack_tftp.ko.xz# modprobe -v nf_nat_tftp
insmod /lib/modules/3.10.0–1062.1.2.el7.x86_64/kernel/net/netfilter/nf_nat_tftp.ko.xz

Now, connect external TFTP client again to learn-k8s-2 (10.10.100.208) on port 69.

# tftp 10.10.100.208
tftp> get dummy.txt
getting from 10.10.100.208:dummy.txt to dummy.txt [netascii]
Received 2016 bytes in 0.1 seconds [141687 bit/s]

Wow !!! File transfer is successful. Additionally, we can also see the packets on server node (learn-k8s-2) at veth interface (cali515518d570e)

In this article, we learnt that kube-proxy running in IPVS mode still uses iptables to implement NodePort service. When externalTrafficPolicy=Cluster, iptables apply SNAT to incoming request. In order to correctly apply NAT to TFTP traffic, helper modules must be added to kernel. Other than reduced size of iptables chains, we see no difference in exposing TFTP server as NodePort service between two modes of kube-proxy.

The conclusion is, with kube-proxy in use, iptables are inevitable. I wonder, if Kubernetes can run without kube-proxy? After all, kube-proxy is considered add-on. What if we do not install this add-on? How would Kubernetes services be implemented without kube-proxy? Enter, Cilium! We will discuss this option in the next article.

Exposing TFTP Server as Kubernetes Service — Part 10

Written by Darpan Malhotra