Exposing TFTP Server as Kubernetes Service — Part 5

Darpan Malhotra
7 min readMay 26, 2022

--

In previous parts of this series, we ran through the steps for exposing TFTP server pod as NodePort service, where we realized:

  • kube-proxy in iptables mode uses NAT to expose pods as NodePort service.
  • TFTP is not a NAT-friendly protocol.
  • Conntrack and NAT helper modules for TFTP protocol were developed long back. They need to be added to Kernel to expose TFTP server pod as NodePort service.

In this article, we will learn more about conntrack and the impact of adding helper modules to Linux kernel.

In order to read connection tracking entries and interact with connection tracking system of netfilter, we will use conntrack CLI.
Every conntrack entry represents a flow (or connection) and comprises of 2 tuples (src ip, dest ip, src port, dst port):

  • IP_CT_DIR_ORIGINAL — tuple for direction of original request
  • IP_CT_DIR_REPLY — tuple for direction of reply

To begin with, we will analyze connection tracking entries when tftp helper modules are not loaded.

We will capture packets when an external client makes TFTP request to NodePort service and parallelly monitor conntrack entries.
This time, packet captures provide the following information:

  • Request packet from external client to server node (before NAT):
    10.10.100.197:27661 → 10.10.100.208:69
  • Request packet to pod (after NAT):
    10.10.100.208:60539 → 192.168.29.67:69
  • Response (data) packet from pod (which does not reach the external client):
    192.168.29.67:47624 → 10.10.100.208:60539
    Checking for events that happened in conntrack on the worker node (learn-k8s-2) where the TFTP server pod runs:

And the list of connection tracking entries:

Normally, IP_CT_DIR_REPLY is populated by conntrack with values expected in reply packet i.e. IP_CT_DIR_REPLY should typically be invert of IP_CT_DIR_ORIGINAL. But in case of NAT, IP_CT_DIR_REPLY is populated as per reply expected after applying NAT.

Let us analyze the 1st event (it is also the 1st connection) :

[NEW] udp 17 30 src=10.10.100.197 dst=10.10.100.208 sport=27661 dport=69 [UNREPLIED] src=192.168.29.67 dst=10.10.100.208 sport=69 dport=60539
  • It is first packet of this connection as seen by conntrack. So, it is in NEW state with UNREPLIED flag set.
  • NAT rules have correctly updated src ip and dst ip in IP_CT_DIR_REPLY tuple.
  • Although src port is set to 69 in IP_CT_DIR_REPLY and TFTP does not work this way. When replying, the server will use a random port (47624, in this case). This means, when server replies, that connection tuple will not match this connection.
  • Well, conntrack will never see a reply as per its expectation and hence this connection remains NEW and UNREPLIED.
  • It will be destroyed after the TTL of this entry in conntrack expires, which is 30s. If the conntrack event monitoring is continued, the deletion event can be seen.

Let us analyze the 2nd event (it is also the 2nd connection):

[NEW] udp 17 30 src=192.168.29.67 dst=10.10.100.208 sport=47624 dport=60539 [UNREPLIED] src=10.10.100.208 dst=192.168.29.67 sport=60539 dport=47624
  • This is considered by conntrack as first packet of a different connection altogether.
  • Conntrack is unable to establish the fact that this flow is related to previous flow. So, it is in NEW state with UNREPLIED flag set.
  • IP_CT_DIR_REPLY is invert of IP_CT_DIR_ORIGINAL.
  • Conntrack does not see a reply as per its expectation (as this packet never reached the external client who would have sent a reply) and hence this connection also remains NEW and UNREPLIED.
  • This connection will also be destroyed after the TTL of this entry expires:

Overall, we learn that TFTP request from external client creates a flow as per NAT rules. But, as the TFTP server pod uses a different port to reply, the packet going out ends up creating a second flow in conntrack. NAT expected the reply to be sourced from port 69, but it came from port 47624, so reverse-NAT could not happen and hence packet was dropped. That explains what happens when helper modules for TFTP are not added to kernel.

Now, let us add conntrack and nat helper modules for TFTP and analyze conntrack entries. We again capture packtes as done before when an external client makes TFTP request to NodePort service and parallelly monitor conntrack entries.

This time, packet captures provide the following information:

  • Request packet from external client to server node (before NAT):
    10.10.100.197:48491 → 10.10.100.208:69
  • Request packet to pod (after NAT):
    10.10.100.208:38075 → 192.168.29.67:69
  • Response (data) packet from pod to server node:
    192.168.29.67:52697 → 10.10.100.208:38075
  • Reverse-NATed response (data) packet from server node to client:
    10.10.100.208:52697 → 10.10.100.197:48491

Checking for conntrack events:

These events look different from the case when helper modules were not added. There are again 2 connections.

Analysis of 1st event:

[NEW] udp 17 30 src=10.10.100.197 dst=10.10.100.208 sport=48491 dport=69 [UNREPLIED] src=192.168.29.67 dst=10.10.100.208 sport=69 dport=38075 helper=tftp

This flow looks similar to the 1st flow in case when helper modules were absent. One notable change is that the flow says: helper=tftp. As IP_CT_DIR_ORIGINAL has dport=69, the tftp helper module got into action.

Analysis of 2nd event:

[NEW] udp 17 30 src=192.168.29.67 dst=10.10.100.208 sport=52697 dport=38075 [UNREPLIED] src=10.10.100.197 dst=10.10.100.208 sport=48491 dport=52697
  • The first response packet from server creates NEW entry as it does not match any existing conntrack entry.
  • IP_CT_DIR_REPLY is not invert of IP_CT_DIR_ORIGINAL. This means, TFTP helper modules are working !!! This means reverse-NAT can now be successfully done by Linux kernel of the worker node.

Analysis of 3rd event:

[UPDATE] udp 17 30 src=192.168.29.67 dst=10.10.100.208 sport=52697 dport=38075 src=10.10.100.197 dst=10.10.100.208 sport=48491 dport=52697
  • As the TFTP server pod’s response now reaches external client, the client will send ACK for the received block of data.
  • Conntrack sees the reply from client and the entry gets UPDATED (i.e. UNREPLIED flag goes away).

Analysis of 4th event:

[UPDATE] udp 17 180 src=192.168.29.67 dst=10.10.100.208 sport=52697 dport=38075 src=10.10.100.197 dst=10.10.100.208 sport=48491 dport=52697 [ASSURED]

Conntrack establishes that the connection is part of UDP stream as many packets are flowing in both directions, so the entry gets further UPDATED. The updates performed are:

  • TTL becomes 180s (it was 30s till now).
  • The entry is ASSURED (which means this entry will remain even if there is heavy load of connections on the system)

So far, we used conntrack tool to see connection entries from “conntrack” table. And we can see that conntrack and NAT helpers make it possible for netfilter to track and NAT complex (read legacy) protocols like TFTP. Behind the scenes, these helpers create expectations in conntrack system. Those expectations look very similar to conntrack entries , but are stored in a different table called “expectation” table.

For the above example, let us see the events in the expectation table:

We can see that there are events in expectation table which create entries that the kernel should “expect”. Although these entries are short-lived. So, if you list the entries of expectation table, nothing will be seen in expectation table.

# conntrack -L expect
conntrack v1.4.4 (conntrack-tools): 0 expectations have been shown.

Analysis of 1st event:

[NEW] 300 proto=17 src=192.168.29.67 dst=10.10.100.208 sport=0 dport=38075 mask-src=255.255.255.255 mask-dst=255.255.255.255 sport=0 dport=65535 master-src=10.10.100.197 master-dst=10.10.100.208 sport=48491 dport=69 class=0 helper=tftp
  • First tuple (src,dst, sport, dport) tells us that it is expected to see a UDP connection from 192.168.29.67 and any port (sport=0) going to 10.10.100.208 and port 38075.
  • Second tuple (master-src, master-dst, sport, dport) tells us the actual connection which has created this expectation.
  • TTL of this entry is 300s.
  • Finally, helper=tftp tells us that helper modules are working. These events in expectation table are a result of tftp helper modules doing their job.

Analysis of 2nd event:

[DESTROY] 300 proto=17 src=192.168.29.67 dst=10.10.100.208 sport=0 dport=38075 mask-src=255.255.255.255 mask-dst=255.255.255.255 sport=0 dport=65535 master-src=10.10.100.197 master-dst=10.10.100.208 sport=48491 dport=69 class=0 helper=tftp
  • Delete event comes when TTL is still 300s. That means, the entries in expectation table live for a very short time.

Finally, let us check for list of connections in conntrack table:

These entries need no explanation after discussing all events in conntrack and expectation tables above. They will expire after TTL expires (30s and 180s respectively) and be removed from conntrack table.

In this article, we saw the impact of TFTP helper modules on connection tracking entries. They create expectations and that’s how seemingly two different flows (incoming and outgoing) become related. So far, we have seen the functional impact of conntrack. In the next article, we will see the performance tuning impact of it.

--

--

Darpan Malhotra

4x AWS Certified including Advanced Networking — Speciality