Photo credit: https://pixabay.com/

How TCP segment size can affect application traffic flow

Shashank Suresh Kumar
Walmart Global Tech Blog
8 min readJul 24, 2019

--

Whilst migrating our firewall infrastructure, a HTTPS connection to a payment vendor failed and we had to roll back the activity. I’m not going to lie, it was disappointing having spent all that time planning and to not have a working solution at the end of the day. Such is technology, it doesn’t always work as expected. But when things go wrong, you are presented with excellent learning opportunities!

Let’s start with some TCP concepts before looking at the packet captures.

MTU

Maximum transmission unit is the maximum size of a packet or frame that can flow across the network, without being fragmented. For Ethernet networks, the maximum MTU value is 1500 bytes.

MSS

Maximum segment size is the maximum TCP datagram size. It represents the maximum payload size an endpoint is willing to accept within a single packet. Maximum MSS value is 1460 bytes. The MSS, IP header and TCP header, together make up the MTU value. That is, 1500 MTU = 1460 byte MSS + 20 byte IP header + 20 byte TCP header. Said another way, MSS = MTU — 40.

Do note that MSS is only announced during the TCP handshake in the SYN segment, it is not a negotiated parameter. Meaning, client and server can announce their own individual and different MSS values [rfc879]. The actual MSS is selected based on the endpoint’s buffer and outgoing interface MTU. This can be represented visually by considering a communication between client A and server B [cisco-ipfrag].

TCP sequence number

TCP uses sequence number field to keep track of the amount of data sent in a communication stream. TCP uses a random 32-bit number to identify the beginning of a conversation. This is known as the initial sequence number (ISN). Rather than starting all TCP conversations with 1, a random ISN helps to identify and keep traffic separate for each flow [www.tcpipguide.com].

Acknowledgement number

Used by the receiving host to acknowledge successful receipt of a TCP segment. An ACK message is replied to the sending host, which includes the received sequence number incremented by 1. This number also informs the sending host, the sequence number of the next segment expected by the receiving host [www.firewall.cx].

Relative sequence number

Wireshark uses numbers relative to each TCP stream to keep track of each session. This essentially means that the ‘Sequence’ and ‘Acknowledgement numbers’ will always begin with a 0 for each new session. Using a small number makes it easier to read the packet captures, as opposed to looking at a large number (since 32-bit ISN can be anything from 0 to 4.2 billion). To verify if Wireshark is using this option, go to Wireshark->preferences->protocols->TCP-> check ‘Relative sequence numbers’.

Next sequence number

This is the length of the TCP payload + the current sequence number. It indicates the sequence number of the next segment that will sent by the client.

Selective ACK (SACK)

This TCP option is used to identify a block of data that was received by a host. The sender does not re-transmit data identified by the left edge and right edge of SACK. This option can be used only if supported by both the parties and is negotiated during the TCP handshake [packetlife-sack].

Duplicate ACK

As part of the TCP fast re-transmit mechanism, duplicate ACKs are used to inform sender of either segments received out-of-order or lost segments. Re-transmission of missing segments is performed immediately [rfc2001-sec3].

Now that we have the basics covered, let’s take a look at our packet captures to see what was taking place on the wire.

Broken connection

Analyzing the relevant lines from a packet capture taken on the Fortigate firewall:

Capture#1-Fortigate
  • In line# 2139, source sends a SYN advertising the MSS as 1460 bytes. The relative SEQ number is 0.
  • Destination acknowledges this by incrementing the previous SEQ number by 1 and sends an ACK=1. Since this is the 1st segment sent by the destination machine, the SEQ number=0. The MSS announced by the destination is also 1460 bytes. This is represented by line# 2140.
  • Source machine acknowledges destination’s SYN, increments it by 1 and sends an ACK=1. Since this is source’s 2nd transmitted segment, SEQ is now=1. TCP handshake is now complete — Line# 2141.
  • In line# 2142, the SSL handshake process is initiated. The sequence number of the next segment that the client is scheduled to send is 142. As mentioned earlier, this is Len + sequence number.
  • Line# 2143 is where it starts to go wrong. Ideally, the server should have responded with a ‘Server Hello’. Also, the SEQ number is 2921, where it should have been 1. This indicates that 2920 bytes of data has been lost (because payload + SEQ = next SEQ-> x+1=2921; x=payload=2920). ACK=142 indicates receipt of 142 bytes of data from the client.
  • The client in line# 2144 transmits segment with SEQ = 142 (see next seq number in line#2142) a Dup ACK, with the SACK option. Client informs the server that it has only received data between 2921–3011 bytes and is missing rest of the data. ACK is still=1 because there’s nothing new to acknowledge.
  • After Dup ACK#2, server sends a RST message to tear down this flow.
  • It might help to see the corresponding packet capture from the external vendor:
Capture#2-External vendor
  • We see that on line# 1176, server has in fact replied with a ‘Server Hello’. But this never reached us, the client. Client sends a duplicate ACK requesting missing data. The server tries to re-transmit the missing data a few times, does not receive an acknowledgment and finally sends a RST back to the client.

Rollback activity

At this point, we had bring the Cisco ASA firewall back into production. Let’s take a look at the packet capture from the ASA:

Capture#3-ASA

Notice the MSS in the SYN-ACK from the server (line#5). It is 1380. But, looking back at the capture#2 (line#1172) from the vendor, we saw the server use 1460 as the MSS. The reason we see a different MSS value here, is because the ASA modifies this to 1380 [ASA-TCP MSS].

The TCP and SSL handshake completes and application data flows between the hosts successfully.

Why did the connection fail with a higher MSS? As we later discovered, a couple hops away on a router, was an IPSec VPN to the vendor. IPSec may require up to 53 bytes for its header [IPSec-Bytes]. With a 1460 byte TCP segment, there is simply no room for the extra header information within a 1500 byte IP packet. Therefore, depending on the environment adjust the MSS value to accommodate header information of other protocols, for example 4 bytes for MPLS, 8 bytes for GRE and 12 bytes for RTP.

Changes on the Fortigate

Once you identify the firewall security policy matching the traffic flow, set the tcp-mss-sender and tcp-mss-receiver values suitable for your environment.

Configure firewall policy

Edit <policy number>

set tcp-mss-sender 1380

set tcp-mss-receiver 1380

end

This can also be set under the interface and if done so, will apply to all traffic traversing the interface

Configure system interface

Edit <port number>

set tcp-mss 1380

end

Migration attempt #2 — Success!

After putting this theory to test in a lab environment, we attempted the migration with the new MSS values.

In the next couple of packet captures, observe the Fortigate modifying the MSS value. The client SYN comes in with MSS of 1460 [capture #4: line #1] and goes out vendor port with a value of 1380 [capture #5: line #1]. Similarly, the SYN-ACK from vendor comes in with 1460 [capture #5: line #2] and goes out client port with an MSS value of 1380 [capture #4: line #2].

Capture#4-Fortigate (client-port)
Capture#5-Fortigate (vendor-port)

SSL handshake completes, followed by application data exchanged between client and server. The HTTPS connection went through as expected and the cut-over was successful!

A few tips

Fortigate offloads traffic to ASIC processors after a session is established, to achieve higher performance. With this default setting, not all packets might be captured [asic offload]. Disable ASIC offload while troubleshooting under the matching security policy. Don’t forget to re-enable this!

Configure firewall policy

Edit <policy number>

set auto-asic-offload disable

end

Use the following options for a complete packet capture from Fortigate, so that they can be viewed on Wireshark correctly [FGT-sniffer-options].

For example:

diagnose sniffer packet <interface name> “ host 1.1.1.1 or host 2.2.2.2 and tcp port 443” 6 0 l

where,

6 = complete header and Ethernet data with interface name.

0 = count of captured packets. 0 means packets will be captured till stopped with control + C.

l = local time

Ping can be used to quickly determine the optimal MTU for a network path. Consider the following example on a Windows machine.

ping <ip> -f -l 1500

Pinging <ip> with 1500 bytes of data:

Packet needs to be fragmented but DF set.

Packet needs to be fragmented but DF set.

Packet needs to be fragmented but DF set.

Packet needs to be fragmented but DF set.

This ping command does not include 28 bytes of IP and ICMP headers. Therefore the packet size is 1528 bytes, which is greater than the supported MTU along the path. Here, we receive a response stating that the packet needs to be fragmented, but the infrastructure in the traffic path does not allow fragmentation (DF = Do Not Fragment).

With the payload size reduced to 1472 bytes, we receive ICMP replies without any error. The MTU along this path is therefore 1500 bytes (1472 bytes of payload + 28 bytes of IP and ICMP headers).

ping <ip> -f -l 1472

Pinging <ip> with 1472 bytes of data:

Reply from <ip>: bytes=1472 time=90ms TTL=241

Reply from <ip>: bytes=1472 time=90ms TTL=241

From MAC, use:

ping -D -s <packetsize> <ip>

References and credits

  1. https://forums.clavister.com/viewtopic.php?t=11915
  2. https://tools.ietf.org/html/rfc879
  3. https://tools.ietf.org/html/rfc6691
  4. http://www.tcpipguide.com/free/t_TCPConnectionEstablishmentSequenceNumberSynchroniz.htm
  5. http://www.firewall.cx/networking-topics/protocols/tcp/134-tcp-seq-ack-numbers.html
  6. http://packetlife.net/blog/2010/jun/17/tcp-selective-acknowledgments-sack/
  7. https://tools.ietf.org/html/rfc2001
  8. https://www.cisco.com/c/en/us/support/docs/security/asa-5500-x-series-next-generation-firewalls/113393-asa-troubleshoot-throughput-00.html#anc14
  9. https://hamwan.org/Standards/Network%20Engineering/IPsec.html
  10. https://fortinetweb.s3.amazonaws.com/docs.fortinet.com/v2/attachments/6db46127-1a1c-11e9-9685-f8bc1258b856/fortigate-hardware-acceleration-56.pdf and
  11. https://forum.fortinet.com/tm.aspx?m=156382
  12. https://forum.peplink.com/t/how-to-determine-the-optimal-mtu-and-mss-size/7895
  13. https://help.fortinet.com/fa/cli-olh/5-6-2/Document/1600_diagnose/sniffer.htm

--

--