How to Simulate Network Failures in Linux

Alexander Zakharenko
9 min readOct 28, 2022

--

Hello, my name is Alex and I’ve been a backend QA engineer for 9 years. During these years I’ve faced testing of microservices a lot. On the one hand it’s easier to test small separate service, but on the other hand we have to test services communication.

In this arcticle I’ve tried to describe what to do if we need to test application behaviour when the network connectivity is poor.

This article is a translation of my Habr article that was originally written in Russian language.

Usually we test software in test environment with good connectivity, but in production it might be worse and sometimes it’s mandatory to simulate network issues. Luckily we have the tc (Traffic Control) tool — it allows to tune networking in a system. It was discribed in many articles here, but I’ve tried to show you how a client behaves when tc is used and what these network tunings mean in practice. We are interested in traffic scheduling, so we will use qdisc, and as we need to simulate an unstable network, we will use classless qdisk netem.

Let’s launch an echo-server (I used nmap-ncat):

ncat -l 127.0.0.1 12345 -k -c 'xargs -n1 -i echo "Response: {}"'

Also I’ve written a simple Python client that prints out timestamps and sends Test to our echo-server:

Let’s launch it and look at the traffic on lo interface and TCP port 12345 (I will attach traffic dumps as embedded gist code for convenience):

[user@host ~]# python client.py
[time before connection: 1578652979.44837]
[time after connection, before sending: 1578652979.44889]
[time after sending, before receiving: 1578652979.44894]
[time after receiving, before closing: 1578652979.45922]
[time after closing: 1578652979.45928]
[total duration: 0.01091]
Response: Test

Everything is ordinary: 3-way handshake, request, response and connection closing.

Traffic delay

Let’s set up a delay of 500 milliseconds:

tc qdisc add dev lo root netem delay 500ms

Let’s launch our client script then and see its execution time is 2 seconds now:

[user@host ~]# ./client.py
[time before connection: 1578662612.71044]
[time after connection, before sending: 1578662613.71059]
[time after sending, before receiving: 1578662613.71065]
[time after receiving, before closing: 1578662614.72011]
[time after closing: 1578662614.72019]
[total duration: 2.00974]
Response: Test

And traffic:

We can see that expected 0.5 seconds lag exists now. Situation would be more interesting if delay were longer: kernel starts to resend some segments (or packets — I will use both namings further) in such situations. Let’s make a 1 second delay and see the traffic:

tc qdisc change dev lo root netem delay 1s

We can see that some TCP-segments are duplicated: two SYN, SYN-ACK segments, etc.

Besides constant delay time we can set variance, distribution function and correlation (relative to the previous value). This is how it can be done:

tc qdisc change dev lo root netem delay 500ms 400ms 50 distribution normal

Here we’ve set the delay in the range from 100 to 900 milliseconds, the values will be selected according to the normal distribution and there will be a 50 percent correlation with the delay value for the previous one.

You mght have noticed that in the first command I’ve used add and then change. The meaning of these commands is obvious, so I will only add that there is also del, which can be used to remove the configuration.

Packet Loss

Let’s try to do packet loss now. As it can be seen from the documentation, this can be done in three ways: lose TCP-segments randomly with some probability, use a Markov chain of 2, 3 or 4 states to calculate packet loss, or use the Elliot-Gilbert model. In the article I will tell about the first (the simplest and most obvious) way, and you can read about others here.

Let’s make a loss of 50% of segments with a correlation of 25%:

tc qdisc add dev lo root netem loss 50% 25%

Unfortunately, tcpdump will not be able to visually show us packet loss, we will only assume that it really works. And we will be convinced of this by the increased and unstable running time of the client.py script (it can be executed instantly, or maybe in 20 seconds), as well as the increased number of retransmitted segments:

Segment Corruption

In addition to the packets loss, it is possible to simulate segment corruption: noise will appear in the random position of a segment. Let’s make segment corruption with 50% probability with no correlation:

tc qdisc change dev lo root netem corrupt 50%

Run the client script (there is nothing interesting, but it took 2 seconds) and look at the traffic:

It can be seen that some segments are resent and there is even one segment with broken metadata: options [nop,unknown-65 0x0a3dcf62eb3d,[bad opt]>. But the main thing is everything worked out correctly in the end — TCP coped with its task.

Segment Duplication

What else can we do with netem? For example, to simulate a situation that is the opposite of packet loss — duplication of segments. This command also takes 2 arguments: probability and correlation.

tc qdisc change dev lo root netem duplicate 50% 25%

Segments Shuffle

You can shuffle segments in two ways.

In the first way part of the packets is sent immediately, the rest is sent with a given delay. Example from documentation:

tc qdisc change dev lo root netem delay 10ms reorder 25% 50%

With a probability of 25% (and a correlation of 50%), one segment will be sent immediately, the rest will be sent with a delay of 10 milliseconds.

The second way is when each N-th segment is sent instantly with a given probability (and correlation), and the rest is sent with a given delay. Example from documentation:

tc qdisc change dev lo root netem delay 10ms reorder 25% 50% gap 5

Every fifth package has a 25% chance of being sent without delay.

Bandwidth Tuning

Usually TBF is used, but using netem you can also change the interface bandwidth:

tc qdisc change dev lo root netem rate 56kbit

This command will make browsing localhost as painful as surfing the internet using a dial-up modem. In addition to setting the bitrate, you can also emulate the link layer protocol model: set an overhead for a packet, a cell size and an overhead for a cell. For instance, this is how you can simulate ATM and a bit rate of 56 kbps:

tc qdisc change dev lo root netem rate 56kbit 0 48 5

Connection Timeout

Another important item in the software acceptance plan is timeouts. This is important for distributed systems since when one of the services is disabled, the rest should fallback to others in proper time or return an error to the client, and in no case should they simply hang, waiting for a response or a connection.

There are several ways to do this: for example, use a mock that does not respond or connect to the process using a debugger, put a breakpoint in the right place and stop the process from executing (this is probably the most perverse way). But one of the most obvious is to firewall ports or hosts. iptables will help us with this.

In order to demonstrate, we will firewall port 12345 and run our client script. You can firewall outgoing packets to this port at the sender or incoming at the receiver. In my examples, incoming packets will be firewalled (using chain INPUT and the --dport option). Such packets can be DROP, REJECT or REJECT with the RST TCP flag, or with ICMP host unreachable (in fact, the default behavior is icmp-port-unreachable, and there is also the possibility to send icmp-net-unreachable, icmp-proto-unreachable, icmp-net-prohibited and icmp-host-prohibited).

DROP

If there is a rule with DROP, the packets will simply “disappear”.

iptables -A INPUT -p tcp --dport 12345 -j DROP

We start the client and see it hangs at the stage of connecting to the server. Let’s see the traffic:

It can be seen that the client sends SYN packets with an exponentially increasing timeout. So we found a small bug in the client: we need to use the settimeout() method to limit the time for which the client will try to connect to the server.

Remove the rule after that:

iptables -D INPUT -p tcp --dport 12345 -j DROP

It’s possible to delete all rules at once:

iptables -F

If you are using Docker and need to firewall all traffic going to the container, then you can do it like this:

iptables -I DOCKER-USER -p tcp -d CONTAINER_IP -j DROP

REJECT

Now let’s add a similar rule, but with REJECT:

iptables -A INPUT -p tcp --dport 12345 -j REJECT

The client terminates in a second with the error [Errno 111] Connection refused. Let’s look at ICMP traffic:

It can be seen that the client received port unreachable twice and then terminated with an error.

REJECT with tcp-reset

Let’s try adding the --reject-with tcp-reset option:

iptables -A INPUT -p tcp --dport 12345 -j REJECT --reject-with tcp-reset

In this case the client immediately exits with an error because it’s received an RST packet:

REJECT with icmp-host-unreachable

Let’s look at another use case of REJECT:

iptables -A INPUT -p tcp --dport 12345 -j REJECT --reject-with icmp-host-unreachable

The client terminates in a second with the error [Errno 113] No route to host, in ICMP traffic dump we see ICMP host 127.0.0.1 unreachable.

You can also try the rest of the REJECT options, but I’ll stick with these :)

Request Timeout

Another situation is when a client is able to connect to a server, but cannot send a request to it. How to filter packets so that filtering does not start immediately? If you look at the traffic of any communication between the client and the server, you will notice that when establishing a connection, only the SYN and ACK flags are used, but when exchanging data, the PSH flag will be in the last request packet. It is set automatically to avoid buffering. You can use this information to create a filter that will let through all packets except those containing the PSH flag. Thus, the connection will be established, but the client will not be able to send data to the server.

DROP

For DROP, the command would look like this:

iptables -A INPUT -p tcp --tcp-flags PSH PSH --dport 12345 -j DROP

Let’s start our client and watch the traffic:

We can see that the connection is established, and the client cannot send data to the server.

REJECT

In this case, the behavior will be the same: the client will not be able to send a request, but will receive ICMP 127.0.0.1 tcp port 12345 unreachable and increase the time between request retransmissions exponentially. The command looks like this:

iptables -A INPUT -p tcp --tcp-flags PSH PSH --dport 12345 -j REJECT

REJECT with tcp-reset

The command looks like this:

iptables -A INPUT -p tcp --tcp-flags PSH PSH --dport 12345 -j REJECT --reject-with tcp-reset

We already know that while using --reject-with tcp-reset, the client will receive an RST packet in response, so we can predict the behavior: receiving an RST packet on an established connection means that the socket is unexpectedly closed on the other side, which means that the client should receive a Connection reset by peer. Let’s run our script and check it out. And this is how the traffic looks like:

REJECT with icmp-host-unreachable

I think it’s already obvious what the command looks like :) The behavior of the client in this case will be slightly different from that with a simple REJECT: the client will not increase the timeout between attempts to resend the packet.

It is not necessary to write a mock to test the interaction of a service with a frozen client or server, sometimes it is enough to use standard utilities that are available in Linux.

The utilities discussed in the article have even more features than have been described, so you can come up with some of your own options for using them. Personally, I always have enough of what I’ve written about (in fact, even less). If you use these or similar utilities in testing in your company, please comment below. If not, I hope your software will become better if you decide to check it in the face of network problems with the suggested methods.

--

--