Deep dive into container networking: Part 1

how a container network works behind the scenes, Can we build one ourselves?

Arpit Khurana
6 min readJul 21, 2019

Overview

Hi, we all know about docker and kubernetes due to their growing popularity. One might wonder, how all of this works behind the scenes. There is a lot of stuff which we don’t deal with while deploying our applications on a container orchestration engine in day to day life. But knowing a bit about it can help in a lot of ways.

This article is not about how to use docker network or kubernetes network but it is rather more focussed on how containers utilize Linux kernel components to implement virtual networks on nodes or on a group of nodes.

In this article, we will try to understand the concepts of networking and build our own small container engine with just networking support.

Introduction

As we know, the containers in a cluster run on their own virtual IP address which is generally assigned by docker with an orchestration engine like kubernetes

One such simple cluster may look like this

Few things to notice here are:

  1. Each node has a dedicated “/24” subnet. The containers on that node has ip allocated from that given range only.
  2. 10.0.0.1 and 10.0.1.1 are gateway ips for container networks.

In such a cluster of containers, each container can talk to any other container using it’s IP address and also each container can directly talk to the internet.

In this and the upcoming parts of this article, we will implement a cluster of 2 nodes and 4 containers using VMs as shown above.

Containers use Linux kernel namespaces to build isolated workspace. When you run a container, certain namespaces are created for that container. These namespaces provide a layer of isolation. Each aspect of a container runs in a separate namespace and its access is limited to that namespace. Common namespaces are PID namespace for process isolation, MNT for filesystem isolation, NET for network isolation and so on.

We are going to focus on network namespaces.

Network Namespace

A network namespace is logically another copy of the network stack,
with its own routes, firewall rules, and network devices. Devices in a namespace may be virtual or hardware devices. So if you working in a custom network namespace you are not making any changes in host network, unless one or more of your host interfaces are attached to this namespace. A container generally runs in its own network namespace.

But this namespace won’t have any connectivity to host or any other namespace ( means any other container). This is why we need VETH device

VETH Devices

Veth devices are built as pairs of connected virtual ethernet interfaces and can be thought of as a virtual patch cable. What goes in from one end will come out the other. They can act as tunnels between network namespaces to create a bridge to a network device in another namespace. You can read more about it here.

A VETH pair connects the isolated network namespace with the host and hence outside world. Let’s see how does that look like

Enough of theory lets do some practical too. For that we need a VM, I am using VirtualBox with ubuntu installed. We will discuss the VM’s network configuration later as we progress through this article.

We will execute the following commands in the VM to add a namespace and a VETH pair. We will use ip binary for it.

containerns” is the name of the network namespace here. After adding a VETH pair (veth0-veth1) we have assigned veth0 to the containerns

Now we have got the setup as shown in the picture above but there are no IP addresses allocated, how do we reach any program running inside the container? or how would a program running inside the container access the outside network. We will solve these problems one by one.

Let's start with adding an IP address and respective routes from and to the container.

In this script, we have set an ip address on the veth0 device which is inside containerns, while we have also added routes. Now let's see what our commands created.

We can check the interfaces created using ifconfig

Running ‘ifconfig’ on the host will show veth1, which will look something this

Notice that there is no IP address allocated to this interface because we did not set any. We set the IP on veth0 which is present inside the network namespace. Let us check that too. We can run ifconfig inside the namespace using ‘ip netns exec’ as shown.

Here veth0 has an ip allocated which we did through our script. A loopback interface is also visible which we enabled through our script. A loopback interface is not an absolute requirement but it can help in communication between two applications running in this namespace. Now our network looks like this .

Now let’s check the routes on the host which were created as a result of that script

Default route in my VM is to enp0s3 interface, it can be some other default interface for your VM. The second route is created by us, to send all the packets coming with the destination IP as 10.0.0.2 to veth1. veth1 will automatically forward it to veth0 inside the container.

This looks good but we need to route the packets back from container also. Line 19 of above script does that by adding a default route. Let’s check the containerns routes to make sure everything is fine.

So we have got a default route which will route all the outgoing packets via veth0. veth0 will forward them to veth1. This looks good too, let’s test this by pinging the container ip.

Whoa. It worked!. We can also run some application inside this namespace to verify. You can try running a python http server like this

ip netns exec $CON python3 -m http.server &
wget 10.0.0.2:8000 # this will give out index.html

Now we can add another such container and ping will work from host-> container. But what about communication across containers? We can easily create a similar VETH pair between the two containers and add the corresponding routes. That will work for sure. But what if we have more than 2 containers creating a VETH pair for every pair of containers can be messy.

So we will use a linux bridge here. Let's continue that in the second part of this article.

That's all for part 1.

Hope you found this article useful. Any reviews are welcomed :)

Already want to go to part 2?

--

--

Arpit Khurana

Software developer @ Golang | Kubernetes | Android . Cloud and networking enthusiast . arpitkhurana.in