Writing your own Linux Container

shashank Jain
4 min readJul 25, 2018

--

In today’s blog we discuss how to create a simple linux container in golang. The idea of this exercise is to understand how to use simple linux primitives to build a simple container of your own.

Before that we speak a bit on Linux namespaces

Namespaces is means in linux kernel to sandbox kernel resources like file system ,process trees, message queues, semaphores, network components like devices, sockets and routing rules etc.

The idea of namespaces is to isolate process within their own execution sandbox so that they run completely isolated from other processes in different namespace. There are 6 namespaces

1. PID namespace

2. Mount namespace

3. UTS namespace

4. Network namespace

5. IPC namespace

6. User namespace

We are not discussing the cgroup namespace here which also allows the cgroups to be scoped into their own namespaces.

We will reserve explanation of namespaces in a separate blog, for now its worth understanding that namespaces provide isolation from visibility perspective and cgroups provide resource accounting like how much memory/cpu/network a particular process can use.

Sample code snippet is shown below

This program needs a rootfs to be downloaded and put on the host vm. In my case I out the rootfs on /home/ubuntu/ubu directory.

The program first checks the first argument. If the first argument is run, then the program executes /proc/self/exe which is just saying execute yourself (/proc/self/exe is the copy of the binary image of the caller itself). Before executing the CLONE flags are set to put the process into different UTS,PROC and MOUNT namespaces.

After that a call to execute is made which calls the program and passes the argument fork which is a function defined in same program.

One thing to be understood here is that the mount points are inherited from the parent so all the mounts are still same and we access the rootfs and do a pivot root. Pivot root switches the root of the filesystem. Post pivot root the proc is mounted from the new root. Now when the process is executed within the namespace, it has the new root mounted and proc not from host but from the rootfs. Pivot root is a system call which is more secure then the chroot to change the rootfs of a process.

Running the above program launches the shell into a sandbox confined by the proc, mount and uts namespace

Hostname on a different terminal directly running on host shows this.

So we can see the hostname within container is different then that on the host.

Listing processes within the sandbox

Listing processes on host

One can see the different pids within the container sandbox as now container has altogether a different process tree.

This is a very simple example to create container prototype using linux namespaces and cgroups. In future articles I intend to touch on some of the linux data structures which facilitate to create these containers. Also I will try to add the network namespace to the above example and go over some sample packet flows.

We can use nsenter to enter the created container namespaces. As an example in the created container doing a ps

And now by getting the pid (on host)of the bash shell (running within container)

We see 4308 is the pid of the shell on the host.

executing nsenter -a -t 4308 /bin/bash allows another shell to be created in the namespaces of the process with pid 4308 as shown below.

Blog is inspired by writings of Liz Rice and her presentation in Docker Con (https://www.youtube.com/watch?v=MHv6cWjvQjM&t=1316s)

Added blog to add network namespace to the container (https://medium.com/@jain.sm/network-namespace-in-own-container-98461eced8d2)

Disclaimer : The views expressed above are personal and not of the company I work for.

--

--