Secure Docker Containers Require Secure Applications

Understanding How Containers Work is the First Step to Understanding How to Secure Them

Published in

Capital One Tech

11 min readMay 13, 2019

Application containers are one of those great technologies that comes along and reshapes an entire industry. Historically, these kinds of disruptions have been rare; to witness in real-time how a product like Docker can evolve from a seed of an idea to the must-have backbone of so much of today’s digital landscape is quite remarkable. My own career as a technologist has run parallel to the development and maturity of Docker and its greater container ecosystem. As containers and container platforms have evolved, the communities around them have grown, and container-based products have permeated our tech stacks. Yet, despite all of this, there’s still a bit of mystery around how containers actually work and the security implications they create for the applications that run inside them. This is the topic I want to try and tackle today.

Note: while I am aware that there are many container runtimes and schemes (lxc, rkt, Docker, etc.) I’m going to be focusing on Docker specifically since it’s arguably the most popular.

Unpacking Docker Images

To start off, we need to understand what a Docker image is. Images, broadly speaking, are like container templates that can either be run on their own or built upon to create new images. On a more technical level, images are just .tar archives containing a filesystem. That’s it!

Once the images are downloaded from an image repository (like hub.docker.com), they are unpacked and stored in the host’s filesystem. We can actually inspect these images on the filesystem by navigating to the image path (/var/lib/docker/overlay2 for hosts using the overlay2 storage driver). Each image layer is named with a sha256 hash; here’s the base layer for the official Alpine image:


root@ip-XXXXXXXXXXX:/var/lib/docker/overlay2/56abedeb5085c1ad962f3dec89d1e9bc6b584ee06d9bed0897221417bb496c56/diff# ls -la
total 72
drwxr-xr-x 18 root root 4096 Jan 24 16:41 .
drwx------  3 root root 4096 Jan 24 16:41 ..
drwxr-xr-x  2 root root 4096 Dec 20 22:25 bin
drwxr-xr-x  2 root root 4096 Dec 20 22:25 dev
drwxr-xr-x 15 root root 4096 Dec 20 22:25 etc
drwxr-xr-x  2 root root 4096 Dec 20 22:25 home
drwxr-xr-x  5 root root 4096 Dec 20 22:25 lib
drwxr-xr-x  5 root root 4096 Dec 20 22:25 media
drwxr-xr-x  2 root root 4096 Dec 20 22:25 mnt
dr-xr-xr-x  2 root root 4096 Dec 20 22:25 proc
drwx------  2 root root 4096 Dec 20 22:25 root
drwxr-xr-x  2 root root 4096 Dec 20 22:25 run
drwxr-xr-x  2 root root 4096 Dec 20 22:25 sbin
drwxr-xr-x  2 root root 4096 Dec 20 22:25 srv
drwxr-xr-x  2 root root 4096 Dec 20 22:25 sys
drwxrwxrwt  2 root root 4096 Dec 20 22:25 tmp
drwxr-xr-x  7 root root 4096 Dec 20 22:25 usr
drwxr-xr-x 11 root root 4096 Dec 20 22:25 var
root@ip-XXXXXXXXXXX:/var/lib/docker/overlay2/56abedeb5085c1ad962f3dec89d1e9bc6b584ee06d9bed0897221417bb496c56/diff#

As we can see, it really is just a directory that holds a filesystem. So how does a filesystem get turned into a running container?

See cgroup. See cgroup Run

Now, let’s talk about Linux cgroups. Cgroups, or “control groups”, are a function of the Linux kernel that allow for isolating groups of processes from the rest of a machine. Using cgroups, a process can have a “virtual” filesystem, resource limits, firewalled networking, and a host of other features. If this sounds suspiciously like a Docker container, you’d be right! Docker leverages cgroups to run its containers.

To run a container, the Docker daemon takes the following steps (roughly):

Compiles a virtual filesystem from each image layer.
Creates a new cgroup.
Mounts the virtual filesystem to the cgroup.
Sets the cgroup limits to those defined by the image metadata (stored in a local DB).
Sets cgroup networking.
Starts the process defined in the image’s Dockerfile CMD or ENTRYPOINT.

Congratulations! You now have a running Docker container. In fact, if you want to inspect the running container, you can find its process by using ps:

Exploring How To Build Secure Containers

Docker, when you dig into how it does what it does, isn’t doing anything entirely revolutionary by itself. It simply leverages already supported kernel primitives like cgroups and packages them into a product that is simpler and more streamlined to use. But because Docker doesn’t build much on top of the actual kernel mechanisms, the containers it handles and applications it hosts are only as secure as the kernel itself. This means that the Docker daemon, its images, and its containers have no real built-in security features of their own that would allow for embedded secrets to stay secret or keep unauthorized third parties from gaining access to their processes and filesystems. We can test this by exploring a few of the more popular methods of hiding secrets in a container image: injecting secrets at image build and putting secrets in environment variables at container run time.

Test #1 Injecting secrets into containers at build

Let’s create a new Docker container. It’s rather simple, actually. All we need is a Dockerfile (assuming you’ve already got a working Docker installation).

FROM alpine:latest
ADD super.secret /super.secret
CMD /bin/sh

We’ll create a file called super.secret to inject into the container:

$> echo "this is a secret" > super.secret

And now we can build the image.

root@ip-XXXXXXXXXXX:~/docker# docker build -t secret:test .
Sending build context to Docker daemon  3.072kB
Step 1/3 : FROM alpine:latest
---> 3f53bb00af94
Step 2/3 : ADD super.secret /super.secret
---> 61ea9104ee5d
Step 3/3 : CMD /bin/sh
---> Running in 2a1b90e4b209
Removing intermediate container 2a1b90e4b209
---> c3713649d32e
Successfully built c3713649d32e
Successfully tagged secret:test
root@ip-XXXXXXXXXXX:~/docker#

Great! We have an image that now contains our secret (password, certificate, really anything you want to keep secret). So what happens if we search the filesystem for that secret file?

root@ip-XXXXXXXXXXX:~/docker# find / | grep super.secret
/root/docker/super.secret
/var/lib/docker/overlay2/11567f3bcc1b8e844e22ba37cfef2432ea319247e403707497022edebfd7a7ce/diff/super.secret
root@ip-XXXXXXXXXXX:~/docker#

That’s not good. Not only did we find the file, but we can actually read that file back, too.

root@ip-XXXXXXXXXXX:~/docker# cat /var/lib/docker/overlay2/11567f3bcc1b8e844e22ba37cfef2432ea319247e403707497022edebfd7a7ce/diff/super.secret
this is a secret
root@ip-XXXXXXXXXXX:~/docker#

This is because, as we explored earlier, Docker images are just filesystems that exist on the Docker host. Now, in this case we obviously didn’t expose anything because a) the secret is “this is a secret” and b) the image never left our local host. But, for the sake of argument, let’s say we pushed our image up to Docker Hub, or an internal enterprise repository like Artifactory. Now anyone with “pull” permissions to that image has access to our secret, meaning that secret isn’t really all that secret anymore.

So what about environment variables? Surely those are more secure?

Test #2 Injecting secrets in environment variables

Let’s go back and modify our Dockerfile to add an environment variable called SECRET.

FROM alpine:latest
ENV SECRET="" # blank secret that we'll over-write at run-time.
CMD /bin/sh

And now we can build this new image.

root@ip-XXXXXXXXXXX:~/docker# docker build -t secret:testenv .
Sending build context to Docker daemon  3.072kB
Step 1/3 : FROM alpine:latest
---> 3f53bb00af94
Step 2/3 : ENV SECRET=""
---> Running in a1872c8078c3
Removing intermediate container a1872c8078c3
---> 178c24f06c44
Step 3/3 : CMD /bin/sh
---> Running in bccf052509b7
Removing intermediate container bccf052509b7
---> cca1f3bd8248
Successfully built cca1f3bd8248
Successfully tagged secret:testenv
root@ip-XXXXXXXXXXX:~/docker#

So, now all we’ve done is shifted the vulnerability from the filesystem to the process environment. While this change eliminates secrets being disseminated via an image repository, this method still doesn’t prevent other containers or users from accessing the secrets in the running container. Let’s explore how.

First, we need to run our new container. Since it defaults to running /bin/sh and wants user input, we can overwrite this to make it run in the background with the sleep command.

root@ip-XXXXXXXXXXX:~/docker# docker run -d -e SECRET="this is a secret" secret:testenv sleep 300

This will make the container run for five minutes before exiting.

Now, just like we did with Nginx, we can find the sleep process that’s running in the container with ps:

root@ip-XXXXXXXXXXX:~/docker# clear
root@ip-XXXXXXXXXXX:~/docker# ps aux | grep sleep
root     12184  0.0  0.0   1516     4 ?        Ss   17:06   0:00 
sleep 300
root     12342  0.0  0.0  12944   940 pts/1    S+   17:06   0:00 grep --color=auto sleep
root@ip-XXXXXXXXXXX:~/docker#

Taking note of the PID (12184), we can then inspect the process environment by navigating to /proc/9944 and looking at the environ file:

root@ip-XXXXXXXXXXX:~/docker# cat /proc/12184/environ
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin HOSTNAME=6ddf6d9e588dSECRET=this is a secret HOME=/root
root@ip-XXXXXXXXXXX:~/docker#

While slightly more secure (and definitely more obfuscated) than the previous method, it’s still not as secure as we would like it to be.

Secure Containers Require Secure Applications

Images and containers should never be treated as inherently secure -- safe and acceptable use of containers in a secure environment comes from properly implementing security at every level of the application stack. It follows, then, that security should be implemented at the application level. To do this properly, we need to consider some common pitfalls.

Pitfall #1: Hard-coding secrets into applications

Especially in a project with a faster release cycle, the temptation to just put all your secrets into your source code can be great. Especially when you can just rotate secrets when a new release is deployed. However, this presents some problems. Chief among them is the fact that you have to put those secrets in the code.

Wherever that code goes, so do your secrets. Assuming that everyone uses some sort of version control system (VCS) like GitHub, Bitbucket, or GitLab, then your secrets will be in that VCS for the life of the project. Even worse, those secrets will continue to persist in the commit history of that project. Even if you rotate your secrets regularly, the commit history provides anyone willing to do the work with a pattern of how those secrets are generated, and once they understand the pattern they can try to predict what the next one will be.

Putting secrets into VCS also makes sure that their horizon (how far the secrets can travel from those who need to know them) is rather large, with it being possible for everyone in an organization to have read-only access to that organization’s VCS system.

Pitfall #2: Using environment variables for secrets

As we’ve already discussed earlier, you should never put secrets into environment variables. Enough said.

Pitfall #3: Not using unique secrets for each application and environment.

One of the problems with having multiple accounts, applications, APIs, networks, and other systems is keeping track of the secrets they ultimately require. Accounts have passwords, applications have encryption keys and certificates, APIs have API keys, networks have packet flags or MAC addresses, the list goes on -- a list that now needs to be kept track of.

There is a very real temptation to simply use the same secret for everything so you only have to remember or keep track of one thing instead of several. This is not just a problem in the consumer space where people keep using the same password over and over, sharing their one unique secret between their banking, shopping, tax returns, DMV portals, and so on. This is also a problem for developers and enterprise IT as well. Reusing secrets like certificates or API keys across applications creates a chain of vulnerability: if one application’s secrets are identified, but they’re shared by another application, then the second application is also vulnerable. The repeat use of secrets also increases their lifespan, and time is the enemy of secrecy. The longer a secret is used, the likelihood that secret will be compromised rises.

Pitfall #4: Storing secrets in insecure places

This ties in with pitfall #1 in that you should never store your secrets in places that are not secure themselves. This includes shared storage, unencrypted files, version control systems, e-mail, chat applications, development planning apps, text message, carrier pigeon, 12th century parchment, databases, and hand-written notebooks. Just to name a few. As we’ve seen recently, one common method of accidentally exposing data is through user error on Amazon’s S3 service, though at no fault on Amazon’s behalf. Using S3 as a secure repository is a bad idea, as companies like Booz-Allen Hamilton found out recently when they accidentally leaked geo-spacial intelligence imagery via a very poorly configured S3 bucket, as did Verizon and Accenture with their own mishaps. While pitfall #4 is arguably the least severe of the four, there are ways to secure secrets properly and in a way that works with applications of all types.

Solution

The solution here is to implement some sort of secrets management, either through a cloud provider (like AWS’s Secrets Manager and KMS), a custom application, or a third-party application like Hashicorp Vault, CyberArk, Salt, or similar. The end result should be a container infrastructure that has no concept of secrets, applications that dynamically acquire secrets on an as-needed basis, and secrets that are managed and rotated separately from the rest of the infrastructure.

It should be noted that applications that strictly adhere to the idea of the “12 Factor App” will not be secure if they use environment variables for all of their config. While the ideas proposed by the 12 Factor App are good starting points (Capital One’s own Jimmy Ray has an excellent write up on the 12 Factor App and microservices), developers should never adhere to a set of rules without critical evaluation of those rules -- this is one of those cases. Perhaps there should be a thirteenth factor added that talks about secrets management?

An Ecosphere of Trust

Real application security should be handled at the application level, not the infrastructure level. When operating an application in an environment where secrets are required, developers should make an effort to leverage secrets management systems to ensure that those secrets don’t proliferate beyond their intended horizon.

We as developers and engineers should endeavor to write applications that we would want to use ourselves. Our role requires that we exist in an ecosphere of trust as we coexist with other developers and their work. We write software that our peers will use, that will manage our peers’ data, and in some cases, have real power over people’s lives. Part of that ecosphere of trust is the understanding that the software we write will be secure and won’t unnecessarily expose data. If those applications are running in containers, we have a duty to make sure those applications are as secure as possible.

But here’s the good news: it’s not that hard! Unlike many things in the DevOps/SRE world, securing containers is pretty simple once you understand how they work. I hope this blog post helped demystify some of that for you.

Happy coding!

DISCLOSURE STATEMENT: These opinions are those of the author. Unless noted otherwise in this post, Capital One is not affiliated with, nor is it endorsed by, any of the companies mentioned. All trademarks and other intellectual property used or displayed are the ownership of their respective owners. This article is © 2019 Capital One.