Placeholder calendar image
120-days: A Docker Security Journey

120-days: A Docker Security Journey

Ashish Desai
CloudX at Fidelity
Published in
5 min readAug 19, 2020

--

Fidelity Investments instituted a 120-day image rebuild policy with a label and lifecycle management model for Docker images and containers. This is our journey to improve container security at the firm.

Background

The financial industry has spent years improving IT security by instituting a regular patching process to keep systems secure. However, with the advent of Docker images/containers, we have an interesting challenge: there is no concept of “patching” a docker image. To “patch”, you need to create a new image with new versions of underlying software. Most teams only create a new docker image when their business-line application needs new features. Docker images can have lots of 3rd party software in which security vulnerabilities can be discovered over time. So how do we address overall security of a Docker container (Figure 1) if folks do not build new images and deploy them often as a running container?

Figure 1 Building block of a running Docker container

Research

We spent a few months learning how our teams build and deploy applications using Docker. Across the company, we discovered processes that varied from sophisticated Jenkins/Concourse CI/CD pipelines, to building images by hand without automation or consistency. Our Docker runtime workloads varied from thousands of on-premises servers to cloud environments like AWS/Azure using managed container runtime services such as AWS-ECS or AWS-EKS/Kubernetes. This is not a surprise; financial firms focus on delivering business value to customers using the latest technology and not running a strong software engineering company.

To secure our container workloads for the long term, we realized that we first must address the basic building blocks of software engineering: get associates to start using automation. Being a large company with over 50K associates and 10+ large business units, we have a variety of micro-cultures and skill sets across business units, associates we need to educate and upskill. With a lot of negotiations across key players we landed on a simple message:

Rebuild Docker images every 120 days

This simple message helps us nudge teams to start using newer base OS images and 3rd party packages, and improves our security posture with the added benefits of moving the entire company to using automation in a consistent manner. As a running container points to a docker image which will age over time, this simple policy forces teams to kill old running containers and indirectly helps improve our disaster recovery process and site reliability.

120, 240, 360 What is the math?

Figure 2 Packaging an application for deployment

In the spirit of being agile, we allow associates to use a variety of base images that come packaged with software, facilitating the creation of tech stack images (JRE, Tomcat) which in turn are used to create application images (Figure 2). Given that an image from any layer can be used to start a running container, we decided to avoid math gymnastics on what 120 days means in this context. We simplified our policy to say that a docker image can only be used for 120 days from its creation date

docker images –format '{{.CreatedAt}}\t{{.Repository}}\t{{.Tag}}\t{{.ID}}'

This resulted in simplifying our reporting complexity and providing a simple message to all associates on how they can determine their compliance. Even if they inherit/use images from 2–3 levels above, each level refreshes at 120 days cadence ensuring that we get a decent frequency of updates down the chain.

Reporting

Figure 3 Lifecycle of a Docker image

We gather information about images and running containers from servers, both on-premises and in the cloud, and try to cross reference them to images stored in our central Artifactory repository. We use this to notify teams monthly of their non-compliance. As Docker tends to “cache” images on a machine, folks who might refresh to a new image were still being notified of non-compliance. We started an educational campaign on how folks could remove old images from a host or institute an image lifecycle policy on managed services like “AWS-Elastic Container Registry”

docker system prune -f

OR

# You can also consider selectively removing images

# Remove images older than 120 days (120 days * 24hrs/day = 2880h)

docker image prune -a — filter “until=2880h” — force

Dockerfile labels

To improve traceability across the firm we realized that we need to use the “LABEL” feature (LABEL key=value) within a Dockerfile. This information can be easily retrieved via introspection of an image or container. We discovered that a LABEL “value” defined in a parent image will get overwritten by a child image using the same “key”. To reduce this conflict, we instituted a naming scheme so that each image would create label keys within its namespace. The format we follow is:

LABEL <prefix>.<imagename>.applicationid=<value>

LABEL <prefix>.<imagename>.createdby=<value>

LABEL <prefix>.<imagename>.scmurl=<value>

where <prefix> is a business unit specific prefix, <imagename> is a symbolic name standing for the Dockerfile/image. “scmurl” points to the Git repo where the Dockerfile resides. This will give us the ability to trace a container back to the Dockerfile that created it.

Project Learnings

This journey required a lot of collaboration across the company. We had to assess lifecycle workflows and ensure we could address edge cases that one might experience in production. We discovered “assumptions” that security and application lifecycle management (ALM) teams made did not hold with the development community, attributable to lack of training and searchable documentation. We hope this journey will set the stage for us to widely adopt containers to serve our 20+ million customers in a more secure manner. Come join us, we are hiring at https://jobs.fidelity.com/ #fidelityassociate

--

--

Ashish Desai
CloudX at Fidelity

Cloud Security Architect with over 20 years of experience in the Financial industry focusing on cloud, Docker and systems engineering practices.