Adding Image Security Scanning to a CI/CD pipeline

Published in

alter way

8 min readOct 29, 2020

Using GitlabCI and Trivy

Introduction

Image security scanning is becoming more and more popular nowadays. The idea is to analyze a Docker Image and look for vulnerabilities based on CVE databases. This way, we can know before using an image what vulnerabilities it contains, and therefore we can use only “secure” images in production.

There are different ways to analyze a Docker Image (depending on the tool you are using). A security scan can be performed from a CLI, or it can be integrated directly into a Container Registry, or even better (in my opinion), you can integrate the security scan in a CI/CD pipeline. The last approach is pretty cool because it allows us to automate the process and continuously analyze images that we build, fitting the DevOps philosophy.

Here is a quick illustration :

Simple pipeline with security scanning included

So Today I am going to show you how to setup an image security scan integrated into a CI/CD pipeline.

Tools

There are multiple tools to perform an Image security scan :

Trivy : Developed by AquaSecurity.
Anchore : Developed by Anchore Inc.
Clair : Developed by Quay.
Docker Trusted Registry : If you use Docker Enterprise and in particular the Docker Trusted Registry, you can use an out-of-the-box security scanner that is directly integrated in the registry.
Azure/AWS/GCP : If you use one of these cloud providers, you can easily setup a security scan. Actually, you don’t need to setup anything , you just need your … credit card. :)

Of course, there are many more open source or proprietary tools to achieve that goal. For this tutorial, I am going to use Trivy on a GitlabCI pipeline.

Quick Trivy overview

Trivy is an easy-to-use and yet accurate image security scanner. The installation is pretty simple :

$ curl -sfL https://raw.githubusercontent.com/aquasecurity/trivy/master/contrib/install.sh | sh -s -- -b /usr/local/bin 
$ sudo mv ./bin/trivy /usr/local/bin/trivy
$ trivy --version

And its usage as well :

$ trivy image nginx:alpine

Which gives us an output like that :

As simple as that.

For more information : Trivy’s Github

Adding a simple Docker image

To illustrate the inclusion of security scanning in a CI/CD pipeline, we need a Docker Image as an example. I am going to use that simple Dockerfile :

FROM debian:busterRUN apt-get update && apt-get install nginx -y

This Dockerfile is pretty simple. It starts from the official debian buster image and adds the installation of nginx.

We are going to build this image later in our CI/CD pipeline, but we can build it as follows :

$ docker build -t security_scan_example:latest .

For now, we just need to create a Gitlab project and push our Dockerfile into that project.

Creating a simple CI/CD pipeline

Now that we have created a Dockerfile for our example image, we can create a CI/CD pipeline to build the image and scan it with Trivy.

Without any surprise, since we are using Gitlab, we are going to use GitlabCI for our CI/CD pipeline. First, let’s add the build part :

build:
  stage: build
  image: docker:stable
  services:
    - docker:dind
  tags:
    - docker
  before_script:
    - docker login -u "$CI_REGISTRY_USER" -p "$CI_REGISTRY_PASSWORD" $CI_REGISTRY
  script:
    - docker build -t $CI_REGISTRY_IMAGE:latest  .
    - docker push $CI_REGISTRY_IMAGE:latest

This job runs on a container based on the docker:stable image. It builds our project’s image based on the Dockerfile we pushed earlier, and then pushes the image into the Gitlab Container Registry.

Now let’s add the interesting part :

security_scan:
  stage: test
  image: 
    name: aquasec/trivy:latest
    entrypoint: [""]
  services:
    - docker:dind
  tags:
    - docker
  script:
    - trivy --no-progress --output scanning-report.txt  $CI_REGISTRY_IMAGE:latest
  artifacts:
    reports:
      container_scanning: scanning-report.txt

This job is our security scanning job. This time it runs on a container based on the official Trivy image. It scans our image based on the trivy command and outputs the report in a file called “scanning-report.txt”

Great ! Let’s have a look at our GitlabCI pipeline which should run automatically after our push. We can see that both of our jobs are run (successfully) :

Let’s have a look at the security scan job :

Where is the report ?

As you can see in the scanning job’s result, we have multiple vulnerabilities, and more precisely 114 “Low” and 8 “Medium”, 24 “High” and 1 “Critical” vulnerabilities.

We would like to have more details about these vulnerabilities. By default, Trivy prints the report in stdout. In this example, we told trivy to output the report in a file and created a job artifact from that file. Therefore, the report is downloadable as follows :

Once downloaded, we can have a look at our report for more details :

We can see that we have more information about the vulnerabilities found by our scanner like the library/binary affected, the CVE ID, the severity, the possible fixes etc.

What now ?

Okay so now that we have integrated our image scanning into our CI/CD pipeline, the question now is what to do with these information ?

Currently, the security scanning job never fails, since the trivy command returns 0 by default. We can improve that by making the job fail if the image is “insecure” and succeed otherwise.

The question is, when to fail ? Obviously, we can’t simply say “fail whenever you find a single vulnerability”, because our images are most likely going to have at least some vulnerabilities. The answer is hard to tell, because it depends on the security level you want to achieve. In general, we want to avoid Critical vulnerabilities as much as possible. The answer also depends on the vulnerabilities you get. Can you ignore some of them ? It depends. That’s why working continuously with security teams helps a lot to benefit from these scans.

For this example, we will make our CI/CD pipeline fail if we have a single Critical vulnerability, and succeed otherwise.

Fortunately, trivy allows us to look only for vulnerabilities of a certain severity with the “severity” option. We can also play with the exit code thanks to the “exit-code” option, telling trivy to return 1 if it finds a single vulnerability and 0 otherwise.

So we are going to change our scanning job to fail if it finds one or more “Critical” vulnerabilities, Like this :

script:
  - trivy --no-progress --output scanning-report.json $CI_REGISTRY_IMAGE:latest
  - trivy --exit-code 1 --no-progress --severity CRITICAL $CI_REGISTR_IMAGE:latest

So when our job is executed, we still have our full report available for download, but this time, the CI/CD job will succeed or fail depending on whether trivy finds critical vulnerabilities or not :

One last step …

Okay our CI/CD pipeline looks great ! We need to handle one last thing …

Currently, our images are analyzed ONLY when they are built/pushed. This is cool but insufficient. Indeed, the CVE databases used by our scanning tool are evolving every day with new vulnerabilities. A “secure” image today might (and most likely) be insecure tomorrow. So we need to keep scanning our image after its first push.

Okay then, let’s add a scheduled pipeline that scans our image, let’s say, every night at 2AM. We need to go in CI/CD -> Schedules -> New Schedule :

Scheduled pipeline that performs a security scan nightly

Note: We define a variable called SCHEDULED_PIPELINE with the “security_scan” value. We will see later the purpose of this variable.

Doing so, our pipeline is going to be fully executed, including the build part. Which is not really what we desire. Thus, we are going to modify our gitlabCI file to make the scheduled pipeline execute only the scanning job.

We will add an extra scanning job that contains the exact same definition as the previous one, with an extra “only” option that makes it executable only if the variable SCHEDULED_PIPELINE (that we previously defined in our scheduled pipeline) is equal to “scanning_scan”. To avoid code redundancy, we are going to use job templates.

Therefore, our final gitlabCI file looks like this :

.scanning-template: &scanning-template
  stage: test
  image:
    name: aquasec/trivy:latest
    entrypoint: [""]
  services:
    - docker:dind
  tags:
    - docker
  script:
    - trivy --no-progress --output scanning-report.json  $CI_REGISTRY_IMAGE:latest
    - trivy --exit-code 1 --no-progress --severity CRITICAL $CI_REGISTR_IMAGE:latest
  artifacts:
    reports:
      container_scanning: scanning-report.jsonbuild:
  stage: build
  image: docker:stable
  services:
    - docker:dind
  tags:
    - docker
  before_script:
    - docker login -u "$CI_REGISTRY_USER" -p "$CI_REGISTRY_PASSWORD" $CI_REGISTRY
  script:
    - docker build -t $CI_REGISTRY_IMAGE:latest  .
    - docker push $CI_REGISTRY_IMAGE:latest
  except:
    variables:
      - $SCHEDULED_PIPELINEsecurity_scan:
  <<: *scanning-template
  except:
    variables:
      - $SCHEDULED_PIPELINEsecurity_scan:on-schedule:
  <<: *scanning-template
  only:
    variables:
      - $SCHEDULED_PIPELINE == "security_scan"

This way, our standard pipeline (build + scan) is going to be executed normally when we push some code, and our scheduled pipeline is going to execute our security scan job everyday at 2AM.

How do we fix those vulnerabilities ?

In general by upgrading your image. In our case, we might upgrade the base image (or maybe use another one like Alpine) or upgrade the nginx that we installed.

Another answer could be by removing unnecessary stuff in your image, which is a good practice when you build docker images anyway. The security scan might help you detecting components that you are actually not using.

In our case, let’s change the base image and use Alpine instead :

FROM alpine:3.12RUN apk update && apk add nginx -y

This time, our pipeline succeed … :

… Without a single vulnerability.

Conclusion

So we have seen how to integrate a security scanning job into a GitlabCI pipeline, which is pretty straightforward (at least with Trivy). Of course, in my example I did everything in a single master branch. In a real world, we would work on multi-branch projects, which needs some adaptations.

You can find every resources that I used to make this CI/CD pipeline in my personal gitlab repository.