Discovery of Known Vulnerabilities and Inventories for Modern Applications

Published in

SDA SE Open Industry Solutions

15 min readFeb 24, 2022

Introduction

Nowadays, plenty of software projects are using a significant amount of third-party components as their base. Third-party components help to efficiently develop complex software without reinventing the wheel. Open source software is transparent, even the components used in it. Self-developed software is therefore often composed of up to 80% open source software.

With these basics in mind we need to be aware of the various risks that are introduced by using third-party software, so code outside of control, together with self-developed source code. Here is an overview of these risks, which are commonly referred to as “supply chain risks”:

Known (and beforehand unknown) vulnerabilities
Malicious code in a dependency
Breached dependency publisher
Security misconfigurations

In this article, we will focus on the first point, known vulnerabilities, and we will limit our view to open source components in order to address the topic in depth. In case you are interested in the other software supply chain risks mentioned above, please leave a comment.

Throughout the discussion of this topic in this article, we will focus particularly on concepts and technologies of SDA SE, such as:

Open source solutions over commercial offerings to reduce vendor lock-in
Self-developed microservices
Services running in Kubernetes clusters

A short introduction to dependencies

A modern software package is usually published using a versioning process where a unique version name is assigned to it. When you build your app, the build system compiles the library module together with all of its dependencies. A dependency in this case is another package that your package needs in order to work. The dependencies versions are specified in a package manager manifest file like package.json, and such dependencies are automatically pulled in during the build process. Please note that at SDA SE, we only use package managers to include third-party software into our services.

There are two types of dependencies: direct dependencies and transitive dependencies. A direct dependency is a functionality used in the software package code directly. While transitive dependencies are second-line dependencies that are used from within these direct dependencies. The overall structure results in a dependency tree that displays all the dependencies of a given project with your app as the root of the tree.

An artefact, on the other hand, is a result produced during the software development process. It may contain the project source code, dependencies, binaries or resources.

Vulnerable and outdated dependencies

Due to the fact that vulnerable third-party-components nearly always pose a significant potential risk (also because they usually do affect multiple products), OWASP has put vulnerable and dependencies within it’s top ten security risks — within the last 4 years it even has risen from top #9 to top #6 risk.

Additionally, according to the WhiteSource Annual Report 2021, vulnerable and outdated dependencies in applications are increasing, having reached 9.658 documented cases in 2020, while it was only 1.213 in 2015.

Open Source Vulnerabilities per Year from WhiteSource Annual Report 2021

Open source software is transparent including the components used in it. This transparency allows researchers to better identify vulnerabilities in open source software. In addition, it allows proper response processes.

When the log4shell critical vulnerability was discovered in December 2021, the first reaction from an organisation was to ask the following questions:

What components are running in our environments?
For example, self-developed software, third-party components like infrastructure components.
Which components are inside the components running in our environments?
For example, operating system files or application dependencies
Are the components vulnerable?

The above mentioned questions could also be asked in reverse order depending on the individual research process.

The reaction of the companies against such vulnerabilities can be handled by different treatment methods as follows: avoidance (e.g. removing the vulnerable component), mitigation (e.g. patching the component) or acceptance — which in turn, especially in the latter case, could significantly increase the overall risk value of the affected software.

When the decision is made to patch a dependency, the corresponding patch process needs to be applied.

Application Patch Management

Application Patch Management consists of patching a library and deploying it. Patching includes testing, which is an automated process that is performed after creating the patch pull request. There are tools like Dependabot or Renovate that scan the repository looking for new dependency versions. Once a new version is detected, a pull request is made containing the updated version of the dependencies. After merging the pull request into the master branch, a new release version is created. Please refer to this example for more details about a pull request patch.

Afterwards, this new release needs to be deployed into the (production) environments. The component is deployed to the component registry to be used by others, like software libraries or base container images.

Organisations often develop internal libraries, which can be more trusted than other third-party components, as the code is within the organisation’s quality control. The SDA SE open-source library is available within the sda-dropwizard-commons repository. The new released versions can be automatically merged if all tests are successfully passed in one of our services.

However, if the build or deployment process is broken, this review won’t help. A broken deployment process might be caused by a broken build due to network unavailability or broken package dependencies. In case the product owner is overloaded or unavailable and not able to approve the changes in the CI/CD pipeline, the patch won’t get released.

Identifying Vulnerabilities in Complex Environments

The variety of artefacts makes it difficult to identify which artefacts are used. The following figure shows the components in a container.

One container image can include from 0 to n images. The container images might be aggregated in one layer inside the container. Therefore, the base images will not be considered in this article.

In addition to these pure software dependencies, organisations dealing with hardware have to consider devices and firmware as well.

Moreover, the overall components of the build and deployment process is rather complex. One self-developed service can result in multiple component repositories, also one of the build components can be deployed to different environments. A generic view on component build and deployment is shown in the following figure.

When using third-party components, the build process is performed by another organisation or individual. For example, the application dependencies or infrastructure images like Kubernetes and database images are usually not built by the SDA SE.

In this article, we use the term environment for applications installed on the target system (e.g., a cluster, smartphone apps or traditional desktop applications).

To address the question of what components are running in our environments, the Kubernetes APIs are requested manually to identify all images. Afterwards, the inside of each component is checked to verify whether they contain a vulnerability. An analysis can be performed to ensure whether a vulnerable function is used. If the function with the vulnerability is not called, it is considered as a false positive. Often, the analysis is more expensive than just patching.

This approach is suitable for organisations with low maturity. It might require a package manager’s lockfile within the images or tools to analyse the dependencies between the images. While the dependency ranges are specified in the package manager manifest, the package managers lockfile (e.g. package-lock.json) declares pinned dependencies (including the used version).

In case we start at the bottom, the organisation uses the version control repository (or repositories) to identify the present components in the application package manager manifest. Afterward, the environments with the application running need to be determined.

Most tools that determine the components’ vulnerabilities can not answer where the component was used.

Therefore, a better solution would be to use an SBOM, which is described in the following chapter.

Software Bill of Materials

At organisation’s with higher maturity, pre-filled artefact inventories can be used. A generic definition of a Software Bill of Materials (SBOM) is any metadata associated with a build, such as dependencies.

Moreover, third-party images can be analysed to create an SBOM to publish it into an artefact inventory. The inventory should be created before the vulnerability is published.

Software buyers should require an SBOM from their software suppliers. This will help buyers to be aware of known vulnerabilities in the software once they are discovered and react quickly if they are affected.

Especially managed services should provide an SBOM, so that service users can scan for vulnerabilities. In case no SBOM is provided, service users should watch news feeds of the service provider to get informed about vulnerabilities in it.

But there is also another point of view: Since vulnerabilities need to be analysed by a developer familiar with the code to evaluate its validity and real exposure, Alex Gantman argues in his article, that it is not helpful to provide the SBOM to software buyers, as they are not able to perform such evaluation themselves.

We believe that software buyers can benefit from the SBOM. For example, they can quickly determine if the software bought contains a potential vulnerability like log4shell. Therefore, we are currently making the SBOM available for each service to our customers.

In addition, we offer our clients the option to pre-analyse vulnerabilities in our third-party components. Afterwards, we timely inform our clients with a timely report.

SBOM Generation Time

Tools can detect dependencies from package manager manifests, source code, and binaries. Generation tools can run at build time or post-build. Post-build means that an image used (e.g. in production) is getting analysed.

A SBOM is generated at build time with the help of plugins or package manager analyser (e.g., cdxgen). cdxgen analyses the software package manager lockfile.

In third-party components, package manager information is mostly not available, so the SBOM has to be generated post-build.

SBOM Format Standards

The standard Software Package Data Exchange (SPDX) or OWASP CycloneDX can be used to provide the bill of the used materials.

CycloneDX and SPDX for example are specified as follows:

CycloneDX and SPDX Specifications, based on CycloneDX and SPDX.

More detailed specification of CyclonedX information is available in the github repository. A sample of a CycloneDx SBOM for the dropwizard java framework can be found in this repository.

Tools that identify known vulnerabilities in software are using their own report formats. However, the new CycloneDX Vulnerability Exploitability Exchange (VEX) might be adapted in the future. With a standard format, the exchange of vulnerabilities will probably be easier.

SBOM Generation Tools

In general there are two ways for a tool to generate the SBOM: either the very precise method during build-time, using package manager information as well as code analysis, or post-build analysis using mainly hashes of files of pattern matching to identify the included dependencies.

A sample for a language plugin for the language java is cyclonedx-gradle-plugin.

The following figure shows sample generation tools for the corresponding usage time.

SBOM generation tools like cdxgen analyse the package manager lockfile to generate the SBOM. cdxgen, also scans the code for usages of dependencies and includes only the used libraries in the SBOM. The tool can be used at build time or post-build depending on the availability of package managers file (e.g. package.json and package-lock.json) within the image.

To perform a post-build analysis without package manager information, tools like anchore/syft can be used. Finally there is still a risk, especially with obfuscated or minified applications (e.g. JS frontend applications): the minified artefact is difficult for post-build tools to analyse, so in these cases the creation of a complete and correct SBOM might fail. Therefore, it is better to create the SBOM beforehand and centralise the SBOMs from all products in an artefact inventory.

Artefact Inventory

Artefact inventories provide a continuously updated snapshot of third-party components. As artefacts can run in multiple environments, it is recommended to store the source code repository and the running environments for future usage, and ensure the environment traceability. Some artefact inventories allow tags or product labels to be applied, which in turn can be used to store the corresponding version control repository and target environments.

When a vulnerability is found in a dependency, the question of which dependencies are currently in use? is to be answered.

In case of an exploited vulnerability in an organisation, which is often discovered in an advanced state of a project, the question of which dependencies were in use for a given point in time? need to be answered to analyse how the exploit happened and if it can happen again in the same way.

A software inventory can help to answer these questions. Artefact inventory tools often provide a software composition analysis, which will be discussed in the next chapter.

Software Composition Analysis

In addition to an inventory, a Software Composition Analysis (SCA) can estimate the associated risks of a third-party components. SCA solutions offer one or multiple functions like management of business rules, compliance, software licensing, patch information, and known vulnerabilities for third-party components, which in summary, allows the user to get a risk-value for a third-party component.

These Software Composition Analysis tools use one or multiple database sources to discover vulnerabilities for a given dependency, often the National Vulnerability Database (NVD) is included: it is a common database for known vulnerabilities. Usually, researchers find a vulnerability, submit it, and receive a Common Vulnerabilities and Exposures (CVE) number. There are also commercial vulnerability databases such as GitHub Advisory Database.

In addition, context-specific databases and commercial databases exist. Context-specific databases are, for example, npm’s own security tracker or the debian security bug tracker.

The following list shows some of the more common open source Software Composition Analysis tools:

OWASP Dependency Check: A tool to identify known vulnerabilities for various languages.
OWASP Dependency Track: Uses a server-side component to store the software bill of materials (SBOM) and periodically analyses it. Supports various languages and uses multiple vulnerability databases.
depScan: A tool to identify known vulnerabilities in several languages. The suggest mode can be used to identify the best patch version keeping all existing vulnerabilities in mind.
npm audit: A native tool for the npm package manager to scan for known vulnerabilities from their private database.

Findings from tools like SCA, should be analysed and responded to. In order to do so a vulnerability management system can help.

Vulnerability Management

Like already described in the introduction to this paper, vulnerabilities can often be handled within the tool creating the corresponding vulnerability report. But in cases where multiple scanning tools are used, it is harder to address the findings in the different tools. In addition, it is more complicated to get an organisation-wide overview. So, while the usage of multiple tools with different methods and data sources increases, the possibility to identify vulnerabilities is therefore strongly recommended. At the same time the complexity of unifying, consolidating and evaluating the different tool outputs increases significantly as well.

OWASP Dependency Track comes with a simple Vulnerability Management, which will not be consider further in this article. Instead, we will look at recommended functions of a Vulnerability Management System (VMS) in a DevOps world. Based on our experience the following functions should be implemented in a mature VMS:

Option to upload various reports from different tools.
Option to group reports for one service during upload.
Option to accept a finding in case of vulnerabilities, but the accountable team doesn’t have the time to fix it immediately. It is helpful to ignore the vulnerability for some time temporarily (so to pause the alerts) and the alerts will be generated later again.
Option to mark a finding as false positive in case the vulnerability exists in the dependency, but doesn’t have an impact on the product. In addition, detecting dependencies isn’t always accurate, for example, using pattern matching, which also results in false positives.
By re-uploading a report (e.g., due to the next scan on the next day), automatically marking a finding as mitigated if it is not in the report anymore because a patch has been applied.
An API to interact with the Vulnerability Management System from other tools.
In enterprise environments, fine granular access control and authentication with an Identity Provider like AzureAD is needed.

Based on these needed functions for an VMS and on our tools evaluation, we consider that OWASP DefectDojo and Purify are great open source solutions.

Our Tools Choices

We evaluated the tools to generate SBOMs, to store the SBOM, to identify potential vulnerabilities in third-party components, and to respond to the potential vulnerabilities.

We are using OWASP DefectDojo, syft, and OWASP Dependency Track and developed our own solution to orchestrate them through the ClusterImageScanner.

OWASP Dependency Track

OWASP Dependency Track provides the information if a new version of a component is available. In addition, it comes with the ability to search for components (dependencies are named “components” within the tool) and see to which product it belongs.

The following screenshot shows, as an example, a search for the dependency (called component) umask.

Screenshot of a component search in OWASP Dependency Track

In the current version 4.4.1 of the tool it is, unfortunately, not possible to launch an automated scan of all existing products “just now”. Users need to trigger a manual scan of the software and by default that trigger is executed every 24 hours.
This could lead to an unwanted time delay: When, for example, a vulnerability like in the log4shell case is announced, the organisation must determine what components depend on the vulnerable component. Triggering a scan in this case is performed by re-uploading the SBOMs or every 24 hours when the EventScheduler triggers the analysis and evaluating the SBOMs takes some time as well, so there might be a delay of a few hours between the trigger and the results.

OWASP DefectDojo

Since we use multiple security scanning tools and want advanced response options (e.g., temporary acceptance of a vulnerability), we use OWASP DefectDojo as a vulnerability management system. The potential vulnerabilities are responded to in OWASP DefectDojo. We contributed, for example, the ability to automatically close findings in case they are not uploaded again due to a patch, and the ability to run OWASP DefectDojo in Kubernetes.

ClusterImageScanner

We developed the ClusterImageScanner to identify vulnerabilities in Kubernetes environments by composing multiple good open source products together, and published the ClusterImageScanner as an open source tool as well. All the scans and used tools are described in detail in the docs for the ClusterImageScanner.

The described open source tools are all good tools, but they need to communicate with each other in order to receive great results. The ClusterImageScanner creates a process for it. An overview is provided in the following figure:

The ClusterImageScanner works, briefly, as follows: the images in Kubernetes clusters are fetched and stored in a git repository. Afterwards, they are picked up by the Orchestrator to fetch the images and analyse them. There is an option for uploading the SBOM at build time to Dependency Track and attach an SBOM serial label to each image, so in case there is an SBOM serial label attached to an image, the vulnerabilities for the image are requested from Dependency Track. Otherwise, an SBOM is generated with syft and uploaded to Dependency Track to fetch the vulnerabilities.

The mentioned case of a new master release without deployment is addressed by scanning the production Kubernetes environments.

All the identified vulnerabilities by OWASP Dependency Track are uploaded to OWASP DefectDojo. Afterwards, the teams which are owners of the corresponding artefact are getting notified about unhandled vulnerabilities via Slack.

The service owner is therefore informed to check his software in OWASP DefectDojo for vulnerable components and verify if the currently deployed service uses the vulnerable dependency. Alternatively, a scan can be triggered manually for a cluster, giving developers direct feedback via Slack. We recommend performing such automatic scans to be performed at least daily. A look into the documentation can help to understand the ClusterImageScanner better.

Summary

As we have shown, the potential supply chain risk components with vulnerabilities by using third-party components do exist, but using the proper processes, tools, and methods the risk is manageable, and the overall positive effects can outperform the risk. Within this article, we introduced some of the more common methods and tools to address these risks.

The SDA SE has a long track in using such methods and tools, and with this article we tried to give you an impression of the way we evaluated and implemented our toolset for this purpose. Finally, we’ve also introduced you to our own open source solution ClusterImageScanner that includes most of the discussed aspects and showed to be the most suitable solution for us.

We are interested in your opinion on how the Software Bill of Materials will be developed and used in the future, please leave a comment.

If you have any questions, either regarding this article or regarding our ClusterImageScanner, please do not hesitate to leave a comment or contact us via opensource-security@sda.se.