One number, one metric. How we made Vulnerability Management easier at QuintoAndar.

Published in

Blog Técnico QuintoAndar

9 min readFeb 7, 2023

The CyberSecurity challenge

I believe that CyberSecurity could be divided into two main categories: identification and remediation. The first is the field that is dedicated to finding security-related flaws, developing exploits, and working on ways to gather previously unobtainable information. The latter focuses on developing and implementing countermeasures to either prevent or remediate already exploited vulnerabilities.

As a cybersecurity professional with just shy of 5 years of experience, I’ve seen many commercially available and open-source tools that promise to deliver the best suite of vulnerability identification and tracking. Don’t get me wrong, most of them deliver great user interfaces and powerful vulnerability identification software. The main issue I find with these solutions is that most of the time just implementing one isn’t enough to get the coverage and depth of analysis we want. QuintoAndar, as an example, is a very diverse company stack-wise, with a few hundred repositories written in 10 or more different programming languages. No single tool can provide that, meaning we need multiple to be able to provide efficient coverage.

The developer’s point-of-view

Now in the shoes of a developer trying to gather information on his repositories security status to try and understand how it’s doing, I would need to visit different pages with inconsistent buzz-words, numbers, letters, and even colors trying to display security status information. All of that is to answer one question: “How secure is my application/repository?”

As an example, I’ve gathered some different pictures to illustrate what would be the answer to the question above according to each identification tool:

Different security tools and their outputs

Just from the images above, what would your answer be to the question: How secure is my application/repository? “B”, “2 Vulnerabilities”, or even “19H 29M”?

Proposed Solution

With the goal of providing a unified answer to the main question and to try and abstract the many tool’s views on the security situation of a repository, the Application Security team created a unified metric.

Through a unified metric, the AppSec team’s goal was to standardize the company’s understanding of security metrics. By being instructed to only look to this single metric, what happened backstage did not matter anymore. What this means is that all the identification tools being used in the background did not matter anymore, as long as the front-end metric kept working and its calculation was known.

As an example, if a vulnerable dependency identification tool needed to be substituted for another, the process would not impact developers directly. Using a single number as a facade to a repository’s security situation abstracts all background tool logic.

This single metric I’ve been talking about received the name of Security Score. With its principles now in place, all the team had to do was to come up with a meaningful way to try and organize all of the vulnerability identification tools output into a single one.

I’m not gonna pretend that the process of developing a calculation that had many different formats, colors, numbers, and letters as input into a single number was easy. In fact, we did many test runs on different formats before finally arriving at what we use today. As an example, in the beginning, we were multiplying the number of issues per severity and then doing a weighted sum. The problem with this approach is that a Security Score of zero was, in fact, very good as it represented no issues, but a higher Security Score meant bad things. Developers were at first a bit confused with this metric, which made us rethink the calculation method. An illustration of how it worked is as follows:

Now in a much-needed V2, the Security Score implemented a calculation method that would still receive all that very different input types, but return a number from 0 to 100. This time as the closer the Security Score got to 0, the more critical the issues were. One way of properly mapping all issues to the Security Score was to create different sections, each with its own severity.

All the vulnerabilities reported by the identification tools had something in common, all of them had their own way of representing the severity. After reading a bit more about each of the identification tools and trying to understand their classification method (which wasn’t easy! 😅) we decided on the following Score:

As shown by the picture above, the Security Score used to be entirely severity based. This decision did work for quite some time and helped us develop a good understanding on where we were standing security-wise. Nevertheless, after about a quarter with this metric running, we started receiving some feedback that it might have been too harsh. With about 800 repositories company-wide created over the span of five years, it’s safe to say that some lacked proper maintenance. This being, the Security Score was due to change one more time.

This new Security Score iteration had to be one that took into account the implicit difficulties of fixing a specific type of vulnerability versus another. To propose, develop and test a fix to an issue related to business logic may prove to be harder and more time-demanding than updating a dependency with a major change. With those points in mind, the newly proposed method of calculation for the Security Score is as follows:

The new calculation method, as shown above, tries to add a bit more significance the higher the vulnerability’s severity, but also takes into account its type. This approach has been very interesting in the sense that it made teams focus more on manual and critical/high severity issues, which are the ones that take more time on average to fix.

Data Visualization

Even though the Security Score may change due to different factors such as a new interpretation of the significance of each type of issue, one thing we always had on our minds was to provide a single visualization experience.

When making use of different identification tools, one may feel a strong push toward the adoption of the tool’s interface as the default interface for security interaction. Nonetheless, it showed itself important to try and move away from this trend with the goal of providing the same user experience to our customers (developers) regardless of the identification tools being used to gather the metrics.

A single visualization experience makes itself valuable as we as The Security Team do not need to worry anymore about providing multiple pieces of training on how to use the different tools we have, just one is needed. The catch here is that this single visualization point is able to encompass all identification tools, but with a single delivery point, an Aggregator, which is how we call the solution internally.

Internally, QuintoAndar makes use of Backstage, which is an open-source platform for building developer portals. The engineering teams are all on track to try and migrate their different portals and tools to the solution with the goal of providing a single place for engineering-related information.

Backstage’s front-end component is written in TypeScript, which was in itself a very interesting challenge for me as someone that had pretty much just worked on back-end systems. As the main design language for the platform is card oriented, we decided to go with that and develop a modular interface to split the information between different blocks. The whole design went through countless reviews and tweaks over the process of development and its use by the engineering teams, thus the image below is the result of all the feedback we received and how the page looks today:

Example of a Security tab shown on a project

The image above is mainly divided into three separate cards. The first, Security Tools Results, provides a simple visualization of how many issues per type a project has. Currently, we divide the issues being found into either Dependency, Secret, or Manual issues. The dependency vulnerabilities are those being reported by an identification tool in every new commit to a repository, secret issues are those related to possible keys being exposed in the repository and manual issues are the ones input manually by a developer. It’s important to mention that all the different images are clickable and provide a quick way to access more details on the issues. Aside from that, there’s also a button to see the whole list of vulnerabilities straight from the Backstage view:

List of vulnerabilities accessible straight from Backstage

Through the detailed view of the Active Vulnerabilities table, it’s possible to find useful information such as their reported Severity, SLA expiry time the patched ranges (if applicable), and the commit it was reported in.

Coming back to the previous page, the chart card provides an insight into the project’s Security Score evolution through the days. As we can see from the image below, this demonstration project has had a score of 0 and the highest severity of Critical for quite some time:

The last card on this page is one that provides a tip on the action that has the biggest impact on the Security Score. If the developers ever feel a bit lost on what to prioritize, the tip could be a great place to start. The tip card is as shown below:

One more thing, we also implemented a company-wide metrics page. The goal of these views is to provide a way for tribe managers to quickly visualize all of the projects simultaneously. The page is easily accessible through Backstage’s sidebar and displays the following table through the Overview tab:

Backstage projects security overview tab

There’s also a Metrics tab, which provides company-wide metrics related to all projects on display:

As can be seen from the image above, the top left card shows the Security Score mean for all tiers of projects. The two cards on the right show the number of repositories with their highest severity and the overall number of issues.

It’s also important to mention that the page also displays a chart to help track the Security Score evolution through time:

Conclusion

After a full year of running the Security Score approach, we’ve collected some interesting results that are worth mentioning. As a recap, it’s important to mention that the Security Score is a metric that allows for an easy way to understand and put into perspective the severity of a vulnerability or issue.

At the beginning of 2022, the Application Security team had two identification tools up and running capable of providing a new analysis on each new commit done to the repository. By the time of writing, we already have a new identification tool fully operational, increasing the number to three. Moreover, we do have a fourth new tool currently being implemented that will only be featured on a 2023 post as it’s still experimental.

Since January, we saw the number of reported vulnerabilities increase by ~108% due to all the new repositories being created and the new identification tool. The cool and interesting part is that December’s Security Score when compared to January’s has a 25% increase.

Even though the overall number of reported vulnerabilities (before triage) has increased, the severity of the issues has decreased. By the end of the year, we can safely say that the company has put an effort into fixing critical severity issues and that we’ve moved forward. QuintoAndar is safer now than it was before.

When working with cyber security it can be hard to try and create a reference to what means to move forward. As an example, if we were road builders, we’d need a reference point to know how far away we’ve gone since the beginning. With so many different identification tools, it’s hard to determine a standard that could be valid for all of them. It’s precisely due to this scenario that the Security Score was created as we can rest assured that the company’s overall issue severity is decreasing. A homemade solution that is vendor-neutral and flexible enough to allow for the removal and addition of new tools without attrition.