ABN AMRO Open Source project: Repository Scanner

Tech

ABN AMRO
ABN AMRO Developer Blog
7 min readDec 7, 2022

--

Today, ABN AMRO published the Open Source project Repository Scanner (RESC) on the ABN AMRO GitHub. It is an Open Source tool that increases the security of your code. In this article, you can read more about the Repository Scanner tool, the motivation behind creating it, why it is open-sourced, its features, and how it works.

Want to learn more and follow our Open Source developments? Read the blog about why we Open Source, go to the ABN AMRO Developer Portal or directly go to the ABN AMRO GitHub.

Introduction:

Exposing sensitive data holds a prominent position in the Open Web Application Security Project (OWASP) Top 10. Therefore, the need to detect secrets like passwords, credentials, tokens, or certificates is at an all-time high. ABN AMRO has developed a tool to focus on detecting such secrets, which are exposed in version control systems: the Repository Scanner tool.

The Repository Scanner tool makes it easier for developers to look for secrets in their source code. Using the repository scanner, you can see where you might have missed something or could improve while increasing security. The tool is not only meant for developers, but also penetration testers, security auditors, security researchers, and enterprises. The Repository Scanner offers a simple, intuitive user interface, that displays the gathered information.

The motivation behind making this tool

The Secure Coding team at ABN AMRO noticed that developers were having difficulties keeping secrets out of their source code. Secrets are passwords, tokens, user names, certificates, API keys, and anything else you would want to keep secret. Secrets are often hard-coded inside of applications by developers. This is done as a temporary workaround, on purpose because it is needed at that time, or unknowingly (for instance, when a piece of code is re-used, or because of inexperience in secure software design).

The reason can be different but the outcome is the same: when an adversary gains access to the source code and observes a hard-coded secret, the adversary can exploit this secret to gain unauthorized access to a system.

Therefore, there was a need for a tool or solution that scans for secrets in the CI pipelines as well as continuously monitors repositories for committed secrets and secrets hidden in the git commit history. The team looked at different market-leading commercial static code analyzing tools as a solution to the problem. These tools extend to the detection of specific types of secrets in code. However, a big gap was noticed: there were many false positives, not all types of secrets were covered, and there was no freedom to set up your own “rules” to find specific secrets. This kickstarted the motivation to build the Repository Scanner tool and give users the freedom to create their own rules for the tool to detect an even broader range of default rules.

Why Open Source the Repository Scanner?

Below, the motivations of ABN AMRO to make the Repository Scanner an Open Source project are explained:

  • Increase cyber security everywhere:
    Secrets pose an immediate risk to applications and data when they are exposed. ABN AMRO wants to contribute to any organization and developer to have the best security possible, at a minimum cost (in this case for free).
  • Community collaboration:
    Collaborating with the community on the repository scanner, hopefully, will trigger improvements and feature development. This way, everyone can benefit from being protected against adversaries even more.
  • Give back to the Open Source community:
    Within ABN AMRO we make use of Open Source code for multiple purposes. Next to that, the repository scanner makes use of GitLeaks, an Open Source project. Therefore, ABN AMRO wants to contribute to the Open Source community and create an open-source-friendly development environment.

Features

The tool stands out with its many features, such as its simple but beautiful user interface and the freedom to manage the rule packs to your own liking. A more detailed description of the features of the Repository Scanner tool is listed down below:

  • Support of multiple Version Control Systems (VCS)
    Unlike other tools, the Repository Scanner supports more than a single Version Control System. It offers support for GitHub, BitBucket, and Azure DevOps. Since every organization or developer uses a different type of VCS, supporting these different platforms is important.
  • Highly configurable YAML-based deployment setup
    Every environment is different. That’s why it is important to have a deployment setup that is easy to configure. Therefore, the Repository Scanner has an easy-to-configure YAML-based deployment setup. Next to that, it offers continuous monitoring of repositories. Since the tool is meant for organizations of all sizes, it is essential that repositories can be continuously monitored. This is done through incremental scans of the repositories and their branches.
  • Ability to manage rule packs
    With the lack of customization options provided by other tools and their vendors, the Repository Scanner tool prides itself on the ability to fully manage the rule packs that check for secrets in source code. There already is a wide arsenal of rules available, but in case you need a specific rule for your repositories or organization, you got the full freedom to do so.
  • Beautiful and intuitive dashboard
    The Repository Scanner tool comes with a beautiful user interface in the form of a clear and easy-to-understand dashboard. This dashboard displays all the information you want in regard to the found secrets in the repositories. The dashboard also comes with an analytics and metrics section which gives an overview of all the rules along with the false- and true-positive rates.
  • SIEM integration to report true positives
    For organizations that have a dedicated team that makes use of a Security Information and Event Management (SIEM) solution, a SIEM integration was implemented. This way the secrets found through the repository scanner are available in one central location for these organizations and teams.

How does the Repository Scanner work?

The Repository Scanner tool is a big project that consists of a lot of different components that all serve their purpose to make the tool work as smoothly, fast, and as accurately as possible. The figure below illustrates each component and how they interact with other components within RESC.

RESC-VCS-SCRAPER

All the projects and repositories from the different supported VCS providers such as BitBucket, Azure Repos, and GitHub are gathered by this component. It consists of two primary modules such as vcs-scraper-projects and vcs-scraper-repositories.

The vcs-scraper-projects runs a scheduled job that scrapes all project information from the VCS providers and sends these projects to the queue. The other module, the vcs-scraper-repositories, picks up these projects from the projects queue. The relevant branch and repository information from these projects is obtained and forwarded to the repositories queue. The repositories queue is one of two message queues that are part of our MESSAGE BROKER, RabbitMQ. The message broker is responsible for the creation of task queues, dispatching the tasks to these queues, and delivering them from the queues to the workers. The other message queue used in RESC is the project queue.

RESC-VCS-SCANNER

The second component used is the RESC-VCS-SCANNER, this component which runs a celery worker gathers the repositories from the aforementioned repositories queue, and performs a scan on these repositories to find secrets. Once these secrets are discovered, they are never stored in the database. Instead, a record of the author, email address, and commit information, along with a reference to the secret in the source code is kept.

RESC-BACKEND

As the name suggests, the RESC backend is the backend of the RESC tool, in which most of the “behind the scenes” magic takes place. The RESC-backend consists of the database models, the RESC Web Service, and the Alembic script for the database migration. Besides these features, the RabbitMQ users and queue creation is done in this component of the tool.

To create the RESC Web Services, we made use of FastAPI, a framework used to build APIs with Python.

RESC-FRONTEND

The last “big” component is the RESC-FRONTEND. The front end of an application is usually the first thing a user sees, thus it must be appealing and draw them in to make use of the tool. The Repository Scanner has a beautiful, simple, and intuitive frontend which consists of a dashboard that displays all the information you would want from the tool (repository information, information about the secret that has been found, and information about the rule pack) along with additional analytics and graphs to give a high-level overview of the findings, makes this tool stand out.

DATABASE

Other smaller components used within the Repository Scanner tool include the DATABASE, where information like the projects, repositories, branches, scans, and findings information is stored. As mentioned before, no secrets are directly stored in the database. For the database, we are using Azure SQL Edge.

SECRET SCAN RULE

The SECRET SCAN RULE is a smaller component that makes use of a configuration file from GitLeaks in the TOML format to specify the secrets you want to scan for. This TOML rule configuration file needs to be provided during deployment time for the tool to find secrets based on this file. There are also a set of initialization tasks that are executed during the deployment of the Repository Scanner tool. The resc-db-init task makes it possible to push some database modification changes, it executes the alembic migration scrips described in the RESC-backend. The resc-mq-init task bootstraps the RabbitMQ message broker by creating queues for projects and repositories as well as by creating and setting up user access/permissions for the queues. The final noteworthy task is the resc-rules-init which parses the rules from the TOML rule configuration file that was provided during the deployment of the tool and saves them in the database. When performing a secret scan, the VCS scanners use these rules.

Check out the Repository Scanner on GitHub

Check out the Repository Scanner project (RESC) on the ABN AMRO GitHub and see how you can contribute. Read about Open Source on the ABN AMRO Developer Portal or the blog.

Acknowledgment

Since the Repository Scanner makes use of GitLeaks, we want to give Zachary Rice credit for creating and maintaining GitLeaks. GitLeaks has helped many organizations in securing their codebases for any leaked secrets.

--

--

ABN AMRO
ABN AMRO Developer Blog

Build the future of banking! Use our APIs to automate, innovate, and connect to millions of customers. Go to: https://developer.abnamro.com/