Github Code Scanning

Published in

Technogise

8 min readJul 26, 2020

Have you ever done code scanning of your code for vulnerabilities or errors before? No? Don’t worry even I wasn’t aware of the same when I started my career as a developer. Many organizations follow this as a best practice and a must before moving their applications to production. This helps to mitigate any security vulnerabilities or errors in the application after it is deployed to production, thereby avoiding any business impact or monetary losses.

There are many code scanning tools available. Some are proprietary licensed products, while others are free tools. But they all require some efforts to set up with your existing CI/CD setup. The one which I have seen is Veracode, which is a proprietary product and provides information about any security risk or code error by scanning the entire code base… even the dependencies I believe. And then there is SonarQube; widely used by many organizations and many of you might be aware of. It’s a static code analyser and provides information regarding any security risk or code error.

GitHub, which is the most popular platform for open source development, has also come up with a new service that allows code scanning of the repository for security vulnerabilities and any coding errors. You can use GitHub code scanning to find, triage, and prioritise fixes for existing problems in your code. It also prevents developers from introducing new problems. You can schedule scans for specific days and times, or trigger scans when a specific event occurs in the repository, such as a push.

If code scanning finds a potential vulnerability or error in your code, GitHub displays an alert in the repository. After you fix the code that triggered the alert, GitHub closes the alert. GitHub Code Scanning is a new service and is still in beta. Lots of scanning rules are still work in progress. For example, passwords leaked in Java properties file is still not available. However, as the service becomes more stable, we will have all such cases available in CodeQL rules database.

By default, code scanning uses CodeQL, a semantic code analysis engine. CodeQL treats code as data, allowing you to find potential vulnerabilities in your code with greater confidence than traditional static analyzers. You can use CodeQL to find all variants of vulnerability, and remove all those variants from your code.

QL is the query language that powers CodeQL. It is an object-oriented logic programming language. GitHub, language experts, and security researchers create the queries used for code scanning, and the queries are open source. The community maintains and updates the queries to improve analysis and reduce false positives. For more information, see CodeQL on the GitHub Security Lab website.

In this short tutorial, will see how to configure code scanning for a Java-based project. Code scanning supports both compiled and interpreted languages and can find vulnerabilities and errors in code that’s written in the following supported languages:

C/C++
C#
Go
Java
JavaScript/TypeScript
Python

Github Code Scanning is still in beta and access is limited to users on an invitation basis. Follow the steps below when you have access to the service

Let’s start 🏃🏻‍♂️

The first thing you need to do is navigate to the security tab of the repository

Then you will see the screen below, and the last option you will see “Code scanning alerts”, click on the Setup code scanning button

Code scanning uses Github Actions, You will see the above screen for CodeQL analysis setup. Along with this, you will also see many different options from marketplace to setup like “Anchore Container Scan”, “OSSAR” etc. However, for this post we will be going with the default action workflow. Click on the Setup this workflow button for CodeQL analysis, and you will be prompted with the screen below to configure the workflow YAML.

When you see the configuration click on Start commit and create a commit with the basic configuration. We will do the modifications afterwards.

Take a pull in from your repository so that you have the workflow YAML available to modify. Open the file in the .github/workflow directory and modify the configuration as per your requirement.

Below are the configurations that I have done for this post:

I have modified a few things from the given workflow. For e.g, I have disabled the cron scheduled option. This makes the security scan to run on given cron expression, the option push enables the scan to run on every commit push to the remote, and the similar pull_request will happen for every pull request. However in an attempt, when I ran a full pack of security rules that took 19min.

In such case, I would recommend setting up the Security scanning as a cron once in a day, preferably a nightly build so that your team will get to see the results of the security scan, first thing next morning and can prioritise fixing the bugs first.

Another change I’ve done is I’ve added a configuration in the file .github/codeql-config.yml to initialize CodeQL step as below:

The CodeQL actions codeql-action/init@v1 can find the programming language by its auto-detection feature. It is something that is embedded in the action by Github developers. However, I have explicitly added Java so that the step doesn't have to do that extra detection work and saves time. At the same time, I like to make things readable instead of the abstraction magic. This helps anyone new joining your team figure out how things are configured.

Then there are queries options. These allow you to specify additional scan rules. Two are inbuilt in the CodeQL tool which I have specified security-extended and security-and-quality. However there are also many additional security rules on GitHub's CodeQL repo. You can find them here. When I first added the entire repo (which was more than 800 rules), it took a lot of time and many of them were not useful for my experiment purpose.. so I selected a smaller subset of those.

There is a syntactic way to provide the configuration as shown below. In order to run the custom queries, you need to disable the default queries by setting disable-default-queries: true.

Different ways to provide custom CodeQL queries

In the configurations, I have provided github/codeql/java/ql/src/codeql-suites/[email protected]. This is the path to the actual GitHub link and I have tagged the rules for version 1.24.0. You can also point to the master as shown in the above configuration guide.

After you have done the configuration, create a new commit and push. You will see a new scan action triggered, which you can view in the Actions section of the GitHub repo page.

Clicking on the individual action run will show you additional information about the scan process in the log which is provided for the step navigate in the Perform CodeQL Analysis section. You can see what all security rules were checked against the code.

Once the workflow is completed without any error, to check any security vulnerability or error in the code navigate to the Security Tab. The badge also shows the number of security risks that are present in the code. I liked this as the information is available right there on the main page of your repo and this way team can prioritise bugs to be fixed.

If you click on it and move to the code scanning alerts you can see all the vulnerabilities that are available in the code listed

The listing shows the message of the issue found, the file in which the issue was detected with the line number (very cool 🙂) and the branch in which the issue was found. Further clicking on any issue, you can drill down into the actual code. The UI highlights the line of code which has the issue, whether it’s a warning or an actual error and more details like the commit that caused the issue, the rule that failed etc. The UI also tags the issues based on the rules category. In the below image you can see the error are tagged as CWE-798 and security

You can see the details of two errors which are security risks to my code are listed below (deliberately added for demo purpose 😛). One is because of the password leaked into the source code and the other is directly redirecting to the user-provided URL without first validating the URL.

If you click on the small show more dropdown, you can see a good help about the issue and how to avoid it. There are also links to some reference documentation that explains the issue in depth.

Recommendation for the cause of the issue

Once you have fixed the issue and the code is pushed the issue will be moved to close. However, at times there will be cases of false-positives in such cases. There is an option provided at the top right of the issue, to manually close the issue, and has three options shown in below image, select one that suits your use case and close the issues which are not valid.

I hope this post helps you. If you don’t see the option to enable the security scan on your repository, don’t worry the tool is still in beta and access is given on invitation basis by Github. Get an invite for the service if you want to try out. It will be accessible to everyone soon.

The post gives a basic idea of how the scanning will work with repository. There are many open-source tools, and their workflows available in the GitHub marketplace that you can try. However, this post will help you to get started with basic.

Happy Coding!

References:

Originally published at https://iamninad.com on July 26, 2020.

Github Code Scanning

Let’s start 🏃🏻‍♂️

References:

Written by Neenad Ingole