Recently I was asked to dig up an old script I wrote for finding email addresses to GitHub usernames. I decided to give it a rewrite and added a few features. The script gitrax is available on GitHub.
gitrax.py is a tool for searching GitHub usernames via the GitHub API and returning data on that username. This is particularly helpful when Analysts come across GitHub accounts containing potentially malicious tools. The data collected helps fingerprint a GitHub user and their connections.
I wanted to give the script a test for digging around a recent malicious event involving GitHub data and ended up learning a lot about hunting/pivoting through GitHub’s API. I decided to focus on the Gentoo breach. During this breach, hackers gained access to the Gentoo repository on GitHub and made numerous malicious commits. Let’s walk through this together, and learn something about hunting GitHub data.
gitrax.py requires a username as an argument. We need to find the GitHub usernames of the malicious commits involved in the breach. This proved to be pretty tricky for several reasons.
- First, there are 656 contributors and almost 200k commits. Sorting through the data is going to be difficult.
- Second, because of the amount of data, the malicious commits are going to be difficult to find via the web interface.
Before we begin, let’s check out the Gentoo Incident Report and narrow our search to useful information. We will focus on the following entries from the report:
20:19 Attacker successfully gains administrative access
20:34 Malicious commit to gentoo/gentoo, 73b72409->fdd8da2e
20:38 Malicious commit to gentoo/gentoo, fdd8da2e->49464b73
20:50 Malicious commit to gentoo/gentoo, 49464b73->afcdc03b
20:55 Malicious commit to gentoo/gentoo, afcdc03b->e6db0eb4, force-push
20:56 Malicious commit to gentoo/musl, 60461ca1->e6db0eb4. Force-push
21:07 Malicious commit to gentoo/systemd, bf0e0a4d->50e3544d
21:11 Malicious commit to gentoo/systemd, 50e3544d->c46d8bbf. Force-push
21:28 GitHub support responds; Gentoo GitHub org frozen.
Based on this information, our search window can be narrowed to 28 June 2018 20:19 to 21:28. At the bottom of the report they list the malicious commits.
gentoo/gentoo, master branch:
gentoo/musl, master branch:
Using a web interface to go back through the commit history (~200k commits) is very tedious, plus the malicious commits were forced-pushed over. So let’s begin with the GitHub Developer API.
Mining GitHub API
The data for each commit is available by adding the abbreviation for the SHA1's from the Incident Report above to the end of the repo’s commit API.
Starting with the gentoo/gentoo malicious commits, Open each of the links above in your browser. Immediately we see obfuscated/fake user data under the committer and author sections.
The committer name and email address are fake. In the author section the login value is
We are unable to find the username in the malicious commit, so let’s try searching the public events api.
A quick note about searching the public events. GitHub’s public events API only goes back 90 days or 300 entries. While doing the searches below, if you cannot find the commits at the links provided, try paginating through older commits by adding
?page=3 and so on to the end of public event links.
I’m out of luck. The malicious commits are beyond the reach of the repo gentoo/gentoo public events (10 pages x 30/page = 300 commits). At the time of this experiment (2018–07–10), the oldest commit I found the in public events for this repo is dated 2018–07–09T13:51:49Z.
Let’s check out the other two repos. Start with the malicious commit for gentoo/musl. Open the link to the malicious commit for the gentoo/musl repo. We have the same author information as above. No luck so far.
Let’s check out the public events for the gentoo/musl repo. Open this link in your browser and look for a SHA1 starting with e6db0eb4. Check out the screenshot. We found one of the malicious commits.
Take note of a few things here. The GitHub username that made the commit is
gentoogang , The email and name are the same. Save the username as we’ll use that later with the gitrax.py script.
Since e6db0eb4 is the same hash on one of the malicious gentoo/gentoo repo commits, let’s check public events for the user
gentoogang and see if we can find any further information on the gentoo/gentoo commits. Open this link in your browser. We found the public events for those malicious commits.
We can now confirm user
gentoogang was the login username for all of the malicious commits to the gentoo/gentoo repo.
There are different email and name values associated with both commits. The author and committer sections are
Again, we are unable to gather a username from the malicious commits. Let’s try searching the public events for these two commits to find the usernames.
Open this link to gentoo/systemd public events in your browser. Look for SHA1’s starting with 50e3544d and c46d8bbf. Those are the two malicious commits we have been searching for. Now we’re getting somewhere!
For commit 50e3544d, the GitHub username is
dudeweedlma0 , the email and name are the same. For commit c46d8bbf, the GitHub username is
dudeweedlma0 , and we have a different email and name. The commit message is something I would not expect from a Gentoo contributor. Take note of the usernames, as we will use them with gitrax.py later.
The malicious commits to gentoo/musl and gentoo/systemd are notably different from the first set in gentoo/gentoo. They use a different username, non-obfuscated email address, and provide insulting commit messages. We know the attacker invited a second user, as noted in the Incident report
20:30 Attacker invites a second malicious user . Since the commit tactics are different, it is probable that one user focused on malicious commits to gentoo, while the other focused on musl and systemd.
Back to gitrax.py
A brief description on using gitrax.py. The script takes numerous flags as arguments. If no flag is provided, the script will lookup all email addresses in the public events for a given GitHub username.
$ python3 gitrax.py -h
usage: gitrax.py [-h] [-a] [-e] [-f] [-F] [-g] [-i] [-o] [-O] [-r] [-s] [-S] [-t TOKEN] usernameSearch GitHub for User datapositional arguments:
username The GitHub username to searchoptional arguments:
-h, --help show this help message and exit
-a, --all Gather all informaiton for GitHub username
-e, --email Find email(s) for GitHub username. This is the default lookup.
-f, --followers List followers for GitHub username
-F, --following List following for GitHub username
-g, --gists List gists for GitHub username
-i, --info List info for GitHub username
-o, --organizations List organizations for GitHub username
-O, --outfile Save results to file
-r, --repos List repos for GitHub username
-s, --starred List starred for GitHub username
-S, --subscriptions List subscriptions for GitHub username
-t TOKEN, --token TOKEN
Use GitHub Personal Access Token. format: -t
The most common usage is to lookup all the fields using the
-a flag. This will lookup all emails, followers, following, gists, organizations, repos, starred, and subscriptions for a GitHub username. This flag will output results to your screen and save them in a json file.
$ python3 gitrax.py -a username
So let’s lookup the usernames we found above and see if we can find any other email addresses, gists, repo’s, followers, organizations etc tied to these GitHub usernames.
woof. ok. nothing additional from
gentoogang. Checking out the account page, we see it was created the same date as the incident.
dudeweedlma0 with gitrax.py
Hrmm. ok. we have a few email addresses. The same ones we already found from the API. Let’s dig around on those. DomainTools does not show them registering any domains. A google search for the first email shows hxxps[:]//wowana[.]me/ as the top result. The owner of the site claims their name is opal hart, the same author we found for Malicious commit 50e3544d. They go on a rant about github on 21 June 2018.
Ok. One last check on the wowaname GitHub account. Let’s go wayback.
We got a positive hit. The account existed at one point in time. The snapshot dates back to April 2014. Let’s check out the
dudeweedlma0 GitHub account. Interesting, the account was created on the same date as the Gentoo breach.
The github avatar image looks similar in design to one of the images on the wowana[.] me site above. Doing a little further digging through that site, I found a direct link to the image used in this GitHub profile picture. hxxps[:]//wowana[.]me/files/005[.]jpg. There are a lot of rabbit holes we could go down here.
I have doubts that anyone committing a breach would register accounts with information about themselves that is publicly available via a google search. ¯\_(ツ)_/¯
Let’s try the last username,
Hey, the script returned data! Well, digging into that data, it looks like
pottering is a very legitimate GitHub contributor and the actual committer was putting their username info into the commit as a jab. A note on the numerous emails returned. The script first tries to directly grab the username’s email address. If no authentication is passed or the value for email is
null, the script then searches public events for a specific username. It grabs all emails found in the public events. While this will most likely capture the email tied to the username, it will also grab the emails of all authors that have contributed to the user's repos. If you need to dig into it a bit more, you can search the user’s public events at
https://api.github.com/users/<insert username here>/events/public
Well, that was a bit of a bust. Hopefully you learned some techniques for pivoting through GitHub API data. Feel free to use the gitrax.py script for your own GitHub hunting adventures.