Removing Keys, Passwords and Other Sensitive Data from Old Github Commits on OSX
Everyone Makes this Mistake Once
When learning to write code you are often working with sensitive data when trying to access accounts, API’s, or other applications. To access this data you need to feed the server a key or a password in order to prove you have permission to make the request. And inevitably you will once post your personal key, your application key or your password in one of your tracked Github files and not realize it to much later. By the time you recognize your mistake either your account has been compromised or you have committed and pushed code several times after the first mistake. How can we comb through Github in order to remove this information without completely deleting your repository? The BFG Repo-Cleaner.
Step 1 — Backup Your Data and Remove Your Passwords!
I am just going to quote the BFG page since they put it about as clearly as possible:
“First clone a fresh copy of your repo, using the — mirror flag:
$ git clone --mirror git://example.com/some-big-repo.git
This is a bare repo, which means your normal files won’t be visible, but it is a full copy of the Git database of your repository, and at this point you should make a backup of it to ensure you don’t lose anything.”
Also make sure the key/passwords you want removed has been taken out of all files in your repo and your most recent commit contains none of the words you want replaced.
To quote the author of this product “The BFG treats you like a reformed alcoholic: you’ve made some mistakes in the past, but now you’ve cleaned up your act. Thus the BFG assumes that your latest commit is a good one, with none of the dirty files you want removing [sic] from your history still in it. This assumption by the BFG protects your work, and gives you peace of mind knowing that the BFG is only changing your repo history, not meddling with the current files of your project.”
Step 2 — Install BFG (and Homebrew)
The easiest way to install BFG is via Homebrew. If you are using OSX and are a Rubyist then you should definitely have Homebrew. You can find the website and installation instructions here. For the purpose of this post, it is assumed you have homebrew already installed and configured.
Once you have Homebrew installed installing BFG is as easy as running a single command in your Terminal:
brew install bfg
Then you should see this response:
Great, now that you have backed up your data and you have BFG installed we can begin to setup what strings we need removed from our code.
Step 3 — Building replacements.txt file
Ok, so now we need to setup the file that contains the strings we want removed. Let’s look an the example of a replacement file as described by the author of BFG on Stackoverflow:
PASSWORD1 #Replace string 'PASSWORD1' with '***REMOVED***' (default)
PASSWORD2==>examplePass # replace with 'examplePass' instead
PASSWORD3==> # replace with the empty string
regex:password=\w+==>password= # Replace, using a regex
regex:\r(\n)==>$1 # Replace Windows newlines with Unix
So lets show these first 3 in action. I created a repository where I committed to 4times. There are 4 files each with a few lines we want to replace and 1 line we want to leave alone.
We want to replace “snazzy_password” with the default “***REMOVED***”,
“abcd1234” with “replace_with_custom_string”
“ringostarr” with nothing
and “leave_this_alone” should go untouched.
You need to create a file in the home directory of your repo. I like to use the name replacements.txt.
***DO NOT ADD/COMMIT/PUSH THIS FILE!***
Here is my screenshot from Atom of the setup:
Now we have the replacement file created we are ready to run BFG to replace our text!
Step 4 — Run it.
While in the home directory of you repo you can run the command:
This is the result in the terminal:
This is showing my 4 commits and which of them were ‘dirty' and are therefore cleaned by BFG.
Next you need to run the git commands they suggest and then push it back to github.
$ git reflog expire --expire=now --all && git gc --prune=now -- aggressive
$ git push
Now we check if our commits on github have changed:
Here we can see the first line is the default replacement, the second line the custom replacement, the third line was not replaced and the forth line was replaced with an empty string. This worked on all files and all commits. Picture above is the “old” side of the most recent commit. The “new” side is untouched, so again you have to make sure your most recent commit it clean.
Other Uses and Conclusion
BFG can also be used to strip your repo’s of large files (like videos, images, or giant databases). It functions in almost the exact same way except you feed it slightly different flags and commands when you execute in your terminal. For example, if we wanted to remove all files over 50MB we would use the command:
$bfg --strip-blobs-bigger-than 50M my-repo.git
Where 50m is the minimum size file you want to clear and my-repo.git is the SSH link to your github repo.
BFG is a great tool for cleaning your old repo’s and it functions somewhere between 10 and 720 faster than the git command git-filter-branch. This makes it the go-to choice for many programmers when they need to make edits to their github repositories. I have personally used it once to clear my secret API key for a personal project and it was what inspired me to write this blogpost. I suggest getting familiar with it in advance so when the time comes that you need to run this app you can do it quickly and painlessly and avoid accidentally deleting extra info or not clearing everything on your first try forcing you to go through the whole process again.
BFG github.io page : link
Github Repo: link