Migrating from a shared GIT folder to GitHub

A while ago we bought a licence for GitHub Enterprise and we’re now using it for all our projects.
Before this we used GIT for a lot of customer projects but our origin was a simple shared drive on the network so we needed to move all of these projects to GitHub — we had a lot of them; so this is where our adventure started.
Note — This is my write up of an internal wiki post, it might be slightly out of date now and I’ve sort of had to remember the parts I didn’t document well, if something doesn’t work — or there is a slicker way of doing this now then please let me know!
Branches, branches everywhere
You use a neat branching strategy right? Git Flow is it? Lovely. Nice short lived feature branches? Fab.
Did our old projects? No. They look like this:
Do you need all those dozens of release branches? Unlikely — so let’s get rid of them.
Mirroring Repositories
We don’t want to start destroying things where everyone else might be changing code, so we best make a backup.
The correct way to take a backup is to clone the repository as a mirror — this article is ace and takes you through the different options, I’ve detailed what I did below.
Prerequisites
- You’re going to need a local empty folder for this — let’s pretend you’ve created this here:
c:\temp\git- You’ve asked people to stop pushing code to the repository you’re about to move
Steps
So this assumes that you’ve got the origin of your repositories sitting on a network drive — we’ll call it drive G and let’s assume you’ve got a customer called “CustomerA”:
In Git bash navigate to your empty folder and run:
$ cd /c/temp/git/
$ git clone --mirror /g/CustomerA/This will get all the branches and tags that are available in the repository and will replicate those into your new folder into a folder at:
c:\temp\git\CustomerA.gitDone! Ace! Next!
Delete merged branches
If a branch has already been merged into master then we can safely delete it — this Stackoverflow article gives our starting point but as it’s littered with warnings from people I’ll breakdown what it’s telling us.
Prerequisites
You’ve followed the steps in Mirror that repo
Steps
- Create a new folder at c:\temp\git\CustomerA and clone the new mirror into it:
$ cd /c/temp/git/CustomerA
$ git clone /c/temp/git/CustomerA.git2. Select the master branch
3. Open the Git Bash Shell and navigate to your repository and run
git branch -r --merged
This will give you a list of all branches that have been merged in master, these can be deleted in theory — except…..
You’ll probably see some branches cloned of master that are long lived branches that you want to keep — like develop for example
4. Run this:
git branch -r --merged | grep -v '\*\|master\|develop'
5. Look again at the list — is the anything else you care about losing? No? Good. (If there are then add the name of the branch to the grep -v list and run it again) — happy? Ok…
6. Run this (don’t forget to any other branches you might have added in the grep from step 5 above.
WARNING — The following deletes branches on the REMOTE branch — don’t do this directly on your original origin drive unless you are absolutely sure you know what you’re doing.
git branch -r --merged | grep -v '\*\|master\|develop' | sed 's/origin\///' | xargs -n 1 git push --delete origin
What’s going on here??
- sed does some substitution to add origin in somewhere — perhaps it’s to do with the grep, honestly I have no idea — details here http://www.grymoire.com/Unix/Sed.html#uh-0
- xargs splits the output from the grep into individual inputs from the git branch -d command and runs them all (I think — I’m not a UNIX guy…)
- You’re almost done now….
What’s left?
For a few customers I was looking at I found half a dozen feature branches that I had to investigate manually, these were branches that had long since been abandoned and so could be safely deleted — if you’re really lucky you’ll just be left with the master and develop branches.
Once you’re happy it’s time to push all your hard work back to the origin and push the code to it’s new home on GitHub — what can possibly go wrong?
Well…..
You can’t push really large files to GitHub
We, as developers, are idiots — well we were idiots in the past and now we’re paying the price. When you push to GitHub you might get an error saying that files you are pushing are over 100MB big and so can’t be added and should be stored as Git large file storage https://git-lfs.github.com/ so your going to need to do some work.
What’s the problem? — just remove the files
Well you can’t because the thing with Source Control is it will hold versions of those big files right the way through the history of the project, so you need to remove or change the file in every commit ever, in every branch you’re pushing. You can do this in GIT but it’s hard and slow; enter BFG Repo-Cleaner.
BFG Repo-Cleaner
BFG says it:
“Removes large or troublesome blobs like git-filter-branch does, but faster”
It also can be used to purge things like passwords etc. from Git history.
Prerequisites
- You’ve followed the steps in Clone as a mirror
- You’ve followed the steps in Delete merged branches
Steps
There are two ways to do this — the easy way or the “correct” way:
The easy way
You can do this if you are happy to delete all the files completely and forever from GIT
- Download BFG
- Rename the download to BFG.jar
- Navigate to the folder above the mirrored repository you pulled down
- Copy BFG.jar to that location
- Run the following
java -jar bfg.jar --strip-blobs-bigger-than 100M <name-of-repo.git>
Th correct way
Use this method to covert all large files to GIT large file storage so they can be source controlled:
- Install Git large file storage
- Download BFG
- Rename the download to BFG.jar
- Navigate to the folder above the mirrored repository you pulled down
- Copy BFG.jar to that location
- Run the following
$ java -jar bfg.jar --convert-to-git-lfs '*.dmp' --no-blob-protection <name-of-repo.git>
$ java -jar bfg.jar --convert-to-git-lfs '*.zip' --no-blob-protection <name-of-repo.git>
$ git reflog expire --expire=now --all && git gc --prune=now
$ git lfs install
The above example just adds .dmp and .zip files to large file storage, you can add as many types as you need here.
Pushing to GitHub
Now we need to get the code up onto the repository:
- Create your new Organisation in GitHub Enterprise (<Organisation> in code snippet below)
- Add your new repository (<Repo Name> in code snippet below)
- Navigate to the directory you mirrored your repository into (so in this example)
$ cd /c/temp/CustomerA.git- Run the following to push the code up
$ git push --mirror https://github.com/<Organisation>/<Repo Name>.git
If using git-lfs, you should see something like this:

Yay! It’s all done — go grab a tea.