Migrating from a shared GIT folder to GitHub

Steve Newstead
Jul 20, 2017 · 5 min read

A while ago we bought a licence for GitHub Enterprise and we’re now using it for all our projects.

Before this we used GIT for a lot of customer projects but our origin was a simple shared drive on the network so we needed to move all of these projects to GitHub — we had a lot of them; so this is where our adventure started.

Note — This is my write up of an internal wiki post, it might be slightly out of date now and I’ve sort of had to remember the parts I didn’t document well, if something doesn’t work — or there is a slicker way of doing this now then please let me know!

Branches, branches everywhere

You use a neat branching strategy right? Git Flow is it? Lovely. Nice short lived feature branches? Fab.

Did our old projects? No. They look like this:

Do you need all those dozens of release branches? Unlikely — so let’s get rid of them.

Mirroring Repositories

We don’t want to start destroying things where everyone else might be changing code, so we best make a backup.

The correct way to take a backup is to clone the repository as a mirror — this article is ace and takes you through the different options, I’ve detailed what I did below.

Prerequisites

  • You’re going to need a local empty folder for this — let’s pretend you’ve created this here:
c:\temp\git
  • You’ve asked people to stop pushing code to the repository you’re about to move

Steps

So this assumes that you’ve got the origin of your repositories sitting on a network drive — we’ll call it drive G and let’s assume you’ve got a customer called “CustomerA”:

In Git bash navigate to your empty folder and run:

$ cd /c/temp/git/
$ git clone --mirror /g/CustomerA/

This will get all the branches and tags that are available in the repository and will replicate those into your new folder into a folder at:

c:\temp\git\CustomerA.git

Done! Ace! Next!

Delete merged branches

If a branch has already been merged into master then we can safely delete it — this Stackoverflow article gives our starting point but as it’s littered with warnings from people I’ll breakdown what it’s telling us.

Prerequisites

You’ve followed the steps in Mirror that repo

Steps

  1. Create a new folder at c:\temp\git\CustomerA and clone the new mirror into it:
$ cd /c/temp/git/CustomerA
$ git clone /c/temp/git/CustomerA.git

2. Select the master branch

3. Open the Git Bash Shell and navigate to your repository and run

git branch -r --merged

This will give you a list of all branches that have been merged in master, these can be deleted in theory — except…..

You’ll probably see some branches cloned of master that are long lived branches that you want to keep — like develop for example

4. Run this:

git branch -r --merged | grep -v '\*\|master\|develop'

5. Look again at the list — is the anything else you care about losing? No? Good. (If there are then add the name of the branch to the grep -v list and run it again) — happy? Ok…

6. Run this (don’t forget to any other branches you might have added in the grep from step 5 above.

WARNING — The following deletes branches on the REMOTE branch — don’t do this directly on your original origin drive unless you are absolutely sure you know what you’re doing.

git branch -r --merged | grep -v '\*\|master\|develop' | sed 's/origin\///' | xargs -n 1 git push --delete origin

What’s going on here??

  • sed does some substitution to add origin in somewhere — perhaps it’s to do with the grep, honestly I have no idea — details here http://www.grymoire.com/Unix/Sed.html#uh-0
  • xargs splits the output from the grep into individual inputs from the git branch -d command and runs them all (I think — I’m not a UNIX guy…)
  • You’re almost done now….

What’s left?

For a few customers I was looking at I found half a dozen feature branches that I had to investigate manually, these were branches that had long since been abandoned and so could be safely deleted — if you’re really lucky you’ll just be left with the master and develop branches.

Once you’re happy it’s time to push all your hard work back to the origin and push the code to it’s new home on GitHub — what can possibly go wrong?

Well…..

You can’t push really large files to GitHub

We, as developers, are idiots — well we were idiots in the past and now we’re paying the price. When you push to GitHub you might get an error saying that files you are pushing are over 100MB big and so can’t be added and should be stored as Git large file storage https://git-lfs.github.com/ so your going to need to do some work.

What’s the problem? — just remove the files

Well you can’t because the thing with Source Control is it will hold versions of those big files right the way through the history of the project, so you need to remove or change the file in every commit ever, in every branch you’re pushing. You can do this in GIT but it’s hard and slow; enter BFG Repo-Cleaner.

BFG Repo-Cleaner

BFG says it:

“Removes large or troublesome blobs like git-filter-branch does, but faster”

It also can be used to purge things like passwords etc. from Git history.

Prerequisites

  • You’ve followed the steps in Clone as a mirror
  • You’ve followed the steps in Delete merged branches

Steps

There are two ways to do this — the easy way or the “correct” way:

The easy way

You can do this if you are happy to delete all the files completely and forever from GIT

  • Download BFG
  • Rename the download to BFG.jar
  • Navigate to the folder above the mirrored repository you pulled down
  • Copy BFG.jar to that location
  • Run the following

java -jar bfg.jar --strip-blobs-bigger-than 100M <name-of-repo.git>

Th correct way

Use this method to covert all large files to GIT large file storage so they can be source controlled:

  • Install Git large file storage
  • Download BFG
  • Rename the download to BFG.jar
  • Navigate to the folder above the mirrored repository you pulled down
  • Copy BFG.jar to that location
  • Run the following

$ java -jar bfg.jar --convert-to-git-lfs '*.dmp' --no-blob-protection <name-of-repo.git>

$ java -jar bfg.jar --convert-to-git-lfs '*.zip' --no-blob-protection <name-of-repo.git>

$ git reflog expire --expire=now --all && git gc --prune=now

$ git lfs install

The above example just adds .dmp and .zip files to large file storage, you can add as many types as you need here.

Pushing to GitHub

Now we need to get the code up onto the repository:

  • Create your new Organisation in GitHub Enterprise (<Organisation> in code snippet below)
  • Add your new repository (<Repo Name> in code snippet below)
  • Navigate to the directory you mirrored your repository into (so in this example)
$ cd /c/temp/CustomerA.git
  • Run the following to push the code up

$ git push --mirror https://github.com/<Organisation>/<Repo Name>.git

If using git-lfs, you should see something like this:

Yay! It’s all done — go grab a tea.

)
Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade