Converting a directory-per-version Subversion repository into a Git one in Git style

Sung-Yu Chen
Aug 29, 2017 · 5 min read

Accidentally I had to convert a Subversion repository into a Git one. It sounded easy first, git-svn, and all done. But when I examined the repository, it was not as simple as expected.

Subversion, but not Subversion

In the repository, each version is placed in its own directory:

project/
1.0/
1.1/
1.2/
...

By scanning through the history, most commits added a directory for a specific version.

Adds version 1.2

It’s OK. We have git filter-branch to handle it. But what worsens was some commits adding multiple versions:

Adds version 1.9, 2.0, 2.2

To be worse, there were versions “downgraded”:

Downgrade version 2.2 as 2.1

And a version followed:

Adds version 2.2

Of course, new directories and directory renames between versions happened too.

Tools

Move a directory to repository root

Must use mv to create the files at the repository root.

git filter-branch -f --tree-filter 'mv -f 1.0/* ./' HEAD

Move files in a directory to repository root

Use ls-files to move known files recursively. If you use mv, it will require you to mv on a per-directory basis because of the existing directory structure.

setenv SHELL /bin/bash # Let git use bash if the shell is tcsh
git filter-branch -f --tree-filter \
'for f in `git ls-files 1.1`; do mv -f $f ${f##1.1/}; done' \
HEAD

Add sub-directory addition in commit

This handles directory addition in a new version.

git filter-branch -f --tree-filter \
'mv 1.2/new_directory new_directory' one_previous_commit..HEAD

Remove a directory in commit

This handles directory removal in a new version

git filter-branch -f --tree-filter \
'rm -rf directory_to_remove' one_previous_commit..HEAD

Split a multi-version commit

This splits a multi-version commit into commits, one commit for one version.

git rebase -i one_previous_commit# in vim
edit ...
pick ...
pick ...
# stop at the commit to split
git reset HEAD~1
git add first_version
git commit -s -m 'Add first version'
git add second_version
git commit -s -m 'Add second version'
# ... and so on
git rebase --continue

Update commit date

Splitting commits using rebase generates new commits with new commit date. To update the time, use env-filter for these commits.

git filter-branch -f --env-filter \
'if [ $GIT_COMMIT = <TARGET_COMMIT_HASH> ]; then export GIT_AUTHOR_DATE=<TARGET_AUTHOR_DATE>; export GIT_COMMITTER_DATE=<TARGET_COMMITTER_DATE>; fi'

Recipe

Assume the commits in the repository is as following:

1 Version 1.0
2 Version 1.1 (adds tests)
3 Version 1.2
4 Version 1.3
5 Version 1.4 (removes tests)
6 Version 1.5, 1.6, 2.0
7 Version 2.1

Assume we have converted the Subversion repository into a Git one via git svn. And, for simplicity, we will assume the commit hash does not change in the following; however, they do change every time you do real work with filter-branch or rebase. So please look up the actual commit hash again, when a command below refers one.

First we move files in 1.0 to the root directory:

git filter-branch -f --tree-filter 'mv -f 1.0/* ./' HEAD

Move files in 1.1

git filter-branch -f --tree-filter \
'for f in `git ls-files 1.1`; do mv -f $f ${f##1.1/}; done' \
HEAD

The above can not move files in1.1/tests to tests, because the directory tests does not exist yet. Move it manually.

git filter-branch -f --tree-filter \
'mv 1.1/tests tests' 1..HEAD

The range 1..HEAD includes the next commit of 1 (ie 2) to HEAD.

The range is required, because before the commit, the directory does not exist and the moving ends with an error.

Continue with 1.2, 1.3, and 1.4:

git filter-branch -f --tree-filter \
'for f in `git ls-files 1.2`; do mv -f $f ${f##1.2/}; done' \
HEAD
git filter-branch -f --tree-filter \
'for f in `git ls-files 1.3`; do mv -f $f ${f##1.3/}; done' \
HEAD
git filter-branch -f --tree-filter \
'for f in `git ls-files 1.4`; do mv -f $f ${f##1.3/}; done' \
HEAD

Remove tests since 1.4:

git filter-branch -f --tree-filter \
'rm -rf tests' 4..HEAD

Commit 5 has three versions. So we would like to split it into three.

git rebase -i 4# in vim
edit 5
pick 6
pick 7
# save and exit vim
# stopped at 5
git reset HEAD~1
git add 1.5
git commit -s -m 'Version 1.5'
git add 1.6
git commit -s -m 'Version 1.6'
git add 2.0
git commit -s -m 'Version 2.0'
git rebase --continue

The resulted commit history:

1 Version 1.0
2 Version 1.1 (adds tests)
3 Version 1.2
4 Version 1.3
5 Version 1.4 (removes tests)
6 Version 1.5
7 Version 1.6
8 Version 2.0
9 Version 2.1

Move files in the remaining commits:

git filter-branch -f --tree-filter \
'for f in `git ls-files 1.5`; do mv -f $f ${f##1.5/}; done' \
HEAD
git filter-branch -f --tree-filter \
'for f in `git ls-files 1.6`; do mv -f $f ${f##1.6/}; done' \
HEAD
git filter-branch -f --tree-filter \
'for f in `git ls-files 2.0`; do mv -f $f ${f##2.0/}; done' \
HEAD
git filter-branch -f --tree-filter \
'for f in `git ls-files 2.1`; do mv -f $f ${f##2.1/}; done' \
HEAD

We lost the commit date and commit time in commit 6, use env-filter

git filter-branch -f --env-filter \
'if [ $GIT_COMMIT = "6" ]; then export GIT_AUTHOR_DATE=<OLD_AUHTOR_DATE_OF_6>; export GIT_COMMITTER_DATE=<OLD_COMMITTER_DATE_OF_6>; fi'

Of course, we could arrange an artificial date for newly split commits, 7 and 8, but it does not mean too much. So we left them untouched to record when the commits were split.

Furthermore, we can update the author and the commit if you are not the one who committed the commits to Subversion. If needed, use the following environment variables:

GIT_AUTHOR_NAME
GIT_AUTHOR_EMAIL
GIT_AUTHOR_DATE
GIT_COMMITTER_NAME
GIT_COMMITTER_EMAIL
GIT_COMMITTER_DATE

Finally, we have a Git repository in the Git style.

Conclusion

We often need to convert legacy Subversion repositories into Git ones. Normally the conversion is easy with a git svn clonecommand. Sometimes the conversion is not enough, because the Subversion user did not use it in the Subversion way. Thanks to git filter-branch, we can make them in the Git style.

References

)
Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade