Cleaning up commit history with git rebase

Have you ever seen a commit history something like this?

(please excuse the colour scheme)

Well I have, and it sucks…

Reading bottom to top: We started off “okay” with the obligatory initial commit commit. Followed by the feat and fix prefix touches which are cute, however, by the time we get to the deploy code to staging section I can see the frustration starting to unfold. 10 commits into our project and already there is no real consistency . It’s fine if your code doesn’t work the way you want first time, and yes you might get frustrated when it doesn’t (which is also normal) — but we should really not be exhibiting this in our Git commit history.

How do we go about avoiding this?

We could just avoid making any mistakes ever, or we could be really creative with our commit messages to mask the reality that the current branch is a disaster that may or may not be fixed at some point. Both of these situations are not ideal.

There is a third (and much better) option. We can change history, purely for the sake of clarity in our commit messages.

Why change history when you can just alter the present to suit you?

There is one main reason, clarity, which takes the form of grouping features in a logical way as well as writing better, clearer more consistent commit messages.

Why do we care about clarity?

  • Isolated features can be rolled out more easily as related commits are grouped together.
  • Control over a commit history, coupled with the more traditional use of git rebase — rebasing specific commits onto other branches (a topic that I wont be getting into right now, more information can be found here), when combined, are powerful tools to have in our development artillery.
  • Less work when describing a pull request. This really makes your life and that of your reviewer easier in the long run, when you have a clear additional reference to what was done when in a pull request in the form of clear commit messages.
  • Establishes good habits when building features: when prioritising clean commit messages we start adding features in a more consistent fashion, grouping pieces of relevant code together by default (as we know that if we don’t, we may need a rebase) and ultimately only using git rebase when necessary. I have found that since starting this approach to my workflow I rebase less and less, because my commit messages are clearer to start with.

A practical example of rebasing

For this entire section I am going to assume you have Git installed on your local terminal / cmd as well as have an understanding of some basic Git commands such as adding / committing / pushing, as well as making file edits in your command line editor (usually VIM).

Some important points before diving into rebasing

Team members’ buy in

It’s really important that your team understands you will be rebasing in a project, as caution needs to be taken by them when working with your branches.

Before team members merge or branch off anything that you are currently working on, they should confirm with you that you aren’t currently in the middle of a rebase. Alternatively, If you push a rebased branch to a remote branch that another team member has previously pushed to containing new changes that you don’t have locally, your team member’s changes will likely be lost. Because of this, I usually keep my own branches separate and only push a rebased branch if I know that I am the only person working on it. I can always create temporary branches that can be merged back into feature branches at a later stage if needed. This approach can be summarised as the “Golden Rule” of rebasing.

Keeping track of what has and hasn’t been done

I would suggest using some sort of Kanban system such as Jira, Trello or Github Projects with tasks/cards that map to features or pull requests when working on feature branches, mainly to track what has and hasn’t been done.You ideally don’t want to find yourself in a situation where important work has been lost due to rebasing, so having another method outside of Git to track your changes (even if at a high level) can be helpful. Alternatively if you intentionally need to revert work for the purpose of refactoring, it is important to keep track of any changes outside of version control, as Git does not keep a concise historical record of rebasing.

Cleaning up descriptions

In the problematic commit log screenshot seen previously in this article we notice there are 10 commits in total. Depending on our situation I may want to group these commits differently, which means rewording, squashing or dropping a commit completely.

All three will require us to run the following in our terminal:

git rebase -i HEAD~9

Breaking this command down:

git rebase — tells our terminal we are running Git with the rebase command

-i — tells git rebase to run in interactive mode (VIM)

HEAD~9 — we want the ability to rebase the last 9 commits (we don’t want to change our initial commit message).

After running this we see a file open in VIM (or whatever your default command line editor is) that lists our commits as well as some options that we can choose to apply to each of these commits (the default option being pick — meaning that we pick this commit as is).

I am only going to be covering the drop, reword and fixup options in this article.

drop

For the sake of example, I have decided that the final commit FFS is not needed as I discovered that I hope this one works actually fixed the problem but I didn’t realise it at the time, therefore I am going to drop the FFS commit.

We can do this by replacing pick next to FFS with drop and then save and exit the file.

If we have a look at our Git log now (git log ) you will see that both the commit message and any code affiliated with FFS no longer exists,. Congratulations, we have changed history.

fixup and rewording

I like to use fixup when I want to group a few commits together under a single commit, as either they are all doing the same thing, or I can reword a group of features in a better way that makes more sense under a single commit message.

Lets go back into our interactive rebase mode by running

git rebase -i HEAD~8

(notice now we only have 8 commits now, as we removed FFS from history)

I have taken the liberty of doing some rebase file changes:

Lines 1 and 2 make sense to be their own commit messages, and they were clear enough to understand and intentionally added to describe the project’s state.

Lines 3 and 4 were mistakes, so I have decided to run fixup on them. This essentially squashes the code into the closest previous commit and removes their commit messages, which in this case is feat: added api endpoint for adding blogpost (the squash command is similar to fixup except it keeps a record of the squashed commit message).

I am merging lines 6, 7 and 8 into a single commit called deployment with the fixup command.

I don’t like the wording of line 5, so I am going to reword this to something that encapsulates this commit (as well as all of the later commits that are being squashed into this one) in a better way. Let’s save and exit the file, which will bring up a new editor instance, where it will ask us what we want to reword the commit message of line 5 to. If we wanted to do rewording on more than one commit in our interactive rebase, then we would be prompted multiple times. Update the commit message then save and close VIM.

Lets run git log again to see everything that we expected did indeed happen:

Hooray! it looks like our commits and messages have been cleaned up.

NB: When pushing back to a remote git branch you will need to use the force flag (git push origin master -f) to tell git that you are forcing a rebase to happen. Git does not like us doing this without acknowledging that we know what we are doing . Please use with caution as if any problems around rebasing are going to occur (especially related to more than one person working on the same branch), then this is the step where it will likely happen.

Conclusion

Rebase is a is a powerful tool that should be used with caution but can be used effectively in the situation described in this article. If you are thinking of using this method of rebasing in your own projects you should consider consulting the official documentation — https://git-scm.com/book/en/v2/Git-Branching-Rebasing.