Level up git! Rebasing and Squashing

Intro

This article is for those who are comfortable with basic git operations and concepts like staging, committing, pushing, pulling, and merging, but who still find themselves struggling through the occasional merge conflict, or creating embarrassingly long commit histories (“fix typo”, “actually fix typo”, “remove unintentionally committed changes”), or feeling unsure about what their branch history looks like after a merge, or two, or three. Of course, those who have it all under control and just want to learn more about what git offers are welcome too!

In this article we’ll be talking about rebasing and squashing. Rebasing can be used instead of merging to bring in changes from master or another branch and results in a much cleaner and more intuitive branch history. Squashing is the process of taking two or more existing commits and combining them into one commit. This ultimately allows you to change the way you think and talk about commits — you can talk about “the” commit where a bug was fixed or “the” commit where a feature was introduced, but perhaps more importantly for right now it’ll let you clean up those embarrassing commit histories of trying a million times to get the build to pass on the server.

Rebasing theory

What is rebasing? The git documentation states that it is “reapplying commits on top of another base tip”. What does that mean? Let’s bring in some visuals. Let’s say you have a branch/repo that looks like this:

You branched off of master some time ago and made a few commits, and there have been a couple commits to master since you created your branch. If you rebase your branch onto master (and note that with rebasing we usually say “onto”, as opposed to merging in which we usually merge “from”), your branch/repo will now look like this:

So the 3 commits from the hotfix branch were “reapplied” on top of “another base tip” which in this case is the head of the master branch, as opposed to the previous “base tip” which was 2 commits behind the head of the master branch. Frankly “base tip” is a bit of a strange word, and it might make more sense if you think of rebasing as “re-parenting”, because you’re changing the parent of either one commit or a series of commits. “Reapplying” in this case means that we take the diff between the first commit of hotfix and its original “base tip” and apply it to the new “base tip”.

Another way to think about rebasing is that when you’ve rebased a branch, it’s as if you had just now created your branch off of this latest master and done all of your work on top of it. I personally find it much easier to think about my branches when they’re in this state, as opposed when I’ve merged changes from master and created a sort of “railroad track” in my commit history:

I find rebasing especially helpful when there are merge conflicts — when you merge, you have to solve the merge conflicts in the merge commit, and so when you look through your history to find the sum total of changes needed to implement your feature or fix your bug, you need to look through both the original commits and the merge commit. When you rebase, you still have to solve the same merge conflicts, but instead of solving them in a merge commit, you solve them in the commits as they’re being re-applied (if the rebase process detects a merge conflict, it’ll stop and let you resolve it before continuing), and so when you go back to look at the diffs of the individual commits, they now make sense when you’re comparing them to master, as opposed to having some lines missing because they’re added in a future merge commit.

Rebasing practice

OK enough theory, let’s bust out our terminals and crank out a few commands. For these exercises, feel free to either make a test repo or go ahead and do them within a repo you frequently work with. We won’t be making any destructive changes.

Let’s start with creating a branch a couple of commits behind master. You can do this with one command with

The ~2 tells git to create the branch not at master, but 2 commits behind.

Now let’s make some changes to the repo and make a commit.

Any changes are fine. You might want to avoid changing a file that has a high chance of having a merge conflict so as to avoid dealing with merge conflicts for your first rebase, but it’s up to you. Doing something straightforward like adding a file or editing the README is fine.

Now that you’ve got some changes in your branch, let’s rebase it onto master, but first let’s run git log --oneline -2 and note the output. On my machine it looks like this:

So we’ve got the commit I just created, 00e444061, in which I added a new file, and its parent commit, bb25634af, which is 2 commits behind master. Now to rebase onto the latest master we’ll run:

And that’s it. Congratulations on your first rebase! Let’s look at the new output of git log:

You’ll notice that both commits have changed. The second commit is 3a29b1c32, which is the tip of my master branch, and the first commit is 1bb3ef7c1. When rebasing your commit the hash will change because you are changing the parent of the commit, which is part of the information used to compute the commit hash.

Go ahead and practice a bit more. Maybe try to create a merge conflict on purpose so that you can see how to deal with merge conflicts when doing a rebase.

Rebasing pro tips

If you’ve followed along so far, you might be thinking, ‘OK, so if I want to get the latest changes from master into my branch, I update my local copy of master and rebase onto that’, which would look like this:

And this will work, but There’s A Better Way !

Both ways work. I prefer the second one since it’s quicker and means I don’t have to switch branches (I often find that switching branches switches my mental context, even if I switch right back. Worst case scenario, I get interrupted after switching to master, but before switching back to my branch, what a nightmare!), but I’ve included both to demonstrate that there are multiple approaches and hopefully seeing the same thing accomplished in two different ways will help the reader’s understanding of the topic.

EDIT: Readers write in that There’s An Even Better Way

Just one command! I hadn’t used this one before, but I’m slowly trying to rebuild my habits to take more advantage of this. One less opportunity to be interrupted.

Rebasing gotchas

When you try to push a rebased branch to your server, you may get an error like this:

This happens when you’ve pushed the branch ahead of rebasing, and are now trying to push the rebased branch. Git sees that the commits you’re trying to push this time are different from the commits associated with this branch on the server and stops you from pushing, fearing that you will overwrite data. In this case though, we know what we’re doing and we know there won’t be any data loss, so we can override git’s judgement by saying git push --force origin mytestbranch. Some might consider this a tradeoff, i.e. to take up rebasing you have to give up some protections that git provides, but I consider this simply as a cost of doing business. The benefits of rebase far outweigh losing this protection.

I was told you should never rebase a shared branch

This comes up a lot in discussions about rebasing so I wanted to address it. Some people say you should never rebase a branch to which multiple people are contributing. Never is a strong word, but in general they are right. When you rebase a branch we’re sharing without telling me, and afterwards I go to update it on my machine, git pull might have issues, and I’m gonna be like “what the hell?”. If you tell me ahead of time, and I’m familiar enough with rebasing to rebase any branches of mine onto your rebased branch, we can make it work. For larger teams with varied skill sets this coordination gets more difficult, and so it’s easier for shared branches to simply merge in changes from other branches as opposed to rebasing onto them.

Rebasing review

OK, let’s recap what we’ve learned. Rebasing is “reapplying commits on top of another base tip”, or changing the parent of one commit or a series of commits. It’s accomplished via the git rebase command and the most efficient way to rebase onto a particular branch is git fetch origin particular_branch followed by git rebase particular_branch (assuming you’ve already checked out the branch you want to rebase) EDIT: or git pull --rebase origin master, if you prefer. Rebasing will change your branch history and your commit hashes, which triggers some protection mechanisms within git upon pushing, and these mechanisms can be overridden with git push --force. Rebasing is best done on individual branches, shared branches should use merging to incorporate changes from other branches.

INTERMISSION

This is a pretty long article, and you’re trying to learn a pretty involved concept, so go ahead and take a break! Get a coffee, grab a snack, have a smoke, whatever you need. We’ll be right here waiting for you when you come back!

Squashing practice

For squashing we’ll dive right into practice. Unlike rebase, the theory of squashing is not all that complicated — you’re taking two or more commits and combining them into a single commit.

Like we did for the rebase part, we’ll go ahead and practice within your repo of choice. Go ahead and create a branch off of master or your preferred branch and make at least 2 commits in this new branch. The contents of the commits can be anything. Once that’s done, we will run an interactive rebase (yes, we use the same rebase command for squashing in addition to rebasing) to kick off that process of squashing, but first we’ll run git log like we did for the rebase practice to help us understand the before and after.

So here we see that I started from master, I added a section to the README, and then I fixed a typo in the section I just added. Now we’ll run an interactive rebase in order to squash the added section and the typo fix into a single commit.

What is HEAD~2? It refers to the commit that’s 2 commits behind the tip of my current branch. You can think of this as rebasing onto that commit, despite the fact that the “base tip” is not changing. This will now put me in an editor which will look something like this:

You can see the two commits we saw in the git log command, although they’re in reverse order. In this prompt, the top of the prompt starts at the first commit after the one onto which you are rebasing, and it goes line by line until the tip of your current branch. After that, there’s a lot of helpful commented text telling you about all the things that are now in your power to do. You can explore all these different things on your own time, but for now let’s modify this prompt to squash the typo commit into the commit that adds instructions to the readme. Actually, despite the fact that there is a squash command, as you see in line 10, we’ll actually use the one after it, fixup, which, as the comment says, is the same as squash, but discards the commit message. Since this particular typo was fixing something in the previous commit, melding the commits will make it like it never happened, and so we have no use for that commit message. So we edit line 2 of the prompt to say f instead of pick, as follows:

For those who like to play code golf, the most efficient way to do this in Vim is to type ‘3xrf’ on the relevant line (‘3x’ deletes 3 characters, ‘r’ puts vim into replace mode, and then ‘f’ tells replace mode to replace the current letter under the cursor with the letter ‘f’). Then we simply save our changes and exit the editor, and we should see a message that everything was successful

And congratulations, you’ve squashed some commits! Let’s take a look at our git log to see what our branch looks like:

So, there’s only one commit in our branch off of master, and it has a new commit hash since its contents are different. And you can see that the master commit that we were on, 3a29b1c32 hasn’t changed. This is because it wasn’t part of the rebase, we were just rebasing onto it.

Squashing pro tips

Like with rebasing, There’s A Better Way to squash. Instead of making all your various commits and then squashing later on via an interactive rebase, you can use the --amend option of git commit to edit the commit you’re currently on. So in the example above, after I made my commit to “Add build instructions to README”, I could amend that commit to include my typo fix by making the fix, staging it like I would if I was about to create a commit, and then running git commit --amend.

You can squash and rebase at the same time. This is nothing more complicated than doing an interactive rebase onto another branch as opposed to HEAD~2 or HEAD~3 or whatever. You can give this a try by combining the rebasing practice with the squash practice. Create a branch that starts a couple commits behind master, make a few commits, and then run git rebase --interactive master. You’ll be able to fixup your commits in addition to changing their parents.

One last note about squashing; so far we’ve talked about squashing multiple commits into a single commit. While it’s usually good practice to have one commit per bugfix or feature, this isn’t always the case, and sometimes it makes more sense to have multiple commits. The point here is to use good judgement when it comes to squashing as opposed to sticking to rigid rules like sticking to one commit per whatever.

Squashing review

Quick recap: squashing is the practice of combining multiple commits into a single commit, or at least fewer commits than you started with. It is accomplished with an interactive rebase onto your own or another branch, or via git commit --amend. It can be performed simultaneously with a rebase operation.

Conclusion

Good job, you’ve achieved a big level up in your master of git. Hopefully you will find working with git easier and more pleasant with these commands in your toolbelt, and if you liked this article, please share it!

Happy squashing!