A Git Workflow Using Rebase

How I learned to stop worrying and love the rebase

Chris Belyea
SingleStone

--

by Chris Belyea

Git can be tricky to use, especially in a team setting where stale branches and merge conflicts tend to cause problems when you least have time for them. I’ve seen many teams turn their repository into a tangled mess as they struggle with Git. The result is always a lot of wasted time, an indecipherable log of what actually happened in the repository, and very little confidence that the resulting codebase contains the changes it should. Even worse is that this mess is self-perpetuating. Some people get to the point where they delete their repository and reclone as a last-ditch effort to fix their Git woes.

In this guide, I’ll explain a Git workflow using the oft-ignored rebase feature that addresses these problems. This workflow scales well from a single developer up to a large team and has been successfully used on multiple projects for a variety of clients.

Git provides a tremendous amount of freedom and little guidance, so developers and teams often have to define their own best practices. This article started as a guide for my team. This workflow certainly isn’t the only way to use Git, but its prescriptive nature allows you to spend more time on actual work and less time fiddling with Git. Once you’re confident with this workflow, Git will fade into the background, allowing you to focus on what’s important: the code.

Why rebase?

Rebase is one of several Git commands that integrates changes from one branch onto another. (Another command is merge.) Rebase can be a very destructive operation. It literally rewrites Git commit history, which is a big no-no in most cases. It does, however, offer some advantages, including a way to cleanly incorporate new code into a feature branch and a way to keep meaningless commit history out of a repository’s master branch. The end result is a linear commit history on the master branch, which makes it easier to see how the code evolved.

Why squash?

Rebase can be run in “interactive” mode. An interactive rebase operation allows you to squash your commits, combining many commits into fewer, or even one singular commit. There are several reasons you might want to do this. If you have many commits in your branch and merge your branch into master, all of these commits will end up in master (possibly with a merge commit). In some cases that’s desirable, but what if your branch has a lot of commits that contain minor fixes? You could end up with a commit history that looks like this:

4d7269c (HEAD -> issue-1) added missing semicolon, lol
ae37a28 my code is broken
30cd73d syntax error ugh
cf5ff68 syntax error2
44962e5 syntax error
26a6423 Broke linter
2a9a75d Fix issue in method
7d0be0e Get tests passing
bd23847 Add some more code
8f34218 Implement feature XYZ
4cb7063 (master) Initial commit

And that’s not the most egregious example of meaningless commit histories I’ve seen!

It’s a good practice to commit often so that you can roll back changes in small increments if needed. But those small, incremental changes are only meaningful in the context of the branch you’re developing on. In other words, each branch should only contain commits relevant to the implementation of a single feature (or bug fix). The consumers of your code (i.e., other developers on the project) are interested in your working code, not the fact that it took you 35 commits to get your code to a working state. Put another way, nobody cares about what it took for you to get your code working, they just want your finished product. Merging a history like the one above into master results in a lot of non-valuable noise. It’s like handing in a term paper to your teacher with all of your notes and rough drafts stapled to it.

By squashing your commits, you keep your end result but get rid of the extraneous commits. If the commit log above was your “before,” this is your “after:”

c27766b (HEAD -> issue-1) Implement feature XYZ
4cb7063 (master) Initial commit

Now, the entire implementation of your feature is contained in one commit. From the master branch perspective, each commit on the branch is a complete implementation of one feature. Put another way, the master branch shows a linear history of implemented features. If you introduced a feature but now need to roll it back, you have only have to revert one commit.

And if you write good commit messages, like this:

commit c27766b4006fa5b1680803748ecfae07b17f1454 (HEAD -> issue-1)
Author: Chris Belyea <chris@chrisbelyea.com>
Date: Fri Jan 12 10:42:08 2018 -0500
Implement feature XYZImplements feature XYZ which does blah blah blah. This includes:
- this
- that, and
- the other
Closes #1.

then when you open a pull request your Git platform will automatically fill in the summary and description fields from your commit message and close the tagged issue when the pull request is accepted.

If this doesn’t make sense yet, keep reading. After I walk you through the workflow, the value of this approach should become more obvious.

Ground rules for this workflow:

  • You’re using the Git CLI. The CLI is consistent across platforms. Most Git GUIs, on the other hand, have too many vague abstractions that would make following this guide difficult. If you want, have a GUI tool like Sourcetree, GitUp, or gitk running side-by-side to help you visualize what’s happening.
  • You’re using GitHub, GitLab, Bitbucket, or another Git platform that supports the concept of GitHub-style pull requests.
  • You have a Git repository that contains the code that you and your team are working on. For the purposes of this guide, we’ll call that repository pebble.
  • Every contributor forks the repository and works in their own fork. On public projects where you don’t have write access this is the only way to do it. But forking also works well for internal projects because it gives each contributor their own private workspace. Forks are free, so there isn’t a compelling reason not to use them. Forking prevents two developers from collaborating on the same branch without granting additional permissions, but you shouldn’t do that anyway because you increase your chances of encountering merge conflicts. If you think you need more than one person working in a single feature branch, then your feature is too big and should be broken into smaller units of work.
  • The most important rule: Don’t rebase a branch that multiple people have access to! Only rebase branches in your fork. As I’ve already mentioned, rebasing is a destructive operation. If you’re doing it in your repository, which only you can access, then there’s no issue. If you rebase a branch that other people have access to, you’re going to run into trouble. So only rebase your own branches, and only push those rebased branches to your own fork.

Roles

For the purposes of this workflow, there are two Git-related roles. In practice, one person may fill both roles, especially in a solo/small project. For larger teams or public projects, the role delineation is a necessity.

  • Maintainer(s). These people have write permissions to the repository. They review pull requests and accept or reject them as appropriate. They also create Git tags for releases.
  • Contributor(s). These people have read (and therefore, fork) permissions to the repository. They can view and create issues and submit pull requests for review. Contributors are also responsible for resolving any merge conflicts. A contributor can only push to his or her own fork.

Setup

To get set up you need to fork the project repository (we’ll call this upstream), clone your fork (we’ll call this origin), and then add a remote in your local cloned repository that points back to upstream.

1. Fork the upstream repository. Follow the instructions for your Git platform to do this. Your fork should end up in your private user namespace.

2. Clone your fork to your computer. By default, when you clone a repository Git will automatically create a remote called origin that points back to the clone source.

git clone git@github.com:chrisbelyea/pebble.git
cd pebble

If you fork a private repository, GitHub will also keep your fork private, even if you’re not on a paid plan that allows for private repos.

3. Add a second remote called upstream that points back to the upstream project. The upstream URL is the same one you’d use to clone the repository directly. This will allow you to pull in upstream changes.

git remote add upstream git@github.com:singlestone/pebble.git

4. To confirm your setup, you can run git remote --verbose which should show both remotes.

origin git@github.com:chrisbelyea/pebble.git (fetch)
origin git@github.com:chrisbelyea/pebble.git (push)
upstream git@github.com:singlestone/pebble.git (fetch)
upstream git@github.com:singlestone/pebble.git (push)

The Workflow

The code changes you commit should generally tie back to a story or Issue. These stories and Issues may exist in an external system such as JIRA or VersionOne, but for this guide we’ll assume that you’re using the Issues feature of your Git platform to track work.

I’m using the term Issues to refer to all code-related work, including bug fixes and new feature development.

At a high-level, the workflow can be described in a few steps:

  1. Fetch upstream changes.
  2. Merge upstream/master branch into local master branch.
  3. Create a branch.
  4. Write code and commit to your branch as you go.
  5. Fetch from upstream again (in case upstream master has had new commits since you started your branch).
  6. Rebase and squash your branch against upstream/master, resolving any merge conflicts.
  7. Push your branch.
  8. Open a pull request.

Here’s more detail:

Step 1: Fetch upstream changes.

You should always be working with the latest version of the codebase. Since the official code repository is upstream, fetch those changes. Git will store the contents of upstream’s master branch locally in upstream/master.

git fetch upstream

Step 2: Merge upstream/master branch into local master branch.

It’s simplest to create a branch off of your local master branch. Before you do so, however, you should merge upstream/master into master so that you have the latest code.

git checkout master
git merge upstream/master

This will perform a fast-forward merge leaving master and upstream/master pointing at the same commit.

One implication of this is that the master branch on your fork (origin/master from your perspective) has no purpose in this workflow. Upstream has the canonical master branch and you’re periodically updating your local master branch from it.

Step 3: Create a branch.

Now that master is up to date, create a branch to track the work for your issue.

git checkout -b issue-1

Step 4: Write code and commit to your branch as you go.

This is where you do your actual development, committing whenever it makes sense. Once you’ve finished coding, proceed to step 5.

Step 5: Fetch from upstream again.

Your coding is complete, but before you open a pull request and merge it into upstream’s master branch, you need to grab any new commits that have appeared upstream. (Remember, you may not be the only person working on this project!)

To get the new upstream commits, use git fetch:

git fetch upstream
# Now your local `upstream/master` branch contains any new commits that are in upstream's master branch.

Step 6: Rebase and squash.

Rebasing will change the original commit on which a branch is based. Rebasing will result in new commits (with the same commit messages) with new SHA-1 hashes. Squashing will condense commits into a new commit (or commits) with a new SHA-1 hash. Typically you’ll want to rebase against the branch that you intend to merge into. When you eventually create your pull request, it will be from your fork’s branch to upstream’s master branch. Therefore, you’ll want to rebase against upstream/master.

To squash, you need to run the rebase in interactive mode:

git rebase --interactive upstream/master

This will open your default editor and present you with a list of all of the commits that will be rebased. It will look something like this:

pick c42629e Add feature XYZ
pick 6fa213d Make an update
pick fdcc8a6 Do some more stuff
# Rebase e342e4d..fdcc8a6 onto e342e4d (3 commands)
#
# Commands:
# p, pick = use commit
# r, reword = use commit, but edit the commit message
# e, edit = use commit, but stop for amending
# s, squash = use commit, but meld into previous commit
# f, fixup = like “squash”, but discard this commit’s log message
# x, exec = run command (the rest of the line) using shell
# d, drop = remove commit
#
# These lines can be re-ordered; they are executed from top to bottom.
#
# If you remove a line here THAT COMMIT WILL BE LOST.
#
# However, if you remove everything, the rebase will be aborted.
#
# Note that empty commits are commented out

Each commit in your branch is listed at the top of the file (from oldest to most recent). The comments at the bottom of the file provide instructions on how to provide the rebase command with directions for each commit. You should keep the top/oldest commit and squash all of the other commits into it. To do this, simply change pick to squash on the second and third commits. It will look like this when you’re done:

pick c42629e Add feature XYZ
squash 6fa213d Make an update
squash fdcc8a6 Do some more stuff
# Rebase e342e4d..fdcc8a6 onto e342e4d (3 commands)
#
# Commands:
# p, pick = use commit
# r, reword = use commit, but edit the commit message
# e, edit = use commit, but stop for amending
# s, squash = use commit, but meld into previous commit
# f, fixup = like “squash”, but discard this commit’s log message
# x, exec = run command (the rest of the line) using shell
# d, drop = remove commit
#
# These lines can be re-ordered; they are executed from top to bottom.
#
# If you remove a line here THAT COMMIT WILL BE LOST.
#
# However, if you remove everything, the rebase will be aborted.
#
# Note that empty commits are commented out

When you save the file and exit your editor, rebase will continue, following the instructions you just provided.

As the instructions explain, squash will preserve the commit message and present it again at the very end of the rebase operation, where you’ll craft the commit message for your new, squashed commit. This is useful when you’re squashing several significant commits together and need the old commit messages to craft a coherent new one. If you are squashing trivial commits (especially those of the “forgot semicolon” variety) and don’t need those commit messages for the last step, you can use fixup instead.

As rebase processes your commits, it may run into a merge conflict (for example, if you and upstream changed the same part of a file). If this happens, rebase will pause and wait for you to manually resolve the conflict. To do this, simply run git status to find out which file(s) have conflicts and then go into each one and resolve them. Once you’re done, run git add to stage each file and then git rebase --continue. (You do not need to git commit resolved merge conflicts.)

Once your commits have been squashed, Git will prompt you to write a commit message for your new squashed commit. You need to write a good commit message by following these seven rules. Since this commit represents the entirety of your work on this feature, it’s important to document what your commit changes. Additionally, your Git platform (e.g., GitHub) will use the first commit message in your feature branch to populate the pull request form.

7. Push your branch

In order to create a pull request you need to push your branch to origin (your fork of the upstream project). This is simple to do:

git push --set-upstream origin issue-1

If you’ve already pushed your branch and need to update it, the above command will fail. Since a rebase rewrites commit history, you will no longer have a common commit on your branch and must use the --force option to instruct Git to discard the branch on your remote:

git push --force origin issue-1

Now that you’ve seen firsthand how rebase rewrites history it should be obvious why you should never rebase any branch that is publicly accessible. If two people are working on a branch, and one rebases that branch and pushes it to GitHub, the next time the second person tries to git pull it will fail because the branch’s history on GitHub will no longer match their local history. The second person would need to reset their branch to match GitHub (losing any local changes) to get things back in sync. When an entire team has to do that, the resulting disruption and potential data loss become a big problem. As long as you only rebase branches in your private fork, you’ll avoid these issues.

8. Open a pull request.

From here, you’re ready to open a pull request from your fork’s (origin) feature branch (issue-1) to the upstream repository’s master branch. If you make any changes to your branch, just follow steps five through seven. When you push to a branch, your pull request automatically includes all changes.

Once the pull request is accepted, you can delete both your local feature branch and the branch on your fork (origin).

If your Git platform supports it, consider turning off merge commits for pull requests and having the platform do a fast-forward merge instead.

Conclusion

That’s it! This workflow may seem complicated at first, but once you use it a few times, the mechanics will start to feel natural. The commit history for the project’s master branch will be a linear progression of feature additions. This rebase and squash approach is highly compatible with the popular GitHub Flow workflow. It can also be used with GitFlow for feature branches that you merge into develop. The forking aspect of the workflow is not strictly required, however if you use this workflow in a shared repository all contributors must agree not use each other’s branches. (In practice this is difficult to enforce, which is why I prefer forking.)

I’ve helped teams adopt this rebase workflow and they’ve been able to make code merges non-events. They easily keep their local clones properly synced with remotes, and rare merge conflicts are kept small and quickly resolved. Everyone can make sense of the linear commit history. Overall, they spend less time fighting version control and more time doing productive work. If that doesn’t describe your current experience with Git, consider using this process so that you can get back to work.

--

--