Finding breaking commits with git bisect

Step-by-step guide

The workflow goes like this:

  1. git log to examine the commit history
  2. git checkout <a confirmed working commit>
  3. git checkout <a confirmed broken commit; often HEAD>
  4. git bisect to find the breaking commit

To “bisect” means to cut in two. Git bisect does just that: you pick two commits, one that is known to be bad, and one that is known to be good, and git will check out revisions about half-way between both, and ask you to check each one and mark is as good or bad until you find the specific commit that introduced a bug.

In a small project with a handful of developers, it may be trivially easy to identify breaking commits. But when projects grow in complexity and large blocks are committed in rapid succession, by a large, distributed team, tools like this become useful.

Important note: Git bisect exists because people are human and make mistakes; this is simply a walkthrough of a tool used to help find those mistakes that slip through despite our best efforts.

Working Example

We will take a real-world example and walk through resolving the issue.

Noticing the problem

A developer attempts to start the exostore and is greeted with this message:

14:29:24.78 <trace> [init][db]start createDataPool (in dblink.js:63)
14:29:24.80 <trace> [kakyo] Init DB OK (in dblink.js:59)
/vagrant/exostore-v4/exo_lib/data/modules/dbapi_admin_device.js:66
).error(errorHandler ).then(funcation(result){
^
SyntaxError: Unexpected token {
at Module._compile (module.js:439:25)
at Object.Module._extensions..js (module.js:474:10)
at Module.load (module.js:356:32)
at Function.Module._load (module.js:312:12)
at Module.require (module.js:364:17)
at require (module.js:380:17)
at Object.<anonymous> (/vagrant/exostore-v4/exo_lib/data/dbmodules.js:9:23)
at Module._compile (module.js:456:26)
at Object.Module._extensions..js (module.js:474:10)
at Module.load (module.js:356:32)
error: Forever detected script exited with code: 8

While it is immediately obvious where the problem lies (i.e., dbapi_admin_device.js:66), it would be imprudent to simply run in and fix it. We need to examine the commit history and see if other things were changed as part of a batch that may affect other components.

Examining the Log

git log oneline  decorate
759c997 (HEAD, origin/master, origin/HEAD, msanford, master) Globally refactor SUCCESSED > SUCCESS; MSGID > MSG_ID; datas > data.
47a77a2 EXO-135 #resolve Add admin user account creation API, update stored procedure for error handling, scaffold unit test.
19371ec 1. Added a constraint in admin db tbe_domain. 2. Added two test helper functions. 3. Added a insertion helper function. 4. Added a unit test case set for db_store_app model class.
d38a884 add device related implementations
9f05de2 Remove old fixture generation script
a1f5ff3 Refactor comment style in db/*.sql
ea1b955 Manually regenerate vagrant-v4 fixtures
3248f20 Updated translation support — Added data table sample — Removed unused stuff — Replaced message APIs routes by correct paths in `app.js`
d7f3076 add get msg count apis

Here we see a series of commits, any one of which have the potential to break the system.

Rebasing is a time machine
While it might be tempting to assume the last commit in the history introduced the breaking change if you just saw it break for the first time, that is not necessarily the case!
Because we work in a private branch, rebase onto master, and then rebase master onto our working branch before pushing, commits made later in time may appear earlier in the work history because they were rebased.

Finding a good state

git checkout 9f05de2

Store starts correctly!

Find a bad state

git checkout master

This checks out origin/master HEAD (which at this time is 759c997). Store breaks!

Finding the breaking commit

Set up the initial state

git bisect start
git bisect good 9f05de2

Begin bisecting by marking a bad commit

git bisect bad 759c997
Bisecting: 1 revision left to test after this (roughly 1 step)
[19371eca935657ebaecb0a4e9879ce295a91f5a2] 1. Added a constraint in admin db tbe_domain. 2. Added two test helper functions. 3. Added a insertion helper function. 4. Added a unit test case set for db_store_app model class.

Found bad commit, continue

At this point, git has checked out a specific revision (commit) about halfway between good and bad. After testing it, we see that the store fails to load, so we mark this as a bad commit.

git bisect bad
Bisecting: 0 revisions left to test after this (roughly 0 steps)
[d38a88471524c07f3762921a030887e29460dc3e] add device related implementations

0 revisions to test after this means we have to test this revision, too. We’re getting close! But the store fails to start in this scenario as well. So, we mark this commit as bad.

Found another bad commit, continue

git bisect bad
d38a88471524c07f3762921a030887e29460dc3e is the first bad commit
commit d38a88471524c07f3762921a030887e29460dc3e
Author: [REDACTED]
Date: Thu Aug 28 09:16:14 2014 -0700
add device related implementations
:040000 040000 53c2f55597509733e427063ff7c8c8d9f6d5efeb aa6a09dd673be0a7d73ccf36f25fc6078ee3ae5d M exo_lib
:040000 040000 f1d3b03ff9ffd3a56fa16a7f698c76bbc7883dd0 a75004b9fe513ec4edd27c12d5d7ae25177c3134 M routes

Found it!

The bisection is complete, we have identified the bad commit, so we can reset our working copy to its previous state.

git bisect reset
Previous HEAD position was d38a884… add device related implementations
Switched to branch ‘master’
Your branch is up-to-date with ‘origin/master’

Fixing it

Now that you have found the commit that introduced the breaking change, you can take steps to remedy it, which may include:

  1. Contacting the commit author directly,
  2. Correcting it yourself and pushing an urgent fix commit.

In this case, finding the commit was useful because it surfaced not only the obvious, known issue as well as another, related issue in the same commit.

Performance

Bisection is usually the term given to searching over a continuous value function, whereas git commits are obviously discrete elements, properly making git-bisect a binary search algorithm.

Binary search runs in worst case O(log n) time and O(1) space.

Binary search method for the value 7 starting at 14; WikiMedia Commons.

Replace 14 and 6 with “known good” and “known bad” commits, and it’s essentially the same process.