Git Bisect: quickly zero in on a bug’s origin

Oooh, what a nasty bug you just noticed! Alas, you can’t seem to find out where it originates just now, and it appears to have been around for a while, too… How can you avoid combing through the entire history?

In this post, we’ll see how Git assists us in isolating a bug’s original commit as fast as possible, even if it ends up being far back in our log.

The right way to comb through your history

A commit log is nothing but a sorted list. What’s the sorting criterion? Time! Commits run from the oldest to the newest, even if they sometimes branch out and merge back in along the way.

When you look for something in a sorted list, it would be a shame to simply start at the beginning and walk your way to the end… You probably already played the “Higher, Lower” game: you have to find a number between, say, 1 and 100. In such a situation, I would worry about someone starting around 1, or around 100, then picking candidates at random. Instinctively, most people start at the middle, hence 50, and if they are told “lower,” pick the middle of the resulting subset, 25, and so on and so forth.

This kind of algorithm has a name: binary search, also referred to as dichotomic search. It lets you find what you’re looking for using at most [log2(n)] attempts, which for a [1;100] set is 7 tries. It gets even more impressive as the set grows significantly: for [1;1,000,000,000], you’d need at worst only 27 guesses! A massive time saving…

You can apply this principle to searching for the first commit that, in a commit log (a time-ordered series of commits), introduced a bug.

By the way, the mathematical application of binary search is called bisecting, which gave its name to the git bisect command.

Methodology

The git bisect command has a number of subcommands.

  1. We start with a git bisect start. You can provide a test range on the fly (a bad commit, generally the HEAD, and a good commit), otherwise you’ll define them next:
  2. A git bisect bad states the first known faulty commit (if you don’t give any, it will be assumed to be HEAD, as per usual).
  3. A git bisect good states a known good commit (a commit which doesn’t exhibit the bug). This should be as close as possible to the faulty one, but at worst you can pick a far-away commit to avoid sifting through recent history.
  4. From that point on, bisecting starts: Git checks out in the middle of the range (or thereabouts), tells us where we’re at, and asks for a verdict: depending on the situation, we’ll reply with a git bisect bad or git bisect good (and more rarely, git bisect skip).
  5. After a few rounds, unless we answered garbage or left too many commits unanswered, Git will tell us what the original faulty commit was.
  6. We can then get out of bisecting with a git bisect reset.

All together now

In order to practice this, let’s use a sample repo I lovingly crafted for you, with plenty of wacky commit messages and 4 contributors many of you will undoubtedly recognize…

Download the sample repo now

Uncompress this wherever you please: it creates a bisect-demo directory in which you then open a command line (if on Windows, prefer Git Bash). This repo contains over 1,000 commits spread across a year or so and, somewhere in there, a bug slipped in.

You see, if you run ./demo.sh, it displays a subdued KO, when it should instead clarion a flippant OK. This issue goes back quite a long way, and we’ll use git bisect to hunt it down.

In this case we have no idea what the latest correct commit was, so let’s take the first commit, d7ffe6a. We first check that demo.sh looked good in it:

Right, this should be fine…

Armed with this knowledge, we can now start bisecting:

Note we could have started this procedure with a single command:

From there, all we have to do is test each proposed commit, and reply with good or bad:

Notice the final display:

And indeed, the listing mentions a modification on demo.sh.

Here, if our prompt is to be believed, we are indeed on bisect/bad, the faulty commit. This isn’t necessarily so when bisect is done, it entirely depends on the path it followed through the commit log, and once the faulty commit is identified, bisect doesn’t automatically check it out.

At any rate, a git show 465194a will prove that this is indeed where the issue got in:

Let’s not forget to stop bisecting and get back to our original HEAD, using a git bisect reset:

And there you go! Although the faulty commit was 881 positions back, it only took us 10 tests to hunt it down! Even with our fast test protocol, we saved a lot of time. Imagine when the test protocol is slower (compiling, driving execution, etc.): the speed gain then becomes enormous.

Untestable/ignorable commits

It can happen that specific commits, or even whole commit ranges, need not be tested. Either because you can’t test it (obsolete libs and dependencies, change of processor architecture…) or because you know they will not exhibit a testable behavior. In such situations, you can simply answer with a git bisect skip.

You can actually inform Git from the get-go that specific commits or ranges are to be ignored, by providing git bisect skip with arguments. For instance:

Here we immediately tell Git that it need not bother testing 3f24b5a or the whole range from v0.4 (exclusive, as always) through v1.1 (inclusive).

When you already know which parts of your codebase likely harbor the culprit code, you can considerably reduce bisecting by providing these paths to git bisect start, after regular arguments. It will then only walk the log for said paths.

In our sample repo, this wouldn’t have looked like much, because I only changed demo.sh once after the initial commit: to introduce the bug. We would therefore have moved immediately to the faulty commit, which kinda kills the demo :-)

Suspending and resuming

You may be interrupted during a bisecting, especially if test setup is lengthy for each testable commit. If you need to get working on the repo for some reason, it would be a shame to lose your current bisecting streak.

Rather than frantically scribbling your answers somewhere, in order to repeat these later, let Git do the work for you. The git bisect log command details what you’ve done so far; you just need to save its output to a file using a redirection. Later on, you’ll be able to replay this with a git bisect replay.

Suppose we’re interrupted after 5 tests:

Feel free to peek into the log; you’ll see it also contains the start spec:

Once we’re done with our other work, we can resume bisecting right where we left off, thanks to the log:

Ain’t it nice?

Zipping through with a test script

The icing on the cake, when it comes to bisecting, is not even having to attend it. By automating your replies, you remain free to work on the project while bisecting goes on (you’ll need another working tree, though).

The idea is to create a test script (anything executable, basically) that will get called, without any arguments, on each testable commit. It must return with an appropriate exit code:

  • 0 (zero) if the commit is good
  • 125 if the commit can’t be tested (skip)
  • Anything else (often 127) if the commit is bad

This can be as simple as a npm test or make test if you already have that in place (and if it’s complete enough for each testable commit). Otherwise, just whip up a dedicated script, ideally stored outside the working tree to avoid being overwritten by successive checkouts.

In our case, the script is super simple: it just needs to verify that ./demo.sh displays OK. A simple grep will do, and guess what, it returns zero if successful, 127 otherwise; so we don’t even have to adjust for exit codes.

Let’s create a ../test.sh script and give it execution rights:

Now all we have to do is start bisecting (giving it the commit range), then run based on the script. Either we’re fast enough that we will get an answer within seconds (which is our case here), or we can just work on something else in the meantime.

There! Under 3 seconds, we’re done! Isn’t life gorgeous?

Want to learn more?

I wrote a number of Git articles, and you might be particularly interested in the following ones:

Also, if you enjoyed this post, say so: upvote it on HN! Thanks a bunch!

Although we don’t publicize it much for now, we do offer English-language Git training across Europe, based on our battle-tested, celebrated Total Git training course. If you fancy one, just let us know!

(We can absolutely come over to US/Canada or anywhere else in the world, but considering you’ll incur our travelling costs, despite us being super-reasonably priced, it’s likely you’ll find a more cost-effective deal using a closer provider, be it GitHub or someone else. Still, if you want us, follow the link above and let’s talk!)

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch

Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore

Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store