Git is a Directed Acyclic Graph and What the Heck Does That Mean?

Sharon Cichelli
Oct 13, 2017 · 6 min read

The Git version-control system frustrates me because, while it is an excellent productivity tool that lets scores of developers collaborate on a single code base without clobbering each other, it asks its users to understand how it works at its deepest levels. It’s like if you were going to write a letter and first had to read a treatise on UTF-8 encoding. I want to make Git less mysterious.

A core concept — once you can visualize it, you can troubleshoot and recover from those weird Git situations you inevitably find yourself in — is the way that commits relate to each other. And the wildly accurate and incredibly unhelpful description of that way is “a directed acyclic graph.” (Although, when I sat down to write this, my dashboard contained a link to this cheerful deep-dive with illustrations: Spinning Around In Cycles With Directed Acyclic Graphs.)

There’s a much simpler way to understand this.

Whom do you follow?

This is just like the way a Git commit history is structured. Each commit remembers which commit came before it. (A merge commit points to two previous commits.)

Image for post
Image for post
Think of commits like people waiting in a shop, each pointing to the person he or she follows.

So wait, what is a commit then? It’s a snapshot of the state of your files, plus some metadata such as who made the commit and when, with a comment, and a pointer to the previous commit(s). That pointer is part of what makes a unique commit. This is why actions that change the parent, such as rebase and cherry-pick, create a new commit with a new ID. This is why re-writing history with those actions on commits you’ve already pushed to a shared repository can cause grief for your collaborators and is a stern no-no. If I’ve built on top of commit A, such that my commit B includes a pointer to A as its parent, and you rebase and replace A with toothpaste, my B is orphaned, pointing to a parent that is no longer part of the history.

The drawings are wrong

People who use words like “directed graph” draw commits with each arrow pointing to the preceding commit.

Image for post
Image for post
Time marches from left to right; arrows point to the left.

People learning about Git, especially those with a background in an older version control system, draw commits with each arrow pointing forward in time.

Image for post
Image for post
Arrows advance to the next point in time; arrows point to the right.

It’s a subtle point, but it means the difference between anticipating what will happen when you merge or rebase, and not. Those history-changing actions are changing where the arrows point; it doesn’t really matter what position the dots are in. Because the dots are not waiting in line by standing in a spot on the floor; they’re relaxing in a quaint shop, remembering which dot precedes them.

The correct mental model is to think of the arrows as identifying a commit’s parent: arrows pointing to the left.

Merges and rebases are just updating pointers.

Image for post
Image for post
Image for post
Image for post
A rebase updates pointers.

The more natural but unhelpful way to imagine the arrows is depicting the flow of time, but that means you have to imagine a rebase as violating the laws of physics.

Image for post
Image for post
Image for post
Image for post
Changing the flow of time, itself?

So think of those arrows as pointers to parents, instead.

That said, there is still a shortcut I’m taking in the above drawings, and it is important to call that out to keep from leading you astray. Recall that a commit includes information about its parent or parents. Therefore, changing a parent really means creating a new commit. It will get a new ID. And anything that uses the old ID as its parent will be cast adrift unless it, too, gets a new ID reflecting its amended lineage.

That’s why you’ll find stern warnings not to rewrite shared history. You would be replacing commits someone else might have built upon.

HEAD is a label

Branches are drawn as if they were a garden path of stepping stones, leading off to adventure, but what a branch really is is a label on a particular commit. That commit has a pointer to its parent, which has a pointer to its parent, and so on until you get back to a commit that is the parent of two lineages, yours and the one that can be traced back from a label called “master”.

Calling one branch “master” is just a convention, as is calling the remote repo you stuck on GitHub “origin”. The names don’t mean anything to Git, only to the people using it.

A commit can have lots of labels, for instance, its SHA1 hash identifier, master, the branch you just created, and HEAD. Branch names and HEAD scoot along automatically as you add commits.

When you check out a branch (git checkout myfeature) , or any commit (git checkout a30bef), Git makes the files in your working directory look like they did when that commit was made, and it makes the HEAD label point at that commit.

Image for post
Image for post
Checking out a branch is like passing a HEAD label to someone else.

“Detached head,” that rather scary-sounding state you get into when you check out a commit by ID, is just saying “HEAD points at a commit that only has an ID and no other labels.” If you dig into your repository’s .git folder, you can find its HEAD text file. When you have a branch checked out, that text file contains a reference to the branch, e.g., ref: refs/heads/master. When you check out some arbitrary commit by ID, the text file contains that ID, and that signals Git to tell you it is in a detached head state. Checking out a named branch again gets things back to normal.

Pointers, not the arrow of time

  • Arrows point to the preceding commit, not the subsequent one.
  • Waiting in a shop, you need to know only whom you follow and whether or not you’re the most recent to arrive.
  • Branches are labels on commits.
  • Checking out a branch passes the HEAD label to that commit.

Do these images resonate? Are there parts that are still confusing?

Girl Writes Code

Articles on tech, C#, Python, IoT, Arduino, Raspberry Pi…

Sharon Cichelli

Written by

.NET and Python developer, open-source contributor, author of GirlWritesCode.com, Arduino enthusiast, and pinball fan.

Girl Writes Code

Articles on tech, C#, Python, IoT, Arduino, Raspberry Pi, and blinky LEDs

Sharon Cichelli

Written by

.NET and Python developer, open-source contributor, author of GirlWritesCode.com, Arduino enthusiast, and pinball fan.

Girl Writes Code

Articles on tech, C#, Python, IoT, Arduino, Raspberry Pi, and blinky LEDs

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store