Climbing Around the Git Tree

Amir Ebrahimi Fard
Data Management for Researchers
5 min readJul 26, 2021

--

Photo by Jaime Spaniol on Unsplash

In the Git workflow, committing one or more changes to any file(s) causes HEAD to move forward. However, sometimes we need to visit past commits in order to revert to older versions of our code or debug an error. There are several commands that function differently to move “backward” within the Git tree. Here, we’ll explain each command in more detail.

Let’s say we spot a mistake in the files we’ve added and/or our commit message after the commit has already been created. To fix the last commit message, we could use the command: git commit — amend. In response, Git opens an editor which allows us to alter and resave the message of the last commit. When the message is saved, the last commit message is updated. If the mistake requires modifying, adding, or deleting files associated with the commit, we need to make those changes, stage the changes, and then use git commit — amend to (optionally) change the commit message.

The next command is git revert which is used to change a specific past commit without affecting the rest of the Git tree. This command can be a bit tricky for a few reasons. First of all, it only affects one particular commit. For instance, if we have the following history of commits:

commit10 <-HEAD

commit9

commit8

commit7

.

.

.

commit 1

and we try to revert to commit 7 ( git revert commit7_SHA), the command reverts only to commit7 and keeps all the commits before and after commit7 intact. This means that git revert does not really go “back in time”. Instead, it undoes the requested commit by creating a new commit and incorporating all the requested changes there. For example, if a line was added to a file in commit7 and we run git revert <commit7_SHA>, Git will make a new commit (after the latest commit, commit10) where that character is no longer there (because we’ve reverted the changes made in commit7). It also works the other way around: if a line has been deleted in an earlier commit, then reverting that commit will add that content back.

Running the git revert command may raise a conflict. If you are trying to revert any commit other than the most recent commit, this is especially possible. This normally happens when more recent commits depend on a commit you want to revert. Imagine the scenario in the following Git tree:

Figure 1: Reverting a non-terminal commit in a Git tree [1].

git revert <commit2_SHA> will add another commit to the Git tree that reverts Commit2. A conflict in the auto merge may happen between the revert commit and the HEAD which requires you to manually fix. This may happen because Commit3 depends on changes you reverted in Commit2.

In other words, if the commits after the one that we want to revert are dependent on content that disappears after the reversion, we would create a conflict which would require manual fixing. In the following figure, we create a separate file in each commit and then we want to revert the second commits. This wouldn’t raise any issues as it seems like removing file 2 from commit 3 could be done seamlessly without raising any conflicts. In this case, by using the command git revert commit2_SHA an editor opens to save the revert message.

Figure 2: Reverting a non-terminal commit without raising a conflict.

On the other hand, suppose we have a file (called file1) and we add a character to it in every commit. After the third commit we want to revert the second commit, so we run git revert commit2_SHA, and we see that there is a conflict. We need to resolve the conflict, similar to the way we resolve merge conflicts by opening the conflicted files and manually fixing them. After editing and saving the files, we can return to the bash environment to add and commit the changes in Git. (Note: if after manually editing the conflicted file, the outcome is exactly the same as the most recent commit (the Git tree would be clean), there will be nothing to commit which would question the utility of the whole reverting process in this scenario [2].

Figure 3: Reverting a non-terminal commit which leads into a conflict.

As mentioned earlier, reverting the latest commit will not raise an error as there is no commit after that. The other command that we’ll discuss is reset. In contrast to revert, reset has a destructive effect on the Git tree and can destroy earlier commits permanently. This command (git reset) is often used with three different options: mixed, soft, or hard. Running git reset with one of these options affects where the changes will go after the reset is complete. The first, git reset --soft <commit_reference> moves the changes (back) to the stage. This command helps if you have recently made a whole bunch of small commits and you want to regroup them into one large commit. Second, git reset --mixed <commit_reference> or just git reset <commit_reference> moves the changes back to the working directory (out of both the commit and the stage). If you made some changes that you want to keep in the Git history but want to regroup or reorder them, this command can help. The last reset command is git reset --hard <commit_reference> which wipes all the changes and commits after the <commit_reference> [3].

The other command in Git that is used to go back in the Git tree is git checkout <commit_reference>. Using this command creates a detached HEAD state. When we are in this state, we can make experimental changes and even commit them, but when we switch back to where we were before (using git checkout <brnach_name>), all those changes will be discarded. In essence, the git checkout <commit_reference> command allows us to explore the status of the working tree directory and files in the project at any specific commit [4].

--

--

Amir Ebrahimi Fard
Data Management for Researchers

Postdoc Researcher on AI Explainability - Interested in the intersection of data, algorithm, and society.