When You “Git” in Trouble: a Version Control Story
Thank you for laughing at my extremely funny title. But do you know what’s not funny? When you push a commit to your git repo, and you see this in GitHub Desktop:
Yes, I know that the cool people use Git Tower and that the really cool people just use the command line. We’re really cool people, so we’re going to use the command line to solve this problem. In fact, we have no choice — and that’s the adventure you’re joining me on in this article: fixing a git repo that suddenly becomes damaged through absolutely no fault of your own and despite having no command line git expertise whatsoever. But at least you have a visualization of my panic.
Step one is diagnosing the problem. If you’re like me, however, you also have a meta problem, and that comes from relying on a tool you barely understand, which makes diagnosis difficult. Finding sympathetic experts to help you resolve your specific case is also a challenge. You might not even know who to ask or what the questions should be. Nevertheless, you can learn how things work and how to resolve your own dilemmas piecemeal—if you’re patient, systematic, and willing to learn. And that, friends, entailed for me a trip to the git reference documentation, where I discovered the git-fsck command, which I dutifully ran in my repo’s root directory, and which resulted in the following (truncated) output:
> git fsck
...
error: object file .git/objects/67/99ddac675cab54060cdfb066dbfadb6708fc3f is empty
error: object file .git/objects/67/99ddac675cab54060cdfb066dbfadb6708fc3f is empty
fatal: loose object 6799ddac675cab54060cdfb066dbfadb6708fc3f (stored in .git/objects/67/99ddac675cab54060cdfb066dbfadb6708fc3f) is corrupt
So what we have here, and what you probably have if one of your projects comes down with a case of the blinkies, is a corrupted repository. Oh. Good to know. Now what? When you want to know the meaning of life, you ask God. In this case, I consulted an ancient email from Linus Torvalds, which happens to address a similar situation.
A git repo is really a graph of various kinds of binary objects: blobs, trees, and commits. Blob objects are cryptographically-hashed, well, blobs of your data, each representing one of your files. These blob objects are independent of one another, but they are also linked by so-called tree objects, which effectively group the blobs into an arrangement that’s analogous to a file system’s directory structure. Finally, there are commit objects, which contain the information necessary to track the changes in your trees and blobs. The commit objects are also linked, sequentially (as you might expect).
Git stores all of these objects in a series of nested directories, located in .git/objects/
, according to the first few characters of their commit ids (as you can see above). For example, the object 6799ddac675cab54060cdfb066dbfadb6708fc3f
is stored in a directory called 67/
as the file 99ddac675cab54060cdfb066dbfadb6708fc3f
; that is, the full object name is a combination of the directory it's stored in and a specific file in that directory.
So, if one of the commit objects becomes corrupted, your whole repo may turn into a useless pile of bytes, because the chain of linked commits will have been broken. That’s the bad news. The good news is that for the very reason your repo is a collection of discrete files, you may be able to restore it to health, even if one of the objects is corrupted beyond repair — if you can perform a precise enough surgery.
That’s what I tried to do.
Following Linus’s advice, I moved the corrupted commit object file ./git/objects/67/99ddac675cab54060cdfb066dbfadb6708fc3f
somewhere else. You can park your damaged objects wherever you like. They will probably end up in the trash, anyway.
I happened to get the same error message for blob object 67a45ac2f58a444fa4db11cd9ab7e024a8e35dcf
, so I moved that one too. Then I tried the file system check again:
> git fsck
Checking object directories: 100% (256/256), done.
Checking objects: 100% (8970/8970), done.
broken link from tree 03a88f876eb3f6157f76461a3ae6cb18bbb86561
to blob 67a45ac2f58a444fa4db11cd9ab7e024a8e35dcf
dangling commit 76814e15074b540bc2f7e78daf3f5175a8759523
missing commit 6799ddac675cab54060cdfb066dbfadb6708fc3f
missing blob 67a45ac2f58a444fa4db11cd9ab7e024a8e35dcf
dangling blob 2a60520000698ad964e4e61fab31f9b862763550
dangling commit 41634cd81964068acb153bfa355d63bd80fc7cef
dangling commit 5bf415e2bdbc47822ae99b64c2a0f6b4f288eefb
Note that Linus suggests using git fsck --full
, but this is the default behavior now.
Ignoring the “dangling commit” messages, the “broken link” message tells me which tree object points to the blob object I just removed. In effect, I broke the link on purpose to reveal this information. Tree object 03a88f876eb3f6157f76461a3ae6cb18bbb86561
expects to point to blob 67a45ac2f58a444fa4db11cd9ab7e024a8e35dcf
, but the blob isn't there. Commit object 6799ddac675cab54060cdfb066dbfadb6708fc3f
, the other one I moved, is also reported missing. So far, so good.
Continuing with Linus’s advice, I now had enough information to use the git-ls-tree command to list the contents of the tree object called out above:
> git ls-tree 03a88f876eb3f6157f76461a3ae6cb18bbb86561
100644 blob 312d8994f1005a9563a9410c592b27000c201101 building-test.js
100644 blob f84006fd14c6d4b2ccc3ef22b2fe02abf535bd1a folds-test.js
100644 blob 67a45ac2f58a444fa4db11cd9ab7e024a8e35dcf index.js
100644 blob 8b1f47bce7ec989dff7e936279d63d1d02f6a92d indexing-test.js
100644 blob 3f2b45f8cd9dfd486c8e821ee672ed66a34768df inf-test.js
100644 blob 0e6d1985aa59d17e2115bc6c7936d2ac88b00457 list-test.js
100644 blob 40310b5df53691d0e1ba4118c0e3ab66ed766990 reducing-test.js
100644 blob 8244d5fb2768ad5c7c33890ee26c797c2df6262b searching-test.js
100644 blob ff34a2f15faf6eec7d9c9635e79d2a0abdadfb42 sub-test.js
100644 blob ac0c029cc64870c7445c4bdd9d7fe20646b5cc33 trans-test.js
100644 blob 99781d303e90b7aa4de8d630c1053a42f87e8331 zip-test.js
Scanning the list, I found the culprit blob 67a45ac2f58a444fa4db11cd9ab7e024a8e35dcf
and its associated file: index.js
. So now I knew the source of the problem, but I didn't know which version of that file created the problem to begin with. Back to the command line:
> git log --raw --all
commit cf63a71497e027d96614cfff6ba1d297f1a1a26e
Author: Steven Syrek <steven.syrek@example.com>
Date: Mon Jul 18 11:55:40 2016 -0400
Add tests for set operations on lists
:100644 100644 67a45ac... c1c2f99... M test/list/index.js
:000000 100644 0000000... 23c47fe... A test/list/set-test.js
commit f3bc2c55b22deb889f99cdd45663c20a8e8e79c1
Author: Steven Syrek <steven.syrek@example.com>
Date: Mon Jul 18 11:14:13 2016 -0400
Add tests for list zipping and unzipping functions and remove exponentiation operator from tests and examples
:100644 100644 01af47b... 3b1bf35... M source/list/zip.js
:100644 100644 21206e2... 67a45ac... M test/list/index.js
:000000 100644 0000000... 99781d3... A test/list/zip-test.js
The git-log command, with the --raw
and --all
options, will show the entire commit history of a repo. I only show the relevant parts from mine above. What we can see here is that object 21206e20386e0365bc6f15d0ccd372b1c72b5667
precedes the corrupted object 67a45ac2f58a444fa4db11cd9ab7e024a8e35dcf
, which is in turn followed in the subsequent commit (they are listed in reverse order) by object c1c2f99072ef41aca89e963cfb0143f897e0de78
.
At this point, Linus says that I’m done, because I discovered which versions of the file preceded and followed the corrupted commit:
If you can do that, you can now recreate the missing object with
git hash-object -w <recreated-file>
and your repository is good again!
Unfortunately, after I tried this, my repository was not good again. Now things started to get hairy. Past Linus was out of advice, and present Linus (now also past Linus) probably had better things to do than help me. I was therefore left to follow the astute troubleshooting process that professional developers use every day:
Google. Google.
Stack Overflow. omg it’s down
This throw-everything-at-the-wall approach led me to try a few things, starting with the git-diff command. If I couldn’t automatically re-create the missing object, I reasoned desperately, perhaps I could do it manually:
> git diff 206e20386e0365bc6f15d0ccd372b1c72b5667..c2f99072ef41aca89e963cfb0143f897e0de78
fatal: ambiguous argument '206e20386e0365bc6f15d0ccd372b1c72b5667..c2f99072ef41aca89e963cfb0143f897e0de78': unknown revision or path not in the working tree.
Oops. I forgot the leading characters:
> git diff 21206e20386e0365bc6f15d0ccd372b1c72b5667..c1c2f99072ef41aca89e963cfb0143f897e0de78
diff --git a/21206e20386e0365bc6f15d0ccd372b1c72b5667..c1c2f99072ef41aca89e963cfb0143f897e0de78 b/c1c2f99072ef41aca89e963cfb0143f897e0de78
index 21206e2..c1c2f99 100644
--- a/21206e20386e0365bc6f15d0ccd372b1c72b5667..c1c2f99072ef41aca89e963cfb0143f897e0de78
+++ b/c1c2f99072ef41aca89e963cfb0143f897e0de78
@@ -25,3 +25,7 @@ export * from './sub-test';
export * from './searching-test';
export * from './indexing-test';
+
+export * from './zip-test';
+
+export * from './set-test';
Above are the lines in index.js
that changed between the two commits on either side of the corrupted commit. They are marked with a +
, with a few surrounding lines also shown for context. Deleted lines, if there had been any, would have been marked with a -
. Since two identical files, when hashed, should produce identical hash keys, I thought I'd try to brute force a solution by deleting the changed lines and re-creating the commit by hand:
> git hash-object -w ./test/list/index.js
2a60520000698ad964e4e61fab31f9b862763550
Nope. Try again, maybe just deleting the lines marked +
.
> git hash-object -w ./test/list/index.js
21206e20386e0365bc6f15d0ccd372b1c72b5667
Nope, but interesting. I managed to recreate the original state of the object before the corrupted commit happened, but I guess what I was really trying to do was re-create the correct intermediate state? There weren’t too many possibilities, fortunately, since I had uncharacteristically been going through a good git hygiene phase. So I changed the file once more, adding to it only those lines marked with a +
that I recalled adding before everything went bits up:
git hash-object -w ./test/list/index.js
67a45ac2f58a444fa4db11cd9ab7e024a8e35dcf
Yay.
> git fsck
Checking object directories: 100% (256/256), done.
Checking objects: 100% (8970/8970), done.
dangling commit 76814e15074b540bc2f7e78daf3f5175a8759523
missing commit 6799ddac675cab54060cdfb066dbfadb6708fc3f
dangling blob 2a60520000698ad964e4e61fab31f9b862763550
dangling commit 41634cd81964068acb153bfa355d63bd80fc7cef
dangling commit 5bf415e2bdbc47822ae99b64c2a0f6b4f288eefb
Oh right, I have a healthy blob now, but I’m still missing the commit object that points to it. Now what? It’s git-gc to the rescue!
> git gc
error: Could not read 6799ddac675cab54060cdfb066dbfadb6708fc3f
error: Could not read 6799ddac675cab54060cdfb066dbfadb6708fc3f
warning: reflog of 'HEAD' references pruned commits
warning: reflog of 'refs/heads/restructure' references pruned commits
error: Could not read 6799ddac675cab54060cdfb066dbfadb6708fc3f
fatal: Failed to traverse parents of commit b267a6a8264c0cdc72d047049610fc91e9f7c06f
error: failed to run repack
Or not. That was supposed to garbage collect all the… garbage. And fix… all the things. I don’t know why I thought that. But I hoped. I really, truly hoped. And then I imprecated. Noting the “reflog” message above, I was on to my next brilliant idea:
> git reflog expire --all --stale-fix
error: Could not read 6799ddac675cab54060cdfb066dbfadb6708fc3f
fatal: Failed to traverse parents of commit b267a6a8264c0cdc72d047049610fc91e9f7c06f
As everyone knows, when using command line tools, the more options you add, the more masculine you are. It doesn’t matter if you don’t know what they do. Real men don’t read man
pages: they just move fast and break things. Plus, I rather liked the idea of re-flogging my repo. But no. That didn’t work, either.
I had by now waded well into the waters of trying absolutely anything, without regard for sense or soundness. I turned back to the logs, which always feels one step shy of admitting defeat and finding a corner in which to quietly weep. But perhaps a solution would miraculously present itself, something I missed before but was there the whole time for all to see?
> git log 6799ddac675cab54060cdfb066dbfadb6708fc3f
fatal: bad object 6799ddac675cab54060cdfb066dbfadb6708fc3f
Nope.
> git ls-tree 6799ddac675cab54060cdfb066dbfadb6708fc3f
fatal: not a tree object
Nope. I mean, duh. Somehow, I then had the bright idea of examining the logs for just the restructure
branch of my repo, which is the one I had been working on when the fatal blinking cursor entered my life:
> tail -n 40 .git/logs/refs/heads/restructure
...
44dc22e706fb029a9c96f3bd125755fd55ac882b 6799ddac675cab54060cdfb066dbfadb6708fc3f Steven Syrek <steven.syrek@example.com> 1468788351 -0400 commit: Replace isEq function in all tests with should.eql
6799ddac675cab54060cdfb066dbfadb6708fc3f b267a6a8264c0cdc72d047049610fc91e9f7c06f Steven Syrek <steven.syrek@example.com> 1468789759 -0400 commit: Separate out functions in Ord tests
...
Huh. “Maybe the same diff thing I did for the blob objects will work on the commit objects,” I thought. So:
> git diff b267a6a8264c0cdc72d047049610fc91e9f7c06f..44dc22e706fb029a9c96f3bd125755fd55ac882b
...
(bunch of irrelevant stuff)
OK. No. But at least I still had a commit hash, 44dc22e706fb029a9c96f3bd125755fd55ac882b
, to do something with. It was the last good one before my arch nemesis, 6799ddac675cab54060cdfb066dbfadb6708fc3f
, darkened my world. I consulted the docs. I consulted the Internet. And I took one more shot in the dark:
> git branch -l rewrite-tests 44dc22e706fb029a9c96f3bd125755fd55ac882b
What I did here was to create a new branch called rewrite-tests
, using the 44dc22e706fb029a9c96f3bd125755fd55ac882b
commit—i.e. the last good one—as its start point, in accordance with the git branch [--set-upstream | --track | --no-track] [-l] [-f] <branchname> [<start-point>]
pattern specified in the git-branch docs. I am not actually sure what the -l
option is for, or even whether it's necessary. Someone said to use it. Shrug.
I then moved all the files out of the repo and did one of these:
git checkout rewrite-tests
The git-checkout command sets HEAD
to the specified branch. In other words, I told git that I wanted to work on the rewrite-tests
branch. Then, I just copied all the files back over, re-committed them, and left the restructure
branch to wither and die.
And just like that, to my astonishment, I was done. The worst was over and none too soon: I was starting to see in hash keys. I had a new branch to develop, and none of my work was lost (despite the untimely deaths of a few intervening commits). Eventually, I squashed everything back into master
, though I now tend to avoid working on that branch directly, in any repo, in the event one of these kerfuffles arises again.
I still visit Ms. Blinky from time to time, just to gloat. Actually, no, I don’t do that. But you can visit my wounded-and-repaired repo yourself, if you like: it contains my maryamyriameliamurphies.js project. It’s a substantial amount of code that I worked on entirely alone. You can imagine how I felt when I thought I might have ruined it. And how I felt when I figured out how to fix it.
At the beginning of this article, I suggested that a damaged git repository — since it is composed of discrete objects — could potentially be recovered through careful surgery. We have seen two possibilities for such an operation. The first, in the case of damaged blob objects, is to excise the offending blob(s) and then suture over the wound through a rehashing of the original file. The second, if the first fails (or if the problem is a damaged commit object, not just a blob), is to amputate the wounded branch at the point of its corruption, graft a new branch onto the stump, and recommit any files that were casualties of the procedure.
These are different solutions but similar in that they both entail repairing a data structure at a rather low level, even if it’s only manipulating files. In fact, the repair operations are possible precisely because a git repo is stored as a series of files. Insofar as a file system is really just a large data structure with an interface, the command line, a git repo is also a file system-like data structure with an interface, the git command and its various sub-commands and options. If you can learn how to use a file system from the command line, in other words, you can learn how to use git, too.
I wish this story had resolved into a set of specific instructions for solving a problem that you yourself might one day encounter. Unfortunately, I only have some trite encouragement to offer: you can do it! Because it’s so hard to know the cause of these sorts of errors, not to mention the best way to fix them, it’s also hard to generalize about them. All you can do is dig in and fight back against entropy. Learn your tools, don’t fear them. Try things yourself, if only for the experience, and before you beg one of your scientician friends to help you. Just remember to back up first!
If there is an obvious moral here, it’s the one that makes recovery much, much easier and far less costly should you ever be confronted by that dreaded, blinking status bar or its obtuse command line equivalent: commit early, and commit often. Preventive care, after all, is often the best medicine.