Is a Git Repository a Blockchain?

tl;dr; It depends on how you define what a blockchain is. But under more general definitions, yes. And under more restrictive definitions, no. Some definitions are more useful than others.

I know, my summary is kind of a cop out. Not kind of a cop out, a real big cop out. But unless you go with one of the more narrow descriptions then a Git repository looks, smells, and acts like a blockchain. Just not in the ways you would expect. To understand this I think it would be more instructive to look at what a blockchain is from the most restrictive definitions to the least. The truth lies in the middle ground, I just don’t know where that middle ground is.

A Git Repository is not Bitcoin

Well, that much should be obvious, that a Git repository is not Bitcoin. Capital B and little b. Neither in structure nor payload is it Bitcoin. There are some purists that when you say “blockchain” what you should mean is “The Blockchain,” rooted in the genesis block mined by Satoshi himself (may his work not be in vain). But I reject those maximalists as narrow minded. Relegating “blockchain” to be a term like “Champagne” or “Caviar” only diminishes its meaning and value. Unix went down that path and now only really has scars (and enterprise grade Windows) to show from it.

Perhaps we need a snazzier name for the blockchain that Bitcoin runs on. How about referring to it as the “Satoshi Chain” or the “Nakamoto Chain?” The “Satoshi Nakamoto Chain” is too much of a mouthful, and I fear we may soon have multiple active chains that will be rooted off of Satoshi’s genesis block. But the definition of what a blockchain is should not be judged by its heritage.

An Git Repository is not a CryptoCurrency

So if we cannot judge blockchainness by heritage, how about judging it by the content it carries? Is a blockchain a data structure that holds a ledger of tokens, tracked by pseudonymous cryptography?

By this standard a Git repository is most certainly not a blockchain. The content is not necessarily a ledger (although you could store a ledger), and the data is just blobs of data, a rudimentary file system heuristic, and cryptographic hashing. Well, that puts it part of the way to a CryptoCurrency even though it uses an older hashing algorithm. And it does have public key encryption (via GPG signed tags). It has forking, although it’s called branching and considered to be not only a feature but a desirable tool. It even has proof of work if you stretch the definition to include converting caffeine into code that passes unit tests. But none of these features are deeply intermingled, like the traditional mining reward and fork resolution strategies such as picking the fork with the most work.

But to conflate CryptoCurrencyness with Blockchainness is to diminish both. You can have a CryptoCurrency without requiring a blockchain structure, the problem is that they work better on a blockchain. I would say that a CryptoCurrency is a payload for a blockchain, and sometimes they are deeply entangled with the blockchain. But they are ultimately two severable aspects, each can live without the other even though their combination creates something larger than the sum of its parts. So the fact that a Git repository is not a CryptoCurrency does not disqualify it.

A Git Repository can be a Distributed Ledger

You could use a Git repository to store a text document stating who owns what. For those with older parents this might be useful to track who gets what when it comes time to settle the estate. For a more formalized approach a comma or tab separated file to more closely resemble a spreadsheet. Or even an Excel document if you need programmable functions. And if you need fully executable distributed apps you could store the source code and check in run results. I’ve heard lots of good things about using Git Repositories for code. Highly recommended.

But this divorces a Git repository from the underlying protocols that make it work. A Git repository is useful for maintaining a distributed version history of content. But when you focus on the content instead of the mechanisms you quickly realize that you could substitute the whole process with zip files and a deterministic file naming convention.

So a blockchain is not determined by the content it caries, because the content of a blockchain does not change the structure of the blockchain.

A Git Repository is a Chain of Blocks

This is what I view as the least restrictive definition of what a blockchain is while still remaining a blockchain. A blockchain is a data structure that carries some quantum of payload and a reference to the previous block, or to no block if it is the genesis block. So the broadest definition of “what is a blockchain” would then be “it is just a linked list at internet scale.”

We could possibly spruce up the definition by requiring that the reference to the previous block be a cryptographically strong hash. But if the goal is the broadest definition of what a blockchain could be then this is about the only revision we could make. Proof of work? Nope, some chains use proof of stake or anchoring. Merkel trees for payload? Why not just put the whole payload in the block header. Tokens? Nope, we have a word for that already: CryptoCurrency. Distributed consensus? Blockchains are part of a solution for Byzantine Fault Tolerance but in and of itself it is not the solution. Is there only a single previous block? Ethereum is doing some cool stuff with what they call Uncles, and I think there’s more there that can be done.

I could go on, but let’s get to the question at hand: is a Git repository a blockchain by this definition? Absolutely. The Git commit object would serve as the block header and the tree and blob objects serve as the payload. Truly a linked list, at internet scale.


Clearly I think that there is more to a blockchain than just a linked list at internet scale. And for reasons that I can’t quite articulate I don’t think any old Git repository is a blockchain, but you could retrofit one.

But a good formal definition escapes me at this point. There are several things that don’t belong in this definition. First, blockchain is not a brand name of a particular chain. And it doesn’t need to be a distributed ledger, although we are sorely lacking in interesting examples of this. It also doesn’t require a particular proof of work, or proof of stake, or proof of <insert trendy cause>.

What the definition does need to address beyond the contentedness of the blocks are questions of security, consensus (or lack thereof), distribution, propagation, and generation of blocks. And this definition needs to fit in a tweet.

Show your support

Clapping shows how much you appreciated Danno Ferrin’s story.