The Blockchain — A sane explanation
One of the most overused and abused notions in the tech world today is that of the blockchain. Many people say that “the blockchain will revolutionise the world”, however, judging by the quality of the discourse, it is painfully obvious that few understand what the blockchain really is.
So, what is the blockchain?
The blockchain is an open, distributed database whose data is, by design, expensive to change.
In other words, the blockchain is like any other distributed database (think Apache Cassandra), except that you have to pay (a lot) if you want to insert or update data.
The only sane reaction to the definition above is: how weird! Why would anyone want pay to be able to add data to a database?
Fair question. After all, the entire purpose of a database is to, well, store data. And indeed, most (if not all) of the traditional database systems -Mysql, Postgres, Cassandra, you name it - make the implicit assumption that the effort spent on state mutations (inserts, updates, deletes) should be minimised.
Sure, occasional trade-offs are made: sometimes write speed is sacrificed at the expense of read speed, or vice-versa, depending on the underlying case. But at least there is an agreement of principle: data changes should be as cheap and as convenient as possible.
So there must be a good reason why the blockchain adopts the opposite philosophy.
A lot of people try to separate the blockchain from its primary implementation, Bitcoin (the cryptocurrency). They claim that the blockchain is the revolutionising technology, while Bitcoin, powered by the blockchain, is only a fad. Bitcoin will fall into oblivion, but the blockchain will change the world. Or so the saying goes.
Yet it is obvious that by making this claim, these people betray their lack of understanding (or the fact that they are trying to sell you on something). You cannot separate the blockchain from Bitcoin, because if you were to do so, all you would be left with would be a very expensive and highly inefficient distributed database.
The reason why the blockchain makes it difficult and expensive to change its state is because of its primary (and probably only) application. Bitcoin is public, digital money. Most importantly, Bitcoin is not under the jurisdiction of any authority. As opposed to traditional currencies, there is no Bitcoin Central Bank, nor is there any Bitcoin Federal Reserve. It is all “magically” managed somehow, and the blockchain is the database that makes it possible.
But the blockchain is an open database, which means that everyone can, in theory, change its data. But since this database powers a money application, there is a big problem. Allowing open access to a money database is a recipe for disaster. In the absence of barriers, people would just add money to their account. I know I would.
So without central authority, how can you make sure that users stay honest? Read on…
There are only 3 entities that a blockchain keeps track off. There are 1) transaction outputs, which establish ownership of currency; 2) transactions, which group together transaction outputs and establish transfer of ownership, and 3) blocks, which group together and validate transactions and thus establish official history (blocks also give the blockchain its name). Let’s look into these in more details.
Transactions outputs (or TXOs, for short) are made out of 3 attributes: 1) the value, or the number of Bitcoins associated with this TXO, 2) the public key of an asymmetric cryptographic key pair and 3) the spent or unspent status, which determines if this output has been spent, or not yet.
TXOs establish ownership according to the following rule. If the status is unspent, then the owner of the private key that matches the public key is allowed to spend a quantity of Bitcoins equal to the TXO’s value. On the other hand, if the TXO status is spent, it is history. There is nothing that the owner can do. By the way, you can calculate your Bitcoin balance by summing the values of all unspent transaction outputs for which you posses the corresponding private key.
Transactions establish transfer of ownership. In other words, the way you can spend the unspent TXOs that you own is by creating a transaction. The process goes like this: a) you include the TXOs that you own in a transaction, b) you providing a cryptographic signature proof that you own the private keys which match the TXOs public keys, c) you set the status of each TXO included in the transaction to spent, and finally d) you create new unspent TXOs that assign ownership to a different public key.
A transaction is defined by 1) a set of inputs, which points to all the TXOs which are to be spent, 2) a set of outputs, which points to all the newly created TXOs, and 3) a set of signatures, which act as proof of ownership of the input TXOs that you are spending.
When creating a transaction, there is a single validation rule: the sum of the values of the newly created (output) TXOs can only be equal or smaller than the sum of the values of the spent (input) TXOs. In short, sum(outputs) ≤ sum(inputs). Note that if the sum of the outputs is strictly smaller than the sum of the inputs, the difference is called a transaction fee and will be allocated to whoever discovers the block that validates this transaction (more on this later).
Transactions are a clever way of transferring ownership of Bitcoins, however, without a validation mechanism, they are subject to the double spend problem. The best way to understand this problem is through the following example.
Imagine that I own the private key for 1 unspent TXO that is worth 1 Bitcoin. Let’s say that I want to buy a laptop and that I have found 2 merchants willing to sell me their laptop in exchange for my 1 Bitcoin. Let’s call these merchants Alice and Bob.
If I am a dishonest person, I could pull the following trick on them. I could create a transaction that transfers the ownership of my 1 Bitcoin to Alice. Very importantly, I would keep this transaction secret from Bob. Alice would see this transaction, believe that she now owns my Bitcoin and send me her laptop.
At the same time, I would create another transaction that would transfer the same Bitcoin to Bob. Since I kept my transaction with Alice hidden from Bob, he would think that I still own this Bitcoin. He wouldn’t know that I had already given it to Alice, so he’d think everything’s fine and would give me his laptop.
When Alice and Bob would later talk to each other, they’d figure out the problem. They’d both own the same Bitcoin, but this cannot be. Ownership must be unique. However, by the time they would figure it out, it would be too late. I’d be long gone with both of their laptops.
The way this problem is solved in the real world is by having a trusted third party witnesses and validate both transactions. That’s what a public notary does, for example. The trusted third party would first witness my transaction with Alice, record it in the public ledger, and then, when I’d later try to fraudulently transfer the same Bitcoin to Bob, he or she would see that I already gave it to Alice, and would prevent it from happening.
However, this is where it gets tricky. The problem with trusted third parties is that they cannot always be trusted, and this is especially true when it comes to virtual, anonymous money. The temptation to cheat is simply too big and the negative consequences too minor. And this is exactly the problem the blockchain solves. It removes the need for a central authority, and it does so through blocks.
When a transaction is created, this transaction does not, instantly, come into effect. Its inputs are not instantly spent and its outputs are not instantly created.
Instead, the transaction goes into a staging area (called the mempool) and the network treats it as a yet-to-be-fulfilled intention. Only after the transaction is included inside a block, its proposed changes actually happen.
Blocks therefore act as the official transaction validators. They are the blockchain equivalent of a real world notary, so to speak. Blocks are made out of 2 things: 1) a set of transactions which are validated by this block and 2) a secret number.
This secret number (called the nonce) is the solution to the trusted authority problem.
The nonce secret acts as the input of a hash function. If we apply the hash function on a string formed by concatenating the block’s transaction ids with this nonce secret (plus a link to the previous block and some other metadata), we demand a particular result, namely, a hexadecimal string that starts with a predetermined amount of leading zeroes:
hash(secret + transaction ids + prev_block + metadata) = 00000…00001e8d6829a…
The nice thing about the hash function is that it is not reversible. Given an input “x”, you can always easily compute “y = hash(x)” by simply applying the hash function on x. But given an output “y = hash(x)”, the only way to find out which “x” has generated “y” is to repeatedly apply the hash function on random x’s until we get to the right result.
So by insisting on a particular output, we force the entity trying to generate it to compute a very large number of hash functions. In other words, we demand the expenditure of computational resources. When a secret nonce is presented to us, we can easily verify that it is correct (that it generates the proper output) by applying the hash function on it a single time.
What this basically means is that if someone shows us a correct nonce secret, the non-reversibility property of the hash function gives us a mathematical guarantee that a lot of time and energy was spent on finding it.
And as a reward for this time and energy, we grant the entity who found the nonce secret the temporary authority to validate mempool transactions. The block (together with the nonce and its validated transactions) is then registered in the blockchain, and the process begins again.
The obvious question now is why would anyone go to the trouble of discovering these secret numbers? (by the way, this process is also called mining and those who engage in it are referred to as miners)
The answer? Because it is profitable to do so.
Whenever a miner discovers a valid block, he or she has can claim all the transaction fees included in the block, which, if you remember correctly, is the difference between the total input value and the total output value of a transaction.
For example, if you create a transaction that spends a TXO worth 5 Bitcoins and creates an unspent TXO worth only 4 Bitcoins, the difference of 1 Bitcoin doesn’t disappear, but will be instead awarded to the miner.
In addition to fees, a block also rewards the miner with freshly minted coins. Whenever a block is created, new monetary mass is introduced into the system and the total quantity of Bitcoins grows (this is also known as inflation).
Unlike fiat currencies though, the rate of inflation in Bitcoin is strictly controlled. It is perfectly known in advance and it is not subject to arbitrary political pressure, since it is embedded in the source code and cannot be changed. For Bitcoin, the inflation rate is scheduled to diminish with time, so that it will eventually drop all the way to 0 (currently, every newly discovered block rewards the miner with 12.5 Bitcoins).
The question is what happens if a miner decides to play dirty? For example, a dishonest miner could decide to censor certain transactions (i.e. never include them into a block), or it could decide to reverse a number of blocks and thus rewrite history.
The answer is competition. Participation to the mining process is open and anyone can participates, so individual miners are under immense pressure to behave honestly because, if they don’t, the network will simply discard their work and follow that of their competitors. And, since mining blocks is a very expensive process, deviating from honest behaviour can have dire financial consequences.
Let me do a summary how the blockchain works:
- Owners of unspent TXOs can create transactions in which they transfer the ownership of their currency.
- These transactions go into the staging area, called mempool, awaiting to be included in a block and thus become official.
- Through a computationally expensive process of hashing, miners discover blocks and so validate transactions waiting in the mempool.
- Rinse and repeat
The true innovation of the blockchain consists in this 3rd step — the expensive mining process through which various entities compete on finding the right hash. It is through this process (also called proof of work) that the blockchain replaces the conventional central authority model and enables the network to work in a decentralised way.
