What is a blockchain actually?

If you can’t explain it simply, you don’t understand it well enough. (Albert Einstein)

As I’ve recently joined vChain, a startup in the IT security space, that leverages its main selling point from building upon blockchain I should actually know what this technology is and how it really works. Openly, I needed to invest quite some time to thoroughly understand the concepts involved. Most of the time I found it very hard to locate useful information that neither was half-baked blah blah flooding the media nor academic papers I would have only understood once I carried a PhD in cryptography.

So here comes my attempt to explain blockchain to someone who’s 
a bit familiar with IT, wants to look beyond the hype but does
not want to invest a lot of time to dive into the topic deeply (like I did).

One more thing: I’m not going to deal with cryptocurrencies as I believe they’re a ghastly stupid application of the underlying technology — fiat money is still one of the best inventions of mankind!

It’s all about (cheap) trust

Forget everything you heard so far about blockchain. The whole thing is just about trust and achieving it cheap(er) — NOTHING else.

For example, I trust my bank to correctly do the bookings on my current account. I don’t really care about the technology they’re using to ensure that my account balance is summed up correctly and that transactions are immutable or kept private from the looks of outsiders.

However, having worked in a bank myself I know the effort — technical and organizational, thus financial — it takes to ensure this level of trust that is a result from correctness, immutability and secrecy and — above all — literally billions spent over decades on legal and organizational ceremonies to eventually shout all of this proudly out to potential customers.

Is it now trust or no trust?

The above example shows that — traditionally — trust is achieved by relying on someone that has enough resources to both ensure and show they’re trustworthy. Hence trust is merely a result from reputation which is built up at high cost over a long period of time.

Further elaborating on the example, let’s suppose I need to transfer money to someone. Today, I need a bank for that — an organization that sits between me and the beneficiary to ensure a correctly executed credit transfer.

Now, the blockchain is a technological innovation that renders such intermediaries obsolete.

By using blockchains you can skip intermediaries which — today — bear high costs to ensure trust, costs they eventually pass on to their customers. As blockchain does not require anyone you needed to trust two often-used terms are trustless and zero-trust. I know, it can be confusing to talk about creating trust by being trustless. But compare the concept to a vaccination: You’ve become immune to a virus because you had been intentionally infected with a (weakened) version of that very virus. The blockchain protocols are all built around dealing with infected members, aka faulty or malicious nodes. So trust is created because you can be sure all of this works even though members could potentially be infected — that is breaking the rules erroneously or intentionally.

Understanding the economics

Let’s summarize the economical part of the whole story:

  • trust is no more a result of expensive reputation
  • trust is ensured by the design of the blockchain protocol
  • once you achieve “technical” trust you can skip intermediaries
  • this makes a lot of transactional scenarios much cheaper
  • which is the reason why blockchain will a key technology
    throughout the next years

But how does it work?

Let’s start with its very name: The blockchain’s chaining of blocks. A block is a data structure (imagine a simple text file) that consists of a couple of transactions, a random number and a reference to the previous block, therefore making up a chained structure because every block points to its predecessor. (The first block is called the genesis block and has no parent.)

A chain of blocks: Imagine a sorted list of documents, each containing a table with transaction data, a signature below and the name of the previous document; hence building up a chain.

All members of a blockchain, the so-called nodes, store all of these chained blocks. Therefore one of the blockchain’s key properties is (global) distribution and this implicitly means 24/7 availability — because any single node can fail and the data will still be available in abundance in many other places.

The blockchain also ensures its tolerance towards partitioning, i.e. when a network connection breaks down, a node malfunctions, etc. as the core communication protocol — the servers’ gossip — is robust against such foreseeable breakdowns.

So far we have availability and partitioning resistance. The third and missing piece now is consistency — being sure that all nodes carry the same state of information. Acccording to the so-called CAP theorem you can never achieve all three in a distributed system once the network somehow fails. (And this is easy to imagine: Let’s say your system is tolerant to partitions. After a while one of your nodes breaks apart due to a network outage. If the node now won’t accept any further write operations it becomes unavailable. However if it still accepts these it will implicitly turn inconsistent with the rest of the unreachable network…)

But somehow Satoshi Nakamoto, the mysterious inventor of Bitcoin, solved the CAP theorem’s conundrum. (Not really though, but here’s the trick that lead to the revolution.)

Consensus

Let’s dive into the above mentioned write operations, e.g. a Bitcoin transaction from my address (account) to yours. Such operations are broadcasted throughout the network and eventually a bundle of them will be written into a new block.

The (public) blockchain is fully decentralized, thus there is no central logic that would control the writing of new data. This property is extremely important as it prevents any kind of censorship and guarantees full transparency. We’re close to reaching the core of it: Trust by design.

Imagine you’re on a telephone conversation and whenever the other person on the line talks (e.g. the write operation) you’d have to listen. When you speak though, your counterpart needs to remain silent. This way of doing telephone calls — the protocol — works because of invisible social rules (it’s just impolite to interrupt) and the low number of participants.

Ever been in a telco at work? You see, the protocol is not built for larger networks. So what we need is a selection process which ensures that only one person will be talking for a given period of time. Then you run the next selection process and the next person would be allowed to talk.

Although it’s a horrible idea to do your next telco like this, suprisingly the blockchain consensus algorithms follow such a concept.

Proof of Work, aka mining

There’s a growing array of different consensus algorithms as two things depend on this very choice:

  1. The blockchains key properties like transaction throughput, speed or trust level
  2. The economic or social incentive model to join the blockchain as a node that writes into the chain (i.e. mining nodes in Bitcoin and related blockchains involving cryptocurrencies)

Bitcoin uses a proof of work algorithm. You for sure have already heard the horror stories about energy consumption resulting from the necessity to solve puzzles that require incredible amounts of computational power.

This computational puzzle is easily explained on a high level: The node knows the current block and all transactions that have been broadcasted in the meantime and thus should be written into the next block. It will now invent a random number and create one hash over all of this input data and hope that the output string matches an agreed pattern.

As hashes are one-way functions you can only solve this puzzle by trying with ever newly generated random numbers. If you’re lucky and find a matching output pattern then you’ve just created a new block which you will broadcast. The other nodes will now verify that you did not cheat: They will quickly hash the data and accept if the output matches, then they move on trying to win the next race to successfully mine a block.

There are two more things to mention when it comes to mining:

First, a newly generated block also frees up some coins which will go to the miners as reward. So much about the economic incentive and why it makes sense to verify and write the data of others.

Second, let’s revisit the node(s) lost due to a network outage I mentioned earlier explaining partitioning. Yes it’s thinkable that two nodes will mine a block simultaneously. But there’s an agreed mechanism which of the shortly separated chains will prevail (it depends on the hash function’s complexity which can never be equally complex). Although such orphaned blocks will be forgotten eventually, they are one of the reasons why it takes comparably long to be really sure that a transaction has been successfully written to the blockchain.

This selection process ensures consistency after a while. Such a mechanism therefore is called eventual consistency.

Revisit cryptography

I mentioned the verification done by all nodes already. This is not only important when a node needs to decide whether to include a newly generated block into its local copy of the data. It’s also a means to ensure that older data is correct. As all of the transactions make up the block together with the random number and a reference to the previous block you can verify this data easily. So finally we reach the point why the blockchain is immutable: If you wanted to do changes in a given block you would have to redo all the mining operations in all its ancestor blocks to fake a blockchain that meets this integrity criterion… good luck paying your next energy bill!

Smart Contracts?

For the sake of completeness let me finish by briefly mentioning smart contracts. I’ve already been using the word transaction quite often and so far, we’ve been thinking here about banking-like transfers only. But transactions in blockchains that came after Bitcoin, e.g. Ethereum, can be of any type of write operation, and this includes the storing of computer programs and execution thereof. Think of it as distributing code to hundreds of thousands of computers via the blockchain with the goal that every node will execute your algorithm storing the output data with the same properties we’ve already seen: no-censorship, immutable, always available… Once you got your head around this you might realize why so many think this is a true revolution.

Finish line

I know this probably was a tough read. Trust me — or don’t if you’ve already switched to trustless— it was even tougher for me to reach a point
where I would feel comfortable enough to (at least try to) explain blockchain. I hope it helped; happy to receive feedback.