What you’re going to need to understand this article: patience.
Unless you’ve been on a sabbatical from the world for the past two years, you’ve already heard the term “blockchain” way too many times. If you’ve been following the tech industry a little more closely though, you’ve probably observed “industry changing”, “revolutionary”, “disruptive” and countless other overly optimistic adjectives used in the same sentence as well. Perhaps you’re even sick of the sudden explosion of literature around blockchain. To top it all, this is another article on the topic. But hang on because this one attempts to be the only one you may need to get fairly well acquainted.
To truly understand blockchain we’ll have to examine what makes it unique and what problems it attempts to solve. What’s happening in the world around it now, and what is this mining stuff anyway and do I need it if I want to use blockchain? Now there is no better way to attempt this understanding than by examining the most successful and prominent use of the technology: cryptocurrencies. So, don’t be surprised that they play a central (distributed?) role in this article.
1. The Elephant on the Page: What Exactly is Blockchain?
Without delay: A blockchain is simply a ledger. It’s a ledger of transactions and is immutable, thus providing an irrefutable history of all transactions between all participants on that blockchain network.
That does not sound like a new concept, so what makes blockchain so unique and justifies all the buzz and hype? And why is it called that?
The truly distinguishing feature as well as the core underlying concept of blockchain is that it’s not centralized, nor is it distributed in the traditional sense. While a centralized system is self-explanatory, a typical distributed system consists of computing equipment or digital data split-up and spread out over several, distinct physical locations. In contrast to either, a blockchain network requires every participant on the network to store a copy of the ledger in its entirety, thus making redundancy it’s key underlying feature. Updates to the ledger (such as adding a new transaction record) then requires participants on the network to reach a consensus as to the validity of the proposed transactions. Consensus must be obtained in a decentralized, fault-tolerant manner thus maintaining the decentralized, participant driven nature of blockchain and to ensure that the ledger is accurate.
Oh, and it’s called a blockchain because of the way this ledger is structured: as a sequence of virtual blocks wherein each block holds some part of the transaction history. Think of how a notebook’s pages each have a finite number of lines and are bound together at the seams. In a similar fashion a blockchain is an ordered collection of blocks, with each block pointing to its preceding block, thus intertwining them into a chain of records.
1.1. The Bitcoin Blockchain
Bitcoin’s blockchain consists of a record of every Bitcoin transaction that has ever occurred since its inception. Each block contains some number of transactions and each block on the Bitcoin blockchain can be at most one megabyte (1 MB) in size. The first block is called the Genesis block, and is special as it does not point to a previous block.
Several challenges though are already apparent: how do you ensure consensus is obtained on the validity of new transactions, especially when factoring in malicious actors who may attempt to distribute incorrect copies of the blockchain ledger to benefit themselves. In other words, how do you achieve fault tolerance in Bitcoin and in any other blockchain network?
2. Fault Tolerance
In a blockchain system, every participant maintains a copy of the entire blockchain ledger. For new transactions to be added to this ledger, participants must reach consensus on the validity of these proposed transactions. This is non-trivial for a great deal of reasons: sometimes vital information may not reach all participants and sometimes malicious actors may attempt to spread false information that benefits them. Fault tolerance then refers to how capable a system is at operating in the face of such challenges.
Consider the two general’s problem: General 1 is considered the leader while General 2 is the follower. They are both located on opposite sides of an enemy and must decide on a time to mount an attack. General 1 decides a time and sends a messenger to General 2 to convey his decision. General 2 must then send an acknowledgement (ACK) back to General 1. It’s easy to imagine how this system can be problematic: either message may not be delivered if the messenger is captured. Even if both messages are delivered, how can General 2 be certain his acknowledgement was received without devolving into infinite acknowledgements? And we haven’t even factored in a malicious messenger or general into the equation yet!
This problem is in fact, unsolvable. Why even bring it up then? Because an extension to this problem called the Byzantine General’s Problem plays a critical role in blockchain and other distributed systems.
2.1. The Byzantine General’s Problem
Byzantine refers to the old Byzantine Empire, also known as the Eastern Roman Empire and their army Generals apparently faced the same problem as above though with two major changes: there are now more than two Generals and any of the actors may be malicious which implies that some of the Generals and/or messengers may hold treacherous intentions, i.e., they may intentionally cast suboptimal decisions or deliver the wrong message respectively.
Consider nine Generals with four in favor of attacking and four in favor of retreating. To avoid a disastrous defeat, a final decision for either a collective attack or retreat must be reached. The ninth General though has other plans and with malicious intent casts two votes: one informing the attackers that he’s in favor of their stance and the other informing the retreat favoring Generals of the same. The resulting effect of this betrayal is not hard to imagine.
Here, while the malicious entity appeared to be functioning correctly to all parties, i.e. there wasn’t a “node failure”, the system was compromised. This makes Byzantine fault tolerance the most general and difficult class of failure modes and a system tolerant to this class of failures is unsurprisingly termed as ‘Byzantine fault tolerant’.
It is critical for distributed systems and blockchain networks to attain Byzantine fault tolerant operation and continue operating reliably and correctly even in the face of failed and malicious nodes. Without this, neither is blockchain viable for monetary systems such as Bitcoin and nor can any enterprise usecase ever adopt a blockchain centric solution for business processes.
3. Attaining Fault Tolerant Consensus
To attain fault tolerance necessitates honest majority participation. That means a blockchain network relies on a majority of its nodes being honest. How many, exactly though? While early Byzantine fault tolerant systems required near eighty percent honest participation, Bitcoin and most cryptocurrencies require just 51% honest participation, i.e. as long as honest nodes control a majority of the compute power on the network, the system remains fault tolerant.
To see how this works, let’s get an overview of the process of adding a block of new transactions to a blockchain ledger, and in the process explore consensus algorithms.
3.1. Transaction Overview
Proposals for new transactions are broadcast into the network and participants on the network gather these proposals into blocks. These blocks are then broadcast to other participants for verification, which then need to indulge in some book-checking to confirm that all transactions in that block are valid. For example, for transferring funds in a cryptocurrency based blockchain network this would involve confirming that a sender possesses the necessary balance to transfer the requested amount to a receiver’s wallet. Again, as long as a majority of the participants are honest the network stays fault-tolerant to malicious entities.
Blocks thus verified are then broadcast to all members for appending to their respective copies of the blockchain ledger. The set of transactions on that block are now part of the irrefutable record history that is the blockchain.
Obtaining the required consensus though is a challenging process and also led to the popularization of the whole “mining” term that is often synonymous with blockchain and cryptocurrencies in the minds of many. In reality, mining is just one of the ways for participants to reach consensus and is officially called the Proof of Work (PoW) consensus algorithm. Obtaining valid consensus in a fault-tolerant manner is absolutely critical for blockchain applications though. The following two sections delve deeper into mining and another popular consensus algorithm in use today.
3.2. Proof of Work Consensus, aka “mining”
Mining, officially called the Proof of Work (PoW) consensus algorithm, is one of the ways for participants to reach consensus. The idea is to have participants attempt to crack a hard, cryptographic puzzle to add new blocks to the chain. Here’s how this works:
1. Nodes on the network receive transaction proposals which they gather into blocks
2. All transactions in a block are individually hashed to obtain their respective hash values
3. These hash values are then summed to a single hash value representing all transactions in that block, this value serves as the root hash value for that block
4. The root hash value of the previous block is now obtained by referencing the ledger
5. Now, a special value called a nonce must be found which when summed with these two root hashes results in a number with a certain number of preceding zero bits
6. This nonce once found is stored in the block header along with the root hash value of the block and the root hash of the block preceding it
7. Once a node has found this nonce and thus generated a complete block it broadcasts the block to other participants for verification of the transactions within the block
8. The first transaction in every block is always a self-credit by the node that generated that block: when this block is accepted this transaction becomes true and new coins are minted and credited to the successful nodes wallet which can now claim to have successfully “mined” the block
Conflicting chains will arise during this process and are abandoned by nodes in favor of current longest PoW chain at that time, which is always regarded as correct as the most compute power or ‘work’ has been expended on that chain. If two nodes simultaneously generate a valid block and propagate it, receiving nodes will begin extending their chains based on the first block received. These conflicting chains dissolve gracefully when the next block is found thus extending one chain and making it the longest.
3.2.1. Incentivizing Honest Participation
Finding the nonce requires a surprisingly large amount of computational power, and the difficulty increases as the blockchain grows. Any attacker who attempts to propagate a false chain must not only compute the correct nonce and headers for many blocks in order to successfully create a fake chain but must also convince a majority of the network to ditch their ledgers in favor of the fake one so it becomes the most accepted chain. Such a 51% attack is simply not going to happen on a sufficiently large network: to do this on Bitcoin and Ethereum networks today would require more compute power than several supercomputers combined!
Further, if an attacker were to amass more compute power than the rest of the network they may simply be tempted to use it to find the next block, which they would have a high chance of mining and of claiming the newly mined coins. This would also alleviate risk for the attacker as their amassed wealth would greatly depreciate in worth if faith and trust in the currency is lost. Further, many networks restrict changes to blocks that are beyond a certain height, for example only the last ten blocks may be alterable with all prior staying completely immutable.
3.3. Proof of Stake
Proof of Work is by no means the only consensus mechanism. Proof of Stake (PoS) is an alternate method that barely uses any power to run thus alleviating the tremendous energy consumed (and wasted) by PoW networks which discard over 99% of calculations in their attempt to find that correct nonce. In a PoS network, ‘stakeholders’ verify new transactions and hold authority to append new blocks. A stakeholder’s chance to ‘forge’ this block is determined by the amount of assets held and the duration for which they have been held. As more assets are held for longer, the stakeholder has a greater “say” in the network. NXT was the first complete PoS based cryptocurrency, with Ethereum attempting a complete shift as well.
So, while PoW algorithms reward compute power, PoS networks reward the long-held assets of their stakeholders who incidentally have the most to lose as these assets greatly depreciate if trust in the network is lost.
4. Why Blockchain is Worth the Fuss
So having a truly decentralized ledger with fault-tolerant consensus mechanisms is great from a technological perspective, but it also sounds like a demanding system to set up from scratch, especially from a business perspective. What then makes blockchain worth the trouble?
1. Truly Decentralized
This is a bigger deal than it may sound like: a blockchain system along with a robust consensus mechanism requires no central servers for co-ordination between participants, for the minting of new coins or assets or for timestamping transactions and yet the end result is a technology fault-tolerant enough to operate monetary systems.
2. Participant Driven
Participants propose new transactions, verify pending transaction proposals, decline incorrect transaction requests and maintain the ledger collectively. No authority is required to overview any network operations.
Since the ledger records all transactions on the network and every participant maintains a copy of the ledger in its entirety (or even just the block headers in the case of pure-verification nodes), the entire history of transactions can be viewed and verified for auditing and other reconciliation purposes as and when required.
As only hashes of transactions are stored, the individual privacy of participants is still maintained. In a public currency system such as Bitcoin or Ethereum, anyone and everyone can view all transactions that have occurred, but these transactions take the form of wallet addresses which are themselves cryptographic signatures, thus maintaining privacy for the individuals behind the transactions. (See the image below).
5. Challenges to Adoption
Nothing is ever just a bed of roses, and blockchain is no different. Some major concerns when setting up a network are:
1. Ensuring Honest Participation
For a public system that anyone can join such as Bitcoin/Ethereum, it really depends on luck with the community at large as honest participation is crucial int the early stages to ensure longevity of the network. For closed networks setup within business domains, participation must be regulated tightly.
2. Infrastructure Investment
Infrastructure costs both in terms of hardware deployment and software development and subsequent maintenance can be significant, especially initially. But it can be argued that this goes for any new system a business house may venture towards and it really is the long term returns and value that should be deciding factors.
3. Fear of the Unknown
The biggest hurdle comes here. There’s no way to sugarcoat it: blockchain and processes around blockchain are radically different and adoption will take a while. With public networks, most people cannot comprehend let alone digest the concept of a decentralized currency which is truly self-regulating. Further, the enterprise space always tends to be a slow and cautious adopter and given the wide-reaching changes required from business processes at the fundamental level coupled with the mindset changes necessary for true blockchain adoption, you can bet that it will be a while before blockchain truly sets in.
6. Concluding Our Discussion: What We’ve Seen So Far
We’ve covered what blockchain is and have delved into the nitty-gritties of the fault-tolerance and consensus mechanisms it relies on. We’ve even taken an in-depth look at the most famous consensus mechanism, Proof of Work or mining as its better known and reviewed another great consensus mechanism, Proof of Stake. We’ve looked at why blockchain is worth the fuss and what the main challenges are to its adoption. We’ve taken the help of Bitcoin and cryptocurrencies along the way to better understand blockchain, and that leaves us on a good note to conclude this article:
· A blockchain is a ledger of transactions and is stored by every participant on the network
· It’s structured as a sequence of blocks, with each block holding a certain number of transactions and maintaining a link to its preceding block, thus forming a chain
· Transactions in a block are hashed until a root hash value is obtained, which is subsequently stored in the block header along with the root hash of the preceding block
· Participants on the blockchain network must reach consensus as to the validity of proposed transactions before they’re added to the immutable ledger
· Consensus must be approved in a fault-tolerant manner accounting for malicious and absent nodes
· Consensus algorithms are built for this purpose, and several exist
· The Proof-of-Work consensus algorithm emphasizes compute power, with participants dedicating resources towards the computation of a special value, dubbed a nonce
· Once the nonce is found, a new block is generated and the nonce is stored in the block header as well
· Multiple nodes may generate valid blocks simultaneously, leading to conflicting chains
· These conflicts are resolved as one PoW chain eventually grows larger and other nodes periodically switch over to the longest chain
· The longest chain is always considered the correct chain as the most compute power has been expended on it
· Proof of Stake is another consensus algorithm and instead of relying on large compute power being expended behind wasteful computations, it relies on stakeholders of its assets
· A stakeholder’s chance at forging the next block is proportional to the quantity of assets held by them and the duration for which those assets have been held
· Blockchain systems are highly decentralized and self-regulating, are entirely participant driven and facilitate high levels of transaction transparency while maintaining individual privacy
· The main challenges to blockchain adoption remain ensuring honest participation and infrastructure investments, though it can be argued that both these are critical and necessary for any system
· Widespread adoption is hindered by a fear-of-the-unknown, which necessitates greater awareness, hands-on experience and most of all: some more time!
This has been the first part in a multi-part series on blockchain technology. Future parts will focus on the enterprise, on the Hyperledger Fabric framework and on writing your own smart contracts and custom logic. Keep your eyes peeled for Part 2, which will focus on exploring blockchain in the enterprise world!