Introduction to Blockchain

Zaid Khaishagi
13 min readJun 16, 2018

--

Most people are still not aware of the technology and concepts that are behind the blockchain technology. This is a simple introductory explanation of these concepts with a focus on avoiding abstractions and slightly delving into the technicalities of things for people who want to get a basic understanding of blockchain technology.

What is a blockchain?

A blockchain, basically, is a collection of sequential records. These records are divided up into chunks. These chunks are called blocks.

Each block is fixed in its place in the overall sequence of records with the help of the previous block, i.e. the block that comes just before it in the sequence. So, if any block is displaced or changed in this sequence, it leads to a different branch.

As you might have noticed, this structuring resembles a chain with each block being similar to the individual links in a chain connected to the neighboring links. Also following this example, is the branching where the displaced/modified block resembles a chain link taken out of its place and attached somewhere else as each block depends on the preceding block, so that any modified block is no longer the same block — it leads to a branched chain. This immutability in the blockchain is one of its key features.

This is pretty much it for the overview of the structure of a blockchain. However, in this form, it does not really serve much useful purpose. Its potential is leveraged by actually implementing it in a distributed fashion across a network (more on that later).

Going deeper

We’ve talked about records in sequence and touched on the immutability feature. But how does all this happen precisely? For a moment, let’s look at what hashes are before we take a deeper look into what goes into a block.

Hashes

A hash function is a function that takes as its input any data and produces and output of a fixed length. This output is called the hash digest of the said input. A hash digest may be also simply be called a hash

The idea here is that for any given input for this function, it always gives us the same result. However, even the slightest change to the input completely changes its resulting hash. The goal here is that the input cannot be figured out if the result (output) is known and the result is unique for every distinct input, i.e. there are no collisions.

For example, using SHA256 which is a hashing function:

‘abc’ hashes to: ba7816bf8f01cfea414140de5dae2223b00361a396177a9cb410ff61f20015ad ‘abcd’ hashes to: 88d4266fd4e6338d13b845fcf289579d209c897823b9217da3e161936f031589

Now, technically speaking, if the number of possible inputs (input range) is larger than the number of possible outputs (output range), which is true in this case since the output is required to be of a fixed length, then there are going to be some double ups or overlaps and these overlaps are called collisions. Even though collisions exist theoretically, these are of no practical significance since the output range is just so large. For example, the range of output for SHA256 is: 115792089237316195423570985008687907853269984665640564039457584007913129639936 ≈ 1.15e+77 , for reference, the estimated diameter of the observable universe is 8.8e+26 meters.

So, the basic features provided by the hash function are: Determinism, Non-invertibility.

Now, let’s look at a block in more detail.

Block Stucture

Each block contains these 3 main things:

  1. Hash of Previous Block
  2. Nonce
  3. Records

Hash of Previous Block: The hash of any data serves as its unique ID because, practically, only the specific data taken as the input could have produced the hash. This is also valid for the blocks in a blockchain. Hence, this serves as the link between the individual blocks. The inclusion of this field inside a block ensures that if even the slightest change occurs, i.e. tampering, in the records that precede this block, then this field would no longer be valid with its current entry.

Suppose you take any block in the sequence of the blocks and decide to make some changes to it. As we mentioned earlier, a slight change to the input completely changes the output of a hash function, so the hash for the modified block changes completely. But the hash for this block was stored in the next block in the sequence. So, either it is left as it was (invalid) or it too is updated in order the keep the entry valid. This again changes the next block… and it goes on. This leads to an entirely different chain than the original. Such a change would be very easy to detect since there are so many users (node) constantly monitoring the blockchain. This is what gives the blockchain its immutability.

Records: This field simply contains all the records that you want to store inside the block. The entire blockchain’s records are stored in this field of each block.

Nonce: This field contains some arbitrary value. It only has meaning to us when it is used in this place. It is used for the proof-of-work (more later).

Those are the the 3 main fields that are present in a block, however, there can be additional fields and there usually are.

I recommend you have a look at this demonstration: https://anders.com/blockchain/blockchain.html

This is what a blockchain essentially is at the local level, i.e. on a single node of a network. However, the blockchain is not really implemented locally in isolation. As stated before, it is implemented in a distributed fashion.

Distribution

The blockchain is implemented in distribution by firstly, having a network over which the blockchain is distributed. Secondly, everyone in the network is handed a copy of the blockchain and whenever any newcomer joins the network, they get a copy of it as well. This copy of the blockchain needs to be consistent among the members of the network (nodes), i.e. they need to be in agreement on one version of the blockchain. To achieve this consensus among the nodes, different methods may be used. These are called Consensus Algorithms. Some of the consensus algorithms are Proof-of-work, Proof-of-stake, Delegated Proof-of-stake, etc. there are many more. We will be focusing only on proof-of-work because it is the most basic and was the first, used in bitcoin.

Proof-of-work

In the proof-of-work consensus algorithm (PoW), every node broadcasts the new records that need to be added into the blockchain, e.g. transactions, and simultaneously listens for anyone else broadcasting a new record. The nodes keep these new records for adding them to the new block in the blockchain. Think of it as a pool where records not yet put into the blockchain are temporarily stored. Once they are added, they become as though etched into the blockchain.

Next, every new block that is eligible to be added to the blockchain needs to be mined. So, what is mining? Mining simply means making new blocks that are valid, and being valid implies that they satisfy some arbitrary criteria. This mining is done by nodes and the nodes that do this mining are called the miners, it was originally intended to be something in which all nodes would participate (in the bitcoin whitepaper), however. It is not necessary for everyone to do so.

Let’s explain what mining means in a bit more detail.

Mining

Mining means making new valid blocks that can be added to the blockchain. The criteria of validity for new blocks is something which needs to be satisfied by every block that gets added to the blockchain. This can be something like having a certain number of leading zeroes present in the hash of the block, e.g. bitcoin currently requires 18 leading zeroes in the hash of the block.

The question arises that how does the miner make the block satisfy this criteria. The answer is brute-force. Also, this is where the nonce field comes into play.

The miners just guess what nonce value to fill in such that the criteria is met. Since the nonce field is a part of a block and the hash digest for any input changes completely for even a slight change to the input, the miners can manipulate the hash digest value obtained by changing the nonce value. So if their guess doesn’t work out, they simply pick another nonce value and try again until the criteria is met. In order to check if their guess was right or not, they need to calculate the hash for the block with their nonce value filled in, and this takes up time and processing power (this is important).

The first node to win the race of finding the correct nonce is called the miner for that block and are rewarded for it. Once the miners have a valid block that can be added to the blockchain, they broadcast it over the network and each node that receives the new block checks whether it is valid and adds it to its own copy of the blockchain if it is valid and if not, it is discarded.

Since each node only adds those blocks which are valid, and it takes effort/work (time and processing) in order to produce a valid block, sufficient resources need to be dedicated for mining and no single node can suddenly come up with an entirely different version of the blockchain that also satisfied the criteria, they need to first put in the effort. This acts as a countermeasure that ensures that any malicious node or group of nodes collaborating together need to have more than half the resources for mining new blocks in order to compete with the rest of the network mining the new blocks. This is called the 51% attack, so called because the malicious party needs to control 51% of the hashing power of the network in order to hijack the blockchain. So, the larger the network, the more difficult it is to hijack.

This is how consensus is maintained across the network about the ‘correct’ version of the blockchain through this mining process which proves the work which went into mining the block, thus the name ‘Proof-of-work’.

One thing to consider is, what is the incentive for the miners to dedicate their precious resources. Incentives matter after all.

Incentive for miners

The miners that mine new blocks need to be rewarded with some sort of an incentive which motivates them to dedicate their resources to the mining process. These incentives are provided in the form of block rewards.

Block rewards are the rewards that go directly to the miner that mined the block. In case of cryptocurrencies, like bitcoin, this is in the form of newly minted bitcoins that go to the miners directly and this transaction is added on into the new block. In bitcoin, this is the only way that new bitcoins can enter into circulation and this block reward gets halved over time so that there are only a maximum of 21 million bitcoins that can be in circulation. So, in its steady-state (when there are 21 million bitcoin in circulation), there can only be reduction in the total quantity in circulation due to keys or wallets lost, or get locked somewhere, etc.

Branches

In the explanation of the mining process above, the process of how new blocks are produced is explained. This does not explain what happens if at any point, there are two, or rather multiple, valid blocks available to add to the blockchain. This situation is entirely possible since every miner mines independently and there can be clashes between their blocks. Also, this is not the only cause for clashes between blocks. There may be some tampered block that clashes with the ‘proper’ chain. There may also be some hard forks in the blockchain software, i.e. some upgrades are made to the blockchain software that are not backwards compatible and these are not completely adopted across the network. Hard forks may even arise if the community simply chooses to allow multiple branches of a blockchain to exist simultaneously through a community split.

So, the reasons are summarised as:

  1. Clash between multiple new valid blocks.
  2. Tampered block.
  3. Hard forks in the blockchain.

In each of these cases, a new branch emerge in the blockchain. This means that there are multiple versions of the blockchain that are valid. This issue may be resolved by simply keeping these different versions for some duration of time.

Eventually, one of these branches would win out, in the sense that it would become longer due to more popular adoption among the nodes of the network. This longer branch implies that it has more blocks in it than the other branches and this further implies that more work has been put into this chain, i.e. more mining has been done using the relevant resources. Thus, the nodes discard the shorter chain with less amount of work put into in preference of the longer chain. So, the longer chain wins.

In the case of hard forks however, this is not exactly the case. Here, the community is faced with a choice of either upgrading to the new version or remaining with the old version or simply allowing two or more different branches to both survive. This leads to a split in the community. An appropriate example which illustrates this kind of a branching is that of Ethereum and Ethereum Classic.

Case Study: Bitcoin

In the blockchain implementation of bitcoin, there are a few notable things which have not yet been covered.

Transactions

In the case of bitcoin, the records that are stored over the blockchain are not just any type of records. These are transactions that occur between different nodes of the network. Moreover, these transaction have some extra security features to them using cryptography so that only the one who owns the bitcoins can actually send them to anyone else, i.e. only the owner of said bitcoins can broadcast the transaction.

This is achieved by using public and private keys. The public and private keys are made in pairs. So, a public key corresponds to a certain private key and vice-versa.

Private key: This is something that only the owner holds (it is private). It can be used to authorize (sign) a message.

Public key: This is something that everyone holds (it is public). It can be used to verify a signed message.

The transaction contains the signature of the sender, the public key of the recipient and a reference to the bitcoin held by the sender, i.e. a reference to an unspent transaction where the sender received those bitcoins. This referencing to unspent transactions helps to avoid double spending.

The owner of a private key can use their private key to sign a transaction. After the transaction is signed, the transaction is broadcasted like any other record. This transaction also needs to be verified by other nodes on the network. They do this using the public key. They use the public key to verify if the correct private key was used to sign the transaction.

These transactions are also verified by what are called full nodes.

Taken from the Bitcoin Whitepaper

Block Difficulty

The bitcoin blockchain requires that each block should take approximately 10 minutes to be mined. For this, the difficulty of mining a new block is adjusted, as required, by increasing or decreasing the number of leading zeroes required in the hash of a block.

Block Reward

Bitcoin’s block reward is an amount of bitcoins that are given to the miner who mined a given block. This block reward value, however, is not entirely fixed. The block reward value initially started off as 50 bitcoins. This value gets reduced over time though. The block reward is halved every 210,000 blocks. Why 210,000 blocks though? The answer is simply that Satoshi Nakamoto decided.

Currently, the block reward is 12.5 bitcoins per block.

Transaction Fees

Transaction fees refer to a certain amount of bitcoins paid to the miner that mines the block. This is done so that the miner is more likely to include the transaction into the block mined. So, this means that the transaction gets into the blockchain faster.

Since the size of a single block is limited, there can only be so many transactions that fit into it. And as the popularity of bitcoin grows and more people want to transact using bitcoin, the number of transactions that need to be added becomes more than can actually be added. Thus the transaction fees.

Summary

In summary, a blockchain is a collection of records distributed in an immutable and sequential manner. It is tamper resistant and can be used in a variety of applications.

The distributed blockchain is held by everyone. The nodes broadcast their records to everyone. The miners mine new blocks and when they have a new block, broadcast it to everyone. Each node that receives this new block checks whether it is valid and if so, adds it to their copy otherwise discards it. In case of branching, the individual nodes may pick whichever branch they want, but eventually one branch would become longer than the other branches. At this point, the longer branch is adopted by the network since it has more work put into it.

I hope this introductory explanation helped you to understand a bit more about this new blockchain technology. I encourage you to do more research into this as this is something that is being rapidly adopted by various organisations and groups around the world.

Feel free to contact me at zaid960928@gmail.com if you have any questions or even if you just want to say hi.

Suggested Further Reading:

  1. Bitcoin whitepaper
  2. Explanation of Bitcoin and other cryptocurrencies: https://www.youtube.com/watch?v=bBC-nXj3Ng4
  3. Explanation of Bitcoin Whitepaper: https://www.youtube.com/watch?v=MCxOwPlVVgA

References:

  1. Bitcoin whitepaper
  2. https://blockchain.info
  3. https://anders.com/blockchain/blockchain.html

--

--