Blockchain Unblocked — Part 2: Beginner Friendly guide to Hashing, Mining and other scary topics!

Deepika Karanji
Coinmonks
Published in
13 min readFeb 26, 2022

--

In this article, we will discuss:

- Hashing — SHA256

- Block and Block Validation

- What exactly is mining?

- Mining Paneer :)

- What are consensus protocols?

- Use cases beyond monetary transactions in Private blockchains

Hi there! If you are new here, welcome! Blockchain Unblocked is a series of articles where I attempt to explain the vast concepts of Blockchain in simple language! If you came here from Part 0 and 1, thanks for reading!

This series has literally no prerequisites, except one — Read every previous article in the series please! ^_^

We will first step through some key concepts of blockchain, and then answer the questions we have listed above.

SHA 256 Hash

SHA256 is a hash function used to secure data in a Bitcoin Blockchain network.

In grade school, we studied functions. A function looked like this: f(x) = y, where x was an input, and y was the output we got, after passing x to some function f.

A hash is a function.

Hash(x) = y, where the output y is said to be the Hash of x. There are many types of hash functions out there, but today we’ll look at a Hash called SHA 256.

  1. What does SHA 256 do?

It converts any given input string into an alpha numeric output, such that every output is unique for a given input. For example, the hash for input string “Deepika” is as below:

You can experiment here by entering your own input values, and you’ll see that the hash changes for every single letter you change, because each change makes it a new input.

Some cool facts:

  • The SHA256 Hash for a given input is CONSTANT. You can delete the input and re-enter it, and you’ll get the exact same hash output.
  • The SHA256 Hash for an input of ANY size is always 256 bits (or 64 characters) long. Input “Deepika” hashes to 64 characters, Input “Deepika Karanji” hashes to 64 characters. Input the entire Indian constitution, and it will still hash out to 64 characters! Hence, it is said to be a “Deterministic” hash function.
  • A minor change to the original data alters the hash value so much that it’s not apparent the new hash value is derived from similar data; this is known as the avalanche effect. (the hashes of Deepika and DEepika are vastly different!)
  • SHA56 is a One way Cryptographic Hash Function. I used to mistake it to be an encryption function. An encryption function would mean that it is possible to “decrypt” it, either by figuring out the encryption algo, or by creating a new algo and get back the original input. But with SHA256, no human or super computer has been able to get back the original data from the hash.

BLOCK

At this point, we have some input data given to a hash function, and we get an output. Let us now break our input data into 3 parts:

  • Block number
  • A magic number called nonce
  • Some actual data

The SHA256 Hash of this input (which is now split into 3 parts) gives us some hash as shown below:

This entire box can be thought of as a block. There are many more fields in a block, but lets ignore that for now.

The Hash is the output of SHA256 Hash of (Block#, Nonce, Data).

One interesting observation is that the hash actually begins with a many consecutive leading zeros! Usually, hashes are just alphanumerics with no specific patterns in them! Just random! So we intuitively realise that the likelihood of getting leading zeros is quite low!

Why is this special?

When the hash of a block begins with some “X” no. of leading zeros, it is said to be a VALID block!

Does this mean we have answered the question of what a Valid block is?

Yep! It’s just that simple! It may seem really abstract at this point — like who decides how many leading zeros are required to declare that block as valid? Do we just invalidate inputs that don’t have leading zeros in their hash?? Let us take it slow, we’ll get our answers soon enough!

For this article, let us assume that if a block’s hash starts with 4 zeros, it is valid.

Now, if we change any one of the inputs — say we enter “deepika” in the Data section, (earlier it was empty), the hash will obviously change as shown below:

This hash does NOT have leading zeros! So this block is NOT YETvalid.

MINING

The data stored in a block can be anything right, say it contains some transactions. We want this block which contains a bunch of transactions to become VALID so that we can add it to a Blockchain. Can we try changing any of the 3 inputs, such that the hash actually starts with a bunch of zeros and the block becomes valid?

  1. We don’t want to change the block number, because it’s just sort of a identification system for the blocks in a chain.
  2. We certainly don’t want to modify the data, because hey, it’s the data’s sanctity that must be preserved.
  3. Can we change the nonce? Yes! This is actually the purpose of nonce! We will try to keep changing nonce by trial and error till we discover a nonce which results in a hash starting with 4 zeros! You can actually try it out yourself. Here, the “key” input can be thought of as the nonce for our use case.

As you experiment on the website above, you’ll quickly see that it is really difficult, nay, IMPOSSIBLE to keep doing trial and error to figure out a nonce that hashes the input to 4 zeros.

But …………..

A computer can actually do this! It can run through the BILLIONS, nay, SEPTILLIONS of possibilities with ANY given input data and block number to figure out the CORRECT nonce that makes a Block Valid!

The process of finding a nonce for a given block by using computers that meet some hardware requirements with the purpose of Validating that block is called Mining!

See the below block — has some input and some nonce, but the hash is not valid yet.

Invalid Block

Let us click “Mine”. What happens in the background? The mining node (CPU, GPU or FPGA or an ASIC) will rapidly keep trying out nonces till the output hashes out to 4 leading zeros!

Valid Block, nonce has changed after mining

As soon as it finds the nonce, the Block becomes Valid!

Fun Fact: nonce expands to “number used once”

Blockchain miners are people who own nodes or hardware which runs some software that mines this nonce and validate blocks and add them to a blockchain. The process of mining is extremely power intensive, requiring significant computational power and hardware speed, which is why people use “GPUs or ASICs” to aid their mining process.

Mining is a competitive process! Like you, there are thousands of other miners all trying to validate a given block first! Whichever node is able to validate the block first, becomes the winner and is awarded some Bitcoins (for this example). These can be other coins or tokens as well! The winning node then broadcasts the valid block to all the other nodes in the network.

Mining is “Work” done by the GPU or ASIC. Proof of this “Work done” is what secures the blockchain. Because Mining (aka generating Proof of Work) is a very difficult problem to solve, once it is actually solved and broadcasted across the network, all the other nodes accept or “concede” that this is now a valid block, and they “concede” that the current state of the ledger includes this new block. (Reminder — Ledger is like a database to store information about Blocks in a blockchain).

And that is why Proof of Work is called a “consensus protocol”.

Let us answer some questions in a logical order now. Some of them are carryovers from Part 1.

If it is getting slightly too technical, don't worry, just keep an open mind and attempt to understand it at a high level!

1. Who decides how many leading zeros are required to declare that block as valid?

  • As mentioned earlier, the block pictures I have attached above are a slight over simplification of what information a block really contains. Henrique Centieiro explains all the fields in a block superbly, but to answer this question, let us say for any block, there exists a block header which looks something like this:
Block header for Block 491133 in Bitcoin network
  • The “bits” field actually indicates the “difficulty level” of a blockchain. Basically, let us say that Paneer, a popular Indian food item, is now in GREAT demand. Whenever Amul creates a block of Paneer, people race to get hold of every block produced by Amul. And since people want to win the race, they procure devices and sensors and sniffer dogs to help them discover Paneer blocks before anyone else does!
  • In other words, as more miners with better and faster resources to find Paneer enter the market or network, the overall speed with which Paneer blocks can be found increases, time taken to find the blocks decreases! This speed is called the Hash Rate of the network.
  • But, Bitcoin has defined a rule that mining one Paneer block should take 10 mins on an average. So what the algorithm does, is depending on the network’s hash rate, it adjusts the “bits” field, or the difficulty level, such that mining one block on an average, takes 10 mins.
  • The higher the difficulty level, the more resources and power it takes to mine a block. At the time of writing this article, mining bitcoin for small scale miners has become very unprofitable, because of the high difficulty, leading to a somewhat less decentralised BTC because only large scale miners seem to prevail, but that's a discussion for another time.
  • This difficulty level dictates how many leading zeros need to be present in the output hash in order to declare a block as valid. As of Feb 23rd 2022, the difficulty level is 27,967,152,532,434. The output hash is expected to have 19 leading zeros. How do we get 19 from the difficulty level? I am unsure, do let me know in the comments!
  • Psssst …. Actually the bits field indicates that the output hash needs to be some X bits smaller than 256 bits, such that there are a certain number of leading zeros to take the output upto 256. It’s not actually about how many leading zeros are needed, though it amounts to that.
Paneer Blocks — easier to digest than Bitcoin Blocks :P

This is what an actual block looks like:

Explore Bitcoin blocks at https://www.blockchain.com/btc/block/724200

The transactions in a block are stored in a structure called a Merkle Tree, but more on that later!

2. Is mining (aka Proof Of Work) the only way to validate a block? Are there other consensus protocols?

  • Nope, not the only way! Yep there are others!
  • So the reason I focused on PoW and Bitcoin in this article is because usually when someone says “crypto” or “Blockchain’, “mining” and “BTC” are what comes to mind. But as we are gaining some understanding of blockchain basics also along the way, I think we are doing a good job!
  • To answer the question — we have already understood that Mining is VERY power intensive. This also means that there are cost, energy and environmental implications to it. If someone is trying to have a private blockchain — say company specific, it does not make sense for the company to validate blocks by making private nodes competitively mine through this cost and energy intensive process right?
  • So people came up with other consensus protocols — like Proof of Stake, PBFT, Proof of Burn, Proof of Authority, etc. Doubtlessly, each of this has its own merits and demerits.

3. We had said that any block tampering will be detected and abandoned in Part 1. Can I alter an old block and then find a corresponding nonce such that the altered block will have same hash and my alteration goes undetected? Basically, can 2 different inputs have the same Hash?

  • Nope! A major reason that SHA256 is used as the hashing algorithm is its resistance to collisions (different inputs producing the exact same hash). In fact, nobody has ever found a collision, even though the bitcoin network produces quintillions of hashes every second. If you were to find two different inputs that produce the same output to the hashing algorithm, you would have “cracked” SHA256 (which hasn’t been done).

4. Why has no one found any collision in SHA256?

  • With SHA256, there are 2²⁵⁶ possible hashes. That is more than the number of atoms in the known universe, so the likelihood of two being the same is infinitesimally, unimaginably small. To know more about SHA256, I would highly recommend that you follow the hyperlinks given throughout the article :)

5. Is SHA256 the ONLY hashing algorithm used in Blockchain?

  • No, but it is the most popular one. Different Blockchain projects may use different hashing algorithms. For example, Ethereum uses Keccak-256 and Dodgecoin uses Scrypt.
  • It is worth noting that an ASIC(Application specific Integrated Circuit) is designed to mine hashes of a specific type; i.e, SHA256 ASICs can only be used for mining networks which use SHA256 hash.

6. Since Mining is a Competitive race, does it mean that whoever has more mining power wins?

  • Yep. If I am trying to mine with a GPU, but you have a ASIC, then you have a better chance of finding that nonce than I do, because ASICs are designed for soley for mining, while GPUs are multi-purpose cards for gaming, video rendering etc, and are hence not as fast as ASICs.

7. But is this fair? Rich people can just own high performance hardware in mining farms and perhaps eventually own 51% of the network and “break the blockchain” so to speak…

  • Fair question, and I too have found myself asking — is Bitcoin REALLY decentralised? But BTC Expert Andreas Antonopolus believes that Operational, geographic, political constraints, electricity availability, natural disaster etc all work against centralisation.. But do the forces of decentralisation and centralisation balance out? He asks us to view it in a relative sense.. i.e, ask yourself this: Is Bitcoin more decentralised than any other comparable payment system? The answer is yes! This is a topic of much debate, so a definite answer is still something I am searching for.

8. I mention private blockchains being used by companies — Are there other use cases to Blockchain than just tracking monetary transactions?

  • Yes! For instance, Walmart used Hyperledger Fabric to build a private blockchain network between multiple food partners, to trace food provenance. By using blockchain solutions instead of paper-based ledgers, the suppliers are obligated to add and store all the information in immutable blocks. Blockchain has a transparent nature; the data is open and accessible for every supply chain participant. Walmart has always been known to leverage technology and innovation to ensure Every Day Low Prices for its customers, and I found its’ case study to be especially fascinating!
  • IBM lists some good use cases of Blockchain beyond crypto

If you have made it till here, kudos! This was a slightly heavier article than the first two parts! We have successfully answered the questions we started off with!

A few open questions at this point:

  1. How are individual blocks chained together?
  2. In Part 1, we said that the Blockchain has a distributed ledger system. How does validation happen across Peers or Nodes of the network?
  3. Exactly how does a miner get their monetary reward for validating a block?

Pending Questions from previous articles:

  1. How does the first block in a blockchain get created?
  2. How do I log my transaction inside a block?
  3. What are coins/ tokens and who creates them?

We will look at the answers to these in the upcoming articles! Stay Tuned!

Huge thanks to Anders Brownworth , Got a lot of clarity because of his website. Do check it out!

--

--

Deepika Karanji
Coinmonks

Exploring new technologies, when I am not cooking, trekking or playing with dogs!