How Does a Blockchain Work? Considering a complicated technology with particular focus on hashing functions:

Ashna Shah-Grover
7 min readOct 15, 2019

--

I am two weeks away from graduating from the Software Engineering course at the Flatiron school at Access Labs, DUMBO, and this is my final blog post for the course.

A few weeks ago, my peer @jasonedlewis wrote a blog post about Bitcoin, specifically about the merits of Bitcoin as a global currency.

This week I’m going to write about the crytopgraphic technology behind Bitcoin. I became interested in the computer science behind this after browsing courses on Coursera, and landing on a free course, hosted by Princeton university, on Bitcoin and Cryptocurrency Technologies. What immeaditley fascinated me was the fact that the idea of “hashing” actually made sense, having already learned about it in the context of OAuth and password digests (technologies we had learned and implemented in our course). The fact that it made sense immeaditley enlightened me how far I’ve come in my programming education: from zero to somewhere.

Description of the Coursera course

First let’s cover the one of the reasons BitCoin was invented.

Granted there are multiple reasons: decentralized distribution system, cheaper transactions costs, non-regulated by governments etc. But the main one I am going to cover in this blog is its security features.

Let’s consider normal bank transactions with normal currencies:

In this scheme, money is essentially transferred from a sender to a bank to a receiver, though it makes multiple other pitstops aloung the way (see the other three pitstops in the diagram).

The main problem with this scheme is that the log of transactions is Tamper-able. Entries of transactions can be manipulated easily or change.People who know how the banking system works are trying to avoid them because of this problem. This is where Blockchain comes in.

Now I’m going to cover what I’ve learned so far from the course on Coursera about the security features cryptocurrencies use.

The basic technical concepts I would like to cover are cryptographic hash functions, hash pointers and linked lists.

1) Cryptographic hash functions

A hash function is basically a function that takes in a string of information and turns it into a fixed-size output.

Earlier in our software engineering course we used BCrypt in order to hash passwords sent to our Ruby backend.

Bitcoin uses a hashing function called SHA256 which turns any string of information (regardless of the size of the string) into a string of 256-bit hash. More information about the advantages of each hash function can be found here — essentially different hash functions can be slow vs fast and are best suited for different purposes.

2) Hash Pointers & Linked Lists

There is one crucial difference between the transaction flow that occurs in diagram we saw of normal bank transfers and a cryptocurrency blockchain. That is that in a blockchain, the information in each transaction is encrypted with an hashing algorithm.

The structure of a blockchain

From this diagram we firstly see blocks of data. Each block represents a transaction — i.e. an amount of cryptocurrency being passed from one entity to another. Moving forward in this explanation, lets consider this in terms of the genesis and transaction chain of a single coin.

The next part of this architecture are the pointers in each block. All except the genesis block (i.e. the block where the coin was “created” and passed to someone else for the first time) contain pointers. Now, each time the coin is transferred, a new block is created. The new block initially contains two pointers: a normal pointer pointing to the raw data of the previous block (i.e. the details of the transaction, who it was sent to, amount sent, date/time etc.) as well as a hash pointer which points to the hash of the information from the previous block.

Now if a hacker comes and tries to modify the information in any block (i.e. the transaction details) he/she will also have to change the hash pointer in the hash pointer of the next block. Let’s say the hacker changes information in block 1. He will have to change the pointer and the hash pointer in block 2 to reflect the changes he made. However the hash pointer in block 4 contains the entire original transaction history so it will not longer match the new blockchain. So he will have to change that to. He can change all the hash pointer EXCEPT for the one in the final block in the chain. Now that pointer will not match the new blockchain and will indicate evidence of tampering.

This structure is called a linked list, and is what allows blockchain technology to identify attempts at tampering with transaction data.

So now to discuss briefly the term “bitcoin mining” and its relationship to the blockchain structure. Here my understanding of this topic is still shaky and I am simply explaining my gauge of what is going on.

So we have a linked list — we have hash pointers that indicate whether a chain has or has not been tampered with. But someone needs to CHECK each and every hash pointer and compare it to the previous block in order to validate the accuracy of the transaction.

The process of validating transactions must be more complicated that given that it is an extremely electricity-intensive practice driven by high computing power used to solve complex mathematical equations that are essentially part of the encryption mechanism. In fact, anyone in the world can be part of the process of validating the accuracy of Bitcoin transactions. And there is a reward for each transaction that is validated: which is a sum of Bitcoin. But in order to be rewarded with such a sum, you must be the FIRST to validate the transaction. And in addition to validating the transaction you must also solve a complicated mathematical puzzle that has nothing to do with validating a transaction.

Here’s the catch. In order for bitcoin miners to actually earn bitcoin from verifying transactions, two things have to occur. First, they must verify 1 megabyte (MB) worth of transactions, which can theoretically be as small as 1 transaction but are more often several thousand, depending on how much data each transaction stores. This is the easy part.

Second, in order to add a block of transactions to the blockchain, miners must solve a complex computational math problem, also called a “proof of work.” What they’re actually doing is trying to come up with a 64-digit hexadecimal number, called a “hash,” that is less than or equal to the target hash. Basically, a miner’s computer spits out hashes at a rate of megahashes per second (MH/s), gigahashes per second (GH/s), or even terahashes per second (TH/s) depending on the unit, guessing all possible 64-digit numbers until they arrive at a solution. In other words, it’s a gamble.

Today, bitcoin mining is so competitive that it can only be done profitably with the most up-to-date ASICs. When using desktop computers, GPUs, or older models of ASICs, the cost of energy consumption actually exceeds the revenue generated. Even with the newest unit at your disposal, one computer is rarely enough to compete with what what miners call “mining pools.”

(See the Explain it Like I’m Five (ELI5) from the above article for an amazing explanation of why bitcoin mining is so difficult, computationally-expensive, and unlikely to occur — the equivalent of gold mining in the 1800s).

Can we see how the complexity of both tasks in derived from the technology being rooted in hashing algorithms?

To be completely honest, I still don’t have a clear understanding of the technologies behind Bitcoin and other cryptocurrencies. I feel I have barely scraped the surface of a solid understanding — the details remain fuzzy. I feel I am throwing terms and concepts around without fully understanding them, despite having spent a respectable amount of time researching and writing this blog (as well as having watched the first two hours of videos for the Coursera course).

Despite terms like “blockchain”, “bitcoin mining” and “cryptocurrency” being hot and trending terms, I think the details of the technology behind these innovations remain esoteric subjects to the majority of the world. And rightfully so — these are complicated, abstract technologies. Three months ago, without having 15 weeks of software engineering bootcamp under my belt, I think I would understand 1/3 of what I understand about these terms right now. Deep understanding of the mechanisms at play — in my opinion — requires some computer science/programming experience.

I look forward to completing my course on Coursera to have a firmer understanding of this economic trend.

Sources:

--

--