Explaining blockchain to someone can be a challenging task, even to an expert in this field. Not because it has a very complex definition, but mainly because of the prerequisites needed to understand this technology. For instance, you cannot explain blockchain to someone who doesn’t understand the need for database technology. This is because you cannot explain a solution if they do not know what the problem is? So, blockchain needs to be explained in a different way to different people based on their knowledge about the components that create blockchain.
Here is a video by Wired explaining blockchain technology in five different difficulty levels.
Now we know that even though blockchian is a complex technology, it can be preached to anyone, but with varied approaches. But in this article, we are going to target only novice programmers as our audience so that a programmer with a basic understanding of a few technologies should be able to grasp the fundamentals of the blockchain.
Although blockchain is a very popular technology, there is no single definition that clearly explains it. A blockchain can simply be described as a data structure of blocks that are linked together using cryptography (cryptographic hashing in particular) to form a collection of records, known as a ledger. These interlinked blocks of data, forming a digital ledger guaranteed the integrity of the data stored in the blockchain. This meant that the data stored in a blockchain was incorruptible.
Even though we have already mentioned that a blockchain doesn’t have a standard definition, we have picked a couple of definitions that throws some light on this technology.
“A blockchain is a specific form or subset of distributed ledger technologies, which
constructs a chronological chain of blocks, hence the name “block-chain.”
- Antony Lewis, the Director of Research at R3
“The blockchain data structure is an ordered, back-linked list of blocks.”
- Andreas Antonopoulos, a popular Bitcoin evangelist
Now that we have described blockchain in general terminology at a fairly high level, let’s look into its technical implementation that will interest a programmer. In the original implementation of Bitcoin, Satoshi Nakamoto proposed the construction of continuous records by timestamping each block by hashing them and linking it to the previous block with a hash reference as shown in the below figure.
A block in Bitcoin or other blockchain based platform is a collection of events called transactions.
Timestamping the blocks with the help of cryptographic hashing ensured the integrity of the entire blockchain. The integrity of the blockchain ensured that the data in any of the blocks couldn’t be modified and any modification could be easily detected by anyone in the network.
It’s time that we dive into the actual implementation of how blocks are linked in a blockchain to create a tamper-proof record. Although blockchain can be simulated using any programming languages, for the sake of simplicity, we are going to use Python scripting language to exhibit a basic blockchain.
Again, for the sake of simplicity, we will create a block with a basic structure. Let’s assume that a block will contain a string data along with few other fields such as index, timestamp, hash, and previous hash value. The block will look something like this:
"""A class representing the block for the blockchain"""
def __init__(self, index, previous_hash, timestamp, data, hash):
self.index = index
self.previous_hash = previous_hash
self.timestamp = timestamp
self.data = data
self.hash = hash
Here, the block structure is defined using a Python class named Block. The block structure in an actual blockchain implementation will store the information about transactions in the block. But in this example, we will store just a string data instead of actual transactions. The hash value of the block can be computed using cryptographic hash functions such as md5, SHA-2 or SHA-3. We will use a particular variant of SHA-2 hash function called SHA-256. Python has a PyCrypto package for this hash function implementation:
from Crypto.Hash import SHA256
Now that we have defined a block structure, we will define how these blocks are linked to form a blockchain. For that, we will define a class called Blockchain which implements few functionalities such as calculating hash and creating a new block:
"""A class representing list of blocks"""
self._chain = [self.get_genesis_block()]
When a blockchain is instantiated, the first block needs to be appended to this chain. The first block of the chain is called a genesis block. This block is often hardcoded in the code. The genesis block is created using the Block structure which was defined earlier:
"""creates first block of the chain"""
return Block(0, "0", 1465154705, "my genesis block!!", "816534932c2b7154836da6afc367695e6337db8a921823784c14378abed4f7d7")
The genesis block is created with a hash value “816534932c2b7154836da6afc367695e6337db8a921823784c14378abed4f7d7" which is created using the SHA-256 hash function:
SHA256.new(data=(str(0) + "0"+ str(1465154705) +"my genesis block!!").encode()).hexdigest()
Once the blockchain is instantiated, new blocks can be added to the chain. The newly added blocks will ensure that they are chained to the older blocks of the chain:
def add_block(self, data):
"""appends a new block to the blockchain"""
The main ingredient that binds the newly created block to the existing blockchain is the reference to the last block with the help of previous block hash.
def create_block(self, block_data):
"""creates a new block with the given block data"""
previous_block = self.get_latest_block()
next_index = previous_block.index + 1
next_timestamp = int(datetime.now().timestamp())
next_hash = self.calculate_hash(next_index, previous_block.hash, next_timestamp, block_data)
return Block(next_index, previous_block.hash, next_timestamp, block_data, next_hash)
The hash value of the new block will be computed for each block which consists of the block details along with the hash value of the previous block.:
def calculate_hash(self, index, previous_hash, timestamp, data):
"""calculates SHA256 hash value"""
hash_object = SHA256.new(data=(str(index) + previous_hash + str(timestamp) + data).encode())
A link will be formed by providing the hash value of the previous block which was computed earlier. This link will make sure that none of the previous blocks can be altered. A simple hash verification can detect any of the modifications in the previous blocks.
This is how a simple blockchain data structure is implemented. We can add a few more functionalities to this implementation along with the above-mentioned methods. Here is a complete code of this implementation:
We can create a new blockchain and add blocks with the help of the above implementation by instantiating a Blockchain object:
new_chain = Blockchain()
new_chain.add_block(data="second block data")
The blocks generated by the above calls will look something like this:
"data": "my genesis block!!",
"data": "test block",
"data": "second block data",
As we can see, we have three blocks, including a genesis block with each block showcasing the same block attributes that were defined in the Block class. The important observation to be made here is — how the previous_hash of block 2 points to the hash value of block 1 and similarly previous_hash of block 3 points to the block 2 hash value.
This is how a basic blockchain data structure can be explained to a programmer. We have explained how a blockchain data structure is created so that it maintains the integrity in a decentralized network. Although the integrity of the blockchain can be maintained in a trustless p2p network, this concept will not ensure the immutability of the blockchain data so that everyone in the network will believe on a single blockchain state. This is achieved with a whole new concept called consensus mechanism. We will dive into this concept in the next part of this article.
Please applaud this article if it was helpful!