The Bitcoin Blockchain
I’m sure you’ve heard the word ‘Blockchain’ before, it’s one of the top buzz words nowadays. In this post we’re going to explain what the Bitcoin Blockchain is and how it works internally.
The Bitcoin Blockchain is a data structure for storing transactions in a series of back-linked blocks. Every block has a list of transactions inside, and each block is linked to its ‘parent’ block. It can be stored on a file or a simple database. The structure of a block is comprised of a block header and the list of transactions.
The block header has an unique identifier called the block header hash, and inside the block header we can find three main components:
- Previous Block Hash
- Timestamp, Difficulty and Nonce (mining information discussed in another post)
- Merkle Tree Root
Block Header Hash
Each block is uniquely and unambiguously identified by a hash number. This number is obtained by double hashing the block header with the SHA256 algorithm. The header hash is not stored in the block structure, instead is calculated by each node as the block is received through the network. It might be stored in a separate metadata database for indexing and faster retrieval purposes.
Another form of identification for a block is the ‘Block Height’. Each block is ‘piled’ on top of another therefore adding +1 to the count of the ‘height’. Meaning that starting from block #0 (Genesis block) each new block adds to the total height. The last recorded block height as of February 2018 is 509415. Block height is a secondary way of identification, although there are some cases in which it won’t work as an unique identifier because it might be the case that two or more blocks are competing for the same height at one moment in time. Eventually this situation (fork) will be resolved and only one block will be added to the height.
Previous Block Hash
Each block has a previous block hash to identify it’s parent block. Blocks can have only one parent, but it could be the case that when blocks are discovered almost simultaneously one block may have more than one child. Eventually the situation (fork) gets resolved and only one child per block remains.
The previous block hash is a very important part of the block, because it will be hashed with other information to get the current block hash. This means that the ‘identity’ of the father is embedded on the child’s ‘identity’, because of this, any change on the hash of the father will imply a modification on the hash of the child and also on the grandchild and so on. This ensures that once a block has many generations following it, it can’t be changed without forcing a recalculation of all subsequent blocks. This is the foundation of the immutability on the blockchain, because doing so, it will require an immensely high computational power almost ‘impossible’ to get. In another post we’ll be analyzing this ‘almost impossible’ power and the ways it can be broken, with a 51% attack or quantum computer power.
A Merkle Tree is is a binary tree containing cryptographic hashes on its leaves. The term ‘tree’ is used in computer science to describe a branching data structure. Each block in the bitcoin Blockchain contains a summary of all transactions in the block using a Merkle tree.
The Merkle tree in Bitcoin is constructed by recursively hashing pair of nodes until there’s only one hash: The Merkle Root. Note that nodes, in this case, means we are hashing a transaction at a leaf level, and in any other parent level we are hashing hashes of transactions. Don’t worry, look a the next graphic and it’ll help you to understand:
Each leaf is a double hash of SHA256 over the transaction data. The final result (Merkle Root) is stored on the block header. As the Merkle Tree is a binary tree, it needs an even number of nodes, if we have and odd number, we add one duplicated transaction and voilá! An even number of nodes!
This is a very efficient data structure to check if a value is inside, in this case to check if there’s a transaction in the tree. By providing a set of hashes, a node can prove that a transaction is related to the Merkle root in the header therefore is included in the tree (list of transactions). This method is called ‘Merkle Authentication Path’ and the SPV nodes use it for validating if a block has a transaction. The SPV node as we saw in other post, stores the block header, which in fact contains the Merkle Root. Then it will ask to Full Nodes a set of hashes that conforms the ‘authentication path’ and with them will prove that certain transaction is inside the Merkle Tree that conforms the known Merkle Root. Bitcoin nodes can produce paths of 10–12 hashes for proving a transaction is in a tree with more than a thousand transactions.
That’s the basics of the Bitcoin blockchain. Now, you can have a better understanding of what we are talking about when we say Blockchain.
See you next time!