Decoding the enigma of Bitcoin Mining — Part I : Mechanism
‘Bitcoin miners’ is somewhat a misleading term. The miners are actually ‘book-keepers’ and ‘validators’ of the network. It is called as Mining because the algorithm somewhat approximates the declining supply of gold and the miner wins an award (which are the new bitcoins created) for their effort.
Functions of Bitcoin mining
Bitcoin miners run a software program (which is the Bitcoin client) on their host machine. In the very initial days it could be done even from a laptop but nowadays you need expensive and dedicated machines worth thousands of dollars and very high processing power.
Functions of mining —
- Book-keeping: The bitcoin client downloads and syncs in real time the entire blockchain of the bitcoin network. Hence the miners are called as book-keepers as the blockchain has list of every transaction processed by the network.
- Network guardians: Miners also safeguard the network against hacks and validate each transaction. Bitcoin mining gives a probabilistic solution to Byzantine’s General problem with the underlying assumption that at least 51% of the miners are honest.
- Settlement and clearing: The bitcoin network works as a settlement and clearing house for all the transactions without depending on any 3rd party service.
- Creation of new bitcoins: As discussed in the Monetary policy of Bitcoin, 12.5 Bitcoins are created out of thin air every 10 minutes. This is the incentive for contributing processing power and keeping the network safe. It is important to understand that the primary function of mining is not for the reward, but rather keeping the network safe and executing transactions smoothly.
Bitcoin Block Construction
I have already posted the structure of a Bitcoin block and individual elements in it. Lets find out how the block is constructed.
- Add high priority transactions: Every transaction that is transmitted in the Bitcoin network is sent to as many nodes as possible. The nodes validate each transaction and add it to a transaction pool. Each transaction is then prioritized depending on the size of the transaction(in kb), its age (the amount of time a transaction has not been picked up by any block) and the value of the input (in simple terms it is the amount of bitcoins as input). The first 50 kb (out of a total 1 MB) are reserved for high priority transactions. This ensures that eventually every transaction is picked up by the network and is part of a block. Transactions do not have a time-out and hence eventually the transaction is ‘old-enough’ that it becomes part of a valid block.
- Generate Coinbase transaction: The very first transaction is a special type of transaction called as a coinbase transaction. Generally, the input of a transaction is the output of parent transaction. However, this is a special case and it contains an arbitrary input . The output is directed to the address of the miner and contains her reward ( i.e the incentive which is 12.5 bitcoins as of Dec 2016). These 12.5 bitcoins are generated out of thin air. It also has total of all transaction fees in that block. Consider any random block — Let say #443333. The first transaction has no inputs and output is 13.15848227 BTC which is the reward. Here 0.65848227 is the amount from the transaction fees and 12.5 are the new bitcoins created. It is awarded to miner with address 1EZBqbJSHFKSkVPNKzc5v26HA6nAHiTXq6. A fun fact is that for the very first block, Satoshi Nakamoto added the the arbitrary input as “ The Times 03/Jan/2009 Chancellor on brink of second bailout for banks” which was the headline of The Times newspaper and talks about the failing financial system.
I have already discussed all the other fields of a Bitcoin block in my previous post, except for Timestamp, Difficulty Target and Nonce . Let us see these 3 fields now.
- Timestamp: This is a 4-byte timestamp, encoded as a Unix ‘Epoch’ timestamp which is based on the number of seconds elapsed from January 1, 1970, midnight UTC/GMT. A timestamp is accepted as valid if it is greater than the median timestamp of previous 11 blocks, and less than the network-adjusted time + 2 hours. “Network-adjusted time” is the median of the timestamps returned by all nodes connected to you. As a result, block timestamps are not exactly accurate, and they do not even need to be in order. Block times are accurate only to within an hour or two. An added feature of this field is to make it more difficult to hash the block and hence more difficult to hack it.
- Difficulty Bits: A numeric representation of the current difficulty in obtaining the expected result.
- Nonce: This is like a counter for each 10 minute iteration (individual block creation). It starts at an initialized number like 0 and is incremental in nature till either the puzzle is solved or else some other node solves it.
We will continue the discussion on Difficulty Bits and Nonce in the next sections.
Explaining Proof of Work with a simple example
This is the algorithm implemented by Bitcoin protocol. It further utilizes cryptographic algorithms like SHA-256. You can revisit it here since the understanding of SHA-256 is very critical for understanding Proof of Work.
Let us take an example that we want to find out the hash value such that it starts with one ‘0’ and our data is the string ‘Make America Great Again’. We apply an incremental approach of adding nonce by the value 1. We start by initializing the nonce to 1.
The SHA-256 ( nonce + input string) = SHA-256(1Make America Great Again) does not start with zero. So we increment the nonce by 1. Finally on the 20th attempt we find the hash value such that it starts with zero. The SHA-256 input is ‘20Make America Great Again’. The number of zeroes that the output has to start with is known as the ‘Difficulty’. Now the goal is readjusted such that it is to find hash starting with 2 leading zeros.
This is achieved when nonce is 120 (nonce being exactly 10 times is purely coincidental). So just by adding one unit of difficulty (i.e one additional zero) the time taken is increased exponentially. Now, if our goal is to have 4 leading zeros then the first such occurrence is only when nonce = 69817. This is called as the Proof of Work. Once the solution is found out then the node immediately sends the information (data + nonce) to all nodes and each can immediately verify and in a distributed consensus agree that it was the winning node and hence award it with the newly created Bitcoins.
Proof of Work — Actual Steps
- Collect transactions from the transaction pool and build a complete block such that its size does not exceed 1 MB.
- Calculate the hash by applying SHA-256 twice to the Block header (Version + Previous Block Hash + Merkle Root + Timestamp + Difficulty Bits + Nonce )
- Compare the result of Step # 2 with the expected number of zeros. If not matched then increment the nonce by 1 and go back to Step # 1. Technically speaking the hash value is compared with a target. The target is a very large number and known to every bitcoin client. For the block to be accepted, the hash value has to be less than the target.
- Keep comparing the result such that the hash is less than target i.e the hash has the expected number of leading zeros.
- Once miner finds a winning block then send it to all participating nodes and they can calculate the result for themselves. Once all agree then the node which calculated the winning block is rewarded with newly created Bitcoins. The winning block is first checked by each node individually for a long checklist of items. So, if one miner is adding, say 10000 bitcoins in coinbase transaction, it is immediately rejected by all.
- As time progresses, more high computational nodes join (or may even drop out of) the network. Hence, the puzzle can be solved much faster and block creation time is reduced. Remember, that the block creation time is set to 10 minutes and this can never change. So after a fixed time of approximately 2 weeks or exactly 2016 blocks the difficulty is re-adjusted. Increase in difficulty means target decreases. Ever since the difficulty has been increasing The current difficulty of the Bitcoin network can be found here. The hash of the genesis block (Block height = 0) has 10 leading zeros, the block with height = 32000 has 9 and it further reduces to 8 leading zeros for block with height = 32256. However these were the initial days with very few users. Since last 3 years, the difficulty has been increasing constantly and dip in difficulty is very minor. The current difficulty is between 17 to 19 zeros. It takes a huge computing power to calculate the nonce.
- If nodes receive 2 blocks at the same time then the one for which more computation power was used (i.e had higher difficulty) is selected.
Proof of Work — Dice Analogy
Consider the example of a game with 2 dice where the goal is to throw dice each time such that their addition (i.e hash value) is always less than an expected number (i.e target). The maximum possible number is 12 (both dice have 6) and the game starts with 12 as the target. If at least 1 dice has value other than 6 then the total will always be less than 6. Winning is fairly easy as difficulty is low. Throwing of dice and obtaining the expected result (hash value < target) which is verified by all players playing the game is called as Proof of Work.
Now the target is reduced to 11 and difficulty slowly increases as 4 outcomes (6,6), (6,5), (5,6) and (6,6) are rejected. As target decreases, the difficulty increases. Eventually the target is the minimum possible number of 2 and the difficulty is highest (1/18 chance). Conceptually, bitcoin proof of work is very similar.
Bitcoin Mining real-world example
We will consider a sample hashing algorithm somewhat similar to the Bitcoin protocol. I have already posted in depth on the protocol so for this example we will reduce complexity by reducing many aspects of the actual algorithm (Eg: Double hashing with SHA-256, merkle trees etc). Algorithm is such that —
- Hash is calculated on fields — Block #, Nonce, Coinbase, Transaction List, Previous Block Hash.
- Hash is calculated by concatenating all fields together and then applying SHA-256 once. This is explained in the figure for each header in Node A
- Difficulty is 4 leading zeros.
Bitcoin mining steps —
- The very first block is called as genesis block and its contents are always hardcoded. The previous hash is a 32 byte field of just zeros. The coinbase transaction i.e the one in which miner gets rewarded is always the first transaction. For genesis block, the transaction list is empty. All values are static ie. they dont change once block is created, except for the nonce. It is initialized at 0 and SHA-256 hash algorithm is applied on all the fields concatenated together. If the result does not have 4 leading zeros then the block is discarded, nonce incremented by 1 and process continues. At nonce = 97713 finally we get a valid block. You can verify the result yourself at this website.
- In real example usually the first few transactions will have just coinbase transaction and the transaction list empty, but for this example we will assume that miners started exchanging bitcoins immediately after genesis block creation.
- Block # 2 is based on same principles as Block # 1 except that the previous block stores the hash of Block # 1.
- Similarly Block # 3 is appended to Block # 2. In Block # 3 the winning miner was ‘Dewey’ and not ‘Satoshi’
- The same copy of blockchain is maintained by all nodes and all are in sync. If there is attempt to corrupt data of any field of any block for one copy then immediately it is detected.
Hope this complex concept is now clear to all of you. In the next post we will see how Bitcoin protocol safeguards itself from hacks and prevents double spending as well as gives a probabilistic solution to Byzantine’s General Problem.
If you are an enthusiast in Bitcoin, Distributed Ledgers/ Blockchain and similar crypto-technologies then do join my Whatsapp group, where we as a community, will answer all your questions. Join by clicking here
I will definitely reply to your comments and clarifications. Do share your thoughts :)
Header stock images from Pixabay, chart is from blockchain.info and all the other images have been designed by myself ©