The Mathematics of Bitcoin — SHA256

Part 4 of a series looking behind the scenes of the worlds most popular cryptocurrency

Toby Chitty
May 25, 2020 · 9 min read
Image for post
Image for post
Photo by Philipp Katzenberger on Unsplash

Disclaimer: If you haven’t already read the first three parts, click here for parts one and two and click here for part three.

4 — SHA-256

A cryptographic hash function is a special type of function that takes an input string of a given length and converts it into an alphanumeric string of fixed length. In the case of Bitcoin, a “Message” is inputted, and a hash function, known as SHA-256 (Secure Hashing Algorithm 256), gives an output known as a “Hash” or “Message Digest”. This means that however long the string of data (limit of 2²⁵⁶- 1 bits), the output will always be 256-bits in length. The process of hashing is not a method of encryption as it is only a one- way process and therefore cannot be reversed (decrypted). By running multiple outputs through SHA-256, we can see how different the output becomes, even when only changing a single character in the message. We can also see that despite having an input of longer length, the output length is the exact same (table 5.1). SHA-256 is also deterministic, meaning given the same input, the output will always remain the same.

Image for post
Image for post
Examples of SHA-256 outputs

To demonstrate how SHA-256 computes a message digest, I will be using the phrase ‘portsmouth’ (my old university), showing each step of the algorithm.

4.1 — Preprocessing Phase

To begin the hashing process, we take our message and convert it into binary using the ASCII (American Standard Code for Information Interchange) as shown below.

Image for post
Image for post
Character to Binary

As we can see each letter is 8-bits long. We then make one long string from these and a ‘1’ is then added to the end:

011100000110111101110010011101000111001101101101011011110111010101110100011010001

We then find the phrase size which in this case is 80-bits (10 letters, each 8-bits long). The phrase size is then also converted to binary:

80 = 01010000

Now we take the phrase size and add it to the end of our message. In between these, we pad the code with zeros to get the block to the correct size (512-bits). If the message is larger than 512 bits, more blocks are created, and we run these through the process independently with the phrase size at the end of the last block.

01110000011011110111001001110100011100110110110101101111011101010111010001101000100000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001010000

The block is then divided into 16 words of 32-bits each:

0. 01110000011011110111001001110100
1. 01110011011011010110111101110101
2. 01110100011010001000000000000000
.
.
.
15.00000000000000000000000001010000

We call this first block: block 1 = 𝑀(1).

When referring to a specific word in a certain block we use 𝑀(𝑖)(b) where b is the block number and 𝑖 refers to the specific word in the block.

For example:
𝑀(0)(3) = The 1st word in the 3rd block
where (𝑖 = 0, 1, 2,…, 15)

4.2 — Initial Hash Values

Prior to computing the remaining blocks, we must define the initial hash value (𝐻⁰). The initial hash value consists of 8 words, each 32-bits in length. For SHA-256 these are calculated from the first 8 primes. These always remain the same for any message. The primes are firstly square rooted and then taken to the modulus 1. The result is then multiplied by 16⁸ and rounded down to the nearest integer.

Image for post
Image for post
Initial Hash Values (p = 2, 3, 5, 7, 11, 13, 17, 19)

We convert the results from above into hexadecimal form; giving us the following 8 words assigned from a-h:

Image for post
Image for post
Where i = the subscript number and b = the superscript number

4.3 — Hash Computation Phase

Up until now, all the working has been part of the preprocessing phase. From now, we will look at the hash computation phase. In this phase, we will be using data from the preprocessing phase to compute an additional 48 words, totalling the number of words to 64. This then gives us 4 blocks in total. As calculating 48 more words would be too long, I will demonstrate the computation by computing the 17th word (𝑊(16) ). This is computed using the following equation:

Image for post
Image for post
Next Word Equation

Where for the 17th word (Just the t value changes for each word):

Image for post
Image for post

We start by finding the word 15 places back (𝑊 (1)). We then make two copies of this. Firstly, we take the first copy and right rotate by 7 places (𝑅𝑂𝑇𝑅⁷(𝑥)). This means each digit moves to the right one place 7 times and when the numbers fall off the end, they are moved to the front:

𝑾(𝟏) = 01110011011011010110111101110101

Image for post
Image for post
Right rotation 7 places using first copy

With the second copy, we right rotate by 18 ( 𝑅𝑂𝑇𝑅¹⁸(𝑥)):

Image for post
Image for post
Right rotation 18 places using second copy

Lastly, we take the original and right shift it by 3 (𝑆𝐻𝑅³(𝑥)). This means when a number falls off the end it is replaced by zeros at the beginning of the word:

Image for post
Image for post
Right shift 3 places using the original

We then take these 3 words and apply:

Image for post
Image for post
Rotation and Shift XOR for sigma 0

Where:

Image for post
Image for post

The bitwise exclusive-OR operation (XOR ⨁) takes two binary digits and returns 0 if both digits are 0 or 1 and returns 1 otherwise. As we have 3 words, we first use the XOR operation on 𝑅𝑂𝑇𝑅⁷(𝑥) and 𝑅𝑂𝑇𝑅¹⁸(𝑥) and then take this result and use the XOR operation with 𝑆𝐻𝑅³(𝑥).

Image for post
Image for post
Bitwise XOR performed on rotated & shifted words

Therefore, concluding that:

Image for post
Image for post

Converting this result into hexadecimal form we have:

Image for post
Image for post
Binary to hexadecimal conversion

We now take the word 2 places back (𝑊(14) ) and make 2 copies. We then take the first copy and right rotate 17 places (𝑅𝑂𝑇𝑅¹⁷(𝑥)):

𝑾(𝟏𝟒) = 00000000000000000000000000000000

Image for post
Image for post
Right Rotation 17 places using the first copy

Now, we take the 2nd copy and rotate right 19 places ( 𝑅𝑂𝑇𝑅¹⁹(𝑥)):

Image for post
Image for post
Right rotation 19 places using the second copy

Lastly, we take the original and right shift by 10 (𝑆𝐻𝑅¹⁰(𝑥)):

Image for post
Image for post
Right shift 10 places using original

We then take these 3 words and apply:

Image for post
Image for post
Rotation and Shift XOR for sigma 1

Where:

Image for post
Image for post
Image for post
Image for post
Bitwise XOR performed on rotated & shifted words

Therefore, concluding that:

Image for post
Image for post

As the whole word consists of only zeros, the hexadecimal value is 0. We then refer to our next word formula, finding the word 16 places back (𝑊 (0)) and the word 7 places back (𝑊 (9)).

𝑊(0) = 01110000011011110111001001110100 0

𝑊(9) = 00000000000000000000000000000000

We can now calculate 𝑊 (16). For this, we use binary addition (+) which follows the rules shown below:

Image for post
Image for post
Binary Addition Rules
Image for post
Image for post
Pre-Binary Addition variables
Image for post
Image for post
Binary Addition to find 17th word (W(16))

The first column is ignored to maintain the 32-bit format. Now we have our 17th word (𝑊(16)):

𝑾𝟏𝟔 = 𝟎𝟎𝟏𝟎𝟏𝟏𝟏𝟏𝟏𝟏𝟎𝟎𝟎𝟏𝟎𝟏𝟏𝟎𝟎𝟏𝟏𝟏𝟏𝟎𝟎𝟏𝟎𝟏𝟏𝟏𝟏𝟏

Lastly, we convert this from binary into hexadecimal form:

Image for post
Image for post
Binary to hexadecimal conversion of 17th word

This process is then repeated until we are left with 64 words. Computing the remaining words, we are left with the words shown below:

Image for post
Image for post
All 64 calculated words in hexadecimal

To compute the final hash, we must run 64 iterations of the equation below:

Image for post
Image for post
Final Hash Iterations 0 to 63

Using our 64 words (𝑊 ) from the table above and the initial hash values (𝒂 𝒕𝒐 𝒉), where:

Image for post
Image for post

(¬) is the NOT operation that returns 0 if the digit is 1 and 1 if the digit is 0. We also use (∧) which is the AND operation, this returns 1 if both digits are 1 and 0 otherwise.

Example: Computing 𝑡(0)

To compute 𝑡(0) we must compute our new 𝒂 𝒕𝒐 𝒉 hash values. Starting with a, where:

Image for post
Image for post

Finding 𝑻𝟏:
𝑒 = 510𝑒527 = 01010001000011100101001001111111
𝑓 = 9𝑏05688𝑐 = 10011011000001010110100010001100
𝑔 = 1𝑓83𝑑9𝑎𝑏 = 00011111100000111101100110101011
h = 5𝑏𝑒0𝑐𝑑19 = 01011011111000001100110100011001

Image for post
Image for post
Computation of Ch(e, f, g)

Referring back to the final iteration formula we can now calculate T(1):

Image for post
Image for post
Computation of T(1) using binary addition

Giving:

𝑻𝟏 = 01100011111001110101111111011100

Finding 𝑻𝟐:
𝑎 = 6𝑎09𝑒667 = 01101010000010011110011001100111
𝑏 = 𝑏𝑏67𝑎𝑒85 = 10111011011001111010111010000101
𝑐 = 3𝑐6𝑒𝑓372 = 00111100011011101111001101110010

Image for post
Image for post
Binary XOR on rotations
Image for post
Image for post
Computation of Maj(a, b, c)

We can now calculate 𝑇(2) :

Image for post
Image for post
Computation of T(2) using binary addition

Giving:

𝑻𝟐 = 00001000100100001001101011100101

Now we can calculate 𝒂 from 𝑇(1) and 𝑇(2):

Image for post
Image for post
Computation of a using binary addition

Giving:

𝒂 = 01101100011101111111101011000001

Converting to hexadecimal, we have:

𝑎 = 6c77fac1

Referring back to the iteration algorithm, we can also now calculate 𝑒:

Image for post
Image for post

𝑑 = 𝑎54𝑓𝑓53𝑎 = 10100101010011111111010100111010
𝑇(1) = 01100011111001110101111111011100

Image for post
Image for post
Computation of e using binary addition

Giving:

𝑒 = 00001001001101110101010100010110

Converting to hexadecimal we have:

𝑒 = 09375516

Referring back to the iteration algorithm we also find:

𝑏 = 𝑎 = 6𝑎09𝑒667
𝑐 = 𝑏 = 𝑏𝑏67𝑎𝑒85
𝑑 = 𝑐 = 3𝑐6𝑒𝑓372
𝑓 = 𝑒 = 510𝑒527𝑓
𝑔 = 𝑓 = 9𝑏05688𝑐
h = 𝑔 = 1𝑓83𝑑9𝑎𝑏

Leaving us with the final values for the first iteration 𝑡(0):

Image for post
Image for post
a-h values after the first iteration

We then perform an additional 63 iterations of the algorithm, using our 𝒂-𝒉 values from the previous iteration, updating every round. This leaves us with the 64th iteration 𝑡(63):

Image for post
Image for post
a-h values after 64 iterations

Using these values, we compute the ith intermediate hash value 𝐻(i) given by the following:

Image for post
Image for post

Where a-h are the values from our final iteration (𝑡(63)) and 𝐻(0)(0) to 𝐻(7)(0) are the initial hash values.

Giving us:

Image for post
Image for post
Computation of H(0)(i) using binary addition

……

Image for post
Image for post
Computation of H(6)(i) using binary addition

Finally, we can convert the intermediate hash values into hexadecimal form, giving us our parts for the final message digest:

Image for post
Image for post
Intermediate Hash Values in Hexadecimal form

Putting these values together in the following layout, we now have our final message digest (𝑀):

Image for post
Image for post

𝑴=3ccf243960d58d970b38dfbfa68be5c554c7462c960ab480933eb16b0d789597

As noted at the beginning of the chapter, we can see that from our initial message ‘portsmouth’ there is no resemblance to the message digest. The long process of SHA-256 is necessary for retaining the security of Bitcoin, ensuring there is no possible way of reversing the process. If a method of reversal were possible, attackers would have the ability to alter transactions on the blockchain to their choosing.

And that concludes the hashing process. If you were wondering how I was able to get all of these results (I’m afraid I didn’t calculate every iteration myself) or wanted to try it yourself, I’d recommend checking out this google sheets document of the hashing process by David Rabahy.

If you enjoyed this series or had any burning questions, please don’t hesitate to drop a comment below.

Part One & Two
Part Three

The Startup

Medium's largest active publication, followed by +771K people. Follow to join our community.

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store