# The Mathematics of Bitcoin — SHA256

## Part 4 of a series looking behind the scenes of the worlds most popular cryptocurrency

**Disclaimer**: If you haven’t already read the first three parts, **click here for parts one and two**** **and **click here for part three**.

# 4 — SHA-256

A cryptographic hash function is a special type of function that **takes an input string** of a given length and converts it into an **alphanumeric string of fixed length**. In the case of Bitcoin, a **“Message”** is inputted, and a hash function, known as **SHA-256** (Secure Hashing Algorithm 256), gives an output known as a **“Hash”** or **“Message Digest”**. This means that however long the string of data (limit of 2²⁵⁶- 1 bits), the **output will always be 256-bits in length**. The process of hashing is **not a method of encryption** as it is only a **one- way process** and therefore **cannot be reversed** (decrypted). By running multiple outputs through SHA-256, we can see how different the output becomes, even when only changing a single character in the message. We can also see that despite having an input of longer length, the output length is the exact same (table 5.1). SHA-256 is also **deterministic**, meaning given the same input, the output will always remain the same.

To demonstrate how SHA-256 computes a message digest, I will be using the phrase **‘portsmouth’ **(my old university), showing each step of the algorithm.

## 4.1 — Preprocessing Phase

To begin the hashing process, we take our message and convert it into **binary** using the ASCII (American Standard Code for Information Interchange) as shown below.

As we can see each letter is **8-bits long**. We then make one long string from these and a ‘1’ is then added to the end:

`011100000110111101110010011101000111001101101101011011110111010101110100011010001`

We then find the **phrase size** which in this case is **80-bits** (10 letters, each 8-bits long). The phrase size is then **also converted to binary**:

`80 = 01010000`

Now we take the **phrase size and add it to the end of our message**. In between these, **we pad the code with zeros to get the block to the correct size** (512-bits). If the message is larger than 512 bits, more blocks are created, and we run these through the process independently with the phrase size at the end of the last block.

`01110000011011110111001001110100011100110110110101101111011101010111010001101000100000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001010000`

The block is then divided into **16 words of 32-bits each**:

`0. 01110000011011110111001001110100`

1. 01110011011011010110111101110101

2. 01110100011010001000000000000000

.

.

.

15.00000000000000000000000001010000

We call this first block: block 1 = 𝑀(1).

When referring to a specific **word** in a certain **block** we use 𝑀(𝑖)(b) where b is the **block number** and 𝑖 refers to the **specific word** in the block.

For example:

𝑀(0)(3) = The 1st word in the 3rd block

where (𝑖 = 0, 1, 2,…, 15)

## 4.2 — Initial Hash Values

Prior to computing the remaining blocks, we must define the **initial hash value** (𝐻⁰). The initial hash value consists of **8 words**, each 32-bits in length. For SHA-256 these are calculated from the **first 8 primes**. These always remain the same for any message. The primes are firstly square rooted and then taken to the modulus 1. The result is then multiplied by 16⁸ and rounded down to the nearest integer.

We convert the results from above into hexadecimal form; giving us the following 8 words assigned from **a-h**:

## 4.3 — Hash Computation Phase

Up until now, all the working has been part of the **preprocessing phase**. From now, we will look at the **hash computation phase**. In this phase, we will be using data from the preprocessing phase to **compute an additional 48 words**, totalling the number of words to 64. This then gives us **4 blocks** in total. As calculating 48 more words would be too long, I will demonstrate the computation by computing the **17th word** (𝑊(16) ). This is computed using the following equation:

Where for the 17th word (Just the t value changes for each word):

We start by finding the word 15 places back (𝑊 (1)). We then make **two copies** of this. Firstly, we take the first copy and **right rotate by 7 places** (𝑅𝑂𝑇𝑅⁷(𝑥)). This means each digit moves to the right **one place 7 times** and when the numbers fall off the end, they are moved to the front:

𝑾(𝟏) = 01110011011011010110111101110101

With the second copy, we **right rotate by 18** ( 𝑅𝑂𝑇𝑅¹⁸(𝑥)):

Lastly, we take the original and **right shift it by 3** (𝑆𝐻𝑅³(𝑥)). This means when a number falls off the end it is **replaced by zeros** at the **beginning** of the word:

We then take these 3 words and apply:

Where:

The **bitwise exclusive-OR operation** (XOR ⨁) takes **two binary digits and returns 0 if both digits are 0 or 1 and returns 1 otherwise**. As we have 3 words, we first use the XOR operation on 𝑅𝑂𝑇𝑅⁷(𝑥) and 𝑅𝑂𝑇𝑅¹⁸(𝑥) and then take this result and use the XOR operation with 𝑆𝐻𝑅³(𝑥).

Therefore, concluding that:

Converting this result into **hexadecimal form** we have:

We now take the word 2 places back (𝑊(14) ) and make 2 copies. We then take the first copy and right rotate 17 places (𝑅𝑂𝑇𝑅¹⁷(𝑥)):

𝑾(𝟏𝟒)

=00000000000000000000000000000000

Now, we take the 2nd copy and rotate right 19 places ( 𝑅𝑂𝑇𝑅¹⁹(𝑥)):

Lastly, we take the original and right shift by 10 (𝑆𝐻𝑅¹⁰(𝑥)):

We then take these 3 words and apply:

Where:

Therefore, concluding that:

As the whole word consists of only zeros, the hexadecimal value is 0. We then refer to our **next word formula**, finding the word 16 places back (𝑊 (0)) and the word 7 places back (𝑊 (9)).

𝑊(0)= 01110000011011110111001001110100 0

𝑊(9)= 00000000000000000000000000000000

We can now calculate 𝑊 (16). For this, we use binary addition (+) which follows the rules shown below:

The first column is ignored to maintain the 32-bit format. Now we have our 17th word (𝑊(16)):

𝑾𝟏𝟔= 𝟎𝟎𝟏𝟎𝟏𝟏𝟏𝟏𝟏𝟏𝟎𝟎𝟎𝟏𝟎𝟏𝟏𝟎𝟎𝟏𝟏𝟏𝟏𝟎𝟎𝟏𝟎𝟏𝟏𝟏𝟏𝟏

Lastly, we convert this from binary into hexadecimal form:

This process is then repeated until we are left with **64 words**. Computing the remaining words, we are left with the words shown below:

To compute the final hash, we must run 64 iterations of the equation below:

Using our 64 words (𝑊 ) from the table above and the initial hash values (𝒂 𝒕𝒐 𝒉), where:

(¬) is the **NOT** operation that **returns 0 if the digit is 1 and 1 if the digit is 0**. We also use (∧) which is the **AND** operation, this **returns 1 if both digits are 1 and 0 otherwise**.

**Example: Computing 𝑡(0)**

To compute 𝑡(0) we must compute our new 𝒂 𝒕𝒐 𝒉 hash values. Starting with **a,** where:

**Finding **𝑻𝟏**: **𝑒 = 510𝑒527 = 01010001000011100101001001111111

𝑓 = 9𝑏05688𝑐 = 10011011000001010110100010001100

𝑔 = 1𝑓83𝑑9𝑎𝑏 = 00011111100000111101100110101011

h = 5𝑏𝑒0𝑐𝑑19 = 01011011111000001100110100011001

Referring back to the final iteration formula we can now calculate T(1):

Giving:

𝑻𝟏 = 01100011111001110101111111011100

**Finding **𝑻𝟐**: **𝑎 = 6𝑎09𝑒667 = 01101010000010011110011001100111

𝑏 = 𝑏𝑏67𝑎𝑒85 = 10111011011001111010111010000101

𝑐 = 3𝑐6𝑒𝑓372 = 00111100011011101111001101110010

We can now calculate 𝑇(2) :

Giving:

𝑻𝟐

=00001000100100001001101011100101

Now we can calculate 𝒂 from 𝑇(1) and 𝑇(2):

Giving:

𝒂 = 01101100011101111111101011000001

Converting to hexadecimal, we have:

𝑎= 6c77fac1

Referring back to the iteration algorithm, we can also now calculate **𝑒**:

**𝑑** = 𝑎54𝑓𝑓53𝑎 = 10100101010011111111010100111010**𝑇(1)** = 01100011111001110101111111011100

Giving:

𝑒= 00001001001101110101010100010110

Converting to hexadecimal we have:

𝑒= 09375516

Referring back to the iteration algorithm we also find:

𝑏= 𝑎 = 6𝑎09𝑒667𝑐= 𝑏 = 𝑏𝑏67𝑎𝑒85𝑑= 𝑐 = 3𝑐6𝑒𝑓372𝑓= 𝑒 = 510𝑒527𝑓𝑔= 𝑓 = 9𝑏05688𝑐h= 𝑔 = 1𝑓83𝑑9𝑎𝑏

Leaving us with the final values for the first iteration 𝑡(0):

We then perform an additional 63 iterations of the algorithm, using our 𝒂-𝒉 values from the previous iteration, **updating every round**. This leaves us with the 64th iteration 𝑡(63):

Using these values, we compute the ith intermediate hash value 𝐻(i) given by the following:

Where ** a-h **are the values from our final iteration (𝑡(63)) and 𝐻(0)(0) to 𝐻(7)(0) are the initial hash values.

Giving us:

……

Finally, we can convert the intermediate hash values into hexadecimal form, giving us our parts for the final message digest:

Putting these values together in the following layout, we now have our final message digest (𝑀):

𝑴=3ccf243960d58d970b38dfbfa68be5c554c7462c960ab480933eb16b0d789597

As noted at the beginning of the chapter, we can see that from our initial message ‘portsmouth’ there is **no resemblance** to the message digest. The long process of SHA-256 is necessary for retaining the security of Bitcoin, ensuring there is no possible way of reversing the process. If a method of reversal were possible, attackers would have the ability to alter transactions on the blockchain to their choosing.

And that concludes the hashing process. If you were wondering how I was able to get all of these results (I’m afraid I didn’t calculate every iteration myself) or wanted to try it yourself, I’d recommend checking out **this google sheets document** of the hashing process by David Rabahy.

If you enjoyed this series or had any burning questions, please don’t hesitate to **drop a comment below**.