What is SHA 256 and How Does It Work?

Uludag University Blockchain
Coinmonks
11 min readJul 13, 2022

--

(Img from: [0])

1. What is SHA 256?

SHA 256(Secure Hash Algorithm) is a hash algorithm from the set of SHA-2 cryptographic hash functions developed by the NSA (US National Security Agency). The hash algorithm outputs the input as a unique hash of fixed size. Hash algorithms are next to impossible to reverse and are most likely identified with outputting a unique hash. The reason why it is called most likely is because theoretically two different inputs have the possibility of giving identical outputs. Because you can enter the SHA-256 algorithm as long as you want, and all of them are summarized as 256 bits. So there are 2²⁵⁶ different hash values ​​in total. This is genuinely an astronomically high number. With 2²⁵⁶ different hashes available, the probability of finding a conflict is also pretty low.

How big is the difference now that we understand that no two hashes will be the same in practice? For example, if a letter of the entered data is capitalized or not, how much does the hash change compared to the old one? Answer: Completely. More precisely, we cannot find logical and mathematical similarities. Hash values ​​change to appear completely random at the slightest change. This is also called the Avalanche Effect.

(These two hashes do not resemble each other in any way.)

2. What Does SHA 256 Do?

There are basic and very important features that we have obtained from SHA-256 in the name of authentication and cryptography. The most important feature is the part we mentioned at the beginning: It gives a unique output. This is important because imagine that one piece of data that needs to be verified can be verified with another piece of data. Of course, with SHA-256 this is theoretically possible but highly unlikely. Another feature is that it is infeasible to return and is a one-way function. In other words, it is, again, very difficult to find the original value from the hexadecimal value presented as the output. The only way for this is to try the hexadecimal value or the original 256-bit value one by one and compare them continuously with the Brute-Force method.

It would be really sad if such a function were not deterministic. In other words, an SHA-256 function that can vary from device to device would create a big problem in peer-to-peer systems, where it works best. Because it is necessary to verify the value produced on one computer on the other computer, this work could not be done properly because the devices were different. Therefore, this function is a deterministic function that does not change from environment to environment, and is never dependent on the features and hardware of the device in which it is located. If you wanted to do the function with a pen and paper in your hand, the result you will get when you finish it will be the same as the result you will get when you run this function on the computer.

I will talk about how it works in a moment, but it will take a lot of time to do this with paper and pencil. But unlike humans, computers can do such calculations very quickly. Especially when it comes to a function that works with simple bitwise logic functions and formulas such as SHA-256 and has few loops, very fast results can be obtained. This is another important feature. Imagine that a piece of data needs to be verified and it takes hours, not even hours, but minutes. This would make the current system rather cumbersome.

To summarize SHA-256:

1. Is infeasible to reverse,

2. One-way,

3. Unique output,

4. Avalanche Effect,

Thanks to these features, it is commonly used in Bitcoin mining. The miners’ job is to enter transaction data into this function and replace the arbitrary number, Nonce, entered in addition to this data, hoping to see a certain number of 0’s in the hash value.

3. How Does SHA-256 Work?

Let’s see how this algorithm, which is so important, works. Let us warn you from the beginning that this part will be quite technical. But trust me, it’s not that hard. If you can understand and analyze it deeply, maybe you can reverse engineer it too! Even if anyone hasn’t been able to do it yet.

3.1) Basic Operations

In order to perceive the working logic of the algorithm, it is necessary to know a few basic operations.

1. Shift Right(SHR): It is a basic bitwise operation. Its logic consists of shifting existing bits to the right. Since it is shifted to the right, the first bit on the left is set to zero, and the rightmost bit becomes the previous bit.

For example: 10101001 SHR 3 → 00010101/ The amount of right shifting meant by SHR 3. So 3 bits shifted to the right.

2. Rotation Right(ROTR): Its logic is similar to shift right. The only difference is that when we shift it to the right, the missing bits appear on the left, respectively, without a leading 0. We can think of the bits settling in a circular fashion.

For example: 10101001 ROTR 3 → 00110101/ ROTR 3 is shifted so that the last 3 bits come first.

3. Exclusive Or(XOR): It works like this, let there be two propositions p and q, and these propositions are correct or not correct with 1 and 0 such that:

XOR,

Output = 0 for p=1, q=1

Output = 1 for p=1, q=0

Output = 1 for p=0, q=1

Output=0 for p=0, q=0 is determined mathematically.

For example:

p = 11001001

q = 00101010

XOR — — — — — — —

Output = 11100011

4. Binary Addition(ADD): It means addition in the binary system. It is not much different from addition in the decimal number system. Only the digits can be 1 instead of a maximum of 9. In cases where it exceeds 1, we add one to the next digit as in the decimal system. We keep adding until we don’t exceed it.

New to trading? Try crypto trading bots or copy trading

For example: 00001100 + 00010101 = 00100001 In order to understand the accuracy of the result, we need to be able to count in the binary system. For counting in the binary system, a notation such as 1*(2⁰) is applied based on the first digit 0. Here 1 determines whether it is 0 or 1, and 0 determines which digit it is in. In that case:

First number:0*(2⁰) + 0*(2¹) + 1*(2²) + 1*(2³) + 0*(2⁴) + 0*(2⁵) + 0*(2⁶)+ 0*(2⁷) = 12,

Second number: 1*(2⁰) + 0*(2¹) + 1*(2²) + 0*(2³) + 1*(2⁴) + 0*(2⁵) + 0*(2⁶)+ 0*(2⁷) = 21,

Our result is: 1*(2⁰) + 0*(2¹) + 0*(2²) + 0*(2³) + 0*(2⁴) + 1*(2⁵) + 0*(2⁶)+ 0*(2⁷) = 33. Thus it can be converted to decimal system.

5. Compound Operations: These are the combined operations of the above operations. But because they are specific, they have nomenclature. Let’s call the initial current bit ‘X’:

  1. ) σ0 (Lowercase Sigma-0): Lowercase Sigma0 is a compound operation. Applies ROTR 7 to X and puts it on first portion, applies ROTR 18 on X and puts it on second portion, and applies SHR 3 on X and puts it on third portion and applies XOR to all of them.
  2. ) σ1: (Lowercase Sigma-1): In Lowercase Sigma1, function applies ROTR 17 on X and puts it on first portion, applies ROTR 19 on X and puts it on second portion , applies SHR 10 on X and puts it on third, and applies XOR to all of them.
  3. ) Σ0= (Uppercase Sigma-0): UppercaseSigma0 is a compound operation that applies ROTR 2 on X puts it on first portion, applies ROTR 13 on X puts it on second portion, applies ROTR 22 on X puts it on third portion and applies XOR to all of them.
  4. ) Σ1= (Uppercase Sigma-1): In uppercase Sigma1, function applies ROTR 6 on X and puts it on first portion, applies ROTR 11 on X and puts it on second portion, applies ROTR 25 on X and puts it on third portion and applies XOR to all of them.

6. Choice(Ch): Works with X, Y, Z bits. If the current bit of X is 1, the current bit of Y is written to the output, if the current bit of X is 0, the bit of Z is written to the output.

For example:

X: 11001001

Y: 00110110

Z: 11100100

Ch — — — — — —

Output: 00100100

7. Majority(Maj): Works with X, Y, Z bits. Processing bits are checked, and the majority bit is written to the output.

For example:

X: 01100001

Y: 10010110

Z: 01101101

Maj — — — — — —

Output: 01100101

3.2) Constants:

Constants are random-looking numbers used to compress messages in later stages. There are 64 numbers and they are found by taking the cube root of the first 64 prime numbers to give a random appearance. After the cube root is taken, the integer parts are discarded and restricted to 32 bits based on the fractional parts. It is denoted as K[i], with i being the variable:

K0 = ∛2 = 1.2599210498948732 →2599210498948732

→ 01000010100010100010111110011000

K1 = ∛3 = 1.4422495703074083 →4422495703074083

→ 01110001001101110100010010010001

K2 = ∛5 = 1.7099759466766968 →7099759466766968

→ 10110101110000001111101111001111

K3 = ∛7 = 1.912931182772389 →912931182772389

→ 11101001101101011101101110100101

.

.

.

K63 = ∛311 = 6.775168952273312 →775168952273312

→ 11000110011100010111100011110010

3.3) Algorithm of the Function:

Now that we have completed our preparations, let’s examine the functioning of the function step by step:

Step 1: Identifying the Input: In this step, we enter the string whose hash we want to find. For example: abc

Step 2: Equivalent of Input to 8 Bit ASCII Table: Although ASCII is 7-bit in its original form, we base it on its 8-bit version. The ‘abc’ input corresponds to: →[97, 98, 99] = 979899 = 011000010110001001100011

Step 3: Padding: We have 24 bits, but as a result, if we want to get a 256 bit hash, we need to work in 512 bit chunks for reasons in the next steps.

So the 24-bit is taken as is: 11000010110001001100011

Add ‘1’ as a separator at the end: 110000101100010011000111

It is then filled with 0 until it is 448 bits: 110000101100010011000111000000000000000000000000000000000000000000000………………………………..0000000000000000000000000000000000

We have stated that it works in 512 bit parts, and the length of our input is written in bits in the last 64 bits: 110000101100010011000111000000000000000000000000000000000000000000000……………………….….0000000000000000000000000000000000…………11000

‘11000’ is 24 in binary. Thus, the message is split into 512-bit chunks. If our message is longer than 448 bits, a second message block is required. The loop iteration required.

Step 4: Message Schedule, Split Messages into 16 Parts: The 512-bit message we get is divided into 16 parts. Starting from the left, 32 bits are taken each and continue. ‘t’ denoted as variable:

W0 = 11000010110001001100011100000000

W1 = 00000000000000000000000000000000

W2 = 00000000000000000000000000000000

.

.

.

W15 = 00000000000000000000000000011000

Step 5: Completion to 64: If we want the digest value of our message to be 256 bits, 64 messages are required. For this, it is expanded using the following formula (assuming t as the last message): 

W[t] = σ1(t-2) + (t-7) + σ0(t-15) + (t-16)

Let’s not forget that the addition here is in binary!

This total is written to the new incoming message and the expansion continues. W63 is the latest message.

Step 6: Compression: At this point, the bits will begin to be compressed and summarized. For this, it is necessary to initialize the hash values ​​first. To make the initial values ​​of the message look random, this time the square roots of the first eight prime numbers are taken, the integer parts are subtracted and multiplied by 2³² because 32 bits of data is required.

a = √2 =1.4142135624 →0.4142135624 * 2³² = 1779033703

b =√3 = 1.7320508076 →0.7320508076 * 2³² = 3144134277

c = √5 = 2.2360679775 →0.2360679775 * 2³² = 1013904242

d = √7 = 2.6457513111 →0.6457513111 * 2³² = 2773480762

e = √11 = 3.3166247904 →0.3166247904 * 2³² = 1359893119

f = √13 = 3.6055512755 →0.6055512755 * 2³² = 2600822924

g = √17 = 4.1231056256 →0.1231056256 * 2³² = 528734635

h = √19 = 4.3588989434 →0.3588989434 * 2³² = 1541459225

If it is converted to bits:

a, b, c, d, e, f, g, h can be called working variables or initial hash values. Before we change these variables, we need to generate two temporary numbers. For this, the following two formulas are used (T1: 1. Temporary number, T2: 2. Temporary number, i is denoted as variable):

T1 = Σ1(e) + Ch(e, f, g) + h + K[i] + W[i]

T2 = Σ0(a) + Maj(a, b, c)

In order to compress, the initial hash values ​​are shifted down one row with the value ‘a’ to be blank, the sum of T1 and T2 is written in the ‘a’ part, and T1 is added to the ‘e’ part.

(Shifted down)
(Addition applied)

This process continues for each message in the Message Schedule, with the values ​​of K[i] and W[i] increasing one by one. When the compression process is finished for the 64th message, the resulting values ​​are summed with the initial hash values ​​and the final bit values ​​are revealed.

This is done for each message block allocated as 512 bits. Since the first message block is like in our example, the initial hash values ​​are written based on the square roots of the first eight prime numbers. The initial hash values ​​for the block after the first block are written based on the last values ​​of the previous block.

(Multiple message blocks compressed)

After the last message block is compressed, the final hash value is revealed in bits. This value is converted from binary to hexadecimal and written sequentially starting from a.

So our ultimate value is:

→ba7816bf8f01cfea414140de5dae2223b00361a396177a9cb410ff61f20015ad

If we had done all the operations with pen and paper, we would have found this result. You can try it from the site I left the link below. Thank you for reading.

Arif Emre Abduşoğlu

Sources:

[0]: https://www.flaticon.com/free-icon/sha-256_2586154

https://www.youtube.com/watch?v=f9EbD6iY9zI&t=1109sHow Does SHA-256 Work?/learnmeabitcoin

https://blog.boot.dev/cryptography/how-sha-2-works-step-by-step-sha-256/How SHA-256 Works Step-By-Step https://andersbrownworth.com/blockchain/hashSHA-256 demo site

Join Coinmonks Telegram Channel and Youtube Channel learn about crypto trading and investing

Also, Read

--

--