SIDEBAR: What is Hashing?

A friend asked me what Hashing was, and why I keep mentioning it.

Hashing is one of the critical functions that makes Blockchains secure.

Lots of data gets hashed in Blockchain. Your private key gets hashed to create your public key. Each block is identifiable by it’s hash which contains all of the data from that block. Each block also contains the block hash from the previous block — which is how the blocks are linked to create the Blockchain. Hashes are all over Blockchains, so it’s important to understand the overall concept of what they are and how they work.

A Cryptographic Hash is a mathematical function that condenses data to a fixed size. So, you can take any set of characters (numbers, text, spaces, punctuation) and when applied, a hashing algorithm turns it into a seemingly unrelated set of characters. Any small change in the original set of characters will change the resulting hash completely.

Miracle Salad’s SHA-256 Hash Generator

Give it a try:

Go to Miracle Salad’s SHA-256 Hash Generator, type in anything you like, and then click the SHA-256 button and see what you get. Now, change what you typed ever so slightly — add a space, take away a comma, or capitalize a word. Click the button again. Look at the result.

You could put the text of War and Peace in there, hash it, then change one comma in the whole text, hash it, and it would create a completely different result.

So, what’s happening when you click on that “SHA-256” button?

An algorithm is being applied to the information provided to create a 256 bit hash which will appear as 64 characters.

Today, Bitcoin uses the SHA-256 hashing function. A Hash is one-way function that cannot be reversed. And since a hashing function is not encryption, it cannot be decrypted.

Properties of a Cryptographic Hash:

  1. Computationally Efficient = Computers can perform the hash function quickly
  2. Deterministic = No matter how many times you try, on whatever computer: the same input, using the same function, will produce the same product output
  3. Pre-Image Resistant = You can’t guess what the input is by looking at the product output
  4. Collision Resistant = It’s very unlikely that you can have 2 different inputs and create the same output
Some of MD5Calc’s Algorithm Calculators

Is SHA-256 the only type of hash function?

Nope. Check out this page from MD5Calc. There’s lots of algorithm calculators listed. If you used SHA-512 to hash the same phrase you used with the SHA-256 function, it would result in a 512 bit hash that would appear as 128 characters. That’s because it is applying a different algorithm.

What do you mean by Algorithm?

You may remember my Sidebar discussing what a Protocol is — a framework of rules that describe how something is structured and how it works. An Algorithm is also a process or set of rules or steps, but these are steps that are followed in a calculation or a problem-solving operation. On the most basic level, an Algorithm is a set of steps for a mathematical function.

You can think of it like this: Something we are familiar with is Addition and Subtraction. You can use the same set of starting numbers, but if you Add, then 2+2=4. And if you Subtract, then you get 2–2=0. You can see that you get a different product in the end. Similarly, an Algorithm is different for a SHA-256 function than a SHA-512 function, there are different steps involved, and we know that in part because what they produce is different.

Why is it called SHA-x?

SHA stands for Secure Hashing Algorithm. Each type of hashing algorithm has a specific purpose — some are optimized for a type of data, others for speed, others for security. Secure Hashing Algorithms are optimized primarily for security.

The first Secure Hashing Algorithm used was SHA-1, used for digital signatures. However, it didn’t provide enough product options and so it was possible for two different sets of data to produce the same result:

Image Courtesey Pixabay — cynthiagaytan09

SHA-1 was not Collision Resistant.

Sort of like all the ways you can get to a product of 2. You can have 8–6=2 and 6–4=2. Now that’s fine for math, but if 2 was supposed to be a secure digital signature, it can’t belong to two different data sets and be used reliably.

So, SHA-2 was developed and implemented in 2015 for secure digital signatures that are widely used online. That required a transition of all the SSL/TLS certificates that were being used. A big effort.

There are a few different flavors of SHA-2 including, SHA-256, SHA-224, SHA-384, and SHA-512. They are part of the same family, and all have a similar structure, but have different outputs in bit-length.

Photo by ian dooley on Unsplash

As you’ve already witnessed, computing power keeps getting more efficient and powerful. SHA-256 is more complex and powerful than SHA-1; it can create 2 to the 256th power of possible combinations — which far exceeds the number of grains of sand in the world. That’s a lot of output combinations.

Even so, there are an INFINITE number of combinations of data that can be input. Think about it. Just the number of changes you can make to “War and Peace”. Then all the other novels that you could input into that SHA-256 calculator and the changes you could make to them.

BUT the number of possible SHA-256 outputs, while very very large, is finite. So, just like there are a number of ways to get to the product of 2, there is technically more than one data combination that can create the product output — even for SHA-256. Thus, as computing power increases, more powerful and complex algorithms are necessary.

With this in mind, SHA-3 was developed utilizing a different construction and random permutations. Like SHA-2, SHA-3 also has different flavors that are in the process of being standardized. Ethereum currently uses a modified SHA-3 hash called KEGGAK256. AND, there are other Cryptographic Hashes in development such as BLAKE2…

Long story short — security and Cryptographic Hashes will continue to evolve to secure data from strong arm hackers using ever more powerful computers. The specific types of Hashes used in Blockchain, how those Hashes are incorporated, and the architecture of Blockchain Protocols will evolve as security and the Blockchain landscape changes.

Special thanks to Vincent Lynch and his article “Re-Hashed: The Difference Between SHA-1, SHA-2 and SHA-256 Hash Algorithms”. For more information about Hashes, you may want to check out this post on Medium by Raul Jordan. Another great Blockchain resource is Blockgeeks including their article on hashing.(Of course, any mistakes in this article are mine.)