Ethereum under the hood- Part 6 ( Hashing )

디팍
디팍
Nov 7 · 6 min read

Back from a very long break and hoping to keep pushing on this journey towards going under the hood of the technology and concepts underneath Ethereum

There are many changes in the world of Ethereum, and I might cover some of the topics as we move along, but for the curious, have a look at this excellent link about the next generation of Ethereum labeled ETH 2.0. One more note I am dropping the Fortnite theme, this chapter is going to be brief but essential, I will be adding useful links as we cover:

  1. What is Hashing?
  2. Why Hashing?
  3. Keccak Cryptographic hash function
  4. Ethereum keys and values
  5. Summary
  6. Onward

What is Hashing:

Hashing is an algorithm function that given input string will always return an output of a fixed length called a “Hash”. I am simplifying this concept in a few paragraphs, but that is the essence of a Hash function as depicted in the picture below:

Plain text converted to a hash value using a hash function
Plain text converted to a hash value using a hash function
Source:cheapsslsecurity.com

For additional reference regarding hashing, refer to:

Why Hashing?:

Hashing provides some advantages than other algorithms like lists or arrays due to its two main properties, 1) Speed of retrieving data, which remains almost constant regardless of the 2)Size of the data, and the other being using less disk space to store the dataset. Here is an excellent reference that highlights on why to use Hashing.

Hash algorithm are faster than lists or arrays and use lesser disk space
Hash algorithm are faster than lists or arrays and use lesser disk space
Lecture 23: Intro to Hashing (cs.cmu.edu)

With our knowledge of Hashing, let’s discuss a little bit about the Cryptographic Hash function and in some minor detail, about Keccak Cryptographic hash function.

A cryptographic function is a particular type of Hash function which has some additional feature, applying a cryptographic hash function to a key will result in :

  • A collision-free: No two different values can produce the same hash
  • Secure: A reasonable randomness will result in a secure hash
  • An input to a hash function will always result in the same message digest

There is a difference between a hash and a cryptographic hash function, the StackExchange discussion reference below provides an insightful distinction.

Let’s look at the grumpy cat picture below which is the input, and after applying a cryptographic hash function, the output is a hash

A Grumpy cat .jpg file as an input to the cryptographic hash function
A Grumpy cat .jpg file as an input to the cryptographic hash function
Source: Manning.com

and a slight change in our grumpy cat picture will result in a completely different hash

Grumpy cat picture with a missing picture will result in different hash
Grumpy cat picture with a missing picture will result in different hash
Source: Manning.com

Ethereum yellow/white paper specifies a cryptographic hash function “keccak” to satisfy those essential properties

Keccak-256 is a sponge cryptographic hash function developed with one of its focus, being speed. Similar to a sponge, Keccak has attributes of absorbing. Squeeze and release. For a given message “M” Keccak hash function will perform a series of “absorb” and “squeeze” at a certain speed, the result would be the message hash Z as depicted in the diagram below, notice the absorb and squeeze operation.

Message flowing into a series of absorb and squeeze function
Message flowing into a series of absorb and squeeze function
source: Fig 2.1 Keccak Paper

Here is my representation of the key being absorbed and released

Keccak Sponge function with absorb and release
Keccak Sponge function with absorb and release
Keccak sponge function

Try to spend some time on this insightful discussion on StackExchange regarding Keccak256 cryptographic hash function for additional details:

Note: Ethereum 2.0 will be switching to SHA-256

Ethereum keys and values:

Ethereum stores its data in a lookup table of keys and values. While performing those lookup functions speed, security is of important priority; Cryptographic hash functions provide a way to accomplish to achieve those goals, back to the Yellow Paper specification:

Yellow paper , which has Keccak hash function for keys and RLP function for values.
Yellow paper , which has Keccak hash function for keys and RLP function for values.
Ref: Yellow Paper( page: 4 )

Let us expand the above expression to a simple key/value pair table which translates to something like the Table below, the key size according to the specification is 32 bytes.

          +-----------------------+-----------+
| Keccak( Keys ) | RLP( Val )|
+-----------------------+-----------+
| HASH 1 | Value 1 |
| HASH 2 | Value 2 |
| HASH 3 | Value 3 |
+-----------------------+-----------+

Assume that the above the table keeps growing, one of the visible results will be a decline in search time results. Using a hash function and specifically, a cryptographic hash function can help perform a quick lookup for values.

In the upcoming articles, we will peek into Keys, but for now, let us assume that keys as a combination of a publicly visible and secure private hash value.

To illustrate a simple example of a collection of keys and values, I created a simple k,v pair where k=>{“12345”,”67890"} and v=>{“eth_1”, “eth_2”} in elixir and created a hash using a SHA256 cryptographic hash function

SHA 256 Hash Functions in Elixir using key, values as an simple example
SHA 256 Hash Functions in Elixir using key, values as an simple example
Sha256 hash function

If we translate this to a simple table of keys, values, and the hash function, it might look like something like the table below which is a loose representation of a table of hashes or aptly referred to as “Hash tables”:

  +-----------------------+-----------+--------------------+
| Keys | Val | SHA256(k,v) |
+-----------------------+-----------+--------------------|
| 12345 | eth_1 |9891cc393003c1......|
| 67899 | eth_2 |5c35ec46226817a.....|
+-----------------------+-----------+--------------------+

If you would like to know more about “Hash tables” I recommend referring to the article below

  • { key, value } table search using a hash function is faster and uses less disk space.
  • A cryptographic hash function is a particular type of hash function.
  • Ethereum currently implements the “Keccak” Cryptographic hash function.
  • Keccak is a sponge-like function with Absorb and Release features.
  • Ethereum maintains a database of keys and values.

Onto:

In the next chapter, we will go in-depth about Ethereum blocks, till then learn on.

References:

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade