Understanding One Way Hash Functions #HowToBUIDL (5/n)

Hashing:: A Building Block for Every Cryptographic System

Dan Emmons
Coinmonks
Published in
5 min readMay 22, 2018

--

Ethereum makes significant use of the Keccak/SHA-3 algorithm.

Specifically, Ethereum & Solidity use the Keccack-256 version of SHA-3.

This seems to be a trade secret among Dapp Developers for some reason.
Although hashing is absolutely a key building block of all cryptographic systems, we do not have to reinvent the wheel in order to write systems and smart contracts that run on top of cryptographically platforms.
You are not required to understand how SHA-3 works. I repeat:
You are not required to understand how SHA-3 works. Once more:
You are not required to understand how SHA-3 works. Understood? Good. Now…
You SHOULD understand why hashing is used.

The sponge construction for hash functions. Pi are input, Zi are hashed output. The unused “capacity” cshould be twice the desired resistance to collision or preimage attacks.

SHA-3 is a subset of the cryptographic primitive family Keccak. SHA-3 uses the Sponge construction in which data is absorbed into the sponge, then the result is squeezed out.

A hash function is any function that can be used to map data of arbitrary size to data of fixed size. The values returned by the hash function are called hash values, hash codes, digest, or simply hashes.

A one-way hash function exhibits the following properties:
A) Irreversible; computationally infeasible to determine the message from its digest, B) Collision resistant; impractical to find more than one message that provides a given digest, and C) High avalanche effect; any small change has a significant change in digest. D) Deterministic. Same input must always map to the same output.

Analogy: Baking a Cake
You have butter, sugar, eggs, flour, salt, baking powder, milk, vanilla, frosting, and you mix up all those bits of ingredients, mix it all up, put it together and heat it in the oven. Out comes a cake. But can you take a cake and go back and pull out each of the individual ingredients? Imagine how difficult that would be to re-extract an egg from the cake. For all intents and purposes, impossible.

Because the data that comes out is always unique, we can do interesting things like derive a specific wallet address from a given public key, meaning the same public key maps to the same wallet address every time.

Deterministic Digital Fingerprints.

hashing function creates a fingerprint

We can also create digital fingerprints of documents, and compare the hash of a document provided by an author, with an independently calculated hash of a fully readable document. As long as the hashes match, we know that the document has not been tampered with by an intermediary that delivered the message. Even a tiny change has a large avalanche effect on the fingerprint, making it nearly impossible to fake a change, but very easy to identify if a bad actor is in the system. We’ve discussed hashing as a mechanism for mining and Proof of Work in the past, but now we’re more interested in the context of how we might want to make use of hashing in our smart contracts.

Let’s try this out. Fire up the truffle develop console:

The web3.js library is well documented.
Let’s try out sha3 with various sample inputs:

Some online tools are available for you to try out, if you don’t want to code…
https://emn178.github.io/online-tools/keccak_256.html

Some quick things to note: 1) Even changing the g to G had a massive change in the output. 2) No matter how long the input message to the sha3 function was, the output was always 64 hex characters or 32 bytes. Solidity conveniently provides us with a data type for storing this: bytes32.

Take, for example, the entire Adventures of Sherlock. If you hash the entire book, the resulting keccak256 / SHA3hash is

This has interesting consequences. Since we know that the hash function is deterministic, storing hashed data in a smart contract’s storage means that digital fingerprint recorded on the public blockchain can be compared with an original document and independently verified at a later date in the future. Additionally, we know there is a high avalanche effect, so even one bit changed in the input data results in a drastic change in the output data. Perhaps this will be useful for determining the provenance of data, or the record of ownership of a work of art, used as a guide for authenticity/quality.

A very simple example smart contract would record the msg.sender for the first person that recorded the uniquefingerprint passed to the contract. If an entry already exists, a storeFingerprint function would throw an exception. An alternative function would take an arbitrary length data string, and the keccak256 of the data would be calculated. It would be a much cheaper gas cost for the client of the contract to pre-compute the sha3 hash, but we are providing both function options for convenience.

Dan Emmons is owner of Emmonspired LLC, a Certified Bitcoin Professional, Certified Ethereum Developer, Full Stack Developer and Advisor on Cryptocurrency projects. He is also the creator of a Youtube Channel and iTunes Podcast called #ByteSizeBlockchain.

--

--

Dan Emmons
Coinmonks

A leader who strives to make issues that seem complex, overwhelming, or insurmountable more manageable for the Team & provide exceptional service to my clients.