Efficient, Usable, And Cheap Storage of IPFS Hashes In Solidity Smart Contracts

Hello world I am Alexandre Trottier RTrades Technologies CTO. Recently while working on a new project, I encountered the need to store IPFS Hashes, in particular CIDv1 type on the Ethereum blockchain, in a user-friendly but also gas efficient manner.

Why CIDv1? Well CIDv0 uses sha2–256, whose output size does not fit into a single bytes32 storage slot. From what I've seen in the wild, anytime people need to store IPFS hashes in smart contracts, it's almost always a hash of the IPFS CID stored in a bytes32 storage variable, or the IPFS hash itself stored in a string storage variable. Working with CIDv1 does not suffer from this :)

While there is nothing wrong with either of those two approaches, they are suboptimal. Hashing a hash simply to fit into a single storage slot makes it significantly harder to consume and is only worthwhile if you need to do so for security reasons. Storing the IPFS hash in a string storage variable is very expensive in general, and also suboptimal.

The trick here is being able to fit into as few, but fully occupied storage slots as possible (efficient), while being easy to consume (usable) and using minimal gas (cheap). This sounds simple in theory as all you need to do is find a hash function that takes up a single bytes32 storage variable, or two bytes32 storage variables, but because IPFS uses multiformats (or more appropriately, multihash) this isn't as easy as it sounds. To figure this out, we're going to need to talk a bit about multihash.


Multihash

Multihash is a subset of the multiformat specification and allows us to create self describing hashes. The format for multihash looks like (image is from github.com/multiformats/multiformats):

So given the above, this means that even if you were able to find a hash function whose output is 32 bytes, multihash would then add an additional 4 bytes onto that for a total of 36 bytes. This means we then need to store it in a bytes storage variable which has a single word overhead.


Solution

This then lead me to the question, what if we can find a hash function which when in multihash format takes up a single bytes32 storage slot we would be good to go! Well not quite, the only multihash that did this was blake2b-136, which go-ipfs nodes don't accept by default due to security risks... yikes!

After several hours of different experimentation (aka blindly trying multihashes), I was finally able to find a hash function which when in multihash the output is 64 bytes, which means we can store this in exactly two bytes32 storage slots, completely filling two slots and not wasting any. The multihash is blake2b-328. So to use this, all we need to do is take the 64 bytes output, split it in two and we're good to go!


Example

To keep things short, I’ll demonstrate the most optimal solution I found after trying a few different combinations and ways to store two bytes32 storage variables.

The first step is two define two parts of the hash, hashPart1 and hashPart2. In order to store our IPFS hashes here, we need to take the 64 bytes output of the blake2b-328 multihash, split it in half, storing each half within a 2 element array of bytes32 type, passing that into the function updateHash.

Now, whenever we want to consume this data, all we have to do is call getHash which will return the complete hash in bytes type. If you're consuming this in a mobile phone DApp, then all you need to do is convert to string, which in golang would be string(returnedBytes) and you have your IPFS Hash in plaintext!

So does this actually save gas, or did I just waste your time?


Gas Consumption

To test gas consumption, i wrote the following fairly ugly test contract to measure gas consumption. To get gas costs, I would check the CumulativeGasUsed field of the transaction receipt. Tests were ran on a local private PoA chain on my laptop with a 1sec block time, using geth 1.8.27-stable.

The functions updateLink, updateLinkHash and updateLinkParts are used to test gas costs from different ways of storing data in two bytes32 storage slots. The function setCID is used to test gas costs for storing a hashed IPFS hash.. The function setCIDString was used to test gas costs from storing the plaintext (aka string) version of the IPFS hash.

and the results of cumulative gas usage from the above contract is

Initially you might be looking at the gas cost for setCID and start thinking that I just wasted your precious time. However, we need to consider the fact that this isn't actually just the IPFS hash. It is a hash, of the IPFS hash. So while this may be gas efficient, it is not easy to consume outside of smart contracts, and is abysmal at best to consume within other smart contracts because:

  1. We need to store a plaintext copy of the hash somewhere accessible by the smart contract (storage)
  2. We need to read the plaintext data from storage, hash it, then compare the two hashed hashes.

Now after considering that, the gas prices for the hash storage methods being talked about here (66071 -> 66360 depending on the method being used), combined with the fact that there you can store+consume the hashes as is, seems pretty useful in my eyes.


Takeways

  • Use blake2b-328
  • Cast to bytes
  • Store in 2 bytes32 storage slots
  • Profit!

Thank you and a big shout out to everyone contributing to IPFS and all the great work that is be done by many different projects!

RTrade’s online community, Twitter or Telegram and website. Dont forget to show Temporal some love on Github!

Also very happy to announce that v2.1.0 of Temporal is out!
Highlights of release:
- go-ipfs v0.4.20
- ipfs-cluster v0.10.1
- gomod support

Temporal: A versatile easy to use tool for companies with large amounts of data to secure, store and track. The platform can be used as is, or customarily built to manage and deploy blockchain-based applications and non-blockchain data-storage solutions for any enterprise.

Temporal Features:

Full Featured Pinning Service w/ Free 3GB/Monthly, 5 Free IPNS record creation a month, 100 Free pubSub messages a month and 5 Free IPFS keys

Interface walk-through

Full Service IPFS API

Temporal-JS SDK Full public IPFS and IPNS usage

IPFS Gateway

I2P IPFS Gateway access
 
Installing your own Temporal

Also the Usages and Features section of the README.md doc on the GitHub repository covers using the docker compose file to spin up the environment.