Efficient, Usable, And Cheap Storage of IPFS Hashes In Solidity Smart Contracts
Hello world I am Alexandre Trottier RTrades Technologies CTO. Recently while working on a new project, I encountered the need to store IPFS Hashes, in particular CIDv1 type on the Ethereum blockchain, in a user-friendly but also gas efficient manner.
Why CIDv1? Well CIDv0 uses sha2–256, whose output size does not fit into a single
bytes32 storage slot. From what I've seen in the wild, anytime people need to store IPFS hashes in smart contracts, it's almost always a hash of the IPFS CID stored in a
bytes32 storage variable, or the IPFS hash itself stored in a
string storage variable. Working with CIDv1 does not suffer from this :)
While there is nothing wrong with either of those two approaches, they are suboptimal. Hashing a hash simply to fit into a single storage slot makes it significantly harder to consume and is only worthwhile if you need to do so for security reasons. Storing the IPFS hash in a
string storage variable is very expensive in general, and also suboptimal.
The trick here is being able to fit into as few, but fully occupied storage slots as possible (efficient), while being easy to consume (usable) and using minimal gas (cheap). This sounds simple in theory as all you need to do is find a hash function that takes up a single
bytes32 storage variable, or two
bytes32 storage variables, but because IPFS uses multiformats (or more appropriately, multihash) this isn't as easy as it sounds. To figure this out, we're going to need to talk a bit about multihash.
So given the above, this means that even if you were able to find a hash function whose output is 32 bytes, multihash would then add an additional 4 bytes onto that for a total of 36 bytes. This means we then need to store it in a
bytes storage variable which has a single word overhead.
This then lead me to the question, what if we can find a hash function which when in multihash format takes up a single
bytes32 storage slot we would be good to go! Well not quite, the only multihash that did this was
go-ipfs nodes don't accept by default due to security risks... yikes!
After several hours of different experimentation (aka blindly trying multihashes), I was finally able to find a hash function which when in multihash the output is 64 bytes, which means we can store this in exactly two
bytes32 storage slots, completely filling two slots and not wasting any. The multihash is
blake2b-328. So to use this, all we need to do is take the 64 bytes output, split it in two and we're good to go!
To keep things short, I’ll demonstrate the most optimal solution I found after trying a few different combinations and ways to store two
bytes32 storage variables.
The first step is two define two parts of the hash,
hashPart2. In order to store our IPFS hashes here, we need to take the 64 bytes output of the
blake2b-328 multihash, split it in half, storing each half within a 2 element array of
bytes32 type, passing that into the function
Now, whenever we want to consume this data, all we have to do is call
getHash which will return the complete hash in
bytes type. If you're consuming this in a mobile phone DApp, then all you need to do is convert to string, which in golang would be
string(returnedBytes) and you have your IPFS Hash in plaintext!
So does this actually save gas, or did I just waste your time?
To test gas consumption, i wrote the following fairly ugly test contract to measure gas consumption. To get gas costs, I would check the
CumulativeGasUsed field of the transaction receipt. Tests were ran on a local private PoA chain on my laptop with a 1sec block time, using
updateLinkParts are used to test gas costs from different ways of storing data in two
bytes32 storage slots. The function
setCID is used to test gas costs for storing a hashed IPFS hash.. The function
setCIDString was used to test gas costs from storing the plaintext (aka string) version of the IPFS hash.
and the results of cumulative gas usage from the above contract is
Initially you might be looking at the gas cost for
setCID and start thinking that I just wasted your precious time. However, we need to consider the fact that this isn't actually just the IPFS hash. It is a hash, of the IPFS hash. So while this may be gas efficient, it is not easy to consume outside of smart contracts, and is abysmal at best to consume within other smart contracts because:
- We need to store a plaintext copy of the hash somewhere accessible by the smart contract (storage)
- We need to read the plaintext data from storage, hash it, then compare the two hashed hashes.
Now after considering that, the gas prices for the hash storage methods being talked about here (66071 -> 66360 depending on the method being used), combined with the fact that there you can store+consume the hashes as is, seems pretty useful in my eyes.
- Cast to bytes
- Store in 2
Thank you and a big shout out to everyone contributing to IPFS and all the great work that is be done by many different projects!
Also very happy to announce that v2.1.0 of Temporal is out!
Highlights of release:
- go-ipfs v0.4.20
- ipfs-cluster v0.10.1
- gomod support
Temporal: A versatile easy to use tool for companies with large amounts of data to secure, store and track. The platform can be used as is, or customarily built to manage and deploy blockchain-based applications and non-blockchain data-storage solutions for any enterprise.
Full Featured Pinning Service w/ Free 3GB/Monthly, 5 Free IPNS record creation a month, 100 Free pubSub messages a month and 5 Free IPFS keys
Also the Usages and Features section of the README.md doc on the GitHub repository covers using the docker compose file to spin up the environment.