EVM Istanbul storage pricing

or how to hack the EVM to spend half the gas when handling data

Agustin Esteban Aguilar

Storing data using smart contracts on the EVM is costly, not only in GAS but also in performance, paid for the users trying to host an Ethereum node.

When a smart contract writes data to its storage, the node processing the TX not only has to save that data to its storage space (HDD or SSD), it also has to re-calculate the Merkle root of the state; this process involves reading the disk multiple times, and then updating multiple values also to the disk.

To keep the TX execution balanced, the EVM charges a gas significant cost for each writing and reading store operations (SLOAD and STORE).

Currently, the cost is approximately the following:

SLOAD: 200

STORE: 20000–5000

However, following the Istanbul hard fork scheduled for next year, the gas cost of those opcodes is going to rise like this:

SLOAD: 800

STORE: 20000–5000

So, why is SLOAD increasing its cost by 4x? That’s because the computational cost when reading a value from the storage is not constant; it increases with the total state storage.

In simpler words, it’s more difficult for the node to read and write storage as the blockchain size increases. You can read more about this issue in the EIP that increases the gas price, https://eips.ethereum.org/EIPS/eip-1884.


But, what if we find another way?

Reading the discussion thread of the EIP1884 implementation, I came across an idea, @jochem-brouwer proposes that the current cost of reading the bytecode of a contract using EXTCODECOPY is only 700 GAS + 3 GAS * word; this is, in fact, lower than reading the storage after the Istanbul update, but it can be easily patched by increasing the cost of executing EXTCODECOPY to 700 GAS + 800 GAS * word.

Still, there is another more -abstruse- way of reading data of the Etherum state. When a contract is executed, the bytecode of the contract has to be loaded into memory by the node. This process has exactly 0 GAS cost and has even been an issue in the past when someone exploited that to slow down the Etherum nodes, this was addressed by limiting the total size of a contract to 24576 bytes.

This fact of the EVM can also be used to load data expending less GAS. In those 24576 bytes of bytecode, we can store a smart contract with the only purpose of returning 21248 bytes of data, then we can read the data by just executing the code (instead of using SLOAD or EXCODECOPY), that means that reading each word will cost us only around 14 GAS, instead of 700 GAS or 800 GAS.

Such smart contract would look something like this:

Data is pushed into memory using PUSH32, and we return all the data that’s in memory

Now let’s test it. I’ve created a Solidity library that can be used to store and retrieve data using this method. Now we can test our custom “ChunkStorage” method for storing data versus the standard way of storing data using SSTORE and SLOAD.

I used the Ropsten network to test the library, Ropsten has been running the Istanbul branch since September 30, 2019, so the tests include the gas increased costs to 800 GAS by each SLOAD.

ChunkStorage test contract:

https://ropsten.etherscan.io/address/0x1ffb894f59b9b6288b504f6169bfc50e69d0cbbc

RegularStorage test contract:

https://ropsten.etherscan.io/address/0xb59c1dcc51ac41e10ef66cd93b3dcd3072860273

ChunkStorage (blue) vs Standard (red) — Lower is better

It’s more expensive to use the ChunkStorage library if we are storing chunks below 96 bytes; this happens because the library has to maintain an internal mapping from each storage entry to its corresponding contract, this adds a fixed gas cost every time a chunk is written.

As soon as the stored data crosses the 96 bytes threshold, we start to see that our custom storage implementation is cheaper than using the contract storage. We save 1834 GAS on each read, and 4553 GAS during the writing process when stored data chunk is 96 bytes long.

Beyond 96 bytes

Let’s say that we need to store a large chunk of data on our Smart contract, most of the time storing data like that is not a good idea, but let’s imagine that we need to store a big struct, a proof of some sort, or even better, the tragedy of Darth Plagueis The Wise…

Did you ever hear the tragedy of Darth Plagueis The Wise? I thought not. It’s not a story the Jedi would tell you. It’s a Sith legend. Darth Plagueis was a Dark Lord of the Sith, so powerful and so wise he could use the Force to influence the midichlorians to create life… He had such a knowledge of the dark side that he could even keep the ones he cared about from dying. The dark side of the Force is a pathway to many abilities some consider to be unnatural. He became so powerful… the only thing he was afraid of was losing his power, which eventually, of course, he did. Unfortunately, he taught his apprentice everything he knew, then his apprentice killed him in his sleep. Ironic. He could save others from death, but not himself.

If we use STOREs, we can store the text using 536717 GAS, but if we use ChunkStorage library, it cost us only 269361 GAS; that’s a difference 267356 GAS, almost half of what we spend using the regular contract storage.

Also, how much would it cost us to read the text? 25703 GAS, that’s 19368 less gas used each time that we need to read the data.

Is it possible to learn this power?

The library ChunkStorage generates the custom bytecode for the “data-contract” and deploys it, making the process transparent and providing a standard interface, rendering it trivial to integrate into already existing code.

ChunkStorage.sol library

Disclaimer: The library code is not optimized nor audited, it can contain serious bugs, and it’s not production-ready. I discourage anyone from using any hack like this one on a real-life application. The library is provided as an example and for explanatory purposes.


Conclusion

This experiment shows that some opcodes and operations on the Ethereum EVM are not adequately priced gas-wise; this, however, is not a critique, given that finding the correct value for all the parameters on a complex system as the EVM is a hard process, and sometimes compromises have to be made.

I am not a core-dev myself, but in my opinion, EIP1884 is the wrong way of fixing the SLOAD pricing; it increases the pressure for this kind of “hacks” to save gas, and it may break some dApps on the process (because of the increasing of gas cost).


Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade