NFT on-chain metadata and why someone would want to pay >37k USD for it

Published in

CreCo XYZ

10 min readApr 12, 2022

Let’s talk about Ethereum NFT collectibles & on-chain NFTs — some of the myths and challenges and why we would want “on-chain NFTs”, but don’t have standards yet. Warning: it will get a little bit technical in the middle.

In our last article “Welcome to Crecoland — Where NFTs are fun again. 🐊🎪” we briefly touched on what NFTs are, how they work and their blindspot: the URL they point to, aka “off-chain metadata”.

Welcome to Crecoland — Where NFTs are fun again. 🐊🎪

Why we need better NFT standards, on-chain metadata and tools for creators and collectors.

medium.com

What is on-chain metadata?

Our crecodile generator and a test crecodile — “naked creco”

To understand NFT metadata and collectible drop mechanics let’s work with our friend the “naked” crecodile as an example. This friendly looking buddy has 9 trait- or attribute-types (“Background, Outfit,…”). For a better intuition, we can think of trait types as sliders (on the right) and traits as slider values. We can say that this crecodile has only two non-zero trait values- Mouth:Classic and Eyes:Classic. Trait values correspond to image layers that generate the visual representation on the left.

We can define that all attributes together are the DNA of our crecodile. And we can encode this DNA by adding all slider positions or trait values into one combined string: “000001000300000000”.

This DNA is an encoding of the crecodile because we can give the above rendering tool and DNA to someone else and provided with a set of rules (fields are ordered, slider values have two digits padded with 0, ..) they would deterministically arrive at exactly the same visual representation by following those rules.

In other words: the DNA “000001000300000000” is enough information for someone (spoiler: or a smart contract) to know the main characteristics of the asset: “Mouth:Classic and Eyes:Classic”.

Expensive Strings

Unfortunately, storing strings like the above is a very expensive operation on the Ethereum blockchain.

The DNA “000001000300000000” consists of 9*2 = 18 characters. The largest trait value that can be numerically represented would be “99”.

Each character is encoded with 1 byte so the whole DNA as string requires 18 bytes of storage.

Al characters are numerical 0–9 but string characters can also be ‘a’, ‘A’, … ‘z’, ‘Z’. Since ‘z’ is not a valid slider position so we can already assume that we might be wasting storage.

Bitfield Encodings

Solidity has a uint8 type which is a number or “unsigned integer” that is expressed by 8 bytes. Each byte is 8 bits and so the uint8 for 0 looks like this in storage:

0000 0000 | 0000 0000 | 0000 0000 | 0000 0000 | 0000 0000 | 0000 0000 | 0000 0000 | 0000 0000

Bytes are separated by | in the above example. Each byte can represent 2^8 = 256 values.

We can use the above storage to encode 8 trait types each with 256 (=1 byte) varying values. Therefore, we can (almost) fully encode our crecodile in a single integer value on the blockchain.

If we convert the above DNA into bitfield representation the naked croc becomes:

With this representation we reduced storage requirements from 18 bytes string to 8 bytes integer. And to be precise, because we can store 256 different values, a string would require 3 digits and therefore “cost” 27 bytes. So the bitfield representation is much cheaper, but unfortunately we sacrificed our 9th trait. And since solidity uint only exist in steps of 8 we have to upgrade to uint16 which as a string would require 16*3 = 48 bytes to represent over 100 trait values for 16 trait types.

For the very technical: Solidity does also have a bytes32 data type for raw binary data. Moreover, read operations on uint8 can be more expensive than uint256 so picking uint256 over uint8 can have advantages in the long run especially in processing and read-heavy applications. If you have suggestions — please leave a comment!

Trait Extraction

In order to read a certain trait from the bitfield we construct a mask 11111111 where each bit “is set”.

If we want to read which “Eyes” trait the crecodile has we can use boolean algebra (& = logical and) to eliminate all bits where we don’t have a 1 in the bitfield AND the constructed mask.

We can use bit shifting to elegantly arrive at value 3 which is the original trait index for “Classic Eyes”.

In Solidity:

uint256 bitMask = TRAIT_MASK << (8 * i);uint256 value = (dna & bitMask) >> (8 * i);

Where TRAIT_MASK = 255 = 1111 1111 and i is the trait type index for which we want to extract the value.

Our smart contracts can now read and understand on-chain metadata! 🎉

Minting Costs 🤑

Equipped with powerful bitfield encoding we can compress a whole crecodile into a single storage optimized number which is pretty amazing.

But even writing the DNA highly compressed will cost some ETH. So how much will it cost?

Consider the following simplified code. In this function we provide an array of token IDs and an array of compressed metadata in form of uints and associate the two in a mapping.

mapping(uint256 => uint256) private tokenIdToDna;function setDna(uint256[] ids, uint256[] dna, uint8 length) public {for (uint i = 0; i < length; i++) {
  tokenIdToDna[ids[i]] = dna[i];
 }}

The cost to run the above function with 10 token IDs and DNA values is 269,178 gas.

~21,000 gas is charged for any transaction as a “base fee”. Which leaves ~249k gas or 25,000 gas per ID->DNA write (including some overhead for the loop..).

You can compare it to raw storage cost of SSTORE opcode for un-initialized values here: https://github.com/djrtwo/evm-opcode-gas-costs/blob/master/opcode-gas-costs_EIP-150_revision-1e18248_2017-04-12.csv#L50

Storing metadata for 10k NFTs on-chain can be estimated with

10,000*25,000=250,000,000 gas

If we assume 50 gwei gas price the total cost would be 12,500,000,000 gwei or 12.5 ETH.

With an ETH price of currently $3000 USD, this write operation would cost roughly 37,500 US dollars.

We could try to bring down the costs by not using uint256 values (see above) but would subsequently pay more on each read operation.

Storing the same information off-chain can be considered free (short term).

IPFS hosting only requires a node and the storage costs for JSON files are negligible. However running the Node and “pinning” the files is a running task and creates costs. Pinata is a service that handles most of the IPFS related tasks at currently $20 USD per month or $240 USD per year and is a popular option for NFT projects.

The difference of 2,400 USD vs 37,500 USD if we think for the next 10 years is a huge when it comes to upfront costs of metadata storage and is one main reason for projects to rather go off-chain. 👋

On-chain = on-chain?

There are different degrees of “on-chainness” and how much information is stored on-chain. Let’s look at some examples and inspirations and why they might have taken this route.

Autoglyphs is considered the first “on-chain” generative art project on the Ethereum blockchain. It is by LarvaLabs the creators of CryptoPunks and pioneering NFTs as art on Ethereum.

Probably the first NFT project to encode metadata as bitfields and even store the images as SVG layers and render them on-chain was Avastars.

Avastars used a human proof of work where DNA was generated through scrolling an infinite feed of new generations. The project combined a blockchain native concept of proof-of-work with NFTs. Bringing in composability through on-chain metadata makes this project ahead of the time and very unique.

Another project that recently launched and is very similar is CyberBrokers. They wrote the metadata including SVG layers on-chain which at that time cost them 91 ETH or at today’s price 273,000 USD.

There have been other NFT projects to write either metadata, image data or both on-chain.

Artblocks, ChainFaces, ChainRunners, Loot, Anonymice, OnChainMonkeys are some of the projects frequently named as on-chain NFT projects. Not every project has image data on-chain especially since Bitmap encodings are extremely challenging and SVG is not suitable for all artistic goals.

Recent paper on efficient Bitmap encodings

Value-added

Why would anyone pay a couple hundred thousand dollars to write the metadata on chain?

Often, on-chain metadata is just used as a marketing stunt. It is rather hard to verify if and how data is really stored but it can drive prices significantly up — because it’s cool.

Probably the most popular reason and maybe a myth is persistance: it is believed that whatever is written on the blockchain, will continue to exists forever -like diamonds.

CyberBrokers just recently launched from an artist named Josie Bellini. [The NFT is] a complete person. And I love that they are continuing to innovate by putting the SVG [scalable vector graphics] directly on chain. It’s preserved forever.

Source: https://www.fastcompany.com/90735768/bored-ape-yacht-club-jimmy-mcnelis-king-nft-kingship-future

Enter state rent. Ethereum blockchain state is growing at an unreasonable pace. Many of the off-chain NFT projects launched last year do not exist anymore or will not exist in a few months or years from now. Their images, and metadata will stop to exist as soon as the project owners turn off the server or stop paying Pinata. But the tokens are still on the blockchain and the URLs will be pointing nowhere. Why would someone want to synchronize so much dead state? That is a question “state rent” is targeting: the core idea of this old burning topic is to recycle storage space occupied by unused accounts.

And while we don’t have state rent yet it could be naive to assume that people want to store and synchronize “garbage” or dead bytes indefinitely without being paid for it. So we might see new concepts in the future to prove state is necessary and they might involve paying a fee.

Another value proposition is that having on-chain images can be seen as an art form itself. It is reminiscent of the demoscene and where part of art is to work with a constrained environment in efficient ways.

Last but not least, games rely on on-chain data to certain extends. If any logic is written on-chain this logic needs inputs. It can be a racing game that requires acceleration and top speed of the cars or it can be a betting app that requires real-world metadata from oracles.

Unfortunately, all these applications have seemingly different needs and handling it in application specific ways. There is no defined standard for on-chain NFTs.

Crecoland

For Crecoland which is the on-chain “Adventure Park” for crecodiles we want to take a look into applications at the intersection of games, art, DeFi and NFTs that can be created when trait information is available. This means smart contracts do not have the tokenURI blindspot anymore. With on-chain metadata smart contracts can distinguish assets by their traits and ideally implement complex logic with low gas fees on L2. We believe only then we can unlock the full potential of NFTs.

Applications can range from new NFT mechanics, DNA based breeding, merging, derivatives to smart lending pools, curation protocols or markets with trait-based offers and distinguishable price feeds. We believe on — chain metadata can unlock a new generation of decentralized NFT applications.

Reveals

On-chain metadata poses a challenge for reveals. Mainly, because on-chain NFT generative drops either need to generate assets during mint or have all assets available pre-mint which can even block the contract development.

The easier option is to generate them during mint. In this case, the mint transaction generates the random trait data or DNA and writes both on-chain. Randomness becomes a problem in this case because someone could try to mint and revert until they found something rare or even try to impact randomness directly.

The more difficult option is to prepare all token metadata / DNA and let the minters mint smaller chunks of the collection metadata. In this case the project can curate the drop similar to off-chain NFTs. Earlier we estimated the on-chain metadata cost for a 10k generative NFT drop at ~36k USD. When each minter writes fractions of the collection metadata, the costs are split on 10k mint transactions and add an extra 36 USD which can be negligible on a large scale.

The problem however is that we run into concurrency and assignments issues. Basically schueduling problems where we have a limited fixed amount of resources (metadata) and potentially many (mining) tasks working on it. The challenge becomes to facilitate a fair launch where rare tokens(metadata) are not given to certain accounts (trustless) and we minimize failed transactions and avoid leaking metadata which has become a lucrative multi-million dollar problem:

Pranksy buying a rare “fished” meebit for 200 ETH after metadata leaked

In our next post we will discuss how we handled this challenge for our crecodiles drop and give an example for our drop mechanic.

Follow us on twitter: https://twitter.com/creco_xyz and follow this publication if you want to learn more about NFTs, our crecodiles and the future of on-chain NFTs.

And if you liked this article please give us some 👏

Thanks! 🐊🐊🐊