Data-hunting through lootprints

Published in

MoonCatRescue

8 min readSep 14, 2022

Blockchain technologies are new and interesting ways of storing data, and for the most part users don’t need to know how the underlying blockchain infrastructure keeps data safe/secure/available in order to use applications built upon it. But sometimes it is interesting to look “under the hood” and see how the magic happens. This is a fairly-technical post about how the Ethereum blockchain (and any other chains that have extends its “EVM” model) saves data, that might be of interest to other developers trying to analyze data from other contracts, or end-users curious how the inner workings actually work.

lootprints (for MoonCats)

One of the key forward-looking initiatives for the MoonCatRescue team at the moment is getting information about various parts of the MoonCatRescue ecosystem to be accessible on multiple blockchains. One of the key parts needed to accomplish that is to replicate the on-chain trait information for each of the assets, such that smart contracts on other chains could reference those traits.

For MoonCats themselves, the core randomizer that determines what each MoonCat looks like (what traits they have) was determined by the person who first rescued them, by the “seed” value they supplied (a solution to a proof-of-work problem). Now that all 25,440 MoonCats have been rescued, those values are fixed, and can be hard-coded into contracts deployed on other chains. There’s no shortcut for compressing that data, as it’s all relatively random thanks to the proof-of-work requirement.

For lootprints, the randomization method was different: they were revealed after being minted, in grouped batches. Each reveal batch got a random value assigned to it, and each lootprint combined their own identifier with their batch’s random value to get a “seed” value for that lootprint (you can see each lootprint’s individual seed value by querying the getDetails function on the lootprints contract. MoonCat Zer0’s lootprint has a seed of “2461630022”, for example), which is used to determine what traits that lootprint has.

So, to allow other chains to have access to that data, the seed values of each of the lootprints could be copied as static data to a new contract to serve it. That seems like the most direct method to get this data to another chain, but one of the key factors of data storage on blockchains is that each byte of data costs fees to store, so it’s very important to use as few bytes of data to store things. So, let’s calculate out what storing those seeds would need:

Each lootprint’s seed is a uint32 number in Solidity (a number that gets stored using 32 bits of storage; 4 bytes), which means you could fit 8 of them in each “storage slot” Solidity uses. That would be 3,200 “slots” of data to store the 25,600 possible IDs in the lootprints collection (only 11,718 lootprints that exist in the final collection, but they have IDs spread out among 0-25,600, hence the slots need to be 25,600 divided by 8). That’s rather pricey to store on the blockchain; can we do better than needing to pay for 3,200 storage slots?

Algorithms

The seed of each lootprint was not supplied by the person who minted it. Instead, the 25,600 lootprints had a process of being “revealed” in 20 batches (1,280 lootprints per batch; as the collection was minted, after 1,280 existed they were revealed as a block. Once an additional 1,280 were minted, they were revealed as a second block, etc.). Each of the 20 batches got an additional bit of randomness from the block hashes at the time the block was revealed. Each lootprint’s “seed” value is a combination of its token ID, plus the bit of randomness assigned to its “reveal batch”.

seed = uint32(
  uint256(keccak256(
    abi.encodePacked(TOKEN_ID, revealBlockHashes[MINT_ORDER / 1280])
  ))
);

There were only 20 “reveal batches”, and so if we capture the randomness that was picked for each reveal batch, that would only be 20 storage slots of data to save, and then we’d only need each lootprint’s “index” (the order it was minted in) rather than its full seed value. The index values (0–11,718) only need 14 bits to be stored, which is a little less than half the size needed to store the uint32 (32 bits) seed value. At 14 bits each, 18 of them can fit into a single storage slot. So 1,423 storage slots for the index values of all 25,600 lootprint IDs, plus 20 storage slots for the batch randomness is 1,443 slots instead of 3,200 (45.09% the size)!

Wonderful! Getting the index values for each lootprint is easy (both the Lootprints and tokenByIndex functions can return that), but what are the 20 bits of randomness from the 20 reveal batches? Looking at the lootprints smart contract, that’s stored in the revealBlockHashes property. Looking at its definition in the source code, it’s defined as:

bytes32[20] revealBlockHashes;

That tells us each there are indeed 20 of them, and each of them is 32-bytes long (one “storage slot”), but one hiccup: that property doesn’t have a “public” label on it. When the lootprints contract was created, that property wasn’t needed for anything external, and so it was not designated as “public”. Therefore, if you look at the “Read Contract” tab for the lootprints contract on Etherscan, there’s no “revealBlockHashes” function there to be called.

That means no other smart contract on the Ethereum chain can access that property on that contract. So are we stuck? Or can we as humans get at the data a different way even if other smart contracts cannot? Smart Contracts can save data to the blockchain, and can designate it as “public” for other smart contracts to see, but things that are “not public” aren’t really “private”. When creating Smart Contracts it’s highly advised to not store any truly secret information in a smart contract, because all the data a Smart Contract saves gets written in clear text somewhere (so it can be used later for other parts of the program). If there’s no “public” accessor function for it, it’s not as easy to find the data with a nice human-friendly label on it, but it’s still there.

Smart Contract Storage

When a smart contract that’s written in Solidity (a relatively human-friendly language) gets compiled down to the actual program (bytecode; computer-friendly language) that gets written to the blockchain, the variables that the developer defined for the smart contract get laid out in the order the developer wrote them in.

Looking at the Solidity code for the lootprints contract, it starts with:

/**
 * @title MoonCatLootprints
 * @dev MoonCats have found some plans for building spaceships
 */
contract MoonCatLootprints is IERC165, IERC721Enumerable, IERC721Metadata {    /* ERC-165 */    function supportsInterface(bytes4 interfaceId) public view virtual override(IERC165) returns (bool) {
        return (interfaceId == type(IERC721).interfaceId ||
                interfaceId == type(IERC721Metadata).interfaceId ||
                interfaceId == type(IERC721Enumerable).interfaceId);
    }    /* External Contracts */    IMoonCatAcclimator MCA = IMoonCatAcclimator(0xc3f733ca98E0daD0386979Eb96fb1722A1A05E69);
    IMoonCatRescue MCR = IMoonCatRescue(0x60cd862c9C687A9dE49aecdC3A99b74A4fc54aB6);
    IMoonCatLootprintsMetadata public Metadata;

The supportsInterface item is a function, not a property, so doesn’t take up a “storage slot”. So, the first item stored for the lootprint contract is the smart contract address of the Acclimator contract, followed by two other address locations.

That’s slots #0, #1, and #2 covered; where’s the revealBlockHashes we’re after? Going a bit further in the source code we see:

/* Name String Data */string[4] internal honorifics =
    [
     "Legendary",
     "Notorious",
     "Distinguished",
     "Renowned"
     ];

The various string fragments that make up the names of the lootprints are next. But how many storage slots does that honorifics property take up? For arrays that have a fixed size, Solidity packs them in slots one after the other. So those four string values should be slots #3, #4, #5, and #6.

We can spot-check this by using a tool like Ethers.js to simply step through the slots one by one:

const storageData = await ethers.provider.getStorageAt(
  LOOTPRINTS_ADDRESS,
  ethers.utils.hexZeroPad(SLOT_NUMBER, 32)
);

Looking at the first few storage slots in the lootprints contract, it has:

To store an address, it stores it in the lower bits of the slot, and to store a short string, the ASCII values are in the high bits of the slot, with the length as the lowest byte.

Following that logic, we can then enumerate out the slots being used, compared to the Solidity code:

    0  : Acclimator Address
    1  : MoonCatRescue Address
    2  : lootprints Metadata Address
  3-6  : honorifics strings (x4)
  7-38 : adjectives strings (x32)
 39-53 : mods strings (x15)
 54-85 : main strings (x32)
 86-101: designations strings (x16)
102-501: ColorTable (400-length array of 32-byte words)
    502: Owner Address
    502: frozen (boolean)
    502: minting open (boolean)
    502: reveal count (uint8)
    503: price (uint256)
504-603: No-charge list (100-length array of 32-byte words)
604-623: Reveal Blockhashes (20-length array of 32-byte words)
    624: lootprints by lootprintId/rescueOrder (25,600-length array of struct data)

Note that because an Address (20 bytes), two booleans (one byte each), and a uint8 (one byte) can all fit into a single 32-byte storage slot, Solidity did compress them down into the #502 slot all together.

If our math is correct, that means the Reveal Blockhashes are in slots #604 to #623. Let’s take a look at the values around there to verify:

#600-#603 should be the end of the “No-charge list” (a data blob that indicates which MoonCats could mint a lootprint for free). The No-charge list is not a public property, but it was set after the contract was deployed. This transaction is the one that set it, and from the data payload of that transaction, we can see the final entry in that array should be 0xfdb57ffe51a772f3c3f929cf0000000000000000000000000000000000000000, which is indeed what #603 has in it.

Then #604-#623 should be the bits of entropy used for each of the 20 reveal batches.

Then #624 should start the first data entry defining the lootprint assets themselves. The first item in that array should be for lootprint #0 (MoonCat Zer0’s), and it indeed shows 0xeDccC2Ce220f286bF218390Ad16E432D539E6890 as the current owner of lootprint #0, and 0x007b being hexadecimal for “123” (the mint-order for lootprint #0).

The data before and after it checks out, so that data in the middle must be what we’re looking for! That chunk of data got extracted out and is now part of the lootprint metadata contract on several layer two blockchains (Optimism, Gnosis, and Matic), for use by contracts on that chain.

Conclusion

Part of developing smart contracts is thinking ahead to what data you might need in the future, and doing your best to expose that data in the contract at launch, even if it’s not directly needed then. But, as this example shows, even if you don’t account for all possibilities, there are sometimes ways to get at the past data (as the blockchain is a great permanent history store!). And this is a practical example of how you definitely don’t want to store secret information in a smart contract, because even if it’s hidden from other smart contracts and from casual observation by humans, it’s not too hard to pull out for those who spend a little bit of effort on it.

Now that the critical data pieces are identified and embedded in additional contracts, making sure the data is transferred correctly is a key part that cannot be easily done by smart contracts (as they’re on two different chains), but can be relatively easily done by a human. Through September 2022, we’re in the midst of a “Testing Party” phase where everyone is invited to help verify these contracts are doing what they’re supposed to — no MoonCats or Metamask required, EVERYONE can help! If you’d like to participate (or just enjoy seeing all the numbers fly around!), you can watch the brief intro video explaining this testing phase, and jump on into the Discord server to share your results.

Data-hunting through lootprints

lootprints (for MoonCats)

Algorithms

Smart Contract Storage

Conclusion

Written by MoonCat Community