Aesthetics Case Study: Deathbats Club

Ilan Rosen
PopRank
Published in
5 min readFeb 7, 2022

Briefly, before diving into our case study, I wanted to take a moment to thank all PopRank users. To anyone who’s jumped in our Discord server to “!pop”, filled out our user study form, and contributed to our over 4 million rounds played — thank you. Your rounds are the core of our product and power our aesthetic rankings as well as this case study. We’re working very hard building out our product, with a big focus on data integrity and protecting against bad actors. In parallel, we’re also exploring a meaningful form of rewarding those visiting the site and playing our game. Think XP systems, user profiles, competitions, and much more. We’ll dive into this in a future post.

Overview

Our rarity rankings, much like others, use a “sum of traits” approach, wherein the rarity of each individual trait of an NFT is summed to determine the rarity score and ranking. This is useful for prospective buyers, as they can view the rarity of each individual trait and the extent to which its score contributes to the overall NFT’s rarity score.

Up until now, we only show the aesthetic rankings for NFTs as a whole — there’s no insight into the rankings of individual traits. Much like rarity, would it not be useful as a collector to have insight into which specific traits of an NFT are the most aesthetically pleasing? What if, instead of treating a round as NFT A vs. NFT B, we play the constituent traits of NFT A and NFT B against each other?

Our Approach

Let’s take a round from Avenged Sevenfold’s Deathbats Club collection. The response from the Deathbats community has been incredible, with over 115,000 rounds played without any giveaway incentive, making them perfect for this case study.

Deathbats Club PopRank round

Here are the individual traits of these two NFT contestants for this round:

Left — #8793
Right — #9102

What originally was #8793 vs. #9102, has now become:

  • Facial Hair: None vs. Fu Manchu
  • Eyes: Bored vs. Monocle
  • Mouth: Bloody vs. Gold Fangs

…and you get the idea. Note that if one NFT doesn’t have the trait type that the other contestant has (i.e. #8793 doesn’t have the “Facial Hair” trait type, whereas #9102 does), we treat that as a “None” trait value.

We’re exploring multiple per-trait aesthetic ranking algorithms for the official release of this feature — the most notable of which being our continued usage of the Elo rating system. This comes with additional complexities though, as different traits see vastly different numbers of rounds played. Our algorithm ensures that every NFT in a collection sees a similar number of rounds played, but this doesn’t extend to the individual traits. A 100/10000 trait would see 10x fewer rounds than a 1000/10000.

That being said, we wanted to give you a peek behind the curtain ahead of our release. For the purposes of this article, we’re going to look at the win rates of the individual traits. In addition, if a round had a 1/1 Deathbat in it, we skipped the round, as we found that the 1/1 Deathbats, which have “None” for all of the conventional traits of the collection, skewed the results. For rounds where both NFTs have the same traits in a specific slot, we ignore that individual round.

Rankings

Let’s start with the trait we’re most interested in: Skin. The final column is the number of rounds played by NFTs with that skin type.

Trait type: “Skin” win rates

The top 3 rated traits are Ghost, Spirit, and Shook.

Ghost, Shook, Spirit (left to right)

NFTs with these traits won around 70% of their rounds.

Now, let’s look at another trait: Background.

Trait type: “Background” win rates

Interesting — while Black is slightly in the lead, which feels in line with A7X’s aesthetic, most of the win rates are approximately 50%. Evidently, the background colour doesn’t have a strong impact on the results of the round, and therefore we could posit that it doesn’t strongly affect the aesthetics of the NFT.

Impact

There’s only a difference of 2.56% between the highest and lowest rated Background traits. Looking back at the Skin traits, we can now better understand its impact on the aesthetics of an NFT, as it has a whopping difference of 32.8% between its highest and lowest rated traits.

By using the range of win rates as a proxy for the “impact” of a particular trait type, we end up with the following list of trait types, in order from most to least impactful:

  • Mask — 42.37%
  • Skin — 32.8%
  • Head — 23.04%
  • Eyes — 21.67%
  • Trait Count — 14.07%
  • Mouth — 10.37%
  • Facial Hair — 8.92%
  • Background — 2.56%

For anyone interested in the “Mask” trait type, and why it has such a large impact, it’s both a very visually impactful trait type, and a rare one, with only 29 of the 10,000 Deathbats having any mask trait type at all.

Final Thoughts

NFT aesthetic rankings and even per-trait aesthetic rankings are merely scratching the surface. It’s very exciting for me, and I hope for you reading, to start seeing the insights our data can generate. We’ve just seen the ability to discern which individual traits are desirable, and the extent to which different trait types contribute to whether an NFT wins or loses, but it’s just the beginning.

We love to hear your cool ideas as to how we can harness our data, so please do fill out our user study form and jump into our Discord to have a chat!

In the Appendix below, I’ve attached the full per-trait breakdown for Deathbats Club.

LFG,

Ilan Rosen

Acknowledgments to my co-founder Shandy Sulen for reviewing and editing this article.

Appendix: Deathbats Club Per-Trait Data

--

--