Creating a 10K NFT Avatar Collection: ExpansionPunks Step-by-Step

Jeremy Posvar
Geek Culture
Published in
15 min readJul 28, 2021

A technical dive into the procedural generation of 10K unique avatar assets.

ExpansionPunks (www.ExpansionPunks.com) is a collection of 10,000 unique, procedurally-generated collectible Punks stored as ERC721 tokens on the Ethereum blockchain. In the same way “expansion packs” introduce new characters and storylines in traditional tabletop gaming, ExpansionPunks have arrived to expand the Punkverse to be a more diverse and inclusive community by addressing subtle biases in the original CryptoPunks collection that unintentionally lead to exclusion. Through an expansion approach that respects and honors the ethos of the original, ExpansionPunks feature unique trait combinations without sacrificing cohesiveness and coherence with the broader CryptoPunks community. Ultimately, ExpansionPunks seek to empower everyone to feel welcome, valued and represented in this new blockchain technology frontier.

As we approach launch, we are excited to share “under-the-hood” how the ExpansionPunks population was born, step by step. While the below certainly doesn’t capture the months of background research on Punk trait nuances, nor the countless trial and error attempts involved with creating a cohesive NFT collection — it should still give a sense for the thoroughness with which this effort was approached.

The ExpansionPunks collection came to life through six primary workstreams:

Figure 1: ExpansionPunks Approach, End-to-End

1. Reverse engineering an image layer library of all traits based on existing CryptoPunks traits to serve as the raw materials for generating new Punks.

2. Validating the accuracy of trait layer library by using it to recreate the original 10K CryptoPunks, validating an exact pixel-for-pixel match between “cloned” punks and originals

3. Building a trait assignment engine to bulk define a unique ExpansionPunks population

4. Bulk generating the ExpansionPunks population

5. Removing Punks that are perceptually equivalent to CryptoPunks (due to hidden traits)

6. Trimming the unique population to the final 10K

1. Establishing a Trait Layer Library

A library of 200 different 24x24 layers was first built in Photoshop, based on existing CryptoPunks traits. LarvaLabs hasn’t published (to our knowledge) separate files for each trait layer, so the library had to be reverse engineered, starting with the zero-attribute Female/Male Archetypes (see Figure 2).

Figure 2: Zero-Attribute Female and Male Archetypes

The above 8 Punks already exist — so those were easy. But zero-attribute Aliens/Apes/Zombies don’t exist in the original collection, so these rare archetypes had to be deduced by cross-referencing multiple samples of each to composite a theoretical zero-attribute version — Figure 3 shows how the Zombie zero-attribute archetype was deduced using punks #8553 and #8127. The same approach was used for the zero-attribute Alien and Ape archetypes.

Figure 3: Deducing the Zombie Zero-Attribute Archetype

These rare archetypes are also decidedly male based on their traits, so the female versions of each also had to be deduced, per Figure 4.

Figure 4: Rare Zero-Attribute Archetypes

With base archetypes in place, each of the accessory traits (mouth, hair, facial hair, eyes, etc.) had to be established as separate assignable layers — i.e. separate transparent PNG files that can be stacked to create a new punk.

Figure 5: Stacking of Separate Transparent PNG Layers to Compose a Punk

To define these traits, minimal-attribute Punks featuring each attribute had to be located to extract the pixel detail of each trait. Two primary challenges had to be overcome:

1. Deducing the RGB and opacity values for semi-transparent traits

2. Trait democratization (giving all punks access to gender-exclusive traits)

First, multiple traits (Rosy Cheeks, Mole, Spots, Horned Rim Glasses, and all 3 Eye Shadow colors) are semi-transparent, and when layered atop other traits (e.g. a skin tone) they blend to create a new third color based on a mix of the two RGB values as a function of the transparent layer’s opacity value. For example, RGB values for skin tones (1–5 in Figure 6) and Rosy Cheeks (6–10 in Figure 6) can all be directly obtained from punks with those respective traits. But to generate these blended results (6–10 in Figure 6) with a single transparent layer, the RGB and Opacity values (“?” in Figure 6) for the Rosy Cheeks Semi-Transparent layer had to be calculated, as this information isn’t available anywhere. To complicate it further — the final Rosy Cheeks blended color differs between Female and Male punks of the same skin tone — meaning the transparent layer also had to be calculated separately by Gender.

Figure 6: Rosy Cheeks as a Semi-Transparent Layer Applied to Multiple Skin Tones

To calculate the RGB and Opacity values for the Rosy Cheeks Semi-Transparent Layer, the following formula was used for each of the 3 RGB values, for each of the 5 Skin Tones, and for each of the 2 gender archetypes (i.e. 30 separate calculations):

Example of one (of thirty) such calculations for Male archetype, RED value of RGB, on Medium Skin Tone:

There are two unknown values here, the RED value and the Opacity Value of the Rosy Cheeks transparent layer. If you’re only working with 1 skin tone, 1 gender, you can achieve the desired “blend” using any Opacity value by simply altering the RGB values as a function of the formula above — e.g. blend 5 = blend 6 = blend 7, per Figure 7, by adjusting RGB and Opacity values of each overlay (overlay 2, overlay 3, and overlay 4).

Figure 7: Multiple RGB/Opacity Combinations of Rosy Cheeks Applied to Medium Skin Tone Only

However, once additional skin tones are considered, it’s clear just “any” opacity value won’t do…

Figure 8: Attempting to use RGB and Opacity Values Calculated from a Single Skin Tone Across Other Skin Tones

If we base the calculation only on Medium skin tone, then we quickly see in Figure 8 that the resultant blends across other skin tones will not be consistent: 1≠2≠3; 4≠5≠6; 7≠8≠9; and 10≠11≠12.

To find the optimum opacity and RGB value combination that can be used as a single Transparent Rosy Cheeks layer to create the exact “blends” when combined with all 5 skin tones, Excel “Solver” functionality can be used across all calculations simultaneously. Solver can test all possible Opacity values (0–100%) to find the best fit across skin tones, subject to the constraints that RGB values must always be greater than or equal to 0 and less than or equal to 255. Figure 9 shows what that looked like in Excel for the Rosy Cheeks calculation on a Male archetype.

Figure 9: Using Excel “Solver” Plug-in to Calculate RGB and Opacity Values Across Multiple Skin Tones

The optimum solution (A in Figure 9) is an RGB Value of R=214.5, G=0, B=0.2 and Opacity=19.8%. Rounded, we use RGB = 215,0,0 at 20% opacity (B in Figure 9) for our single Rosy Cheeks transparent layer (on Male archetype). When composited on each of the different skin tones, it creates the exact blended RGB value found in original Male CryptoPunks with Rosy Cheeks. This Solver approach had to be replicated across all attribute layers (and separately across both gender archetypes) that contained transparency (Mole, Spots, Horned Rim Glasses, Blue/Green/Purple Eye Shadow) in order to correctly identify the optimum RGB/Opacity values that would produce the targeted “blend” RGB value per the original CryptoPunks collection. Why did we go through such trouble when we could get “close enough” so that the naked eye wouldn’t be able to perceive differences in RGB values? Ultimately, we wanted the ExpansionPunks process to output Punks as true to original form as possible…

…as if the original CryptoPunks process were resurrected and re-run today. Only by ensuring an exact pixel for pixel mimic of the original process could we confidently say we achieved our goal.

The second challenge to address was the enablement of gender-exclusive traits (e.g. various hats, hair styles and all the Facial Hair traits, etc.) to fit the opposite gender archetype. Given Male and Female archetypes have different surface areas (e.g. Male is 1 px wider and 2 px taller than Female), gender-exclusive traits can’t simply be applied to the alternate archetype:

Figure 10: Attempting to Directly Apply the Hoodie to Female Archetype

In Figure 10 above, simply applying the Hoodie to the Female archetype creates 3 visually-jarring issues that depart from the CryptoPunks aesthetic: (1) missing pixels around the lower-left neck, (2) a narrowing of the visible female neck from 3px wide to only 2 px wide, and (3) exposure of too much forehead, losing the hooded look of the original. Moreover, the overall proportions feel off given the smaller frame of the archetype. To accomplish a more proportionate and aesthetic fit, the Hoodie was edited to address these 3 inconsistencies, as seen in Figure 11 below:

Figure 11: Creating a More Proportionate and Aesthetic Hoodie for the Smaller Female Archetype

Caution needed to be applied to ensure dynamics of the original trait were also considered — in this case, the Hoodie “hides” the Earring trait. Check out Punk #269 as an example:

Figure 12: Example of a “Hidden” Earring Trait on Punk #269

Such dynamics needed to be honored in the democratized version of the trait. Each gender-exclusive trait underwent a similar pixel-by-pixel assessment, which is how the overall CryptoPunks aesthetic was successfully carried through to these new Punk permutations that include formerly gender-exclusive traits.

2. Validating the Trait Layer Library

Before attempting to generate even a single ExpansionPunk, we needed to authenticate the accuracy of the trait layer library. We did this by generating (cloning) the entire 10K CryptoPunk population from scratch using only the original CryptoPunk trait data (CSV file of all original punks and their attributes) in concert with our newly established trait layer library. We used Excel to bulk translate the original CryptoPunk metadata into 10K lines of punk syntax that could be fed to ImageMagic via the Windows command line. Below is the sample syntax representing the famous Covid Alien Punk #7523 (Figure 13) that recently sold at Sotheby’s auction:

magick convert um-alien.png um-earring.png
um-knittedcap.png um-medicalmask.png -background none
-flatten CryptoPunks\punk7523.png

Figure 13: Punk #7523, aka “Covid Alien”

With 10K generated CryptoPunks clones, we used ImageMagick to compile them into a 100 x 100 punk matrix (2400 x 2400 pixels) that we named PunksCloned.png…

magick montage *.png -geometry 24x24+0+0 -background none PunksCloned.png

We then performed a perceptual hash (pHash) comparison against the original CryptoPunks “punks.png” composite 2400 x 2400 px image (the same .png image that has been SHA-256 hashed into the original CryptoPunks contract)…

magick compare -metric phash PunksCloned.png punks.png -compose src Aggregate_Difference.png

The command above results in the creation of a third 2400 x 2400 px image (Aggregate_Difference.png) that details the location of any pixels that don’t exactly match (across RGB value, Opacity, Location). Here’s an example on a much smaller 24x24 px image, using again our friend Covid Alien Punk #7523 and an altered version without the Medical Mask:

magick compare -metric phash punk7523.png punk7523_nomask.png
-compose src Difference.png

Figure 14: Comparing Every Pixel via Perceptual Hashing

As you can see — the Perceptual Hash comparison is able to isolate ONLY the pixels that are not perfect matches across the two images (Figure 14) — in this case only the missing “Medical Mask” pixels. This pHash comparison approach was conducted repeatedly at full collection scale to compare our 2400 x 2400 px “PunksCloned.png” composite with the original 2400 x 2400 px “punks.png” composite, each time exposing minor pixel nuances in trait layers that had to be adjusted in the trait layer library. After the 5th iteration of this cloning and comparison process, no more discrepancies were found between our clones and the original CryptoPunks — we had successfully re-engineered the original CryptoPunks generative process and validated it through creation of a “PunksCloned.png” that is pixel-for-pixel (5.76M pixels) an exact duplicate of the original “punks.png” (identical Perceptual Hash value, zero Hamming Distance between images). As an interesting aside, the Punks you see in the CryptoPunks Explorer are actually the clones generated by our process!

3. Building a Trait Assignment Engine

With the layer library built and verified, we needed a logic engine that could manage all of the below…

1. Bulk author trait combinations based on directionally-instructed trait-level probabilities

2. Accommodate Archetype-specific trait logic, for example…

a. Aliens/Apes should only ever be assigned “hat” attributes — never “hair” attributes. Logical — as Aliens/Apes don’t grow human hair styles (but Zombies do, of course, as they were formerly humans).

b. Aliens/Apes never have Facial Hair, Blemish, Mouth or Nose traits. It’s important to honor that logic in the ExpansionPunks population to ensure their visual consistency with the original Aliens/Apes.

c. Punks with “Pilot Helmet” must never be assigned an “Eye” trait (e.g. glasses, eyeshadows, etc.). Logical, as the Pilot Helmet has goggles on it already, so having double Eye traits would be an awkward outcome.

d. Punks with “Welding Goggles” must never be assigned a “Hat” attribute — only “Hair” attributes. Again logical, as the Goggles wouldn’t pair well various Hat traits.

3. Validate trait combinations as unique by de-duping against the trait combinations found in the CryptoPunks collection

4. Bulk create Punk generation syntax to feed to ImageMagick through the windows command line.

We used Microsoft Excel to accomplish all of the above in a workbook containing 4 primary worksheets…

Worksheet 1: Generator applies “random” integers as multipliers against each trait probability to randomly assign traits, then applies Archetype-specific logic, followed by trait combination de-duping against the original CryptoPunk population — each new row is a new Punk, with an “Expansion” vs. “Duplicate” flag to indicate whether it should be generated. Virtually every cell in Figure 15 below contains a complex formula that brings this all together. The visual layout could be simplified further by baking more formulas into each cell — possibly a future enhancement to make this Excel worksheet configurable for the creation of other collections.

Figure 15: ExpansionPunks Trait Assignment Engine Built in Excel

Worksheet 2: PNG Name Lookup Table for the Generator to use (vlookup) in creating final Punk syntax for generation:

Figure 16: PNG Lookup Table

Worksheet 3: Trait Probabilities for Generator to use for randomization of trait assignment — in Figure 17 you can see the rough instruction for the generator to follow the same Skin Tone probabilities as the original CryptoPunks collection (~10% Albino, and ~30% each across Light/Medium/Dark).

Figure 17: Trait Probabilities to Instruct Random Trait Assignment

Worksheet 4: Original 10K CryptoPunks Metadata for the Generator to cross-reference in deduping ExpansionPunk trait combinations

4. ExpansionPunks Generation

With the validated image layer library of all traits and a single line of generation syntax per Punk (as output by our Excel generator) — we then fed ~14K lines of punk syntax to ImageMagick via the Windows command line. Below is the sample syntax representing Punk#10000:

magick convert um-dark.png um-smile.png um-handlebars.png um-peakspike.png um-cigarette.png -background none -flatten ExpansionPunks\punk10000.png

A few minutes later, we had 14K trait-distinct ExpansionPunks, each as 24x24 pixel PNG files. Jumping into the assets, it was exciting to discover all kinds of ExpansionPunks that still felt right at home amongst the CryptoPunks. In Figure 18 below, we see some sample ExpansionPunks (orange) we like to refer to as “Filler Punks,” since they fill a skin tone gap in an otherwise complete trait combination archetype.

Figure 18: “Gap” Punks Filled by ExpansionPunks

Of course — the other post-generation excitement in all of this was the emergence of new diverse archetypes (Figure 19) that only existed as Punk trait syntax an hour prior, some examples below:

Figure 19: New, Diverse ExpansionPunk Archetypes

5. Perceptual De-Duping Against CryptoPunks Population

With 14K ExpansionPunks — there were sure to be “perceptual” duplicates (e.g. an ExpansionPunk with a hidden trait that on visual inspection would appear pixel-for-pixel as an exact duplicate of an existing CryptoPunk). There are a variety of traits that can hide other traits — just a few as examples below:

> Medical Mask hides all Mouth traits (Lipsticks, Smile, Frown, Buck Teeth)
> Medical Mask hides some Facial Hair traits (Mustache, Handlebars)
> Big Beard hides Gold Chain
> Wild Hair hides Earring
> Straight Hair hides Earring
> Etc.

Given the variety of potential ‘hidden’ trait collisions, relying solely on the trait combination dedupe performed in Excel pre-generation was not enough to ensure a fully unique expansion population. Thus, each of the 14K images had to then be perceptually hashed into a unique “signature” value, that could then be deduped against both the ExpansionPunk collection itself, as well as against perceptually hashed signature values for each of the original 10K CryptoPunks. Enter ImageMagick again for the hashing of signatures…

magick identify -quiet -format “%f %#\n” *.png > Signature.txt

Now with 14K ExpansionPunks hashed signatures, and 10K CryptoPunks hashed signatures, some basic Excel functions to compare and remove any duplicate hashed signatures from the ExpansionPunks population was a simple task. Figure 20 below highlights an example of a trait-distinct Expansion candidate that Perceptual Hashing revealed as a visual collision with CryptoPunk #8149 due to a hidden Earring trait.

Figure 20: Trait-Distinct Punk Exposed as a Visual Collision through Perceptual Hashing

This perceptual hashing process further reduced our ExpansionPunks population from ~14K to ~12K trait-distinct and visually-unique punks.

6. Trim Population to Final 10K

A variety of Pivot Tables were built on top of the 12K rows of Punk data to assess the general distributions of punks across “Types” and “Attributes” — and the collection was manually trimmed from 12K down to 10K in a way that optimized trait distributions to feel more in line with the original 10K population (as much as possible at least). For example, after de-duping visual collisions, the final 12K population skewed more towards 4/5/6-attribute punks than the original population, so punk reduction focused first on randomly removing punks with higher attribute counts. This process was more art than science, but it was still done at aggregate views only (e.g. looking at pivot table values and then randomly removing punks with bulk actions vs. singling out individual punks for removal). In Figure 21 below, the full 100 x 100 punk composite is making a first appearance.

Figure 21: All 10K ExpansionPunks

Some final notes on the above process…

· Admittedly, there are more code-centric ways of building a generator without the use of Excel. We didn’t go that route, and we’re okay with that 😉 — we think our brute force Excel approach is somewhat unique here (vs. leveraging someone else’s generation scripts).

· We fully acknowledge that ExpansionPunks were pre-generated vs. generated on-chain at time of mint. We made the decision to pre-generate to ensure there would be no risk of trait-combination collisions or visual collisions with CryptoPunks. We estimated the computational load to do such de-duping at time of mint, and concluded a pre-generation would ultimately be a better minting experience for the collectors.

· While the above process appears pretty straightforward — it’s the result of months of trial and error working on other processes that ultimately weren’t scalable. For example, before we discovered ImageMagick, we were using the “variable” and “dataset” functions in Photoshop to bulk generate ExpansionPunks. While it’s possible to do this, it required more steps (export data sets as PSD files, then bulk convert PSDs to PNGs, etc.) and introduced more room for error than the Windows command line approach with ImageMagick.

What’s Next?

ExpansionPunks are now live and still available for minting @ www.expansionpunks.com — join in the inclusive expansion of the PunkVerse by minting your very own ExpansionPunk!

Meanwhile, we are committed to advancing the entire Punkverse, bringing new experiences and value to the original CryptoPunks community as well. In that spirit, we’re excited to introduce the 10K Collection Explorer, a new experience for navigating 10K NFT collections, available now for the CryptoPunks, ExpansionPunks and World of Women collections:

The 10K Collection Explorer empowers you to unlock further insights and connection through dynamic filtering and sorting of the entire collection in one visually engaging experience. It changes the way attribute data can be used, revealing trends that would otherwise be invisible.

Stay Connected

Please join us in Discord and on Twitter to learn more, stay connected and support our mission to expand the Punkverse.

www.ExpansionPunks.com (still minting)

--

--

Jeremy Posvar
Geek Culture

Decentralization enthusiast, Punk #7741, ex-MSFT, views are my own.