In part one of this article I explored the broader concepts of generative art and machine learning art in particular. Make sure to read it if you would like the full context for GAN machine learning models. Here in part two, I will describe my process as a Machine Learning Artist.
T3RRA is a StyleGAN2-ADA model made from network blending. It is the combination of a model trained on textures with a model trained on abstract and surreal art.
To begin the process I had to choose and curate the data sets. For the textures model I sourced images from Flickr Materlia Database and the Describable Textures Dataset. Out of these 6000+ images I discarded many that were either low quality or didn’t fit the style of texture I wanted, such as plastic, checkered and spiral patterns. I ran the remaining images through a machine learning resolution enhancing model to get them to the 1024x1024 pixel resolution that I use for consistent training. For the art model, I curated abstract and surreal images from a WikiArt sourced dataset and ran them through a similar process of enhancing.
After compiling these data sets I started training and experimenting. Each training session would last days. As the training progresses, the algorithm spits out snapshots as a way of checking in on the training. These snapshots are a random sampling of twenty-eight images. Depending on how the sample images turned out, I might tweak the training session. Sometimes I would go back and alter the data sets.
Sometimes I would change training parameters like the augmentation settings. These augmentation settings are what set StyleGAN2-ADA apart from other GAN algorithms. They take the initial data set and apply filters, rotation or size alterations to provide even richer data sources for the model to learn from. Using network blending, I also experimented with different combinations of the various art and texture models I had made. All told I spent over 1,000 hours training these experimental models. Once I had a few finished models I liked, I began generating several hundred images per model to get a better sense of what each model would produce. Eventually settling on the final model that I named T3RRA.
All told I spent over 1,000 hours training these experimental models.
To get the best pieces for the collection that would represent T3RRA the model, I developed a multistep generation and curation process. I started by generating 777 random images from seeds. Seeds are integers that initialize the model vectors or parameters. For most StyleGan2 models this is a 512 dimensional array of numbers that determine which image gets generated. These vectors are essentially randomized, but using a particular seed will always get the same randomized vectors. Seed one will always generate the same image, but there is no way to correlate a seed number to any category or type of image within a model.
After choosing around fifty of the images from this batch that spoke to me, I made custom code scripts and GUIs to identify and modify what are called the feature vectors of the remaining seeds. Feature vectors are like directions in the model landscape that an algorithm has determined are significant. The significance of these directions is sometimes hard to tell, especially when it comes to abstract art. They are also almost always intertwined, meaning that when one aspect of an image changes, another aspect will always change with it. So even when these feature vectors are produced, a human needs to determine which ones are actually useful. To do this I printed out representations of the first 100 feature vectors of the image I was working on and chose the feature vectors I personally deemed impactful. To make these representations I generated two images along the feature vector path in both the positive and negative direction with the original in the middle to get the best sense of what was changing.
Using combinations of these feature vectors I was able to manipulate and modify the original seed based images, getting some sense of control and direction over the images. With all the entanglements and random combinations of feature vectors, it was still more a process of experimentation and wrangling than detailed control and fine tuning. After experimenting with the initial batch of images in this way and getting more images, I again whittled them down to only those that spoke to me most. Here I arrived at the final images that would represent the entire model and compromise the full collection.
From there I took the 1024x1024 images and put them through the resolution enhancing model to get crisper more detailed images at 4096x4096 pixels. The final step is to put them into After Effects for some post processing. This is different for every image, but generally speaking I do color adjusting, contrast balancing and add some lighting and additional texturing when the piece calls for it.
This has been a long and impactful journey for me. It showed me the power of harnessing a machine learning model’s core, not just its generated results. Because I merely set the stage for the models I was also able to have a sense of discovery each time I went through what the models had generated. Even when tweaking the feature vectors I would sometimes come up with completely unexpected results. I had an adventure into uncharted territory with art and came out the other side with something precious.