AdiGAN: A visual adventure into the machine learning generative world
By Fabio Toste, Senior Creative Developer and Peter Altamirano, Technical Director at Jam3.
Part of our bread & butter at Jam3 is to generate innovative ideas, separately from client work, that can sometimes be technically challenging and require intensive research. Jam3 Labs is where we take all those great ideas and make them happen in quick R&D sprints.
AdiGAN Goal: Generate unique customized products
The idea this time was to generate texture using a GAN (Generative Adversarial Network) model to sample different customizable styles to the same adidas sneakers and create beautiful modified versions of it.
Part of the goal was to keep it simple and create usable textures, but not be concerned about high resolution (at least at the beginning). The goal was to train the AI fast, understand the concepts, and get ideas on how to generate content using machine learning. This article is not exactly a tutorial, but an in-depth look into how we went from an idea to a working final prototype.
This use case can be applied in real campaigns to generate visuals applied to different products based on different inputs. If you would like to explore this idea further don’t hesitate to reach out to us via firstname.lastname@example.org.
We have executed multiple projects with adidas, such as ComplexCon, Ozweego, a collaboration with Donald Glover at Coachella, an immersive experience for Predator, and many others. In one of our brainstorming sessions the idea of machine learning generated products came up and we decided to create a proof of concept in the Jam3 Labs. That’s also how we came with the name AdiGan (adidas + GAN), cool right?
To train the machine learning model we used a series of images from 90’s fashion. The idea was to use all the colors and formats and transfer them to generate different textures as an output.
TensorFlow and Unreal
We used TensorFlow and Unreal to accomplish this. Tensorflow for working with GAN models and Unreal to finalize the graphics.
TensorFlow is an open-source platform for machine learning created by the Google Brain team.
It’s a tool that allows developers to build and deploy Machine Learning powered applications with different tools, libraries, and an awesome community. Unreal is a game engine developed by Epic Games used for creating advanced real-time 3D creations.
TensorFlow: CPU vs GPU
The first step to start playing with Tensorflow is to install it and the first question that comes with it is to decide using CPU or GPU?! After a couple of tests we can say that GPU TensorFlow can be 10 to 20 times faster and when you work with image generation it can make a huge difference.
It seems like an easy choice, let’s use the GPU, right? But, if you are using a Macbook it can be harder as it will not support the CUDA environment (Nvidia). You can try PlaidML + OpenCL and make it work and use a GPU version without the power of Cuda, or consider using Google Colab cloud platform for running TensorFlow.
In our case we wanted to test locally and also use RTX on Unreal and we needed Nvidia anyway.
The solution was to install Windows on a Macbook and use an external GPU to have full access to the power of RTX, it’s not an easy task, and here is a separate tutorial to follow.
Our Hardware and Software Setup
MacBook Pro 15” (2019, 2.3ghz i9, Radeon Pro 560X, 1TB SSD)
External Samsung T5 Portable SSD, 1TB (Windows installation)
- Windows 10
- Python 3.7
- CUDA Toolkit 10.1
- Tensorflow >= 2.0
- Unreal >= 4.22
It is not the intent of this article to walk through how to install python and TensorFlow. You can follow the official TensorFlow Install Guideline and for more information about installing Unreal you can follow this guideline.
Playing with GAN
Now that we had the proper setup we could start making super cool stuff. First we had to find a good model to start with and GAN (Generative Adversarial Network) was the right choice for image generation, style transfers, and art.
TensorFlow provides a great collection of pre-trained models and tutorials. We wanted to compare results using different models and decided to test DCGAN, Pix2Pix, CycleGAN, and DeepDream as each seemed to help us generate more unique outputs.
As datasets we used a series of images from the ’90s and found some cool vintage adidas pieces.
The final results are textures like these ones:
The machine learning models we used were trained to fit just one texture. It means it can be used only for this prototype and not for generic texture generation or style transfer, of course, you can use a bigger dataset input to make it more generic, but it will probably take more time.
For the first prototype and understanding the GAN concepts we used Deep Convolutional Generative Adversarial Network (DCGAN).
To discover that we started with a simple GAN and a single texture mapped to the 3d model. We used the single texture as follows:
First we tried to use a GAN to reconstruct the texture as is, or, at least get it as close as possible.
To accomplish that we needed to create two models, a generator and a discriminator. The generator will create random pixels and pass the result to a discriminator that will compare with real texture and check how accurate it is. If it’s very far, it will come back to the generator and generate another random pixel image, until it gets close to the original image.
Using the loss of the generator to quantify how well it was able to trick the discriminator and the loss of discriminator to quantify how well the discriminator is able to distinguish real images from fakes we can set a time to finish the process and get as close as possible to the expected result.
First, we did it with a very small image 28x28, so in the end we had a very low-resolution image, as you can see below:
It’s very close in shape to the original, but as the resolution is very low you can’t really see the details on the final result. It did, however, prove that the model was working. The time to generate a small image was around 3 minutes with our setup.
It’s good practice to start small and increase the resolution as soon as you get a good result, remember that the time to train your model can go from minutes to hours just changing the final resolution of the model input/output.
At the end we got an image like this:
It’s not perfect, but good enough to be used as a test texture, this higher resolution (512x512) image took around 30 min to produce over the network, so as you can see, the processing time increases a lot as soon as you improve the resolution of the image.
After trying the simple GAN to generate textures, we had to move forward, as we wanted to create new textures and not only get a worse version of the texture we already have. It was time to try new GAN models. The next one we tried was the Pix2Pix.
The pix2pix uses the same concept of the normal GAN, but instead of using random pixels as a start for the generator, it uses pixels from another image, like the example below:
For our experiment we used the same texture and random images from our 90’s dataset.
One of the original pix2pix use cases was to transfer maps (colored images) to satellite images, or like the example above from colors to buildings, but we were able to get some interesting results applying it to textures. Now it was time to go start modeling.
The cycleGan is another type of GAN used to transfer characteristics of one image to another like for example making a horse look like a zebra.
We used the CycleGAN model to verify if the results could be different using other models, and they were. You can observe the different results.
As you can see, it generated a colorful texture, but creating a different relation between the dark and light areas of the image, we can notice the difference between the models (Pix2Pix vs. CycleGAN).
As soon as we got some results we were excited to start playing with another model.
Neural style transfer
Style transfer is a very popular Machine Learning model used to transfer characteristics of one image to another, in the example below you can see how the style is transferred but keeping the same shape and constitution of the ground truth image.
For our experiment we started with exactly the same images as before.
As you can see, the final result was again a little bit different, keeping the colors, but the distribution is a lot different from the other models.
After that we started playing with the last model. The Deep Dream.
Deep Dream is an experimental model described by Alexander Mordvintsev and it visualizes the patterns learned by a neural network creating a kind of crazy dream pattern applied to the original image. The process was dubbed “Inceptionism” (a reference to InceptionNet, and the movie Inception).
The deep dream model is made with the help of a pre-trained image classification model. That’s why all the images are filled with the same pattern (theoretically), but for the purpose of our experiment the pre-trained model was more than enough to test with.
After playing with all the models and producing different textures, we started to think about what would be the best way to visualize our results and apply them to 3D models.
Time to go to Unreal
Unreal is a real-time game engine used mostly for games, but it’s getting a lot of new updates and features to help other areas like architecture, product visualization and even VFX to benefit from the real-time render.
The final Unreal project that we used for this prototype uses RTX, a need for a 10xx or 20xx GeForce RTX capable GPU to see it at full quality.
We created a simple 3D scene in Unreal with the shoe in the center and a camera to rotate around it. The idea was to center the shoe and every 360 rotation would change the material to a new version of the texture.
After we had the scene setup we started importing all the generated textures, with the originals, a normal map, to create the bumps and an ARM texture with AO, Roughness, and Metallic maps combined.
We created some materials for each 360 turn version, and we chose to make 11 textures plus the 3 originals.
Each material has a different configuration and blends between some of the textures we’ve created.
After setting up the blending options we simply used the “save to texture” feature in Unreal to generate the new blended textures.
This is a process to avoid change material in realtime and instead just replace the texture. In the end we had 11 new textures to use in the blueprint and change it after each rotation of the shoe.
To make the effect a little cooler, we added a post-processing material to create a warp effect.
And combined with the material blend transition.
With all the setup we created a sequencer to animate the camera and capture the final video.
The process in Unreal was very manual, but for our experimental purposes was more than enough. Fabio likes to play with Unreal, which made it really fun for him, but for future exploration we could consider Tensorflow to Unreal to make it more real-time.
For now we just wanted to output the textures in a cool way and not play in realtime ML creation.
As a final result we got a video with all the textures applied and the animation to blend between all.
And we got some cool slices of the process as well. Enjoy!
In the end, playing with textures and Tensorflow was really fun and could bring some cool results.
The GAN deep learning process is very interesting for using visuals and opens a huge world of possibilities for creative creation and graphics. The results were a mix of all the GAN processes combined in Unreal, it’s a good start for generative visual creation or adaptation.
For this prototype we used only images as inputs, but it can be generated using music, sounds, words, or any other inputs we can convert to numbers and output textures can be applied to any 3D model or product, not only the sneakers.
Got an idea for an AI-generated product? We’d love to hear them — let us know in the comments!