DCGAN for kicks — auto-generated basketball shoe styles via machine learning

Jason Salas
3 min readJul 25, 2019

--

After 10,000 training iterations with fairly simple models, the images are starting to capture custom shoe designs. This is all synthetic, learned by the system.

A lot of literature exists about the mechanics and mathematics behind deep convolutional generative adversarial networks (DCGANs), and how they’re the neatest thing to happen to artificial intelligence since sliced bread. Basically, such a system pits dueling machine learning models against each other in order to produce images. The models learn from each other in order to get better — one improves at the expense of the other. It’s some real yin-yang stuff.

Both networks continue this co-dependent dance to the point where the generator creates images that are plausible enough that the discriminator can’t tell the difference between what images are authentic vs. synthetic.

What is real?

CHECK OUT MY REPO FOR THE CODE

I coded-up an experiment in Keras for producing completely artificial images of basketball shoes (or, “kicks”, if I can try to be cool) through a DCGAN, based on the wonderful repo and blog post by Dr. Connor Shorten (Github: @CShorten). Rather than use the shipped datasets like MNIST and CIFAR-10, Connor uses a custom dataset of shoes, which I sneakily found hiding on his profile in training and testing sets consisting of 70 images each for ADIDAS and Nike shoes.

Photo by Barrett Ward on Unsplash

His project uses a number of shoe images based on the same general style, with varying colors, which he trains a DCGAN on to produce custom designs. It’s a really neat concept, and a much more practical example for grabbing real-world data that needs to be preprocessed and massaged. The source images also were larger and in color, as opposed to MNIST’s (50000, 28, 28, 1) shape. It wasn’t at all hard.

So as a clever twist to get the bespoke dataset into a format that the DCGAN expects — a NumPy shape of (140, 45, 45, 3) — the images needed to be decomposed to their raw pixel values, and then persisted by storing them as an array in NumPy’s uncompressed NPZ format. The 140 images Connor uses with both the testing and training sets combined come out to just 1.6MB on disk.

I upscaled the demo to Google Colab to make use of cloud GPUs and make the machine learning run consist of many more epochs, hopefully to get both the DCGAN’s generator and discriminator models to converge (or get close to it). I used Google Drive File Stream to access the .npz file on disk in my share space.

After training the DCGAN over a run of 10,000 iterations with fairly simple generator and discriminator models, the following progression is achieved:

It’s got a ways to go as far as eliminating noise from the images, so I’ll be tuning hyperparameters and tweaking the models until I can get the error for both generator and discriminator somewhere around convergence level. If you have good deep models, GPUs and more data, you could get very convincing results without obnoxiously-long training cycles.

And you could use the storage routine I wrote to create a dataset for other types of objects or varying size — puppies, roses, cars, t-shirts, bumper stickers, movie posters, etc.

Enjoy and have fun!

CHECK OUT MY REPO FOR THE CODE

--

--

Jason Salas

Machine learning, recommender systems, 360-degree filmmaking, college football rankings, movie stuff, general dorkery