Hands-on Tutorials
Lego Minifigure Gender Classification Using Deep Learning
With CNN’s and transfer learning
Through my journey of working on the convolutional neural network (CNN) section of Udacity’s deep learning nanodegree, I decided to work on my own project to see if CNN’s would be able to classify between the genders of Lego minifigures.
The reason I decided to do this is because I’m a Lego fan and have been collecting minifigures for many years now. I think I now have over 200 of the little guys, mostly obtained from blind bags.
Oh and I also take photos of them which I share on Instagram!
Why use transfer learning?
Transfer learning is when you use a pre-trained neural network, and use it for a different dataset.
Since I have a small dataset, I wanted to utilize ImageNet’s pre-trained images as it has many pictures of people and clothing, so it should be easier to determine the features of the minifigures. With the similarities in human features and clothing of the minifigures, I would categorize my dataset to be similar to what is present in ImageNet.
According to Udacity, if the new dataset is small and similar to the original training data, you have to change the neural network as follows:
- slice off the end of the neural network
- add a new fully connected layer that matches the number of classes in the new data set
- randomize the weights of the new fully connected layer; freeze all the weights from the pre-trained network (to avoid over-fitting)
- train the network to update the weights of the new fully connected layer
You will see how I have done this later in the code.
Gathering the data
I took photos of my own minifigures to form the dataset. As I only had a limited amount of them (over 200), I took photos of each minifigure from different perspectives to get more photos for the dataset.
To make sure the CNN doesn’t get exposed to items that aren’t as easily distinguishable between male and female, I made sure to take off any accessories that the minifigures were wearing such as these:
I also didn’t include the minifigures that weren’t human like these guys:
In the end, my dataset looked something like this:
The code
I first loaded in the libraries:
%matplotlib inline
%config InlineBackend.figure_format = 'retina'import matplotlib.pyplot as plt
import numpy as np
import torch
from torch import nn
from torch import optim
import torch.nn.functional as F
from torchvision import datasets, transforms, models
And checked if Cuda was available so I can utilize Google Colab’s GPU.
train_on_gpu = torch.cuda.is_available()if not train_on_gpu:
print('CUDA is not available. Training on CPU ...')
else:
print('CUDA is available! Training on GPU ...')
Loading and transforming the dataset
I saved my finished dataset as a zip file into Dropbox, where the contents were split into a Train/Test folder with another Boy/Girl folder inside.
I then generated a download link so that I could use !wget
and !unzip
to load the images into Google Colab.
As the images I took had a large size, I needed to transform them so that the input data could work with what the pre-trained model expected (in my case, VGG16). I also used PyTorch’s ImageFolder class so that I could load the data from the train and test folders I created in the zip file.
data_dir = 'Lego (compressed pics)'# VGG-16 Takes 224x224 images as input, so we resize all of them
data_transform = transforms.Compose([transforms.Resize((224, 224)),
transforms.ToTensor()])train_data = datasets.ImageFolder(data_dir + '/Train',
transform=data_transform)test_data = datasets.ImageFolder(data_dir + '/Test',
transform=data_transform)
Here you can see how I allocated the photos into a train and test set.
Create DataLoaders for the train and test datasets:
# how many samples per batch to load
batch_size = 20# number of subprocesses to use for data loading
num_workers = 0train_loader = torch.utils.data.DataLoader(train_data,
batch_size=batch_size, num_workers=num_workers, shuffle=True)test_loader = torch.utils.data.DataLoader(test_data,
batch_size=batch_size, num_workers=num_workers, shuffle=True)# specify the image classes
classes = ['Boy', 'Girl']
Now let’s visualize a batch of training data.
# obtain one batch of training images
dataiter = iter(train_loader)
images, labels = dataiter.next()
images = images.numpy() # convert images to numpy for display# plot the images in the batch, along with the corresponding labels
fig = plt.figure(figsize=(25,4))
for idx in np.arange(20):
ax = fig.add_subplot(2, 20/2, idx+1, xticks=[], yticks=[])
plt.imshow(np.transpose(images[idx], (1, 2, 0)))
ax.set_title(classes[labels[idx]])
Define the model
This will be accomplished by loading in a pre-trained VGG16 model.
# Load the pretrained model from PyTorch
vgg16 = models.vgg16(pretrained=True)# Freeze training for all feature layers so the model doesn't change # the parameters it was pre-trained on
for param in vgg16.features.parameters():
param.requires_grad = False
As described earlier, the pre-trained model’s classifier doesn’t work with what we are trying to achieve because it’s last layer outputs 1000 features, and we only want 2 (since we only have two classes of Boy and Girl).
So we need to remove the last layer and replace it with a linear classifier of our own.
n_inputs = vgg16.classifier[6].in_features
last_layer = nn.Linear(n_inputs, len(classes))vgg16.classifier[6] = last_layerif train_on_gpu:
vgg16.cuda()
Now the classifier is what we want!
Specify the loss function and optimizer.
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(vgg16.classifier.parameters(), lr=0.001)
Training the network
Testing
Below, you can see the accuracy in determining each gender class from testing the trained model on previously unseen data. In order words, we are predicting the gender of the minifigure using our trained model, and comparing it to the actual gender (target).
Visualize test results
Conclusion
The results seem pretty good! Out of the twenty sample images tested, only one of them predicted the wrong gender. Keep in mind that I only used a small dataset, so the results would likely change if I had a larger one.
If you are interested in reading my code in more detail, please visit my Github below.
This was a fun project where I was able to apply what I learned from Udactiy’s deep learning nanodegree. I hope to write about more of my projects as I progress through the course so stay tuned!
If you have any questions or comments, feel free to leave your feedback below. You can also follow me on Linkedin or connect with me here.