Implementing an Image Classifier with PyTorch: Part 3
We conclude our 3-part series exploring a PyTorch project from Udacity’s AI Programming with Python Nanodegree program.
In the first article in this series we discussed why and how to use a pre-trained network. We then explored the training process with the second article. In this third and final piece, we will learn how to use our neural network to predict the type in new images.
Predicting the flower type
Once our classifier is trained we want to avoid having to train it again every time we need to analyze an image. So we want to save the classifier so we can load it again any time that we need it.
Using the torch.save method we can easily store a dictionary with those values that we need to persist. Similarly, we can use torch.load to retrieve the dictionary when we need to use our classifier.
Pro tip: You can save the architecture type. This will allow you to know which initial model architecture you should load. Also, you can easily find the input size of your classifier, depending on the initial architecture chosen:
ResNet, Inception: input_size = model.fc.in_featuresVGG: input_size = model.classifier[0].in_featuresDenseNet: input_size = model.classifier.in_featuresSqueezeNet: input_size = model.classifier[1].in_channelsAlexNet: alexnet.classifier[1].in_features
To predict the type we need to load the image first. For that, we can use PIL. After resizing and cropping to match the required input size of our neuronal network, 224x224, we will need to convert it to a numpy array.
RGB colors are usually encoded as integers from 0 to 255. As our model expects floats between 0 and 1, we would need to divide by 255. Then we will need to subtract the means and divide by the standard deviation to normalize our data.
As shown in the previous article, in order to normalize our data we will use [0.485, 0.456, 0.406] for the mean and [0.229, 0.224, 0.225] for the standard deviation.
Last but not least, we need to address the fact that our numpy array has the color in its third dimension, while PyTorch expects it in the first dimension. To make our data match the expected input format, we need to reorder its dimensions using transpose.
Finally, we can create the tensors for that numpy array using torch.from_numpy, and pass them to our model. Passing the result to torch.topk will give us the top classes with their corresponding probabilities.
Note that the probabilities of all classes should sum 1 when using Softmax as the final activation function. In the case of using LogSoftmax, we would be getting the logarithm of the probabilities instead.
Playing a bit with the result arrays, and thanks to matplotlib, we can represent the prediction like this:
Conclusion
In this final article, we have learned how to save and load our classifier. We have also discussed how to predict the type in new images, with the primary takeaway being that we need to pre-process the images to match the format that our neural network expects.
Once done, we can create the tensors and find the top classes using the topk method, and plot the results for easier visualization.
Thank you for reading through this series. I hope that you found it useful and that it helps you in your future projects!