AISaturdayLagos: The Torch Panther

Published in

AI Saturdays

8 min readMar 11, 2018

Before I get to the fun stuff about how #teamPyTorch built a classifier to recognize two of black panther’s characters. Let me run you through some administrative details 😄.

We’re officially nine weeks old — 🔥 🔥 🔥

Part of our goal for AI Saturdays Lagos is to build a CNN architecture using 5 different framework. Thus, we grouped ourselves into 5 teams — PyTorch, Tensorflow, Keras, Neon and Theano.

While the main project is still ongoing for each groups, we thought it’d only be fair for everyone to have a general understanding of how each framework can be used in practice.

This week, we concluded part 1 of fast.ai deep learning for coders course which was very practical (building ResNet from scratch). Then we had a short presentation from #teamPyTorch. Later on, we listened to CS231n lecture on deep learning software.

Below is an overview of #teamPyTorch’s experiments with PyTorch framework written by George Igwegbe.

You can also read it up here.

This section is written by George Igwegbe, the team lead of #teamPyTorch

How this article is Structured

What is PyTorch ?
Convolutional Neural Network( LesNet and ResNet) with CIFAR-10.
Transfer learning with our custom Black Panther dataset.
Results.
Challenges

PyTorch?

PyTorch is an open source deep learning framework for python. This open source framework was based upon another deep learning library named Torch. Surprisingly, this framework was not created with python rather it was designed with a scripting language named LuaJIT.

PyTorch was initially released on October, 2016 which followed with its stable release (version: 0.3.1) in 14th February, 2018. It was primarily developed by Facebook’s artificial-intelligence research group. The framework’s popularity grew sporadically after its stable release because it was created using python as its scripting language. Also, it is built with the concept of dynamic computational graph in mind.

PyTorch puts Python FIRST which makes it suitable for industry experts and academic researchers for its simplicity and rapid prototyping. Also, Fastai library and Uber’s “Pyro” software for probabilistic programming are built upon it.

Convolutional Network (LeNet-5 and ResNet) with CIFAR10

Our team were basically beginners when we tried this experiment out of curiosity. We implemented Convolutional Neural Networks(CNNs), which are deep learning architecture inspired by the brain’s biological processes. So everything we did was by the book with a little ingenuity.

We decided to compare an earlier version of neural networks called LeNet-5 with a modern version called ResNet on the CIFAR-10 dataset. We decided to try the most basic network model on the CIFAR-10 dataset which was the LeNet-5 model.

Quick Note: The CIFAR10 data set which consists of 60,000 (32 x 32 pixels) colour images of 10 classes. The data set is divided into 50,000 training images and 10,000 test images.

LeNet-5

The LeNet model was developed in 1988 by LeCun for handwritten number recognition used by banks for cheque checking.The LeNet model is a 7-level convolutional neural network that employed the use of one or more convolutional layers before adjoining to the fully connected linear layers where the outputs are collated and compared. In our experiment, we made use of 2 convolutional layers, 2 max pooling layer and 3 fully connected linear layers.

The design of our Convolutional Neural Network (CNN) based on LeNet follows the approach below:-

Import the necessary tools and functions
Create a neural network Module class
Initialize it by defining the necessary layers
Define your forward pass

Our LeNet Model

PyTorch has special in built packages in order to assist us in designing our desired network model. Some of these notable features include: torch.autograd, torch.nn and torch.nn.functional. These features help us in situations like computing gradients, performing gradient descent, declaring neural networks as well as selecting our desired activation functions etc.

Criterion and Optimizers

After designing our neural network, we declared our criterion and optimizers. These functions are vital in computing our gradients, performing back propagation as well as updating our weight since the LeNet model works with this flow.

We trained the network for 2 epochs as well as its performance over the dataset. This process was carried out on a Local system (CPU) with the class “car” having the highest accuracy. The loss rapidly in the first epoch but its rapid fall decreased in the second epoch.

We concluded that the results collated and compiled were good but we felt that the accuracy could go higher and the loss lower.

One way this could be achieved was using a deeper neural network than LeNet-5. So we decided to apply the ResNet (Residual Network) model on our dataset.

ResNet50(Residual Network)

We decided to go with the ResNet-50 model not because we thought it was the best, but because it was a model we understood and this allow us to fine -tune our dataset.

After fine-tuning layers of the skeletal ResNet-50 Model, the results were better compared to results gotten from the LeNet-5 Model. The class “car” had a higher accuracy of 83% while the lowest loss value dropped by 0.161 over 2 epochs compared to the previous result.

Our ResNet-50 Model

From the experiments performed above, we discovered that the deeper the network the better the accuracy. But this conclusion is clouded by how deep the network should be. In general, it can be deduced that the ResNet model provides a better accuracy and lower loss values than LeNet-5 Model.

Therefore, we proceeded to try out this same experiment with our custom dataset. We also discovered that the results could be better by increasing the number of epochs but this might not work at in few cases.

Transfer Learning with Black Panther.

Our team decided to be a bit adventurous by embarking on our own Wakanda 🙅 conquest. We created a custom image dataset (120 images) of two main characters in the movie namely “Okoye” and “Shuri”. We implemented transfer learning using the dataset. But we had to ask ourselves this question what is transfer learning?

What is Transfer Learning?

Transfer learning involves applying knowledge from a tested and trusted source to a new problem/situation. In practice, models are rarely trained from scratch because such processes are computation-intensive, power -consuming as well as a waste of processing time. So, pre-trained models are considered suitable for this experiment.

Why read, when you can download the knowledge ?

In this experiment, we will be comparing two transfer learning techniques: Fine-tuning and Feature-Extraction.

Fine-tuning

Here, we initialized our network with a pre-trained network (ResNet-18) which has been trained with Imagenet dataset. We are basically transferring the knowledge such as edge detection or color blob detection used in Imagenet to our network. Then, the model was trained with the custom dataset.

Feature Extraction

In this case, we froze the weights of all the network leaving the last fully-connected layer. We altered some features and replaced it with our own fully connected layer with 2 classes and trained only that layer. In our experiment, we replaced the final fully connected layer with our new fully connected layers with our 2 classes (Shuri and Okoye).

Result

The accuracy of the fine-tuned model was about 95% compare to the 90% of feature extraction.

Challenges

One of the challenges, we encountered was training the model. We naively trained the model on convection laptop CPU, which took a lot of time before switching the Google cloud Engine which took rough 25 minutes (lifesaver).

Great care should be taken when selecting your images as you’ve all noticed we did not use pictures of “Prince T’ Challa” because most images of him conflict with the antagonist of the movie (Erik Killmonger). This was done in order to avoid multi-labelling which can prove to be a pain during computation. The image format is another challenge that should be considered PyTorch will work better with images of the same format.

Annotation of any custom dataset is very important as it will help you organize your images. We learnt this the hard way.

Overcoming all these challenges seemed impossible for novices like us but by addressing the necessities required for the experiment we got our desired results.

In summary, this project was an eye-opener and a worthwhile exercise.

You can watch the presentation below and view our slide here

We applaud #teamPyTorch for all the hardwork they’ve put into their presentation. George Igwegbe, Ezerioha Somtochukwu, Ayodeji Oluwajoba, Chidi, Seun Lawal and Victor O, thank you!

AISaturdayLagos wouldn’t have happened without my fellow ambassador Azeez Oluwafemi, our Partners FB Dev Circle Lagos, Vesper.ng and Intel.

A big Thanks to Nurture.AI for this amazing opportunity.

Also read how AI Saturdays is Bringing the World Together with AI

See you next week 😎.

View our pictures here and follow us on twitter :)

Links to Resources

https://docs.google.com/presentation/d/1C-6s22tJAT6iHpLA1R7Kz1DISWcS_ALeGJffqw8f98M/edit#slide=id.g313908e3fb_2_0
PyTorch :Transfer learning.
Stanford CS class CS231n: Convolutional Neural Networks for Visual Recognition.
Fastai framework.
Practical deep learning for coders
Deep learning Theories
Convolutional Neural Networks
A friendly introduction to Convolutional Neural Networks and Image Recognition
Setting up Google Colab 1
Setting up Google Colab II