AI6: The Torch Panther.

George Igwegbe
Facebook Developer Circles Lagos
8 min readMar 14, 2018
source: google.com

You remember AI6? It is already 8 weeks and we are excited to brief and keep you up to date on our conquest i.e democratize AI. The first few weeks have been great with lectures from Standford’s CS231n, Fastai and reviews of academic papers related to deep learning. We were introduced to deep learning frameworks which we will use to build our projects in the subsequent weeks and later divided into groups. Each group is supposed to build a project with their framework (PyTorch, TensorFlow, Theano, Neon and Keras).

In this article, we will run through the first project by Team PyTorch.

How this article is Structured

  • What is PyTorch ?
  • Convolutional Neural Network( LesNet and ResNet) with CIFAR-10.
  • Transfer learning with our custom Black Panther dataset.
  • Results.
  • Challenges

PyTorch?

PyTorch is an open source deep learning framework for python. This open source framework was based upon another deep learning library named Torch. Surprisingly, this framework was not created with python rather, it was designed with a scripting language named LuaJIT.

Major users of PyTorch.

PyTorch was initially released on October, 2016 which followed with its stable release (version: 0.3.1) on 14th February, 2018. It was primarily developed by Facebook’s artificial-intelligence research group. The framework’s popularity grew sporadically after its stable release because it was created using python as its scripting language. Also, it was built with the concept of dynamic computational graph in mind.

Dynamic computational graph.

PyTorch puts Python FIRST which makes it suitable for industry experts and academic researchers for its simplicity and rapid prototyping. Also, Fastai library and Uber’s “Pyro” software for probabilistic programming are built upon it.

Convolutional Network (LeNet-5 and ResNet) with CIFAR10.

Our team members were basically beginners when we tried this experiment out of curiosity. We implemented Convolutional Neural Networks(CNNs), which are deep learning architecture inspired by the brain’s biological processes. So everything we did was by the book with a little ingenuity.

We decided to compare an earlier version of neural networks called LeNet-5 with a modern version called ResNet on the CIFAR-10 dataset. We decide to try the most basic network model on the CIFAR-10 dataset which was the LeNet-5 model.

Quick Note: The CIFAR10 data set which consists of 60,000 (32 x 32 pixels) colour images of 10 classes. The data set is divided into 50,000 training images and 10,000 test images.

LeNet-5

The LeNet model was developed in 1988 by LeCun for handwritten number recognition used by banks for cheque checking. The LeNet model is a 7-level convolutional neural network that employed the use of one or more convolutional layers before adjoining to the fully connected linear layers where the outputs are collated and compared. In our experiment, we made use of 2 convolutional layers, 2 max pooling layer and 3 fully connected linear layers.

LeNet

The design of our convolutional neural network (LeNet-5) was done by:

  • Importing the necessary tools and functions.
  • Creating a neural network Module class.
  • Initializing it by defining the necessary layers.
  • Defining your forward pass.

PyTorch has special in built packages in order to assist us in designing our desired network model. Some of these notable features include: torch.autograd, torch.nn and torch.nn.functional. These features help us in situations like computing gradients, performing gradient descent, declaring neural networks as well as selecting our desired activation functions etc.

After designing our neural network, we declared our criterion and optimizers. These functions are vital in computing our gradients, performing back propagation as well as updating our weight since the LeNet model works with this flow.

We trained the network for 2 epochs as well as its performance over the dataset. This process was carried out on a Local system (CPU) with the class “car” having the highest accuracy. The loss rapidly in the first epoch but its rapid fall decreased in the second epoch.

We concluded that the results collated and compiled were good but we felt that the accuracy could go higher and the loss lower. One way this could be achieved was using a deeper neural network than LeNet-5. So we decided to apply the ResNet (Residual Network) model on our dataset.

Deeper with ResNet

ResNet(Residual Network)

ResNet-50 is a 50 layer Residual Network developed by Microsoft Asia that won the 2015 Imagenet and COCO competitions. As network depth increases, accuracy get saturated and then rapidly degraded. This is called the vanishing and exploding gradient problem. The introduction of skipped connections in ResNet carry important information in the previous layer to the next layers. By doing so we can prevent vanishing gradient problem.

Residual Networks

We decided to go with the ResNet-50 model not because it was the best, but that it was a model we understood and was able to fine tune to suit our dataset since there are no pre-trained models for CIFAR-10.

After fine-tuning layers of the skeletal ResNet-50 Model, the results collated were better compared to results gotten from the LeNet-5 Model. The class “car” had a higher accuracy of 83% while the lowest loss value dropped by 0.161 over 2 epochs compared to the previous result.

From the experiments performed above, we discovered that the deeper the network the better the accuracy. But this conclusion is clouded by how deep the network should be. In general, it can be deduced that the ResNet model provides a better accuracy and lower loss values than LeNet-5 Model.

Therefore, we proceeded to try out this same experiment with our custom dataset. We also discovered that the results could be better by increasing the number of epochs but this might not work at in few cases.

Transfer Learning with Black Panther.

Our team decided to be a bit adventurous by embarking on our own Wakanda conquest. We created a custom image dataset (120 images) of two main characters in the movie namely “Okoye” and “Shuri”. We implemented transfer learning using the dataset. But we had to ask ourselves this question what is transfer learning?

What is Transfer Learning?

Transfer learning involves applying knowledge from a tested and trusted source to a new problem/ situation. In practice, models are rarely trained from scratch because such processes are computation-intensive, power consuming as well as a waste of processing time. So, pre-trained models are considered suitable for this experiment.

Why read, when you can download the knowledge ?

In this experiment, we will be comparing two transfer learning techniques:

Fine-tuning: Here, we initialized our network with a pre-trained network (ResNet-18) which has been trained with Imagenet 1000 dataset. We are basically transferring the knowledge such as edge detection or color blob detection used in Imagenet to our network. Then, the model was trained with the custom dataset.

Feature Extraction: In this case, we froze the weights of all the network leaving the last fully-connected layer. We altered some features and replaced it with our own fully connected layer with 2 classes and trained only that layer. In our experiment, we replaced the final fully connected layer with our new fully connected layers with our 2 classes (Shuri and Okoye).

Result

The accuracy of the fine-tuned model was about 95% compare to the 90% of feature extraction.

Challenges.

One of the challenges, we encountered was training the model. We naively trained the model on convection laptop CPU, which took a lot of time before switching the Google cloud Engine which took rough 25 minutes (lifesaver).

Great care should be taken when selecting your images as you’ve all noticed we did not use pictures of “Prince T’ Challa” because most images of him conflict with the antagonist of the movie (Erik Killmonger). This was done in order to avoid multi-labelling which can prove to be a pain during computation. The image format is another challenge that should be considered PyTorch will work better with images of the same format.

Annotation of any custom dataset is very important as it will help you organize your images. We learnt this the hard way.

Overcoming all these challenges seemed impossible for novices like us but by addressing the necessities required for the experiment we got our desired results. In summary, this project was an eye-opener and a worthwhile exercise.

Major contribution from team member Ezerioha Somtochukwu, Ayodeji Oluwajoba, Chidi, Seun Lawal, Michael Adejuwon and Victor O.

Dora Milaje with the sauce.

Thanks to Femi Azeez and Tejumade Afonja for the great work. Special thanks to Univelcity for the wonderful venue, Vesper.ng, Intel and Facebook Developer Circles Lagos for the support.

Resources

  1. PyTorch :Transfer learning.
  2. Stanford CS class CS231n: Convolutional Neural Networks for Visual Recognition.
  3. Fastai framework.
  4. Residual Network

--

--