Logo Detection Using PyTorch

I wrote this blog to wrap up my first ever public talk at PyCon Thailand 2018 and to add some more details . Download all materials.

To run notebook on Google Colab free GPU read here

Ad Tech

Image for post
Image for post
Source: http://cluep.com

Deep Learning

Single Layer Neural Network

Image for post
Image for post

Fully Connected Network

Image for post
Image for post

Convolutional Neural Network (CNN)

Image for post
Image for post

Max Pooling

Image for post
Image for post

Create Network

Image for post
Image for post

Loss Functions

Image for post
Image for post

Decent Gradients

Image for post
Image for post
Source: Python Machine Learning 2nd Edition by Sebastian Raschka, Packt Publishing Ltd. 2017

We can update each weight by using this equation. The learning rate is hyper parameter which we have to define. If it’s too high, the loss will overshoot and never find minimum (see upper-right picture).

Update weight : w += - learning_rate * gradient

Network Training Loop

Image for post
Image for post

PyTorch

Tensor

Image for post
Image for post

Project Pipelines

1. Get the Data

  • 32 brands + no-logo
  • Adidas, Aldi, Apple, Becks, BMW, Carlsberg, Chimay, Coca-Cola, Corona, DHL, Erdinger, Esso, Fedex, Ferrari, Ford, Foster’s, Google, Guiness, Heineken, HP, Milka, Nvidia, Paulaner, Pepsi, Ritter Sport, Shell, Singha, Starbucks, Stella Artois, Texaco, Tsingtao and UPS.
  • There’re 320 logo images for training, 960 logo images for validation, 3,960 images for testing, and 3,000 no-logo images.
Image for post
Image for post

  • Import required library.
  • Define directories.
  • Create load_datasets utility function to load dataset from FLICLLOGOS_URL and unzip to SOURCE_DIR.
Image for post
Image for post

2. Prepare Data for Network

  • Add train_logo_relpaths and half of val_logo_relpaths to train_relpaths.
  • Add train_logo_relpaths and the other half of val_logo_relpaths to val_relpaths.
Image for post
Image for post

  • We’re going to use datasets.ImageFolder( ) which preferred directory structure as dataset/classes/img.jpg
  • Create prepare_datasets utility function to copy image files according to the lists of relative paths to preferred directory structure.
Image for post
Image for post

  • Import torch and torchvision libraries
  • Define data_transforms which resize, convert to tensor, and normalize inputs by mean and standard deviation of training dataset.
  • Create datasets using torchvision.datasets.ImageFolder with arguments-dataset directories and data_transform.
  • Create dataloaders using DataLoader with batch size = 32 on each dataset.
Image for post
Image for post

  • We create imshow utility function to display image.
Image for post
Image for post

3. Create Network

Image for post
Image for post

  • Create our network by subclass nn.Module
  • Define __init__ method by create conv1 layer using nn.Conv2d with arguments-3 in-channels, 6 out-channels, 5 x 5-pixel filter with stride=1 as default, conv2 layer with arguments–6 in-channels, 16 out-channels, and same size filter.
  • Create pool layer using nn.MaxPool2d with arguments-2 x 2-pixel filter, stride=2.
  • Create fully connected layer fc1 using nn.Linear with arguments-(16 * 53 * 53) in-features, 120 out-features, fc2 with arguments-120 in-features, 84 out-features, and fc3 with arguments-84 in-features, 33 out-features.
Image for post
Image for post
  • Define forward method for inputs.
  • Forward inputs through conv1 layer, apply nn.functional.relu, and forward through pool layer.
  • Forward inputs through conv2 layer, apply nn.functional.relu, and forward through pool layer.
  • Flatten x to two-dimensional tensor-[number of instances, 16*53*53] using view method.
  • Forward through fc1, apply relu, forward through fc2, apply relu, and forward through fc3 without relu because we will use softmax instead.

  • Before instantiate our network, use torch.device to detect GPU if it’s available otherwise use CPU.
  • Then set our network to device
Image for post
Image for post

4. Train the Network

  • Define criterion or loss function using nn.CrossEntropyLoss which already incorporates softmax function with entropy loss.
  • Use optim.SGD or Stochastic Gradient Decent as optimizer with lr (learning rate) = 0.001.
Image for post
Image for post

  • Create train_val function to train network on training dataset and evaluate network on validation dataset.
  • Set model or network to train mode for training and to eval mode for evaluating.
Image for post
Image for post
Source: https://pytorch.org/tutorials/beginner/transfer_learning_tutorial.html

  • Set inputs and labels to device.
  • Set beginning gradients zero using optimizer.zero_grad to prevent adding up gradients for each iteration.
  • Set torch.set_grad_enabled as True if it’s training.
  • Compute outputs.
  • Compute loss.
  • If it’s training, compute gradients by using loss.backward and update parameters by using optimizer.step.
Image for post
Image for post
Source: https://pytorch.org/tutorials/beginner/transfer_learning_tutorial.html

  • The function returns the model with best accuracy on validation dataset.
Image for post
Image for post
Source: https://pytorch.org/tutorials/beginner/transfer_learning_tutorial.html

  • Train and validate the network.
  • It took about 11 minutes on one GPU with best accuracy of 61.87% which seems not best at all.
Image for post
Image for post

5. Evaluate

  • We get better accuracy on test set.
Image for post
Image for post

There’re many things we can do to improve our own created network. But there’s one practice in deep learning that very useful and effective. It’s called transfer learning.

Transfer Learning

Transfer learning is a machine learning technique where a model trained on one task is re-purposed on a second related task.

Source: https://machinelearningmastery.com/transfer-learning-for-deep-learning/

Simply speaking, we can run our data on any other pretrained networks with some tweaks. It saves a lot of time and very effective.

How?

  1. Match our data to the network input’s format.
  2. Replace the output layer.
  3. Retrain the network.
Image for post
Image for post

ResNet18

Image for post
Image for post
Source: https://arxiv.org/pdf/1512.03385.pdf

Match the network input’s format

Image for post
Image for post

Load Pretrained Network

  • The output layer fc is the layer we’re going to replace.
Image for post
Image for post

Replace output layer

Image for post
Image for post

Retrain the Network

  • The accuracy on validation dataset is much better.
Image for post
Image for post

Evaluate on Test Dataset

Image for post
Image for post

As Fixed Feature Extractor

Image for post
Image for post
Image for post
Image for post

That’s it. Hope you see how powerful the transfer learning is.

Questions?

Recommend resources

Diving in Deep

The journey from machine learning 101 to AlphaGO deep…

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch

Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore

Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store