Dog Not Dog

Roshan Noronha

Published in

Algorithms For Life

10 min readSep 25, 2019

In 2017, I completed my first machine learning project — a number recognizer.

NUMBRE — A NUMBer REcognizer Neural Network

My machine learning journey began back in 2015 when I took Andrew Ng’s machine learning course through Coursera. I only…

medium.com

My goal after completing that project was to take what I had learnt and use it to make an app that only recognized hotdogs. Why? Check out the video below.

I started with the best of intentions but I’m ashamed to admit that I slacked off and ended up giving a TED talk.

Dunno if they chose the worst possible thumbnail on purpose

So overall, 2018 was a very busy year.

Jokes aside, another reason for using hotdogs was the fact that hotdog datasets are much smaller in size. This means that it’s a lot harder to create and train a model that can accurately classify “hotdogs” and “not hotdogs”. As such, the approach used to make this classifier can be applied to solve other problems, even when a lot of data isn’t available.

Examples on using machine learning on more niche examples such as to find novel star systems (left) or to identify tumours (right)

Before I made this hotdog recognition app, I wanted to start with an easier example. As such, I went with a classifier that could only recognize dogs, just to keep things simple. I’m not a machine learning expert so this helped to minimize errors. And believe me, even with this “simpler” approach I still ran into a number of issues.

Speaking of approach, I used a different one for this project . Previously, I had used a recurrent neural network. For this project, a convolutional neural network (CNN) was my network of choice. CNN’s are a type of neural network that are really good at recognizing images. What I find very cool about them is the fact that they are based on the mammalian visual cortex. They “see” images just like you or I do!

If you’d like to learn more about the differences, there are a number of great explanations available. I’ll link those below.

Different between CNN & RNN

CNN is a feed forward neural network that is generally used for Image recognition and object classification. While RNN…

medium.com

What is the difference between CNN and RNN?

Answer (1 of 4): CNN is a feed forward neural network that is generally used for Image recognition and object…

www.quora.com

Before I start any project I usually break it down into a list of manageable steps. Here’s what the steps for dognotdog were.

Implement a basic CNN from scratch that can classify dogs.
Use transfer learning to increase the accuracy of the dognotdog CNN.
Create a Shiny webapp that allows users to input a picture and uses the trained model to classify the image.

It’s important to note that I’m not a machine learning expert. I basically tried a bunch of things out and figured out what to do as I went along. So definitely take anything I mention here with a grain of salt.

Since I don’t have a super powerful computer and I’m too cheap to pay for AWS, I used Google Colab to train my models. The great thing about Colab is that I was able to train my models using a GPU instead of a CPU, for free!

While there were challenges every step of the way, I won’t be explaining things step by step. I found that even with tutorials, I still ran into issues. So I’ll be focusing more on what went wrong, why it went wrong and the thinking needed to overcome those challenges.

Of course, if you don’t like this approach then tough shit. It’s my project and I can do what I want.

Implementing a Basic CNN

The goal for this step was to train a CNN to recognize dogs. So, the dogs vs cats dataset on Kaggle came in handy.

Dogs vs. Cats

Download Open Datasets on 1000s of Projects + Share Projects on One Platform. Explore Popular Topics Like Government…

www.kaggle.com

Since there were two categories in this dataset, dog and cat, my thinking was to use binary cross-entropy during training. As you’ll see in the next section, this decision ended up being problematic.

Compared to the other parts of this project, creating this basic CNN ended up being fairly straightforward. After training it ended up being ~79% accurate. I’ve attached the Jupyter notebook I used for this part, below.

dognotdog-basicCNN

Written by Roshan Noronha

colab.research.google.com

Using Transfer Learning to Optimize a CNN

As someone who appreciates not having to do unnecessary work but is definitely not lazy, the idea of transfer learning really appealed to me.

Basically, you take a model that has been trained on a different dataset and reuse it for the task that you want. Makes things a lot more efficient.

I started by modifying the code used to make the basic CNN. Keras offers a number of pretrained networks to choose from so after a bit of research I went with VGG16.

From there, all I needed to do was to freeze the pretrained layers before adding a dense layer with two output nodes at the end.

The final model with the dense layer as the output

This approach resulted in two errors.

Error 1: The trained model was not able to be loaded after training

Error 1

Error 2: During training, the accuracy stayed around 50%.

Let’s focus on error 1 for a minute. When I read in the trained model I found that I could not load it. Googling around, it sounded like there was an issue either with the version of keras I was using or with the first layer of the network.

Downgrading to an earlier version of Keras didn’t work but changing the input layer did. In retrospect, it’s funny how simple the fix ended up being.

vggmodel = VGG16(include_top = False, weights = “imagenet”, input_shape= (img_width, img_height, 3))

Simply setting the include_top parameter to False fixed the issue. As I understand it, this allows you to define the input size rather than go with the default size of the pretrained network.

Error 1 fixed!

Error 2 was much more complex to figure out. As a refresher, since there were only two categories, dog or not dog, I was using binary classification. However, this resulted an accuracy of ~50%.

The great thing about having a basic foundation in machine learning is that I had a couple ideas on which parameters to tweak. Thanks Andrew Ng!

The original settings that were potentially causing issues

I played around with changing four settings.

The learning rate.
The batch size.
The number of iterations over the entire training set.
The number of batches used for training per epoch.

To ensure things didn’t get too confusing, only one parameter was changed at a time, while the other 3 remained consistent. At the end of the day however, the accuracy still hovered around 50%.

As a last resort I changed the loss function from binary to categorical cross-entropy and …..success! But why? I reached out to friends and colleagues but the explanations were varied and didn’t make a lot of sense. So at this point, I was a stumped. When this happens I usually stop coding and then do one of three things. Get a drink, go out for a run or grab some spicy dumplings.

After a plate of these you’re practically invincible

A plate of dumplings later……back to the solution.

To figure out the why, experience has taught me to take a closer look at the documentation. Going back and reading the basics can be incredibly helpful and saves time in the long run. The official keras documentation as well as the source code on Github helped to provide insight.

Losses - Keras Documentation

A loss function (or objective function, or optimization score function) is one of the two parameters required to…

keras.io

keras-team/keras

github.com

It turns out that when binary cross-entropy is used, the implication is that every input is associated with multiple outputs. For example, a dog could be grouped with other dogs as well as grouped with other mammals. Technically, it fits into both groups.

The assumption with categorical cross-entropy is that given multiple classes, only one is correct. For the purposes of dognotdog, categorical cross-entropy has to be used as there is only one correct answer.

Based on that, I modified my original approach.

train_gen = datagen.flow_from_directory(train_data, target_size= (img_width, img_height), batch_size= 25, shuffle = True, class_mode= “categorical”)validation_gen = datagen.flow_from_directory(test_data, target_size= (img_width, img_height), batch_size= 25, shuffle = True, class_mode= “categorical”)model.compile(loss = 'categorical_crossentropy', optimizer= 'adam', metrics = ['categorical_accuracy'])

Modifying the class_mode parameter in the training and validation generators to categorical and specifying categorical_crossentropy for the loss function fixed the issue!

Although the accuracy was closer to 90% and definitely an improvement I’m sure that by tweaking a number of parameters this could be improved further. If I had the time I would definitely play around with the the learning rate, batch size and a number of other parameters.

But for now I think it’s safe to say….error 2 solved!

A Shiny Webapp that Classifies an Image

Making a webapp can be a daunting task. You have to know HTML, CSS and a bunch of other frameworks. And since I’m not that kind of programmer, learning all of that isn’t the best use of my time.

Fortunately, I have some experience with this. A couple years ago, I had made a webapp using R and Shiny as part of a hackathon. And I had made a couple more since then.

Shiny Diversity — Visualizing Bacterial Diversity

Several weeks ago, I participated in my first hackathon. For the uninitiated, hackathons are software/hardware…

medium.com

Shiny is an R package that lets you develop interactive webapps. I like using it because as long as you know R, you really don’t need to learn HTML, CSS or other frameworks.

That being said, I wasn’t sure how to actually use my trained model in R. All my code so far had been written in Python and I had no idea how to convert it into something R would understand. Luckily, during the ungodly amount of time it took to complete my undergrad I learnt one, very important lesson.

Specific questions get specific answers.

Can I use Keras and Tensorflow in R?
How can I run Python code in R?

Here’s the result of question 1.

And the result for question 2.

…….well that’s lucky.

Before we get into how to use all this you should set up R, RStudio and Python in some kind of Linux/Unix environment. I used a virtual machine running Ubuntu 16.04.

DO NOT USE WINDOWS!

You’ll be shaking your fist at the cold, cruel world when things don’t work.

You may also run into issues where packages do not load due to missing .h files. Running the following code in the command line should help.

sudo apt-get install libxt-devel xvfb xauth xfonts-base libcairo2-dev libgtk2.0-dev

To set this up, I started with installing and importing Shiny, Keras, Tensorflow and Reticulate into R.

install.packages("shiny", “keras”, “Tensorflow”, "reticulate")library(shiny, keras, tensorflow, reticulate)py_install("Pillow")
PIL <- import("PIL")

From there I ran the same python functions that were in my Jupyter notebook. And this is where I ran into my first error message.

ImportError: Could not import PIL.Image. The use of array_to_img requires PIL.

Now ordinarily I like pickles but this one, not so much. Pillow(PIL) is a python package but it wasn’t being loaded despite reticulate being installed. My guess was that R didn’t know where to look.

Now I could have given R the file path where Python was installed. However, since this application would eventually be hosted on another server, that would have been problematic. This is where the idea of a virtual environment came in. When the app was run, reticulate would create a local environment that contained Python along with all the needed libraries.

#create virtual environment
virtualenv_install(“env”, packages = c(“Pillow”, “keras”, “tensorflow”), ignore_installed = FALSE) #import python libraries
PIL <- import(module = “PIL”)
keras <- import(“keras”)

Essentially using virtualenv_install allows for an environment that contains the Pillow, keras and tensorflow to be created. Once that is done, import can be used to load those packages into R.

Everything should work but if not, restart RStudio. If that still doesn’t work, cry for a bit then send me a message.

The completed application was uploaded to ShinyStudio!

After all that work, here’s the end result.

Some last minute notes.

All the code used for this app can be found here.

roshannoronha/dognotdog

Contribute to roshannoronha/dognotdog development by creating an account on GitHub.

github.com

At the time of writing this, Tensorflow 2.0 was released. So if you use my code and run into issues, that’s probably why. When installing Tensorflow, I recommend you stick to version 1.14.
This app is 92% accurate at distinguishing dogs from cats. If you show it another animal, results may vary. In the future, I’d use a more comprehensive dataset with more animals to get around this.

If you enjoyed this article, and want to read more, just click the link below…

Algorithms For Life

Making Cool Things

medium.com

… or drop by my website to check out my research or to send me a message!

Roshan Noronha

Roshan Noronha — Personal Website

roshannoronha.github.io

If you have any comments or suggestions about this article, feel free to leave a comment!

Thanks for reading :)