Creating a simple dog 🐶 vs cat 🐱 image classifier using Keras

Anchit Jain
Data Science 101
Published in
5 min readMay 25, 2018

I’ve been learning various machine learning algorithms like Linear Regression, Logistic Regression and Decision Trees for a while, it’s time for me to explore Neural Networks for advanced Machine Learning problems.

In this blog I’ll be discussing all the basic fundamentals of Machine line(ML) and using this branch of science I’ll help you to figure out how we can work on problem where we need to predict something.

So what is Machine learning ?

Copying few lines from wikipedia lets see what they say.

Machine learning(ML) is a field of computer science that often uses statistical techniques to give computers the ability to “learn” (i.e., progressively improve performance on a specific task) with data, without being explicitly programmed.

Well said.Using the same concept we will try to train our model with large data set with all parameters and at the end we will test of model with some inputs.

A little theory for big picture should not be painful.

I’ll try to precise every linked topic with minimal theory and maximum knowledge. Now how can we use ML(please excuse my laziness) for our problem statement and how can I use ML for my problem and this is where the concept of Deep learning comes into play. In simple word deep learning is the way to implement ML using algorithms and inspired by the functionality of the brain called artificial neural networks.

Deep learning is an aspect of artificial intelligence (AI) that is concerned with emulating the learning approach that human beings use to gain certain types of knowledge.

Did you see that ? …. Artificial Neural Network. Yes please pay your attention here.It is a machine learning algorithm, which is built on the principle of the organization and functioning of biological neural networks.Now look at the beauty of ML here is that our entire neural network model will work in the same way like human brain work.Let see what our dear google has to say about artificial neural networks.

An artificial neuron network (ANN) is a computational model based on the structure and functions of biological neural networks. Information that flows through the network affects the structure of the ANN because a neural network changes — or learns, in a sense — based on that input and output.

The below image briefs you about how a message is being transferred from one neuron to another where each neuron is located in a series of layer where we feed data to input layer and by passing it through successive hidden layer and training our model simultaneously we reach to layer called output layer where we can predict our output.

Artificial Neural Network

I guess this much theory is suffice to learn our problem statement. It’s time to dive in our coding pool.

For those of you who want to understand Neural Networks in more depth — I’d recommend watching this short, yet exhaustive explanation of neural networks here.

Image classification:

The very first step in CNN is to classify image,since it very easy to visualize an image with human eye but how can we make the same thing to visualize through our machine …..sad but machines can see :( this can be achieved through the matrix representation of image. See how.

Image classification

From the above image we can easily see the image (digit 8 ) now what is image nothing but the collection of pixels. So here the image consists 28 number of rows and 28 number of columns which is equal to 784 pixels in total and these 784 pixels will act as an input to our first layer of CNN that is input layer.In a similar way we have taken thousands of images of dog and cat and we have taken 150 by 150 pixels as an input. Also, we rather than taking each image one by one I have created a batch of images of size 16 for faster iteration.Taking samples of 2000 images in iterating the entire model for 50 (Epochs)

One Epoch is when an ENTIRE dataset is passed forward and backward through the neural network only ONCE.

Okay ! now we have designed our input layer and now moving ahead with further layers we chooses our model to be sequential since the output of one layer is input to another layer and so on.Let check how these layers looks like.

Let us see each function one by one….here we go.

  1. Sequential() : The sequential model is just a linear stack of layers. add() method help you to add layers in your model.
  2. Conv2D : This layer creates a convolution kernel that is coiled with the input layer to produce a tensor (generalization of matrices) of outputs.
  3. Activation : This function is a node between the output of one layer to another.
  4. MaxPooling2D : It is the process of down-sampling(reducing dimensions) the representation of image.
MaxPooling2D

Now once we have created the input and hidden layer, we need to connect all the layers so as to gain the output for that Flatten() method is used to take the input which creates 1-D array of input.Followed by Dense() method which is used to connect all the layers densely for the final output and the last method is Dropout() which is used to avoid overfitting.

Augmenting and Compiling the images :

This is something which helps in training our model with best fit. Augmentation is the pre-processing of image where an model is been trained with wide diversity of an image. This diversity of an image can be carried out in following ways like scaling, translation, rotation and flipping etc.

We then compile the CNN using the compile function. This function expects three parameters: the optimizer, the loss function, and the metrics of performance.The optimizer is the gradient descent algorithm we are going to use. We use the binary_crossentropy loss function since we are doing a binary classification.

Last, after gathering the well structured data it’s time to train the model. We have model.fit_generator() where we take following arguments and train our model multiple times till we achieve the maximum accuracy and minimum loss. The accuracy of our model can be achieved by tuning our hyper-parameters(Epochs). Once we train our model with maximum accuracy we need to save the whole model so as to avoid multiple training for every test.

Its time to summarize our learning's so far. Lets check it out some necessary steps for building dog-cat classifier.

  1. Image per-processing
  2. Creating ANN layers
  3. Model training
  4. Model testing
  5. Model evaluation

You can access the full code here.

Thank you for you patience…………Claps (Echoing)

--

--

Anchit Jain
Data Science 101

Machine learning engineer. Loves to work on Deep learning based Image Recognition and NLP. Writing to share because I was inspired when others did.