Creating and Deploying a Cat-Dog Image Classifier using TensorFlow and Streamlit- Part 1

Published in

Analytics Vidhya

6 min readJul 3, 2021

It all started with an innocent question. I grew up watching a lot of cartoons including a comedy show called CatDog from Nickelodeon. This show was based on conjoint twins who happen to be a cat and a dog respectively (It sounds really silly when I explain it, but hey I was a kid back then!)

CatDog (TV Series 1998-2005) - IMDb

The life and times of a cat and a dog with a unique twist: they're connected, literally. They share one body with Dog's…

www.imdb.com

The question I was trying to answer was; “If I build an image classifier, would that classify my beloved cartoon character as a cat or a dog?”. In this article, we shall go over some steps in what turned out to be an interesting project and answer my question: was it a cat or a dog?

Getting Started:

Assumption- The material covered in this blog assumes that the reader has an intermediate understanding of python and a basic understanding of ML concepts.

Cat and Dog

Cats and Dogs dataset to train a DL model

www.kaggle.com

I started with a dataset on Kaggle which allowed me to train my model on multiple images of cats and dogs. This is a moderate-sized dataset with 4000 images of cats and dogs respectively for training provided. Although this is not the best dataset to work with given the smaller sample size to train our network on, it's a good starting point. More on how I tackled the smaller size later on…

Let us begin with observing some of the images just to get an idea about the project. This exercise will not only allow us to view the image data but also allow us to warm up before getting to the heavy lifting

Basic libraries to load before-

import pandas as pd
import numpy as np
import os
import tensorflow as tf
import cv2
from tensorflow import keras
from tensorflow.keras.models import Sequential, Model
from  matplotlib import pyplot as plt
import matplotlib.image as mpimg
import random
%matplotlib inline
from matplotlib import image
from matplotlib import pyplotimport PIL
print('Pillow Version:', PIL.__version__)

from PIL import Image

Getting started with some cats and dogs-

plt.figure(figsize=(20,20))
test_folder=r'/kaggle/input/cat-and-dog/training_set/training_set/cats'
for i in range(5):
    file = random.choice(os.listdir(test_folder))
    image_path= os.path.join(test_folder, file)
    img=mpimg.imread(image_path)
    print(img.shape)
    ax=plt.subplot(1,5,i+1)
    ax.title.set_text(file)
    plt.imshow(img)

Similarly, for dogs, the results were as follows-

An important thing to note here is that every image has its own dimensions. One takeaway from this observation is that we need to fix the dimension of the images so that we can put up a streamlined continuous tensor(matrix of matrixes) in our CNN’s. For this example, I have fixed my dimensions as (200,200)

Data Ingestion and Manipulation

IMG_WIDTH=200
IMG_HEIGHT=200
img_folder='/kaggle/input/cat-and-dog/training_set/training_set/'

def create_dataset(img_folder):
   
    img_data_array=[]
    class_name=[]
    
    for dir1 in os.listdir(img_folder):
        print("Collecting images for: ",dir1)
        for file in os.listdir(os.path.join(img_folder, dir1)):
       
            image_path= os.path.join(img_folder, dir1,  file)
            image= cv2.imread( image_path, cv2.COLOR_BGR2RGB)
            try:
                image=cv2.resize(image, (IMG_HEIGHT, IMG_WIDTH),interpolation = cv2.INTER_AREA)
            except:
                break
            image=np.array(image)
            image = image.astype('float32')
            image /= 255 
            img_data_array.append(image)
            class_name.append(dir1)
    return img_data_array, class_name
# extract the image array and class name
img_data, class_name =create_dataset('/kaggle/input/cat-and-dog/training_set/training_set/')

A lot of things to cover here. Let's get started with the basics: we need to ensure that the image is processed as data, every image has the same height and width, and the image quality is not forsaken in this resizing process. Finally, we need to collate those images(collection of pixel data, duh!) into one large array for our Neural Nets to work on!

I will cover some of the things which might fetch your attention here-

CV- OpenCV; an open source library for computer vision operations and algorithmscv2.imread-This method loads an image from a destinationcv2.COLOR_BGR2RGB- OpenCV reads images in BGR (Blue Green Red) colorspace ordering and we have to convert it back to RGB (Red Green Blue)cv2.INTER_AREA- This method resizes the image. We need to ensure that the aspect ratio is maintained while image manipulations. Ene quick thumbrule to follow is this-
If you are enlarging the image, you should prefer to use INTER_LINEAR or INTER_CUBIC interpolation. If you are shrinking the image, you should prefer to use INTER_AREA interpolation.image.astype('float32') - For all weights and neuron activations, if you are using a method based on backpropagation for training updates, then you need a data type that approximates real numbers, so that you can apply fractional updates based on differentiation. Best weight values are often going to be fractional, non-whole numbers. Non-linearities such as sigmoid are also going to output floats. So after the input layer you have matrices of float values anyway. Hence, We need to ensure here that the tensors are read as floats valuesimage /= 255 - normalizing our input so that images are scaled between 0 and 1 (max pixel value is 255).

Once, I had the arrays prepared for cats and dogs, It was time to prepare the output variable as 0 and 1. I choose 1 for dogs and 0 for cats (I just prefer dogs, that's all)

def dog_cat_mapping(a):
    if a=="dogs":
        return 1
    else:return 0
class_name=list(map(dog_cat_mapping,class_name))
class_name=np.array(class_name)

Model Building

Finally, I arrived at building the model. I wouldn't go into the details of what and how a Convolutional Layer operates because it requires more than a few (thousand) pages!

def model():
    from tensorflow.keras.models import Sequential
    from tensorflow.keras.layers import Dense, Conv2D, Dropout, Flatten, MaxPooling2D
    model=Sequential()
    model.add(Conv2D(28, kernel_size=(3,3), input_shape=input_shape,activation='relu'))
    model.add(Conv2D(64, kernel_size=(3,3),activation='relu'))
    model.add(MaxPooling2D(pool_size=(2,2)))
    model.add(Conv2D(128, kernel_size=(2,2),activation='relu'))
    model.add(Conv2D(128, kernel_size=(2,2),activation='relu'))
    model.add(Flatten())
    model.add(Dense(256,activation='relu'))
    model.add(Dropout(0.2))
    model.add(Dense(1,activation='sigmoid'))
    return model

The important thing to note here is that you can have as many filters as you want on the layers. I chose 28,64,128 and 256 as these are standards to go with. Moreover, I decided to go with 3*3 and 2*2 sized feature maps. These are just filters where our input image features get extracted.

Once the convolution operation is done, we can start and flatten out the layer to input the data into our Neural Network. I chose a dropout of 0.2 which means that some number (20%) of layer outputs are randomly ignored or “dropped out.” This prevents overfitting.

Finally, a sigmoid function was used to tell if the input was a cat (0) or a dog (1).

Model Compilation and Evaluation

Model compilation took quite a while. To expedite the process, I would highly recommend switching to the GPU mode at Kaggle.

model=model()
model.compile(optimizer='adam', 
              loss='binary_crossentropy', 
              metrics=['accuracy'])
model.fit(x=img_data,y=class_name, epochs=10)

With just 10 epochs, I was able to achieve a training accuracy of 99%. Please note that there is nothing to flex here, reaching a 99% accuracy on an 8000 sample-sized dataset just goes on to show our model is overfitting the data.

This accuracy should also be taken with a pinch of salt since training accuracy depends on initial weight assignments in our neural network (something which is completely random). If I were to start the process all over again, I could very well achieve an accuracy of 80% instead of 99% (because of poor initial weights)

Ending Thoughts and Further Steps

The test accuracy wasn't that great (no surprises there). I was able to achieve an accuracy of ~61% on a test data set. Well, at least it's a good start. Our model can make a basic classification between a cat and a dog. However, it faces troubles when it comes to some breeds such as this-

Further down the road, we shall try to incorporate ‘Transfer Learning’ in our model. Moreover, we shall be able to get an answer to our original question, “Is my beloved cartoon character a Cat or a Dog?”

Find Part 2 of this series here. Please leave your thoughts and comments below. Good vibes only!

Please find below my notebooks here-

Kaggle Link

GitHub Code

Creating and Deploying a Cat-Dog Image Classifier using TensorFlow and Streamlit- Part 1

CatDog (TV Series 1998-2005) - IMDb

The life and times of a cat and a dog with a unique twist: they're connected, literally. They share one body with Dog's…

Cat and Dog

Cats and Dogs dataset to train a DL model

Written by Saksham Gulati