Final Project: Natural Image classifier

9 min readMay 1, 2022

The goal is to build a Image Classifier for the Kaggle Natural Images dataset which contains 6,899 images from 8 distinct classes compiled from various sources. The classes include airplane, car, cat, dog, flower, fruit, motorbike and person.

Interested in Testing my Web App follow this link

Google colab Notebook link

Interested in testing my web app follow the link

Before going deep in to the project Let’s see what is an image classification.

Image classification is a machine learning method used to recognize the category of an image. using different machine learning algorithm when we provide an image as input it predicts the category of the input image. First we will train an image classifier by providing different images that belongs to different categories. Once the image classifier is trained enough and the accuracy is hight it will start predicting the category of images correctly

Convolutional Neural Network

A CNN, or convolutional neural network, is a deep learning neural network designed to interpret structured arrays of data like photographs. Convolutional neural networks are widely utilized in computer vision and have become the state-of-the-art for many visual applications such as image classification, as well as natural language processing for text categorization.

There are four types of layers for a convolutional neural network: the convolutional layer, the pooling layer, the ReLU correction layer and the fully-connected layer.

The convolutional layer

The convolutional layer, which is always at least the first layer in convolutional neural networks, is the most important component. Its objective is to identify a set of features in the photographs provided as input. Convolution filtering does this by dragging a window representing a feature on the image and calculating the convolution product between the feature and each piece of the scanned image. A feature is considered as a filter in this scenario, and the two names are interchangeable.

As a result, the convolutional layer accepts many images as input and calculates the convolution of each image with each filter. The visual features we’re seeking for are perfectly matched by the filters.

The pooling layer

This layer is frequently sandwiched between two convolutional layers: it receives many feature maps and applies the pooling operation to each of them.

The pooling operation reduces the size of the photos while maintaining their essential properties.

The ReLU correction layer

The real non-linear function defined by ReLU(x)=max is referred to as ReLU (Rectified Linear Units) (0,x). In terms of appearance, it appears as follows:

All negative values received as inputs are replaced by zeros by the ReLU correction layer. It serves as an activating mechanism.

The fully-connected layer

The fully-connected layer, whether convolutional or not, is always the last layer of a neural network, hence it is not unique to CNNs.

This layer takes an input vector and turns it into a new output vector. It accomplishes this by applying a linear combination and, perhaps, an activation function to the incoming input values.

Let’s start coding….

First step is to load the data — we need to unzip and load the input dataset and split the input data into test and train datasets
check the accuracy of the model. Check If the accuracy is low or results in underfitting or overfitting
Apply hyperparameter tuning and test by increasing layers and changing other parameters
Change the batch size, epoch values and then compare the accuracy
Try implementing ML models with other algorithms such Random Forest classifier,KNN and decision tree.
Build a web App that enables the user to upload image and then the app will predict the category of the uploaded image

1.Load and split the Input Natural Images dataset

Here I have mainly used Tensorflow to build the image classifier

What is Tensorflow ?

TensorFlow is a Machine Learning library developed by the google brain team in 2015 and it’s primarly developed for deep learning applications.TensorFlow accepts data in the form of higher dimensional arrays called tensors. Multidimensional arrays are very effective in handling large volume of data

Tensorflow consists of a function called image_dataset_from_directory which can be used to split our dataset to test and train

2. Model Training

I have implemented the below classifiers for image classification.

Convolutional Neural Network
Random Forest classifier
KNN classifier
Decision Tree Classifier

1.Build the CNN model

Model 1

As part of model 1, I have developed a simple machine learning model with only dense layer

with the initial model we have got an accuracy of 76%. But from the above result it is very much clear that the model has resulted in overfitting.

Overfitting

Overfitting happens when your machine learning model shows high accuracy in the train dataset but very low accuracy in the test dataset.

Above you can clearly see a large difference in the accuracy of model over test and train data. With the test data the model is showing an accuracy of 85% whereas in the train data it is showing an accuracy of 46%. Well…this can be a sign of overfitting. So we need to find ways to prevent overfitting.

Let’s look into the different ways to prevent overfitting

As per the Tensorflow documentation the two effective method to prevent overfitting is to add weight regularization and dropout

Model 2

Model 2 is built in such a way to avoid the overfitting in Model 1

This model 2 resulted in an accuracy of 78.97%.

Model 3

As part of model 3 we have developed a complex model with 6 conv2D layers that gave an accuracy of 89.48%

Graphical representation of Training and validation loss and accuracy of the above 3 CNN models

Accuracy vs Epoch

Hyperparameter Tuning

Experiment with five hyperparameters in the model:

Dropout rate in the dropout layer
Optimizer
L2 Regularization parameter
Epoch
Dropout layer

While experimenting with the dropout rate in dropout layer we got the maximum accuracy when the dropout rate is between 04–0.5

Also we got the maximum accuracy with adam optimizer in the tensorflow

While increasing the epoch value the accuracy also increased but after some point of time the accuracy remained constant or started decreasing.

L2 regularization parameter and Dropout layers are used to prevent overfitting in a machine learning model

2. Random Forest Classifier

Random forest is a supervised machine learning approach for solving classification and regression issues. It uses the majority vote for classification and the average for regression to generate decision trees from various samples.

The Random Forest Algorithm’s ability to handle data sets with both continuous and categorical variables, as in regression and classification, is one of its most important features. It outperforms the competition when it comes to classification problems.

We achieved a 45 percent accuracy utilizing Random Forest Classifier for our image classifier.

3. KNN Classifier

The k-nearest neighbors (KNN) algorithm is a data classification approach that determines the likelihood that a data point will belong to one of two groups based on the data points closest to it.

The supervised machine learning algorithm k-nearest neighbor is used to address classification and regression problems. It is, however, mostly employed to solve categorization difficulties.

By using KNN Classifier we got an accuracy of around 67%

4. Decision Tree Classifier

Decision Tree is a Supervised Machine Learning Algorithm that makes judgments using a set of rules, similar to how people do. A Machine Learning classification method is designed to make judgements, in one sense.

The model predicts the class of the fresh, never-seen-before input, but the algorithm must decide which class to allocate behind the scenes.

By using Decision Tree Classifier, we got an accuracy of around 56%.

Comparison of accuracy of models

Flask Web App

Now that we have seen how the classification of images works, let’s create a web app which has an interface to upload images and uses our model to classify images.

Flask is a micro web framework written in Python that offers useful tools and features that make building web applications easier. Compared with other frameworks, Flask is more accessible to beginners.

There are two pages in this web application:

1) The Home page has a feature where we can upload an image, and when we click the Predict button, the result page appears.

2) The predicted result is on this page. It displays the category of the image and the probability.

Folder Structure:

models/my_model.pkl

static/images

templates

app.py

Workflow:

The templates folder contains the HTML template files for rendering the application. The rendering engine we use is Jinja2.

We will store uploaded images in the static folder before processing.

The exported model file will be kept in the models folder. The trained model was exported to .pkl format using the pickle package.

The entry point of the app is app.py. In this file all the necessary information is contained to get the app running.

I have defined a route for the home page in app.py: @app.route(‘/’, methods=’GET’). Here we will have the form to upload the image.

Similarly @app.route(‘/’, methods=[ ‘POST’]) is defined as the form action route to upload the image. Once the image reaches the server image preprocessing is done before it is given to the model for prediction.

The result of the model is processed and rendered using templates with the help of Jinja2.

Github: https://github.com/rettygeorge/Natural-image-classifier

Youtube Video

Below you can find an youtube video describing about the image classifier and also showing a small demo of the web app

Data set link : https://www.kaggle.com/datasets/prasunroy/natural-images

Final Project Proposal : https://medium.com/@retty.george/final-project-prop-abfdc5d6f4af

Source code link : https://github.com/rettygeorge/rettygeorge.github.io/blob/master/final_project.ipynb

Web App link: https://flask-image-classifier-retty.herokuapp.com

Web App Source code link : https://github.com/rettygeorge/Natural-image-classifier

My Contribution:

My contribution includes understanding the concepts of convolutional neural network and experimenting with different hyperparameters to improve the accuracy such as working with different batch size, epochs, input layers, optimizers and so on. I created 3 CNN models and then compared the accuracy.

Apart from that, I have also implemented random forest classifier, KNN classifier and also decision tree classifier.

One of my major contributions is building a web app that allows users to upload an image and the web app in return will predict the category of the uploaded image.

Challenges:

The main challenge i faced in this final project was with the Neural Network park. Most of the models resulted in under-fitting and overfitting. So I have experimented with different hyperparameters like input layer, epochs, batch size, optimizer and so on. to arrive at the final neural network model that gave an accuracy of 89%. I have explored and tried different optimization methods and thereby increasing the accuracy to 89.

The next challenging part was not implementing other algorithms such as random forest classifier, KNN classifier and decision tree classifier.

Reference

Final Project: Natural Image classifier

Written by Retty George