Image based search engine with CNN and Transfer Learning

Dhruv Shrinet
The Startup
Published in
5 min readJun 15, 2020

There is an option in google search for finding the images which are similar to each other in numerous ways like features , color contrast.The result is show with many images with some of theses common attributes.

Let’s see how can we build one with CNN and Transfer Learning. You may ask why CNN ? well this is the most popular deep learning algorithm for work related to image classification or feature extraction and it’s quite common if you look at. I will recommend to know about CNN before going further.

CNN Architecture.

Transfer Learning

Transfer learning is all about using the past computed models, it’s a fancy way of saying using the parameters of some other model which was earlier trained on a bigger data set and using its weights for our data set which will help us in not doing training again and have a better accuracy.

But how will we use transfer learning and why we need it ?

What we actually need in this task is feature vector of two different images, and we will get that from using per-trained models of CNN.In This task we will use RESNET-50 just because of high accuracy.The weights which we will be using are of “imagenet

Here the 2048 is the feature vector we will need for our task.

Implementation

For this task I have used Flickr8k data set from kaggle as it has good variety of different images.

Using google colab.

Importing the libraries and downloading the weights for RESNET-50.

WHAT WE NEED!!

This is the architecture of RESNET-50 avg_pool is the layer where we get the feature vector (1,2048)

This is the feature vector of our image.

Let’s see how can we extract the features from this model,Here the model.layers[-2] signifies the avg_pool layer as it is the second last layer in the RESNET-50 model.

We create model_n a new model where the input goes to the RESNET-50 model and its output is what the second last layer gives which is the 2048 vector.

Before we move on to the feature extraction part lets pre-process the image change its dimension to (1,224,224,3) convert it to array and making the image from this

to this

we need to do this so to reduce the features and have only the important so as to reduce the size of image

Code for the same

This return a numpy array of dimension (1,224,224,3)

After pre-processing the image we need it to pass to the function where it is passed to our model_n where it gives the feature vector

This returns a numpy array of 2048

Now we have to use both of these functions together and do feature extraction of our whole data set , This is how it looks like

Around 8k features will be extracted

How to tell if two images are related?

We have image vectors of all are images , but how do we tell if two vectors are related to each other? In this case we will use cosine distance which tells if two vectors are how much related on the scale of 0–1 (0 being least and 1 being the max),We may also use euclidean or any distance formula

How to make it work in python

Let’s now run through our data set with 0.65 as the minimum cosine value and see what images we get. I have selected a random image and then ran it through the whole data set and where ever the cosine value came equal to or more than 0.65 , plotted that image

What we got!!

what we selected
we got first image the same because the cosine value came 1.0

All of the images are of flying dogs , the more big and vast the datas et the better are the results

Further Optimization

One can train in a better data set for example Flickr30k , try different cosine values , try different pre-trained models like GoogleNet , VGG-16 or maybe different distance function.

--

--