How create Image Recomendation system

Bernardo Caldas
Analytics Vidhya
Published in
4 min readDec 6, 2020

A recommender system, or a recommendation system (sometimes replacing ‘system’ with a synonym such as platform or engine), is a subclass of information filtering system that seeks to predict the “rating” or “preference” a user would give to an item.[1][2] They are primarily used in commercial applications.

Recommender systems are utilized in a variety of areas and are most commonly recognized as playlist generators for video and music services, product recommenders for online stores, or content recommenders for social media platforms and open web content recommenders.[3][4] These systems can operate using a single input, like music, or multiple inputs within and across platforms like news, books, and search queries. There are also popular recommender systems for specific topics like restaurants and online dating. Recommender systems have also been developed to explore research articles and experts,[5] collaborators,[6] and financial services.

( https://en.wikipedia.org/wiki/Recommender_system )

Collaborative filtering

A key advantage of the collaborative filtering approach is that it does not rely on machine analyzable content and therefore it is capable of accurately recommending complex items such as movies without requiring an “understanding” of the item itself. Many algorithms have been used in measuring user similarity or item similarity in recommender systems. For example, the k-nearest neighbor (k-NN) approach[38] and the Pearson Correlation as first implemented by Allen.[39]

When building a model from a user’s behavior, a distinction is often made between explicit and implicit forms of data collection.

Examples of explicit data collection include the following:

  • Asking a user to rate an item on a sliding scale.
  • Asking a user to search.
  • Asking a user to rank a collection of items from favorite to least favorite.
  • Presenting two items to a user and asking him/her to choose the better one of them.
  • Asking a user to create a list of items that he/she likes (see Rocchio classification or other similar techniques).

Cosine similarity

Using Keras and CNN vgg16 we going to develop a algorithm to recommend similar products ;

# imports

from keras.applications import vgg16
from keras.preprocessing.image import load_img,img_to_array
from keras.models import Model
from keras.applications.imagenet_utils import preprocess_input

from PIL import Image
import os
import matplotlib.pyplot as plt
import numpy as np
from sklearn.metrics.pairwise import cosine_similarity
import pandas as pd
# parameters setup

imgs_path = "../input/style/"
imgs_model_width, imgs_model_height = 224, 224

nb_closest_images = 5 # number of most similar images to retrieve

1. load the VGG pre-trained model from Keras

Keras module contains several pre-trained models that can be loaded very easily.

For our recommender system based on visual similarity, we need to load a Convolutional Neural Network (CNN) that will be able to interpret the image contents.

In this example we will load the VGG16 model trained on imagenet, a big labeled images database.

If we take the whole model, we will get an output containing probabilities to belong to certain classes, but that is not what we want.

We want to retrieve all the information that the model was able to get in the images.

In order to do so, we have to remove the last layers of the CNN which are only used for classes predictions.

files = [imgs_path + x for x in os.listdir(imgs_path) if “jpg” in x]print(“number of images:”,len(files))
example path files using in project
# load the model
vgg_model = vgg16.VGG16(weights='imagenet')

# remove the last layers in order to get features instead of predictions
feat_extractor = Model(inputs=vgg_model.input, outputs=vgg_model.get_layer("fc2").output)

# print the layers of the CNN
feat_extractor.summary()
# compute cosine similarities between imagescosSimilarities = cosine_similarity(imgs_features)# store the results into a pandas dataframecos_similarities_df = pd.DataFrame(cosSimilarities, columns=files, index=files)
cos_similarities_df.head()
def retrieve_most_similar_products(given_img):print("--------------------------------------------------------")
print("original product:")
original = load_img(given_img, target_size=(imgs_model_width, imgs_model_height))
plt.imshow(original)
plt.show()
print("-------------------------------------------------------")
print("most similar products:")
closest_imgs = cos_similarities_df[given_img].sort_values(ascending=False)[1:nb_closest_images+1].index
closest_imgs_scores = cos_similarities_df[given_img].sort_values(ascending=False)[1:nb_closest_images+1]
for i in range(0,len(closest_imgs)):
original = load_img(closest_imgs[i], target_size=(imgs_model_width, imgs_model_height))
plt.imshow(original)
plt.show()
print("similarity score : ",closest_imgs_scores[i])
results

--

--

Bernardo Caldas
Analytics Vidhya

Developer | AI | Driving innovation with development tools and AI. Making technology meaningful