Building a Visual Similarity-based Recommendation System Using Python

A Comprehensive guide to perform Image-based Recommendation from scratch

Published in

Geek Culture

6 min readApr 15, 2021

In today’s competitive world of technology, it is very crucial for a growing e-commerce platform to engage its customers and maintain a consistent brand experience. Instead of allowing the users to perform search after search in order to get their desired items, recommending such relevant items is more impressive and provides a better sense of satisfaction.

Product recommendations can address such challenges very effectively by analyzing the customer’s previous purchasing behavior and current platform usage.

Product recommendations can help in:

Converting the shoppers to customers
Engaging the customers
Boosting sales and revenue
Delivering the most relevant content
Maintaining the brand experience

Broadly speaking there are two kinds of recommendation approaches:

Content-based recommendations
Collaborative filtering

As the name suggests, the content-based method recommends based on the additional content (metadata) about the customers or products. For products, this content may be product title, description, images, category/subcategory, specification, etc.

So, this approach recommends the products by finding the most similar products to a given product based on the content.

In this post, we will implement a content-based recommendation system by utilising the product images. Basically, the goal is to recommend product images that are very similar to a recently bought/checked product image.

Therefore, this image-based recommendation will be helpful in recommending the most similar products to the customers based on their recent shopping behaviour/platform usage.

Let’s start implementing this using the Fashion Product Images Dataset. The dataset contains 2906 product images across four different gender categories (men, women, boys, and girls). It also contains various product features like title, category, subcategory, colour, gender, type, usage, etc.

Additional NOTE : In order to access the entire python code follow the kaggle kernel here(https://www.kaggle.com/vikashrajluhaniwal/building-visual-similarity-based-recommendation) .

1. Basic Data Analysis

First few records from the dataset are as shown below.

a. Basic statistics — Number of products, subcategories & gender

As mentioned earlier, the dataset contains 2906 products of 9 different subcategories across 4 different gender types.

b. Frequency of each gender

From the output, we can observe that most of the products belong toMen category, then Women, and so on.

c. Distribution of products gender-wise

From the bar chart also, we can observe that Men have the highest number of products. Similarly, the dataset is almost balanced.

2. Data Preparation

Since cross-category recommendations are not preferred, for example, recommending girls’ products to a bachelor. So let’s subset the data gender-wise into 4 different dataframes.

3. Feature Extraction using ResNet

Generally, the product image contains a unique pattern along with its colour, shape, and edges.

Images with the same kind of such features are supposed to be similar. Therefore, extracting such features from the images will be very helpful in order to recommend the most similar products.

How to extract features from the images?

Computer vision techniques can be used to extract features from the images. Here, since we have limitations on data size, compute resources, and time, so let’s use the standard pre-trained models like ResNet to extract the features. Such pre-trained models are already fine-tuned and trained on a huge dataset (like ImageNet). This process is also known as transfer learning.

ResNet

ResNet is an abbreviated form of Residual Networks, first proposed by Kaiming He in 2015. Currently, it is perceived as a classical neural network for many computer vision tasks. In 2015, during the ImageNet Challenge, this model out-performed previous models like GoogleNet, VGGNet, and AlexNet.

The architecture allows us to train an extremely deep and wide network with 152 layers successfully. In our implementation, we will use ResNet50 (a smaller version of ResNet152) to extract the features.

extract_features() function extracts the features from the given images. As per the ResNet standard first, we resize the image to 224 x 224 and normalize them using ImageDataGenerator available in Keras. Finally, each image is represented as a 100352-dimensional feature vector.

To avoid run-time feature extraction after deployment, the extracted features are persisted in NumPy arrays. We maintain two arrays here for product Ids and extracted features respectively.

Similarly, this same feature extraction process is repeated for other product images gender-wise.

4. Computing the Euclidean distance and recommending similar products

Distance is the most preferred measure to assess similarity among items/records. Minimum the distance, the higher the similarity, whereas, the maximum the distance, the lower the similarity.

There are various types of distances as per geometry like Euclidean distance, Cosine distance, Manhattan distance, etc. We will use Euclidean distance here to compute similarity.

Since we have already extracted the image features so the Euclidean distance can be easily computed using the pairwise_distances() function form sklearn.metrics.

Once this distance is computed, we can easily recommend the products as per the ascending order of distance. Let’s do this!

a. Loading the extracted features

b. Distance computation and Recommendation

The above get_similar_products_cnn() function recommends 5 most similar products to the queried product based on the extracted features. The function accepts two arguments — product id of recently bought/checked item and the number of products to be recommended.

The top 5 recommended products against the product id 13683 are as shown below.

Likewise, we can recommend products against the products from other gender types also. Let’s see the final deployment using Streamlit.

5. Deploying the Solution

Streamlit is an interactive library/framework to build data apps & web applications and deploy machine learning workloads. The most important thing is it does not require any prior knowledge of web designing and development. Python knowledge is sufficient to interact with this as it is Python-compatible.

Here, the below built-in functions are used to make an interactive deployment:

st.text_input() — takes dynamic input from the user
st.write() — writes messages/arguments to the app
st.title() — displays an image or list of images.

Like earlier, here the get_similar_products_cnn() function recommends most similar products as per the arguments specified.

To execute this deployment script in the terminal type:

streamlit run recom_deployment.py

Tip: This complete deployment code can be downloaded from here.

End Notes

In this post, we discussed about product recommendations and implemented a visual similarity-based recommendation system by utilising available product images using ResNet.