Cosine Similarity between products to recommend similar products

Satish Mistry
Analytics Vidhya
Published in
3 min readJul 7, 2020

Identifying the best similar products using cosine similarity product recommendation for your E-commerce website. Its also known as item-to-item similarity.

I want to introduce a very simple, yet powerful, technique for a recommendation — Cosine Similarity.

Product on Left and its Similar Products Items on Right ( % indicates the cosine similarity between left most product and individual products)

Step 1: Identifying features for each product item.

To start with first you will have to Identify features for each itemset.

Features can be few important attributes that are helpful to identify the product individually and help you to categorize them intuitively.

I have taken cosmetics products as an example to categorize them further and recommend other cosmetics products basis their similar attributes. Feature selection depends on what data points are available to you and its variation across product itemsets. Let's identify a few features first.

Example of feature selection for cosmetic products.

Step 2: Assign labels for each product item.

We will categorize products based on their attributes and data points available to us.

Example of assigning labels to each product in the catalog

Step 3: Finding similarities of each product item based on the features.

We will use standard cosine similarity, a commonly used approach to match similar product itemsets based on counting the maximum number of common attributes between the products.

Mathematically, it measures the cosine of the angle between two vectors projected in a multi-dimensional space. In this context, the two vectors I am talking about are arrays containing the features counts of two products.

Cosine Similarity Formula

Let's do the calculation for Product Item 1 & Product Item 2.

Calculating Product Item 1 & Item 2 Cosine Similarity

Now, we know the similarity between the two products is 85%. Similarly for each Itemset, you can run the no. of iteration with all the available product items in your catalog.

Item 1-to-item 2 ,

item 1-to-item 3,

item 1-to-item 4

.

.

item 1-to-item n.

and then item 2-to-item 1, item 2-to-item 3, and so on.

After the successful use of cosine similarity, you will have no. of products available to recommend shoppers based on what products they are looking at.

Please Note: The data and product pictures I’ve used are for learning purposes and no intention to advertise or promote. They are intended to enhance public access to information about cosine similarity and its use in the recommendation system.

--

--

Satish Mistry
Analytics Vidhya

Expert in data wrangling, data transformation, data analysis, data visualization, data science, and machine learning techniques. All things DATA!