How e-commerce websites show similar products which you search?

Krishna Dheeraj
2 min readFeb 7, 2018

--

Let us begin with a most familiar scenario in online shopping. We are searching for Macbook Air on Amazon.com. When we scroll down the product page in the end we get to see similar products like Macbook, Macbook Pro and other Apple related accessories and items. How did Amazon.com suggest you to look in for similar products in the same category just like a salesman puts you before similar products when you go to a real store? This is the power of Machine Learning applications in this case it is a recommender system which plays a massive role in increasing the sales of a e-commerce websites. What is the working fuel behind these recommender systems?

E-commerce websites use content based recommendations and collaborative filtering for their recommendation systems. Recommendation systems use K-NN algorithm for suggesting the products to the customer. In K-NN algorithm it takes the features like text and image for recommendations. Features of text can include product name, description, brand and so on. It converts the text into n-dimensional vector and searches for similarity with other products in that same category. In our scenario of finding a laptop features like brand, product name helps them classifying into same cluster so that similar products are recommended to the customer.The names like Apple and Macbook are converted from text to vectors so that the ML algorithms can understand them. Computers are comfortable in understanding the numbers rather than text so these text are converted to n-dimensional vectors.The idea behind the K-NN algorithm uses the simplest concept of Euclidean distance. K-NN algorithm calculates the Euclidean distance between these vectors and clusters(groups) them into one cluster. The items which have Euclidean distance nearer to other items are grouped into one cluster and those products are recommended to the customer.

How your product gets classified into a cluster?

K-NN algorithm has its own limitations. It is not as fast as neural networks or an Support Vector Machines(SVM) which are also used for recommendations.It is not also as accurate compared to those approaches but it has got some nice practical features. The major advantages it has over other approaches is it’s easy to implement, doesn’t require any training and easy to understand the results. Even with those limitations it is more widely used in industry. It’s easy to implement and understand approach wins hands over other approaches in practical scenarios. In the upcoming posts I would be discussing what are the common mistakes an ML practitioner commits while implementing the K-NN classification.

--

--