5 Must-read Papers on Product Categorization for Data Scientists

Limarc Ambalina
Analytics Vidhya
Published in
4 min readAug 14, 2020

--

Product categorization/product classification is the organization of products into their respective departments or categories. As well, a large part of the process is the design of the product taxonomy as a whole.

Product categorization was initially a text classification task that analyzed the product’s title to choose the appropriate category. However, numerous methods have been developed which take into account the product title, description, images, and other available metadata. The following papers on product categorization represent essential reading in the field and offer novel approaches to product classification tasks.

1. Don’t Classify, Translate

In this paper, researchers from the National University of Singapore and the Rakuten Institute of Technology propose and explain a novel machine translation approach to product categorization. The experiment uses the Rakuten Data Challenge and Rakuten Ichiba datasets. Their method translates or converts a product’s description into a sequence of tokens which represent a root-to-leaf path to the correct category. Using this method, they are also able to propose meaningful new paths in the taxonomy.

The researchers state that their method outperforms many of the existing classification…

--

--

Limarc Ambalina
Analytics Vidhya

Owner of Jpbound.com, VP of Growth at Hackernoon.com | Specializing in AI, tech, VR, and pop culture.