Sitemap
Product categorization

Product categorization topics

Product categorization AI / API

--

In this article, we will introduce you to our product categorization tool and API, which allows you to highly accurately classify products for Shopify, Google Merchant, eBay and other taxonomies.

Here is an example screenshot of 1) categorizing and 2) determining attribute values of Shopify taxonomy for product “satin dress”.

Note how many attributes are automatically determined by our AI, along with the main category which can be used as breadcrumb and for filtering on your store.

Inclusion of these attributes on your online stores boosts visits from search engines (especially on long-tail keywords) and improves user experience through better search.

Our AI automatically determines which Shopify attributes are relevant for your product, as laptop e.g. has different attributes than a t-shirt.

Our AI categorizer predicts multiple relevant categories for given product, along with confidence in each prediction. Here is an example output for product “camera mini uv filter professional lens”

Our solution is available both as dashboard and API. Latter allows you to automate your classifications at scale, via our API we serve millions of categorizations per day for our clients.

You can also send your own categories list and get classifications back for your own categories.

You can try it out for free at:

https://www.productcategorization.com

Our state of the art classifier is used with great success by a large number of companies, ranging from Unicorns, multinationals, online stores, eCommerce analytics companies, Saas companies and individuals.

You can also send images and get back both classifications and generated description:

Our API generates descriptions of images, which is ideal for many use cases, e.g. marketplaces with user submitted images.

AI classifier is extremely accurate, here is a recent feedback from its user: “I am surprised how accurate your product is. I am really shocked. I love it.” David

One of its cool features is an ability to provide explainability of results by machine learning classifiers — by identifying the words with highest contribution to resulting categorization of products.

In the screenshot below, you can find an example of product description that has been classified by classifier as Laundry Appliances.

In addition to categorization, classifier also provided explainability by colouring words that most contributed to this categorization: “washer”, “dryer”, “clothes”, “wash”, “combo”, “capacity”.

Another example of AI explainability by the classifier, but this time for categorization of website www.cnn.com:

“international”, “news”, “politics”, “world”, “health”, “times” are words that most contributed to cnn.com being classified as “News and Politics”.

Why the need for product categorization?

When we enter a store, looking for a new item to purchase, it helps when we see various segments of store denoted with appropriate labels e.g. Electronics, Household Appliances, etc.

So we expect that the online stores and shops also provide web visitors with the same kind of categorization. This improves the search, discoverability and filtering of the online shop websites and ultimately leads to better conversion.

Product categorization is the task of classifying products as belonging to one or more categories from a given taxonomy.

Product categorization AI and website categorization taxonomies

If we want to do categorization of products or categorize websites, there is not a single way of doing it. A lot of ecommerce companies have their own set of rules for categorizing products. These rules or definitions are also known as taxonomies.

When it comes to websites classification a well known standard is that of IAB. It is especially suitable for categorizations in marketing area. E.g. an advertiser generally wants to advertise only on websites of publishers that are from specific categories. Thus an industrial company would not want to place an ad on website of fashion magazine.

In regard to IAB note that there have been several revisions of IAB classifications over the last decade and it is better to use the latest revision from September 2021. Revisions are necessary because of new verticals/categories constantly arising and becoming popular.

Google and Facebok product categorization AI taxonomies

If you are more interested in Ecommerce product classification service domain, two of the best taxonomies are those from Google and Facebook.

Google product taxonomy has several Tiers and you can explore it in more detail at https://www.google.com/basepages/producttype/taxonomy.en-US.txt.

A selection of Google Taxonomy categories:

Of particular interest is building product categorization AI models for Tier 3 level, because it is quite detailed, with over 1000+ categories and including many micro niches, such as e.g. “Bird Cage Food & Water Dishes”, “Baby & Toddler Outerwear”, “Snow Pants & Suits”, etc.

Why is the depth and large number of categories important?

Because as we will later discuss, the more number of categories you have, the better is the discoverability/filtering on your website and the more benefit you get from increased visits from search engines.

Another set of ecommerce product taxonomy service examples is the one from Facebook: https://developers.facebook.com/docs/marketing-api/catalog/guides/product-categories/

The specific classification such as:

Apparel & Accessories > Clothing > Underwear & Socks > Shapewear

is also known as Taxonomy Path.

The machine learning models can have as its objective either to predict a class in given Tier level or the complete product taxonomy ecommerce path.

The latter was e.g. objective of the Rakuten 2018 ecommerce challenge: https://sigir-ecom.github.io/ecom18DCPapers/ecom18DC_paper_13.pdf

In our company we have built machine learning models of both kinds — predicting taxonomy paths and predicting single Tier level categories.

What you want mostly depends on your specific use case.

Text and smart product categorization models

The smart product categorization for online stores is in practice usually performed in automatic manner, using machine learning models for this purpose, from the group of text classification models.

There are many ML models available for text classification. We are listing here a list of possible ML models, from simple to more complex ones (by no means is the list exhaustive):

  • Naive Bayes classifier
  • K-Nearest Neighbors
  • Support Vector Machines (SVM)
  • Logistic regression
  • Decision Trees
  • Random Forests
  • Deep Neural Networks
  • Recurrent Neural Networks (RNN)
  • Convolutional Neural Network (CNN)
  • Ensembles of neural nets

At our company, we have used all of the aforementioned ones in the past, some as baselines for comparison, others for testing and deployed in production.

The particular ML model that is best suited often depends on the problem. E.g. SVM works well for smaller data sets, but because its complexity / training time rapidly increases with the size of data sets used, it is not best for problems with very large training data sets.

Vectorization of texts for product classifier

An important of text classification task is also the pre-processing and conversion of texts in numerical form, with which product classifier can work.

Pre-processing often includes but is not limited to following steps:

  • removal of stop words
  • lowercasing
  • spelling corrections
  • stemming
  • lemmatization

Once the text is pre-processed, there are several methods to convert them into numerical format.

Often used is TF-IDF and it can work well in many text classification problems.

Where semantic meaning plays an important role, it is useful to explore embedding techniques, including Word2Vec, GloVe, ELMo and other methods.

A useful library for conversion to vector format is the FastText library, especially when dealing with non-english languages, it supports trained models for over 150 languages: https://github.com/facebookresearch/fastText/blob/master/docs/crawl-vectors.md.

FastText is a bit different from other word level embeddings like Word2Vec in that it operates on character level, using character n-grams.

E.g. if your word is “that”, then n-grams would be:

  • < t
  • th
  • ha
  • at
  • t>

FastText can thus also be useful when dealing with rare words. We also used it with great success in developing text classification platforms on social media data, where words are more often misspelled, shortened, modified.

What are the benefits of using ecommerce product categorization or product tagging?

Two of the key benefits of using automatic ecommerce product categorization on your online shop is that your customers can more easily find relevant products, especially if you are selling products in several different verticals.

Another very important benefit is that its implementation:

  • allow you to generate more webpages, indexed and available on search engines (with corresponding more opportunities for users to find your website via search)
  • leads to higher rankings due to more relevant keywords on your webpages

Both mean more free visits from organic rankings of your webpages on search engines.

How can software product categorization and product tagging improve your rankings on search engines?

If you want a higher ranking of your given webpage for given keyword, e.g. “bead necklace” then you should have words on your webpage that are semantically and topically related to keyword “bead necklace”.

Here is where software product categorization and tagging can help, you can thus add Tier 2, 3 and 4 level categories to each of your product webpages to make it more relevant for google.com and other search engines, e.g. for “bead necklace” you could add: Jewellery, Necklaces / Jewellery Sets.

You can also go a step further and add not only categories but also highly relevant tags to each of your product pages.

For this purpose, we have built a ML-based product tagging solution that automatically produces highly relevant tags from product names.

For “bead necklace” it produces the following ideas for tags:

necklace, jewelry, beads, bracelet, handmade, beaded, necklaces, jewellery, pendant, necklace set, handmade jewelry, chain, fashion jewelry, long necklace, handcrafted jewelry

What is product category for given item?

Here is another set of tags ideas for phrase “sewing pattern”, produced with our product tagging tool (“value” denotes the relevancy):

{
"language": "en",
"classification": [
{
"category": "pattern",
"value": 0.42340898513793945
},
{
"category": "sewing",
"value": 0.18840937316417694
},
{
"category": "sew",
"value": 0.10562954097986221
},
{
"category": "cotton",
"value": 0.07833655178546906
},
{
"category": "quilting",
"value": 0.05662618204951286
},
{
"category": "quilt",
"value": 0.05658867955207825
},
{
"category": "vintage",
"value": 0.054514266550540924
},
{
"category": "embroidery",
"value": 0.05450379103422165
},
{
"category": "handmade",
"value": 0.052232421934604645
},
{
"category": "crochet",
"value": 0.04206467792391777
},
{
"category": "applique",
"value": 0.03805079311132431
},
{
"category": "weaving",
"value": 0.03781686723232269
},
{
"category": "wool",
"value": 0.03761012479662895
},
{
"category": "knitting",
"value": 0.03648590296506882
},
{
"category": "fabric",
"value": 0.0340331606566906
},
{
"category": "cross stitch",
"value": 0.031439878046512604
},
{
"category": "thread",
"value": 0.027697991579771042
},
{
"category": "patterns",
"value": 0.01848462037742138
},
{
"category": "rug",
"value": 0.0134653989225626
},
{
"category": "yarn",
"value": 0.012413100339472294
},
{
"category": "cotton fabric",
"value": 0.012060941196978092
},
{
"category": "textile",
"value": 0.012012960389256477
},
{
"category": "textiles",
"value": 0.010599330067634583
},
{
"category": "stitch",
"value": 0.009857836179435253
},
{
"category": "accessories",
"value": 0.00966225378215313
},
{
"category": "stitched",
"value": 0.009236618876457214
},
{
"category": "batik",
"value": 0.009234720841050148
},
{
"category": "winter",
"value": 0.009045564569532871
},
{
"category": "upholstery",
"value": 0.009000061079859734
},
{
"category": "download",
"value": 0.008843736723065376
},
{
"category": "patterned",
"value": 0.008658943697810173
},
{
"category": "patchwork",
"value": 0.00858057290315628
},
{
"category": "neutrals",
"value": 0.008539760485291481
},
{
"category": "sofa",
"value": 0.007974304258823395
},
{
"category": "handwoven",
"value": 0.007709966041147709
},
{
"category": "antique",
"value": 0.0076023912988603115
},
{
"category": "machine washable",
"value": 0.00661444291472435
},
{
"category": "fall",
"value": 0.006400591693818569
},
{
"category": "summer",
"value": 0.006314214318990707
},
{
"category": "girls",
"value": 0.006126525811851025
},
{
"category": "shawl",
"value": 0.006045812275260687
},
{
"category": "baby",
"value": 0.005956804845482111
},
{
"category": "girl",
"value": 0.005812007002532482
},
{
"category": "colours",
"value": 0.005611352622509003
},
{
"category": "india",
"value": 0.005135116167366505
},
{
"category": "geometric",
"value": 0.005095324013382196
}
]
}

Ecommerce product classifications services

We can help you with product categorization not only for Shopify, Google Merchant, but we also offer to set up ecommerce product classifications services for the marketplaces listed below. These ecommerce product data classifications are obtained by training AI models for the particular marketplace. 1–800-flowers.com

  • ABB
  • Afound
  • Ahlens
  • Albertsons
  • Alltricks
  • Amazon Seller
  • Amazon Vendor
  • Astore Shop
  • B&Q
  • Belk
  • Best Buy Canada
  • BigCommerce
  • Blokker
  • Bol.com
  • Brico Prive
  • Bunnings
  • Carrefour France
  • Carrefour Poland
  • Carrefour Spain
  • Carrefour Taiwan
  • Catch
  • Cdiscount
  • Click Central
  • Cleor
  • Conrad
  • Conforama
  • Coperama
  • Coppel
  • Darty
  • Debenhams UK
  • Decathlon
  • Dia & Co
  • DKE
  • Douglas
  • E.Leclerc
  • eBay
  • El Corte Inglés
  • Empik
  • Eprice
  • Express
  • Fnac
  • FonQ Partners
  • Friends of Joules
  • Galeries Lafayette
  • GO Sport
  • GPA
  • Green Weez
  • Grenier Alpin
  • Hewlett Packard Enterprise
  • Home24
  • Hudson’s Bay
  • IBS
  • Inno
  • Intermarche
  • J.Crew
  • Kaufland
  • Kleertjes en Co
  • Kogan
  • Kroger
  • La Poste
  • La Redoute
  • LE BHV MARAIS
  • Leen Bakker
  • Leroy Merlin Brazil
  • Leroy Merlin Netherlands
  • Liverpool
  • Madewell
  • Magento
  • Maisonette
  • Maisons Du Monde
  • ManoMano
  • Maty
  • MediaMarkt
  • Mercado Libre
  • METRO.fr
  • Motherly Shop
  • MyDeal
  • Nature & Decouvertes
  • NBC Universal
  • Office Depot
  • Once It
  • Orderve
  • Pandacola
  • PcComponentes
  • Phone House
  • Place Des Tendances
  • Plateforme StopCOVID19
  • Premiere Vision
  • Privalia Spain
  • Reitmans
  • Ripley Brazil
  • Ripley Chile
  • Ripley Peru
  • Rue Du Commerce
  • Satair
  • Sears Marketplace
  • Sephora UK
  • Shop Apotheke
  • Shopify
  • ShowroomPrive
  • Sprinter
  • The Iconic
  • Tiendanimal
  • Trade Me
  • TradeSquare
  • Truffaut
  • UBALDI
  • Urban Outfitters Inc.
  • Verishop
  • vidaXL
  • Walmart Mexico
  • Walmart USA
  • Westfield Direct
  • WooCommerce
  • Worten
  • Yoox
  • Zalando

Shopify taxonomy

The most popular platform for online stores is Shopify. It recently (July 2024) launched a new version of its taxonomy. Whereas previously it had around 5.6k categories and was very similar to Google Merchant, it now has 10k categories.

You can explore it interactively at https://shopify.github.io/product-taxonomy/releases/2024-07/?categoryId=sg-4-17-2-17

Free product categorization

Use of product categorization and tagging is by no means limited to online stores providers, but can also be used for many other use cases.

You can test classifiers for product categorization (using modified Google Product Taxonomy) and website classification (using IAB Taxonomy) freely at our website https://www.productcategorization.com/.

Product classification API

Calling our API is very simple, here is example code (in python):

from urllib.parse import quote_plus
import requests

api_base_url = “https://www.productcategorization.com/api/ecommerce/ecommerce_category6_get.php?"
api_key = “your api key”
query_text = “T Shirt“
encoded_query_text = quote_plus(query_text)
final_url = f”{api_base_url}query={encoded_query_text}&api_key={api_key}
response = requests.get(final_url)
print(response.text)

resulting response:

{“language”: “en”, “total_credits”: 12, “remaining_credits”: 12, “classification”: “Apparel & Accessories > Clothing > Shirts & Tops”, “buyer_personas”: [“Minimalist Enthusiast”, “Eco-friendly Specialist”, “Budget Conscious Individual”, “Fitness Enthusiast”, “High-End Designer Devotee”, “Luxury Brand Enthusiast”, “Vintage Aficionado”, “Casual Lifestyle Enthusiast”, “Trendy Fashionista”, “Comfort Seeker”], “category”: “Apparel & Accessories > Clothing > Shirts & Tops”, “status”: 200}

If you are more interested in solution that categorises websites, then check out a free website categorization tool at: https://www.websitecategorizationapi.com

Using this web classification service we have categorized 30 million domains in terms of 700+ possible categories, based on IAB taxonomy.

We offer this in form of offline url database for web content filtering, which is ideal if your app/service needs low latency for checking website categories for millions of domains. Database has 30 million domains categorized.

In our next article we will go into more detail on how to build a product categorization and website classification model with over 90% accuracy.

Frequently asked questions

What is smart product categorization?

Smart product categorization is an approach of categorizing products into pre-defined categories (called taxonomy) using automated process, usually based on machine learning, deep learning or other AI models with the goal of improving product search/filtering on-site and visibility on search engines.

What is product categorization AI?

Product categorization AI denotes a group of machine learning/AI models that were developed for the purpose of product categorization, including TF-IDF + classical models (e.g. logistic regression), deep learning models (e.g. BeRT) or LLM models (e.g. ChatGPT, Gemini, Claude).

What is product categorization API?

This denotes the process of calling classification of products via REST API, which returns category of products along with other metadata, e.g. attributes. It is usually used when needing to classify a large number of products.

What are ecommerce product classifications services?

Ecommerce product classifications services are usually Saas platforms which provide API access, based on AI models, to classifications of ecommerce products, usually for online stores, like those on Shopify, Woocommerce and others.

Last updated: 15th October 2024

--

--

SeniorQuant
SeniorQuant

Written by SeniorQuant

Ph.D. in Theoretical Physics, Senior Data Scientist

No responses yet