Prophet API — How Olist tackled product taxonomy for multiple channels

Marcos Rossetti
olist
Published in
5 min readMay 15, 2018

As product managers at Olist, we do have many challenges, as explained by my fellow Igor here Afinal… o que faz o time de Produto no Olist?(Portuguese).

I have been working here for about a year now, mainly focused on new products registration, making sure the description of products is accurate and that we have enough information to publish them on every marketplace already integrated and therefore increase sales.

Not only that, we also strive to deliver the best possible experience to those registering products on our platform, optimizing the process and decreasing the time needed for this task. You can check some of our releases from 2017 (Portuguese).

Among the challenges we faced, I decided to write about trying to deal with one of our biggest problems on product registration so far: Categories.

Why is taxonomy a challenge?

Every marketplace that Olist is plugged into relies on their users to perform this important task. So that is how it was done at first, we simply replicated the category tree from each marketplace to our sellers and let them classify their own products on each channel.

For example, a PlayStation 3 Dual shock Controller would be categorized as following on each channel:

  • Channel 1: Consoles & Games >Accessories > Controller > PC/PS3
  • Channel 2 : Games> PlayStation 3 > Accessories PlayStation 3
  • Channel 3: Games > Playstation 3 > Accessories
  • Channel 4 : Games > PlayStation > PlayStation 3 > Accessories > Controllers > Dual Shock Controllers > Original Controllers

Each level needed to be selected from a list, and it worked great for a while. It enabled products to be categorized correctly and allowed us time to focus on other core parts of the platform.

But once we got to work on releasing our public APIs , we noticed that our partners would have a nightmare trying to categorize each product on every category of each channel, since they didn’t have the whole category tree information on their side.

Not only that, we also noticed that one of the most time demanding steps of a new product registration and that generated several complains was the categorization. At least 32 mouse clicks were needed and it took an average of 4 minutes to find and select them all.

How product categorization was performed on our web client

Moreover, we knew that the present solution was not scalable, since for each new channel, another completely different category tree would have to be inserted on the product registration screen.

Heat map of the product registration page

Discussion about possible solutions to our problems

Since Olist strives to bring simplicity to our sellers, we had to come up with a solution.

This solution would have to fulfill certain requirements:

Make it easier to expand to new channels and update their category trees

Minimize categorization complexity on our web client

Allow product registration using our APIs without the need to categorize products one by one for each channel.

Looking at some of the categorization options on the market, we’ve come across solutions such as category mapping for products already categorized in another marketplace or internally; it basically consists on manually relating these categories to a new category on each channel. Another option, a more common one, was a category search field where it auto completes the category while it is being typed, but it still needed to be done for every channel. None of those looked good enough for our needs.

While discussing the options, one of the ideas that came up was to use machine learning to categorize these products; it was definitely a promising one. Machine learning would allow any product to be categorized for every marketplace automatically, no matter if it was registered on our web client or using the public APIs.

To put it simple, machine learning uses algorithms and a training database to predict the most probable output for an input data.

The main idea was to use products already categorized previously as training data (product title related to channel category) and from a title input we’d be able to predict the most probable category.

A small team (a Product Manager and a Developer) was then dedicated to study this possibility and how it could work on our platform. We found out, after testing some different algorithms that we could achieve around 85% accuracy on the category prediction for each marketplace, when simply relying on the product title information.

So we decided to tackle this approach and build an API that, after some training, would categorize our sellers’ products.

Results

Now at Olist, we have the Prophet API, and we can predict categories for each channel using the product title:

  • In 85% of the registrations, a product can be categorized with 0 mouse clicks! On the few cases needed, it is simple to correct the category suggested
Product suggestions for each channel and how to correct one of the predictions
  • All products registered on by our partners on the API are categorized automatically and on 85% no extra action is required
  • Every time a marketplace updates its categories, we simply have to train the Prophet API with the updated database
  • New channels categories can be predicted as long as we have enough training data for the new channel

Discoveries along the way and future steps.

  • Our training database was not as good as expected (sellers's categorization wasn’t always right)
  • Usage of more information besides the product title for the predictions, such as description, photos and attributes in order to improve accuracy
  • Train the machine learning algorithm with each suggestion correction
  • Improve the category correction flow on our client
  • Abstract “category per channel”, perhaps not even categorization at all done by our sellers

It was a great experience to start this product along with all its difficulties and I know Olist will keep providing even better experiences every day not only for me, but mainly for our customers.

Are you interested in working with such challenges?Join us and help us always build better products. We are hiring!

--

--