Product classification with AI: How machine learning sped up logistics for Aeropost

Anjali Krishnan
The Ecommerce Intelligencer
7 min readJun 26, 2018
Fancy some correctly HS coded pasta?

Pasta. Have you ever thought about importing it? Artisanal pasta maybe?

Well here is some handy information if you ever do. Importing uncooked plain pasta into the US is easy, you have to pay nothing — 0% duty. But stuffed pasta is an entirely different story — that’ll be 20% in tariffs. That level of markup is almost enough inspiration to bust out a bag of flour and make a mess of the kitchen stuffing our own pasta. Almost.

This seemingly unreasonable rule stems from a very basic problem in retail — product classfication. Somewhere along the way it so happened that the Harmonized Tariff Schedule (HTS) codes for pasta got very complicated indeed —

1902.11.20 denotes,

exclusively pasta, Product of a European Union (EU) country:

Subject to the Inward Processing Regime (IPR)

Subject to the EU reduced export refund in accordance with the US-EU Pasta agreement

while 1902.20.00 covers,

Stuffed pasta, whether or not cooked or otherwise prepared.

I’m not sure what the US-EU pasta agreement states exactly but clearly stuffed pasta didn’t meet the criteria for an intercontinental free-pass.

If pasta is so complicated what about the rest of our stuff?

The United States has well over 17,000 unique classification code numbers for determining tariffs on imported goods.

And customs departments aren’t the only ones with unwieldy product catalogs. Below is Amazon’s first level of categorization. At a glance it seems like each further section goes on to have 20–50 more sub-categorizations.

It’s not easy being the world’s biggest retailer.

Product categorization is a hard science to master. Given the scale and speed of retail it’s easy to see how mistakes can happen very often. And those mistakes can be very very costly.

Enter Aeropost

Aeropost is the perfect example of an efficient free market fulfilling customer needs. They allow anyone in Latin America to order overseas products on a single platform without the hassle of ordering from different sites, with different shipping, logistics and tracking. Their logistics network is optimized to reach over 32 countries in Latin America, shipping packages from its processing warehouse in Miami.

Making ecommerce truly global.

Customers get a unified shopping experience and can buy any product online and have it shipped automatically to their home. They don’t have to worry about import restrictions, unexpected fees, or holdups in customs.

Succinctly put Aeropost is an online retailer operating without an in-house product catalog — their product catalog is everything and anything sold online.

It’s a tough job

Ensuring an easy and unified shopping experience for customers means Aeropost has to abstract away all the problems with cross-border commerce. They aggregate products from different retailers and ship everything to the customer in one easy order.

But in delivering this need Aeropost has signed up for a hard task. All of us have experienced missing packages, tracking errors, delays and sometimes customs glitches. To combine different products from different retailers, aggregate them under one account, clear customs and deliver them to their customers Aeropost needs to jump through a lot of hidden hoops.

  1. To even begin somewhere they need an on-demand product catalog with all of the product and pricing information needed to create product display pages (PDPs) and convert shoppers online.
  2. They need to be accurate with their customs classification so that they don’t overpay or worse, underpay and get fined.
  3. They also need to process packages quickly and in a scaleable fashion. Their bottom line is defined by how many packages they can process (their throughput).

That’s where our Categorization API comes in.

How is product categorization powered by AI?

Product categorization is the backbone on which ecommerce runs smoothly. According to Govind, our AI research lead, the goal of AI-driven product categorization is to tag each of the hundreds of millions of products with a unique category ID. These category IDs could be derived from Semantics3's own category taxonomy or belonging to an external company or system (like HS codes). Regardless of the source, product taxonomies generally run to over tens of thousands of elements.

So to have a high degree of accuracy two conditions are important —

An algorithm that solves the task of categorization has to develop a deep understanding for what a product is, and then determine which of the taxonomy tags is most appropriate for it.

Secondly, categories are hierarchical (e.g., an “Apple Macbook” can be described broadly as an “Electronics” item, more specifically as a “Computer & Accessories” item or even more specifically as a “Laptop”); hence, to get category tagging right for just one product, the algorithm needs to make multiple correct decisions.

Read complete article>

The simplified version of our categorization engine

A solution for Aeropost

So how can Aeropost solve these two key issues?

Issue 1: Getting an on-demand product catalog

This one is the easiest. Our product APIs on tap allow Aeropost to use multiple input values in order to retrieve product data on demand. Input values can be URL, keyword, and UPC or any other unique identifies. Shoppers load any product URL into Aeropost, the website retrieves the data instantly and shoppers proceed to buy directly on Aeropost.com.

The next two issues get a bit more complicated.

Issue 2: The costs of tariff classification and compliance.

Every country has in place a classification system for monitoring and taxing the goods coming into their country. A majority of the world follows the Harmonized Commodity and Coding System (known as HS codes) which is maintained by the World Customs Organization.

Goods are assigned six-digit codes from a classification system of over 200,000 categories. Countries can also add digits if they want to customize their processes. When goods are being shipped from a single retailer they can identify the product at the time of packaging and issue a receipt that complies with customs regulations.

For Aeropost the problem gets much more complicated. Imagine aggregating from many different retailers, who issue uniquely designed invoices, with company-specific SKUs. The varied orders from different retailers have to be mapped correctly onto various country codes, packaged accurately and shipped with accompanying product information.

The margin for error is very high and no company wants to be saddled with import duty fines. Customers are also very unlikely to be pleased about dealing with mislabelled packages. And perhaps the worst of it is that scalability is constrained by labor and systems costs.

Enter Semantics3’s AI-driven Categorization API. The API uses inputs like name, brand, manufacturer, and description to map products to the Semantics3 Master Category Tree.

Demo of categorization engine

For the online shopping experience, the workflow is as follows:

  1. Shopper loads up the product page on Aeropost.com
  2. The product data is fed again into the Semantics3 neural network
  3. The tariff code and shipping weight is returned from Semantics3
  4. This is used to calculate the shipping cost and import duties and added to the product cost to show the fully laden cost to the shopper

This simplifies the process for Aeropost. Setting up an automated, deep-learning/neural network based classification module that is tailored precisely to each country’s HS code system ensures greater accuracy. Aeropost can afford to be less conservative with tariff code classification — allowing them to optimize the tariff duties paid. Less fines paid for miscategorized tariffs builds up their reputation which allows them to use the fast lane for imports. Lastly, customers are fully informed because shipping and tariff costs are calculated and displayed before they make the purchase.

Issue 3: Time Taken

Pre-AI the shipping process involved manually opening the packages and checking for tariff codes at the warehouse. This meant a significant labor cost for each country that Aeropost wanted to expand to.

With Categorization API the workflow changes to:

  1. Package arrives in warehouse.
  2. Warehouse staff open the package and scan the product barcodes.
  3. The UPC lookup hits Semantics3’s UPC API and returns the product metadata.
  4. The product metadata is then fed into Semantics3’s neural network to automatically return the product’s tariff code for the country it’s being shipped to.
  5. The related product data, and tariff codes are then loaded into the package documentation in preparation for shipment to the target country.

The cost savings to Aeropost are significant. Shorter package processing time at the warehouse, more precise categorization, and better track record with customs.

These costs are then passed down to the customers, allowing them to gain market share, trust in the system, and reliability. The savings are immediate and long-term.

Our suite of machine-mearning categorization modules help retailers to leverage the power of artificial intelligence to increase accuracy and efficiency when dealing with massive amounts of product data. Companies can also create customized ML-powered product classifier engines targeted to any retailer taxonomy or country-specific HS code system

If you are looking for a similar solution check out our Categorization API here.

--

--

The Ecommerce Intelligencer
The Ecommerce Intelligencer

Published in The Ecommerce Intelligencer

A look at how data is shaping the future of e-commerce, gleaned from our stockpile of E-commerce product, pricing and customer metadata. Also see www.semantics3.com/blog