Search, and you shall find

Can your customers find what they want to buy?

5 min readDec 27, 2015


By Joshua Fox

If a customer walked into your showroom and asked for a refrigerator, would you show them a microwave? So why do you see microwaves when you search a top e-commerce giant for “refrigerator with more than 9 cubic feet”? (Go ahead, try it on your favorite sites. Hint: Start with the biggest.)

Your bricks-and-mortar store makes products easy to find with tasteful, well-designed displays. Would you label a smartphone “Samsung Galaxy S4 SCH-I545 16GB Verizon AT&T T-Mobile GSM UNLOCKED Cell Phone”? (Yes, that’s a real title from a top marketplace site.)

You can’t expect customers to find the right products if you don’t even know what you’re selling. And, as it happens, the search engines at most e-commerce sites don’t have a clue. The software just doesn’t make any sense of its jumble of unstructured product descriptions, gathered from multiple manufacturers and distributors.

Marketplaces have it even worse than retailers: They have to accommodate lots of small vendors whose weak IT infrastructure doesn’t let them organize and standardize their product data. The usual result is shoving maximum product information into the title or textual descriptions in the hope that a text-based search engine picks up some keywords.

If you want customers to find the products that they are actually going to buy, your search engine needs to understand what your product data means.

The first step to creating this solution is a well-defined product model. It should declare exactly what information is needed to help you sell each product. This will include information describing the product like title, brand; and will state what fields are required in the product specifications. It will also include offer information for each product, such as price, condition, shipping costs, inventory.

What’s new in this approach is not the information — which usually already exists in in the jumbled data. What’s new is the clear declaration of exactly what information should be gathered, and how it is to be laid out.

There are methodologies for this, for example ontology and Unified Modeling Language. And much modeling work has been done: See the Creative-Commons-licensed Good Relations e-commerce ontology. I suggest you do brush up on the basics. But only the basics: You don’t need to use the full formal power of these methodologies and models to get started.

Begin by writing up the model in an ordinary document, using bullet points, clear language, and small diagrams where needed. Start by documenting the aspects of retail products that most concern you, taking guidance from those standard retail product models and methodologies. The goal is to establish a common language that your team, partners, vendors, and customers will understand — including both business and technical players. Comprehension is more important than completeness or formal precision.

This document then serves as a source for the other artifacts you need to get stakeholders on board and implementing your approach. You do this by carving out parts of the model into the formats and languages that different audiences need. This includes diagrams for business presentations; schemas for databases; object-oriented classes; data interchange formats; predictive machine-learning models; and front-end layouts for your website and mobile apps.

Don’t bother trying to generate these technical artifacts automatically from a formalized model, as different formats impose very different technical requirements. But the document will serve as a communication tool to guide the creation of artifacts, providing the common vocabulary and structure which are lacking in most search engines today.

A key point here is that the product model is not a technical artifact like a Java class model or a Relational database schema. Those artifacts have their use, but they define software implementations, while the product model defines your retail reality — the products that you sell. The software and databases should follow your business’s search needs, not the other way around, and using the model-first approach guarantees this.

With this document, and the artifacts derived from it, information flows from your distributors, vendors, and your affiliate marketers in a form you can use. Search result pages stop being either an empty wasteland or a cluttered hodge-podge, and start showing customers exactly the information that your marketers want to show them.

Some artifacts based on the model

Consumers can then search for the products that truly satisfy their requirements. Not just “Smartphone,” but “Smartphone with more than 4 gb.” Amazingly, this is impossible in most e-tail search engines today. With a little bit of structure to your data, this search becomes straightforward.

But even better for maximizing conversions is a search in terms of customer value: “Laptops for business travel.” With well-defined data in place, you can calculate these using simple rules which balance multiple factors: In this case, the laptop’s size, weight, and presentation features. Advanced machine learning algorithms can leverage the product model to learn how suitable each product is for consumers’ use cases. In the example, the algorithms would discover that laptops with certain combinations of features are most likely to be bought for business travel, and suggest further laptops accordingly.

You hire beautiful fashion models to show off your products. Invest in a beautiful product model as well, and your search results will show consumers exactly the search results they want to see — which means, consumers will find the products that they want to buy.

About the author: Joshua Fox is Principal Software Architect at Twiggle, building the next generation of semantic product search engines. His background includes stints as senior architect and technical lead at IBM, HP, and VC-backed startups. Joshua holds a PhD from Harvard University, and his articles on business and software innovation have appeared at leading publications including ReadWrite, Pando, and the front page of Hacker News.




Bringing the power of natural language understanding to the e-commerce search and discovery experience.