Why Natural Search is Awesome and How We Got Here

Natural Language Search — How To Go Beyond The Hype And Build a Useful Experience

The Evolution Of Desti’s Search Interface

This is a story about how one ambitious start-up tackled this subject that has riddled people like Google, Apple, Facebook and others, and came up with some pretty clever conclusions (if I may say so myself). In 2012–2013 we were building Desti — a holistic travel search app (i.e. it would search for everything — from hotels through attractions to restaurants), using post-Siri natural-language-understanding tech, and with powerful semantic search capabilities on the back end that allowed Desti to reason meaningfully about search results and make highly informed suggestions.

What Were We Trying To Achieve?

Desti’s search was built on a premise that sounds very simple, but it’s actually very hard to pull off. We believe that people should be able to ask specifically for what they’re interested in and get results that match. This sounds reasonable, right? If I’m looking for a beach resort on the Kona Coast in Hawaii, it’s pretty obvious what I want. And if I also want it to be kid friendly and pet friendly, I should just be able to ask for it. Our goal was to get users inputting relevant, specific queries because that’s what people need. That’s where Desti shines — saving you time and effort by delivering exactly what you want.

Now let’s assume that Desti knows which hotels on the Kona Coast are actually beach resorts, are kid friendly and pet friendly. How can we make expressing this query easy and intuitive for the user?

Episode I: Desti is Siri’s Sister or Conversational User Interface:
When we started, we were very naïve about this. We said — first, let’s just put a search box in there, allowing the user to type or say whatever they want, and let’s make sure we understand this. Then, let’s leave that box there so they can react to what they see and provide more detail (“refine”) or search for something else in that context (e.g. a restaurant near the resort — we called this “pivot”). And let’s run a conversation around it, kind of like Siri. What could be more natural? To do this we used SRI International’s VPA platform, which is almost literally a post-Siri natural-language-interaction platform with which you can have a conversation in context.

This is more or less what it looked like in our beta version:

Search box:

Dead simple search box

A conversational UI:

Conversational interaction

We launched this, monitored use and quickly realized is that early users split into two groups:

  • Those who speak Googlese, but not a lot of it. Their queries looked like this: “Hawaii” “hotel kona” or at most “hotel kona beach.”
  • Those who came to meet Siri. They had questions like “Are you Siri’s sister?” “what is the best hotel in the world” “tell me a joke.”

Discarding the 2nd group (we’re busy people), we learned that people don’t know how to interact naturally with computers, or they have no idea what to ask or expect, so they revert to the most primitive queries. Problem is, our goal was to answer interesting, specific queries, because we believe that if we give you a great answer that caters to what you want, your likelihood of buying is that much higher.

Furthermore, absolutely no one got the conversational aspect — the fact you can continue refining and pivoting through conversation. We decided to take away the focus from conversation for the time being.

Episode 2: Vegas Slot Machines or Make It Dead Simple

We realized we have to focus on the first query, and give people some cues about what’s possible. And came up with this interface:

Dynamic spinners for query input

These contextual spinners turned interaction from a totally open-ended query to something closer to multiple-choice questions. In essence these were interchangeable templates, where you could get ideas for “what to search for” as well as easily input your query. What you picked would show up as a textual query in the search bar, which we hoped people would realize they can edit or add to. Hoped…

The results — on the one hand, progress. We saw longer and more interesting queries and more interaction. However when talking to users, we realized that they were assuming that the spinner was a kind of menu system, which means (a) they can only pick what’s in the menu (b) they have to pick one thing from each menu. So while this was better than what most sites have for search, it was still a far cry from what we wanted to deliver. Here’s what we learned from this:

  • Suggestions give people a feel of “what’s possible” and also reduce the amount of typing needed, which is a boon — especially on a virtual keyboard
  • But, if the suggestions look like a menu, people just assume they are limited to those suggestions and to the template provided. They won’t try to add anything beyond it, or drop anything from it
  • Furthermore, this UI breaks as soon as you want to input quantitative stuff that won’t fit in there visually like a calendar-based date entry or prices.

Episode 3: Fill In The Blanks — Smartly

At this stage, it was clear that we needed better auto-suggest and smarter auto-complete. This is similar from a UI perspective to Google Instant, but Desti is about semantic search, not keyword matching. In most cases, Google will auto-suggest a phrase that matches what you’ve been typing AND has been typed in by many other people. Desti should suggest something that semantically matches what you entered and makes sense given what we know of the destination and about your trip. Because Desti is new and there haven’t been a million users searching for the same things before you, Desti should reason about what you may ask, not suggest something someone else asked.

We realized we have to build a lot of semantically-reasonable and statistically-relevant auto-suggesting. We still wanted to keep to the template logic because we believed it helps users think about what they are looking for and form the query in their minds. So we came up with a UI that blends form-filling and natural language entry, and focused on building smart auto-suggest and auto-complete.

Combined dynamic templates and free text

This UI was built of a number of rigid fields (e.g. location, type) that adapt to the subject matter (so if the type is “hotel” you’re prompted for dates), and a free text field that allows you to ask for whatever else you want.

We iterated a lot over the auto-complete and auto-suggest features. The first thing is to realize they are different. With auto-complete, you have a user who already thought of something to type in, and you have to guess what that is. With auto-suggest, you really want to inspire the user into adding something useful to their query, which means it needs to be relevant to whatever you know about the query and user so far, but not overwhelming for the user.

All this requires knowing a lot about specific destinations (what do people search for in Hawaii vs. New York?) and specific types (what’s relevant for hotels vs. museums?). Also, on the visual side, what the user is putting in is often quantitative and easier to “set” than “type” — e.g. a date, a price etc. So we came up with our first crack at blending text with visual widgets.

Template filling with dynamic widgets

The results were a big improvement in the quality and relevance of queries over the previous UI, but a feeling that this was still too stiff and rigid. When people are asked for a “type of place” — e.g. a museum, a park, a hotel — they often can’t really answer, and it’s easier for them to think about a feature of the place instead — e.g. that they can go hiking, or biking, see art or eat breakfast. For linguistic reasons it’s easier for people to say that they want a “romantic hotel” than a “hotel that’s romantic”. So while this UI was very expressive, often it felt unnatural and limiting. Furthermore many users just ended up filling the basic fields and not adding any depth in the open-text field (despite various visual cues). And editing a query for refining or pivoting was hard.

At the same time — the auto-suggest / auto-complete elements we’ve built at this stage werealmost enough to allow us to just throw out the limiting “templates” and move to one search field — but this time, a damn clever one.

Episode 4: Search Goes Natural

To the naked eye, this looks like we’ve gone full circle — one text box, parsed queries shown as tags. What could be simpler?

On-the-fly parsing and tagging

Well, not exactly, because we still need queries to be meaningful. One thing that the templates gave us was built-in disambiguation. We need a query that has at least a location + a type (or something from which we can derive a type), and without a template telling us that the “hotel” is the type, and the “restaurant” is something you want your hotel to have (vs. maybe the opposite), the system needs to better understand the grammatical structure or the sentence, and cue you into inputting things the right way when it’s suggesting and auto-completing.

Typing a query:

Semantic auto-complete

The query is understood — you can add / edit:

After parsing — auto-suggest

With this new user interface, changing queries (“refining and pivoting”) is very natural — add tags, or take away tags. Widgets were contextually integrated using the auto-suggest drop-down menu, so they are naturally suggested at the right time (e.g. after you said you were looking for a hotel, we help you choose when, how many rooms etc.). It’s also very easy to suggest things to search for based on the context. For instance if we know your kids are traveling with you, we’d drop in “family friendly” and you could dismiss it with one click.

Suggesting widgets when relevant

So Where is This Going

So far, Natural Search looks and behaves better than anything else we’ve seen in this space. From now on, most of the focus is on making the guesses even smarter, with more statistic reasoning about what people ask for in different contexts, and more contextual info driving those guesses.

We believe this UI is where vertical search is heading. Consider how nice it would be to input “gifts for 4 year old boys under $30” into target.com’s search bar, or “romantic restaurant with great seafood near Times Square with a table at 8 PM tonight” into OpenTable — and get relevant answers. But then again, answering specific queries is not that easy either, but that’s the other side of Desti…

To be continued.