Cross Language Information Retrieval Via Taste Translation

4SQ Eng
Foursquare
Published in
4 min readFeb 15, 2016

What’s the best place for lamb in Santiago? If you’re a local, you’d know to hit up Jewel of India for their cordero magallanico or Barrica 54 to try the Garrón de Cordero. But what if you’re an English-speaking traveler visiting the city, and you don’t know a cordero from a cortado?

Whether you are looking for inspiration on where to shop or searching out the best iced coffee in town, Foursquare is able help you make a decision on where to go and what to do. Our rich venue data is created by tourists and locals alike and is used to recommend a hidden local gem as well as popular tourist attractions.

The Foursquare app and website is translated into 11 different languages, allowing users from around the world to write tips about their favorite places in their native language. More than half of all of our existing tips are in a language other than English. This plethora of data from locals around the world poses a specific problem though: How do we make use of it to help users who do not speak the language of the created content? How can we build the best experience for an English user visiting Tokyo, or a Turkish user visiting Sao Paulo?

Traditional search engines will use a variety of information retrieval techniques to find the documents/web pages that are most relevant to a given query, typically in the same detected language of the query. While this model works well for generic search engines, it would leave Foursquare users travelling to destinations that did not have a lot of data in their native language with suboptimal results when searching for specific foods, or only finding results based off of tips left by tourists and not the locals.

Santiago lamb before
English lamb search in Santiago without translations
mission
Korean gelato search in NYC without translations

The paucity of non-native language content within a specific geographic region, however, does not mean that we are completely blind as to the quality and content of the venues within that region. Native non-English speakers in other countries are using Foursquare the same way you are using it at home. Up to this point, our users experience when searching for tastes has been different between languages. A Japanese user visiting the US who searched for tastes such as gelato (ジェラート) within the Japanese Foursquare app on their phone would only get back results where other Japanese users had written about ジェラート, even though there is a tremendous amount of English gelato data that would be useful to that user.

To address this problem, we are happy to announce that we have started the process of improving the underlying taste models for each individual language to include the appropriate language translations. Under the hood, you can visualize our taste model as a large ontology of terms that are represented in the form of directed acyclic graphs. Previously, each language’s taste model was distinct from every other, with no links between them. Adding the translation links to these ontologies allows us to make use of the language specific curations (i.e. 納豆/natto implying 朝食系/breakfast food in the japanese taste model) that were made in the taste models of each language, which in turn, provides an even better, more localized experience when using Foursquare abroad. No longer will a user’s results be limited by the extent to which other tourists who speak your language have left content in the area you are visiting. Instead, you will truly be able to live like a local by leveraging all of the foreign language data that was previously indecipherable by your queries.

More concretely, there are two very specific changes that users will have access to. The first is under the hood query expansion for taste matches into the languages for which we have translations. For example, a query for “lamb” in Santiago, Chile, will now be automatically expanded to incorporate Spanish results for “cordero” by traversing the multilingual taste graph.

Santiago lamb before
English lamb search in Santiago with translations
mission
Japanese gelato search in NYC with translations

This query expansion results in huge cross-language information retrieval gains that continue to make Foursquare search the best in the world.

Beyond retrieval wins, these translation links also give Foursquare the ability to highlight the top tastes at a venue on the venue page, even if there is no content in your language to support that taste.

English tip snippets in Japanese
NYC venue as seen by a Japanese user
Taste pile in Japanese
Taste filtering in Japanese

We currently have translations enabled between English, Spanish, Japanese, and Indonesian, but are working to get all of our supported languages enabled. We’re incredibly excited about the improved experience and fidelity of results that this new feature brings to the table.

Interested in helping to continue pushing the envelope and improving Foursquare? Take a look at our job openings here.

Kris Concepcion (@kjc9), Ben Mackey, Matt Kamen (@losfumato), Daniel Salinas (@zzorba42)

--

--