Search @ Pelago

Aman Srivastava
Pelago Tech Blog
Published in
10 min readJul 7, 2023

Pelago is a tour and activities booking platform backed by Singapore Airlines that offers thousands of activities and tours around the world. Our platform is designed to help travelers discover and book exciting experiences in a new city or country. With so many options to choose from, it can be overwhelming for travelers to find the perfect activity or tour that matches their interests and budget. This is where our search engine comes in handy.

In this blog post, we will explore why search is crucial for Pelago and how we use Elasticsearch to provide relevant search results to our users.

Table of Contents

1. The Importance of Intelligent Search in Travel

2. Leveraging Elasticsearch for Powerful Search

3. Multilingual Support: Breaking Language Barriers

4. Boosting Methods for Enhanced Relevance

5. Facet Search: Enhancing Discovery on Pelago

6. Entity Detection: Uncovering Contextual Insights

7.
ChatGPT Integration: Search Keywords Expansion

8. Tracking and Analytics: The Search Dashboard

The Importance of Intelligent Search in Travel

Pelago offers a vast range of activities and tours, including adventure sports, food tours, cultural experiences, and more. Our platform’s success depends on how easy it is for travelers to find activities that match their interests and budget. A good search engine can help travelers find the right activity or tour in no time.

We understand that travelers have different preferences and priorities. Some may want to explore the city’s food scene, while others may be more interested in adventure sports. To cater to these diverse interests, we need a search engine that can understand and interpret users’ search queries and provide them with personalized search results.

Moreover, travelers expect instant gratification. They want to find what they are looking for quickly and easily. If they can’t find what they are looking for, they will likely abandon our platform and look elsewhere. Therefore, we need a search engine that can deliver fast and accurate results to meet users’ expectations.

Leveraging Elasticsearch for Powerful Search

At the heart of Pelago’s search engine is Elasticsearch, a highly performant, open-source search and analytics engine, built on Apache Lucene. Elasticsearch arms Pelago with the ability to handle large data sets and execute intricate searches, all while maintaining high speed and efficiency.

The search experience at Pelago stands on three main pillars: Auto-suggest, Full Search, and Fuzzy Search. Each of these caters to specific user needs, together ensuring a comprehensive and seamless user experience.

The Auto-suggest feature aids users by providing real-time recommendations of products and destinations as they type. These instant suggestions lead users directly to the respective product or destination pages, making the discovery process swift and efficient.

Conversely, Full Search and Fuzzy Search offer a more exploratory path. These search types guide users to a product listing page populated with a range of options relevant to their query. From there, users can further refine their search using filters such as category, tags, and price, providing them with the flexibility to tailor their search results according to their preferences.

Multilingual Support: Breaking Language Barriers

Pelago’s search engine is equipped with a unique feature of multilingual support, utilizing the capabilities of Elasticsearch’s language-specific analyzers. These analyzers help process and tokenize text according to the linguistic rules of each specific language. For instance, English queries are analyzed using English analyzers, Thai queries with Thai analyzers, and so on.

To enable real-time querying based on the user’s locale, Pelago has designed a dynamic mapping mechanism. This mechanism assigns language-specific fields to cater to different languages — for instance, title.en is used for English queries, and title.zh for Chinese. This way, users can search in their native language and receive accurate and relevant results.

This process not only makes the platform more accessible and user-friendly but also helps Pelago in reaching out to a larger, more diverse global audience. The intricacies of this dynamic mapping procedure and its application in real-time querying will be elaborated further in an upcoming hands-on coding session.

Let's dive into some hands-on Python programming to better understand how Pelago leverages Elasticsearch for search with multilingual support.

Firstly, we’ll establish a connection to Elasticsearch and create an index named ‘products’:

from elasticsearch import Elasticsearch

# Connect to Elasticsearch
es = Elasticsearch()

# Create an index
index_name = 'products'
es.indices.create(index=index_name)

After creating the index, we define the structure and data types of fields within Elasticsearch through mapping. This step is crucial for returning accurate search results and supporting multiple languages. Here’s an example of mapping fields for various product attributes:

# Define the field mappings
mapping = {
"properties": {
"title": {
"properties": {
"en": {
"type": "text",
"analyzer": "english"
},
"th": {
"type": "text",
"analyzer": "thai"
},
"zh": {
"type": "text",
"analyzer": "smartcn"
}
}
},
"destination": {
"type": "keyword"
},
"country": {
"type": "keyword"
},
"categories": {
"type": "keyword"
},
"tags": {
"type": "keyword"
},
"search_keywords": {
"type": "text",
"analyzer": "english"
}
}
}

# Update the mapping for the index
es.indices.put_mapping(index=index_name, body=mapping)

Now that we have the index and field mappings set up, we can insert data into Elasticsearch and conduct searches. Here’s how to add a product and perform a simple search in a different language:

# Insert data into the index
product_data = {
"id": 67890,
"title": {
"en": "Universal Studios Singapore",
"th": "ยูนิเวอร์แซลส์สิงคโปร์",
"zh": "新加坡环球影城"
},
"categories": ["Theme Park", "Family"],
"tags": ["Fun", "Adventure", "Thrilling"]
}
es.index(index=index_name, id=product_data["id"], body=product_data)
# Perform a basic search in english
en_search_query = {
"query": {
"multi_match": {
"query": "universal",
"fields": [
"title.en",
"description.en"
],
}
},
"_source": ["title.en", "description.en"]
}

en_search_results = es.search(index=index_name, body=en_search_query)

# Perform a basic search in Thai
th_search_query = {
"query": {
"multi_match": {
"query": "สากล",
"fields": [
"title.th",
"description.th"
],
}
},
"_source": ["title.th", "description.th"]
}
th_search_results = es.search(index=index_name, body=th_search_query)

In addition to basic searches, Pelago utilizes advanced features provided by Elasticsearch. For example, fuzzy search can handle typographical errors and variations in user queries, ensuring more robust search results.

Boosting Methods for Enhanced Relevance

To further improve search relevance, Pelago implements boosting methods within Elasticsearch. By assigning weightage to specific fields and utilizing popularity-based and location-based boosting, Pelago enhances the accuracy of search results.

Field boosting involves assigning higher importance to specific fields, such as titles, descriptions, or search keywords. For instance, the following query demonstrates field boosting applied to the title, search_keywords, and description fields:

# Boosting specific fields in the query
boosted_search_query = {
"query": {
"multi_match": {
"query": "universal studios",
"fields": [
"search_keywords.keyword^5",
"title.en^3",
"description.en^0.5"
]
}
}
}

# Perform the boosted search
boosted_search_results = es.search(index=index_name, body=boosted_search_query)

In this example, the search_keywords.keyword field has a boost factor of 5 and the title.en field has a boost factor of 3. This ensures that if the search term matches the search_keywords field, the result will be considered more relevant than a match in the title field.

Popularity-based boosting factors in metrics like boosting based on the number of reviews, bookings, clicks, and page views to prioritize popular products and destinations for users’ search queries. This can be achieved using Elasticsearch’s function score feature. It allows for the boosting of documents with high popularity, thereby enhancing the visibility of the most sought-after products and destinations in the search results.

Furthermore, Location-based boosting allows Pelago to boost products relevant to a user’s specific location. By incorporating a user’s location into the search algorithm, Pelago ensures that users are presented with activities and tours that are conveniently located, enhancing the ease of travel planning. Here’s an example of how Pelago can boost products based on location:

# Boosting based on location
location_boosted_query = {
"query": {
"function_score": {
"gauss": {
"location": {
"origin": "1.3521, 103.8198",
"scale": "50km",
"offset": "10km",
"decay": 0.5
}
},
"query": {
"match": {
"title.en": "zoo"
}
}
}
}
}
location_boosted_results = es.search(index=index_name, body=location_boosted_query)

In this query, the gauss function assigns scores based on the distance of products from the specified origin which can be the user browsing location located from his IP. The parameters such as scale, offset, and decay define how the score decreases with increasing distance from the origin.

Facet Search: Enhancing Discovery on Pelago

Faceted search is a crucial feature in improving user experience on Pelago. It allows users to narrow down search results by applying multiple filters, making the search process more efficient and less overwhelming.

The term facet refers to specific categories into which results can be classified. These categories can include attributes such as price range, categories of activities, location, reviews, and more. The faceted search thus provides a more interactive search experience as it helps users discover what they want faster.

Elasticsearch supports this feature through aggregations that provide the facets’ values. For instance, we can use the terms aggregation to get a list of distinct values (and their counts) for a specific field.

Let’s look at how we implement faceted search in Python using Elasticsearch:

# Inserting data with various facets into the index
product_data = {
"id":12345,
"title":{
"en":"Walking Food Tour in Bangkok",
"th":"ทัวร์อาหารเดินบนถนนในกรุงเทพฯ",
"zh":"曼谷美食步行游"
},
"price":30.00,
"categories":[
"Food",
"Culture"
],
"tags":[
"Street Food",
"Local Experience"
],
"destination":"Bangkok",
"rating":4.5
}
es.index(index=index_name, id=product_data["id"], body=product_data)
# Search query with facets
facet_search_query = {
"query": {
"multi_match": {
"query": "Food",
"fields": [
"title.en",
"categories",
"tags"
]
}
},
"aggs": {
"by_category": {
"terms": {
"field": "categories"
}
},
"min_price": {
"min": {
"field": "price"
}
},
"max_price":{
"max": {
"field": "price"
}
},
"by_rating": {
"range": {
"field": "rating",
"ranges": [
{
"to": 3
},
{
"from": 3,
"to": 4
},
{
"from": 4
}
]
}
}
}
}
facet_search_results = es.search(index=index_name, body=facet_search_query)

In this example, the user searches for Food activities, and Elasticsearch returns the relevant activities along with aggregated counts of various categories, min & max price values, and ratings. This allows users to quickly filter the search results based on their preferences.

Additionally, Elasticsearch also provides the filter feature to exclude certain documents from aggregations without affecting the search hits. This way, we can present users with accurate facets that are completely independent of the search results.

The power of faceted search, especially when combined with Elasticsearch’s other features like full-text search and relevance ranking, enhances the user’s ability to find the most suitable activities or tours based on their preferences and constraints, thereby elevating their overall search experience on the Pelago platform.

Entity Detection: Uncovering Contextual Insights

Entity detection plays a crucial role in understanding user queries and providing contextually relevant search results. Pelago incorporates advanced natural language processing (NLP) techniques to identify key components of search queries, such as destinations, tags, or categories.

By leveraging NLP algorithms, Pelago’s search engine can extract relevant entities from user queries, enabling more accurate and contextual search results. For instance, if a user searches for theme parks Singapore, Pelago's search engine identifies an entity Singapore as the intended destination and uses it as a filter when querying the Elasticsearch index. This ensures that the search results are more relevant and tailored to the user’s desired travel location.

ChatGPT Integration: Search Keywords Expansion

As data scientists at Pelago, we constantly explore new ways to enhance our search functionality and improve its accuracy and usability. To achieve this goal, we have integrated ChatGPT, a state-of-the-art language model developed by OpenAI, into our search system.

We are leveraging ChatGPT’s extensive knowledge of language patterns and user behavior to generate search keywords for popular products in our system. Through its training on vast amounts of data, ChatGPT has acquired the ability to identify common language patterns people use when searching for products. As an example, ChatGPT has suggested adding USS as a search keyword for Universal Studio Singapore based on its understanding of users’ tendencies to use abbreviations or short forms.

By incorporating ChatGPT into our search system, we are able to provide more accurate and relevant search results to our users. This helps to streamline the search process and enables our users to easily find the products they are looking for.

Tracking and Analytics: The Search Dashboard

Pelago is committed to enhancing search functionality and delivering an exceptional user experience. Our powerful search dashboard is instrumental in driving data-driven decisions by providing valuable metrics and insights.

By tracking a diverse range of key metrics, such as click-through rates, conversion rates, and user engagement, we gain deep insights into user behavior and search performance. Analyzing this data helps us identify patterns and trends, enabling us to optimize and improve our algorithms to deliver a search experience that goes beyond expectations.

We closely monitor popular search queries, frequently clicked results, and user feedback to uncover opportunities for enhancement. With cutting-edge technology and comprehensive metric analysis, our data-driven decisions continuously refine search quality and elevate user satisfaction.

Stay tuned for our upcoming blog post, where we will provide further insights into the inner workings of our search dashboard and how we leverage data to enhance the search experience for our users.

--

--