EXPEDIA GROUP TECHNOLOGY — DATA

The Juggler Model: Balancing Expectations in Lodging Rankings

Expedia Group’s ranking method for keeping travelers, properties and the marketplace satisfied

Tiago Cunha

Published in

Expedia Group Technology

8 min readSep 19, 2023

“You cannot please the Greeks and the Trojans.”

The argument behind this famous saying is it is impossible — and even counter-productive — to try to fulfill everyone’s expectations. So, why bother trying?

A woman overlooks the ocean — Photo by orva studio on Unsplash

At Expedia Group™️, we’ve been researching machine learning solutions to balance multiple stakeholders’ expectations in Lodging Marketplace. This blog post will introduce the Juggler model, a Multi-Stakeholder Ranking solution named so because it juggles the importance of relevance and business adjustments in the final ranking.

Lodging Ranking basics

The Lodging Search process starts with a traveler’s query expressing their needs. The search request typically contains a destination, check-in and check-out dates, and number of travelers.

The traveler query triggers a series of services that lead to the Property Search Results page. Here, the properties are displayed in decreasing order of importance.

Then, we track the traveler’s interactions with the page, focusing mainly on clicks and bookings. In this example, we know the traveler clicked two properties (green signs) and ended up booking the second one (in blue).

Diagram represents the overall search process. On the left hand side, there is reference to a search request on the webpage, Then, in the middle section, the Property Search Results page shows 3 ranked properties. Lastly, on the right hand side, the traveler feedback is represented via ticks — 2 in green for the top 2 properties and one in blue for the second property. A green tick means a click and a blue tick means a booking. — Basic stages for Lodging Ranking search: start with providing query parameters(left), move to the results page (middle) and then finish with tracking customer actions such as clicks and bookings (right).

As a result, we can create the Lodging Ranking dataset. The data are subjected to an attribution logic to connect all events into a search. The data are a combination of multiple sources: Searches, Properties and Travelers.

Image showing the Lodging Ranking data are a combination of multiple sources: Searches, Properties and Travelers. — The Lodging Ranking data comprises a set of Searches, the respective Properties assigned and the Traveler. The data should also contain descriptors for each entity, e.g. the destination in Search, the price of the property and the traveler profile information.

Lodging Marketplace stakeholders

Most e-commerce marketplaces cater to three broad stakeholder groups: Producers, Consumers and System. In Lodging Ranking, these refer to the properties, travelers and Expedia Group’s marketplace, respectively. The marketplace provides tools so that travelers can find the most suitable property that meets their requirements.

The marketplace, represented by the System, facilitates the relationship between Producers and Consumers. Consumers express needs, which the producers meet by providing services and/or goods.

Although the marketplace can be simply explained, its dynamics cannot. Much like the Greeks and Trojans, stakeholders have competing objectives that dictate how the market operates. Consider some examples:

Travelers want the best available property at the lowest price. However, we know that “best” does not mean the same for everyone, everywhere, all the time. Also, different customers have different price sensitivity. So, we build models that can predict customer needs in any given search context and use them to rank the properties.
Properties want to have the best exposure while paying the lowest fees for it. However, this is not always possible, depending on the market conditions in which they operate. To their aid, we provide mechanisms to improve their rankings in exchange for fees or special deals and coach partners on how to improve their service such that we bring more value to the marketplace.
Finally, the system wants to sell the most profitable inventory to the largest number of customers. It needs to balance short-term conversion goals (i.e., convert a search) with long-term objectives (increase repeat rate), adapt to different traveler preferences and market conditions and respond to external factors quickly and effectively.

Addressing stakeholder dynamics in a way that ensures collective satisfaction and marketplace health is a complex subject and a matter of continuous research.

Multi-stakeholder ranking

In Lodging Ranking, machine learning models consider traveler needs when ranking properties. However, optimizing for traveler relevance only hurts long-term marketplace health. Thus, we need to combine stakeholders’ needs for the final ranking.

We tackle this issue by creating a composite score given by the sum of independent components, including the relevance score and other business adjustments used to optimize the ranking for different stakeholder objectives. The formula looks like this:

Score = relevance + adjustment1 + adjustment2 + … — The scoring formula is a sum of relevance score and multiple adjustments. Since this is proprietary information, we do not provide details regarding each adjustment in Lodging Ranking.

The final ranking is obtained by decreasingly sorting the properties using the composite score. If all is well, we keep the most relevant properties in top positions but make slight adjustments to make sure we boost properties that ensure long-term marketplace health closer to the top.

Good business adjustments are hard to define. On the one hand, they must be effective; on the other, business adjustments must be easy to interpret and explain to stakeholders. For instance, we must be able to explain to property managers why a specific property is assigned a given ranking and advise on what actions to take to improve it.

So, how do we keep all stakeholders happy, while maintaining the current ranking mechanism? Our answer is to use linear scalarization: each adjustment is assigned a weight, dictating how much it can contribute to the final ranking. The scoring formula now becomes:

Score = w0 * relevance + w1 * adjustment1 + w2 * adjustment2 + … — Lodging Ranking scoring formula with linear scalarization — Each component is assigned a weight, which we will adjust for each search.

For instance, in a particular search where conversion probability is high, we may want to be more aggressive in terms of profitability — in this case, we may want to increase the business adjustments in charge of this objective. In other cases, we may prefer to be more conservative and focus most on relevance, such that we maximize conversion probability. To do this, we need only to increase the relevance weight.

Consider a simple example of three properties, each with specific scores for each adjustment. Notice we represent only one adjustment for simplicity, but we encode all remaining adjustments under the “…” column. In this case, the final score is simply the sum of all components. The ranking is then created by decreasingly sorting the properties’ scores.

The table shows a toy example with 3 properties (A, B and C) and the assigned relevance and adjustment scores. The final score is calculated by summing all components and the ranking is defined by decreasingly sorting the scores. The actual scores are not important, but it is noteworthy that the final ranking is 3–1–2 for properties A, B and C respectivelly. — The table shows a toy example with 3 properties (A, B and C) and the assigned relevance and adjustment scores. The final score is calculated by summing all components and the ranking is defined by decreasingly sorting the scores.

With Juggler, we will include a pair of weights in the calculations. Such weights, although they are the same for all properties in each search, can affect the overall score substantially. In this example, the new scores mean the final ranking is completely different: it changed from 3–1–2 to 2–3–1.

The table shows an adaptation of the previous example, where weights are assigned to each component. The weights are the same for all properties in a search and the weighted sum yields the final score. The ranking continues to be calculated by decreasingly sorting the final scores. The scores now change to yield the ranking 2–3–1. — The table shows an adaptation of the previous example, where weights are assigned to each component. The weights are the same for all properties in a search and the weighted sum yields the final score. The ranking continues to be calculated by decreasingly sorting the final scores.

Now, all we have left to do is to find appropriate weight parametrizations for any search the travelers perform on our marketplace. Easy, right?

Juggler: A meta-learning framework

Intuitively, we know there is a relationship between the search context and the ideal adjustment weight parametrization. For instance, when there are fewer properties to choose from, then relevance is likely most important. Contrarily, when there are several good options, there is added value in enforcing the objectives behind business adjustments.

Although this is certainly true in most cases, it is not the whole story. To explore and leverage the complex patterns that govern Multi-Stakeholder Ranking, we propose the “Juggler” model. Juggler is inspired by the algorithm selection task from Meta-Learning, whose goal is to find models that map dataset characteristics to the best algorithms.

To apply the methodology to the Lodging Ranking domain, we establish the following parallelism:

Instead of datasets, we focus on searches. In our context, a search is a complex data structure, comprising several entities and their relationships: Search, Property and Traveler. We must then find appropriate ways to describe them to derive the search context.
We do not select the best algorithms, but rather the best weight parametrizations. Like algorithms, a different parametrization can lead to a different ranking and, by extension, to different performance. Therefore, different parametrizations may be useful under different conditions. We just need to discover when …

The model produced will find the mapping between search context and weight parametrizations. This enables us to predict the best parametrization based on what happened in similar searches and, after plugging it into the scoring formula, we can re-rank the items accordingly.

Below we can find the flowchart representing the tasks required to prepare the data, build the model and use it in inference.

The figure shows a diagram with two sections: Model Training and Model Inference at the top and bottom, respectively. Model training starts at “Historical searches” and connects both to “Summarize Searches” and “Perform Simulations”. Both blocks connect to “Model fitting”, which in turn connects to “Juggler model”. The Model Inference starts at “New search” and connects to “Summarize searches”, which directs to “Juggler model”. Then, the flow proceeds to “Apply weight” and “Re-rank items”. — The Juggler framework is divided into two stages: Model training and Model inference. In the first, historical searches are used both to summarize searches and perform simulations. The outcome of such components will flow into the model fitting process, from which the Juggler model is created. In the model inference stage, a new search is submitted to two components, previously defined in Model training: first, the search is summarized using the same techniques as before and then submitted to the Juggler model for predictions. The predictions are then processed to update the weights and re-rank the items.

Juggler @ Expedia Group

Such a framework can be used in other Ranking tasks. We will leave that topic for future blog posts — for now, we focus on the existing Juggler model by addressing some implementation questions.

Defining the search context

Recall that the Lodging Ranking data comprises Search, Property and Traveler data. In this stage, we employ suitable data summarization techniques for each entity to derive a rich search context.

While Search and Traveler features are essential to define a search context and can be used as is, they are not fine-grained enough to enable good predictive performance. Thus, we must go a step further and summarize the properties in a search into single measures. For instance, we describe the price distributions via histogram. We apply a similar operation to other property characteristics to ensure complex patterns are detectable.

Finding the ideal weight parametrizations

The ideal parametrizations per search are found via simulations. To do this, we recreate the rankings and evaluate them against an objective policy. A policy is simply a set of ranking metrics. For instance, NDCG is commonly used to evaluate ranking relevance, but several useful metrics inspect serendipity, fairness, novelty, etc. All of them can be combined into a policy. The simulation involves several steps:

Load search data. The data coverage should be large enough to cover multiple segments and capture seasonality effects.
Apply the scoring formula, using different parametrizations. We test a wide range of options, baselining them against the default option.
Evaluate the new ranking, using the objective policy. In the case of ties, we choose the option closest to the default option.
Select the best parametrization. This is achieved simply by looking up the weight parametrization with the best score in the objective policy.

Defining and using the Juggler model

Having now a clear identification of features and labels from the classical Supervised Learning task, it becomes trivial to find a Juggler model. We use a gradient boosting trees classifier model, due to its superior performance in offline testing.

Once the model is placed in production, it is ready to inspect new searches and predict the most appropriate weight parametrizations. It is then able to find generalizable patterns which can direct the search results toward more appropriate rankings.

More details about the model implementation are available in the RecSys 2021 workshop paper here.

Conclusions and next steps

The Juggler model has consistently improved business performance over the years and we are now expanding it to other ranking problems. Additionally, we will explore other ways to enhance the model:

Personalization. Juggler will use more detailed Traveler information and continuously adapt the weights to their needs. For instance, the ranking should be adapted to the customer’s price sensitivity.
Deep Learning. The search context is the most critical component in the Meta-Learning model. As such, there is great potential in using Deep Learning architectures to derive useful embeddings.
Reinforcement Learning. Juggler must adapt to changes in the underlying data patterns. Through Reinforcement Learning, it can continuously adapt the model parameters to the traveler’s actions.

The Greeks and Trojans may not be pleased — but we are certainly a step closer.