EXPEDIA GROUP TECHNOLOGY — DATA
The Juggler Model: Balancing Expectations in Lodging Rankings
Expedia Group’s ranking method for keeping travelers, properties and the marketplace satisfied
“You cannot please the Greeks and the Trojans.”
The argument behind this famous saying is it is impossible — and even counter-productive — to try to fulfill everyone’s expectations. So, why bother trying?
At Expedia Group™️, we’ve been researching machine learning solutions to balance multiple stakeholders’ expectations in Lodging Marketplace. This blog post will introduce the Juggler model, a Multi-Stakeholder Ranking solution named so because it juggles the importance of relevance and business adjustments in the final ranking.
Lodging Ranking basics
The Lodging Search process starts with a traveler’s query expressing their needs. The search request typically contains a destination, check-in and check-out dates, and number of travelers.
The traveler query triggers a series of services that lead to the Property Search Results page. Here, the properties are displayed in decreasing order of importance.
Then, we track the traveler’s interactions with the page, focusing mainly on clicks and bookings. In this example, we know the traveler clicked two properties (green signs) and ended up booking the second one (in blue).
As a result, we can create the Lodging Ranking dataset. The data are subjected to an attribution logic to connect all events into a search. The data are a combination of multiple sources: Searches, Properties and Travelers.
Lodging Marketplace stakeholders
Most e-commerce marketplaces cater to three broad stakeholder groups: Producers, Consumers and System. In Lodging Ranking, these refer to the properties, travelers and Expedia Group’s marketplace, respectively. The marketplace provides tools so that travelers can find the most suitable property that meets their requirements.
Although the marketplace can be simply explained, its dynamics cannot. Much like the Greeks and Trojans, stakeholders have competing objectives that dictate how the market operates. Consider some examples:
- Travelers want the best available property at the lowest price. However, we know that “best” does not mean the same for everyone, everywhere, all the time. Also, different customers have different price sensitivity. So, we build models that can predict customer needs in any given search context and use them to rank the properties.
- Properties want to have the best exposure while paying the lowest fees for it. However, this is not always possible, depending on the market conditions in which they operate. To their aid, we provide mechanisms to improve their rankings in exchange for fees or special deals and coach partners on how to improve their service such that we bring more value to the marketplace.
- Finally, the system wants to sell the most profitable inventory to the largest number of customers. It needs to balance short-term conversion goals (i.e., convert a search) with long-term objectives (increase repeat rate), adapt to different traveler preferences and market conditions and respond to external factors quickly and effectively.
Addressing stakeholder dynamics in a way that ensures collective satisfaction and marketplace health is a complex subject and a matter of continuous research.
Multi-stakeholder ranking
In Lodging Ranking, machine learning models consider traveler needs when ranking properties. However, optimizing for traveler relevance only hurts long-term marketplace health. Thus, we need to combine stakeholders’ needs for the final ranking.
We tackle this issue by creating a composite score given by the sum of independent components, including the relevance score and other business adjustments used to optimize the ranking for different stakeholder objectives. The formula looks like this:
The final ranking is obtained by decreasingly sorting the properties using the composite score. If all is well, we keep the most relevant properties in top positions but make slight adjustments to make sure we boost properties that ensure long-term marketplace health closer to the top.
Good business adjustments are hard to define. On the one hand, they must be effective; on the other, business adjustments must be easy to interpret and explain to stakeholders. For instance, we must be able to explain to property managers why a specific property is assigned a given ranking and advise on what actions to take to improve it.
So, how do we keep all stakeholders happy, while maintaining the current ranking mechanism? Our answer is to use linear scalarization: each adjustment is assigned a weight, dictating how much it can contribute to the final ranking. The scoring formula now becomes:
For instance, in a particular search where conversion probability is high, we may want to be more aggressive in terms of profitability — in this case, we may want to increase the business adjustments in charge of this objective. In other cases, we may prefer to be more conservative and focus most on relevance, such that we maximize conversion probability. To do this, we need only to increase the relevance weight.
Consider a simple example of three properties, each with specific scores for each adjustment. Notice we represent only one adjustment for simplicity, but we encode all remaining adjustments under the “…” column. In this case, the final score is simply the sum of all components. The ranking is then created by decreasingly sorting the properties’ scores.
With Juggler, we will include a pair of weights in the calculations. Such weights, although they are the same for all properties in each search, can affect the overall score substantially. In this example, the new scores mean the final ranking is completely different: it changed from 3–1–2 to 2–3–1.
Now, all we have left to do is to find appropriate weight parametrizations for any search the travelers perform on our marketplace. Easy, right?
Juggler: A meta-learning framework
Intuitively, we know there is a relationship between the search context and the ideal adjustment weight parametrization. For instance, when there are fewer properties to choose from, then relevance is likely most important. Contrarily, when there are several good options, there is added value in enforcing the objectives behind business adjustments.
Although this is certainly true in most cases, it is not the whole story. To explore and leverage the complex patterns that govern Multi-Stakeholder Ranking, we propose the “Juggler” model. Juggler is inspired by the algorithm selection task from Meta-Learning, whose goal is to find models that map dataset characteristics to the best algorithms.
To apply the methodology to the Lodging Ranking domain, we establish the following parallelism:
- Instead of datasets, we focus on searches. In our context, a search is a complex data structure, comprising several entities and their relationships: Search, Property and Traveler. We must then find appropriate ways to describe them to derive the search context.
- We do not select the best algorithms, but rather the best weight parametrizations. Like algorithms, a different parametrization can lead to a different ranking and, by extension, to different performance. Therefore, different parametrizations may be useful under different conditions. We just need to discover when …
The model produced will find the mapping between search context and weight parametrizations. This enables us to predict the best parametrization based on what happened in similar searches and, after plugging it into the scoring formula, we can re-rank the items accordingly.
Below we can find the flowchart representing the tasks required to prepare the data, build the model and use it in inference.
Juggler @ Expedia Group
Such a framework can be used in other Ranking tasks. We will leave that topic for future blog posts — for now, we focus on the existing Juggler model by addressing some implementation questions.
Defining the search context
Recall that the Lodging Ranking data comprises Search, Property and Traveler data. In this stage, we employ suitable data summarization techniques for each entity to derive a rich search context.
While Search and Traveler features are essential to define a search context and can be used as is, they are not fine-grained enough to enable good predictive performance. Thus, we must go a step further and summarize the properties in a search into single measures. For instance, we describe the price distributions via histogram. We apply a similar operation to other property characteristics to ensure complex patterns are detectable.
Finding the ideal weight parametrizations
The ideal parametrizations per search are found via simulations. To do this, we recreate the rankings and evaluate them against an objective policy. A policy is simply a set of ranking metrics. For instance, NDCG is commonly used to evaluate ranking relevance, but several useful metrics inspect serendipity, fairness, novelty, etc. All of them can be combined into a policy. The simulation involves several steps:
- Load search data. The data coverage should be large enough to cover multiple segments and capture seasonality effects.
- Apply the scoring formula, using different parametrizations. We test a wide range of options, baselining them against the default option.
- Evaluate the new ranking, using the objective policy. In the case of ties, we choose the option closest to the default option.
- Select the best parametrization. This is achieved simply by looking up the weight parametrization with the best score in the objective policy.
Defining and using the Juggler model
Having now a clear identification of features and labels from the classical Supervised Learning task, it becomes trivial to find a Juggler model. We use a gradient boosting trees classifier model, due to its superior performance in offline testing.
Once the model is placed in production, it is ready to inspect new searches and predict the most appropriate weight parametrizations. It is then able to find generalizable patterns which can direct the search results toward more appropriate rankings.
More details about the model implementation are available in the RecSys 2021 workshop paper here.
Conclusions and next steps
The Juggler model has consistently improved business performance over the years and we are now expanding it to other ranking problems. Additionally, we will explore other ways to enhance the model:
- Personalization. Juggler will use more detailed Traveler information and continuously adapt the weights to their needs. For instance, the ranking should be adapted to the customer’s price sensitivity.
- Deep Learning. The search context is the most critical component in the Meta-Learning model. As such, there is great potential in using Deep Learning architectures to derive useful embeddings.
- Reinforcement Learning. Juggler must adapt to changes in the underlying data patterns. Through Reinforcement Learning, it can continuously adapt the model parameters to the traveler’s actions.
The Greeks and Trojans may not be pleased — but we are certainly a step closer.