Unveiling Rerank: Trendyol’s Approach to a Tailored Search Experience

Published in

Trendyol Tech

14 min readJul 30, 2024

In the realm of e-commerce, delivering personalized and relevant search results is crucial for both user satisfaction and the success of a large-scale company like Trendyol. As the Search Relevance Team, our mission is to enhance the search experience by ensuring users are efficiently presented with the most pertinent and personalized search results. We aim to assist users in quickly finding products that align closely with their interests and requirements.

Within the Search Relevance Team, developers and data scientists work closely to achieve these objectives.

This article aims to explore Rerank, a central project within our team, and how it empowers us to deliver personalized and relevant search results to Trendyol users.

Search Before Rerank

Before delving into Rerank, it’s important to understand how Search at Trendyol performed before the introduction of the Rerank project. When a user initiates a search, the process initially involves Matching, followed by the Relevancy and Smartlisting steps.

Matching: The initial phase of our search algorithm involves matching products to user queries, and identifying products based on their relevance to the provided search terms. Currently, we use exact matching between the product’s title, category, brand, and attribute information and query to retrieve products related to a given query from Elasticsearch.

Relevancy: Positioned as the second stage, the relevancy algorithm evaluates the alignment between a specific query and each product category. This algorithm leverages click distributions to calculate category relevance scores. For instance, in an “iPhone” search, smartphones are ranked higher due to the application of category boosting with Relevancy. Without category boosting, products like phone cases with higher metrics such as sales and impressions would appear at the top.

Gender boosting is also applied in the Relevancy phase to refine the sorting process with query-specific insights, thereby minimizing the potential for displaying irrelevant results. For example, in a search for “t-shirts”, gender boosting is applied through Relevancy. As a result, male users are shown men’s t-shirts more prominently, while female users see women’s t-shirts ranked higher. Thus, Products from the matching phase are ranked by relevant category and gender information if applicable.

Gender boosting example for a “t-shirt” search of a male user

Smartlisting: As the third step, the same-category products are ordered by smartlisting score. The order/impression prediction of the Smartlisting score is based on a predictive algorithm. Although this method works well for capturing the overall performance of a product, it may not adequately assess how relevant a product is to individual searches and user preferences.

These steps (Matching, Relevancy, and Smartlisting) remain intact, with Rerank incorporated after them (Matching, Relevancy, and Smartlisting followed by Rerank). While the Matching, Relevancy, and Smartlisting phases occur within the Search API, the Rerank process is handled within a distinct project named the Rerank API.

The Need for Rerank

Need for Personalization

No personalization is applied before Rerank. We aim to display search results personalized by incorporating user-related features along with Rerank. Our goal is to prioritize products that align with user preferences based on their historical behavior and demographic information through search results.

Overall performance focus

While Smartlisting effectively measures the global performance of products, it may fall short of capturing the nuanced relevance of products to specific user queries and users' intent for a particular product. This singular focus can lead to the dominance of high-performing products across various queries, overshadowing potentially more relevant and personalized products.

The current method of calculating category relevance scores unintentionally creates category bias. This bias arises from the overemphasis on certain categories during searches, resulting in a skewed and incomplete representation of relevant options. This can cause our relevancy algorithm to become fixated on popular categories, neglecting a broader range of potentially relevant items.

Additionally, the existing relevancy logic sometimes results in only one type of product being highlighted, limiting the effectiveness of category-based relevance. Consequently, for specific queries like “kırmızı elbise” (red dress), category relevance may not accurately reflect user needs. From a user’s perspective, featuring unrelated products prominently can be frustrating and may lead to dissatisfaction, such as showing “socks” at the top of a general “clothes” search.

Relying solely on Smartlisting scores and the limitations of category-based relevancy can result in irrelevant search results, affecting overall user experience and satisfaction.

Therefore, we need Rerank for a better experience of personalization and query relevance.

How Rerank Works

With the Matching, Relevancy, and Smartlisting steps, products are already ranked. If Rerank is not applied, products will be displayed in the order of this ranking. The top N = 1000 products are called the candidate products for Rerank. Rerank just re-ranks those candidate products and determines the final display order of products.

Rerank not only helps display personalized results but also enhances the relevance of displayed products to the query by utilizing interaction features between query terms and products.

Let’s say, with the Matching, Relevancy, and Smartlisting steps, the candidate product ID list was created as follows: 1, 2, 3, 4, 5. In the absence of Rerank, the order that will be shown to the user is 1, 2, 3, 4, 5. However, with Rerank, this order can change completely and for 5 products, 120 different sortings may exist. Each product is assigned a score by Rerank and subsequently arranged from highest to lowest score. For instance, the reordered list after applying Rerank might be 5, 2, 1, 3, 4.

Consequently, the search results are presented in the order determined by Rerank. Therefore, Rerank is a crucial step as it determines the final order of the products that will be displayed to the user.

We have described Rerank as a black box. Internally, Rerank employs a Machine Learning model for product ranking. This model can take any 1+ sized sets of <user, query, item/brand/category/other derivatives> features as inputs and generates a prediction score for each item. The prediction scores are then used to establish the final order of products. Rerank can also utilize multiple ML models. An equation integrating prediction scores from various models for the same products can be employed to compute the final scores of the products.

In the current system, following the Matching, Smartlisting, and Relevancy stages, the top 1000 highest-scored products are forwarded to Rerank from the Search API. Therefore, Rerank processes and reranks a maximum of 1000 products. If a user scrolls through more than 1000 products during a search request, the remaining products are displayed in the order determined before Rerank is applied.

Although search API gets reranked results for 1000 products, the user will see only a portion of it in the UI at a time. When the user starts scrolling, another search request is made to Search API. Then, does the Search API send a request to Rerank again? No, In Search API, for a user’s search, reranked products are stored in a distributed cache for 10 minutes with a key that can identify a specific search journey of the user, which we call the Unique Search Journey Key. Thus, when the user scrolls, Search API uses the ranking in the cache instead of repeatedly requesting the rerank.

An Example Flow of a Search Request with Phases

An illustrative flow of a search request unfolds as follows: Upon entering the query “kırmızı elbise” (red dress) on the search box, the system initiates retrieval of 35000 products during the matching phase. Products categorized as “Dress” undergo category-specific refinement to enhance result relevance. Subsequently, the reranking phase focuses on the top 1000 products with the highest adjusted ratings.
During reranking, the system leverages different features to optimize the sortings of displayed products further. This process aims to prioritize highly relevant and personalized products, optimizing user experience by presenting products tailored to the specific search query and individual preferences.
Finally, products are displayed in the order Rerank’s output determines for the “kırmızı elbise” search.

Separation of Rerank by Services

In the search results on Trendyol, certain products are marked with a “sponsored product” badge, such as the third product in the image below. Internally, we refer to these as “ads.” While ads are also products themselves, we distinguish them as ads to denote that they are sponsored products.

Rerank handles products and ads separately. Specifically, products undergo ranking through a distinct request to a separate endpoint as /rerank-products, whereas ads are processed via individual requests to their own /rerank-ads endpoint. Thus, for the image below, in Rerank, the 1st, 2nd, and 4th products are ranked against each other and other products, while the 3rd and other ads products are ranked separately among themselves.

A search result showing an ad: “sponsored product”

Ads and Product Reranks are handled through separate requests due to their distinct requirements and objectives. The ML models used for Reranking Ads and Products, along with their feature sets, are tailored differently to meet specific targets. For example, one model may focus on enhancing the Click-Through Rate (CTR), while another may aim to improve the Conversion Rate (CR). Additionally, our services will extend beyond Ads and Products such as to include Reranking on flash-sale pages, which will involve yet another ML model and a separate request process.

Presently, Rerank is only operational in Turkey. However, we aim to extend its application to other international markets such as GULF, CEE, and more. This expansion will also necessitate the use of different ML models customized for each market’s unique requirements and preferences.

Every business and market has unique needs and objectives, so they require different machine learning models and targets.

High-Level Architecture of Rerank

Initially, our client search API transmits the product IDs to Rerank within the request body. Subsequently, the Rerank API executes Lookup operations, which involve retrieving necessary features from the Feature Store for each item for the ML model. The Feature Store, maintained on a NoSQL Database, serves as our repository for these features. Features are extracted and prepared to accommodate the model’s requirements for each item. These features are then forwarded to the Nvidia Triton Inference Server for an on-the-fly inference process.

Inference is the process of using a trained machine learning model to make predictions or draw conclusions from new data.

The inference request sent from Rerank API to Triton is structured such that each array corresponds to the features of an individual item as follows:

[
  [<feature1>, <feature2>, <feature3>, …], 
  [<feature1>, <feature2>, <feature3>, …], 
  …
]

The output Rerank API gets from Triton is roughly as follows, which is the prediction score for each item:

{
  ...
  "data": [0.5230713242013007404, 0.4662513542218167203, ...]
  ...
}

Following inference, a post-processing step occurs to refine results before they are returned to the Rerank API. Subsequently, the prediction score for each item is sent back to the Rerank API as the output. The Rerank API then arranges products in descending order based on these prediction scores before responding to the Search API. Ultimately, products are displayed in search results according to the sorting conducted by the Rerank process.

How Do We Serve ML Models?

Our ML models are deployed using Triton. Triton Inference Server, developed by NVIDIA, is an open-source, high-performance software designed for serving machine learning models in production environments.

We have used LightGBM and XGBoost as our Rerank ML models so far. We use Triton’s Business Logic Scripting (BLS) with the models. Business Logic Scripting (BLS) functions as a Python backend that operates immediately prior to and following the inference of the ML model(s). BLS preprocesses features before forwarding them to the ML model for inference. It also facilitates post-processing of the model’s inference results, enhancing our ability to refine outputs effectively. Pre-process and post-process logics also enable us to use and orchestrate ensembled models together.

What Features do ML models Use When Determining Prediction Scores?

Models utilize a variety of features including:

Item Characteristics: Gender, price, review, item order gender ratio, content price group information, and more.
User History: Brand and category propensities, demographics, etc.
Search Query Data: Aggregated information from search queries.
Query Interactions: Features such as Query x Item, Query x Category, Query x Brand, and Query x Category x Brand in different timeframes like Last 1 Hour, Last 1 Day, etc. including metrics like clicks, CTR, and impressions.

Features are updated in scheduled batches like every 10 minutes, 1 hour, 24 hours, etc., and obtained mostly via clickstream data. User and query-related features sent to the inference server are common for each item in an individual request. To enhance the effectiveness of Rerank, our Data Science team continuously iterates ML models, and we aim to increase the feature diversity and efficiency.

How Do We Ensure Rerank’s Success?

The fundamental success criterion for Rerank is an improvement in user actions, such as conversion rates (CR), click-through rates (CTR), and attributed actions.

At Trendyol, A/B testing is extensively employed to compare two versions of a webpage or app to determine which one performs better concerning a specific metric, such as user engagement or conversion rates. By assigning users to either version A or version B, we can measure the impact of changes and make data-driven decisions.

To initially measure the success of Rerank, we conducted an A/B test where group A did not have Rerank applied, while group B did. Observing a significant improvement in business metrics for group B, we enabled Rerank for all users.

Our Data Science team continuously iterates on the machine learning models. When introducing a new ML model, we conduct further A/B testing to evaluate its performance against the current model. If the new model demonstrates significant improvement based on A/B test results, we replace the existing model to serve the Rerank with the new model for all users. Occasionally, we test multiple new models simultaneously, employing a sort of A/B/C/D testing.

Some metric results from the testing of the recent Rerank Ads model iteration:

Below you can see a few examples of how we can show more relevant & personalized results with Rerank:

An example of displaying more relevant results thanks to query-item relevance

An example of displaying personalized results with Rerank-Ads

Some Challenges of Rerank

Maintaining Query Relevance in Personalization: While enhancing product results with personalization, it is essential to retain their relevance to the search term, necessitating the selection of appropriate targets and features.
Predicting Performance for Low-Impression Products: Accurately estimating the performance of products with low impressions is crucial due to the lack of user interaction data. Correct predictions in model simulations are therefore imperative.
Improving Candidate Selection: The selection of candidate products for Rerank can be optimized. The more effectively candidates are chosen, the greater the potential for Rerank to succeed. For instance, while Rerank can elevate the position of the 1000th product to the top, it cannot do the same for the 1001st product, even if it would yield a more personalized and relevant result for the user’s search, simply because it was not included in the candidate products list.
Position Bias: Products displayed on top of the search are prone to have high metrics such as impressions and clicks. That is why, the model may experience position bias. DS is working to address and mitigate this issue within the Rerank model.
Using one ML Model for All Categories: We use a single Rerank model for all categories, which requires comprehensive feature engineering to address the diverse needs of various categories. Handling this broad range of categories effectively is a complex task.

Future Plans

Utilizing Vectors in Search Relevance: Moonshot and Prerank

We aim to implement a two-tower architecture through the Moonshot and Prerank projects, leveraging user and item vector embedding similarities to enhance search results presentation.

The two-tower architecture is composed of two distinct towers: one dedicated to the user and the other to the item. Utilizing deep neural networks, it learns high-level abstract representations for both users and items based on their past interactions. The output measures the similarity between the user embedding and the item embedding, indicating the user’s interest in the given item.

Moonshot: Integrating vector embedding similarity as a feature in the Rerank process.

Prerank: Improving the candidate pool selection of products before Rerank.

We have started testing these projects and are refining them through iterative improvements. The iterations are divided into phases, and upon completion, we plan to phase out the Smartlisting and Relevancy algorithms. The ultimate search process is expected to follow the steps of Matching -> Prerank -> Rank -> Rerank (with a smaller candidate size), utilizing user-query and item vector embeddings at each stage.

Using Real-time Features in Rerank

The features currently stored in our Feature Store are updated in scheduled batches. However, for more accurate prediction scores in Rerank, we aim to incorporate real-time user features such as visits, favorites, baskets, and orders. This approach will enable us to provide more real-time predictions.

The ultimate point we want to reach is to achieve event-based and real-time request predictions. Through these iterations, we aim to establish a high-throughput, low-latency infrastructure for real-time personalization.

Expanding Rerank’s Sphere of Influence

Include Anonymous Users: Currently, only logged-in users can see Reranked results. However, we aim to extend this capability to anonymous (logged-out — userID is unknown) users as well for better relevancy and personalization. Even if the user is anonymous, we aim to apply personalization to some extent still, using their in-session user activity features.

Rerank in Different International Markets: Presently, Rerank is active only in Turkey. However, given Trendyol’s significant growth in the international market, we aim to enable Rerank in regions such as the GULF, CEE, AZ, and DACH markets. This expansion will allow us to deliver more personalized and relevant search results, thereby enhancing our business metrics in these areas.

Rerank in More Context: Currently, Rerank is enabled only on the main search page. However, we plan to extend Rerank to additional pages, such as Flash Sales and Homepage Widgets that are empowered by search behind the scenes. By doing so, we can apply Rerank to a wider range of search results, providing our users with a more relevant and personalized search experience.

Unified Platforms for Feature and ML Model Serving

We have developed and managed our Feature Store and handle the serving and maintenance of ML models for inference as stated. Currently, we are collaborating with ML Platform and Feature Store teams to develop unified platforms for Feature and ML Model serving that can be utilized across Trendyol. Our goal is to transition these platforms to the relevant teams, thereby enhancing the productivity of both Data Scientists and Developers and contributing to the internal ecosystem.

Conclusion

In this article, I have tried to cover Rerank, a project developed by the Search Relevance team to deliver more personalized and relevant results to our users. Additionally, I have outlined some of the future work we plan to undertake in the next iterations. If you are interested, you can watch the meetup where our Data Science team discusses Rerank and other projects we are working on. Stay tuned for upcoming articles on interesting topics from the Search Relevance team!

Hope you enjoyed reading this article. Thank you for your time. Please do not hesitate to contact me with any questions or feedback.

Join us

We’re building a team of the brightest minds in our industry. Interested in joining us? Visit the pages below to learn more about our open positions.

Home - Trendyol Careers

We believe in the power of an inclusive workplace. Our platform is for everyone, and so is our workplace. Each and…

careers.trendyol.com