Trendyol Tech

Trendyol Tech Team

Optimizing International Content Eligibility Calculations: Challenges and Solutions at Trendyol

--

Introduction

As of 2023, Trendyol has enabled Turkish sellers to reach international customers. The journey of a product in e-export requires collaboration across multiple teams and processes. This article shares the scale challenges we faced in calculating the sellability of international content and the optimizations we implemented as the Catalog Rule Management team.

Terminology

Before diving into the problem, let’s define some domain-specific terms for better clarity.

Legal Rule

A Legal Rule verifies whether a product can be legally sold between countries. While a product may be sellable in Turkey, it might not be permitted in other countries for various reasons. To manage this process, Legal Rules are defined based on Category and Brand, ensuring legal compliance in international sales.

Coefficient Rate Rule

This rule accounts for factors such as VAT rates and operational costs when selling a product in different countries. Coefficient Rate Rules can be defined at the content, listing, or category level. If multiple rules overlap, the most specific one takes precedence, following this hierarchy: listing > content > category.

Storefront Attribute Rule

SAR (Storefront Attribute Required) and SAS (Storefront Attribute Suggested) determine the required or recommended attributes for a product in a storefront, category, and attribute context. If a seller fails to provide an attribute marked as SAR, the product cannot be listed. This data is maintained at the category and storefront levels.

Eligibility Process

The purpose of eligibility calculation is to manage micro-export compliance by filtering products based on:

  • Low-quality content
  • Prohibited products
  • Legal restrictions
  • Cultural concerns
  • Large or hard-to-ship items

A product with eligible = false cannot be sold internationally, whereas a product with eligible = true is sellable but may still be subject to other business processes.

Problem

Each of the above-mentioned rules can be updated independently, and as a result of these updates, the eligibility values of international-content need to be recalculated both on a category basis and a storefront basis.

However, according to business priority ranking, the hierarchy follows: Legal Rule > Coefficient Rate Rule > Storefront Attribute Rule. Accordingly, for example, when a Legal Rule update is made, this change must be reflected in international-content as soon as possible.

In this process, the content-consumer application manages reindexing operations across 60 pods. It consumes all incoming events and restarts the eligibility calculations. However, since each rule update goes through the same flow, the application’s CPU and memory resources are heavily consumed, preventing separate scaling for different priority tasks and leading to performance issues.

Old Architecture

Note: The waiting-room application groups n category-based events received from different clients according to a scheduled time and produces these events after a threshold time from the createdDate of the first group. This prevents unnecessary event processing.

As seen in the above diagram, events from different clients were consumed through a single application, leading to scalability and business delivery problems.

Solution

Therefore, applications needed to be redesigned based on the Priority Queue Pattern.

Independent Deployments: Each rule type (e.g., Legal Rule, Coefficient Rate Rule, Storefront Attribute Rule, etc.) was assigned to a separate deployment, allowing independent scaling.

Configurable Consumer Status: A configuration mechanism was implemented to control which consumer group can process which topics, ensuring efficient event consumption.

Auto-Scaling with KEDA: KEDA (Kubernetes Event-driven Autoscaling) was used to automatically scale applications based on event load. This minimized resource consumption during low-load periods while scaling up when needed.

In this pattern, we used the same Docker image for separate independent deployments. Each business rule was deployed separately and scaled according to priority.

Independent Deployments

What does our new architecture look like?

Legal Rule Content Consumer
Storefront Attribute Content Consumer

As a team, we run all our workflows through the same Docker image. This means that when a new feature is added, the final version of the developed code must be deployed to all applications separately.

In other words, despite independent deployments, every application in the production environment always uses the latest image.

We have a single repository where all business logic is coded. When the Deploy Prod process is executed via the pipeline, the newly created image ID (e.g., 0fb6d3d771c713c47b00f1e2777950ed40405ede) is deployed to all applications at once.

Analysis Rule Changed Content Consumer

Configurable Consumer Status

At the application level, we have a mechanism to control the activity status of consumers. This allows us to predefine which application will consume which topic. An application can only activate consumers related to the topics it is responsible for.

For example, below is the consumer-status-config configuration for the Analysis Rule Content Consumer application. This config enables only the analysisRuleChanged and reindexInternationalContentByAnalysisRuleUpdate consumers, preventing access to other consumers.

configA:
enabled: false
configB:
enabled: false
analysisRuleChanged:
enabled: true
reindexInternationalContentByAnalysisRuleUpdate:
enabled: true

Auto-Scaling with KEDA

Would new deployments lead to higher resource consumption?

Yes, to address this concern, we gradually reduced the number of pods for the old content-consumers when transitioning to this pattern.

Initially, the pod count was 60, which we gradually reduced to 40, 30, and finally 18. The old application was left with only non-critical consumers.

Running multiple applications at maximum scale simultaneously could lead to wasted cluster resources. Moreover, since events are not always heavily loaded, applications often did not require continuous scaling.

To solve this, we enabled auto-scaling.

Our internal tool TBP (Trendyol Builder Platform) leverages KEDA (Kubernetes Event-driven Autoscaling) to manage auto-scaling across 2 active zones and 2 data centers.

Minimum pod count: 3

Maximum pod count: 10

• If the event lag exceeds 100K, the application scales up to 40 pods (10 * 2 Data Centers * 2 Active Zones).

Example: Auto-Scaling Configuration for Analysis Rule Content Consumer

In this case, if the lag is below 100K, the application runs with 12 pods (3 * 2 Data Centers * 2 Active Zones).

However, if the lag exceeds 100K, the pod count increases to 40, ensuring the system scales efficiently without constant manual monitoring.

Thus, when the system is under high load, it operates at maximum performance, and when there is low load, the consumers remain active but idle, preventing unnecessary CPU and memory usage.

Before this optimization, the content-consumer was operating on a single data center, reaching a maximum of 200K RPM.

Since there are two data centers, this equates to 400K RPM in total.

Content-Consumer RPM data in 1 DC before Priority-Based Deployment (Multiplying this data by 2 gives the correct performance. Max 400K RPM)

However, after implementing Priority-Based Deployment and scaling each consumer separately, our system achieved a new record of 8.1M RPM, demonstrating a massive performance gain.

Consume RPM values of the application after Priority-Based Deployment (Max 8.1M RPM)

This new architecture has successfully eliminated scalability bottlenecks, optimized resource usage, and enabled dynamic event-driven scaling, making our system far more efficient and future-proof. 🚀

Conclusion

In this article, we discussed the optimization performed by Trendyol’s Catalog Rule Management team in the international-content flow. We explored how we optimized the processes that needed to be delivered based on business priority and examined the efficiency gains provided by the Priority-Based Queue Pattern.

About Us

Would you like to be a part of our growing company? Join us! Check out our open positions and other media pages from the links below.

--

--

No responses yet