How We Prepare 2M Sellable Product Excel Report In Seconds With Scalable And Fault Tolerant System — Part 1

Eren Arslan
Trendyol Tech
Published in
3 min readJun 27, 2022

In Trendyol, Scaling is everything. And that’s one of the things we’ve been thinking about the most. Our systems must be scalable and able to tolerate errors.

As the GlobalPlatforms team, We are responsible for product sales on global websites. Unlike Trendyol, we manage our own product catalog. We have a Product Manage UI for listing, filtering, etc. Many teams use these screens.

We have complex search features to search in more detail. And Business Team wants to get reports from these filtered products. We handle these requests with Catalog Read API which uses ElasticSearch. As You see below, We can handle so complex queries.

And We support 5 languages for our products. And These reports should have different language options. This is where Product Translation API comes into play. We manage our translations with this API.

First, let’s think very simply. We should fetch all products. And then We should fetch Translations of that products. And We should prepare a basic CSV file then We should upload this file to OpenStack.

But…

If you think We have +2M product for now and increasing. to fetch them in one request is impossible. If you make pagination. Again It is impossible to send 20K requests without any error. even If you did. Managing all this data in memory or disk is almost impossible. It will be almost a 2GB file. And not to mention the time it would take to prepare it that way. This is not scalable and fault-tolerant approach

Our Solution

We created a Reporting API. It manages our Reporting Domain. We used Golang and applied DDD for this API. We produce domain events via CouchBase Kafka Connector. You can find the details here.

We created another application Reporting Consumer which listens to API events. This application is Golang too.

After some filtering, People sends a Report-Create request from UI. We create a report and set status IN_PROCESS as initial. This request includes Complex-Query information. And then We produce a domain event like report-created. We pass the query information to that event.

Now We consumed report-created-event. We create a manager object for each Report. Let's call it ReportJob. And We divide this query into 25 size pages. It is configurable. And We call it Task. For example, For the 100 Products, We create 4 tasks. And We keep all tasks status in ReportJob. Then We produce all tasks to Kafka as concurrently. like task-created-event. Each Task represents a different page. I want to mention that. We didn’t fetch any data yet. We didn’t call any external API. We just divide pages into tasks and produced them.

Don’t forget that all consumers should be idempotent.

Now We consumed task-created-event in Reporting Consumer.

For each task event, We need to make an API call to Product Read API with the query that comes in the event as paginated. We get the current page result. So We need to translate them at Product Translation API. We can get easily Translated Products from that API.

Now We need to create a file, upload it to OpenStack, and set the task status as true. But there are a lot of good tricks that you should know :)

Thanks for reading part1, You should keep reading part2 and part3.

--

--