Re-platforming of Delivery Search: A Scalability Journey of Trendyol Express

Öner Çiller
Trendyol Tech
Published in
5 min readNov 4, 2022

Hi everyone,

In this article, we will talk about the evolution of Trendyol Express delivery search services and the step-by-step transformation from a legacy service to a scalable service.

Background

As Trendyol grows rapidly, delivery plays a critical role in the company’s growth. Trendyol Express is the logistics company of Trendyol, which grows day-to-day and ships millions of deliveries. It has dozens of management pages and a main Delivery Search Page; please see the following way for more details.

Delivery Search Page

Re-platforming process of the Delivery Search Service

There are four main pillars in the process:

Separation of the delivery search service from the legacy service in the service layer. The following pieces are added:

  • Delivery search service
  • Replacement of PostgreSQL with Elasticsearch and Couchbase.
  • Delivery search BFF (backend for frontend)
  • Express delivery query service

Legacy Service
In the early days, the delivery search was handled by the monolithic Hermes API. Since the delivery volume was low, the search mechanism met our needs, and we did not encounter any problems. However, as delivery volume increased rapidly, the response time of the search page increased, and we encountered timeout errors. We needed to scale the system to keep managing it. This is the story of how we decided to separate the delivery service from the legacy service and create a dedicated service.

The following diagram shows the new structure of the search service: We separated the delivery search service from the legacy service, we changed from PostgreSQL to Elasticsearch and Couchbase as database stacks. As a service, the delivery search structure consists of delivery search service, delivery search BFF (backend for frontend) and express delivery query service.
The express delivery query service
is not a new service, it is another monolithic structure. We use it to retrieve all delivery data.

Delivery Search Structure

Why do we use both Couchbase and Elasticsearch for searching?
Since Elasticsearch is a search engine, not the Database; it shouldn’t be used primarily for data storage. Combination of Elasticsearch and Couchbase provided an efficient search mechanism.

Elasticsearch and Couchbase mapping examples for delivery search.

As you can see above, potential search fields are mapped with a unique id (deliveryNo) of the ElasticSearch index. When we want to perform a search by mapped fields (e.g. deliveryState), we can retrieve a list of deliveryNos efficiently thanks to the index of Elasticsearch. After having the list of deliveryNos, the delivery query service (data source) is called to complete retrieve delivery data.

Since Couchbase stores data as key-value pairs and can efficiently retrieve values ​​for known keys, we used deliveryNo as the key and used its efficiency.

Before re-platforming deliver search, the response time was in the range of 2–3sn after that we’ve managed to reduce it to the range of 30–70ms.

Re-platforming process of the delivery data source

The delivery data source was another bottleneck in our system. We were using the express delivery query service to retrieve delivery data (you can see the above delivery search structure diagram). However, it was a monolithic service and connected to a single Kafka topic to handle delivery events.

A growing team
We didn’t have any issues with this service initially as we were a small team and few services were manageable. However, our team grew rapidly, and the team split into smaller teams. Each team had their own microservices under their responsibility, but since the express delivery query service is a monolithic structure, each team had to modify the same codebase. Therefore we decided to split the delivery query service into multiple services called translator services. Please see the following diagram for the new data source infrastructure.

As you can see above, each team has a translator service and topics under its own responsibility, and a translator service consists of 2 layers: data-validator and delivery contracts.

Data Validator
Each team should be able to interact with the data validator service with their own translator service (i.e., the data validator service needed to be language agnostic). The sidecar application pattern was an out-of-box natural solution. Data validator was written as a sidecar application that runs with the translator service in the same Kubernetes pod. It has a ruleset stored in the Kubernetes config map to decide validation rules and validates data from Kafka topics.

Please see the following diagram for the data validator application infrastructure.

Delivery Contracts
After validation, we should map the data that come from the topic to the Delivery Contracts. Every translator service uses the Delivery Contracts model. Therefore, we created a library for it so that able to use shared.
After the Delivery Contracts mapping, we publish data to earth.delivery.cx.delivery-status-updated.0 topic. Every translator service publishes the same topic.

Please see the example of the delivery contract mapping below.

Delivery Detail Consumer
the Delivery Detail Consumer is a service that consumes the earth.delivery.cx.delivery-status-updated.0 topic. As I said above, every translator service publishes data on the same topic, and the Delivery Detail Consumer service store the data on the Couchbase.

Conclusion

As a result, in this significant re-platforming journey, we have come a long way in both the delivery search side and the creation of the delivery data source. We have established a sustainable, scalable and strong structure for The Delivery Search service. We have also built a flexible structure where different teams can manage their own topic events and services with the new delivery data source structure.

--

--