Trendyol Coupon Journey: Survive Active 500M Coupons

Süleyman Can
Trendyol Tech
Published in
11 min readJun 6, 2022

In this series, we will explain Trendyol Coupon’s growing journey. We will talk about technical and business details of our features and test processes. In the first post of this article series, I will explain our coupon creation, archiving, finding applicable coupons process, how we survive with the active 500M coupons, and how we developed with data and user feedback.

What is Trendyol Coupon? What is not?

Coupon is one of the discount tools at Trendyol, which provides discounted shopping. The coupon discount is applied to the shopping basket amount. Coupons may have conditions. The condition restricts the scope of the coupon. For example, the coupon may have conditions such as brand or category. In Trendyol, the coupon can be created according to over ten different conditions. The promotion code in Trendyol does not mean coupons. The promotion code is another discount tool. (e.g: WELCOME123)

How do we work with clients for coupon creation?

Our sellers and CRM team (Customer Relationship Management) can create coupons for customers. We provide services to our sellers and CRM team for coupon creation as a coupon team. While the sellers create coupons, we follow their experiences on the coupon creations screen with Google analytics events and then we improve the coupon creation flow experience.

Coupon creation requests from sellers over the past 24 hours

We monitor the number of daily create coupon requests from the sellers as a KPI metric. We compare the coupon usage ratio with created coupon count. We make recommendations for more effective coupon definitions by interpreting the metrics. For example, we have developed a matrix that will make the sellers choose the lower limit and discount according to the average basket amount.

How are we creating seller demand coupons?

Sellers can create a coupon request. We have called Demand; we receive information such as budget, lower limit, discount, condition, etc., from the sellers. On the proportion of the seller budget, we determine the targeted customers with intelligent algorithms and create coupons. In this section, I will explain how we create customer coupons from the coupon requests (Demand) of the sellers.

Demand Coupon Create Flow

Demand requests are shared with the team who is responsible for managing customer services with these automation scripts (2. transition). The CRM team creates the meta information of the coupon (name, image links, etc.) and assigns coupons to targeted customers by sending a post request to coupon-api. (4. transition) We target the customers who might have an interest in the seller’s coupons with an intelligent algorithm. The CRM team works with the data to create coupons for the targeted customers.

We can create an average of 8M coupons in the hour with our demand coupon creation architect. We have a fallback mechanism to produce the message to Kafka and consume it from Kafka. As a clue, you can check the ‘Discount Coupons’ page and notifications on Trendyol. We were creating an average of 30M coupons per day before November 2021.

How are we archiving coupons?

Since the first day of the Coupon domain, we’ve been moving the expired coupons to the second database. We also store them over the legal period in the second database and then delete them permanently. Our database team supports us in archiving process.

At first, we were finding expired coupons through the index on the coupon bucket in Couchbase (Secondary Index). As the coupon bucket increased, the index also grew up and the resource needed. It began to be an extra load to Couchbase. So, we needed to design it differently. After tons of work, we decided to go following system design:

Coupon Archive Flow

We save the endDate and id information of the created coupons to another database. We have called the CouponArchiveOps database. Thus, we are following expired coupons in another database. With id, we can take key-value queries to Couchbase. With this architecture, we are archiving an average of 10M coupons a day. When we tested the boundaries, we archived up to 70M coupons. We had a timeout problem when we used the secondary index on Couchbase. Our customers were affected by the archiving process. We have solved this problem with the new architecture.

Timeout problems when living with millions of coupons

We provide coupon creation screens to sellers and the freedom of coupon creation to the CRM team; as a result of this freedom, there were 300m active coupons in our database. In 2020 November Events, we had approximately 300M coupons in our database. While living with 300M coupons, we were experiencing timeout problems. As the number of coupons on customers increased, we spent more time in the database and began to cause this timeout.

We had an incident that lasted 12 hours, due to the timeouts. 🔥 During the incident, we were unable to apply coupons to the baskets, so our customers didn’t take advantage of discounts during this period. We investigated the root cause of these incidents as the DBA and Coupon team as follows:

  • By using the log capabilities of the Couchbase SDK, we observed to log the passage of a query from the application layer to the database layer. We’ve seen that we lost over time in the network layer.
  • In addition to examining at the SDK level, we organized meetings with Couchbase consultants and have examined our Coupon Couchbase’s use cases. We have listened to their suggestions and tried to find the root cause of the problem. We’ve scaled at DB level.
  • Hardware improvements to the DB side. We have changed the cluster to another cluster that we are currently using.
  • When we query Couchbase, we expect a response from RAM. Our goal was to keep the Couchbase bucket resident value over 90. Resident percentage of active items cached in RAM in this bucket. When we send a key-value query, if the document in RAM, does not search to the disk.
  • We were storing multiple types of data on a bucket and we have divided these data into separate buckets and started not to write the null and empty fields on the database. We’ve handled it at the application level. So we end up having much much less data size in our buckets. These improvements allowed us to keep more data in RAM.
  • We reviewed our need for an index and rewrote some queries in key-value queries. So, we reduced the need for index service.

As a result of some improvements at the scale and application level, we reduced our timeouts. Within the scope of Trendyol’s Multi DC Project, we have observed considerable improvements in timeouts. The timeouts are almost gone. When we started working in a new data center with more new hardware, this was to verify our network determination in timeouts.

On The Way 500M Active Trendyol Coupons

In 2021, we have served with 300M coupons in the average alive. Our 2021 November event targets, survival with 500M coupons in the living. We measured our last state with performance tests before the November events. In addition to performance tests, we have increased the number of coupons in DB to simulate the 500M coupons before November. By applying a stress test, we observed that our system is working healthy with 500M coupons in the living. Before November, we simulated 500M coupons in alive as the BF (Black Friday) time.

Birth of new coupon ways: We live with data and feedback

In 2021 Trendyol Customer Satisfaction Surveys, we found that the coupon discount type is not satisfied with the customers. In the Trendyol, at the 2021 Award Ceremony Meeting, we received a negative reward. This feedback and data pushed us to improve the coupon tools.

The biggest one of our customers' dissatisfaction is they are not actually into given coupons. As the coupon team and stakeholders, we have a dream to give customers the freedom to choose coupons. In Trendyol, as mentioned above, coupons were assigned to our customers by our targeting algorithm. So they weren’t involved in what kind of coupons they wanted. So we have dreamed of a new system of coupons that allows our customers to collect coupons that they are interested in.

In the second half of 2021, we have developed the ‘follow the seller store and win coupon’ and ‘Collect coupon from product detail’ features to give the freedom to choose coupons to customers. During Trendyol Event (ex: 11.11 or Black Friday), we have created the ‘Mega Coupons’ page where the most attractive coupons are listed. We have developed the ‘Wheel of Fortune Project’ to collect the coupon by gaming. Our architecture and the details of the new tools will be in the next posts of the series. Collect coupon from product detail’ feature increased 10-fold of Trendyol coupon usage. New features improved the shopping experience using coupons. We are aware that the resource of the database is not infinite. By reducing the number of active coupons with new features, we have increased the use of coupons.

How do we list the applicable coupons for the basket?

Today, we have more than 10 ways to create coupons. With these creation ways, we have created coupons for the customers or customers to win coupons. In this section, we will look at how we list the applicable coupons for the basket. As we mentioned above, coupons may have conditions and we only list coupons that can be applied to the basket. With the increase in coupon creation ways, to list applicable coupons in a high performant way, we take advantage of CQRS. Our system is a write-heavy system. So we needed to separate write and read sources.

Applicable Coupon Sequence Flow

When shopping in Trendyol, we list applicable coupons with the above architecture when you view the basket.

  • We keep the summary information of your coupons in the UserCoupon bucket. Such as the summary information of the coupon id, endDate, and statuses. The UserCoupon bucket has two benefits for us. First, we have made the status and date control with pre-filtering. Secondly, there is over 300M coupon detail information on the Coupon Bucket. We are accessing the key-value instead of the index in the Coupon bucket. Instead of using the userId index in the Coupon bucket, we receive active coupon ids from the UserCoupon Bucket.
  • After pre-filtering, we receive coupon detail information from the Coupon bucket with the active coupon id.
  • We are storing the coupon metadata in the CouponType bucket. For example, displayName, product image links, and description fields. The reason for this, the displayName and description of thousands of coupons can be the same. We are updating a single type document instead of updating a thousand coupons in the update processes. There is a relationship between CouponType and Coupon.
  • We use the CouponGroup bucket to ensure the ‘This coupon is applied in the first X usage.’ Here, we keep the limit and currentUsageCount. There is a relationship between CouponGroup and Coupon.
  • After getting the customer’s active coupon information from the database, we try to find applicable coupons for the basket. For this evaluation process, we are applying a Chain of Responsibility Design Pattern. We have handlers to find the applicable coupons. We list successful coupons from all handlers.

Applicable coupon endpoint, our the highest throughput endpoint. Whenever a customer gets in their baskets, we repeat the above operations again and again. If the customer is recently refreshed, we always make applicable control with the same coupons. We get customer coupons from the database regardless of the contents of the basket. We are getting the customer’s coupon information from different buckets. We collect customer coupon information from the main Coupon database. We have considered storing customer coupons object in a cache database. Thus, we aimed to reduce the number of reading requests of the main database we are writing and reading.

How did we do cache development?

To support cache requirements with data, we have observed the data on how many customers are renewed baskets in 15 minutes. We watched this data along with a Trendyol event. On average, the customers come to the basket 5 times in 15 minutes. That means we get customer coupons from the database for the same data an average of 5 times. The basket content is not important to us unless the customer coupon has been updated, as long as the new coupon is not created for the customer.

After supporting our opinion with the data, we have set our priorities for cache technology. Our priority was a Multi DC support of the cache technology. Trendyol is working on different data centers, we should be able to serve our services in different datacenters. At this stage, we had three candidates namely Redis, Hazelcast, and Couchbase choice.

  • The Couchbase XDCR ability is frequently used in the Trendyol for Multi DataCenter data replication. Hazelcast is not widely used in Trendyol, Redis is more common. We have reviewed the use scenarios (use cases, data size, response time) of the teams using Redis.
  • We also examine the Ephemeral Bucket feature of Couchbase. We had a chance to make a quick POC because we are already using Couchbase. The response time on the Ephemeral Bucket was on the same values as the other technologies we examined. Our preference was Couchbase in the ability to XDCR.
  • Ephemeral Bucket NRU ejection strategy and we use it with a specific TTL time. (Not Recently Used: If the bucket quota is over, old documents are deleted so that new documents can be saved) The data on the Ephemeral bucket is kept in RAM.

The cache feature has provided a gain of 8ms in our response time and reduced our timeout problem.

Finally

We have new features in addition to the available features. For example, multiple coupons to the basket and percentage coupons. We continue to improve our current architecture.

We have more than 15 microservices in Coupon Domain. We are trying to apply DDD strategic and tactical patterns to our domain all the time. We are developing ubiquitous language to communicate with our stakeholders. We are organizing Event Storming meetings for architectural review and new features. We decide as a team on architecture and technology decisions and we are organizing analysis meetings. Join us to develop millions of coupons and high throughput applications. We have a lot of challenges 🚀

Thanks for reading!

--

--