Trendyol Coupon Journey: How We Gamify Coupon Collecting?

Anıl Coşar
Trendyol Tech
Published in
6 min readDec 20, 2022
Photo by Jeffrey Clayton on Unsplash

Hi everyone! In Trendyol, we give coupons to customers to use in their online shopping purchases. To do that, we are using various ways. We added a wheel of fortune game for coupon collecting not long ago. Through this article, I will share how we designed service architecture that provides the customers to win the coupon and our experiments. In this game, there are two terms that I use in this post.

Wedge ->The coupon prize on the wheel

Wheel Of Fortune -> The game that consists of wedges

Why do we need gamification?

We have different ways to get coupons. It could be defined by Trendyol, collect from product detail pages, follow to the seller then collect. We decided to add a game for collecting coupons because of these reasons below;

  • Add luck factor to get the customer’s attention
  • Increase activity in Trendyol
  • Engage customers
Çevir Kazan Page

Let’s look at high-level design

High-Level Design

High Level Design

1- Trendyol creates wedges on event days.

2- The customer spins the wheel on a mobile app.

3- Mobile Client sends a request to the wheel of fortune service.

4- Wheel Of Fortune Service fetches wedge data from the Couchbase Database. Service gets all wedges and checks wedge’s data whether it is active or not. After that, it chooses one wedge among active wedges randomly. Then service gives a chosen wedge coupon and increases the wedge’s collect count.

5- After increasing the collect count, Couchbase eventing service catches the mutation. Eventing service checks if the wedge reaches the collection limit. If it reaches, we change the wedge’s status as passive

6- Chosen random wedge convert to coupon document then sends to create coupon kafka topic. If there is a problem in kafka, we applied outbox pattern for fallback. Events collects in outbox bucket. When kafka is available again, we send it from outbox bucket to kafka topic.

7- The create coupon events are consumed by Coupon Service then coupon creates.

8- All steps are repeated for each user’s collect action

How do we choose wedge randomly?

For adding lucky factor, we give the wedges randomly. Every wedge has a collect limit so we don’t give coupons limitless. Imagine that numerical axis. We put wedges to the axis.

numerical axis

For example, we have three wedges. These wedges have a collect limit that is 2,3,5. So from 1 to 2 wedge one, from 3 to 5 is wedge two and from 6 to 10 is wedge three. We sum the collect limits. In this example, it is 10. After that, we take a random number between 1 to 10. Depending on what the number is, we give a coupon by wedge position in the numerical axis. Through this algorithm, system routes requests to wedges that has most collect limit. Also we avoid exceeding the limits.

Challenges and Solutions

While developing the wheel of fortune game, we have faced some challenges. Let’s start!

Kafka Consumer Lag

In the beginning, we created a topic with 10 partitions. When we deployed to production, we didn’t expect a huge load. Lag happened in kafka and coupon creation process was delayed. Scaling consumer instances haven’t solved the problem. After a couple of load tests, we found the right instance number and increased the partition size from 10 to 60. Thus our service could afford millions of requests.

Race Condition

Wheel of fortune page enables for a while. We take lots of requests same time. For this reason, we give additional coupons. One of the business rule for the wheel of fortune is “don’t throw exceptions to the customer”. Customer experience shouldn’t break. Rather than throw an exception, we accept giving additional coupons.

The problem is our services work on a multi-datacenter. Couchbase uses XDCR to replicate data between data centers. When replicating the data, a delay happens. Our service updates the document in one data center but the other is waiting for mutation. Within this period, new collection requests come then we give additional coupons.

To solve this problem we tried optimistic locking, stopping at the threshold. When applying optimistic locking, it caused high response time. We tried to add collect limit threshold for example if wedge has 1000 collect limit we stopped to 900 but still exceeded the limit.

We thought of running couchbase as a server group instead of running two data centers. The server group feature has not auto failover feature so we decide to not apply.

Finally, we use atomic increment operation in each data center. We sum collect count in each data center. No matter which data center takes a request because we check two sides. You can find details Race Condition post.

Concurrent Database Connections

Our service runs only on event days at 08:00 pm. At that time, we get a lot of requests same time. We couldn’t get data from the database. It may occur a database bottleneck. We cached static wedge data every 30 seconds. Thus we increased our throughput. The reason for the short cache time is adding new wedges during the game.

Concurrent Service Request For Same User

Sometimes users send multiple spin requests. We used the rate-limiting feature by Istio for blocking multiple requests. We have two rate limit rules. The first one is each user wins one coupon. The second one is total request number couldn’t exceed the total coupon number.

Performance Overview

Wheel of the fortune service using the following tech stack:

  • Java 17
  • Spring Boot
  • Couchbase

Let’s look at the performance.

Performance for collecting

Throughput
Response Time

Performance for getting wedges

Throughput
Response Time

We tested our services with 2.5 million users and coupons. On event days, we get fewer requests. We see that our service can handle more requests. We reach high throughput and low response time by using caching, database operations by key value and minimal dependency.

Conclusions

Wheel of fortune has provided us reach with potential customers who might not have used a coupon. Also, we handled many requests at the same time.

We use Couchbase which is fast for key-value operations, for getting data quickly, and use Couchbase Eventing to catch the collect mutations. Though, we haven’t needed a service that works scheduled.

Our collection process is async and uses Kafka, a highly available, secure, and delivery guarantee as a message broker.

We decide as a team on architecture and technology decisions and we are organizing analysis meetings. Join us to develop millions of coupons and high throughput applications. We have a lot of challenges.

In this article, I try to explain how our system works. If you have any questions, please don’t hesitate to contact me.

--

--