How redBus built a Scalable Inhouse Referral System (aka Maverick) — using Dgraph.

Neeti Khard
Mar 6 · 6 min read

Referral programs have become a standard product feature in the eCommerce | Internet space. There have been many successful programs out there. At redBus, we did launch the same and it was quite successful.

These programs build a brand loyalty with the network of customers. By adding constant intelligent rewards and features to it — helps in long term acquisition and retainment metric of customer base.

ReferralCandy explains the pros of Referral Marketing in depth.

Why Inhouse?

redBus relied on a third party for the Referral Solutions for a while as we wanted to quickly take it to the market and see how things pan out. Once we started doing some serious numbers and the kind of cases we wanted to solve, it made sense to build in-house. Few pointers and reasons —

  1. Reduce fixed costs of third party solutions.
  2. Any addition to Product features or Gratification Events had a third party dependency.
  3. Maintain Referral Data on our back-end data to get more meaningful insights for other horizontals (Hotels, Piligrimages, BusHire).
  4. We are operational in 6 countries and every country — region had their own nuances to keep in to consideration.

Architecture Goals:

  1. Fraud Detection/Prevention: We want to make use of the share and device details to build a fraud engine which would help detect and prevent the loss due to fraud.
  2. Scalabale, real-time data store (referral graph) to get better insights in to influencers, cluster networks, geo-gravity etc ..
  3. Build various gratification events for first degree, second degree and nth degree referrals — pyramid set-up for loyalty.
  4. Easy to integrate this data set with our Transaction data for further analysis.
  5. Apart from regular flows of referral marketing which involves around referrals and gratifications we wanted to extend of product to few other unique capabilities specific to our E-comm product.
  6. Cross Country- We wanted to interlink different geographies and handle complex gratification process which follows.[Singapore-Malaysia was a good use case]
  7. Cross Vertical- We wanted to build a generic platform which works as a plug and play for our different verticals[Bus, Hotel, Bus Hire] to encash the capability of referral flows.
  8. Promotions linked to Referrals- Tying up offer promotions along with referral flows.
  9. Scaling- As we grow, high availability and high performance was the architectural need.

The key component to this architecture was the choice of our Graph Store. We looked at ArangoDB, OrientDB, Dgraph

Why DGraph?

For Beginners, why Graph DB ; also check-out this “the famous join-depth problem” for any graph engine well articulated here.

In a jist, we wanted to build a scalable, highly reliable data store which works well with relationships as a major data point rather than the entity itself. We can still do the same with SQL, however that will not be optimized and has constant overhead in terms of tuning the systems.

We did a through case study for our product and we understood the need of a scalable Graph Data Layer for our product features.

Sharding forms the logical separation of data which helps minimizing response times of the queries. The number of rows in each unit is reduced and thus increasing the response times. Sharding can be based on certain key attributes which would help scale systems based on few real time problems.

For our use cases, sharding on certain campaigns, countries and verticals are of great importance as we scale our product to the future.

Fault tolerance and load sharing are the advantages of having a cluster setup for a database.

D-graph cluster provides all the necessary options to build a robust set of nodes to achieve high performing,high availability data layers.

Even though we are working on the cluster set up, we have gained lot of confidence in terms of the performance and feature support DGraph has to offer on the single node setup.

What have we built?

Maverick- The Inhouse referral system consists of the below components and is expected to expand further. We coined this system as Maverick internally — as it quite unorthodox and independent of the gratification events we can power :)

We have different Engines interacting with a central Go Api layer on top of Dgraph.

Referral Construct Engine- Rule based engine which defines the benefit based on

  • Region
  • Vertical
  • Referrer/Referee
  • Event.

Fraud Detection Engine — It helps in preventing referral frauds, checks for device, email related frauds and prevents users from abusing referral bonus.

Refcode Engine — Deals with referral codes for the users. Custom Referral codes and multiple referral codes for a user.

Gratification Engine — This engine is responsible for performing the credit to the user based on the rules defined in constructs with proper validations.

Data Representation

User Subject is connected to User Object via ReferredTo and ReferredBy Predicates.

User Subject is connected to Campaign Object via Registered Under Predicates.

User can be a referrer as well as a referee

A sample referrer- referee relation on Dgraph.


Dgraph definitely provides a platform to extend the performance of our system. We had 7 million records to be loaded on to Dgraph. We generated the N Quads for these records and started using Bulk loader.

We were able to upload close to 5k records per second with minimal concurrent connections. The schema and the data points were slightly heavy. We did not work on optimizing this further, as we were able to get this done within a couple of hrs.

We have added necessary indexes to boost our query performance. We did our benchmarking as follows

Referral details Api-which fetches the referral details of the individual user, referee list, device details, campaign details and aggregation of the earnings.

Register Api- This involves the mutation of user,share and device nodes

Check user Api- Validates the user eligibility.

Please note these results are of the GO apis over Dgraph and other engines. Purely in terms of Dgraph we were able to achieve response times below 50 ms.

All the tests were performed on a single server node with 2 core and a memory of 16gb.

CPU never crossed 20 percent throughout and memory was within 30 percent.

What next?

With this set up, we are able to achieve our architectural goals to a good extent. We want to push ourselves to higher levels.

Cluster set up on Dgraph is what we have started working on.

Geo indexes and Geo filter on Dgraph is something interesting in terms of region and exploring the constructs to be more minute in terms of Geo level.

Overall, we had great success using Dgraph. Engineering applications that use graph stores need a different mind-set. Data modelling on graph stores, indexes and traversals — next blog. Stay tuned !

redbus India Blog

redBus India Blog

Neeti Khard

Written by

redbus India Blog

redBus India Blog

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade