Designing Data Experiments During a Pandemic

Published in

Rapido Labs

8 min readAug 17, 2020

These are unprecedented times. With the Covid-19 virus taking the world by storm, pretty much nothing is the same anymore. It goes without saying that the pandemic has brought an unforeseen scale of challenges to humanity — from the collapse of entire economies to the plight of migrant workers in India who had to walk thousands of miles back home and not to mention the massive layoffs among all major industries. While some of us are incredibly fortunate for not having to go through such extreme hardships, the pandemic opened up some unforeseen challenges for the Data Science community as well.

In this series of blog posts, we would like to shed some light on the challenges we faced while designing our experiments in the past few months.

At Rapido, we have consciously decided to undertake an experiment-first approach. Briefly describing what we mean by that - When we, as the data vertical, take on an initiative, we focus on designing quick and effective experiments, in order to test out our mathematical hypotheses. The learnings from these experiments play a major part in how we shape the product that we are building.

This paradigm helps us validate our assumptions and hypotheses with a very quick feedback loop from the ground, the actual marketplace.

Primer on A/B Tests

For our experiments, we typically design an A/B Test. This testing methodology involves stratifying our population into similar groups and then preferentially exposing one group (referred to as the Test group) to the treatment in question. The treatment here could be anything, ranging from a new improvised dispatch algorithm to intelligent customer-centric communications. The other stratified group (referred to as the Control group) is not exposed to the treatment and is allowed to continue operating as-is. Post the experiment, we measure the shift in metrics that were caused by the treatment in the Test group as compared to the Control group. The vital assumption here is that all other conditions are exactly the same between the Test and Control group so that any change that is seen in the Test group can be causally attributed to the treatment in question, and not other extraneous factors.

Figure 1. Sample Output of A/B test for one of our funnel metrics

Elephant in the room

Now that we have set some context around who we are and what we do, let’s drive straight into the matter. In the rest of the series, we will discuss some of the challenges we faced while designing these A/B tests, and how that forced us to think out of the box. In terms of structure, for every topic, we will try and lay out a few distinct parts -

1. Context — A lot of stuff in our field is extremely contextual. The gravity of a situation depends on quite a bit on the context setting. Hence, in order to take readers on the same journey that we went on, we feel it is vital to place them on the hotspot by setting the context right.

2. Problem — This part will talk about the crux of the crisis. We hope to give you a sneak peek into the different caveats that crawled up, some of which we could see coming and some that were missed.

3. Solution — This section will discuss how we tackled the problem at hand, at that point in time. And consequently, what did/did not work out regarding the same.

4. Learnings — In this section, we retrospectively look back at the decisions we took and analyze how it worked out. Or if we could have done something better.

Our aim is to give you an idea of the problems we faced in designing Data experiments here at Rapido and to shed some light on the solutions/workarounds that we used, which you could probably use when faced with similar problems in your respective projects. A disclaimer that we would like to put out is that we are no experts in the field A/B tests ourselves. We are a group of young, passionate people trying to wrap their heads around efficient experiment design. So the solutions that we will be proposing are by no means the perfect out there but are the ones that we came up with at that point in time to mitigate the situation and move forward.

Lack of a Logical Time Control

Context

One of the first initiatives that we took up as the Data Science team was to build an intelligent pricing system. We started our R&D around this in early March, just before the pandemic hit us. We were ready to do our first experiment sometime around mid-June. when we were expecting the first phase of “Unlock 1.0” across India.

Rapido’s ride volume, and as a consequence, the incoming revenue was at an all-time low — practically nil, because of the government restrictions around bike taxis operating.

Because of the revenue situation, we were keen to establish the pricing engine’s impact as soon as we hit the ground running again. What that translated into was that we would need to run the A/B test right from the day of our relaunch.

Problem Statement

At Rapido, we have observed strong weekly and intra-day seasonality historically. Hence the most obvious method of control used here is time-based control. To elaborate on that a little bit, if we are running an experiment and decide to price a geo-temporal combination (Say HSR Layout, Monday morning 9 AM) differently, the most obvious control group becomes the same combination one week back when the usual pricing system was operational. We then measure various experiment statistics between our test and control group and how they were different. In our case, there was no “last week” to compare to, because no rides were happening due to the lockdown.

Visualization of how our ride volume crashed during the lockdown ( Axes have been omitted because of the sensitivity of information )

One logical solution to this was to wait for a couple of weeks for the marketplace to re-open, thereby giving us some ideas about what the volume of rides would be post-lockdown. And then these weeks could serve as logical control sets for our experiments, which we run in later weeks. However, as mentioned earlier, we were eager to prove the impact of the pricing system as early as possible to all relevant stakeholders. Hence, we decided to tackle this problem and find a way to let us run the experiment as soon as we were back up.

Solution

With temporal control deemed infeasible, we turned to geo-spatial Control. However, like most things in our field, it was not as easy as it sounds. This was primarily because different geo-locations are expected to behave differently to changes in pricing. You can easily imagine that an IT park on a Monday morning would be very different in terms of price sensitivity than a remote location. On similar lines, even the same geo-location behaves differently at different points in time.

We arrived at the understanding that price sensitivity majorly depends on the underlying need for the demand and its’ characteristics like urgency, density, volume, etc.

At Rapido, our operational geo entities are referred to as clusters. Looking at the socio-economic characteristics of these areas tell us a lot about the type of commute needs that a particular cluster will have.

Examples of some clusters are-

Dense office areas like BKC/Bellandur/Gurgaon Sector 42,
College districts like Koramangala or North Campus, Delhi,
Posh residential areas like Indiranagar, Banjara Hills or South Delhi.

How the transportation needs of these three different types are different - IT corridors need morning/evening commute which is fast and price isn’t too much of a consideration, colleges need morning/afternoon commute options which are cheap while residential areas are more mixed use with food deliveries and other service movement.

Hence in order to hint at the causality of any observed difference during our A/B test, we needed to account for these underlying differences in nature of demand. Without these considerations, our experiments would be inconclusive because of pre-existing differences between the Test and Control groups.

In order to do this logical stratification, we had to establish similarity on two dimensions -
1. Similarity of the nature of underlying need
2. Similarity in terms of levels of demand and supply

To establish the similarity of need (1), we analyzed the different socio-economic features of these clusters. A few of the features, indicative and not limited to, that we considered were -
A. Presence of an IT Park
B. Presence of a college/university
C. Residential Area
D. Densely Populated

We used a bunch of features like these for the clusters in contention and then used a Jaccard Similarity Index to arrive at demographically similar clusters.

For similarity of the level of demand (2), We analyzed the levels of demand and supply that clusters experienced over the day and found clusters that experience similar demand and supply at a given temporal combination, using Euclidean Distance. Finally, we combined these two similarity metrics to arrive at our test control split for the experiment.

Learnings

This shift in our way of deciding control groups gave us a way to go live with our experiment. Looking back, it mostly worked well and we could arrive at the conclusions that we were looking for. One challenge that we faced during the experiment was the unpredictability of demand, while the marketplace was recovering. As mentioned earlier, we had computed the similarity of demand and supply for clusters using pre-lockdown data. However, when the marketplace reopened, the dynamics of demand were vastly different. IT parks had near-zero demand because of the work from home guidelines. Hence, the similarity scores went for a toss a couple of weeks into the experiment. As a mitigation, we revised our test control based on post-lockdown data two weeks into the experiment. And we discarded test-control combinations where the scales of demand and supply were vastly different while drawing inferences out of our experiment.

If you found this informative, please keep an eye out on our publication for the next blog in this series. We plan to talk in-depth about designing our experiments, related to assigning captains to requested rides intelligently. If you would like to know more about what we are cooking here at Rapido, in terms of Data, Engineering, or anything else, feel free to reach out to me on LinkedIn.

About Us

ThoseDataGeeks is a group of data scientists working at the online bike taxi aggregator, Rapido, based in Bangalore, India. We are a diverse bunch of individuals who have had the opportunity to come together and try and shape a data-driven ride-hailing company, right from the ground up. We hope to share more of our experiences, learnings, and challenges with you as we progress on this topsy-turvy journey.