How Myntra’s Data Platform drives a seamless shopping experience for customers during its marquee events

Sri Nandhini
Myntra Engineering
Published in
12 min readMay 26, 2023
Myntra EoRS Sale
Myntra End of Reason Sale (EoRS)

The Data Platform at Myntra has come a long way in recent years. We began our journey of platform modernization a few years ago when we realized that we required a high-performance data platform that could handle the growing amounts of data and enable us to make faster, more informed decisions. Given Myntra’s size, we needed an enterprise-grade solution that could process a lot of data quickly and accurately while also giving us insight into performance metrics and trends in customer behavior. This was not an easy task.

Over the following few years, the team put a lot of effort into developing this robust infrastructure from scratch using open-source technologies. In 2020, after months of development work with internal teams, Myntra launched its new Data Platform, which provided petabyte-scale real-time analytics capabilities.

Our engineering teams continue to work diligently on further enhancements to cater to the expected scale that we generally observe during key events like EORS and BFF while guiding us on the upcoming trends and also boosting our AI mechanisms

After a fabulous edition of BFF in 2021, that created a positive sentiment in the industry, we realised the need for the data platform to bring a lot more scale and resilience to different products and applications for many more such events that would come in future.. The team has taken a moment to reflect on our accomplishments over the past year as we finish up last year’s BFF and December’s End of Reason Sale Event (EoRS17). Thanks to the robust platform that we now have in place, we managed an incredible amount of data growth without any issues.

To provide some context regarding our volume and velocity accomplishments, During both events’ peak performance times:

  • The number of events related to apps and the web, such as page loads and clicks, increased by 13% year over year, reaching a peak of 18.8 billion events on the first day of BFF 2022.
  • During BFF 2022, the data serving layer processed and served 2.5 petabytes of data per day, handling over 65,000 queries per day — an increase of more than 30% from the End of Reason Sale in the middle of the year.
  • Typically, we use our Customer Engagement and Personalization Platform to raise awareness of such occasions. During BFF, that platform had a combined peak of 2 million sessions solely for customer engagement.

All this was done via the platformization of key tenets of a modern data platform stack, including the:

  • Ingestion of structured, and unstructured data assets and click stream data via Sourcerer, a powerful enterprise-grade ingestion platform;
  • Real-time enrichment and transformation of raw data into actionable ML interventions and insights via Quicksilver, our near-real-time streaming platform
  • Petabyte-scale processing of the raw and enriched data into intermediate and final served datasets in Myntra’s data lake via Janus, an in-house cluster orchestrator
  • Serving at scale via Myntra’s evolving data lakehouse, Bifrost
  • Orchestration of all data enrichment and movement via Janus’s in-house orchestrator.

It’s critical to take note that the progress of these occasions could never have been conceivable without the energetic endeavors of the data platform group. Myntra relying heavily on Microsoft Azure Cloud Platform as a valuable partner while maintaining Open Source foundations across all stack layers made it possible for these outcomes to be achieved. This gave each layer unrestricted choice when selecting the tool that would be most suitable for the task at hand. We were able to handle extremely large amounts and types of data passing through multiple systems simultaneously during peak hours of BFF 2022 and EORS17.

How did we get here? Without last year, none of these accomplishments would have been possible. The story behind each of the aforementioned applications is as follows: where these systems were a year ago and what adjustments they made since to bring us to this point.

Near Real Time (NRT) Platform

For years, Myntra has been working on continuously improving customer experience through personalization, recommendations, intent identification, etc. However, until 2021, all of this was done using batch jobs.

The relevance of in-session personalization/recommendation has increased dramatically over recent years. As a user’s intent can quickly change from one session to the next, organizations need to identify the user’s intent in real time and capitalize on it by showing relevant Ads/recommendations/search results/coupons, etc. According to Mckinsey & Co’s study, companies that excel at personalization generate 40 percent more revenue from those activities than average players. According to Accenture’s survey of 1.5K consumers, 65% are more likely to buy from a place where they are recognized, remembered, and receive relevant recommendations.

The Near Real Time (NRT) team in Myntra was constituted to build a platform that enables us to capture thousands of actions performed by millions of customers on their Myntra app/web in real time and use them to improve their experience by personalizing their UX, showing them relevant recommendations, search-results and generating relevant coupons in real-time by leveraging machine learning capabilities.

The NRT team undertook several initiatives during this BFF to increase the efficiency and resiliency of its systems. Consuming, processing, and analyzing a high volume of clickstream data requires a large number of hardware resources. Provisioning and maintaining them also leads to high costs and operational overhead. Our main goal during this BFF was to optimize hardware resource utilization and adopt new technologies for efficient analysis of real-time data. Some of the key areas of focus over the past 12 months were:

  • Deployed In-Place scaling of Kafka to handle high traffic while saving cost: Each action performed by the user in the app/web is called an event and Myntra uses these events for a host of reasons from improving user experience to analyzing data. Myntra’s clickstream Kafka is responsible for ingesting billions of events that are generated by users every day. This requires large amounts of H/W resources for storage and computing and they have to be scaled up significantly.
  • Successfully deployed and Validated Druid: Real-time analytics helps us to get accurate information about various metrics in real-time, make inferences, and take appropriate actions based on them. At Myntra, we have traditionally used legacy code to capture and show real-time traffic data (unique session count, active user count, etc.).

To address these challenges, the NRT team adopted Druid which is a real-time analytics database designed for sub-second queries on real-time data.

The data aggregated by Druid was made available for visualization through PowerBI. Druid along with PowerBI showed real-time traffic data which was made available in a few milliseconds as opposed to the high latency (a few seconds to minutes) seen while using MemSQL. With Druid, we were able to achieve the same aggregations and visualizations while consuming ~40% less compute resources, thus providing scope for significant cost reduction in the future.

  • Implemented & Expanded the scope of real-time Product Recommendation Use Cases: Similar Products Recommendations and Cross Selling are an important part of e-commerce which allows customers to identify the most relevant items that they might want to purchase. Identifying the right recommendations and showing them at the right time plays an important role in driving revenue.

The NRT team worked with Data Science and Machine Learning Platform teams to develop use cases that could show similar/cross-sell products to the customer in real-time.

  • Advertisement Ranking & Frequency Capping for Advertisements: At Myntra, we always strive to improve the experience of our customers as well as our partners. Several brands advertise on our platform and we must ensure that we show the most relevant ad to the customer and at the same time, ensure that our partners get maximum RoI for their Ads. To achieve this, NRT implemented Ads Ranking. Earlier, all advertisements were shown to the users with the same pattern which meant that all users saw the same Ad irrespective of their preferences. With Ads ranking, the NRT team was able to rank the Ads based on user’s preferences and show them the most relevant Ads, thus increasing the CTR. Additionally, showing the same Ad to the user over and over again can lead to user fatigue. To prevent this, NRT implemented Frequency capping for advertisements to limit the number of times a particular Ad is shown to a user.
  • Search Auto suggest Feature: Search is an important feature in any e-commerce application. Tailoring the search to the user’s individual preferences can help improve user experience and help the user find the results faster. NRT worked on a feature that showed personalized search auto-suggest recommendations to users based on their preferences and previous searches. For Example, if a user typed ‘sh’, the first result would be ‘shoes for men’ as the feature would identify that the user is a male with a preference for shoes over shirts. NRT plays an important role in this feature as the user’s input needs to be analyzed at very low latency to provide instant results that are personalized to the user.

Serving Layer

With Myntra’s growth, it became imperative for the business to make data-informed decisions. Post-pandemic changes in the E-commerce competitive landscape and an increase in smartphone penetration in India meant that a lower ‘Time-to-Insight (TTI)’ at scale was no longer just a differentiating factor for the company.

“Data leaders are 5.7 times as likely to say their organization almost always makes better decisions than competitors. They are 4.5 times as likely to believe their organization is in a very strong position to compete and succeed in their markets over the next few years.” Economic Impact of Data Innovation 2023, ESG

As our Platforms scaled up to store varied data running well over PBs collected across the expanding customer base and touch-points, processing this data into actionable insights and serving it at a lightning speed of as less as 15s was no longer an aspirational goal. This was achieved in phases over the years:

Phase 1: Data Democracy Platform

It all started In 2015 when a newly instituted Myntra Data Platform team was tasked with the development of a self-serve query platform that allowed analysts to fetch and analyze data from existing Data warehouses. The front-end SQL editor workbench was called DDP: Data Democracy Platform.

DDP

While this was a great step towards enabling users to create reports and access data in a self-serve manner, over time it ended up with some major performance limitations. As more and more users were onboarded to DDP, Out-of-Memory (OOM) issues started happening very often and the median query wait times for users went above 7 mins. Storage in ADW became a constraint with data being duplicated in silos and querying historical data becoming a challenge. Also, since in the case of ADW, Storage was coupled with compute, scaling up the latter alone during HRDs was not feasible. With ~60% YoY growth in queries being fired by users and over 150PBs of data being fetched for analysis, there came a definitive need to look for alternatives.

Source: Data Products — Self-Service Query Platform (DDP)

Phase 2: BIfrost- Serving out of Data Lakehouse

BIfrost(Myntra’s in-house stack built on Apache Superset & Trino) was introduced as an alternative to allow users to query data directly from Data Lake. Data from different sources was brought into the data lake, and unlike ADW here it could persist to store historical data as well.

Phase 3: Multi-version concurrency in case of mutable Transactional Datasets

As we moved to the data lake, updating mutable transactional data like critical Order, Item, and Customer fact tables, became challenging in a non-RDBMS/immutable environment offered by Hadoop. Hive ACID compliance was introduced on the data lake to maintain multiple versions by isolating write from read operation and for much better metadata and schema management. Hive ACID was chosen over the more popular delta version due to its maturity and open-source system.

Phase 4: Handling scale of growth for analytics and storage

Now that the handling of data versions was no longer a problem, the data serving platform had to evolve further to handle the increased scale. To solve this, compute and storage were isolated on different clusters using the TRINO SQL query engine. This enabled the system to be resilient for Adhoc read-heavy workloads. Also, this offered an on-demand multi-compute layer over Presto, Spark, Hive, etc.

Segmentation Platform

The Internet, e-commerce, and mobile technologies have fundamentally changed the way companies and consumers act, interact, and transact. In a world where business is conducted 24x7 at an increasing pace, a communications-driven customer experience is emerging as a clear differentiator for brands. In a survey conducted by Harvard Business Review, 40% of organizations stated that creating an exceptional & highly relevant customer experience is their top priority. State of the connected consumer report by Salesforce also mentions that 33% of end consumers are more likely to purchase if content & communication is personalized to their needs.

In such a world, personalization plays an increasingly important role, more so in the world of fashion e-commerce. Personify platform is a customer data management platform that can unify customer data and communications so that the right message reaches the right person at the right time.

In the last 12 months, the team at Personify undertook major enhancements in terms of scale, efficiency, and accessibility. Major highlights include:

  • Ability to target customers in a matter of hours from days: Traditionally segments at Myntra were running with data available till the previous day. To improve retention, customer recency was targeted with segments running with data from the previous hour rather than the previous day.
  • Segmentation has become faster: To meet the challenge of a constant increase in the number of segments running to support messaging campaigns across all channels, the norm was to constantly increase the underlying infrastructure available to execute these segments. But in the last 12 months, a long-term view was taken to make segmentation more efficient by using newer technologies and removing legacy modules.
    We were able to increase the segmentation efficiency by reducing the average TAT numbers by 50% compared to CY21. This helps us run 8x more segments than CY21 with only a 2x increase in underlying infrastructure.
  • Enabling Programmatic access to the platform: With the increased complexity of use cases with targeting needs, users coming onto the platform to create custom segments and then consume them on downstream execution systems was not allowing Personify to scale efficiently. Keeping in line with other internal platforms’ expectations, accessibility via APIs was added. This enables users to perform CRUD+D (Create, Read, Update, Delete & Download) segments into their platforms/tools without ever coming onto the Personify platform. This helps directly reduce the learning time to get started on creating & accessing segments necessary for newer use cases.
  • Achieved 99.9% platform SLA: Marketing Campaigns are a highly time & context-specific function. Any miss in campaign execution would directly impact the expected revenue that the campaign is expected to generate. In such a scenario, generating and making the segments available for campaigns in time becomes a revenue-impact function. Personify platform has worked on co-opting segment completion metric as a north star metric always to have a 99.9% segment completion rate. This becomes even more important during the HRD events, as the platform runs more segments, each having more revenue impact than regular days. This has helped us create a set of priority segments for which the platform aims for a 100% completion rate & for the rest maintains its 99.9% segment completion rate metric.

Conclusion

The customer segmentation platform at Myntra has been revamped to meet the company’s and its customers’ growing demands for scale, efficiency, and accessibility.

Customers could be targeted in a matter of minutes thanks to improvements to the Near Real Time Platform, whereas improvements to the Serving Platform led to eight times more segmentation with only a twofold increase in the underlying infrastructure. On the Segmentation Platform, programmatic access was made possible, and a platform SLA of 99.9% was achieved. Finally, by improving processes through innovative systems, the Finance Engineering platform helped augment partner experience and productivity. It has become simpler and more cost-effective for brands to create highly relevant and personalized customer experiences thanks to the collective efforts of all of these groups.

The dedication, hard work, and commitment of the Myntra engineering team can be seen in the Data Platform. The platform has evolved from a single-node system capable of handling petabyte-scale real-time analytics in just a few short years.

During the Big Fashion Festival and other important events, customers were able to experience a superlative shopping experience, thanks to its ability to quickly analyze customer behavior. But the data platform’s journey in Myntra is far from over. The engineering team is diligently working to further scale the platform and develop new features to make it more powerful and effective as the company continues to expand and grow.

Authors: Richa Singh, Karthik Kamath, Karan Shah, Anirudh Mangipudi, Nandhini Saravanan, Puneet Mahajan

--

--