AI & ML at Flipkart Engineering — CTOtalk with Ravi Garikipati
AI/ML@Flipkart — CTOTalk
I have been working with ML and DNN applications for nearly 2 years along with some Big data analytics work. Most of ML/AI comes from self learning, open source implementations and few courses in Udacity/Udemy. No doubt ML/AI opened a new dimension on how data can be used.
But when I hear or read this ‘we are AI First company’ all over, I always wondered what exactly ‘AI First’ strategy for a company means. How it is implemented or practiced?
I attended CTOTalk session with ‘Ravi Garikipati, CTO- Flipkart’ and he answered it. The rest of this post will cover my takeaways from the session.
‘Data’ is the new oil
Well all of us know data is the king and can do wonders. The amount of data generated by Flipkart on a normal day and special occasions like Big Billion Day sale are really huge.
It is 10+ TB of raw text data on a normal day and 50+ TB on big billion day. Again, another theory in practice — ‘collect every possible data, model ML around it later’. It is not just the ‘click streams’, they collect every other information like user device, location, product searches, rating, reviews, feedback for recommended items.
I always wondered how exactly feedback or validation on a demographics prediction works apart from user confirming it by filling his profile data.
In fact I asked him about this. He said the recommendations + users actions on them are main feedback for the models and it made a lot of sense.
Hadoop Ecosystem — in practice
What I really liked about this part is — Flipkart has built a ‘Big data platform’ on top of open source tools. They have all kinds of processing — from batch processing to near real-time decision parameters under two minutes latency. Their platform ingests 10–50 TB raw data every day and plays a major influence in data driven decisions from personalisation, fraud detection, marketing/sales targeting and validations. In particular Flipkart, runs 60+ ML models on this data to build 40+ insights on users ranging from demographics, behaviour to fraud detection.
‘AI First’ vision at Flipkart
This part is more appealing to me on AI practice in companies. Ravi explained about how as a culture, Flipkart made ML platform available to everyone in the company and having the data available to all of them. He went on to explain how an internal kaggle.com kind of tool helps them apply some best performing models for some problems.
They also work with academia on validating some new models under research. AI/ML skills training has been made compulsory for all the new hires along with e-commerce domain training. They also have invested in high performance hardware and made it available exclusive for ML modelling and validation to all developers and data-scientists.
Case Studies
Ravi ended the session with few case studies on AI/ML application.
Personalization
He explained about how personalised recommendations are more appealing and getting them 70% retention rate. There was lots of focus on this part, about the 30% bounce rate and area of improvements. He showed how they have built various conversation funnels for different type of users and recommendations.
Visual Similar Search
He showed us how their ‘Image Feature’ based similarity measure earning better results compared to DNN models. A combination of them earns better results for them.
Customer Reviews, Sentiment, Aspects
Ravi spoke about how they have built automation in moderation of customer reviews, identifying sentiments and aggregating them. They build features like Auto-Titling of the review based on description and also rating calculation.
RAPID Design Automation
He explained how their fashion unit ‘Myntra’ uses a vision system ‘Generative Adversarial Networks (GAN)’ to generate new fashion designs. The system generates new designs which are more like human generated and showed us some of the auto-generated designs.
Conclusion
After these case studies, he opened the stage for questions. We had some interesting discussions on Flipkart recommendations, unsolved problems they are working on. Their ML platform and open source counterparts being equal. He concluded the session with a point on how ‘AI First’ vision runs through every feature of their product through their data scientist team for applicability of AI/ML.