A Non-Linear Data Science Journey

Published in

CommerceIQ

5 min readNov 13, 2019

Hi! My name is Shubham Maurya. I’m a Data Scientist at CommerceIQ, where I build intelligent recommendations for e-commerce brands, write algorithms to optimize advertising dollars, and constantly ask annoying questions around data quality. I joined full-time this year, after interning here last summer. Coming back was one of the easiest decisions I’ve made!

My background in Data Science is fairly eclectic. I started out in the Economics sphere, interning at places like J-PAL (of Abhijit Banerjee fame) and EPIC, before trying DS in tech firms. I have degrees in Computer Science and Economics from BITS Pilani, which has given me a strong background in computing and modelling.

I used internships as a way to explore my interests, and got the chance to work on a wide variety of problems.

At EPIC, an Energy Policy institute, I worked with a large electricity dataset and night lights data. This was my first experience working on a ‘data’ problem, and taught me how important modelling is.
At Flipkart, I spent 3–4 months working on one (very interesting) problem called map matching, using location data, with a laser focus on ensuring correctness.
SportStack gave me the opportunity to build custom dashboards for them in Shiny, and write analysis pieces based on football data. Apart from internships,
I’ve also tried Kaggle competitions, and scraped my own football dataset to explore a hypothesis. SportStack actually reached out to me after discovering one of my Kaggle Kernels!

These experiences have given me confidence that I can adapt to any type of data, as long as I follow first principles — understand your data. While upskilling and learning new methods is obviously required, I believe that EDA and looking at gaps in data is just as important.

All the places I’ve interned had one thing in common: the presence of extremely strong mentorship. My mentor at EPIC constantly challenged me to learn more and really think about the problem and explore the data. I was given room to experiment, and used the freedom to learn tools like RShiny. At both Flipkart and CommerceIQ, it was drilled into my head to establish functional correctness first, and think about the value your model is expected to deliver — accuracy is not necessarily the only aim.

When I interned at CommerceIQ last summer, I worked towards building a Root-Cause-Analysis model for Market share changes. Two things immediately struck me — building production-quality data science models is extremely complicated, and models that aid decision-making need to be interpretable. We’d constantly discuss how the model needs to generalise well across customers (not easy because of the variety of customers) and make sense (so that the insights are actionable). I poured through literature to understand what modelling approach would make sense, and had constant discussions on ways to iteratively improve the model. I had wonderful colleagues who helped me through the internship, in terms of understanding the data sources and thinking through approaches. They actively kept me engaged even before I joined full-time, inviting me for Beer Bashes, and helping me understand the type of problems I’d be working on.

Since joining full-time, I’ve had the chance to work on several other problems. We work with a large variety of datasets — Pricing, Promotions, Marketing, Fulfilment, Forecasting and so on — effectively, everything related to E-Commerce! Some problems were improvements — ‘how can we remove the effect of seasonality in sales from our recommendations?’; others are being written from scratch, like implementing an algorithmic strategy to optimize advertising spend on Amazon, and, obviously, there were bug fixes too! All the problems have one thing in common though — they were ambiguous problem statements, and the onus was on me to talk to Product/Engg/Customer Success and figure out exactly what is needed, work on the end-to-end delivery, and release into production. While this is challenging, it is immensely gratifying to see my models go live and receive positive feedback from customers! Often, I’ve found myself stuck, can I model something as a Normal Distribution, what metric/feature makes sense — my team has always come to the rescue! I’ve also been able to learn and understand the problems my team is working on, like calculating Price Elasticity, estimating competitor sales, and predicting when Amazon will discontinue a product.

I think start-ups are an extremely interesting place to work in Data Science — you work on a wide variety of problems and wear multiple hats — you’re a PM, analyst, and engineer all at once. You’re likely to be responsible for putting your own models into production and maintaining them. You might have to do some reporting or analysis at times, but that’s part of what you sign up for. You’re also likely to get the chance to build a lot of stuff from scratch, which is an incredible learning experience.

After you’ve done the hard work of building your portfolio by doing coursework and working on projects, it makes sense to think about the role and organization to join. These are some lessons I’ve picked up on the way.

Understand the role. Given the ambiguity in the title, make sure you pour through every detail in the job description — will you be responsible for reporting numbers? What tools do they use? Is it more of an analyst role? Some orgs will look for people with NLP/CV skills, while others might need a strong grasp of Statistics. Make sure you apply to places that suit your interest and skill set.
Understand the organization. Is Data Science a part of their core product, or is it something they’re ‘trying out’ to see if it’ll add value? Go through the company site and their blog, if they have one. Look through Linkedin to see who your prospective colleagues will be, and reach out to them if you need more information — most folks are happy to help!
Look for good mentors, especially early in your career. While there is ample scope for self-learning in this field, there’s nothing quite like a good mentor to show you best practices and ensure you’re moving in the right direction. This extends to having good colleagues as well — a significant part of my learning has come from observing the tools and techniques my colleagues were using.

I hope you’ve found this useful! It’s been quite a fun journey, and I feel like I’m just getting started — there’s lots more to learn (as my unfinished books and Coursera would testify), and lots of interesting problems to solve here. In case you have any queries, would like to explore opportunities at CommerceIQ, or generally keep in touch, please feel free to reach out to me!

A Non-Linear Data Science Journey

Written by Shubham Maurya