15 Key Learnings for Product Managers in ML Engagements, Part 2

Published in

Product @ Publicis Sapient

7 min readDec 17, 2021

And, we are back to discuss the remaining learnings from our journey as product managers in ML engagements. We hope you liked and importantly benefitted from part 1 of this series.

Let’s get going here!

Data Selection & Preprocessing

In data selection, product managers need to identify various data sources (files, database, web, etc.) as well as assess which datasets are available, missing, and irrelevant. They also need to assess the quantity and quality of data sets. As a rule of thumb, the more the data, the more accurate the prediction will be.

Product managers need to work closely with data engineers to preprocess the data. Preprocessing is all about making the data fit for the purpose and some of the key techniques involves:

Case in point: An event management company wants to increase the turnout of customers for upcoming events and it is planning to target the right set of customers via an email campaign. They can deploy a ML model to predict the customers with the highest probability to attend the event. Product manager needs to identify data sources for collecting customer order history, customer demographics data, etc. Available data sets need to be verified to find missing, duplicate, and inconsistent entries. Whenever required, product manager need to work with data engineer to convert raw data into a useable format.

For developing a logistic regression model in this scenario, dependent variable could be y = whether the customer will attend the event or not and the independent variables could be x1 — event attendance history, x2 — age, x3 — profession, x4 — payment method, x5 — location, x6 — referral and x7 — income. As a product manager, you should be in a position to discard data sets pertaining to payment method (x4) due to being irrelevant and income (x7) due to being non-reliable, incomplete even though it is important.

We would also like to cite few other key aspects of data selection.

Role of Bias

One key point to note is that we, as Product Managers, come with our own biases before we start out to build a product. A diverse dataset alone cannot help you get rid of the human bias from the technology being built because essentially Machine learning is trying to mimic the human behavior and thus the biases such as interaction bias, latent bias, and selection bias that comes with it. Here’s a simple explanation from Google. Sometimes the biases can range from trivial topics of selection of type of shoes from a screen to much sensitive topics including gender imbalance. I am sure most of us are aware of this infamous story. We know that an ideal state is impossible but we should try to make products that can make better decisions than a human can in much lesser time. Selecting the right data set to begin with followed by constantly reviewing the same is highly recommended.

Causation & Correlation

Another relevant point that we would like to mention here is from the field of economics & statistics — Causation & Correlation. Often, there is a tendency in us to arrive at a wrong conclusion even when it is purely based on data. While it is too long a topic to cover here, we would highly recommend you to go through this piece, which will not only sensitize you enough on the subject but also explain the matter in a fun way. You can relate this in the above example where a PM might have to discard x4 and x7.
There might be a linear relation between the variables but the evidence might prove otherwise.

Product managers may also need to work with data engineers in identifying data pipelines that are highly available & reliable and can work seamlessly with scalable data inflow.

Feature Engineering & Modeling

Product managers need to have an experimentation mindset. They should be able to take into cognizance of the product goal and then think about what are the possible modeling techniques, say supervised or unsupervised ML, at their disposal that can help them achieve the goal. This in essence is the craft of experimentation.

Most of the times, product managers are involved in performing different experiments such as A/B Testing or multivariate testing to test a hypothesis based on initial user research they have done. This involves appreciation of statistics and design principles.

However, experimentation required while building a ML product might be a little different. Product managers need to guide ML engineers in coming up with different models and help them choose the best model to ensure success.

Case in Point: Referring our past engagement with an automotive client here. We were working with a team of ML engineers to build a Document Ranking model. There in our role came in handy from the point of view of defining the topic n-grams for Word2Vec model. We constantly engaged with the team to develop the model, deploying and evaluating it consistently. Again, the results are not binary in nature and it takes time to arrive at an acceptable level. We were able to leverage an in-house ML based product named KaaS, which helped us to reduce the model training efforts by leveraging the out-of-the-box services such as Semantic Ranking and Paraphrase Detection.

Case in Point: In another engagement with a global CPG client, the business goal was to increase the visitor engagement on their websites. The strategy here was to first identify the visitors’ need followed by serving them with the most relevant content. The product line in this case was a hair-care and accompanying content aimed at helping customers find the right product for their specific hair problems — frizzy hair, oily hair, broken & thin hair. ML was a definite go-to strategy for improving this experience via personalized product recommendations. The devised solution was an NLP based ML model that classified the product description data via algorithms such as it-idf, LDA, textRank for extracting the most prominent keywords followed by another ML model that predicted product scores based on these keywords and combination input passed by the visitor (via online quiz). We were able to see a sizable uplift in the visitor engagement for visitors were able to find their most relevant products matching their need.

Similarly, ML team can come up with n types of models and Product Managers should guide them by giving important inputs and communicating the expected outputs.

The diagram below best explains this cycle:

In essence, product managers should help the team in finding the best model and it can involve the following activities:

Deployment

Deployment is the last step where the best version of your ML model goes live. Product managers need to evaluate the model’s performance against the ML goals (ML KPIs) and assess any deviations.

It is also worth noting that model performance needs evaluation in an ongoing basis based on changing business requirements. Product managers must rely on transparent communication in case of deviations in model performance and formulating the rollout strategy. Product managers need to work closely with the all relevant stakeholders to set the deployment cadence, say — daily, weekly, monthly, quarterly, etc.

In the event of a latency issue, product managers need to work with ML engineers to improve the performance (output lag) of the machine learning models.

Case in point: Based on the user activity patterns as well as velocity of new content creation, say for social media platforms, model deployment cadence varies a lot. Platforms such as Facebook and YouTube have been following real time deployment cadence.

Well, those were our key learnings! Before we conclude, it is also worth sharing the importance of soft skills while working in ML engagements. More often than not, the team has members who are masters or PhDs and it becomes imperative that we understand their perspective to have meaningful conversations. It indeed requires a new mindset to look at things from scientific or statistical perspective. Having said that, we want to bust the myth that a product manager need to be as technically sound as a data scientist. Today there are ways in which product managers can get the minimum proficiency in AI & Ml by utilizing open-source applications such as Google Colab, Jupyter Notebook, and the recently launched Amazon ML University. Again, there is no limit to one’s learning, the more the merrier, however product managers’ ultimate goal is ensure project success by delivering business value.

While it is super exciting and complex at the same time, ML journey still continues to present us with new learnings and challenges every day. We hope that you find these learnings helpful and we would like to hear from you about your experiences and favourite ML products.

Thank you,

Nithin Subhakar, Shubham Tripathi

15 Key Learnings for Product Managers in ML Engagements, Part 2

Data Selection & Preprocessing

Feature Engineering & Modeling

Deployment

Written by Nithin Subhakar