A “How To Product Guide” for Machine Learning

Published in

Augment

6 min readJun 9, 2020

Does it really change how a team works if the product involves machine learning?” The short answer is yes!

Machine learning teams and software engineering teams often work differently and to be an amazing product manager for an ML team, you need to understand differences between SWE and ML, as well as core ML concepts.

This article aims to provide you with the knowledge and resources you need to tackle your next PM for ML project head-on! I picked up on these things during my time as an ML Engineering Intern at Deloitte and currently also working with an ML team from the product perspective at Personifi.

So what is different about Machine Learning compared to software engineering?

The inherent process of development is different.

For example, you are building a to-do web app. You can define features that need to be built out relatively easily. A functional frontend for typing the tasks, a backend to store and retrieve this data, delete when done and perhaps keeping a history of completed tasks. As soon as these tasks are defined the team can implement an Agile process, break these features into tickets and get to work. Problems are often well defined and although there can be many ways to solve a problem,

It’s quite different for an ML team.
When trying to tackle a problem with machine learning. A lot of questions need to be answered before you even start the development. What are we trying to achieve? What kind of data is required to build the model? Is there enough data? As you can see, it is a lot harder to answer these questions as they can often mean many different things and will often require hours of research before starting to approach a solution. ML work is also extremely research-focused. During my time at Deloitte, there were weeks where I was only doing research for an entire week, building up my knowledge to answer questions and then doing iterative prototypes, trying to come up with solutions that were applicable.

Data Science has 3 main constituents: Calculus, Statistics and Linear Algebra. But wait, do you need to even know these as a Product Manager? In my opinion, no.

When I was working at Deloitte and was tackling a new topic, I would often be knee-deep in research and math, trying to figure out what was applicable to our use case. However, when explaining my findings to the rest of my team, I’d make sure to abstract my key learnings and present them in a way where it is understandable for non-ML team members. Even if the ML team needs to get into the nitty-gritty, as a PM you will not have to dive into very specific math. While knowing the math can be useful, it is not needed.

Core Concepts
Knowing core machine learning concepts will be crucial to your success as a PM when working with ML products and teams. Below, I cover some basic principles with examples as explanations.

Exploratory Data Analysis (EDA): Analyzing data sets to identify key components making up that data

Labeled data: An image of a cat, the image is stored in a folder called “Cat”, the program is told that the images in the folder are of cats.

Unlabeled data: An image of a cat, stored in a folder called “data” which also stores images of dogs, the program does not know which image is a dog and which is a cat.

Features: A list (technically vector) of properties of something we are making a prediction about, denoted by numbers.

There are 2 main types of Machine Learning Algorithms: Supervised and Unsupervised Learning. (Semi-supervised exists but is out of scope for the article)

Supervised Learning
Supervised learning is when labeled data exists and there is a process that can be automated or generate a probability. There are 2 types of problems you will face in the field: Regression and Classification.

Regression
When there needs to be a relationship formed between two or more variables — to make some form of prediction, regression is a good choice. Ex. Do the number of reps for a bicep curl increase bicep size as you do more reps?

Algorithms to look into:
Linear regression, Bayesian Ridge, Neural Networks (this is a whole topic in itself!)

Classification
When there needs to be an identification of what a certain piece of data is, classification is employed. Ex. We need to create a filter for spam. We get labeled data to tell us how spam emails look like.

Algorithms to look into:
KNN, SVM, Decision Trees

Unsupervised Learning:
Unsupervised learning makes use of unlabeled data and generates insights to find hidden patterns in data. For example, I used Latent Dirichlet Allocation for generating topics from different documents. By using this technique I was able to draw tags from various documents and identify recurring patterns in the documents.

Algorithms to look into:
K-means, PCA

How to help ML teams prioritize
As a PM, you will need to track how ML teams are doing on a day-to-day basis. Andrew Ng (very famous data scientist) recommends a daily sprint as while things take long to research and understand, goals change daily as more information is collected.

For example, a few college students were building a project for a web app. If no one has experience with how to build a web app, they will do research, and understand what to build. It is similar for machine learning teams. Most times, you will have an idea of things to look into, but you will uncover many different things, prototype, test and validate whether a certain idea/tech/research paper could help build the solution.

As a PM, you need to track the ML teams and understand where they need to focus on. A good way to do this is by implementing a 30-second to 1-minute presentation per person which covers what their findings were from the past day, whether this was research, data analysis or building models. This allows for the team and the PM to understand what different things are being worked on, and if there are dead-ends being hit how to go about solving them.

There are times where research will hit a dead-end, a model will not perform as you expected it to in production or a different environment, what was built cannot be used due to technology integration capabilities, etc.

In these cases, the PM needs to bring the business mindset to the data science team and help the team realign their development goals with those of the product. You also need to facilitate sessions where the team can come up with alternative solutions to the roadblock and you can also think about the proposed solutions from a product and business perspective and consider the proposed solution’s feasibility.

User + Product Perspective
Engineers can forget whom they are building for. This also applies to data scientists and as a PM, you need to translate the vision of the product into requirements and features the development teams can understand. It is important to help the data scientists understand the user well, as the better they can understand the use case, the better they can develop ML algorithms.

Conclusion
Understanding the differences between ML teams and regular SWE teams can help you become a better PM. By integrating your product sense and knowledge as well as core ML principles, you enable the engineers to have a better grasp of product expectations to finally ship a coherent product.

Resources:
The Hundred-Page Machine Learning Book — Andriy Burkov
Good for a math-based dive into ML, good for people with a math/cs/engineering background.

Towards Data Science — Medium page for learning about ML

How to Be a Great Machine Learning PM by Google Product Manager

Machine Learning for Product Managers (Highly recommend this series)

A “How To Product Guide” for Machine Learning

Written by Parth Sareen