AI Product Management P2 and P3: What priorities should you set for your AI model and how do you know how much training data you need?
Learn about how to track metrics beyond accuracy and how to accelerate gathering training data for your AI model.
This article is part of a series that breaks down AI product management into 5 distinct phases. The introduction to these series starts here.
Phase 2: Priority Setting
I’m sure you know, as a fellow Product Manager, how priorities align everyone around the product. When you build an AI product, there are AI-specific priorities to consider.
So, what priorities should you set and track?
It’s not only about accuracy.
Here are a few thoughts below on why accuracy is too limited of a metric to optimize for. I also added more priorities that would be helpful to track alongside accuracy. These priorities will drive the AI model design and impact model improvement plans.
How important will accuracy be for you? How do you optimize for this?
Your data scientists can tell you accuracy metrics. But it may turn out that the inaccuracies aren’t significant. If you are building an AI Chatbot, transcribing “A” instead of “The” may not impact that translation! Accuracy will also never be 100%. In fact, if you do get an accuracy metric of 100%, that’s a red flag. This means that your AI model is only working on your data set and will stop working when it sees new data.
Thus, accuracy is just one data point to think about the AI model. If the bot is wrong 5% of the time but people are completing their calls and getting what they need, then does it matter? How much cost are you willing to take to improve your model? If 6 months of work makes the AI model 1% better — is it worth it? The accuracy of most AI models will also eventually plateau. Increasing accuracy X% at this stage can cost as much as the entire project to date.
I hope you can see that accuracy is not the end all be all. Depending on the context, it may not even impact your end-users. There are ways to mitigate inaccuracies that don’t even involve re-training the AI model. For instance, you can give users the ability to correct the systems themselves or display error messages. In chatbots, a message that says, “Is there anything else I can help you with?” helps if the user asked for many things and the chatbot only found one.
Approach “accuracy metrics” as just one of many priorities to track and improve over time. Read on to learn about other priorities you can set for your AI model.
Explainability covers the need to understand how the AI model got from point A to point B.
If explainability is important to your users, this can impact the AI model you can use for your product. Some AI models, like neural networks, operate like a ‘black box.’ Data went into the model and some output came out. It is difficult to explain how or why that output occurred.
Additionally, this can impact the way you design the UI of your AI product. My design team spent time developing explanations of the AI model output in the UI. Our product used a natural language processing model that analyzed the quality of requirements. The output of that natural language processing model was a string of entities — this was not very useful at first glance. So, we built a series of business rules that assigned weights to the entities to compute a single score. The UI only showed that score to the end-user, an explanation on why and suggestions to improve the score (we hid the complex entity output). You may need to iterate on a design that’s most understandable by the end-user.
3. Business KPIs
Business KPIs are important to check your AI model’s accuracy in context.
At the end of the day, the success of your AI product hinges on whether it helps your end-users achieve their goals. If the AI bot is wrong 5% of the time but people are still getting what they need, then does it matter?
Your business KPIs should be your Northstar to make choices in improving accuracy or not.
4. Bias mitigation
The AI model’s output is directly related to its input. Thus, it’s important to have good representation and diversity in the training data.
This may or may not be relevant depending on your use case. It will depend on how much this bias will impact your end-users. If you’re trying to identify a dog in a picture, this may not be relevant. But, if you are building an AI tool that can recommend bank loans, having a bias towards gender, race, or age will have a material impact on your end-users. In this case, having a priority in place to keep training until the internal bias is gone is critical. You can use an ethical framework to help brainstorm and mitigate the potential primary, secondary and tertiary effects of your AI model. There are also some Watson services that can help identify and mitigate bias.
- You can’t only optimize for accuracy.
- Align your priorities against your end user’s needs.
- Your priorities will drive algorithm choice, UX design and AI model design.
- Make sure to set and track these priorities.
Phase 3: Training Data
Consideration: How much training data do I need?
If you decide to go ahead, don’t assume that you’ll need a lot of training data. There can be a lot of hesitation to start because there’s an assumption that AI models need a lot of training data. But the amount of data you need depends on the entities you’re labeling and the models you’re building.
It’s hard to generalize… but here are 6 rough rules of thumb below.
1. Quality over quantity
The more data the better but quality also matters. “Quality” means getting training data as close to what the AI model will see in production.
As a rule of thumb, the size of your future model performance problem is the distance between the synthetic training data you use and the real data your users are providing. You may want to create fake data, but you are not as good at simulating what your users want as you think you are. You can bootstrap this process by building interfaces to gather data or extracting data from sources like logs or search queries.
In our case, I spearheaded a client program where we partnered with our enterprise customers to train the AI model. These customers gave us access to training data and were partners throughout the entire process. Being able to use near-close production level training data helped us speed up this phase. You should consider engaging a few of your trusted customers as well!
2. The algorithm you choose matters
Deep learning algorithms need more data.
3. The more you want the more you need
If you want to extract 100 entities, that will take more data and time than extracting 1 entity.
4. Pick the right AI tool
Some AI tools might train with less data but they may give less accurate results. You should test different AI tools for your use case and select the tool that performs the best.
5. Gathering minute details may mean longer training cycles
Identifying New York City v. Taipei City in a photo may take more training cycles than identifying a city from a forest.
6. Data organization and clean-up may take more time than you expect
You’ll need to plan for enough time to clean the training data. Many projects can spend 80% of their time in this “data janitorial” work. Be careful to set expectations with the business and give your data scientists enough time to do this work!
- The amount of data you need depends on many factors.
- Do not underestimate the time it may take to secure and clean the data.
Read on to Part 4 and 5 to learn how to build and deploy your AI model.
Stella Liu was a Product Manager at IBM Watson IoT where she helped build her team’s first AI-based product at scale. She loves to talk about AI, product management and environmental sustainability.
Please reach out to her at LinkedIn for any questions or comments!