The Five Levels of Machine Learning Use Cases

A non-technical guide for how to brainstorm potential ML use cases in your business. Recommended read for experienced, new, or hopeful Product Managers, Data Scientists, and entrepreneurs looking to integrate AI/ML into their business.

I hear the following question several times a week:

“I want to use ML in my business. Where should I use it?”

Companies are changing the way they do business because of machine learning. This evolution comes with a lot of excitement, but also some anxiety about potentially falling behind.

Whether or not you’re supposed to be the one figuring this out, it can be stressful to think about where you and your company fit in to this changing landscape.

A bit about me: My name is Allie Miller, and I am a Lead Product Manager at IBM Watson. I have worked in three of the most critical areas of artificial intelligence: conversation, computer vision, and data. And what gets me out of bed in the morning? The scale, impact, and humanization of artificial intelligence.

And basset hounds.

I have now worked with over 150 clients in artificial intelligence, and while each industry and vertical presents its own special flavor of considerations and challenges, there are a few persisting themes that together form the five levels of machine learning use cases.

Each level builds on the last. But before we start climbin’…

Level 0: Talk to users

No matter what you build, no matter how much funding you have, no matter what industry you’re in: you must talk to your end users.

Understanding your users will illuminate paths to success — everything from what data you should analyze to where biases might creep in.

Always start with empathy. Always have communication lines open with your users.

Do not even continue reading until you have committed to this.

OK, ready? Back to the ladder.

Level 1: Identification (sometimes called Classification)

Also known as: “tell me what this thing is.”

Examples:

  • I took a picture. Machine learning tells me it’s a dog.
  • I received an email. Machine learning tells me it’s about dog food.
  • I am on a website. Machine learning tells me it has a shopping plug-in.
  • I am looking at a bank statement. Machine learning tells me there is a grocery transaction.

Identification is the foundation of machine learning use cases. It tends to be most helpful in workflow routing, auto-tagging, and trend analysis use cases: knowing what something is is the first step in deciding what to do with it.

Identification is also generally not a binary result. Machine learning will not just tell you whether a not or photo contains a dog, it will tell you how likely it is that the photo contains a dog. This is referred to as a “confidence score”.

Different use cases call for different confidence level thresholds.

For example, if you are creating a stock photo site and want to use ML to identify if a sidewalk is present in a photo, you might require a minimum of 80% accuracy. The consequence of a false positive (saying a sidewalk is there when it isn’t) or false negative (saying a sidewalk isn’t there when it is) is fairly low. But if you’re building a self-driving car and want to identify if a sidewalk is present, 80% accuracy is far too low.

Level 2: Categorization

Also known as: “group these similar things.”

Examples:

  • I have 1,000 pictures. Separate the dog and cat photos.
  • I have 10,000 emails. Group them into hierarchies: which ones are about pets, and of those, which ones are about dog food vs cat food.
  • I have 100,000 websites. Categorize them by brand tone.
  • I have 1M bank statements. Label each transaction as Retail, Services, or Other.

Really, this is just combining multiple classifications and making them relate to each other. All of the “classes” (or “tags”), like dog vs cat, are trained in a group, rather than in individual models, so the system can better learn the relationship, similarities, and differences between A (dog) and B (cat).

Level 3: Assessment

Also known as: “tell me whether I should care about this thing.”

Examples:

  • I have 1,000 pictures. Tell me which ones have a sick dog.
  • I have 10,000 emails. Tell me which ones are high priority.
  • I have 100,000 websites. Tell me which ones have illegal terms and conditions for their respective countries.
  • I have 1B banking transactions. Tell me which ones are fraudulent.

Now, we start to add contextual clues. Clues specific to that specific company or time or geography. We’re not just labeling one aspect of one feature; we’re generating a sense of urgency, ranking, and prioritization.

Level 4: Recommendation

Also known as: “tell me what to do about this thing.”

Examples:

  • I think I have a picture of sick dog. Now what?
  • I think I have an important email. Now what?
  • I think I have an illegal website. Now what?
  • I think I have a fraudulent bank transaction. Now what?

At Level 4, we begin to incorporate AI/ML outputs into business workflows. The system has returned some sort of output (“this bank transaction is 91.7% likely to be fraudulent”), but deciding what to do with that output is where it gets really valuable (“cover up to $10,000 in fraudulent charges”).

There are multiple ways to handle the triage: if it’s an urgent customer email, you can automatically reply with a direct customer support line. If it’s a retail store emergency, you can send 911 alerts to all nearby security guards. If it’s a fraudulent bank transaction, you can send it to a banking expert to manually review it.

Deciding what to do with the output is up to you and your company, but automating the “assess and recommend” portion of your workflow will allow you to triage not only more quickly but also with greater accuracy.

Level 5: Prediction

Also known as: “will this thing happen?”

Examples:

  • Will my vet clinic see more sick dogs next month?
  • Which customers are most likely to purchase a couch in the next 60 days?
  • Can I predict a fraudulent bank transaction? How should I best notify the user/freeze the card?
  • Is a power outage likely to happen in San Francisco this month? How can we mitigate that?

Prediction is the golden ticket of artificial intelligence. The holy grail. And I sometimes to refer to this as the “last column” problem.

Picture a spreadsheet, with structured and unstructured data sources functioning as each column.

If you’re a retail company, you may have a customer’s name, age, gender, email, address, emails to the company, in-store shopping behavior, previous items viewed, and previous items purchased. If you’re in agriculture, maybe you have a farm’s location, farm size, local weather patterns, predicted weather, pesticide levels, competitor farms’ performance, watering schedules, and satellite images of the farms.

In each case, you want to analyze all of the factors (e.g., age, farm size, satellite images, customer emails) to best predict the last column of your spreadsheet.

Something like “knowing everything you know about this customer, will they buy during our holiday sale?” or “knowing everything you know about this farm, will it produce enough corn this year?”

If you can predict costly or destructive events before they happen, you can have a huge impact on your company and users. Applying machine learning to your business can cut costs, allow you to redirect your company’s resources toward areas of greatest impact, and most importantly, improve the livelihood of others.

Ascending the Ladder: Final Thoughts

These five levels have served me and my clients well across a variety of projects and applications, from labeling dog photos to predicting train track malfunctions. These are core building blocks to ML use cases, and I hope you find them valuable.

Regardless of your familiarity with machine learning or the size and complexity of the solution you and your business go after, remember: always start small and iterate.

Comment below, and let me know what ML use cases you’re working on!