Supervised machine learning for consultants: Part 1

Joe Feldman
Cervello, a Kearney Company
7 min readNov 10, 2020

Understanding machine learning: it’s not magic

Your client wants the power of foresight, and you, the consultant, promise clairvoyance. No, this is not the premise of an Isabel Allende novel, but rather the expectation for consulting casework.

By foresight, I mean your client wants to be able to predict some aspect or quantity that will affect her bottom line. As the consultant, your crystal ball comes in the form of a statistical model — statistical because you’re probably relying on historical data to inform your model, which hopefully can accurately forecast demand or market share or the riskiness of a supplier in a complex supply chain.

You look to the data scientist as a soothsayer who mutters a strange spell with the words “supervised machine learning” and “algorithm” and — poof! — like magic, you have a powerful tool that makes your client very happy.

In today’s world, consultants need to have many skills and understand a wide range of topics, with data science being one of them. Because data science is often used in case work, you may be asked to act as the intermediary between data scientists and your client, or you might have to do the analyses yourself. Either way, machine learning has achieved almost mythical status as the hottest technique for building powerful models that produce the best results.

I’m here to tell you that what goes into much of machine learning is not wizardry reserved for the data scientist who is fluent in Python, R, and Java and dabbles in C#. In fact, much of it is motivated by intuitive concepts that don’t require a PhD to understand.

The goal of these blog posts is to unlock machine learning for the modern consultant. This first installment is a Rosetta Stone of sorts, translating and explaining data science jargon into plain English. By the end of this series, you will shed the inaccessibility of machine learning and be able to offer intelligent, advanced, and data-driven consultations.

Learning to speak data science

Sometimes, we data scientists love to use mathematical, complex words, and if you’ve ever wondered, “What in the world does that mean?”, this will help clarify.

Are AI, predictive modeling, and machine learning the same thing?

Sometimes, you’ll hear AI, predictive modeling, and machine learning used interchangeably. Let’s sort this confusion out.

First, machine learning is actually a subcategory of artificial intelligence, or the way computers are taught to mimic human behaviors. Machine learning is a type of artificial intelligence where computers analyze historical data to learn a human task, like predicting demand or market size or a supplier’s riskiness.

In predictive modeling, we are creating a system that can provide projections for some unknown quantity, such as demand, given circumstances that affect this quantity, such as the state of the market — and hopefully predicting it with some accuracy. If we have collected data, machine learning is often the best way we can create this system.

Let’s train a model

This simply refers to the process by which we use data to build a model.

With that in mind, it is important to distinguish between the two primary types of machine learning problems that I’ll talk about in this blog.

We use unsupervised machine learning when we want to find groupings or patterns in the data. The goal is usually exploratory in nature, but it also possible to use unsupervised learning to predict. By exploratory, I mean that unsupervised learning is used to discover interesting structures in data when they are not immediately apparent. For example, if our data set is comprised of the historical purchases of two products for many individuals along with their demographic information, we want to discover if there are distinct patterns between purchase preferences and demographic information.

Our unsupervised learning created two clusters. Now we can ask: what are the common demographic characteristics of these two clusters, based on the data? One cluster represents older, wealthier individuals, and these customers clearly prefer item one. Younger, less affluent consumers tend to buy product two.

As you can see, these groups are graphically distinctive, which tells us that age and income are indicators of purchase preference. If we were to build a predictive model for the purchase preferences of individuals in this population of consumers, our unsupervised machine learning has convinced us that factoring age and income into such a model is probably a good idea.

In addition, we can predict which cluster a new, unseen individual will fall into based on their demographic characteristics. Though individuals in each cluster do not exclusively prefer one product over another in our example, assigning a cluster to an unrecorded individual can give a pretty good projection of their purchase preference.

When we specifically seek to predict the value of an outcome based on inputs, we use supervised machine learning.

Prediction and supervised machine learning go hand in hand. When we would like to build that model to predict demand or sales or supplier risk, perhaps the best method is to use supervised machine learning to build it.

This brings me to the final jargon translations and explanations:

Algorithm to model: the spell that creates the magic

Think of an algorithm as a step by step set of directions, the specific way in which we extract information from the data. But algorithms are not restricted to high-performing computers. In fact, as a consultant, you build algorithms all the time. For example, the way that you build your model to size a market relies on some explicit combination of data and business logic. Your analyses that lead to your market sizing model — “discount this, multiply by this, and factor in that” — is an algorithm.

A model, therefore, is a guess for how a feature of the world might behave. Obviously, we don’t want any old guess; we want an educated guess. By looking at historical data surrounding the feature of the world we would like to model, we are minimizing the uncertainty in our guess. The algorithm finds patterns in historical data that educate our guess. We use an algorithm to learn a model.

So what can machine learning do for me?

Now that you understand the jargon, you’re probably thinking, “What makes machine learning so special, and how can it help me make an impact with my clients?”

Here’s a typical scenario: a client comes to you wanting a model that can accurately forecast demand for one of their products in a particular region. You immediately set out to begin your regular case research, interviewing industry specialists and collecting data on what drives demand in this particular region — the state of the market, the season, and competitor sales, to name a few. Then, you begin to put together a model to predict demand: multiply historic sales by this number to account for growth, factor in the holiday season and competitor status, yadda, yadda, yadda.

Then, being newly versed in the terminology, you think: Can I use some sort of supervised machine learning to build this model? Slow your roll, partner. A slight twinge to your ambition begs a second question: would this even be useful?a

We’ll start with the second question. To build your first model, though based on data and business logic, you are summarizing the information from hundreds, thousands, or even millions of data points. These summaries may be simple averages, maximums or minimums, or they may be more complex, and you use these statistics to formulate your model for prediction.

When it comes down to it, this approach is probably pretty good. In fact, the laws of probability theory dictate that when you have this much data, the sort of summaries previously mentioned are usually accurate indicators of what’s actually going on with demand and the drivers you identified. But by taking these summaries, you are ignoring nuanced information in the data.

Supervised machine learning algorithms build models that have the ability to recognize complex patterns across boat loads of data, something that humans simply don’t have the capacity to do. At the end of the day, using your initial method versus machine learning is like using a rowboat versus a speed boat to get across the lake.

The short answer to the first question is yes, you can use supervised machine learning to build the model. But I’m going to take a break here since your managing director has emailed you three times in this seven-minute read.

In the next installment, “Clean data: the foundation of machine learning,” I’ll answer that question in more detail, describing the specific requirements of the data needed to use supervised machine learning to build this model to predict demand. Then, in my next post, “Algorithms: the engine powering machine learning,” I’ll break down what goes on under the hood of machine learning algorithms and the way we evaluate models built using machine learning, and I’ll share one application of these concepts with a concrete business example. See you next time.

About Cervello, a Kearney company

Cervello, is a data and analytics consulting firm and part of Kearney, a leading global management consulting firm. We help our leading clients win by offering unique expertise in data and analytics, and in the challenges associated with connecting data. We focus on performance management, customer and supplier relationships, and data monetization and products, serving functions from sales to finance. Find out more at Cervello.com.

--

--

Joe Feldman
Cervello, a Kearney Company

Joe Feldman is a 3rd year Ph.D. student in the Department of Statistics at Rice University and Data Scientist for Cervello, a Kearney Company