From Deterministic to Probabilistic:
A Nontechnical Guide to Building Your Company’s Machine Learning Systems

Published in

the integrate.ai blog

6 min readDec 6, 2018

Editor’s note: This post is adapted from a keynote that Kathryn Hume, our VP Strategy, gave at the INSEAD AI Forum in Paris on October 13, 2018.

Kathryn Hume demystifies AI at the INSEAD AI Forum in Paris.

Artificial intelligence (AI) is about applying mathematics to mimic what we traditionally consider to be cognitive capabilities. It’s about evolving from having to write out all of the rules to solve a problem (think “if this, then that”) to making educated guesses about what something might be by looking for trends in the underlying data.

That’s transformative because it allows us, as Jeff Bezos put it in his 2016 Letter to Shareholders, to automate tasks where it’s hard to describe the precise rules. There are a lot of things in the world that are too complex to be reduced to the straightjacket of deterministic rules. But these systems can’t be created in a vacuum. Building a successful machine learning product requires the active engagement of stakeholders from business, risk, data, and technology throughout the entire process.

So what does this mean if you’re in a nontechnical role at your company? What do you need to be thinking about if you’re on a business team, in a risk and compliance role, or working as an analyst or developer?

As we’ve articulated in earlier posts, most companies follow the same general process when building machine learning systems. And it doesn’t differ much from the same systems development methodology that’s taught in business school. The only difference is that there’s this new AI piece of the puzzle to deal with. As such, you don’t need to reinvent the wheel to learn critical business skills for AI. You just need to update what questions you ask when (and to gain some sound intuitions for what different AI algorithms do and don’t do).

Let’s consider a real-life example from Kanetix, an integrate.ai customer.

Phase 1: Scope & Design

The first step in building a machine learning product is figuring out what business problem you’re solving. The point here is that you shouldn’t just apply AI anywhere, but rather just in those places where it’s going to drive the most value for your business.

In our example, the problem Kanetix faced was that they had a high drop off in conversions. People would come to their website and scroll around, but they wouldn’t actually buy anything. The question they were grappling with was how they should spend their marketing dollars to incentivize purchases while still maintaining healthy margins. In other words, they didn’t want to waste money incentivizing the wrong people and wind up raising their customer acquisition costs unnecessarily in the process.

That left us with the challenge of helping Kanetix optimize their sales funnel so that they could target incentives at those people where it would make the greatest impact, thus minimizing waste. The way that we approached it was to analyze the data and divide the company’s prospects into three buckets, namely people who were:

Unlikely to ever convert and therefore weren’t worth spending money on.
Slightly likely to purchase and could potentially be motivated to do so with the right incentive.
Highly likely to make a purchase and therefore didn’t need any incentive.

Once we’d made predictions about who would and wouldn’t buy, we needed to create a rule for when we’d offer an incentive and when we wouldn’t. Finally, we built out our goal, which was to get people from our customer’s website into a partner program.

Phase 2: Data & Models

Next up, you need data. And one of the first questions you need to think through here is where that data is coming from and what might be potential low-quality data sources.

Kanetix has a website where it collects data as people answer questions. Some of those questions are easy to answer, such as the make and model of your car. Others, like highly specific insurance information that most people don’t know and can’t be bothered to look up, are much harder. As a result, the quality of the data from those harder questions might suffer.

During this phase, it’s important to make sure that you’re working with a subject matter expert who truly understands the data set. Companies often make the mistake of assuming that Ph.D.’s who know a lot about math can just come in and figure things out. Unfortunately, that’s almost always a recipe for failure. Instead, it’s much better to partner with someone who’s been working with the data set for a long time. They can jump start what the scientist pays attention to so that as they’re designing their algorithms to pick up the stuff that really matters.

As part of this work with the data, there are some additional issues you need to think through:

Be wary of tricks in the data that might be proxies for sensitive features.
Figure out if the model needs to be explainable. Will you have to be able to explain why it reached the output it did?
Does it makes sense to go with a simpler model so that you can be sure you can explain it.

Thinking through all of these issues early on will help you avoid pitfalls further down the track.

Phase 3: QA & Production

Now that you’ve built out a model and designed your system, you have to decide what to do with it.

First off, you need to figure what to do with your outputs. That will require collaboration with a creative team that can build new website features that actually change and personalize how the site behaves based upon your model’s predictions. With Kanetix, we defined a threshold for who would see an offer based on their propensity to convert, and then worked with the Kanetix creative team to design offers and update their website to serve up those offers at the right time.

Then you need to be able to evaluate if it’s working by building out a test and control set. The idea here is to demonstrate the value that AI is adding over and above business as usual. You want to be able to convince the rest of the business that AI has value above the baseline performance.

Finally, you have to figure out how smart you want your AI system to be. In other words, how often are you going to update the model? Your choice will likely depend on your company’s infrastructure and needs to be considered before you build your system.

How Will Businesses Evolve in this Brave New World?

There are two phases to AI adoption. In phase one, you insert the technology into a business to make a process better. That’s exactly what we did in the Kanetix example described above. But really, it’s in the second phase that things get exciting. That’s where you reframe your processes as data collection mechanisms for whole new revenue streams. In other words, your AI model starts collecting all of this data that you’re then able to monetize in some other way.

This is what you see at big companies like Google and Facebook, and the implications are huge. As businesses adopt AI, they will initially improve their processes. Eventually, however, they may become entirely different businesses as they garner new data assets.

As you think about that, just remember that the best way to learn is by doing. This is new territory for everyone, but the future is ours to build.

Wait, Want Some More?

If you enjoyed this post, you might also like watching this fireside chat that was part of the INSEAD AI Forum. It’s on privacy, ethics, and how AI will impact our work and lives.

Kathryn Hume, VP Strategy at integrate.ai and Subramanian Rangan, INSEAD Professor of Strategy and Management, dive into the impacts of AI in society and the skill set required to address the challenges ahead.

You can access the video, here.

From Deterministic to Probabilistic: A Nontechnical Guide to Building Your Company’s Machine Learning Systems