STATISTICS 101

Prediction Vs Inference: complementary approaches to ML

How prediction and inference affects how to solve an ML problem

Andrea Gaudino
Published in
3 min readMay 18, 2020

--

unsplash

When we talk about statistical learning we talk about a methodology to estimate relations between data. Sound familiar to you? Yes, you’re right … Machine Learning.

Introduction

First of all, a formal definition.

Given a problem with m observations, Y the targets, X the set of n features and f the relation between X and Y, we have:

Statistical learning is about estimate f, given away the term ϵ that is an irreducible error non-dependent of X, no matter how good is the estimate of the function f.

There is two kinds of problems here:

  • prediction: when you want to estimate a target value
  • inference: when you want to understand the relationship between features and targets

As you see, there are two different meanings and so two different approaches to the problem.

Prediction

A problem of prediction is about estimate f in order to compute a good approximation of Y, based on never seen before observations, with features X. So, we have:

The real composition of f is not important here. We focus on the difference between the real Y and the predicted one. A good model minimizes this difference. Prediction is a matter of accuracy.

Inference

When we talk about inference we are trying to understand what’s going on with our data. We may ask:

  • which features X contribute to Y
  • what is the most important X_j for a given Y?
  • is the function h good for estimate f?

In an inference problem, we focus on simplifying the model, features selections, and statistical correlation between the features. The inference is a matter of interpretability of the model.

So, Which one?

It depends, really … It’s all about our final goal. In most cases, you have to go with both of them to obtain good accuracy and better interpretability of your model.

This was a very introductory subject. Most of the reasoning is from a very good introductory but rigorous book: “Introduction to Statistical Learning”. A copy of this book can be downloaded from here.

Thanks for reading and stay tuned … more to come!

--

--

Andrea Gaudino

Lifelong learner, machine learning enthusiast, I like to share what I know and learn something new every day.