Feb 24, 2018 · 3 min read

Just started to read the book “The Signal and the Noise” by Nathan Silver and getting inspired, though I am still at the first pages and the following does not appeared so far. So let me put down some thought about prediction and put them into the three categories interpolation, regression and classification.

How could we put prediction in a formula:

(q1) f(x) is known on an interval [0,t] and we are interested in the value f(t+d) with d>0

Most of the time we look at a discrete time interval, so

(q2) f(x) is known for {0,…,n} and we are interested in the value f(n+d) with d a positive natural number

Depending on the question it is a one-time efford like predicting the outcome of an election or a continuous process of predicting the stock prices for the next day.

This is for time prediction, another question might be how a function behaves for some unknown observation, so

(q3) f(x_i) is known for i in {1, …, n} and we want to know f(y) with y not in {x_i, i=1, …, n}

There is this little difficulty in both cases that we do not know on what information the outcome depends. In the third question we should ask in what room the x_i live, so what is the definition space of f. The same rule applies to q1 and q2, since we should split x in the touple (y,t) with t the time and y from some unknown space.

We should also think about the value space of the function. Is it finite, an interval, real, multi-dimensional.

Now the art of prediction is actually very old, a lot of really smart people have thought about it and came up with several ideals, depending on the definition and value space. Though a lot of attention is put on prediction algorithms in the last years with the massive increase in computational power.

Basically we organize the predictions methods by

• definition space
• value space
• set of functions

Interpolation

First simplification is to pretend the definition space of f is known (and has finite dimension). Then we speak of an interpolation problem, thus matching a continuous function to the given observations. Usually we have a class of function we try to fit, e.g. polynomials up to degree n. Another approach is to just locally fit the observations, the most popular method is spline-interpolation. Depending on the class of functions the prediction can be very different.

Regression

A more realistic approach is to accept we do not know everything and introducting an error function epsilon, that has some random distribution to account for the unknown. For ease of use — but not very realistic — the error is assumed to be normally distributed, though there are methods to allow other distributions as well.

y = f(x)+eps

The most popular method or better class of methods is “linear regression”, roughly speaking putting a line (therefore linear) in the plot with dimensions x and f(x) with minimizing the error function.

Classification

If the value space is finite, we consider it a classification problem. There is a lot of attention to algorithms tackling these kind of problems as we now are in the field of machine learning. Think of image or speech recognition. As said there are a ton of different algorithms: binomial regression, decision trees, support vector maschines, clustering, neural networks, information filtering systems just to name the most important ones.

Written by