Series
Explaining Machine Learning to my Mother
Part 3: So, you predict something — How do you do that?
This article is part of a series.
If you are an absolute beginner to the world of machine learning, I’d recommend reading Part 2 of series before you begin this section.
Before you start, keep in mind, I’m writing this keeping in mind people who are absolute outsiders to the computing world.
So, some sections may seem outrageously basic, feel free to skip them :)
Seeing Patterns — Review
When we teach a computer to learn, we are, in essence, teaching it to see a pattern.
In the first stock price prediction example we discussed in Part 2. i.e., the dataset below:
We arrived at an answer for the Opening Price of MNO, which is 200 INR
We did this by arriving at the following pattern
I’m guessing, the best answer you could probably come up with was —
“I’m not really sure, I looked at the numbers and my brain saw a pattern. I noticed that the Opening Price was simply Daily Turnover multiplied by 10”
I’m going to ask that question again — Do you know how you saw the pattern?
I’ll cut to the chase, I don’t know either!
We don’t really know how the brain arrives at the pattern.
Neuroscientists are still trying to understand how our brain sees this pattern.
Nevertheless, based on observations of how we think, we have tried to teach computers to mimic the patterns that our brains see, using — MATHEMATICS!!!
That is sort of the ideal goal of machine learning — being able to create systems that can truly mimic human thinking.
Fair question — If we don’t know how the brain works, how did we figure out the maths that helps us learn patterns?
Honest answer — We made some guesses, and for the most part, it works fairly well!
This will all start making sense soon, so just stay with me, okay?
So what is this magic mathematics that helps us figure out patterns?
Let’s take our earlier example dataset from Part 2
The question is — Based on the data you see, can you determine the opening price of HIJ?
There is a pattern for this, the pattern is the formula below
Now, we discussed how this pattern or formula is not easily perceptible to human brain.
To help a computer figure out this pattern, we pose the question slightly differently.
Before, we jump into that, let’s get familiar with a few ideas —
Input Data
We have a set of input data — Daily Turnover and Total Trade Quantity in the example.
Input data is that part of the dataset that we don’t have to predict. These are simply facts, no guess work.
Like in this example, we are going to assume, that we have no need to predict Daily Turnover or Total Trade Quantity — These are given facts.
Output
This is the part that we must predict, or guess.
Like in this example — Opening Price
Relationship between input and output
This is an important philosophy in Machine Learning — We assume that the input data has some relationship with the output we are trying to predict
For example, if I ask you — “Do you think it will rain in the next hour?”
You’ll look up at the sky, and you’ll say probably one of the two answers (or some variation of it)
“The sky is dark with dense clouds, I think it mostly will rain in the next hour”
Or
“The sky is bright and clear, my guess is, it won’t rain in the next hour”
In this case, your input data is — The state of the sky (cloudy or clear)
The output is — Your answer to — Will it rain in the next one hour — (yes/no)
You are assuming that there is a relationship between the input and the output, i.e., between the state of the sky and the chance of rain
In other words, you are assuming that the input influences the output. Like the state of the sky influences the chance of it raining.
This assumption comes from examples that your brain has seen over the years.
This assumption is the first step to making a prediction, for both computers and humans.
Because if we can’t assume a relationship between input and output, we can’t use the input to predict the output.
Okay, so how does a computer assume?
We help the computer assume by setting up a mathematical equation between the inputs and the outputs. We assume that the output is a function of the inputs
The idea of a function
Let’s revise our basic algebra a little shall we?
Question: Can you tell me how the following inputs are related to the output?
I’m sure you guessed that — The outputs are squares of the input
Or,
Or,
What we have written here mathematically is that
The output is a function of input — The function in this case is square()
Functions in maths are a way of representing relationship between numbers
Question: Can you figure out the relationship between the following two inputs and the output?
That’s an easy one too isn’t it?
Or, another way of writing it as a function that relates inputs and the output is —
Generic function
Now that we understand the idea of functions in maths. I’ll add one more idea to it.
In the last few examples, we knew what the relationship between the inputs and the outputs were.
So we named our functions accordingly — square(…), sum(…) etc
In case we do not exactly know what the relationship between inputs and the output is, we use a generic name for a function in maths — f(…)
That’s it — f for function
When we don’t know what the relationship exactly is, but we know for sure that there is a relationship, we write the relationship as
Basically saying, I know there is a relationship, I just don’t quite yet know what to call it, so I’m just going to call it — f
Back to Machine Learning
In the dataset we were looking at, we had two inputs — Daily Turnover and Total Trade Quantity
And one output — Opening Price
We begin by assuming that there is a some relationship between the inputs and the outputs.
And mathematically, we write this as —
Basically saying, there is some relationship between Opening Price and the inputs — Daily Turnover and Total Trade Quantity
We don’t yet know what that relationship is, but that’s what we are going to figure out.
This relationship, in other words is the pattern we are trying to understand
Now, we don’t really know anything about the relationship. I mean absolutely nothing
Is it —
Or is it —
Or could it be —
The relationship could be any of the above, or even none of the above
So what do we do now?
This is where smart mathematicians figured out a technique.
They said — Agreed we don’t exactly know the relationship, but since we know there is one, let’s just write it as
If we can calculate and figure out a value for x and y — We’ll have an answer to — What is the relationship between the inputs Daily Turnover and Total Trade Quantity and the output — Opening Price
Good question —
“How do we know that the relationship is an addition operation?
You just told me, it could be just about anything!
Why aren’t we multiplying Daily Turnover and Total Trade Quantity?”
Right, it need not be an addition operation, it could be just about anything.
There is a step in Machine Learning that involves some mathematics that can help you figure out what kind of operation to put in between the inputs.
It’s pretty interesting, but it’s a little advanced for where we are, so we’ll hold that discussion for later.
For now, let’s assume we’ve done those steps and figured out that the relationship involves adding the two inputs. So —
Now all we have to do is figure out x and y
Makes sense, now what do we do?
The overarching idea behind machine learning is this
- The computer makes an absolutely random guess for the x and y — It simply begins somewhere.
Using this it calculates the output (in this case, the Opening Price) - The guess will obviously be way off —
So it goes back and checks with the outputs we already know the answer for.
It calculates how wrong it was, this is called an error calculation - It looks at how big or small the error was, and goes back and adjusts it’s guess for x and y
- It keeps doing this until it get’s as close as possible to the original answer — It learns from it’s mistakes
That’s it! That’s all machine learning algorithms do
Sort of, just like us, don’t you think?
Sounds too abstract — Give me an Example!
Here’s an example —
The computer starts by randomly setting x and y to some value — absolutely randomly
Say it set —
x = 2
y = 4
Now it calculates using the formula
So, it’ll calculate the values shown below in the Computer’s Guess column.
It’s way off! Isn’t it?
Well, we all make mistakes and we learn — So does the computer!
For it to learn from the mistakes it made, it will have to first look at how big a mistake it made, so it will also calculate an error value.
The error value is calculated as —
Simply, square of the difference between the actual answer and the guessed answer
So, now the computer is looking at a dataset like below
Now we simply take the average of all the errors, which in this case is —
average error = 137198506.3
A computer now looks at that number and goes!
So now, it goes back and picks an x and y that will reduce this error.
How does it do that? How does it know what x and y to pick next based on the error?
It does that using a very interesting mathematical technique. It’s a little too early for us to talk about it, it’s a little advanced for where we are right now.
But one of these days, we will discuss that too, I promise!
For now, let’s say we applied the technique, and the computer now thinks that the new value of x and y is
x = 9.5
y = 0.25
So it goes back and calculates again
Can you see that the values are now much closer to the actual Opening Price than before?
We calculate the error again, this time it is
average error = 345720.2
Which is much lesser than 137198506.3
So the computer knows it’s probably headed in the right direction.
The computer will keep making guesses, until that error becomes 0 or at least close to zero.
At the point where the error becomes zero, the computer would have figured out the actual pattern or formula
Which is —
i.e., it would have eventually guessed that
x = 10
y = 0.05
All that for this? Isn’t this simple algebra?
Right, in this particular case, we could have simply solved it using basic linear algebra.
For those who don’t quite remember how we do that, here’s how
We take any two rows from the dataset. Say —
Then we will create two equations
now we solve this, like we did in middle school, remember that?
And voila — we’ll have the values for x and y
Alas, the real world is not that simple!
To keep things simple, I gave you an example where Opening Price depends only on two factors — Daily Turnover and Total Trade Quantity
But in reality, stock prices depend on a lot of factors, some of which we don’t even know, what they are!
Example — Did you know, Trump’s tweets can affect opening stock price of companies in India? (example, his tweets related to restricting work visas can affect Indian IT industry’s stock price)
Sometimes even the centimeters of rainfall can affect opening stock price (ex: agro industries)
We never really know everything that can affect a prediction.
In cases we don’t have all the data that can directly influence the value we are trying to predict, the best we can do, is to keep guessing until the error is as low as possible.
In the real world, the error will never touch zero. Because, we mostly never have all the data, if we did, we have no use for Machine Learning.
I hope you learnt something new — I know I did!