Machine learning concepts for business people
A lot is written about machine learning lately. Many publications are very technical in nature and hard to follow for non-nerds. Business oriented publications often describe machine learning as something magic that will solve all of our problems (and take our jobs). But there doesn’t seem to be much in between.
In this post you will learn about the basic concepts underlying machine learning. Explained in layman’s terms.
So, what is machine learning?
Let’s first define learning itself. The Google web definition tells us:
Learning: the acquisition of knowledge or skills through study, experience, or being taught
This applies to computers, as well as humans. Later we will see that indeed: humans and computers learn in a very similar way.
In machines the acquired knowledge resides in what’s called a model. The model learns during the training phase and will be used later to perform its final task in the application. Making predictions, for example.
Model: the system’s view of the world, contains what the system has learned
Types of machine learning
Following the learning definition above there are three main types of machine learning:
- Supervised learning (being taught)
- Reinforcement learning (experience)
- Unsupervised learning (study)
At the moment this is the most common form of machine learning. We know what the result must be and we ‘teach’ the machine to come closer by providing good and bad examples of the data.
This is technically a form of supervised learning, except that the teacher is not a human. The system learns from his environment by trial and error. Take a look at this hilarious video of a robot learning to flip pancakes.
Even without a teacher, a machine can learn to find structure in the data it receives. See below for an example of dimensionality reduction. The system finds examples of photo’s that look similar.
Machine learning vs AI and data science
These terms are often used interchangeably.
Where machine learning can be defined pretty easily, defining AI and data science is more of a moving target.
Especially the meaning of AI is constantly changing. As Douglas Hofstadter put it:
“AI is whatever hasn’t been done yet” — Douglas Hofstadter
It means that whenever we make something work with AI, people stop calling it AI. There is a whole wikipedia page dedicated to what is called the AI Effect.
Data science is not a crystal clear concept either. After Harvard Business Review called it “The Sexiest Job of the 21st Century” it became a buzzword. Data science is a general term referring to the art (or science) of extracting knowledge from data. Machine learning is therefore not a required ingredient for Data science. Take statistics for example.
Likewise, not all AI solutions need machine learning. Take a computer chess player for example. With just 200 lines of code you can create an AI player that plays decent chess.
A lot of the recent breakthroughs in machine leaning can be attributed to deep learning. It is used for all types of machine learning (supervised, unsupervised, etc).
The model has a special structure called a (deep) neural network or DNN. We call the network deep because it contains a lot of layers. The advantage of a deep model is that it can lean complex things. The downside is that it also takes a lot of data to train it.
Neural networks have actually been around for decades. The availability of large amounts of data (and equivalent computer power) is one of the reasons that it only recently took off.
Types of problems
Within supervised learning there are two main types of problems that we can solve. This depends on the desired output.
Imagine that you want to determine if a certain image contains a dog or a cat. This would be classification.
Classification: the goal is to predict a class or category
Another example: Let’s say we want predict the amount of smog in Beijing given the weather prediction and the season. This would be a regression problem.
Regression: the goal is to predict a a real number
Now, we take a look at the input data for a machine learning system. The model needs to know the properties of the thing we want to learn about. These properties or attributes are called features.
Let’s use fruit as an example: the weight and color of the fruit are both features.
The model can use these features to make a prediction about the kind of fruit we are observing. This would be classification.
Trying to predict the diameter based on the weight and color is a regression problem.
Ok, but how do we get from the features to the prediction? First we have to train our model.
Training a classification model
Let’s stay with our fruit example. We want to determine the type of fruit based on weight and color. First we need to convert the color to a number. We pick the light wavelength in nm. Green becomes 520nm and orange 600nm. We plot both numbers in a graph.
The goal of training is to find a line that separates the kiwi’s and oranges. The line starts out completely random. With each training step the model tries to move the line a little bit in the right direction. After some time, more and more fruit will end up at the correct side of the line.
After 5 steps, we are satisfied with the results. The training stops and we can now make predictions with our brand new trained model.
Training a regression model
This follows the same basic principle. Now we are going to predict the price per orange, based on the weight. The goal is to draw a line that closely matches the examples we used for training.
After training the model gives us an estimated price per orange when given the weight.
Let’s take a final look on how the concepts are related again.
This is a first post in a series about machine learning for business people. In next posts I want to talk about practical applications and specific dynamics of machine learning projects. The topic list is not final yet so if you have some suggestions, let me know in the comments!
If you liked the article, please show it by hitting the clap hands button below.