AI Snack #1: Types of machine learning
Supervised. Unsupervised. Semi-Supervised. What does it all mean?
In this article, we cover a brief intro into two broad types of machine learning — Supervised & Unsupervised.
Imagine I gave you five years of data logging:
- the day of the week
- what time I woke up
- the clothes I put on
- my mode of transport
- whether or not I went to work on that given day
You could then use that data to create a model to predict, going forward, whether or not I am going to work on any given days simply by observing the data points listed above. This is an example of supervised learning.
Using “labelled” data (i.e. containing answers such as did or didn’t go to work) the machine learning algorithms are trained to make predictions and classifications going forward.
So, the key outcome for a supervised learning algorithm is to predicting and classifying. In order to be effective, the machines need access to enough of the “right answers” to be able to predict/classify with confidence — which can get expensive to source. The great thing about services like Amazon and Netflix is that the “right answers” are being provided to them all the time through users buying different products and streaming content, which translates to smarter “suggestions” on what you should buy or watch.
What if there aren’t any “right answers”? What if all we have is a sea of data and we just want to make sense of it all?
That is where unsupervised learning comes in. Machines are provided unlabelled data such as images, audio, customer databases or search history. They then attempt to make sense of and group the data in ways that may be useful to us in the future.
For example, a machine could trawl through a customer database and create segmentations relating to demographics and buying habits. Using that information, the business may then choose to create targeted advertising campaigns for some of those segments. Note, the outcome of the machine learning in this case is not specific, it doesn’t predict anything. All it is doing is breaking the data apart in ways that may be of interest to us so that we can make more informed decisions.
Unsupervised learning algorithms, categorize and cluster, and are becoming very popular due to the vast quantities of data being hoovered up and access to cheap computing power to train models.
Hybrids / Semi-Supervised
More and more models are now blending aspects of both supervised and unsupervised machine learning by using partially labelled data for training.
A great example can be seen in the following video, which identifies different objects being recorded while driving: