Machine Learning: Supervised and Unsupervised Learning | EXPLAINED
In this article you will learn to distinguish between supervised and unsupervised learning and when you should use each of them. You will also discover some Machine Learning techniques you can use in order to design a predictive model.
What is machine learning?
Machine Learning is the statistical and scientific study of algorithms that computers use to optimize a performance using sample data or past experiences.
In other terms , machine learning makes computers able to automatically perform a specific task without explicitly being programmed to perform that task by making them learn to detect patterns in data and predict future outcomes.
Machine learning is similar to human learning:
Humans build skills by observing their environment.
Machine learning models optimize their predicting performance by using past data.
Machine Learning, Statistics and Artificial Intelligence
Machine learning is closely associated with statistics and artificial intelligence because:
Statistics is an effective tool for machine learning
Machine Learning is one possible route to realize AI
Types of learning: Supervised Vs Unsupervised learning
Before choosing your model / predictor, you first need to understand your problem by determining the type of learning you need to implement.
Based on the data we have and the task we want our machine to perform, we can distinguish two types of learning: Supervised and Unsupervised learning.
Let’s start with supervised learning:
Suppose that your data is a collection of cats and dogs labelled pictures and you want your model to find the animal either cat or dog from a picture.
Our model will then learn from the data that these pictures are for cats and these are for dogs: This is a supervised learning problem. Inputs in our case are pictures and outputs are labels (dog/cat)
Supervised learning is where our data is composed of inputs and their corresponding outputs. Our model has to learn the mapping function from the input to the output.
Now, based on the type of output, we distinguish two types of Supervised Learning techniques:
Classification: when the output is qualitative / categorical (our previous example is a classification problem, we had to determine if the picture corresponds to a cat or a dog).It can be used in medical diagnosis, character recognition, web advertising …
Regression: when the output is quantitative / numerical (Suppose that our model should determine the weight of the animal based on it picture).It can be used in product pricing, weather forecast, sales forecasting …
Suppose now that our data consists of dogs and cats pictures only(no labels included) , we provide our pictures to the model and ask it to group them based on similar properties(shape , colour or other interesting characteristics ..) which means that our data contains no output variables , only a set of variables . In this case, we talk about unsupervised learning.
Unsupervised learning is where we only have input data and no corresponding outputs. The model will then help us understand structures and patterns in the data.
Clustering is the most famous technique to deal with an unsupervised learning problem .It consists in grouping data points by classifying each data point into a specific group.
So as a take of note, in supervised learning, our data should provide the inputs with their corresponding outputs. Then the model will use past data to train itself to predict the outcome of the new data ,whereas in unsupervised learning , our data provides only some unlabelled variables and the model has to detect interesting patterns in the dataset .