Supervised Learning: Classification VS Regression
In supervised learning, the data has a supervisor (label) to help learn the mapping function. An input variable (x) and an output variable (y) are paired together and put into the ‘black box’ to produce Y=f(X).
The goal is to approximate the mapping function so that you can predict the output variable even with new input data (x). (Conversely, data fed in unsupervised learning does not have labels)
Supervised learning can be divided into two categories of:
- Classification
- Regression
All supervised algorithms belong to at least one kind, and I felt before delving into specific algorithms, I needed to get a clear distinction between these.
Just to note, this whole problem of learning a mapping function from inputs (x) to outputs (y) is called predictive modeling.
One liner:
- Classification predicts a label
Q: Is the input a picture of a dog?
A: Yup. - Regression predicts a quantity
Q: How much will a cup of coffee cost in 10 years?
A: 11$
More Precisely…
— Classification Problem:
Examples are classified into one of 2 or more classes
Can have real-valued or discrete input variables
Binary Classification (2 class), Multiclass classification (more than 2)
— Regression Problem: outputs a specific quantity
Real or discrete input variables
Current ML issues including object detection problems can be thought of as classification problems because the algorithm is actually trying to classify objects into different categories (dogs, cats, human… etc)