Supervised and Unsupervised Machine Learning made simple!
When to Use Supervised vs. Unsupervised Learning
Supervised Learning
Think back to when you were a little kid learning colors for the first time. If someone pointed to a green wall and asked you what color it was, you’d have no clue! But then your mom comes along and starts teaching you colors by pointing at objects and saying “This is red”, “This is blue”, over and over again. After seeing many examples of colored objects and being trained by your mom with the correct colors, you start to recognize patterns — green things tend to look like this! Red things tend to look like that! Once you’ve got the hang of things, your mom tests you by pointing to a random object and asking what color it is. If you get it right, you’ve learned! This is how the supervised machine learning approach works.
This whole process is how supervised machine learning algorithms learn. The algorithm is given many labelled training examples, like your mom pointing out the greens and reds. It finds patterns in this labelled data in order to learn a model that can accurately predict labels for new unseen data, just like you learned to label new colours. The labels provide supervision for the algorithm, teaching it the correct outputs for given inputs. Supervised learning is all about learning rules from labelled examples that can be generalized to make predictions on new data!
Unsupervised Learning
Pretend you just moved to a new school. Obviously, you don’t know anyone, but you do want to make friends. On the first day, you talk and observe different groups forming based on interests and personalities — the athletes, the artists, and the geeks. On your own, you figure out the social clusters and where you might fit in based on the patterns you pick up. No one told you explicitly what groups exist, but you used observations to learn the underlying social structure. This is how the unsupervised machine learning approach works!
Technically speaking, unsupervised machine learning algorithms are given data that is not labelled. They have to learn the trends and patterns on their own.
Supervised vs. Unsupervised Learning
In summary, the type of machine learning approach you use depends on what kind of data you have and what you want to do with it.
Supervised learning is best when you have labelled data, like the examples our moms used to teach us colours. The labels allow an algorithm to learn a model that can predict outcomes for new data. Use supervised learning when you need your algorithm to be able to classify new examples accurately or make numerical predictions.
In contrast, unsupervised learning shines when you only have unlabeled data. The algorithm has to find hidden patterns and groupings on its own, just like we observed social clusters at a new school. Use unsupervised learning to discover new insights in complex data, segment customers, or identify anomalies.
Labelled Data? Go Supervised. Unlabelled Data? Go Unsupervised.
I hope these analogies made it easier and more fun for you to learn about this topic. Let me know if you have other fun analogies!