Supervised and Unsupervised Machine Learning

Grifftalan
6 min readApr 28, 2020

--

The field of information and technology changes rapidly compared to most industries. Artificial intelligence, machine learning, neural networks, and quantum computing may all be terms that you have heard with some frequency, yet still have no understanding of. As these Data Science tools become ever more present in different job sectors, it is important for people that won’t be using these tools to at least understand the meaning of these terms and ideally have an understanding of how they might be useful in innovating their workplace.

Unwrapping the Bigger Picture: What is Artificial Intelligence?

Artificial intelligence is just a term used to describe anything outside of the “natural” world that exhibits some sort of intelligence, or learning. The term machine learning can be used interchangeably with artificial intelligence. This description is necessarily vague due to the fact that there are many different methods by which a machine can go about learning.

Supervised and unsupervised learning are the two main categories associated with modern machine learning and serve. Supervised machine learning relies heavily on pre-computer methodologies that have been optimized and made useful by the increase of computing power seen in recent years. This increase in computing power has also resulted in the ability to design unsupervised learning systems that can find trends in data without looking for anything specific.

Quantum computing, a developing technology that expands the binary structure of modern computing (0’s and 1’s), will allow for even more complex and abstract learning methodologies in the future, but it is not a learning method itself.

Supervised Learning

Supervised learning is when a machine learning system uses training data to build the model with before making predictions. For example, a Data Scientist may use data collected from a health survey of patients with and without a specific disease to see which questions highlight the individuals with the disease. If enough questions are asked and enough data is collected, a viable machine learning model can be created that predicts whether or not another individual will have the disease given the same questions.

These machine learning models can be used to predict both continuous (e.g. Age) and discrete/categorical (e.g. red, blue, green…) information. A major drawback to this type of modeling is that the data must be labeled correctly in order to achieve an acceptable model. A label is just the piece of information that we want to know about, or predict. Health studies require that a number of control and affected patients be gathered in order to use their labels (0 for unaffected, 1 for affected) to create a supervised machine learning model.

A supervised model (red line) is created from training data (blue dots)

There are many different models that analysts use to interpret labeled data. Some of the most common include:

  • Linear Regression
  • Logistic Regression
  • K-Nearest Neighbors
  • Decision Trees / Random Forests
  • Support Vector Machines

Though supervised learning models are far more common at the present moment, unsupervised learning models are becoming more prevalent as processing power is increasingly more accessible to the analysts in various job sectors.

Unsupervised Learning

When most people think about artificial intelligence, they have a creative depiction of a robot that can interact in some human-like fashion and make decisions for itself. Though this depiction is not correct, it aligns with the concepts behind unsupervised learning.

Unlike supervised learning, there is no label, or target that the machine learning algorithm (system) can use to validate its models with. These systems are fed unlabeled data with the goal of finding undefined patterns. Without a set target to find in an unsupervised machine learning algorithm, the “what” that is being implemented is loosely defined. Four of the main classes of unsupervised machine learning systems are outlined below.

Clustering

Clustering seeks to group data together by some set of criterion that the model deems appropriate. A simple example of this would be to give an unsupervised clustering model a data set containing hand written numbers. If the model was successful in grouping all of the information it received, it would find 10 different groups representing the numbers 0–9 even though there were no labels to identify these. The machine lacks the common knowledge that there are only 10 digits in our number system, but would be able to find this out regardless.

Anomaly Detection

An unsupervised machine learning algorithm designed for anomaly detection would be one that is able to predict a data point that is significantly different than the others or occurs in an unpredictable fashion. These algorithms work under the assumption that most samples that it is exposed to are normal occurrences. One example of this would be a model that predicts the presence of cancerous cells by image detection. Though the model was never trained with pictures of cancerous cells, it is exposed to so many normal cells that it can determine if one is significantly different than normal. As the name would suggest, these models serve the purpose of identifying infrequent events.

Artificial Neural Network

The term neural network is trending heavily in the data world. Neural networks are unsupervised machine learning models that build network-like associations between data within a data set. Though this is hard to conceptualize, the simplest corollary is incredibly complex: a human brain. The idea of happiness in an individuals brain may be linked to a family memory, a favorite food, a funny movie, or any number of things. Similarly, neural networks loosely associate different pieces of data into a larger framework, creating what we know as “context” in our world.

Wikipedia

Unfortunately for data scientists at the moment, it is impossible to determine what makes a machine learning model happy, we can only see that it is happy or isn’t. Many efforts are being made to see deeper into these black box models to understand what leads to their predictions. Combined with the increased efficiency that would come with quantum computing, these types of systems may soon find their way to handheld and everyday technology.

Data Compression

A common theme in this discussion has been processing / computing power. As a general rule of thumb, predictive models become more accurate as they are exposed to more and more data. The big drawback to this is that it takes computers more time to process and analyze more data.

For this reason, unsupervised machine learning algorithms are being designed to compress data to increase efficiency. Though the computational concepts behind this can be very complex, they can be summarized as removing redundant pieces of information or combining related pieces of information.

Compare and Contrast

When choosing supervised and unsupervised learning to make predictions on data, it is important to consider the problem.

  • Do we have a large set of data that we can “train” a model with? If yes, choose a supervised model.
  • Are we looking to make a prediction? If yes, choose a supervised model.
  • Are we looking for trends groupings in our data without a goal in mind? If yes, use an unsupervised model.
  • Do we need to extract information as to why our model made it’s predictions? If yes, then probably use a supervised model.
  • Do we need to detect anomalies in our data, without a specific idea of what they may be? If yes, use an unsupervised model.

Over time, unsupervised learning models will be used to vastly simplify every day tasks, but they do have limitations. Supervised learning methods are much more conceptually digestible and still have a fundamentally important role in the field of data science.

References

--

--