AI/ML Introduction: Episode #7: Types of Machine Learning
If you’re like most people, the term “machine learning” probably doesn’t mean much to you. And that’s okay! The field of machine learning is vast and complex, and it’s constantly evolving.
In this blog post, we will be exploring the different types of machine learning and what they mean.
Machine learning algorithms are broadly classified into 4 categories: supervised learning, unsupervised learning, semisupervised learning, and reinforcement learning.
Each of these categories encompasses a variety of different techniques and algorithms.
So what do they all mean? Let’s find out!
#1: Supervised learning:
Supervised learning algorithms are a type of Machine Learning algorithm, which utilize labeled data to make predictions and assess the accuracy of those predictions.
With supervised learning, the algorithm looks at the data set and identifies patterns that can be used to classify new data into specific categories. The labels in the data set act as a target or an answer key, allowing the algorithm to assess whether its predictions are correct or incorrect.
Supervised learning are categorised into 2 subtypes:
Classification:
In classfication problems, the algorithm attempts to answer questions such as “Is this data point A or B?” or “What is the most likely class for this observation?”
Regression
Regression algorithms are used to predict a continuous target value based on features of the data set. Unlike in classification problems, which focus on discrete outputs, regression algorithms output a continuous value or range of values.
In regression problems, the algorithm attempts to predict a continuous value, such as “What will be the price of this stock tomorrow?” or “How many sales will we make next month?”
Some of the typical supervised learning algorithms include:
#1: Linear regression:
A linear regression model attempts to capture the relationship between a continuous target (dependent) variable and one or more independent variables. For example, linear regression can be used to predict the price of a house, given its size and other features
#2: Logistic regression:
Logistic regression is used for classification problems. The algorithm uses an S-shaped logistic function to convert the predicted value into one of two or more discrete outcomes. For example, it can be used to classify an email as spam or not spam, based on its content
#3: Support vector machines (SVMs):
SVMs are powerful supervised learning algorithms used for both classification and regression problems. They draw a decision boundary between classes by maximizing the margin of separation between them. For example, it can be used to predict whether a customer is likely to churn or not, based on their past behavior.
#4: Naive Bayes:
Naive Bayes is a type of supervised learning algorithm that uses the ‘Bayes theorem’ to make predictions. It is used for both classification and regression problems. For example, it can be used to classify an email as spam or not spam.
#5: Decision trees
Decision tree algorithms are used for both regression and classification problems. It is a type of supervised learning algorithm that uses a tree-like structure to make predictions. For example, it can be used to predict whether an applicant will get a loan or not, based on their credit score.
#2: Unsupervised learning:
Unsupervised learning algorithms are those that learn from unlabeled data. This means that the data is not classified into any groups, and the algorithm must learn from this unstructured data.
In general, unsupervised learning algorithms are used to find patterns in data. For instance, a clustering algorithm might be used to group together images that have similar features. Other unsupervised learning algorithms might be used to detect outliers in data, or to find relationships between variables.
There are many different types of unsupervised learning algorithms, but some of the most popular include:
#1: Clustering algorithms
Clustering algorithms are used to group data points together based on their similarity. For example, we can use clustering algorithm to group customers together based on their purchase history.
#2: Association rules learning
Association rule learning is used to identify relationships between different variables in a dataset. For example, it might be used to find out which items are often purchased together in a supermarket.
#3: Dimensionality reduction
Dimensionality reduction algorithms are used to reduce the number of variables in a dataset while preserving its essential characteristics. For example, it can be used to reduce a large image dataset into a smaller set with only the most important features.
#4: Anomaly detection
Anomaly detection algorithms are used to identify outliers in datasets. For example, they can be used to detect unusual behavior in a network of connected devices.
#5: Autoencoders
Autoencoders are unsupervised learning algorithms used for feature extraction and dimensionality reduction. They attempt to learn the structure of data by encoding it into a smaller representation. For example, they can be used to generate compressed representations of images for efficient storage and retrieval.
#3: Semisupervised learning:
Semisupervised learning algorithms are a subset of machine learning that utilizes both labeled and unlabeled data to improve the accuracy and quality of the model being created. This type of algorithm is beneficial when there is an insufficient amount of labeled data available, as it can use the unlabeled data to supplement the supervised learning
There are several techniques used in semisupervised learning, including:
#1: Self-training:
Self-training is a type of semisupervised learning algorithm that uses a small labeled dataset to train an initial model, which is then used to label the remaining unlabeled data. The newly labeled data is then added back into the training set for further refinement of the model. For example, businesses can use self-training to create a customer segmentation model with limited labeled data.
#2: Graph-based methods:
Graph-based methods use a graph structure to incorporate the labeled and unlabeled data into the training set for semisupervised learning algorithms. For example, it can be used to identify which parts of an image are related to each other.
#3: Generative models:
Generative models create synthetic datasets from existing labeled and unlabeled samples in order to improve generalisation. For example, it can be used to generate more realistic images from a small set of labeled data.
One key advantage is that semi-supervised learning algorithms require significantly less labeled data than traditional supervised approaches while still achieving satisfactory results. This makes them particularly effective for applications where data labeling is expensive or time consuming due to its ability to learn from both the structured (labeled) and unstructured (unlabeled) parts of the dataset simultaneously.
#4: Reinforcement learning:
Reinforcement learning algorithms are based on a trial and error approach, wherein the agent learns from feedback it receives in the environment. This type of algorithm is commonly used in robotics applications and artificial intelligence systems to develop behaviors that maximise rewards.
The techniques used in this type of learning can be divided into two categories:
#1: Value-based learning:
In value-based learning, the agent estimates the expected long-term reward for each action it takes. It then uses this value estimate to choose actions that will maximise its overall reward. For example, a robot can use this approach to learn the best path to take in order to reach its destination.
#2: Policy-based learning:
In policy-based learning, the agent learns a set of policies that determine which action it should take in any given situation. It then uses these policies to choose actions that will maximise its reward. For example, a robot can use this approach to learn how to complete a task without being given explicit instructions.
Reinforcement learning algorithms are useful for tasks that require the agent to make decisions in real-time, such as robotics or autonomous driving. They can also be used in more complex applications such as game playing, where the agent must consider a wide range of scenarios and responses
Conclusion:
Machine learning algorithms are powerful tools that can be used to solve complex problems in a variety of domains. From supervised and unsupervised learning, to semisupervised and reinforcement learning, there is an algorithm type suitable for any given task.
In addition, the use of machine learning algorithms has become increasingly important as businesses continue to leverage technology for competitive advantage. By understanding the different types of machine learning algorithms and how they can be used in your business, you can ensure you are leveraging the best approach for your problem