Machine Learning

7 min readApr 10, 2022

Types of ML Models

We all have heard the term ‘Machine Learning’ and some might even be familiar with the details of how it works. Recently, I began studying about the various applications of Machine Learning, and I came across a couple of articles on the different types of ML Models that have been developed over the years, classified on the basis of the task they accomplish. In this article, I compile a list of what I have learnt in a comprehensive and crisp manner!

Let us first have an overview of the types of models we have, grouped on the basis of the tasks they perform:
1. Classification
2. Regression
3. Clustering
4. Dimensionality Reduction
5. Deep Learning

We may also make note that this is not an exhaustive list of ML Model types, just a compilation of the most common types that I have read about as of now.

1. Classification

A Machine Learning model whose output is always a categorical variable. These models are used whenever we have a set of labelled data and we need to group each datapoint into a certain ‘class’ or group of object with similar properties.

One of the most common application of these models is the Image Classifier, to classify images of animals into their groups, like cats, dogs, butterflies and so on.

Some common ML Algorithms which serve as classifier models are:

K-Nearest Neighbours

When k = 5. Here, Point Z will be labelled as part of the cluster on the right side, despite becoming the only point in the cluster belonging to Quadrant II.

Simple enough to be understood by beginners
Computationally expensive

Naive Bayes

Based on Bayes’ Theorem
Updates probabilities based on gained information
Makes certain idealistic assumptions for these probabilities, thus titled ‘Naive’
Performs well on real world data, despite the assumptions

Logistic Regression

Named regression, but widely used in Binary Classification
Linear model for classification
Can be used for more than 2 classes as well, but becomes computationally expensive as classes increase

Support Vector Machine (SVM)

One of the lines that separates this dataset, used for the SVM

Used for binary or multi-class classification
Searches for a curve or hyperplane that divides data in the best way

Decision Tree

Used for binary or multiclass classification
Powerful against outliers
Overfitting may occur

Ensembles

A combination of two or more of the above stated classifiers, to get the desired result

2. Regression

A Machine Learning model whose output can take up continuous values. These models are used when we need to establish a relationship between a value we want to predict and other values on which it may depend.

An example of application of regression would be the prediction of airplane ticket prices based on seasonal trends, or prediction of temperature on a certain day of the year.

Some common ML Algorithms which serve as regression models are:

Linear Regression

The simplest of regression models
Works best on linearly separable data
Issues may arise when multi-collinearity is present in the dataset

Lasso Regression

Linear Regression with the L1 Regularisation
Reduces the number of predictor variables
Robust against outliers

Ridge Regression

Linear Regression with the L2 Regularisation
Does not reduce variables, instead keeps them all and adjusts their importance in the final outcome
Works best when output variable is a function of all input variables

SVM Regression or SVR

The line of best fit in this set of points

Similar to the SVM
Objective is to find the best-fit line/curve
This is the hyperplane that contains the maximum points

Decision Tree Regression

Tree structure like Decision Tree classifier
Useful when predictions can have virtually infinite values

3. Clustering

A Machine Learning algorithm that groups together unlabelled data and labels them without manual intervention, based on a certain measure of similarity.

A common application would be grouping together customers of a rock climbing club, based on their age, fitness and athleticism to give them the correct kind of course to practice on.

Some common ML Algorithms which serve as clustering models are:

K-Means

Simple enough for beginners to understand
Suffers from high variance
‘K’ value must either be pre-determined or calculated, which can be computationally expensice

K-Means++

Improved version of K-Means, which selects initial centroids in a smarter manner
Although initialisation is a little longer than K-Means, it serves well to reduce time consumed later into the process

K-Medioids

Green points indicate actual datapoints closest to centroids

K-Means gives centroids that may not actually be a part of the dataset, reducing interpretability
K-Medioids initialises like K-Means++, processes like K-Means, but finally chooses an actual data point as the centroid of groups, based on which point gives the least loss with respect to the original centroid

Agglomerative Clustering

Hierarchical clustering, bottom up approach
Begins with all datapoints in individual clusters, ends with one big cluster containing all datapoints, grouped on similarity measures
One of the penultimate states of cluster sets is chosen as the output set

DBSCAN

Groups together points that are close to each other, usually measured by Euclidean distance
Basically has 2 parameters:
- Maximum distance between 2 points to consider them close
- Minimum number of points to be labelled close to each other, to be called a high-density group
The optimum values of these parameters usually depend on the size of the dataset

4. Dimensionality Reduction

Dimensionality is the number of predictor variables on which the target variable depends.

Often, in real world cases, we have access to a lot of predictor variables, but only a handful of them actually considerably affect the target variable, while others carry very little impact. Such subtle variables may be removed from the equation completely, in order to reduce computation costs.

This is known as dimensionality reduction and can result in reduction of model complexity, burden on the processor, and increase in computational efficiency; producing similar results, sometimes even better than original.

Some common ML Algorithms which serve as classifier models are:

Principle Component Analysis (PCA)

Creates a new set of lesser number of predictor variables, out of the original set
Results become less interpretable

T-distributed Stochastic Neighbour Embedding (TSNE)

Calculates the measure of similarity of small sets of pairs of points, unlike maximizing variance as in PCA
A visual difference can be observed on the Swiss Roll Dataset, when processed with TSNE and with PCA
First a point is chosen
The gaussian distance distribution for this point is calculated
The size of the gaussian circle varies with perplexity or the number of gaussian neighbours in vicinity
We replace the gaussian distribution with a Cauchy Student-t distribution, with a sharper peak and heavy tails
Heavy tails can be used to find probabilities of similarity of far-away points, and keep only highly similar points (high probability) in the same cluster, in a lower dimension
Basically, it works like an M-Dimensional projection of an N-Dimensional dataset, where M < N. Often, M is chosen as 2 or 3 for visualisation purposes.

Singular Value Decomposition (SVD)

Decomposes a large matrix into smaller, calculable component matrices based on linear algebra
Uses properties of linear transformations to produce results efficiently

5. Deep Learning

The subset of Machine Learning that deals with Neural Networks is known as deep learning.

It has taken up the internet by storm, as it is being used widely in numerous ways in digital marketing, application personalisation, recommendation systems and many other facets of the internet. Often, we see image perception using neural networks and real-time object labelling, both of which are products of deep learning.

Some common ML Algorithms which serve as deep learning models are:

Multi-layer perceptron
Convolutional Neural Networks (CNN)
Recurrent Neural Networks (RNN)
Boltzmann Machine
Autoencoders
Generative Adversarial Networks (GAN)

Conclusion

We have understood a few fundamental and differentiating features of a small set of machine learning models as well as algorithms which fall under those categories. Once I learn more about these models, I shall expand upon this data in a new article in continuation to this one.

If you liked the article so far, you might want to follow me on Medium, here.

If you would like to get in touch, you may do so here.