Machine Learning

Ishaan Sunita Pandita
7 min readApr 10, 2022

--

Types of ML Models

We all have heard the term ‘Machine Learning’ and some might even be familiar with the details of how it works. Recently, I began studying about the various applications of Machine Learning, and I came across a couple of articles on the different types of ML Models that have been developed over the years, classified on the basis of the task they accomplish. In this article, I compile a list of what I have learnt in a comprehensive and crisp manner!

Photo by Markus Winkler on Unsplash

Let us first have an overview of the types of models we have, grouped on the basis of the tasks they perform:
1. Classification
2. Regression
3. Clustering
4. Dimensionality Reduction
5. Deep Learning

We may also make note that this is not an exhaustive list of ML Model types, just a compilation of the most common types that I have read about as of now.

1. Classification

A Machine Learning model whose output is always a categorical variable. These models are used whenever we have a set of labelled data and we need to group each datapoint into a certain ‘class’ or group of object with similar properties.

One of the most common application of these models is the Image Classifier, to classify images of animals into their groups, like cats, dogs, butterflies and so on.

Some common ML Algorithms which serve as classifier models are:

K-Nearest Neighbours

When k = 5. Here, Point Z will be labelled as part of the cluster on the right side, despite becoming the only point in the cluster belonging to Quadrant II.
  • Simple enough to be understood by beginners
  • Computationally expensive

Naive Bayes

  • Based on Bayes’ Theorem
  • Updates probabilities based on gained information
  • Makes certain idealistic assumptions for these probabilities, thus titled ‘Naive’
  • Performs well on real world data, despite the assumptions

Logistic Regression

Logistic Regression using Sigmoid Function — Source: Wikipedia
  • Named regression, but widely used in Binary Classification
  • Linear model for classification
  • Can be used for more than 2 classes as well, but becomes computationally expensive as classes increase

Support Vector Machine (SVM)

One of the lines that separates this dataset, used for the SVM
  • Used for binary or multi-class classification
  • Searches for a curve or hyperplane that divides data in the best way

Decision Tree

  • Used for binary or multiclass classification
  • Powerful against outliers
  • Overfitting may occur

Ensembles

  • A combination of two or more of the above stated classifiers, to get the desired result

2. Regression

A Machine Learning model whose output can take up continuous values. These models are used when we need to establish a relationship between a value we want to predict and other values on which it may depend.

An example of application of regression would be the prediction of airplane ticket prices based on seasonal trends, or prediction of temperature on a certain day of the year.

Some common ML Algorithms which serve as regression models are:

Linear Regression

Linear Regression, searching for the line of best fit
  • The simplest of regression models
  • Works best on linearly separable data
  • Issues may arise when multi-collinearity is present in the dataset

Lasso Regression

  • Linear Regression with the L1 Regularisation
  • Reduces the number of predictor variables
  • Robust against outliers

Ridge Regression

  • Linear Regression with the L2 Regularisation
  • Does not reduce variables, instead keeps them all and adjusts their importance in the final outcome
  • Works best when output variable is a function of all input variables

SVM Regression or SVR

The line of best fit in this set of points
  • Similar to the SVM
  • Objective is to find the best-fit line/curve
  • This is the hyperplane that contains the maximum points

Decision Tree Regression

  • Tree structure like Decision Tree classifier
  • Useful when predictions can have virtually infinite values

3. Clustering

A Machine Learning algorithm that groups together unlabelled data and labels them without manual intervention, based on a certain measure of similarity.

A common application would be grouping together customers of a rock climbing club, based on their age, fitness and athleticism to give them the correct kind of course to practice on.

Some common ML Algorithms which serve as clustering models are:

K-Means

K-means, with randomised centroids. Here, centroids are accurate, many times, they may not be.
  • Simple enough for beginners to understand
  • Suffers from high variance
  • ‘K’ value must either be pre-determined or calculated, which can be computationally expensice

K-Means++

  • Improved version of K-Means, which selects initial centroids in a smarter manner
  • Although initialisation is a little longer than K-Means, it serves well to reduce time consumed later into the process

K-Medioids

Green points indicate actual datapoints closest to centroids
  • K-Means gives centroids that may not actually be a part of the dataset, reducing interpretability
  • K-Medioids initialises like K-Means++, processes like K-Means, but finally chooses an actual data point as the centroid of groups, based on which point gives the least loss with respect to the original centroid

Agglomerative Clustering

  • Hierarchical clustering, bottom up approach
  • Begins with all datapoints in individual clusters, ends with one big cluster containing all datapoints, grouped on similarity measures
  • One of the penultimate states of cluster sets is chosen as the output set

DBSCAN

  • Groups together points that are close to each other, usually measured by Euclidean distance
  • Basically has 2 parameters:
    - Maximum distance between 2 points to consider them close
    - Minimum number of points to be labelled close to each other, to be called a high-density group
  • The optimum values of these parameters usually depend on the size of the dataset

4. Dimensionality Reduction

Dimensionality is the number of predictor variables on which the target variable depends.

Often, in real world cases, we have access to a lot of predictor variables, but only a handful of them actually considerably affect the target variable, while others carry very little impact. Such subtle variables may be removed from the equation completely, in order to reduce computation costs.

This is known as dimensionality reduction and can result in reduction of model complexity, burden on the processor, and increase in computational efficiency; producing similar results, sometimes even better than original.

Some common ML Algorithms which serve as classifier models are:

Principle Component Analysis (PCA)

  • Creates a new set of lesser number of predictor variables, out of the original set
  • Results become less interpretable

T-distributed Stochastic Neighbour Embedding (TSNE)

  • Calculates the measure of similarity of small sets of pairs of points, unlike maximizing variance as in PCA
  • A visual difference can be observed on the Swiss Roll Dataset, when processed with TSNE and with PCA
  • First a point is chosen
  • The gaussian distance distribution for this point is calculated
  • The size of the gaussian circle varies with perplexity or the number of gaussian neighbours in vicinity
  • We replace the gaussian distribution with a Cauchy Student-t distribution, with a sharper peak and heavy tails
  • Heavy tails can be used to find probabilities of similarity of far-away points, and keep only highly similar points (high probability) in the same cluster, in a lower dimension
  • Basically, it works like an M-Dimensional projection of an N-Dimensional dataset, where M < N. Often, M is chosen as 2 or 3 for visualisation purposes.

Singular Value Decomposition (SVD)

  • Decomposes a large matrix into smaller, calculable component matrices based on linear algebra
  • Uses properties of linear transformations to produce results efficiently

5. Deep Learning

The subset of Machine Learning that deals with Neural Networks is known as deep learning.

It has taken up the internet by storm, as it is being used widely in numerous ways in digital marketing, application personalisation, recommendation systems and many other facets of the internet. Often, we see image perception using neural networks and real-time object labelling, both of which are products of deep learning.

Some common ML Algorithms which serve as deep learning models are:

  1. Multi-layer perceptron
  2. Convolutional Neural Networks (CNN)
  3. Recurrent Neural Networks (RNN)
  4. Boltzmann Machine
  5. Autoencoders
  6. Generative Adversarial Networks (GAN)

Conclusion

We have understood a few fundamental and differentiating features of a small set of machine learning models as well as algorithms which fall under those categories. Once I learn more about these models, I shall expand upon this data in a new article in continuation to this one.

If you liked the article so far, you might want to follow me on Medium, here.

If you would like to get in touch, you may do so here.

--

--

Ishaan Sunita Pandita

Living life and becoming better than who I was yesterday. Oh, and also learning magic so I can turn data into money.