Probability Distributions and their use in Machine Learning

4 min readJul 29, 2021

Linear algebra is the branch of mathematics that plays a vital role in data science and machine learning. Let’s learn about the different kinds of datasets machine learning works upon and the different probability distributions.

Different Types Of Data

The variables are of two broad categories-

Categorical variables-This is the qualitative data that is broadly classified into nominal and ordinal variables.
Numerical variables-This is the quantitative data divided into discrete and continuous variables.

Here we will focus on discrete and continuous variables. Discrete data can take only specific values (eg. number of visits to the dentist) while continuous data can take any real or fractional value (eg. height and weight).

In this article, we will see some of the commonly used distributions in data science.

Discrete probability distributions are used in modeling binary and multi-class classification problems and in natural language processing.

Binomial Distribution- Binomial distribution is a discrete probability distribution used to represent the probability of x success in the ‘n’ number of trials. Criteria for binomial distribution-

The number of trials is fixed and denoted by n, trials that result in either “success” or a “failure”.
Trials are identical(same probability) and independent(output of one trial does not affect previous trials).
And x denotes the number of successes.

Use in Machine Learning-

The binomial distribution is mainly used in quality control and quality assurance.
Industries use this technique for defective analysis.

Poisson Distribution-This distribution is used to estimate how likely an event will occur within ‘X’ period of time.

The formula for Poisson distribution is-

Here:

e is Euler’s number(e=2.718….)
x is the number of occurrences, x=0,1,2….
k is the expected value of x, where k>0

Properties of Poisson Distribution-

The events are independent of each other.
An event can occur n number of times (within the defined time period).
Two events can’t take place simultaneously.
The average rate of events occurring is constant.

Use in machine learning-

Poisson distribution is also used in quality control for reducing the number of defects per standard unit.
It is used to predict how many times an event can occur in a specific interval of time.
Insurance companies often use it to conduct risk analysis and learn patterns to decide insurance pricing.
Poisson distribution can work on datasets where we need to calculate the average time between the occurrence of different events.

Continuous probability distributions play an important role in machine learning from the distribution of training datasets to the models and the distribution of models' errors.

Normal (Gaussian) Distribution-It is one of the most used distributions in machine learning. It is a cumulative distribution that has a bell-shaped curve that is symmetrical from the mean point to both halves of the curve.

Properties of Normal Distribution-

The mean, mode, and median are all equal.
The curve is symmetric at the center (i.e. around the mean, μ).
Exactly half of the values are to the left of the center and exactly half the values are to the right.
The total area under the curve is 1.

Use in Machine Learning-

Some of the machine learning models that work best on data that follows normal distribution are-Gaussian Naive Bayes Classifier, Linear Discriminant Analysis, Quadratic Discriminant Analysis, and Least Squares based regression models
Datasets with Normal distributions are applicable to a variety of methods such as the propagation of uncertainty and least squares parameter fitting.

Uniform Distribution- A continuous uniform distribution (also known as a rectangular distribution) is a statistical distribution with an infinite number of equally likely measurable values.

Properties of Uniform Distribution-

All the outcomes of the event have equal probabilities.
Since each outcome is equally likely which means both mean and variance are uninterpretable.
It does not have predictive power.

Uniform Distribution is a probability distribution where the probability of x is constant. Formula for Uniform probability distribution is f(x) = 1/(b-a), where range of distribution is [a, b].

Use in Machine Learning-This distribution is an idealized random number generator.

Probability Distributions and their use in Machine Learning

Different Types Of Data

Written by SHREYA GARG CO18554