Statistics Part 2 — Distributions: What’s behind the curves?

Nawin Raj Kumar S
kgxperience
Published in
4 min readJun 12, 2023
The Normal Distribution Graph

Hi all 👋,

This is the second part (as the title suggests) on the Statistics Blog, if you haven’t read it check here. This blog discusses about the weird graph we all have seen during our school times 👻. Many of us wouldn’t know what it actually represents. Well, we will figure this out. This graph or the bell-shaped curve is actually called Gaussian Distribution. It’s basically another name for the Normal Distribution. But before going to the Gaussian Distribution, we’ll look into what are distributions?

What is a distribution?

Distributions are used in statistics to describe and analyze the variability and patterns within data. A distribution represents the way in which values are spread out or clustered together in a dataset. By understanding the distribution of data, statisticians can make inferences, draw conclusions, and make predictions about the underlying population or phenomena being studied.

Distribution provides a mathematical function, called as the distribution function which returns the probability or likelihood of an individual observation from the sample space. Also, it can also be used to describe the grouping or density of observations. The likelihood of an individual observation can be determined by the value provided by the distribution function(equal or lesser than a given value by the function provided by distribution).

Credits : datasciencedojo.com

Many data confirm to well-known and well-understood mathematical functions, such as the Gaussian distribution. A function can fit the data with a modification of the parameters of the function, such as the mean and standard deviation in the case of the Gaussian.

Once a distribution function is known, it can be used as a shorthand for describing and calculating related quantities, such as likelihoods of observations, and plotting the relationship between observations in the domain.

Distributions are of two types, they are:

  1. Probability Density Functions(PDF): calculates the probability of observing a given value.
  2. Cumulative Distribution Functions(CDF): calculates the probability of an observation equal or less than a value

Probability Density Functions

To determine the likelihood or the probability of an individual observations present in the distribution we use Probability Density Functions(PDF).

In statistics, a Probability Density Function (PDF) is a special function that tells us the chances of different values happening for a random thing we’re interested in. It’s like a magic formula that helps us figure out the probabilities associated with different outcomes.

P(a) <= X <= P(b).

Credits : BYJU’S

The PDF is like a bridge between the values of the random thing and their probabilities. By plugging in a value into the function, we can find out how likely that value is to occur. It gives us a way to understand the likelihoods of different possibilities.

So, think of the PDF as a handy tool that helps us calculate and understand the probabilities of different outcomes for a random variable we’re studying. It’s like having a secret code to unlock the chances of things happening!

Cumulative Distribution Function (CDF)

The Cumulative Distribution Function (CDF) is like a cool tool that helps us understand the likelihood of different values in a distribution. It’s a bit different from the probability density function (PDF). Instead of telling us the likelihood of one specific value, the CDF gives us the cumulative likelihood up to that value. It’s like adding up the probabilities as we go along.

CDF for the Normal Distribution

Imagine plotting the CDF as a curve that starts at 0 and ends at 1. It shows us how much of the distribution is “covered” or accounted for by each value. So, as we move along the curve, we can see how much of the distribution lies before and after a particular value.

This helps us get a quick sense of how likely certain values are in relation to the whole distribution. It’s handy for comparing values, understanding their significance, and making comments about their position in the distribution.

So, next time you see a CDF plot, remember that it’s like a map of the distribution showing us the cumulative likelihood and helping us grasp where different values stand.

In this blog, we have seen what is a distribution and how the distribution functions work. In the upcoming blogs, we’ll see the different types of distributions. Hope it was useful. Thank You 😄

--

--