Mastering Probability Distributions: Understanding PMF, PDF, CDF, and PPF in Just 10 Minutes

9 min readFeb 23, 2024

This journey isn’t just for seasoned statisticians or data wizards — it’s for anyone with a curious mind and a thirst for knowledge.

Introduction
Random Variables
Binomial Distributions
Poission’s Distributions
Normal Distributions
Uniform Distributions — An Overview
Cumulative Distribution Function (CDF)
Percentage Point Function (PPF)

Introduction

Before we dive into the main content of this blog, let’s take a moment to go over some common terms that we’ll be using frequently. Understanding these terms will help you follow along smoothly.

What are Random Variables?

A random variable is a mathematical representation of the numerical outcomes resulting from a random process or experiment. It represents the variability found in random events like rolling dice or measuring traits of a group.

Let’s consider the example of rolling a fair six-sided die.

In this scenario:

The random variable X represents the outcome of rolling the die.
X can take on values from 1 to 6, inclusive, since those are the possible outcomes when rolling a standard six-sided die.
Each possible outcome (1, 2, 3, 4, 5, or 6) has an equal probability of occurring, assuming the die is fair.

So, if we were to define X as a random variable representing the outcome of rolling a die, we could have:

X= the number rolled on the die (anything between 1 to 6)

Each of these values represents a possible outcome of the random process (rolling the die), making X a random variable in this context.

Random variables can be discrete or continuous

Countable number of distinct values like the number of heads in coin flips, or continuous, encompassing an infinite range of values within a certain interval, such as the height of individuals in a population. Understanding random variables is essential in probability theory and statistics for analysing and modelling uncertainty.

Probability Distributions

Probability distributions, on the other hand, describe the probabilities associated with the possible outcomes of a random variable. They specify how likely each outcome of the random variable is to occur. In essence, a probability distribution assigns probabilities to each possible value that a random variable can take.

The probability distribution for the random variable X specifies the probabilities associated with each possible outcome. Since the die is fair, each outcome (1, 2, 3, 4, 5, or 6) has an equal probability of 1/6. So, the probability distribution might look like this:

P(X =1) = 1/6 
...
P(X =5) = 1/6
P(X =6) = 1/6

Common distributions

The probability distributions can be of two types, continuous probability distribution and discrete probability distribution.

Discrete Probability Distributions:

Has an associated probability Mass function (PMF) which gives the probability with which the random variables takes a particular value.

Continuous probability distributions:

Has an associated Probability Density Function (PDF) which helps determine the probability with which the random variable lies between the two given numbers.

Let’s discuss some of the commonly occurring distributions around us:

Binomial Distribution

A binomial distribution is a discrete probability distribution that describes the number of successes in a fixed number of independent Bernoulli trials, where each trial has the same probability of success, denoted by p. The distribution is characterised by two parameters: n, the number of trials, and p, the probability of success in each trial.

P(X=k) is the probability of having k successes in n trials.
(nCk) is the binomial coefficient, also known as “n choose k”, which represents the number of ways to choose k successes from n trials.
p is the probability of success in each trial.
1−p is the probability of failure in each trial.
n is the total number of trials.
k is the number of successes.

Binomial distributions are commonly used to model situations such as coin flips, where there are two possible outcomes (success or failure) and each trial is independent of the others. They have applications in various fields, including statistics, biology, economics, and more.

2. Poisson’s Distribution

The Poisson distribution is a discrete probability distribution that represents the number of events occurring in a fixed interval of time or space, given that these events occur with a constant rate and independently of the time since the last event. It is named after the French mathematician Siméon Denis Poisson.

The probability mass function (PMF) of a Poisson distribution is given by:

P(X=k) is the probability of having k successes in n trials.
(nCk) is the binomial coefficient, also known as “n choose k”, which represents the number of ways to choose k successes from n trials.
p is the probability of success in each trial.
1−p is the probability of failure in each trial.
n is the total number of trials.
k is the number of successes.

Let’s look at an example..

Imagine a bakery that, on average, receives 4 customers every hour. We want to understand the probability of different numbers of customers arriving in the next hour.

In this scenario:

The number of customers arriving in each hour follows a Poisson distribution.
The average rate of customer arrivals, denoted as λ, is 4 customers per hour.

Here’s how we can use the Poisson distribution:

Probability Mass Function (PMF): The PMF of the Poisson distribution gives the probability of observing a specific number of events (in this case, customers) in a fixed interval (in this case, an hour).
Example Calculation: Suppose we want to find the probability of exactly 3 customers arriving in the next hour.

X is the random variable representing the number of customers arriving.
k is the number of customers we’re interested in (in this case, 3).
λ is the average rate of customer arrivals (in this case, 4).

So, the probability of exactly 3 customers arriving in the next hour is approximately 0.1952, or 19.52%.

3. Normal Distribution

The normal distribution, also known as the Gaussian distribution, is a continuous probability distribution that is symmetric about its mean, with the shape of a bell curve. While the PDF gives the probability density at each point, it does not give the probability of a specific outcome occurring. Instead, the probability of observing a value within a certain range is obtained by integrating the PDF over that range.

In the above figure which represents a normal distribution, the mean, median an mode are always equal
The total area under the curve is always 1
The distribution is symmetric around the mean

Probability Density Function (PDF)

Relationship with Probability: Probability Density Function (PDF) provides the density of probability at each point within the distribution. However, it does not directly provide the probability of a specific outcome occurring. Instead, to find the probability of observing a value within a certain range, you need to integrate the PDF over that range.

Empirical Rule: In many cases, the distribution of data follows a normal distribution. According to the empirical rule:

Approximately 68% of the data falls within one standard deviation of the mean.
Approximately 95% of the data falls within two standard deviations of the mean.
Approximately 99.7% of the data falls within three standard deviations of the mean.

For a standard normal distribution (with mean (μ) =0 and standard deviation (σ)=1, the PDF is given by the Gaussian function:

Lets look at an example, consider an example using IQ scores, which are often assumed to follow a normal distribution with a mean of 100 and a standard deviation of 15. Using the empirical rule, we can make some observations about the distribution of IQ scores:

Let’s say we have a population of 1000 individuals. According to the empirical rule:

About 680 individuals will have IQ scores between 85 and 115.
About 950 individuals will have IQ scores between 70 and 130.
About 997 individuals will have IQ scores between 55 and 145.

Standard Normal Distribution

We also have a standard normal distribution which is denoted by z, we can normalise any normal distribution to standard normal distribution by subtracting the mean and diving the standard deviation.

When dealing with data that varies across different scales, normalization allows us to standardise the values, making them comparable and facilitating analysis. Let’s say we have two variables: one representing income in dollars and another representing age in years. These variables have different scales, making direct comparison difficult. By normalising the data, we can transform both variables onto a standardised scale, such as z-scores, allowing us to compare them more easily for analysis.

4. Uniform Distribution

The uniform distribution is a probability distribution in which all outcomes are equally likely within a given range. In other words, each value in the range has an equal probability of occurring.

The uniform distribution is commonly used in various applications, such as modelling random processes where each outcome is equally likely, generating random numbers within a specified range, and in certain sampling techniques.

Imagine you have a box containing slips of paper numbered from 1 to 10. You randomly draw one slip from the box without looking. Since each slip has an equal chance of being drawn, this situation follows a uniform distribution.

Cumulative Distribution Function (CDF)

The Cumulative Distribution Function (CDF) of a probability distribution provides the probability that a random variable is less than or equal to a specified point.

When utilising the CDF, it’s essential to note that by default, it calculates the probability of a range less than or equal to the specified value. In the context of using the cumulative distribution in libraries like Pandas, the value provided is inclusive. However, if you want the probability of a value exceeding the specified threshold, you can readily obtain it by subtracting the CDF value from 1.

Percentage Point Function (PPF)

The Percentage Point Function (PPF), also known as the inverse cumulative distribution function (CDF), is the mathematical function that provides the value for which a given percentage of observations fall below. In simpler terms, it gives the cutoff value for a specified percentile or probability.

For example, if you have a normal distribution and you want to find the value below which 95% of the data falls, you would use the PPF to obtain this value.

The PPF is the inverse of the CDF. While the CDF gives the probability that a random variable is less than or equal to a specified point, the PPF gives the point for which a certain probability is less than or equal to.

We can use scipy.stats for all these functions. scipy.stats is a powerful module in Python's SciPy library that provides a wide range of probability distributions along with functions to calculate their probability density function (PDF), cumulative distribution function (CDF), probability mass function (PMF), and percentage point function (PPF).

Here’s a brief overview of the functions available in scipy.stats for various distributions:

PDF (Probability Density Function): Use pdf() function.
CDF (Cumulative Distribution Function): Use cdf() function.
PMF (Probability Mass Function): Use pmf() function (for discrete distributions).
PPF (Percentage Point Function, also known as the inverse CDF): Use ppf() function.

You can use these functions to work with a variety of probability distributions such as normal, binomial, uniform, exponential, and many others.

Thanks for reading ♡