Gaussian/Normal Distribution and its PDF(Probability Density Function)

ashok .c
5 min readAug 28, 2021

--

Explaining the CDF(Cumulative density function) and PDF(Probability Density Function) of normal curve distribution in this article.

BELL SHAPED CURVE

This is a popular distribution in statistics. Its graph is a bell-shaped curve as above and this graph is widely seen in many disciplines such as Economics, Business, psychology, medicine, and of course in statistics.

RANDOM VARIABLE:

There was much different distribution curve such as Normal distribution, Binomial distribution, Poisson distribution, Exponential distribution and so on with each has its own use case and differs extensively based on random variable upon which the distribution is drawn.

A random variable describes all possible outcomes or results of any statistical experiment. Random variables are split into two types such as continuous random variable and discrete random variable based on all possible outcomes of an experiment. Discrete values are data that you can count, that is, the random variable can only take on whole number values. For eg, a number of books in a backpack that you can count. Continuous values can take any number of values in between such as the weight of a book, the heights of people in India, and so on.

HOW TO DEFINE RANDOM VARIABLES:

Defining a random variable is very important. The upper case letter X denotes a random variable notation. Lower case letters like x denote the value of a random variable. If X is a random variable, then X is written in words, and x is given as a number. For example, let X = the number of heads you get when you toss coins hundred times or if X is the height of the student then P(X=x) where x can be 160 or 170 or 150cm.

PROBABILITY DENSITY FUNCTION:

The probability density function or probability distribution function is the same. PDF can be considered as a function which maps each value of the random variable to its frequency. If we plot each value of the random variable with its frequency then the resultant plot will form a curve. This shape of a curve will be unique for each distribution such as normal distribution and exponential distribution. So each distribution curve has a function and that function is a PROBABILITY DENSITY FUNCTION. It’s just a function that represents the distribution curve and that equation of a curve is called PDF. If the random variable is a continuous distribution then the curve of the distribution is a probability density function. The shape of the pdf will differ slightly with respect to the mean and variance of the distribution curve. So the normal distribution has two parameters such as mean and variance which define the shape of a normal curve.

The PDF of a normal distribution curve is a mathematical function and its formula seems daunting but it is simple as that.

PDF of normal distribution

In this above function, only variable x (i.e all possible values of a random variable) keeps changes and the other thing such as mean and variance are constant for each distributed curve. If we remove the constant of the above pdf, then the above function is nothing but an f(x) = e^-(x)². So PDF is nothing but a mathematical function for a distribution curve.

The probability density function (pdf) is used to describe probabilities for continuous random variables while for the discrete random variable, PROBABILITY MASS FUNCTION will be used. The binomial distribution uses PMF to calculate the probability of a single discrete value. The binomial distribution and normal distribution have similar distribution curves but the only difference is the binomial distribution is used for discrete values and normal distribution is used for continuous values.

The PDF in a normal distribution is to calculate the probability for a certain range of values and not for a single value because the probability of a certain single value in a continuous random variable is always zero (i.e) P(X=x) is zero. The PDF at a given point gives the probability density or value on the y-axis and not the probability at that point itself. That is why for a continuous distribution, the probability is calculated for a range and not a single discrete value, and the probability that X takes a single discrete value is 0. The reason for that is in continuous distribution, there were an infinite number of values are in between, the probability of getting each and every value is a minuscule amount, so it’s almost negligible. That’s why we only can assign a probability to an interval and not on a single value.

CUMULATIVE DISTRIBUTION FUNCTION:

The probability for a range in normal distribution can be calculated by measuring the area under the curve in between the range. In PDF, Probability is represented by the area under the curve. The area under the density curve between two points corresponds to the probability that the variable falls between those two values. The calculating area under the curve in the distribution curve is called the cumulative distribution function.

The shaded area represents CDF

Normally, the area under the curve is calculated by using integration or the antiderivative method. To calculate the area under the curve, we need to know the equation of the curve i.e PDF, the limits of the curve, and the axis enclosing the curve. Mathematically, the cumulative density function is the integral of the pdf, so the probability between two values of a continuous random variable will be the integral of the pdf between these two values (i.e) area under the curve between these values.

Mathematically, In a continuous distribution (e.g. continuous uniform, normal, and others), the probability is calculated by integrating the area under the probability density function(f(x)) with a limit a≤ x ≤ b

Integration of PDF

In above, where f(x) is a PDF of a distribution curve then a and b are intervals of the function where we need to calculate the area under the curve. Remember that the area under the pdf for all possible values of the random variable is one (i.e) total area under the curve sums to one.

Ok then, Can we calculate PDF from the CDF? Yes, we can calculate PDF by differentiating the CDF. Note that the Fundamental Theorem of Calculus implies that the pdf of a continuous random variable can be found by differentiating the CDF. This relationship between the PDF and CDF for a continuous random variable is incredibly useful.

--

--