In this post, we will discuss how to sample data vectors from a distribution.
Consider the case of sampling data points from a normal distribution centred around 0 with a standard deviation of 1.
If you are familiar with libraries such as
numpy , then you will recall that this can be trivially done by using functions like
numpy.random.normal() . But for now, let us assume we do not have access to such a library function. How might one go about achieving this task?
The inversion sampling method is based on the following property :
To prove this theorem, we start out by writing down the CDF(Cumulative Density Function) corresponding to the inverse function.
Given the theorem, we now have an algorithm that generates samples from a given pdf(probability density function) f.
In the next post, we will look into some approximate method of sampling which does not require computing the inverse CDF.