In the previous post, we discussed how to sample from a gaussian distribution using the box-muller method. In this post, we will explore rejection sampling, a method of sampling from a generic distribution when inversion sampling is impractical.

Consider the distribution below,

Image for post
Image for post
The distribution we wish to sample from

Rejection sampling involves the sampling from a proposal distribution q(x). The choice of proposal distribution should be such that there is an efficient algorithm to sample from it and it should be close to our target distribution, f(x). In particular, we want the ratio f(x)/g(x) to be upper bounded by some constant c >1. For such a c, we have the constraint 0< f(x)/c*g(x) < 1.

Given two functions g1 and g2 and their corresponding constants c1 and c2, we say g1 is closer to f than g2 if c1 < c2. In practice, we want to choose a g such that c is close to 1.

For the distribution shown above, one “good” proposal distribution is shown below:

Image for post
Image for post
The proposal distribution of choice is a Gaussian(shown in red)

Algorithm

Given the proposal(g) and the target distribution(f), the algorithm for rejection sampling is :

Image for post
Image for post

Intuitively, the accepts points lie in the shaded region shown below:

Image for post
Image for post

Proof

Notations: We will use capital letters F and G for CDFs corresponding to f and g.

To show that rejection sampling does indeed return samples from the target distribution f, it is sufficient to show that the samples points x, have the same CDF as f. Mathematicall we want to show that:

Image for post
Image for post

where x is a sample and X and U are random variables.

We will use the Bayes rule to compute the expression on the write.

Image for post
Image for post

We will now compute each of the terms on the right-hand side of the equation.

Image for post
Image for post

The term in the denominator of Equation 1 corresponds to the probability of a sample being accepted. Given a sample x from g, we can write the probability of x being accepted as:

We can now marginalize the conditional distribution to obtain the required probability.

Image for post
Image for post

Finally, we evaluate the conditional probability in the numerator:

Image for post
Image for post

Combining all these terms together we have,

Image for post
Image for post

Hence, the samples generated by rejection sampling follow the distribution as the target distribution f.

In the next post, we will dive into importance sampling.

Reference

[1] http://www.columbia.edu/~ks20/4703-Sigman/4703-07-Notes-ARM.pdf

[2] Kevin Murphy, Machine Learning: A Probabilistic Perspective

Written by

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store