# Rejection Sampling

In the previous post, we discussed how to sample from a gaussian distribution using the box-muller method. In this post, we will explore rejection sampling, a method of sampling from a generic distribution when inversion sampling is impractical.

Consider the distribution below,

Rejection sampling involves the sampling from a proposal distribution *q(x)*. The choice of proposal distribution should be such that there is an efficient algorithm to sample from it and it should be close to our target distribution, *f(x)*. In particular, we want the ratio *f(x)/g(x)* to be upper bounded by some constant *c* >1. For such a *c*, we have the constraint 0< *f(x)/c*g(x)* < 1.

Given two functions *g1* and *g2* and their corresponding constants* c1* and *c2*, we say *g1* is closer to *f* than *g2* if *c1 < c2*. In practice, we want to choose a *g* such that *c* is close to 1.

For the distribution shown above, one “good” proposal distribution is shown below:

## Algorithm

Given the proposal(g) and the target distribution(f), the algorithm for rejection sampling is :

Intuitively, the accepts points lie in the shaded region shown below:

## Proof

*Notations: We will use capital letters F and G for CDFs corresponding to f and g.*

To show that rejection sampling does indeed return samples from the target distribution *f, *it is sufficient to show that the samples points *x, *have the same CDF as *f. *Mathematicall we want to show that:

where *x *is a sample and **X** and **U** are random variables.

We will use the Bayes rule to compute the expression on the write.

We will now compute each of the terms on the right-hand side of the equation.

The term in the denominator of Equation 1 corresponds to the probability of a sample being accepted. Given a sample *x* from *g, *we can write the probability of *x *being accepted as:

We can now marginalize the conditional distribution to obtain the required probability.

Finally, we evaluate the conditional probability in the numerator:

Combining all these terms together we have,

Hence, the samples generated by rejection sampling follow the distribution as the target distribution *f.*

In the next post, we will dive into importance sampling.

## Reference

[1] http://www.columbia.edu/~ks20/4703-Sigman/4703-07-Notes-ARM.pdf

[2] Kevin Murphy, Machine Learning: A Probabilistic Perspective