This key concept will simplify all your probability calculations

Paolo Molignini, PhD
7 min readAug 12, 2024

--

DALL-E capturing people at a party enjoying themselves way too much while putting balls in boxes (and apparently eating them, too?).

In probability theory, simplifying complex problems often feels like a daunting task. Whether you’re assembling probability distributions from complicated combinatorics or grappling with the expectation of random variables, it’s easy to feel overwhelmed by the calculations involved. However, there’s a powerful concept that can cut through all of this complexity and make these calculations much more manageable: indicator random variables.

Indicator random variables, despite their simplicity, offer an extremely versatile tool for bridging the gap between probability and expectation. By considering binary representations of events, we can transform cumbersome probability problems into straightforward expectation calculations that rely on the elegant principle of linearity.

In this post, we’ll explore how this key concept not only simplifies calculations but also provides deeper insights into probability theory. By the end, you’ll see how indicator random variables can be your go-to method for tackling a wide range of problems, enabling you to navigate through probability spaces with newfound confidence and ease.

Indicator random variables

The definition of an indicator random variable (indicator r.v. for short) is very simple. For any event A, the indicator r.v. I(A) is defined to be one if A occurs, and zero if A does not occur:

The definition of an indicator random variable.

This definition also clarifies why these random variables are dubbed “indicator”: they namely indicate whether the corresponding event happens or not. Another way to look at indicator r.v.s is as a special case of Bernoulli r.v.s (I touch upon Bernoulli trials in my post about St. Petersburg’s Paradox), where “success” corresponds to A occurring and “failure” corresponds to A not occurring.

Indicator r.v.s thus inherit all the properties of Bernoulli r.v.s, such as idempotency and the multiplicative property of intersections. However, their most powerful property is connected to their expectation.

The fundamental bridge

Since indicators r.v.s are binary quantities, the expectation of an indicator r.v. is the probability of its corresponding event. This fact is also known as the fundamental bridge between probability and expectation, and it can be written mathematically as

The fundamental bridge.

where the blackboard bold characters P and E indicate probability and expectation, respectively. This property can be proved directly by using the definition of expectation for discrete r.v.s and the fact that the indicator r.v. can only take 0 or 1 as values:

In the last step, we used the 1:1 correspondence between I(A)=1 and A occurring.

The fundamental bridge is an extremely powerful tool, especially for discrete problems, because it allows to talk interchangeably about probability and expectation through the use of indicator r.v.s. In other words:

In discrete probability spaces, expectation is probability and probability is expectation.

As we will see in the examples below, we are often confronted with situations that require complicated formulations of discrete random variables with cumbersome or even unknown distributions. While the events and probabilities might be hard to write down, we can often express them as sums of indicator r.v.s which are instead extremely simple. Applying the fundamental bridge then allows us to quickly find the probabilities by leveraging linearity of expectation, i.e. the property

Linearity of expectation.

This simple observation is actually quite remarkable. Look carefully at equation (*): linearity only applies to the right hand side (expectation), and not to the left hand side (probabilities)! By choosing to reformulate probabilities as expectations, we can exploit this extra handy property that we have at our disposal.

Let’s now look at a couple of examples on how to harness the power of indicator r.v.s in real contexts.

Example 1: random balls in boxes

The first example involves combinations of distinguishable objects as shown in the sketch below.

A total of k differently colored balls are randomly placed into n differently colored boxes. What is the expected number of empty boxes?

You randomly put k distinguishable balls into n distinguishable boxes. What is the expected number of empty boxes?

This problem is a classic question in quant interviews. It is quite complicated if attacked head on. If we determine the probability distribution of the number of empty boxes, we can then assemble the expectation by summing all the cases. This can work, but then we would have to separately compute the probability of zero empty boxes, the probability of one empty box, the probability of two empty boxes etc. All this combinatorics can be skipped if we use indicator r.v.s.

First of all, for every ball i, box j has 1/n chance of receiving it. Complementarily, the probability of i not being in j is 1–1/n. Since all balls are distributed independently, the total probability of box j being empty is the product of it not hosting any of the k balls, i.e. (1–1/n)^k. The equations below summarize all these facts.

Now, let’s define an indicator r.v. for box j being empty, that is the indicator r.v. is one if the box is empty, and zero otherwise. The r.v. describing the number of empty boxes, N, is then simply the sum of all the indicator r.v.s for each box. If we want to compute the expected number of empty boxes, we simply construct the expectation of N and use linearity to reduce it to a sum of the expectations of the indicator r.v.s:

At this point, we apply the fundamental bridge to convert the expectations into the probabilities we computed earlier, and we immediately obtain the final result:

To give you some numerical examples, for 3 balls in 2 boxes the expected number of empty boxes is 0.25, whereas for 3 balls in 4 boxes, we can expect on average 1.7 empty boxes.

Example 2: birthday matches

The classic birthday problem is a well-known and intriguing probability puzzle that asks how likely it is for two people in a group to share the same birthday. This is another fan-favorite among quantitative finance interviews. Extending this problem, we explore a related question:

In a group of n randomly selected people, what is the expected number of birthday matches, i.e. pairs of people with the same birthday?

To solve this problem, we could directly compute the probability that no two people in the group share a birthday and then subtracting this from 1 to find the probability of at least one match. This approach leads to a rather complex combinatorial formula, often known as the “complementary probability” method, where you calculate the probability that each successive person does not share a birthday with those who have already been considered. While this method can yield the probability of at least one match, extending it to determine the expected number of matches is not straightforward and becomes increasingly complicated as the group size grows.

Indicator r.v.s are an excellent alternative in this case. They are more efficient, more intuitive, and more broadly applicable to any n. Let’s see how this works.

We first define N as the random variable corresponding to the number of birthday matches, whose expectation we want to determine. We can decompose N into a sum of indicator r.v.s by enumerating all the possible pairs of people. The total number of such pairs is n choose 2. For example, for 5 people, there are 10 possible pairs, as the diagram below shows.

Out of 5 different people we can create 10 possible pairs.

For each pair j, we define a corresponding indicator r.v.:

The total number of birthday matches N is then the sum of all the possible birthday matches in the pairs we enumerated, and from here we can easily reduce the computation of the expected number of matches to the expectation of the indicator r.v.s, by exploiting linearity:

Now we know how to calculate the expectation: with the fundamental bridge! The probability of any two people sharing a birthday is simply 1/365 since there are 365 days in a year (we neglect February 29th for simplicity). With this information, we arrive at the final result:

Conclusions

Indicator random variables stand out as a remarkably effective tool in simplifying calculations of probability and expectation alike. By leveraging their binary nature and the principle of linearity of expectation, complex problems can be transformed into more manageable forms, allowing for straightforward solutions. Whether tackling classic problems like birthday matches or more intricate scenarios, the use of indicator random variables not only streamlines your thought process but also provides a deeper insight into the fundamental relationship between probability and expectation. Embracing this approach can significantly enhance your ability to navigate and solve a wide array of probability challenges with confidence and precision. Be sure to add indicator random variables to your arsenal of probabilistic methods!

If you want to check out other quant interview questions related to probability, take a look at my recent post on an application of Gambler’s ruin!

--

--

Paolo Molignini, PhD

Researcher in theoretical quantum physics at Stockholm University with a passion for programming, data science, and probability & statistics.