Decoding probability: how characteristics influence the likelihood of events

6 min readJun 23, 2024

DALL-E hallucinating the visual process of calculating probabilities for boys and girls with given characteristics.

Last week, I wrote a post on three family-related probability questions that are often used in interviews for quantitative jobs to probe reasoning around seemingly contradictory probability and statistics questions (coincidentally, I found one of the questions on this interview preparation from a leading UK quant firm). I showed that if we consider the realization of an event with two possible outcomes (e.g. a child being born a boy — assuming sex is binary) and we have two such independent events (e.g. two children), the more information we add about one of these events, the closer we push the probability of both events occurring to 50%.

We saw that the information could be anything. If we specify the gender of the eldest (or youngest) child we saturate this bound immediately, because there are only two possibilities (the child is ether the eldest or the youngest). We therefore have full information on one of the children, and the remaining child has 50:50 probability of being a boy or a girl.

If we have more “loose” information, the probability can drop below 50%. For example, if we specify that at least one child is a boy born on a Tuesday, there are 14 total combinations of gender and day of the week. A calculation with conditional probabilities reveals that the probability of both children being boys is 13/27, or around 48%. If we reduce the amount of information, we can reduce this probability further. For example, by specifying that at least one child is a boy, the probability that both are boys drops to 33%, as there are only three total cases possible (boy/girl, girl/boy, or boy/boy).

Today, I want to show you how we can generalize all these cases into a single formula. With that, we can show that 33% and 50% are indeed the lower and upper bound for such conditional probabilities.

For concreteness, let’s again consider two children who can both be either a boy or a girl. We assume that each child has a characteristic C with probability p, independently of gender and of each other. To recollect one of the cases above, C could be “being born on a Tuesday”. However, we can also extend this to whatever we want, for example “having blue eyes”, “liking soccer”, or “being taller than 5'2” — as long as we can assign a probability p to these events.

We want to calculate the probability that both children are boys given that at least one is a boy with characteristic C. We can do this in two ways, in a quicker way with Bayes’ rule or in step-by-step fashion thinking about events and intersections. I’ll illustrate both.

Method 1 — Bayes’ rule

The quickest way is to use Bayes’ rule for conditional probabilities (see this post for a review of Bayes’ rule):

P(two boys | at least one boy with C) = P(at least one boy with C | two boys)P(two boys)/P(at least one boy with C).

We can calculate immediately P(two boys)=1/4 as there are four gender combinations in total. To calculate the denominator, we use the complement of the event:

P(at least one boy with C) = 1 - P(no boy with C).

Since P(boy with C)=P(boy)P(C)=p/2 (because gender and characteristic C are independent of each other by assumption), we obtain using the complement again

P(at least one boy with C) = 1 - (1 - p/2)².

For the final term in the numerator, we need to realize that P(at least one boy with C | two boys) is the same event as P(at least one child with C | two boys) = P(at least one child with C), again by independence of gender and characteristic C. We can now use the same steps above but for probability p instead of p/2 (since now gender does not matter), obtaining

P(at least one boy with C | two boys) = 1 - (1 - p)².

Putting all together, we obtain

P(two boys | at least one boy with C) = (1 - (1 - p)²)/(4*(1 - (1 - p/2)²))
= (2 - p)/(4 - p).

To confirm that we did the calculations correctly, let’s examine the same problem in terms of all basic events.

Method 2 — Intersection of events

Let’s define the following events:

C1: Child 1 has characteristic C → P(C1)=p.
C2: Child 2 has characteristic C → P(C2)=p.
B1: Child 1 is a boy →P(B1)=1/2.
B2: Child 2 is a boy →P(B2)=1/2.
A: At least one boy has characteristic C.

We can rewrite event A in terms of the more fundamental events B1, B2, C1, and C2 (see sketch below for a visual aid). Since at least one boy has to have characteristic C, the possibilities are that either child 1 is a boy with C, or child 2 is a boy with C, or both. In terms of events (recall that the union operator ∪ acts as a non-exclusive or):

A=(B1∩C1)∪(B2∩C2).

Illustration of the four events discussed in method 2 and their relevant intersections.

To calculate P(A), we can use the rules of probabilities for unions of events:

P(E1∪E2) = P(E1) + P(E2) - P(E1∩E2).

Applying this formula to E1=B1∩C1 and E2=B2∩C2, we get

P(A) = P(B1∩C1) + P(B2∩C2) - P(B1∩C1∩B2∩C2) = P(B1)P(C1) + P(B2)P(C2) - P(B1)P(C1)P(B2)P(C2) = p/2 + p/2 - p²/4 = p/4*(4-p),

where we have used the independence of all the events in the second equality.

Now that we have P(A), we can calculate the conditional probability corresponding to having two boys given that at least one has characteristic C. This is given by

P(B1∩B2|A) = P(B1∩B2∩A)/P(A).

To calculate the numerator, we can either do logic algebra with unions and intersections, or simply looking at a drawing of all the events, which reveals that B1∩B2∩A is the union of three disjoint events — B1∩B2∩C1^c∩C2, B1∩B2∩C1∩C2^c, and B1∩B2∩C1∩C2. Again, we can recognize the three cases: two boys and only child 1 has C, two boys and only child 2 has C, or both boys have C. Summing all the probabilities of the disjoint events, we get

P(B1∩B2∩A) = P(B1∩B2∩C1^c∩C2) + P(B1∩B2∩C1∩C2^c) + P(B1∩B2∩C1∩C2) = (1-p)*p/4 + p*(1-p)/4 + p²/4 = p/4*(2 - p).

Putting the denominator and numerator together, we see that we obtain the same result from method 1:

P(B1∩B2|A) = (p/4*(2 - p)) / (p/4*(4 - p)) = (2 - p)/(4 - p).

Discussion

Now that we have confirmed the result, we can discuss its implications. First of all, you can see in the plot below that the probability is monotonically decreasing in p. This means that the larger the probability about the characteristic C, the lower the posterior (i.e. conditional) probability. This makes sense because if p is large we can discern the two children less, as both will anyway display the characteristic C.

The probability of having two boys, given that at least one is a boy with a characteristic C that occurs with probability p, plotted as a function of p.

In the limit p=1, both children always have the same characteristic C and the posterior probability is (2 – 1)/(4 - 1) = 1/3 = 33%. In other words, we do not provide additional information about either child and we recollect the case of P(two boys | at least one is a boy).

At the other end, in the limit p=0, we are maximizing the additional information about discerning one child over the other. In fact, we recollect P = (2 - 0)/(4 - 0) = 1/2 = 50%. This happens because we know a lot of information about an event over a multitude of possibilities for at least one child. This could be for example being born on December 12th, 2019 at 04:54am in room 401 at the General Maternity Hospital in Stockholm, Sweden. It is an extremely specific event that conveys a lot of information once we know it occurred, but which is extremely unlikely if picked randomly. Therefore, for all practical purposes, it is equivalent to specifying which child is a boy and thus it saturates the bound that we found when calculating the probability of two boys, given that the eldest is a boy. In this case, we were also fully specifying which one was the boy.

Also note that if you insert p=1/7 (probability of being born on a random Tuesday) in the formula, you correctly recollect P=13/27, i.e. the probability of having two boys given one is a boy born a Tuesday. Thus, we have successfully covered all the cases in the previous post.

I hope you enjoyed this detailed analysis of a quite general conditional probability problem and how it can recollect various different cases. Stay tuned for more probability and statistics next week!

Decoding probability: how characteristics influence the likelihood of events

Method 1 — Bayes’ rule

Method 2 — Intersection of events

Discussion

Written by Paolo Molignini, PhD