The Birthday Paradox
What is the probability of sharing birthdays?
The other day, a few coworkers and I were having lunch and we found out that two people in the group had the same birthday. Someone exclaimed, “What are the chances!?”
“Welllll”, I declared as a prior statistics major, “I don’t know right now about this size group, but if you have a group of 23 people, there is a 50% chance that at least two people have the same birthday.” What?? How can that be?
I then decided to ask the data team: How many people do you need before you have over a 50% chance that two people have the same birthday? They all responded with an answer that was around 23. I realized that this may be a biased sample, so I decided to ask non-data friends…
“I haven’t met anyone with my birthday, so maybe 400–500”
“The stats major is testing my probability. Uh…2?”
Hmm…is it that the data team at Ro is just so good at numbers that they have an amazing knack for calculating probabilities of birthdays? Well, I will say that the numerical dexterity on the team is pretty great, but this birthday problem is actually famous and is favorited by probability classes so it makes sense that the team already had mathematically worked out the question.
The problem is confusing because there are 365 days in a (normal) year, so intuitively, assuming people are uniformly distributed among all the days, we think that we would need around that many, or more, people in order to have two people who have the same birthday. Alas, calculating probabilities is often not as intuitive as one would think, which is why this problem is called a paradox!
Combinations of pairs
It might help to think about this in terms of pairs. Given a group of 23 people, I can compare my birthday to that of Person 2, Person 3, Person 4, etc…and Person 2 can compare her birthday to that of Person 3, Person 4, Person 5, etc., so on and so forth. This results in 22+21+20+19+…+1 different comparisons. In statistics, this is referred to as a combination, or the number of different ways to create the same size groups.
We can count all these different combinations below…
…orrr we can use a handy formula!
n = number of people
r = number of people that make up a combination
In this scenario, n = 23 and r = 2. Two people make up the combination because we are only comparing two people’s birthdays at a time.
Following the equation we get…
23!2!*(23–2)!= 23*221*2 = 253
Phew. Ok, so what does 253 represent? It is the number of possible combinations of pairs of people. If you are skeptical of this, feel free to count the X’s above!
Probability of x people having different birthdays
Now on to incorporating days of the year…
A normal year has 365 days. As a reminder of basic statistics, all probabilities will add up to 1, and for readability, I will use the notation P(x) to denote the expression the probability of x
If we have two people:
This makes sense intuitively because there is only 1 day of the year that two people will have the same birthday, so the other 364 days they do not have the same birthday.
But, what if we have three people?
We can start to see how this calculation can get out of hand as the number of people we have gets larger. However we can simplify the equation by stating:
It is a lot easier to calculate the statement on the left.
Continuing with three people…
And using what we proved with two people…
We are multiplying these probabilities because people’s birthdays are independent of each other. For example, Person 1’s birthday is not affected by Person 3’s birthday…just like your ex’s birthday is not affected by your mom’s birthday.
Hopefully this is all starting to look familiar! It is exactly what we were doing with the combinations calculation above. So the calculation that we want will be the probability that two people do not have the same birthday multiplied by however many combinations exist.
As we said earlier, with 23 people, there are 253 combinations, so we will multiply (364/365) by itself 253 times, or we can use the notation (364/365)²⁵³, which equals .4495.
Don’t forget — this is the probability that no one has the same birthday. To get the probability that at least two people have the same birthday, we can simply subtract this by 1: 1 — .4495 = .5005, which is over 50%!
We can graph this calculation to visualize how the probability changes as the group gets larger:
As there are more people in the group, the probability exponentially increases — which is part of why this problem is so unintuitive!
Next time you’re in a group of 23 people, if you’re brave enough to ask everyone their birthday, you too can be viewed as a probability guru…or potentially as someone who wasted everyone’s time; you have a 50/50 chance!