The birthday problem
Say you work in an office with around 20 employees. If two of them shared the same birthday, would you remark on it as an interesting coincidence? A rarity?
Many people might, because our instinct tells us that it’s quite unlikely that two people in an office of around 20 people were born on the same day of the year. After all, there are only 20-odd people and there are 365 possible birthdays (for simplicity’s sake, let’s ignore the possibility that anyone was born on February 29 in a leap year).
But, are our instincts right? Mathematics allows us to work out just how likely, or probable, such a coincidence is.
The answer might surprise you. Take an office with 23 employees: with 23 randomly-selected birthdays, the probability that two or more will be the same is just over 50 per cent—a better than even chance! And with those odds, it’s hardly remarkable when such a coincidence occurs.
Read on if you’d like to get into the nitty-gritty of the probability calculation.
In the mean time here are some other results to variations on the –birthday problem’: in a group of 40 randomly selected people, the chance that two or more would share the same birthday goes up to around 90 per cent; in a group of 14 people, the chance that two or more would have birthdays that are either the same or only one day apart is around 50 per cent; in a group of 88 people, there’s an even chance that three or more people would share the same birthday.
So, how do we work this out? By working out the probability that there are no shared birthdays in the group, we can then work out the probability that there are two or more people sharing a birthday.
This is because these are the only two possible outcomes—either (a) no one in the group shares a birthday with another in the group, or (b) at least two people share the same birthday. Because these are the only two possibilities, their respective chances must add up to 100 per cent, and by working out one chance, we can work out the other.
Let’s put our group of 23 people in a line and work our way through from end to end.
The first person will have one of 365 possible birthdays (as described below*). Now we work out the probability that the second person in line does not share the same birthday as the first, which is 364/365. Why this number? Because, out of the 365 possible birthdays for person number two, 364 of them will not match the first person’s birthday, thus giving a probability of 364/365.
Next we take the third person in line and work out the probability that he or she does not share a birthday with either of the first two: this chance is 363/365, following the reasoning of there being 363 out of 365 possible birthdays that do not match the two birthdays of persons one and two.
Applying the same logic to the fourth person, we get a probability of 362/365 that their birthday will not be the same as the three people before them in the line. We then continue assigning probabilities down the line until we get to the 23rd person, who will have a 343/365 chance of not sharing a birthday with the 22 other people in the group.
We now use the calculated chances that each successive person in line will not share a birthday with those preceding to work out the probability that the entire group has no shared birthdays.
In words, this is the chance that i) the second person does not share a birthday with first (chance = 364/365), AND ii) the third person does not share a birthday with the first or second (363/365), AND iii) the fourth…(and so on)…AND, finally, xxii) the 23rd does not share a birthday with any of the previous 22 (343/365).
The probability that ALL of these things will occur is obtained by multiplying together the chances of each individual event occurring:
Probability that there are no shared birthdays in a group of 23
= 364/365 x 363/365 x 362/365 x 361/365 x ……. x 344/365 x 343/365 (you can work this out on a calculator)
= 0.4927… = 49.3% (rounded)
Therefore, we know the probability that at least two people share the same birthday is:
100% – 49.3% = 50.7% —better than even!
*Assumptions behind the calculation: we assume birthdays are spread evenly throughout the year—that is, each day of the year is equally likely as a birthday. In real life this may not be true, as the birth rate can change month to month, season to season, leading to an uneven distribution of birthdays. Some births are more likely to occur on weekdays than weekends, such as those by caesarean section, and this may affect the distribution of birthdays for certain populations (e.g. a class of children all born in the same year).
The second assumption is that we ignore the possibility of people being born on February 29 and examine the case where a population has their birthdays evenly distributed across the 365 days of a non-leap year. Again, this is not true in reality, but considering that February 29 is just one possible birthday out of every four years—so 1 possible birthday out of 1461 days—the simplified case of 365 possible birthdays is accurate enough to give us a general idea about the probabilities involved in the birthday problem.
Further reading:
Matthews, R. and Stones, F. (1998). Coincidences: the truth is out there. Teaching Statistics, 20:17–19. Available at http://ts.rsscse.org.uk/gtb/matthews.pdf
Crilly, T. (2007). 50 mathematical ideas you really needs to know. Quercus Publishing Plc, London
Written by Sarah White, Science Research Officer in the Office of the Chief Scientist.