Princeton’s 2020 Room Draw is Fair

12 min readMar 26, 2020

Last year, I published an article about Princeton’s Room Draw and how it was non-randomized, giving advantage to certain students.

Princeton Room Draw is Not At All Random

Large groups are advantaged over small groups, order of groups with the same participants stay the same across all…

medium.com

I originally did not plan on analyzing this year’s Room Draw data, but due to many requests I decided to investigate and found that this year’s Room Draw resolved the problems associated with last year’s draw.

There were two major problems associated with previous draws:

1. Draw times were correlated with size of groups, with larger groups receiving earlier times:

We can show this by plotting draw time vs. group size. We have to be careful that we’re only comparing groups of the same weight, since randomization occurs within groups of equal weight. I’ve shown the Senior Upperclass Draw for 2019; a similar approach works for other draws.

Figure 1: Swarm (left) and violin (right) plots showing the distribution of draw times per group size

The swarm plot on the left shows every instance of a draw group as a single point, the x-coordinate represents the group size and the y-coordinate represents the draw order, with 0 representing the first group to draw and 900 representing the last.

The violin plot on the right shows the distribution of draw times per group using a kernel density estimate (KDE) along with its associated box plot.

Figure 2: Box plot showing the distribution of draw times per group size

We can summarize our data by using a box plot. Here we can see that over half the students who drew alone were in the last 100 students to draw. We can also see that the median draw time decreases significantly as the draw size increases, meaning that larger group sizes tend to receive earlier draw times.

The box plot tells us that over three-quarters of students who drew alone were in the last 200 students to draw, and that any student who drew alone and managed to get a draw time that was in the top 50% is considered a statistical outlier (more than 1.5 times interquartile range).

How can we tell it is non-random?

If draw times were randomized across different draw sizes, then the draw times for each draw size would obey a uniform distribution. We can observe how far our data deviates from a uniform distribution by using a Q–Q plot.

Figure 3: Q-Q plots showing the sample vs theoretical quantiles of draw times

A Q–Q (quantile-quantile) plot compares two probability distributions by plotting their quantiles against each other. Here, the y-axis (sample quantiles) represents the actual draw time of each group within our sample, with 0.0 representing the first group to draw and 1.0 the last group to draw. The x-axis (theoretical quantiles) represents what draw time we expect if the data did actually come from a uniform distribution.

We have the 45-degree line in red, which is what we expect if the data were uniform. For draw groups with sizes 1 and 2, we see that all of our points lie above the red line, meaning that our sampled quantiles of times were much larger than the theoretical (expected) times. This means that students who drew in groups of 1 or 2 received draw times that were much later than expected.

Conversely, for draw groups with sizes 5, 6, 7 and 8, we see that all points lie below the red line, which means that people who drew in large groups received draw times that were much earlier than expected. We can see that the distributions for groups of sizes 3 and 4 were largely uniform.

How do we quantify the result?

So we can see that the points do not lie on the red line, but could this be due to pure chance?

One way to quantitatively test the equality of two distributions is the Kolmogorov–Smirnov (K–S) test. The K–S test measures the distance between the sample distribution and the cumulative distribution function (cdf) of the reference distribution.

In other words, it looks for the largest distance between the points and the red line, giving us the Kolmogorov–Smirnov statistic. This provides us a p-value, which tells us the probability of achieving a statistic at least as extreme as the results observed.

Table 1: K–S test statistics and their associated p-values for each draw size.

From the results above, this tells us that for a draw group of size 5, if we simulate a uniform distribution, there is a 8.71% chance that we will obtain a K–S statistic greater than 0.2586, which is what our sample gives. This means that if the draws were randomized, there’s a 8.71% chance that we would see such a distribution in the draw times of groups of size 5.

If we look at the p-value for a draw size of 1, we see that it is 7.09×10-⁴⁰ (!), which means that if the draw times were uniformly distributed, then it is virtually impossible to see our particular draw times!

If we generate a trillion random room draws every nano-second, it would still take us 1/(7.097×10-⁴⁰)/10¹²/10⁹ = 1.4×10¹⁸ seconds to find a sample this extreme. Given the universe is 13.82 billion years (13.82×10⁹×365×24×60 ×60=4.36×10¹⁷ seconds) old, this is over three times the current age of the universe!

If we reject at the 5%-significance level, then we can see that we can reject that the draw sizes of 1, 2 and 7 are uniformly distributed.

Hence we can conclude that last year’s Room Draw was NOT random, with smaller group sizes penalized and larger draw groups advantaged.

The non-randomness was probably caused by selecting draw groups with weights corresponding to their draw size. For a more in-depth discussion, see https://princetonhousing.github.io/.

How did things change this year?

Again, I’m going to analyze the Senior Upperclass Draw but this time for 2020. The analysis is analogous for any other draw, but we have to be more careful about removing mixed-weight draw groups. The Upperclass Draw is by far the largest in terms of number of students, so it is the easiest example to use.

Figure 4: Swarm (left) and violin (right) plots showing the distribution of draw times per group size

Here the swarm and violin plots look much more uniformly distributed.

Figure 5: Box plot showing the distribution of draw times per group size

The box plots look much more uniformly distributed than before, with no outliers and interquartile ranges that lie much closer to expectation. There is no trend in the median draw time decreasing as draw size increases, with the median hovering around the 500th draw.

Figure 6: Q-Q plots showing the sample vs theoretical quantiles of draw times

In order to truly tell how random our sample was, we need to look at the Q–Q plots. Here we can see that each sample quantile is much closer to its theoretical quantile compared to before, with points closely following the 45-degree red line.

In fact, we can see that students who drew alone received a draw time which is consistent with a uniform distribution. Using the Q–Q plots we can see that no group received a draw time which would be inconsistent with a randomly-generated permutation of draw groups.

We can see that students who drew as a triple received draw times that were slightly better than expected, but this is consistent with randomization and the fact that our sample size is small — not many students drew in groups of 3 and it is natural for some slight variation to occur. This does not suggest that students who drew as a group of three were advantaged in any way.

The Kolmogorov–Smirnov Test

Table 2: K–S test statistics and their associated p-values for each draw size.

Using the K–S Test for the 2020 Draw, we can see that we cannot reject that any of the draw times have a non-uniform distribution, for any draw size at a 5% significance level. Even the most likely distribution to be non-uniform (size 3) has a p-value of above 0.05, and even then, with all distributions uniformly distributed there is still a 1–(0.95)⁸ = 33.7% probability that one of the p-values is less than 0.05 by pure chance.

Hence we can statistically conclude that the draw times are unaffected by draw size.

2. Draw times were correlated across different draws, and also different years:

Last year draw times were correlated across different draws, meaning that students who received unfavorable draw times in one draw (e.g. Upperclass) would also likely receive unfavorable times in other draws (e.g. Independent). Moreover, if the draw group stayed the same, then the order of groups between different draws would also stay identical.

This means that students who received good times in one draw were also likely to receive good times in other draws. Furthermore, this also happened in the 2018 Draw, and the order of groups who stayed the same between 2018 and 2019 also did not change.

Figure 7: Scatter plot showing the correlation of draw positions in 2018 and 2019

This plot shows the draw positions of students who were present in both the 2018 and 2019 draw. Every point is a unique student. A draw position of 1000 means that the student is the 1000th student to draw, and lower numbers are better. We see a large cluster of students in the bottom-right quadrant since these represent the Juniors in 2018, who became Seniors in 2019, meaning they switched from being in the later half of the draw to the earlier half. The small number of points in the top-right and bottom-left corners represent students who repeated Junior and Senior years respectively.

Notice the thick straight line in the bottom-right quadrant: these represent all the students who stayed in exactly the same order between the 2018 and 2019 draws.

How random is this?

We can measure the correlation between the two draws through Spearman’s rank correlation coefficient (Spearman’s ρ), which measures rank correlation, or whether the order of the draws are correlated.

Table 3: Spearman’s rank correlation coefficient and associated p-value for 2018 and 2019 draws

The Spearman correlation coefficient is 0.459, with a p-value of 6.04×10-⁴⁵ (!!!), which is even less than the p-value for draw times being related to draw size. This means that if draw times were random, there is a less than 6.04×10-⁴⁵ probability that Spearman’s ρ will be at least 0.459, meaning that we can statistically reject the null hypothesis that ρ = 0 (no correlation).

This tells us that there is no way that the 2018 draw times are not correlated to the 2019 draw times.

How did things change this year?

Figure 8: Scatter plot showing the correlation of draw positions in 2019 and 2020

If we plot the 2019 draw times and compare them to the 2020 draw times, we see that the thick straight line completely disappears. But is there still a chance that the draw times are still correlated?

Table 4: Spearman’s rank correlation coefficient and associated p-value for 2019 and 2020 draws

If we look at the correlation between the draw times, we see that Spearman’s ρ = –0.013, which is very close to zero, giving us a p-value of 0.69. This means that we cannot reject that ρ = 0 at the 5% significance level, which says that we cannot statistically conclude that the draw was non-random.

Correlation between different draws of the same year

Figure 9: Scatter plot showing the correlation between the Upperclass and Independent Draws in 2019

Last year there was also significant correlation between different draws of the same year. The above scatter plot shows the draw times of Seniors who participated in both the Upperclass and Independent draws.

Table 5: Spearman’s rank correlation coefficient and associated p-value for Upperclass and Independent, 2019 — Table 5: Spearman’s rank correlation coefficient and associated p-value for Upperclass and Independent Draws, 2019

Again, the results show that this cannot be attributed to pure chance. Significant correlations also occur between all other draws, including Spelman and Residential College draws, but I’ve decided to display only the Upperclass vs. Independent draws since these had the largest number of students and it was also relatively easy to control for different weights.

Figure 10: Scatter plot showing the correlation between the Upperclass and Independent Draws in 2020

The same plot for 2020 shows a much more randomized draw order. The dark blue clusters represent students who stayed in the same draw group across both draws — observe how these aren’t in a straight line, in contrast to before.

Table 6: Spearman’s rank correlation coefficient and associated p-value for Upperclass and Independent, 2020 — Table 6: Spearman’s rank correlation coefficient and associated p-value for Upperclass and Independent Draws, 2020

Now we see that we cannot statistically reject that there is zero correlation between the draw times. Hence we can confirm the draw times are indeed randomized this year.

The Average Draw Size

While conducting the analysis I noticed something quite significant — more students were choosing to draw in larger-sized draw groups. This could be a potential response to the phenomenon of larger draw groups receiving earlier times in previous draws.

Figure 11: Histogram showing number of groups for each draw group size in 2018

In 2018, we see that a large number of students chose to draw alone, with not many students choosing groups of size 5 or more.

Figure 12: Histogram showing number of groups for each draw group size in 2019

In 2019, we saw that most groups (for Seniors in the Upperclass Draw) were still fairly small in size, with the largest groups of size 7 and 8 being similar in count to the next-largest groups of size 5 and 6. The biggest change from 2018 is the number of people drawing alone, which decreased significantly, making more draw groups of sizes 5–8.

We know from our previous analysis that the 2018 draw was biased, but this information wasn’t publicly revealed until after the 2019 draw results had already been announced. Perhaps certain students realized the discrepancy between draw group size and draw time, and decided to draw in a large group instead of alone.

Figure 13: Histogram showing number of groups for each draw group size in 2020

In 2020, we can see that the number of groups of size 7 and 8 doubled in count compared to the previous year, with the number of groups of sizes 2–6 all dropping. This is consistent with all the Tiger Confessions and res-college listserv posts asking for people to join their draw groups.

The number of people drawing alone stayed around the same, which makes sense because people who want to draw alone would rather draw alone, while people who were already in draw groups would want their group to increase in size. Hence we see a shift from groups of size 2–6 to groups of size 7–8.

Indeed, the average draw group size (across groups) has increased from 2.60 to 2.84 to 3.12 while the average draw group size (across students) has increased from 4.25 to 4.45 to 5.11. However, just comparing the average does not tell us too much, and we need to look at the full distribution.

The above two plots showed the distribution of draw groups based on the total number of groups, but now let’s look at the distribution of draw groups based on the total number of students.

Figure 14: Stacked horizontal bar plot showing the percentage of students in each group size in 2018

Figure 15: Stacked horizontal bar plot showing the percentage of students in each group size in 2019

Figure 16: Stacked horizontal bar plot showing the percentage of students in each group size in 2020

In this stacked horizontal bar plot, each colored region represents the proportion of students who are in a draw group of that size. For example, Figure 15 shows that in 2019, 14% of Seniors chose to draw alone, 12% of Seniors were in a draw group of two, 16% of Seniors were in draw groups of size 8 and so on.

Comparing these above plots, we can see the number of students in draw groups of above 7 almost doubled from 2019 to 2020. The number of students in draw groups of 4 or less decreased from 60% in 2018 to just over 40% in 2020. This year, more than half of all Seniors in the Upperclass Draw are in draw groups of 6 or more.

Closing Remarks

To save space, the analysis I have performed mostly focus on the Upperclass and Independent draws, but it is analogous for other years and other draws – if folks are interested I can post the code and a tutorial so that you can perform the analysis for yourself.

This analysis was entirely performed in Python, using Jupyter, numpy, pandas, matplotlib, scipy and seaborn.

Let me know if anyone wants any additional analysis, or more plots — in the meantime, stay safe and practice social distancing!

Last year’s code can be found at: https://princetonhousing.github.io/

Connect with me:

Email: yangsong@alumni.princeton.edu
Facebook: https://www.facebook.com/yangsong20
LinkedIn: https://www.linkedin.com/in/yangsong97/

Princeton’s 2020 Room Draw is Fair

Princeton Room Draw is Not At All Random

Large groups are advantaged over small groups, order of groups with the same participants stay the same across all…

How do we quantify the result?

How did things change this year?

The Kolmogorov–Smirnov Test

How random is this?

How did things change this year?

Closing Remarks

Connect with me:

Written by Yang Song