Hypothesis Testing: Are university towns more resilient to recession than non-university towns?

Nathan W. Doctor
Analytics Vidhya
Published in
4 min readJun 20, 2020

For an early project, I sought to use Python to examine if university towns are more resilient to economic downturn than non-university towns. More specifically, I asked are the housing prices in university towns less effected by recession?

To start, a university town is a city which has a high percentage of university students compared to the total population of the city.

The hypothesis is that we can expect housing prices in such cities to be less effected by recession mostly because we should expect similar numbers of students, staff, and other workers connected to university life to live in such towns, regardless of the economic outlook.

To get a list of university towns, I simply used Wikipedia, which maintains a list of college towns in the United States. For a spreadsheet on housing prices across the United States, I used Zillow, which included data from 1996–2020 in City_Zhvi_AllHomes.csv. And lastly, I used the U.S. Department of Commerce, Bureau of Economic Analysis (BEA) to figure out when exactly the ‘Great Recession’ of 2007–2009 started and when the recession reached its bottom i.e. the quarter within the recession which had the lowest GDP. This was necessary because I sought to compare housing prices at the start of the recession to prices at the bottom.

To match the format of the list of university towns from Wikipedia to the list of all cities on Zillow, I would need to clean the text file derived from Wikipedia a bit.

Not the most elegant solution here, but at least it works..

Next, to get the start of the recession, let’s load data from the BEA and find the recession’s start. A recession is defined as starting with two consecutive quarters of GDP decline, and ending with two consecutive quarters of GDP growth.

As you can see, we’ll need to clean this dataframe a bit..

Now, let’s convert the housing data from Zillow to quarters.

Next, we’ll create new data showing the decline or growth of housing prices
between the recession start and the recession bottom.

And finally, we’ll run a t-test comparing the university town values to the non-university towns values, return whether the alternative hypothesis (that the two groups are the same) is true or not, and the p-value of our confidence.

The function will return the tuple (different, p, better) where different=True if the t-test is True at a p<0.01 (we reject the null hypothesis), or different=False if otherwise (we cannot reject the null hypothesis). The variable p should be equal to the exact p value returned from scipy.stats.ttest_ind(). The value for better should be either “university town” or “non-university town” depending on which has a lower mean price ratio (which is equivilent to a reduced market loss)

As we can see, there is a difference between the mean housing prices of university towns and non-university towns. As the p-value is less than .01, we can reject the null-hypothesis (that there is no significant different between university towns and non-university towns). In other words, we can see that there is a difference between university towns and non-university towns and that university towns are, indeed, less effected by recession.

--

--