A guy just transcribed 30 years of for-rent ads. Here’s what it taught us about housing prices
Michael Andersen
3.4K222

I’m afraid that the entire analysis of this article (along with the underlying analysis from Fischer) is fundamentally flawed, because it interprets Fischer’s *predictive* statistical relationships as *causal* when it is far from clear that such an interpretation is warranted. All that Fischer has shown is that his 3 independent variables (housing units, total wages, and total employment) *predict* the dependent variable of rent. {To use econometric parlance, Fischer has estimated the best linear predictor of the dependent variable from the independent variables). Correlation is not causation; Fischer has not shown any evidence that his independent variables actually *cause* the dependent variable. Indeed, *reverse-causality* seems to be a highly plausible counter-explanation for at least some of the predictive relationship he found. For example, perhaps higher rent is actually causing higher total wages and total employment by spurring formerly unemployed people (e.g. housewives, teenagers) to enter the workforce and employed people to work longer hours and earn more so that they can afford the higher rent.

Alternatively, perhaps a heretofore unnamed variable is actually driving one or more of the independent variables as well as the dependent variable. As a case in point, I am sure that one could find a correlation between the total amount of ice cream consumed on a given day and the total incidents of sunburn on that same day, but that obviously doesn’t mean that ice cream actually *causes* sunburn, such that reducing ice cream consumption would actually reduce sunburn. Rather, the daily weather — namely whether a day is hot and sunny — is causing both ice cream consumption and sunburn incidence. Similarly, changing cultural trends (e.g. the rise of the double-income households) might be causing both increased employment and wages along with higher rent,

To ascertain the causal effect of any of the independent variables upon the dependent variable requires that you analyze the effect of *exogenous* variation of the independent variable in question upon the dependent variable. For example, regarding the housing units independent variable, perhaps one might analyze instances where a court surprisingly declared some apartment buildings to be condemned thereby removing the corresponding housing units from the market, but then an appellate court subsequently overruled the initial court hence restoring those units back onto the market. One might then plausibly analyze that induced exogenous variation in housing units to estimate its effect upon rent. Similar exogenous shocks to the system might be found that induce variation in the other independent variables. For example, perhaps a surprise court ruling declares that certain SF city workers — many of whom live in SF — are owed an immediate bonus payout. Again, one might then leverage that to plausibly estimate the causal effect of total wages upon rent.

But without conducting that type of rigorous analysis, all we have is a purely *predictive* regression with no clear causal interpretation.