This tweet from Russ Roberts on the questionable fashion of drawing causal links between variables that clearly have no direct relationship with one another made me think about I how struggle to interpret econometric studies. On the one hand, econometrics is a form of hard evidence and to say that it has no value is simply dogmatic anti-empiricism. On the other hand, it is an undoubtedly flawed paradigm. Even putting aside the myriad of technical issues with misspecification and how these can yield results that are completely wrong, after seeing econometric research in practice I have become skeptical of the results it produces.
Reading an applied econometrics paper could leave you with the impression that the economist (or any social science researcher) first formulated a theory, then built an empirical test based on the theory, then tested the theory. But in my experience what generally happens is more like the opposite: with some loose ideas in mind, the econometrician runs a lot of different regressions until they get something that looks plausible, then tries to fit it into a theory (existing or new). As Roberts himself has pointed out, statistical theory itself tells us that if you do this for long enough, you will eventually find something plausible by pure chance!
This is bad news because as tempting as that final, pristine looking causal effect is, readers have no way of knowing how it was arrived at. There are several ways I’ve seen to guard against this:
(1) Use a multitude of empirical specifications to test the robustness of the causal links, and pick the one with the best predictive power, similar to this paper on climate and hunger in The Journal of Applied Statistics. (Note also that linear regression comes out as one of the best predictors, suggesting that it’s not unreasonable to rely on the Central Limit Theorem to validate linear regression a lot of the time. But this is no excuse not to test linearity at all).
(2) As the journal Comparative Political Studies recently tried, have researchers submit their paper for peer review before they carry out the empirical work, detailing the theory they want to test, why it matters and how they’re going to do it. Reasons for inevitable deviations from the research plan should be explained clearly in an appendix by the authors and (re-)approved by referees.
(3) Insist that the paper be replicated. Firstly, by having the authors submit their data and code and seeing if referees can replicate it (think this is a low bar? Most empirical research in ‘top’ economics journals can’t even manage it). Secondly — in the truer sense of replication — wait until someone else, with another dataset or method, gets the same findings in at least a qualitative sense. The latter might be too much to ask of researchers for each paper, but it is a good thing to have in mind as a reader before you are convinced by a finding.
All three of these should, in my opinion, be a prerequisite for research that uses econometrics (and probably statistics more generally, though I do not know the ins and outs of how statistics is used in other disciplines). But for now I think a reasonable rule to follow for skeptics of econometrics — one that won’t result in a fingers-in-your-ears dismissal of all empirical studies — is to demand that empirical findings fulfil at least one of the criteria.
Naturally, this would result in a lot more null findings and probably a lot less research. Perhaps it would also result in fewer attempts at papers which attempt to tell the entire story: that is, which go all the way from building a new model to finding (surprise!) that even the most rigorous empirical methods support it.
Both of these would be good things, but the latter especially might make economics move towards being a subject where, to paraphrase Claud Alexander, economists are more happy just to sit back and gather data, making tentative observations and slowly building towards more comprehensive theories. Because to me the whole ‘credibility revolution’ in economics, right down to the attempt to create controlled experiments, just seems like an attempt to run before we can walk.