Know What P-Value is (and What It isn’t.)

I don’t know Frank Harrell personally, but he is one of my favorite statisticians — I am a big fan of his book on regression modeling strategies (even if it has been a long time since I read it — I should open it up again soon) and the fact that he is an active contributor to Stackexchange — the only famous statistician that I see there on regular basis. His perspectives on P-values and Bayes are worth a close read.

One interesting thing that comes out of his discussion is the inherent linkage between p-values and Bayes’ Rule. The p-value emerges from the conditional distribution of the estimator, assuming that the parameter being tested is precisely equal to 0. Harrell’s observation that “In reality, there is unlikely to exist a treatment that has exactly zero effect.” is exactly on the mark. So, what we really want to know is something more along the line of the parameter being of value something less than interesting, say, what is the probability that, given the model, and the distribution of the noise associated with it, the parameter is of a certain range of interest, or outside of it? This is not in itself, an onerous demand: this sort of exercise is routinely given as assignments in intro to stats courses. I’ve done some when I was young and I gave some when I taught.

Perhaps this is a bit harder to implement when the question is that of research problems? Relatively few people try to formally incorporate noise into their models, even when it is theoretically possible, and the resulting empirical evaluations thereof can involve a form of shotgun marriage — empirical tests taken from stats formulas that do not neatly match the description of the model being evaluated. I used to be puzzled by this when I was a naive undergraduate, but having learned what needs to go into even half-baked models tied to statistical models, I’ve come to understand, even appreciate weaker than desirable empirical tests, although never fully accepting on principle. But, even with the additional complexities, this seems like a good idea at least and increasingly feasible now that the computational cost of simulation is getting smaller. You can simulate what should take place given your model for varying values of parameters of interest, and use them as the basis for statistical tests comparatively easily, IF you have enough of a skeleton for your model sketched out to support a meaningful structure. Something logically analogous to p-value can be computed, but without the dumb logical straitjacket that the parameter is exactly 0. In fact, it could even elicit some useful insights that rely on taking the noise more seriously.

PS. Another point that is a worthwhile takeaway is the following:

All probabilities are conditional on something, and to be useful they must condition on the right thing. This usually means that what is conditioned upon must be knowable.

Not just probabilities, but all statistical relationships, distributions, etc. In some cases, these conditional probabilities reveal information about “causality,” or, more accurately, the “necessary condition” part of conditional relationship (with much weaker definition of “necessary” than usual — necessary as a matter of degree, rather than dichotonomous condition.) In many other cases, the conditioning captures something that is not quite “causality” but nevertheless interesting and potentially more useful.

Show your support

Clapping shows how much you appreciated Henry Kim’s story.