Another blog post about why Bayesian Statistical Analysis rocks (but really its application is pretty confusing)

Anders Orn
Human Systems Data
Published in
4 min readApr 12, 2017

The hot, up-and-coming fad in the world of statistics, psychology, and research in general is Bayesian analysis. As more and more people become familiar with the pitfalls of the Frequentist approach to statistics, the ideas of Thomas Bayes are gaining traction as a robust alternative to test a hypothesis. Just as with any other paradigm shift (yes, I believe Bayesian Inference will eventually overcome the stranglehold that Null Hypothesis Significance Testing has over the academic community), there are some who are trying to catalyze the propagation of the movement.

One of these people is a noble scholar whose name is John K. Kruschke, a professor of Psychological and Brain Sciences at Indiana University (see Dr. Kruschke’s website here: http://www.indiana.edu/~kruschke/DoingBayesianDataAnalysis/). Kruschke is very obviously outspoken in his partiality towards Bayesian Analysis as he has many videos, papers, and blogs contributing to his cause. Despite this, I found his 2010 article entitled What to believe: Bayesian methods for data analysis to be particularly frustrating to read; the author obviously knows his stuff but as someone who is only tangentially familiar with Bayesian statistics I was admittedly lost throughout my first read through. I feel as though the author really could have benefitted from some scientific writing training…

After rereading a few parts and consulting an alternative source (stata.com’s Introduction to Bayesian analysis), I can at least wheedle some of the basics:

The fact that Bayesian analysis provides a substantially adequate alternative to Null Hypothesis Significance Testing is a huge draw to implement its models of data analysis. The entire first half of Kruschke’s paper discusses the inadequacies of NHST: p-values are worthless, replication probabilities and power analyses are worthless, and the whole model is entirely too rigid. Bayesian analysis is the knight in shining armor galloping towards us from the horizon, in the sunset, in slow motion.

Admittedly, Kruschke did lose me a bit in the second half of his paper. I know he talks an awful lot about how awesome Bayesian analysis is, but that’s not very helpful statement for learning now, is it? Let us persevere!

From what I understand, here are the basics:

Bayesian analysis requires that we build a posterior distribution from a combination of prior knowledge on the subject and observed evidence. The first part of this is actually pretty similar to NHST; the Frequentist approach uses literature reviews to then create hypotheses. The difference here, however, is that while the hypothesis in NHST does not change for any given study (and is not a concrete probability), the posterior distribution is a probability that will continue to change as new evidence comes in. The Bayesian researcher can, and will, adjust the model as he/she sees fit as the data points come in. I think this all makes sense to me; I know it’s very surface level, though.

Just as I have in my last few blog posts, I wanted to attempt to apply what I’ve read to my thesis data.

(I’d like to point out quickly that Krusche’s paper does not offer instructions on what to do, but rather offers reasons why Bayesian analysis is better than the Frequentist approach. I am therefore deviating from this paper by trying to apply it to my own data, but would much rather attempt to learn by application and not theory. Anyway, this is really where I’m having trouble. I got stumped early in this endeavor.)

You are supposed to generate a probability based on prior knowledge. How the heck are we supposed to do that? The examples I have found use contexts like “1/10 people in City A have the Ebola. If we observe 30 people and find that 18 of them have Ebola, the model changes accordingly to probability X (whatever the math is).” That’s easy! It’s just probability!

But what about a bunch of theories that help to contribute to the hypotheses generated in my thesis? To name a few, Inverted-U Theory, Distraction Theory, Conscious Processing Hypothesis, and Reinvestment Theory are all perfectly reasonable trains of thought that may explain how pressure may impact performance in basketball free throws (my thesis project, for those of you who have missed that). How am I supposed to generate a probability from any of these?

Can someone break this down into simple terms and then point me in the right direction here? I know there’s a way to do this type of thing, but I’m clearly 1) not looking at the best resources or 2) am somehow caught up in the Frequentist approach I’m accustomed to and it’s just not going to fit that way.

Okay, so that definitely turned away from the traditional style of a blog. I’m a trendsetter, I guess. Bringing Kruschke back into this, thought, he really emphasizes the importance of incorporating prior knowledge in Bayesian analysis and demonstrated that importance with a good example using drug tests (Kruschke, 2010). It was for this reason that I thought it would be good to give it a shot with my own data. But what about actually doing this stuff? All sorts of people are explaining why Bayesian analysis is great but I would like to see some more concrete examples of how the analysis was really implemented from start to finish.

Kruschke, J. K. (2010). What to believe: Bayesian methods for data analysis. Trends in cognitive sciences, 14(7), 293–300.

--

--