From The Pinnacle To The Pit
With apologies to Ghost.
So, this happened.
Long story short: Brian Wansink is a professor at Cornell. His work has come under scrutiny for being inaccurate. I have made some small personal contributions to that scrutiny. Others have made many more. The inconsistencies uncovered include means and standard deviations which cannot exist, data distributions which cannot exist (or only exist under ludicrous conditions), oddly repeated sample sizes, self-plagiarism, statistical anomalies, partridge, pear tree, etc.
If you aren’t interested in meta-science or forensic statistics, or you don’t follow me on Twitter, you might have heard nothing whatsoever about that.
Because if you don’t care, it’s godawful boring. And there are numbers. So many numbers.
What we didn’t know until now was how the sausage was made. Where are these inconsistencies coming from? What do they represent? I don’t spend too much time on questions like this, because — as I’ve said before — I have neither search warrant nor crystal ball.
Now, to this giddy pile of inaccuracies and confusion, this article can add some insight into the research process not just research outcomes that we’ve already dealt with.
Basically, we can see inside the sausage factory now, and it is as pretty as a Bosch painting of a perforated colon. We knew the sausages were bad, but now we can also say why. It is because there are old horses and bike tyres and Bangkok summer garbage middens in the meat.
A lot will be said about this, so I’ll make three points here that other people might not get around to.
(1) Spare Me The Well-Meaning Goof Routine
One quote from this article that will not get the same attention as the others (like ‘data torturing’, for instance… that’s a quote that’s going to get a few miles on it) is from a graduate student who spent time in Wansink’s lab.
This is a mature and empathetic opinion, but it hints at something that grinds my teeth — that all of this research turmoil is the sad result of some kind of Icarian quest to be helpful. “Just trying to help, boss, honest. I’m sorry about the horrifying litany of stuff-ups. I’m doing my best to help people.”
Cool. I’m sure you’re a saint. The fact still remains when you do terrible research with the best will in the world, you are still very much part of the problem. In many ways, you are more dangerous than a complete bastard who might pervert the research process in more direct and less honest ways.
Why? Why is it such a problem to ‘help’ like this?
It is selfish. You are prioritising your own opinions over the opinions of controlled observations. You are saying that you are smarter than the data. And, by extension, better situated to dictate reality than other people who bring their own data which is carefully analysed via a research plan which isn’t “beat it until it totters on bloody stumps”. You are literally saying ‘it doesn’t matter what we find, I know what people need’.
It is also selfish because publishing lots of awful research is usually unambiguously good for your career and unambiguously bad for science.
It is monumentally short sighted. I cannot fully outline here, without descent into madness, how many good ideas subject to carefully conducted experiments failed to work out in nutrition, food science, dietetics, etc. If you think you’re smarter than the data, you ignore the twenty-seven-steps-forward-twenty-six-steps-back which is the frustrating nature of ALMOST ALL behavioural research. If you think the slippery nature of all the other research just doesn’t apply to you, well, you’re a massive donkey.
It allows a very cynical escape. If you are a zealot on the side of the righteous (‘I support children eating vegetables! I support walks in the park!’) people are far more likely to go soft on you when there’s phase change in the excremental/ventilatory continuum (*) and your research comes under scrutiny.
Done knowingly, this is astonishingly cynical positioning. Scientists descend like a Category 5 hurricane full of knives on cooked research from anti-vaccination lunatics and fossil fuel goons. The same will never be true if you write a paper called ‘Hugs, Fresh Fruit, or Hugs AND Fresh Fruit? Improving The Lives Of Children Because It’s Nice’ or ‘Four Plans To Insert Vegetables In The Poor’. The best of intentions are a marvelous bed for the flowers of total synapse-cracking incompetence to grow, AND a handy escape into hand-wringing when the flowers scream and turn into necrotic dust.
(2) Narratives. Narratives Everywhere.
So many times in this story we see the role of narrative. Where’s the good story? What sells? What will people enjoy? What will make this story clearer?
We go around and around on this question — what role should a good story play in the communication of science? Is it necessary? Can we take it too far?
My answer to this is generally: usually we don’t have enough material to tell a story. Good research programs ask related, focused questions until information you can narrativise emerges eventually. These days, though, every dataset has its own marvelous story to tell. And, in situations like the present, could be made to have a good old tale with the prior application of a few hundred strategically-placed kicks up the arse.
If you want to tell stories, fine. Buy a Moleskine and an annoying hat, sit in a cafe looking pensive, write nights, enjoy eating Top Ramen, and follow J.K. Rowling on Twitter. A thousand thousand places exist in the world for story-tellers. Find one of them, and put the multi-level cinnamon-flavoured regression models away.
(3) You Failed At Sucking… And That’s Scary
If I do a scientific study recording 20 variables, and then I report only the three that ‘worked’, it is very difficult for anyone to ever find out.
No-one is auditing my initial work, no-one is checking. Generally no-one will see my entire and unvarnished dataset. If I’m positioned between research groups, handling my own data collection, and so on, so much the better.
Oh, and if I’m occasionally asked if my study reporting is accurate, I can always just say yes. There is no burden of proof, or anything even similar. I can simply assert.
If challenged, I can simply produce the data for the three variables of interest. The 17 variables I didn’t report are taken out the back and shot.
What’s remarkable about this case is (A) a journalist had the presence of mind and tenacity to obtain actual evidence about bad lab practices, which is something I could never do and might be unprecedented, and (B) the evidence of terrible inaccuracies in the reported data preceded this.
Basically, while taking advantage of all the marvelous ‘creative’ ways of ginning up research to look good, these people did such a bad job of it that someone noticed. These papers literally failed at bad research practice. Remember the quantum potatoes, where a group pair was simultaneously reported as having 23, 25 or 26 members? Just what kind of a clown car are you driving when you can’t ADD UP?
Ready for the terrifying part?
What does this mean about the people who CAN add?
How many research groups are doing something similar, but are accurately reporting the data they cooked up to make a good story?
Do we know how to find them? Can the Black Flag find them?
The answer is no, we cannot. Dishonest research reported accurately is the hulking mass of ice below this visible tip. We can’t see it, we can only really infer it’s there. The way to fix that is a change in the academic environment and publication practice, and not going out and combing more published papers for goofs and stuff-ups.
I said only a few days ago that this whole sorry unending saga of research woe, this Silmarillion of bollocks, still unaccountably maintained the capacity to surprise me.
And here we are again, and surprised I am.
(*) When the shit hits the fan. It’s late. Indulge me.