Fraud Ain’t The Game

Forget about morality. Wrong is wrong.

In a parallel universe, not unlike our own, a graduate student is working late. She is putting the finishing touches on a paper she hopes will be the cornerstone of her PhD, which — as much as it’s often reasonably soul-destroying — is progressively becoming more exciting. She works late to preserve this excitement. It gives her hope that maybe one day this rotten, unforgiving business will work out, that she will have a life of curiosity and progress. People doubt her. She does not doubt herself. Or, at least, not too often.

In another parallel universe, a tenured professor who is a complete bastard has finished kicking his neighbour’s garbage bins and yelling at the television for the evening, and slopes off to his study. It is working late nights like this, he grouses in a moment of self-pity, that caused his third wife to leave him (it actually wasn’t this, it’s because he’s a miserable wretch who would try the patience of St. Anthony and wipe the smile off the face of a golden retriever). He is a shiny brittle little man. He is a sneer in a cardigan, a tumble-dried faculty Grinch without the fetching skin tone. He is a martinet, a hypocrite, a bastard, and a ruiner.

Her latest study is a model of good scientific practice and prudence. She has tried to be careful, open, honest, forthright. The studies are correctly powered. The interventions are reasonable. The notes are careful. The data is freshly scrubbed and annotated, should anyone request it. She’s a model citizen. It’s important to her to BE a model citizen.

The only problem is: at a crucial point in an analysis, she used the common logarithm rather than the natural logarithm. The numbers for the control group are therefore a lot lower than for the intervention group. That number, never checked or visualised, was the fodder for further analysis. One step amongst many. Her work is built, sadly, on an elementary mistake, hidden amongst enormous screens of numbers. A classic Reinhart-Rogoff.

His studies are a bullets in a gun he uses to rob the world. He doesn’t give an empty rattling fuck about any form of accuracy. He is tired, and grey, and mean, and cynical beyond all possible belief. He is quite happy to pollute the global pool of knowledge. In other words, he is quite successful. He could make department chair in two years if he keeps his productivity up. His latest study is a pile of smoking ruin, cobbled together from numbers he made up, numbers he cribbed from another study, numbers he thought looked attractive (he is particularly fond of 17). Plausible, terse, focused… and bollocks. That’s his work. Just enough spit-shine on it to not raise any awkward questions… and bollocks.

Her last study is wrong. Its conclusions are not supported by evidence.

His last study is wrong. Its conclusions are not supported by evidence.

Both of them will be published, both of them will step into the Pass-the-Parcel game of the global pool of knowledge. And then, their influence becomes abstract, distant. They enter a world where they unknowably influence decisions, study plans, lives, policy… and they waste a lot of graduate student time. Doing a PhD is hard enough without someone handing you a sandcastle to build a house on.

So, differences in culpability, intent, recklessness, fraud, malice… but the same outcome. Let’s talk about the outcome.

“So, How About That Fraud, Eh?”

HERE COME DAT DATA BOI

Every time I see someone representing something I’ve written (or participated in) as ‘fraud detection’, I grind my teeth and my insect eyes flash like a rocket.

This is in the context, of course, of my apparently congenital ability to involve myself in investigating problematic research.

A smorgasbord of options are possible: bad data management, wishful thinking, garden-variety sloppiness, edit spirals / the ‘Telephone Game’, extreme p-hacking, and more besides. Curiosities abound. It’s often the subject of speculation — how is that thing that wrong?

Who knows.

Just don’t call it ‘fraud detection’.

I stopped thinking about ‘fraud’ a long time ago. The reason is simple — we have no crystal ball. Fraud is be determined by actions we can’t observe, and similarly a state of mind which is unobservable. There’s not enough time in the day to properly litigate why it happened, or more ridiculously still, what your intent was while you were doing it. That’s up to you, your university, the government, and whatever statue of Moloch you pray to.

What we do is error detection. That’s all. Forget about fraud.

We find impossibilities, inconsistencies, weirdness, incompleteness, and all the wonderful company they keep. These are all defined by function, not intent.

Sometimes, the errors we find are very substantial.

Sometimes, these errors are of absolutely no consequence whatsoever, even if we’re still essentially correct.

Sometimes we are wrong, and what we see as troublesome is our misunderstanding. This is open science — if this happens, if we are wrong, please point it out nice and loud. I will write a whole separate blog post about how I got it wrong. These are excellent learning opportunities.

This framework is fairer both to me and anyone receiving pointed questions. To me, because I am not alleging wrongdoing and exposing myself to charges of harassment, intimidation, libel, or general dickery. To others, because they are not being accused of wrongdoing out of hand.

Basically, scientists are allowed to make mistakes without the assumption that they have done something untoward — no matter how their work looks — and I am allowed to ask if I have identified something which constitutes a mistake.

The only important thing to do when you have found an error (or a bunch of them) is to tell other people. Preferably the author, who should take responsibility for the problems identified if doing so would be justified.

But in the absence of that, telling everyone else works too.

The scientific record is important. Even for research you might think is deeply silly, even when it’s the Southern Maine Journal of Basketweaving, even when it’s not your field, even when you think what’s been written is so facile and arse-backwards that no ‘reasonable’ person would ever believe it (so why get involved?)

Because far more important than the life and times of any individual paper is building a scientific environment where mistakes are located, publicly identified, and corrected. You’ll never know whose time or money you are saving. But money and time is saved. It’s no more abstract than the ‘good’ the work can do if it exists.

So: where the errors are from, I have no opinion.

People have told me previously that this is terribly wishy-washy, which I consider to be bollocks. My suggestion to illuminate this opinion for you is: try it. Put yourself in the public domain and accuse people of bad faith. You will be wrong a lot, and then you will get into trouble. It’s much easier to support the argument that someone can’t add than litigate their intentions when the cat ate their calculator.

And wrong is wrong.

And if wrong enough, should get fixed, or get gone.

If you want to yell at me, do it here.