Hitchhiker’s Guide to Analytics — Uncertainty
Make sure you know what you don’t know
As we continue our analytic hop across the galaxy, learning from the smartest computers in existence, avoiding our plastic pals who continually fail to be any fun, and searching for the truth , one thing has become painfully clear:
Almost no one around here really seems to know what is happening.
I’m not trying to be rude.
I’m just saying, for a world that generates and shares information at a continuously accelerating rate, we are spending most of our time trying to manage it and not nearly enough time understanding it.
The Four Vs
I’ve been doing research in regards to data veracity. I keep coming across the “Four Vs of Big Data”: volume, variety, velocity, and veracity.
One of these things is not like the others.
The volume of data is growing.
The variety of data is growing.
The velocity of data movement, storage, and retrieval is growing.
Veracity? Not so much.
This should not come as a surprise to any data scientists. Or analysts. Or, really, anyone who pays attention to statistics or science fiction.
It’s growing. It’s changing. And it’s out of control.
If anything, the growth attribute here can be defined as the opposite of veracity. That would be… let me think about it…
Uncertainty
Some articles on the “Four Vs of Big Data” use a modified list: volume, variety, velocity, and… uncertainty.
It ruins the alliteration, but it is a pretty meta way to make the point.
But I’m not here to cast aspersions or to tell you that 90% of the data stored and analyzed in the world is wrong. We’re getting much better at capturing it.
I’m just saying that we have no idea what most of it means or how to use it.
Our collective data set is growing too much, too fast, and in too many directions. We capture it because we’re afraid to lose the history, but we have no idea how to verify much of it and no time to do it, anyway.
Control
Funny how a fairly innocuous subtitle could almost make you laugh out loud. Don’t worry; it had the same effect on everyone.
Truth is, before we can begin to get control of our analytics, we need to get control of the data. And the key to the process was stumbled upon by two ingenious morons millions of years ago, standing in front of the Deep Thought, the second smartest computer in all of time and space…
I use this quote a lot when discussing requirements, data or otherwise.
I’m pretty sure Douglas Adams meant it as a joke.
It might be the single most intelligent statement I’ve ever read.
Rigidly Defined Areas of Doubt and Uncertainty
Data keeps growing, changing, and accelerating. I’ve written about it. Others have written about it. Some very smart people make their living saying the same thing.
Thing is, most of the people writing about it are also proposing ‘simple’ or ‘straightforward’ methods of dealing with your data. Of imposing control over it. Of eliminating uncertainty.
That does make me laugh. Might as well try to put the universe in a box.
Thing is, almost no one in the world today is surprised by the complexity of the data we capture. Scientists of every discipline have been dealing with this issue for years, and there are a number of strategies available for dealing with vagueness and uncertainty in large data sets.
The real problem comes when we try to assert control with inadequate tools, or worse, try to claim control where we don’t really have it.
And the answer is…
Yeah, I don’t know the ‘answer’ to resolving uncertainty. My specialty is Questions.
But I do know how to address uncertainty. Admit it. Make sure you know what you know, and always delineate what you don’t know.
This know / don’t know comparison is something I’ve heard a lot and will probably be discussing again at some point.
Outlining the gaps you’re aware of (what you know you don’t know) is easy. Getting the bottom squares is the hard part, but you have to try.
Just be clear. Be complete. You can’t address the uncertainty until you define it. And even when you do, it will continue to grow. And change.
Maybe next time, we’ll visit Wonko the Sane. He might have some ideas.
For more wisdom from the hoopy froods of the galaxy…