Part 1 of 2: The Big Data Problem
Last year I wrote about how the acquisitions of Tableau and Looker were bright signals spreading light on the early stage of the data analytics market. Emphasis on early. To my surprise, in conversations today most people are speaking as if we are in a state of market maturity. This evaluation gets ahead of ourselves.
Our world is riddled with misconceptions. B2B marketers have done a great job selling the idea that every company (besides yours) has figured out how to make use of their data.
The truth is that almost everyone is wondering what to do with all the data they’ve been hoarding. We’ve collectively hitchhiked on the a priori assumption that ingesting every piece of data in front of us in an infinite game of Data Hungry Hippos is a sound business strategy. At first, we bundled up our data and packaged it away in data warehouses. The world kept producing more data, and now everyone started talking about the possibilities, so we gleefully unloaded it into newly formed data lakes. Here we could hold massive amounts of structured and unstructured data until we may (or may not) need it.
Why Data will Transform Investment Management | Data Driven Investor
Some have called it "the new oil." But while it bears little resemblance to the black gold, its ongoing commoditization…
The data never stopped coming. We lost control, and in exhaustion, gave up on data governance and management. All the time and money poured into our great data lakes left us with murky data swamps filled with polluted data, that force data scientists to spend more time being data janitors than extracting knowledge. Analytics capabilities sputtered along, but insights have been laborious to garner. As we sank to the bottom of our new data tarpit, a hand from the cloud reached down and lifted us out. The cloud promised to untether data from the cold restraints of carbon-based hardware and free it to exist everywhere yet nowhere; dancing in the ether like cosmic pixie dust. We could have all the data in the world, as fast as we want it, and finally, the insights we so desperately craved.
In a perfect world, we all moved seamlessly to the cloud and lived happily ever after, spinning data into insights until the sun burnt out. The end.
Except in the real world, the journey to the cloud hasn’t really changed much. We thought the cloud was the solution to our problems, and because of that, we didn’t address the root cause: our behavior towards how we treat data. All data was treated as sacred and we refused to part with any of it.
We continued with the same disorganized data onboarding and cleaning processes that had got us in trouble in the first place. This was due to most people not understanding what data they actually need, and it made it easy to justify our hoarding tendencies. Only this time at a much larger scale. So now many of our clouds have just become floating swamps.
So why talk about what’s problematic with data when there are so many other exciting things to discuss, like AI and machine learning?
The latest technology trends get the sexiest headlines. That’s because of the misconception that is continually perpetrated about the world we are told we live in — where these technologies work seamlessly for everybody — versus the reality we actually experience every day.
Before we can open up the real potential of AI and machine learning, we have to address the one constant that ties everything together: the exponential explosion of data and our lack of ability to make sense of it all.