What is “Big Data”?

Ami Levin
97 Things
Published in
3 min readJul 11, 2019

You’ve heard the term numerous times. Articles and books have been written about it. You might be using a product that claims to be such. Perhaps you even have it in your resume. Have you ever stopped to think what it really means?

The term “Big Data” has no standard, clear and agreed-upon definition. Some use it to describe data volumes, others as an indicator of data velocity, and some as data of large variety. None of these have any quantitative metrics that can classify a data set as being “big” or “small.” Many use it to refer to specific technologies such as Hadoop, while others use it to describe data from specific sources such as social media or IOT, or so called “unstructured data.”

There are so many conflicting, vague, and ambiguous definitions for it, but none describe neither the data, nor its use. This means that anyone can claim its products, services, technologies, or data sets to be “Big Data”, and this claim can’t be refuted.

The truth is that there is no such thing as “Big Data.”

Large volume, high speed, varying source data has always challenged our ability to derive value from it. Nothing fundamental has changed in the last decade to warrant a new term.

Data management has always been about applying analytical methods to gain insights that improve decision quality. Nothing more, nothing less. By now, you may be asking yourself why so many smart, honest, well-intentioned data practitioners believe that Big Data is a real thing.

Big Data is nothing but rebranding of existing offerings contrived by marketing people to revive interest and boost sales, and it is not the first time (and probably not the last) this has happened.

In his book “Big Data at Work”, Thomas Davenport says: “It is a well-established phenomenon that vendors and consultants will take any new hot term, and apply it to their existing offerings — and that has already happened in spades with Big Data”.

To find the root cause, all you need to do is follow the money trail. Big Data advocates that you must store everything and keep it forever. It convinces organizations that they must use the latest technologies, or they will be left behind. I’m sure you see now where this is going.

Pursuing Big Data is wild goose chase. It distracts organizations from what is really needed to uncover the value in data, by investing in storage and the latest technologies instead of improving data quality, and decision making - which can only be achieved with domain expertise, modeling skills, critical thinking, and communication. These require education, practice, and time. Unfortunately, these are not as easy or sexy, nor as appealing, as the false promise of Big Data being the silver bullet that will solve all your data challenges.

Data management has always been, and always will be, about applying analytical methods to gain insights that improve decision quality. Nothing more, nothing less. We must refocus our efforts and stop chasing the latest fad, which is what we have been doing over and over again for decades.

Relying on any new technology, or a rebranding of an old one, is bound to fail. In my 25 years in data management, I have yet to see a technology that could beat a bad design.

[This article was inspired by Stephen Few’s book “Big Data — Big Dupe”.
Get it on https://amzn.to/2Wk5nLx to learn more and see how you can transform your organizations’ data management culture.]

--

--

Ami Levin
97 Things
Writer for

A data architect, mentor and trainer with more years of experience than would like to admit. BS debunker, vegan, motorcycle rider, and a cat hoarder.