Data as a Curse Word

The use of the word “data” has become mind-numbing.

Scott Gehring
Technology Whiteboard

--

Data. Big Data. Data Lake. Data Warehouse. Data Mart. Data Driven. Data Analytics. Data this. Data that.

You get the point; the use of the word “data” has become mind-numbing.

To me, “data” has developed into a curse word for modern information systems.

Its exploitation has become more egregious than the use of “gluten-free” on food labels or the word “aesthetic” by tweens on social media.

Herd mentality is habit-forming.

Stop the madness!

The Data-Driven Disconnect

In 2021 it was estimated that 215 billion dollars were spent on IT solutions that involved data[1].

This spending increased by 10% over the previous year, and the trend seems to be continuing for the foreseeable future.

The need for companies to invest in becoming data-driven is apparent.

However, where is this money going? Is throwing cash at a problem always the best approach?

Another trend that is also clear is that IT systems involving data are not meeting the needs of their customers.

Here are some staggering statistics[2]:

  • 87% of data science projects never make it into production. (VentureBeat)[3]
  • 85% of big data projects fail. (TechRepublic) [4]
  • 80% of data-driven insights will not deliver measurable business outcomes. (Gartner)

These statistics argue that when it comes to spending patterns on data-driven systems, the juice is not worth the squeeze.

What is the disconnect? How do we keep from making the same mistakes repeatedly? Is this not the definition of insanity?

There is much discussion among the analytics avant-garde on how to solve the spend versus results mismatch.

With no shortage of talking heads, opinions on how to remedy this problem fly like loose items on the dashboard of a moving car.

Much of the material I have seen and heard is surface-based observation and remediation.

Not unlike a dermatologist prescribing an ointment for an unsightly rash without asking the question: what is the underlying cause of the inflammation? Does the cream truly heal the patient, or does it hide a symptom of a deeper, more chronic ongoing problem?

One must go beyond the epidermal surface to fix chronic systemic difficulties.

We must probe deeper than tissue, tendon, and muscle.

Not dermatology, but orthopedics.

An x-ray of the bones.

When an orthopedic x-ray of the IT data-driven anatomy is conducted, we find that the underpinnings are built on flawed notions.

Identifying the Flaw in the Data-driven Anatomy

As aforementioned, a systemic flaw in the IT data-driven mantra leads to disconnect and ultimate failure.

What is that flaw?

To answer this question, first, let’s define what it means to be data-driven.

For purposes of this article, I will use the definition put forth by data scientist Carl Anderson:

“Data-drivenness is about building tools, abilities, and, most crucially, a culture that acts on data.”[5]

The flaw of this definition is not the use of tools, abilities, and culture but the term “data” itself.

Why?

The use of the term data is incomplete and circuitous[6].

What is the ultimate objective of data-drivenness?

The answer: to gain knowledge.

To use anatomical imagery again: if knowledge is the physique, and data is bone, what is the muscle?

Without musculature, we simply have skin on bone — some kind of weird, animated zombie.

What is the missing muscular tissue?

The answer is information.

Information is the muscle of Hercules.

KID — Information, not Data

KID stands for knowledge, information, and data.

These three components constitute an essential hierarchy in analytical systems — each acting as a stepping stone, with data as the base, knowledge as the end goal, and information as the bridge between the two.

The join between knowledge and data is vital to system development.

That is why they are called “IT” systems.

The term IT stands for information technology.

At best, using the term data is a distraction created by marketers, and at worst, an industry regression, a step backward.

Information is the machinery that connects data to knowledge.

Information as an idea is so compelling that in 2019 physicist Melvin Vopson of the University of Portsmouth, in an article for the American Institute of Physics (AIP)[7], stated that information is actually the 5th state of matter and has mass.

To accent my point: data is not the fifth element. Its Information.

The following image displays KID:

KID = Knowledge + Information + Data

KID is a subset of the bottom three rows of the 2019 Gallón L. study on Systemic Thinking[8] and expresses the importance of information for data to knowledge transformation.

A hierarchy that applies to individuals, societies, and companies, all organizational groups, emerges here. It is a progression of sophisticated intelligence.

It is not to say data is not valuable.

Data is a foundational prerequisite for knowledge.

Data is the raw material used for thinking, synthesis, and analysis.

By way of illustration, data would be the equivalent of gasoline.

Without gas, a car cannot operate.

Thus, we need to collect and store gas to drive.

Knowledge is the destination for our data-fueled car.

Information is the automobile.

It is the automobile that gets people to their destination, not gasoline.

A problem in modern business systems is that IT tends to overemphasize the collection portion of the data paradigm.

I call this “collect and expect.”

The IT and data scientists collect and store the data, expose it through some kind of API or application, and expect the end-user to consume it, somehow gaining knowledge through some invisible magic.

Build it, and they will come.

This mystical thinking leads to the notoriously unusable data lake.

Gas collection for the sake of gas collection is expensive and useless.

It is the transformation of data into information that creates meaning.

The transformation of data into information and information into knowledge is the systemic, not topical, difference between systems that generate true business outcomes and those that do not.

The next question is how does one convert data to information and then information to knowledge?

Connective Tissue

Data is the bone, information is the muscle, and the physique is knowledge.

How do they connect?

If you notice in the KID diagram, there are two connecting arrows. The arrows are the transformation factors.

The transformation factor between data and information is relationships.

Relationships would be the anatomical equivalent of tendons.

The transformation factor between information and knowledge is patterns.

Patterns are the tissue that connects skin to muscle.

How many IT professionals, data scientists, or engineers spend the bulk of their time “collecting and expecting” rather than taking time nurturing the transformation factors of relationships and patterns surrounding their API or applications?

It is few and far between, that is for sure.

The process of integrating transformation factors involves not only technology but people and processes.

A holistic view is required to develop a fully integrated approach.

The Road to Salvation

The first step on the road to salvation is to stop cursing!

The word “data” should be eliminated from the vocabulary of cultures and replaced with “information.” Information-drivenness.

To build on the definition from Carl Anderson:

“Information-drivenness is about building tools, abilities, and, most crucially, a culture that acts on data, relationships, and patterns to derive knowledge.”[9]

This definition is holistic, integrating KID with tools, abilities, and culture. Furthermore, information-drivenness expands on the full potential of SIS and the Golden Triangle of Analytics.

This article may seem like I am mincing words.

However, words are the vehicle of growth and improvement.

Growth and improvement require force.

Think of a child.

Every year, their bodies exert the physical force of growth outwards against the world.

Many think of force as only kinetic.

While it is true, force has a physical element, to believe it is only kinetic is a misnomer.

To cross-reference a field specializing in such matters, martial arts, the force formula used in conflict (growth is a form of conflict) is as follows:

holistic force = movement + firepower + communication [10]

Per the formula, communication is a behavior of force delivered on a social-emotional plane, and words are the delivery mechanism.

Thus, applying the correct words with accurate meaning employs the appropriate force of improvement.

Conversely, misleading words and off base definitions apply force in the wrong direction, thus leading to weak results; in the case of the data system paradigm, a severe cost-result mismatch.

In Closing

With information as the hub, you can move up the KID hierarchy into knowledge.

Per sir Francis Bacon, knowledge is power.

An organization with more knowledge will have superior control over the products and services they provide to its customers and against its competition.

It will have the ability to solve genuine business problems. Even beyond power enablement, knowledge is the key to understanding and, ultimately, the development of wisdom.

For more reading on this topic, please see the following articles:

Hey Sis! Beyond People, Process, and Technology | by Scott Gehring | Technology Whiteboard | Feb, 2023 | Medium

The Golden Triangle of Analytics | by Scott Gehring | Technology Whiteboard | Oct, 2022 | Medium

Footnotes and References

[1] Global Spending on Big Data and Analytics Solutions Will Reach $215.7 Billion in 2021, According to a New IDC Spending Guide | Business Wire

[2] 6 Reasons Why BI and Analytics Projects Fail — And How to Avoid It | Salesforce Ben

[3] Why do 87% of data science projects never make it into production? | VentureBeat

[4]In 2016, Gartner estimated that 65%. A year later, Gartner analyst Nick Heudecker‏ put the failure rate at 85%.

[5] Creating a Data-Driven Organization by Carl Anderson, 2015, Publisher(s): O’Reilly Media, Inc. ISBN: 9781491916919.

[6] This is not a knock on Carl Anderson. Instead, this is critique on broader industry that imprisons both him and I.

[7] The mass-energy-information equivalence principle: AIP Advances: Vol 9, No 9 (scitation.org) https://aip.scitation.org/doi/full/10.1063/1.5123794

[8] https://link.springer.com/referenceworkentry/10.1007/978-3-319-69902-8_58-1

[9] Creating a Data-Driven Organization by Carl Anderson, 2015, Publisher(s): O’Reilly Media, Inc. ISBN: 9781491916919

[10] Communication as a Force. Holistic engagement of force is not… | by Scott Gehring | S.E.F. Blog | Medium

About the Author

Scott Gehring has over 30 years of experience in global enterprise information systems and holds several patents for his work in varying industries.

As a pioneer in the field of analytics, he has been an influential industry leader in defining best practices around system design, implementation, integration, and operations.

Scott has built hundreds of solutions for companies ranging from small-mid size business to large-scale enterprise organizations, helping to drive process improvement, tighten the link between business and IT, and provide the latest innovations in information technology.

www.scott-gehring.com
www.linkedin.com/in/scott-gehring/
Scott Gehring — Medium
Technology Whiteboard

--

--

Scott Gehring
Technology Whiteboard

Deft in centrifugal force, denim evening wear, velvet ice crushing, and full contact creativity. Founder of the S.E.F Blog and Technology Whiteboard.