The Five Critical Components of Data

Or How To Weigh The Value Of Your Business’s Most Misunderstood Ingredients

Decision-First AI
Comprehension 360
Published in
5 min readApr 17, 2018

--

Let me start by repeating myself, data is not information. Information is valuable. Data is just the raw ingredients that valuable information is prepared from. You can’t own data, but you can certainly control and cultivate information.

That assumes you understand it. Sadly, in our world of the Data Illiterati, very few do. Doubly so, if they now don the flashy new title of Data Scientist. If that offends you, it offends me, too. Though likely for opposite reasons. But I digress…

This article is about five components of data that determine its value. That is not to say there are only five or that these five terms are some sort of data gospel. Of course data would be better understood and fewer people would be data illiterate if someone would only create a data gospel. My goals here are more modest…

Recency

All data has a freshness. It has a born on date. Or use whatever ingredient analogy you like. Recency has a power when it comes to data. Just don’t get carried away.

There are nearly as many examples of companies chasing real-time data to their own detriment — as their are companies operating on stale data. Neither is beneficial. Neither is optimal. Recent data is better than the alternative — the question is always how much better.

Recency also has a cost. But note, it is rarely the cost of obtaining. Data is easily obtained in real-time. It is the transmission, storage, and translation of data in real-time that breaks the bank. And sometimes spoils the recipe…

Frequency

Yes, your data is very big. Aren’t you so lucky… personally, I would rather have it frequently than in large quantities. Frequent data is powerful data. It drive reliability.

When folks think about time series data, they often mistakenly apply recency concerns and question the value. Time series (typically) offsets those issues via frequency. Frequency creates trends, history, and signal. Never underestimate its power.

Intensity

If you were betting on monetary value, you weren’t wrong and you likely spent time in retail. Only not all intensity is measured in dollars (or Euros). Intensity is simply a broader category.

If you are asking how recently and how frequently then you need to add how intensely. It is the purest measure of data value/power. Note — I am not saying the most important. I am also not, not saying that… just saying.

Intensity includes volume, dollar amounts, and time — just to name a few. Persistent frequency can actually morph into intensity — now that is powerful data, just saying.

Change in any of these first three components is typically a powerful indicator for the future. They are a pretty solid threesome of data value but you need to go a bit deeper and farther to capture it all.

Recency, frequency, and intensity distinguish feedback from raw data. They also distinguish active data streams from decaying pools.

Connection

This is a more nebulous concept. I won’t be able to develop it as far as I would like in the confines of a five minute article. But connections is critical!

Connection can be defined by concepts like proximity or closeness. Not typically a temporal thing (that was recency) so much as a logistical or analytic concept. How connected is your data to a real-world event, a real action, or real intention.

To make this more difficult, it is relative. Data can only be connected to more data. That connectivity can only be measured relative to an outcome, a criteria, or a perspective. Connection can be more powerful than any other component… or completely meaningless. But we must move on…

Integrity

It isn’t a complicated concept. It is a critical component. And it is more complicated to summarize than you might otherwise imagine. Integrity is the health and the heritage of your data. It is the value and validity. It is a process and pedigree. All of those terms are different and meaningful. It is complicated.

At the heart, integrity is a matter of trust and documented validation. Sadly, you will likely need to look long and hard to fathom the integrity of your data. Okay, that is NOT true. It is sad that you will need to look longer and harder than you likely should. There is a certain prevailing laziness to data and that is the result. Is that a good note to close on… or the absolute worst? Too late…

So there you have it. Five components for understanding the value of your data. But let me repeat, data is really NOT that valuable. It is the information that is distilled from it that matters. These five components figure prominently into how that information is created, formed, and found. Good luck in your data mining efforts and thanks for reading!

For more on this topic consider:

--

--

Decision-First AI
Comprehension 360

FKA Corsair's Publishing - Articles that engage, educate, and entertain through analogies, analytics, and … occasionally, pirates!