What is Dark Data?

Jill M. Platts
2 min readMar 9, 2018

The basic definition of dark data is data that has been collected, but is unstructured and, therefore, not currently being used. It is data that has been continuously collected and stored, but has not been organized via categorization, labels, or any other effective organization tool. Though this massive treasure trove of unstructured data could hold valuable insights if it were to be organized and, subsequently, analyzed, it is currently in the “dark”. Potentially highly influential in the decision-making processes of a business, dark data is often waiting indefinitely to be evaluated and analyzed via data analytics.

Examples of Dark Data

One example of dark data is a customer call record. Potentially holding valuable information on a customer’s thoughts and geolocation, these types of records are regularly recorded and stored, but rarely organized or analyzed. Another example of dark data is a website log file. Potentially holding valuable information on visitor behavior and traffic, these logs are regularly collected, but rarely analyzed in any organized or meaningful way.

Growth of Dark Data

According to a 2011 IDC study, 90% of digital data is unstructured data, or dark data. The study also found that the world’s digital data is doubling every two years, significantly faster than Moore’s Law predicted…

--

--