Evolution of Data and Data Management

Dr. Rupa Mahanti
ILLUMINATION
Published in
5 min readOct 27, 2022
Evolution: slow and gradual continuous process of change (Image created using Adobe Spark)

Evolution of Data

With the passage of time and the evolution of technologies, civilizations, and culture, the techniques used to capture, store, process, and use facts have evolved. Similarly, data (a representation of facts) and data management have had their own evolution cycles, and they continues to evolve.

Until the advent of computers, limited facts were documented, as resources to store and maintain them were scarce and expensive, and also required considerable effort . In ancient times, knowledge was usually transferred from one generation to another by the process of oral learning. The oral tradition of the ancient ages is a contrast to the current digital age, which has elaborate document and content management systems (for example, Documentum and Confluence) that store information and knowledge in the form of documents and records (Mahanti 2021b).

With the advent of computers and subsequent innovations in computing, technology, and industrial automation, a marked shift in data processing has resulted in the electronic recording and processing of data to support business operations. While electronic storage and processing of data started at the end of the 19th century, owing to the cost and limitations of storage, the amount of data that could be stored was relatively small, and data management as a discipline was less complex. Technology was seen to support reducing manual overhead to generate correct reports, and data was seen as a by-product, rather than a commodity and asset.

However, the advancement in technology, decreasing cost of disk hardware, and availability of cloud storage has facilitated the storage of large volumes of data at much lower cost. With the growing number of digital data-generating devices including smart devices (for example, smart phones, smart meters, smart cars, and smart thermostats), internet of things and cloud computing, we have ended in capturing a lot of data in a relatively short times, resulting in an explosion of data.

As stated by Eric Schmidt “there were 5 exabytes of information created between the dawn of civilization through 2003, but that much information is now created every 2 days.” From 2010 to 2020, the amount of data created, captured, copied, and consumed in the world increased from 1.2 trillion gigabytes to 59 trillion gigabytes, an almost 5,000% growth (Press 2020).

Data Deluge Challenges

The digital age is clearly characterized by an over-abundance of information. It can be considered a pandemic in own right, hence the name infodemic. While prior to the digital age, people were starved for knowledge and insights because of limited or no data, the increasing volumes of data in the present age brings with it some challenges of their own. Unfortunately, people are still starved for knowledge as well as insights. The quote by John Naisbitt (2016), “we are drowning in information but starved for knowledge” very aptly summarizes the current information and knowledge situation in the digital age. Clearly the old age “excess of anything is bad” applies in case of data too.

Some of the challenges with the explosion of data are:

Information Overload

Very few businesses have the time, resources, or expertise to make use of all the data that they capture and store. According to Forbes, “On average, companies only use a fraction of the data they collect and store (Marr 2016).” With too much data, it is a challenge to locate the right data. A survey conducted by the Compliance, Governance and Oversight Council (CGOC) showed that only 1% of information being retained was subject to legal hold requirements (that is, required to be preserved because it is related to the subject matter of actual or reasonably anticipated litigation or regulatory proceedings) (Baker and Sjoberg, 2018). With large amounts of data stored and replicated in multiple repositories across the organization, locating this 1% of data is like trying to find a needle in a haystack (Mahanti, 2021a).

Information versus Misinformation

Misinformation is false information that is spread, regardless of intent to mislead. The abundance of information and advancement in communication technologies such as internet, social media and telephone has amplified this problem of spreading misinformation at exceptional speed (Mahanti, 2022). Research indicates that in the first three months of 2020, when the COVID-19 pandemic was spreading across the world, roughly 6000 people around the globe were hospitalized because of coronavirus misinformation. During this period, researchers state that at least 800 people may have expired due to misinformation related to COVID-19 (WHO, 2021). With the huge amounts of data, it is hard to distinguish between information and misinformation. Another problem is the speed with which misinformation travels, when compared to information (Mahanti, 2022). Misinformation travels much faster and reaches more people than information. The quote “Information walks. Misinformation flies” very aptly summarized the problem.

Security and Quality

With the huge amounts of data that the organization captures and stores, data security and data quality are a challenge. Security, privacy, and compliance needs to be taken into consideration and measures and controls need to be implemented for the same. Also, there are bound to be data quality issues. Decisions based on bad quality data are bad decisions that an organization does not know about until later.

Evolution of Strategy, Data Management, and Governance

With enterprises capturing and storing exponential volumes of data, there needs to be adequate strategy, data management, and data governance to derive the best value and drive competitive advantage. Data management is no longer a simple discipline that existed in the early days of computing. Currently, data management is a multifaceted discipline with several closely interacting sub-disciplines or functions such as but not limited to data quality, data security, data architecture, metadata management, and master data management; data governance is a core function connecting all the other data management functions (Mahanti, 2021b).

Without an adequate strategy, all these data management functions will be implemented though different data initiatives and the solutions will be addressed piecemeal or as department silos without assessing the enterprise level implications. The lack of an enterprise view can allow risks to be magnified and alignment with organization’s strategic objectives be neglected, resulting in implementation of sub-optimized solutions, and introduce inconsistencies in data and information, or development of systems that cannot be easily integrated (Mahanti, 2019).

Conclusion

Evolution of data management has many steps. A data strategy containing a proposed balance of defense (control) versus offense (flexibility/growth) coupled with planned execution through data initiatives should be accompanied by periodic review and revision, and measurement of impact and progress through meaningful and actionable metrics. An evolutionary data strategy can deliver a real and tangible impact, drive change, improve operational efficiencies, and provide substantial opportunities for new revenue and competitive advantage.

If you have any questions or any inputs you want to share, just comment or connect on LinkedIn.

This article draws significantly from the research presented in the books- Data Governance and Compliance: Evolving to Our Current High Stakes Environment, Data Governance and Data Management: Contextualizing Data Governance Drivers, Technologies, and Tools published by Springer in 2021, Data Quality: Dimensions, Measurement, Strategy, Management and Governance published by ASQ Quality Press in 2019, and How Data can Manage Global Pandemics: Analysing and Understanding COVID-19 by Routledge Press in 2022 and is a modified version of the article “Evolution of Data and Data Management” published in Data Management University. Future research will focus on strategies and management to derive maximum value from data.

--

--

Dr. Rupa Mahanti
ILLUMINATION

Author of 7 books, mostly on data; Ph.D. in Computer Sc. & Eng.; Digital art designer; Publisher- The Data Pub (https://thedatapub.substack.com/)