The State of Big Data
A 2016 Veritas Global Databerg Survey revealed 52% of all information currently stored and processed by organisations around the world is considered ‘dark’ data, whose value is unknown. Additionally, another 33% of data is considered redundant, obsolete, or trivial (ROT) and known to be useless.
Volume, Velocity & Variety: The 3Vs That Were, Are And Will Be
It goes without saying we continue to generate larger and larger volumes of data. According to a recent report from IBM, 90% of the world’s data has been created in the last two years alone, at a remarkable 2.5 quintillion bytes (i.e. 2.5 exabytes or 2.5 billion gigabytes) of data per day!
With the anticipated rise of autonomous agents and IoT, and looming ubiquity of connected sensors in devices of all shapes, sizes and purposes, the velocity of data generation will likely remain a concern, as will the incredible variety of structured and unstructured data created. For example, according to Intel, a single autonomous vehicle generates and consumes approximately 40 terabytes of data for every eight hours of driving, courtesy hundreds of embedded sensors, which means just a million autonomous vehicles will generate 3 billion people’s worth of data.
Even though ingenious data storage and retrieval techniques are swiftly materialising, ranging from DNA sequencing to plant-based techniques, the current scale and exponential growth of both data production and consumption will continue to be a challenge.
Decentralisation: Rise Of The Data-Oriented Organisation
A few years ago, some pundits predicted Chief Data Officers (CDOs) quickly rising to prominence as the most powerful executives in organisations, possibly even eclipsing CEOs. Instead, while there has certainly been a rapid proliferation of senior data officers, there has also been a simultaneous decentralisation, and to an extent democratisation, of data thanks mainly to technological advancements.
Much of an organisation’s data is heterogeneous and distributed across multiple relational and non-relational systems, from Hadoop clusters to NoSQL data stores. The governance of these data assets lies with data and IT executives. However, there have emerged intermediary self-service platforms and algorithms markets (where one can purchase machine learning algorithms, rather than develop them, and “just add data”) that are largely data/source-agnostic. These are not simply dissective but increasingly prescriptive; they have empowered individual departments and business units (BUs) to become progressively self-sufficient when it comes to putting data to use. These solutions do not require months of planning/preparation or the establishment of data/IT infrastructure and capabilities. Instead, BUs can simply connect their data sources and get to work, thereby enabling independence, agility and increased productivity.
As a natural consequence, there has been a pervasive spurt in data-consciousness and analytics-literacy across the enterprise. For example, a Forrester study found that almost half of all of B2C marketers use big data and analytics to improve responsiveness. Simultaneously, according to a report by Deloitte, 35 percent of companies surveyed said they were actively cultivating data analysis capabilities for HR.
Veracity, Visualisation & Value: Emergence Of The New (Arguably More Important) 3Vs
With data, as with other trends, the initial fascination with size has been supplanted by a focus on quality. Collecting massive amounts of different kinds of data at high speed is worthless if it is inaccurate or incorrect. Therefore, it is important to eliminate errors, biases, skews, noise and abnormality in raw data (as well as subsequent analysis) to ensure veracity, in turn building dependability and inspiring trust. This is especially true of progressively automated decision-making with little-to-no human intervention/supervision.
Data visualisation is one of the most difficult and yet most rewarding aspects of big data. According to a piece in Wired, “data visualisation is wayfinding, both literally, like the street signs that direct you to a highway, and figuratively, where colours, size, or position of abstract elements convey information … data becomes more malleable, actionable, and, ultimately, more human.” In many ways, successful visualisation helps organisations answer questions they did not even know to ask.
Most importantly, the entire point of big data is value. Of course, data in itself is not valuable. The value comes from the analyses done on it to turn it first into information and then knowledge for better business decision-making.
A 2016 Veritas Global Databerg Survey revealed 52% of all information currently stored and processed by organisations around the world is considered ‘dark’ data, whose value is unknown. Additionally, another 33% of data is considered redundant, obsolete, or trivial (ROT) and known to be useless. Organisations are creating and storing data at an ever-increasing rate due to a ‘data hoarding’ culture and indifferent attitude to retention. If left untamed, business data will unnecessarily cost organisations around the world a cumulative $3.3 trillion to manage by the year 2020.
Companies keen on building efficient data-driven businesses need to avoid their data lakes becoming swamps. To that end, value starts and ends with the business use case. Initial and continued investment in big data infrastructure and resources can only be justified by the insights generated leading to measurable improvements with ROI-positive results.
Privacy & Security: Critical Elements Of The Overall Premise
Lastly, and most significantly, big data continues to pose formidable challenges around both privacy and data security. With traditional methods of data protection usually falling short, companies are compelled to overhaul their privacy and security procedures, often with guidance from organisations such as IAPP and FPF, among others. While it is certainly true that new data management practices and emerging technologies can have a transformative impact on multiple aspects of a business, it is equally important to simultaneously conduct risk assessments, employ stringent controls, develop company-wide ISM competencies and stay abreast of a rapidly evolving regulatory landscape when participating in the new information economy.
This article was originally published in Business World on 1st June, 2017.