Data Technology Tends — Multi-part series.
In God we trust, all others bring data. — W. Edwards Deming.
We are in the information age where we have abundant data. Every organization is generating massive amount of data and wants to easily access data on-demand preferably from a single place. Getting more value from the Data quickly with the highest quality is increasingly becoming a challenge for many organizations whatever the size the organization is.
With tremendous data growth in the organizations,
- security, privacy, and governance of data with a strong data strategy
- the data source and authenticity of data
- ability to gather explainable summary from disparate data systems
- answering the business concerns of “so what” and providing value for data quickly and with the highest quality
The so-called theory of “Data is the new oil” is being practiced today with the advent of modern cloud databases, data lakes, and the delta lake. The “Data Trends” is a series of analysis from my perspective on how I see the latest Data Trend and what is next in Data. There are so many Data Technologies in place today and provide organizations to put all the data in the same basket or spread out across various cloud solution providers. The main idea of this article is to identify the trends in Data Technologies and to position it and see what is the future.
I have a couple of decades of experience working with Data and have seen the growth of data from small data to big data and how the shift has taken place. This article is purely my personal perspective and does not represent the ideas of my organization that I work for.
Trend 0: Foundational
I divide this foundational section into two (1) Mainstream (2) Gaining maturity
Data as a mainstream where the data related technology/processes has reached the plateau of productivity as per Gartner’s terms. Once thought dead is back into business due to cloud and modern data platforms. As part of this trend, Data Warehouse is back with modern cloud databases, Big Data is revived with Spark, increasing use of Artificial Intelligence with multiple PAAS and SAAS providers.
Gaining maturity: In Finance and health care, Data protection, security and privacy is not a new concept. While for the internet organizations the data protection and security is evolving.
Trend 1: Trusted
With Internet organization’s focusing on Data protection/security/protection, I would like to highlight “Trusted” as the first trend — Ringfencing Data from generation till archival and. beyond.
Data generation: Data comes from anywhere and everywhere. Especially with Blockchain evolving, data or information provenance is important.
Data Preparation: More than 60% of the work goes into preparing/cleaning the data before using it.
Data processing: The main act of data processing is to store and compute. Security while processing is yet another key element as the data gets integrated, stored, and passed onto different systems for processing.
Data store: Encryption and Security at Rest.
Data share: Encryption at transit.
Data publish: File based encryption or block encryption. Data encryption and Data virtualization.
Data archival: Archival security.
Data security/governance / privacy: With the advent of cloud and multi/poly cloud, the importance of security is increasing. Zero Trust security, need for DMZ or not, Data personalization, advanced data governance, Ai / Analytics Governance, privacy and security, Differential privacy, Data Classification, DataOps, AI / MLOps etc.
Monitoring: Machine Learning Enabled Data Quality and Validation
Trend 2: Strategic
The core of the Strategic trend is “Simplification of Data”. As with the modern data platforms, more and more organizations want to store and retrieve information from a single area and be able to have a simple yet efficient data architecture and platforms.
Data warehouse a big come back on Data Lake: eg., Redshift & Data Lake (Glue and S3) in AWS. Cloud is a given for Data & Analytics
Data Fabric: Worlds collide1. Cloud / On-prem 2. Relational / Non-Relational (structured / semi / unstructured)3. Data & Analytics
other features that are part of Strategic trend in my perspective includes Data Hub, DBAAS: Database as a service, Delta Lake, Data Mesh, Single Database for cross needs, Data Catalog, Self-Service data preparation, Data Management solutions, Data Integration tools.
Trend 3: Accelerated
The core of this trend is “Optimisation of data”.
Long Live Streams (Realtime) and Dead ETL(?): Kafka and Kinesis have made streams on steroids. The ETL which would have been dead with the streams has given a come back with technologies such as Glue. ELT is more predominant than ETL. More managed services are in the spin for the Data ETL and Streams. key advantages of using the modern real-time streams / ETL is the ability to broker for multiple purposes, removing latency, improve performance, store, and access by multiple consumers.
Other technology includes Augmented Data Management, better Master Data Management, a better, cheaper, smarter, and faster conversion of Data into Business Insights.
Trend 4: Decentralized
There is always a constant debate about whether the Data should be centralized or decentralized. With the advent of modern data platforms, data as a service and data as a product, why this should be either decentralization or centralization. With organizations generating more and more data, there should be a proven ability to switch between centralization and decentralization — this should not be an after-thought and should be an inception point and must be created as part of the data architecture.
Modern technologies such as Data Mesh, Data Fabric, Data Virtualization and Blockchain Technologies enable this big time.
Trend 5: Democratized
The current and the next generation of data moves towards Data Democratization at a fast pace. There is no technology dependency or demarcation between which sources you retrieve the data and how you are putting up analytics on top of it. Most useful technology for the democratization of data including but not limited to DataOps or DataSecOps, MLOPS or AIOps, Data sharing using Delta Tables and Auto ML.
Trend 6: Actionable Outcomes
Sharing data for actionable outcomes and publishing. After all that is the use of data after it is cleansed, stored, analyzed, and application of Machine Learning algorithms.
Technologies include but are not limited to Continuous Intelligence & Explainable AI, Embedded AI, Responsible AI, smarter faster, and more responsible AI, Blockchain in Data and Analytics and NLP.
Trend 7: Monetized
If Data is the new oil, then we should be able to sell oil or create derivatives out of it and monetize Data. Best ways to go about, Data & AI Market Places and exchanges platforms, Data As A service, and Data as a product.
Trend 8: Data Next
We are already in the next generation of data — Unified and Enriched Big Data and AI, Data Bricks, Marriage between Quantum Computing & AI.
Summary: As with the trends of Data Technologies, we also see the trend of new roles arising frequently. In this series of articles, I intend to discuss the Data Technologies trend and how it has differentiated the solutioning for business to gain value out of data with the focus on quick to market and top quality. This is a series of 8 articles.
This is a Multi-part series on Data Technology Trends — links are provided in the respective sections.
For other articles refer to luxananda.medium.com.