Considerations for a Unified Modern Data Platform

Rajesh Warrier
5 min readJul 9, 2024

--

Photo by Marvin Meyer on Unsplash

Modern Data Platforms are changing the way enterprises operate and make decisions. In this series of blogs, we will talk about how digital transformations have paved way for emergence of Next Gen technologies and Modern Data Platforms.

Digital transformations have revolutionized the way Enterprises function, paving the way for:

  • Large scale end-to-end software automation across various business processes.
  • Skilled talent across the spectrum.
  • Improving overall experience for both customers & stakeholders (both internal, external).

Enterprises have, thereby, adopted technology across business units, business functions & user personas. They have greatly evolved across various business functions by leveraging:

  • Specialized tools (SAAS, Custom Solutions, Accelerators, Scripts, Automations etc).
  • Cloud native services (Serverless, Fully-managed, Provisioned/Dedicated etc).
  • Deep integrations (Authentication, Authorization, Data access, Resource management/Sharing etc).

As a result, enterprises have matured and are continuously improving the data-value-chain.

Associates across the enterprise cross functional teams as well have remodeled themselves by:

  • Accepting the massive change in adopting technologies & crafting robust processes.
  • Aligning towards a significantly cohesive & collaborative environment (more hybrid).
  • Contributing proactively and aggressively towards the wave of change.

This has resulted in the development of better empathy and a great culture within organizations.

Enterprise IT needs are continuously evolving unifying their data platforms orienting towards business domains:

  • Dealing with 4V’s of data (Volume, Variety, Velocity, Veracity).
  • Managing data silos, orchestration & harmonization to establish a single-source-of-truth.
  • Descriptive analytics & metrics (KPIs, KCIs, KRIs) powered by visualizations (BI/reports/ dashboards).
  • Predictive & Prescriptive analytics leveraging ML/AI.
  • GenerativeAI & RAG for contextual conversations, QnA and information retrieval.

Thus, empowering data-driven decisions across the enterprise.

Technology innovators have commercialized plethora of offerings by providing:

  • Variety of cloud services on AWS, Azure, GCP etc. across various regions.
  • Purpose driven storage options (relational, object, graph, columnar, timeseries etc).
  • Efficient & scalable compute (serverless, fully-managed, on-demand, provisioned etc).
  • Data platforms (Snowflake, Databricks, Synapse, Redshift, Big Query etc).
  • ELT/ETL tools (Matillion, Fivetran, DBT, Talend, Snaplogic, StreamSets, Informatica etc).
  • MDM tools (Reltio, Informatica, Verato, Semarchy, Profisee etc).
  • Cloud Native & Managed Services for ELT, ETL, Orchestration (ADF, Glue, Airflow etc).
  • Complete data security (at-rest, in-transit, on-access, aggregated data, DPP, DLP etc).
  • Deep integrations (APIs, SDKs, Auth, Tokens etc) & technology partner ecosystem.
  • Visualization & Data Apps (PowerBI, Tableau, Qlik, Spotfire, Streamlit etc).
  • DataOps (DataOps.live, UnravelData, AccelData, CloudKitchen etc).

This has brought agility, flexibility, extensibility, scalability, affordability, deep integrations and improved speed for adoption as well.

The Data Platform Blueprint

Various “Shifts” and their evolution:

Shift Left:

In yet another point of inflection, leading to creation of new technology verticals, the Enterprise realization & need is shifting more towards the left:

  • Cataloging the data (tables, columns, objects, artifacts, code components, models etc.) and managing metadata thereby empowering discovery.
  • Establishing Data Governance to classify sensitive data (PII, PHI, PCI etc) and provide RBAC & data masking powered by continuous model learning (human-in-the-loop).
  • Applying effective Data Quality checks & business rules (read data contracts) powered by Observability thereby ensuring accuracy, completeness, relevancy, validity, timeliness/freshness, consistency, reliability etc by fixing data issues much earlier and building trust in the data.
  • Building a unified definition of your metrics using a Semantic Layer thus enabling metrics standardization (definition) & discovery across BI tools.
  • Adopting Data Ops based on best software engineering & agile practices, automation & collaboration processes throughout the data-value-chain thereby continuing to improve the overall experience & quality, speed of delivery, and faster time-to-market.

Data Governance & Data Ops, hence has become the #1 priority for all Enterprises.

Shift Right:

Until a few years ago, enterprise data & analytics mainly focused on having metrics (KPIs, KCIs, KRIs) and visualizations (reports, dashboards) leveraging aggregate data with deep dive views for descriptive data analytics.

Typically, various Ops teams (like SalesOps, RevenueOps etc) leverage the analytics reports & dashboards (summarized & deep dive views) to:

  • Democratize knowhow improving alignment across the Enterprise.
  • Guide and mentor the nextgen on driving business growth.
  • Improve targeting (cross sell, up sell) & thereby revenue.

Data & analytics divisions have, as a result, become the backbone of enterprises.

Given the success of data & analytics platforms thus far, the need greatly increased for:

  • More and more comprehensive analytics, reports & dashboards.
  • Improved data governance, data security, data quality & collaboration.
  • Building ML/AI models empowering descriptive analytics powered by predictive analytics (predictions, classification, recommendations, anomaly detection, fraud detection etc).

Thus, motivating enterprises to make further significant investments thereby improving 360°.

Shift Up:

While enterprises are still shifting left and improving data governance, data quality, lineage & transparency, collaboration etc apparently, there is a growing demand for ML/AI integrations. More so, by automating a good majority of what typically humans do today “in a semi-automated manner” — in spite of best-in-class tools being made available.

Think about it:

  1. Auto classifying similar type of data (orchestrated from a new source system) and suggesting / deriving / augmenting — the most of metadata & definitions needed — be it cataloging, classification, quality rules, measures & expressions, similar KPIs/KRIs/KCIs, auto augmenting reports & dashboards based on approval workflows etc.
  2. Developing domain specific models solving specific problems efficiently, additionally, by combining output from several models and further improving quality of results (recommend the best price triangulated with the best rated, information retrieval by data popularity with deep user and access context etc)
  3. Intent driven data security & access management !

What’s “Happening” in Data & Analytics

Technology vendors continue to innovate tools and solutions (especially over the cloud, ensuring complete data security) across a variety of categories:

  • Low code/No code tools for creating (read developing) apps and business solutions/workflows deployed and fully managed over the cloud.
  • Data platforms, data warehouses, data lakes, lake houses etc. have evolved and brought in innovation by providing out of the box native features for data acquisition, data governance, data quality, data observability, data orchestration & transformations, data analytics, data virtualization & semantic layers, BI & visualization, ML/AI, chatbots, NLP, GenAI & LLMs etc.
  • Data Ops and Data Mesh

Thus, specializing in one or few verticals and trying to build a cooperative partner ecosystem via deep integrations (APIs, Auth, Agents & Callbacks, Secure Metadata Sharing, Proxy URLs) thereby abstracting & hooking required custom implementations.

Conclusion:

Choosing the right technology stack/tools/services for your digital transformation may look like a herculean task. However, evaluating, adopting technologies and processes that suit your Enterprise is the key to success. A well thought-of methodology, process and tool selection are key to enabling success and staying ahead of the technological advances. We will explore this in more detail in part #2 of this blog series.

Let us connect: Linked In Saama

--

--

Rajesh Warrier

Passionate data enthusiast with zest for solving data challenges, I write about Data Analytics and Business Intelligence.