Unifying Data Management with Data Mesh and Data Fabric

Harish Govindaraju
4 min readFeb 7, 2023

--

Source from AnalyticsInsight.net

A brief overview

In the quest to build the best data architecture for your organization’s current and future needs, you have many options. But here are certain design patterns that have emerged in recent years that can certainly help you on your journey to build your best data architectures. With that said, let us explore what is Data Fabric and Data mesh. At first glance, it looks quite similar and with some confusion. Here we will see some fundamental differences between these two approaches, so it’s worth taking some time to learn their key differences.

Data Fabric

Conceptually, a big data fabric is essentially a metadata-driven way of connecting a disparate collection of data tools that address key pain points in big data projects in a cohesive and self-service manner. Specifically, data fabric solutions deliver capabilities in the areas of data access, discovery, transformation, integration, security, governance, lineage, and orchestration.

Momentum is building behind the data fabric concept to simplify access to, and management of, data in an increasingly heterogenous environment that includes transactional and operational data stores, data warehouses, data lakes, and lake houses. Organizations are building more data silos, not fewer, and with the growth of cloud computing, the problems surrounding data diversification are bigger than ever.

The goal of data fabric is to provide a secure, flexible, and scalable data structure capable of supporting modifications and future expansions, while also ensuring security, scalability and data quality, while reducing the cost of data management.

It builds upon the idea of polyglot persistence. A Polyglot Data Store combines storage approaches like Relational Database, Graph Database, and/or File/Blob Store. On the contrary, reporting and analysis tools are not considered the core scope of the Data Fabric concept, but rather under the responsibility of the data consumers.

Data Fabric

Data Mesh

The Data Fabric concept basically already introduces the idea of “data as a service” with datasets being “data products”, even though this terminology is not explicitly used. This is what led in the end to the idea of a data Mesh.

The existing centralized and monolithic data management platforms, with no clear domain boundaries and ownership of domain data, fail for large enterprises with a large and diverse number of data sources and consumers. In a Data Mesh, the domains must host and serve their datasets as domain data products, which enclose information and functionalities of the data. While the individual domain teams own the necessary technology to store, process and serve their data products, a common framework is needed to allow homogeneous interactions with the data products.

As you can see in the diagram, Data Pipelines are also owned by the business Domains, i.e., each domain is responsible for its own data transformations. A domain can consume data products from another domain. Like Data Fabric, there is a strong focus on metadata with the Data Catalog providing a cross-domain inventory of available data products. The data mesh gives complete autonomous responsibility for the domain to pick and choose their tools of choice to deliver data products but at the same time enforces the domains to have their data products shareable across other domains adhering to the overall data mesh framework.

Unifying Data Mesh with Data Fabric

Conclusion

Data fabric builds upon the idea of polyglot persistence and eventually aligns to a centralized data platform with data as a service platform. Data mesh on the other side builds upon the idea of decentralization of data platforms with the ownership of business domains and the individual domain teams own the necessary technology to store, process and serve their data products, enforcing a common framework to share and subscribe data products across the domains. Unifying the best of both (fabric and Mesh) will take us to a new realm of data platforms. Fabric Polyglot persistence with decentralization data mesh will unlock the power of two designs.

#datamesh #datafabric

Works Cited

Inspired by the article written by Priebe, Torsten & Neumaier, Sebastian & Markus, Stefan. (2021). Finding Your Way Through the Jungle of Big Data Architectures. 5994–5996. 10.1109/BigData52589.2021.9671862.

Harish Govindaraju

RightData

--

--

Harish Govindaraju

Data Practitioner and Data Leader with wide industry and technology expertise . https://twitter.com/harishr82504253