Image Credits: Fakurian Design (image source)

10 reasons why you are not ready to adopt data mesh

Thinh Ha
Google Cloud - Community
7 min readJan 3, 2022

--

Disclosure: I am employed by Google Cloud Professional Services, however the opinions expressed here are mine alone and do not reflect the views of my employer.

There is a lot of excitement and scepticism at the moment about Data Mesh. Much has already been written about what a Data Mesh is, so I will not recap them here. In this article, I want to provide a view on why you should not adopt Data Mesh as I believe this perspective is missing from existing literature.

To be clear, I did not write this article to dismiss Data Mesh. I have personally found the concepts useful in my work advising GCP customers on large-scale data transformation programmes. The goal of this article is to encourage constructive conversations around Data Mesh adoption by describing where Data Mesh may not be the right solution.

With that, let’s get started:

1. You are not operating at a scale where decentralisation makes sense

Data Mesh is fundamentally about eliminating central bottlenecks in delivering value from data.

However, moving towards a completely decentralised model risks creating data silos, unnecessary duplication of effort, and may require bridging a much larger data analytic and engineering skill gap than what can be justifiably achieved by decentralised teams alone.

Therefore, effective decentralisation still requires centralised coordination to align, enable, and support the decentralised data teams.

This coordination overhead may not be justifiable for small to medium sized organisations, e.g. a 100 people organisation with 2–3 FTEs focusing on data and analytics. Instead, analytic needs across the organisation may be served more quickly and efficiently by a small and centralised data team.

On the other hand, a 100,000 people organisation spread across multiple geographical jurisdictions or legal entities will inevitably reach scaling bottlenecks with a single centralised data team. In these organisations, decentralisation will naturally emerge, and they most likely already have central business functions tasked with coordination and alignment across distributed teams. In which case, Data Mesh may be a suitable framework to structure this coordination and alignment effort.

2. You do not have a strong business-case for how adopting Data Mesh will deliver business value for individual business units

Borrowing ideas from DevOps, Data Mesh proposes federating ownership for generating value from data into individual product-oriented teams in order to enable faster value-creation and learning cycles. Each iteration cycle must therefore be guided by a clear and rapid feedback-loop as to what value each data team is generating for their customers.

Federating ownership often makes sense because business value can be better defined, prioritised, and iterated on by the business unit with the requirement.

However, taking on ownership over data analytic and engineering require investment that may not be justifiable for some business units without a strong business-case.

Therefore, it is unlikely that you will be able to fully realise the operating model described by Data Mesh from the beginning. Instead, you should focus on rapidly proving the value from adopting Data Mesh over many short iteration cycles with a few early adopters and collaborate with the remaining business units to build the business-case for adopting this change more widely.

3. You treat Data Mesh as a technical solution with a fixed target rather than an operating model that continuously evolves over time

I interpret the core goal of Data Mesh is to enable your organisation to leverage data to more effectively adapt to changes and more rapidly meet customer needs. As such, Data Mesh is a means to an end, not an end in itself.

Furthermore, there is no canonical reference implementation of Data Mesh. Therefore, every organisation adopting Data Mesh has to plan to evolve their Data Platform and Operating Model over time. The best way to structure this evolution process is through many small iteration cycles to experiment and learn the best way forward.

4. Your organisational culture does not empower bottom-up decision-making

When moving to a decentralised operating model, it is no longer feasible for any centralised body to maintain sufficient context or influence to make effective decisions in every distributed domain. Instead, you must trust in the distributed teams to autonomously find the best solution to the problems at hand.

Underpinning this mutual trust and individual autonomy is a culture of blamelessness and psychological safety to experiment, share knowledge, and learn together.

This article describes how to implement, improve, and measure cultural capabilities that drive higher software delivery and organisational performance.

5. You do not have clearly established roles & responsibilities and incentive structure for distributed data teams

Adopting Data Mesh requires navigating a number of new job roles such as Data Product Owner/Manager, Data Steward, Data Engineer, Data Scientist, or Analytics Engineer. Each plays a crucial function in enabling the distributed operating model.

These job roles and responsibilities must be standardised in order to create clarity of focus and to establish an incentivisation structure for these roles to work together to build the Data Mesh. Without this clarity and incentivisation structure, individuals may choose to prioritise other parts of their role over performing the necessary activities to build the Data Mesh such as building and sharing high-quality Data as Products.

6. You do not have a critical mass of data talent

To effectively make data and analytics a ubiquitous part of the business, every team will need to become more data-savvy. To achieve this, you may need to make a concerted effort to train your team members with skills such as data analysis, data visualisation, SQL, machine learning.

As adopting Data Mesh may require a fundamental shift in how a team operates, people who can champion new ways of working and help enable and empower distributed teams are critical to lead the organisation towards adopting these changes.

A distributed organisation with high individual autonomy cannot be successful without a critical mass of data talent and a structured approach for learning and enablement.

7. Your data teams have low engineering maturity

High-performing engineering teams deliver faster, more frequently, fail less often, and recover from failure more quickly. They adopt a number of best-practices such as Continuous Integration and Continuous Delivery (CI/CD) to maintain high delivery velocity.

As automation is necessary for Data Mesh to deliver value at scale, high-performing engineering teams are more likely to apply these DevOps practices in their daily work.

Data teams relying on manual, ad-hoc, and one-off processes are unlikely to be able to develop reliable and trustworthy Data as Products. These properties of a Data Product are crucial to enable the Data Mesh to scale.

8. You expect to find off-the-shelf software to help you adopt Data Mesh

Like DevOps, Data Mesh is about more than just technology and tools. It is a mindset and cultural shift where teams adopt new ways of working.

Technology can only be the catalyst for cultural change. Furthermore, Data Mesh emphasises how technology can be used for data integration, rather than what technology to use. Therefore, it is unlikely that adopting any technology solution alone will help you realise value from Data Mesh.

9. You do not have buy-in to “shift-left” security, privacy, and compliance

Distributing ownership over data cannot result in lower security, privacy, and compliance standards. Instead, security, privacy, and compliance must become everyone’s responsibility.

Research from DevOps Research and Assessment (DORA) shows that teams can achieve better outcomes by making security a part of everyone’s daily work instead of testing for security concerns at the end of the process.

“Shifting-left” security, privacy, and compliance into development processes requires collaboration between relevant stakeholders and every embedded data team. Therefore, it is important to get buy-in from all relevant parties and involve security, privacy, and compliance stakeholders as early as possible when adopting Data Mesh.

10. You do not consider Data Governance to be a core activity to be prioritised against other activities in every data team’s backlog

Federated Computational Governance is a fundamental principle of Data Mesh, with an emphasis on automation and standardisation in order to enable more comprehensive and real-time monitoring, detection, and remediation of policy violations.

Like Security & Privacy, Data Governance must “shift-left” to become a part of daily work for every data team. As such, Data Governance concerns such as enhancing Data and Metadata quality needs to be prioritised in every data team’s backlog. To embed Data Governance into standard development processes, these activities can be raised as tickets directly to the relevant data team’s backlog, or automated as tests that every code change must pass in order to be integrated and deployed to production.

To drive automation, standardisation, and best-practices, you may need to establish specialist engineering teams who can develop tooling/processes and provide specialist advisory to help distributed teams meet Data Governance policies and standards more easily.

Conclusion

Data Mesh is not a silver bullet for all your data management problems. It can be challenging — maybe even unrealistic — to implement for many organisations.

Regardless, I predict that many large organisations looking to scale impact from data and analytics will eventually adopt some version of Data Mesh; however not every element of Data Mesh will be implemented, or at least not immediately. For example, building a reusable and self-serviceable Data Platform as a Product is already emerging as a best-practice for cloud adoption. This requires implementing the platform using infrastructure-as-code and CI/CD with embedded and continuous controls, setting the foundations for Federated Computational Governance. Once organisations are confident that they can operate the Cloud Data Platform safely and at-scale, Decentralised Data Ownership can be adopted widely. This might in turn prompt adjustment over Data Domain boundaries and even the org chart to allow distributed data teams to effectively develop Data as Product.

Instead of trying to implement every element of Data Mesh, I would recommend first focusing on how you can empower your data teams to deliver value faster and more frequently for your customers, then work backwards to identify and adopt specific elements of Data Mesh that will help you to achieve this goal.

Hope you’ve found this article useful / interesting / thought-provoking. Please share and leave comments if you do.

Checkout a podcast I recently recorded with Scott Hirleman (founder of the Data Mesh Learning Community) where I went into more details on each point in this article: https://daappod.com/data-mesh-radio/are-you-ready-for-data-mesh-thinh-ha/

Special thanks to my colleagues Magdalena Algawam, Pradeep Bhattiprolu, Razvan Culea, Radek Stankiewicz, and Ada Tagoe for your help reviewing this article.

--

--

Thinh Ha
Google Cloud - Community

Strategic Cloud Engineer at Google Cloud Professional Services