Data Mesh In Practice — Thoughtworks’ Recommendations from Roche’s Journey

The recommendations given are intended to guide the reader through the complexity of the data mesh and provide some clarity and orientation.

Axel Schwanke
14 min readDec 27, 2023
Double Diamond Approach to Onboarding a business unit / domain to the Data Mesh process, © Thoughtworks

Table of Contents

· Introduction
· Getting off to the right start (Part I)
· Organizational operating model (Part II)
· Product thinking and development (Part III)
· Technology and the architecture (Part IV)
· Conclusion
· References

Introduction

Data Mesh represents a new approach to data architecture that emphasizes decentralized data ownership and domain-oriented thinking. This article provides a summary of Thoughtworks findings and recommendation from its work with Roche, a global leader in healthcare. The original four-part series provides practical guidance on achieving transformative business value through the adoption of Data Mesh principles. This paradigm shift requires a nuanced understanding of its principles and a strategic approach to its implementation. In the following sections, we will dive deeper into the intricacies of the Data Mesh and provide insights and recommendations to to successfully adopt and implement Data Mesh

Getting off to the right start (Part I)

As organizations increasingly recognize the transformative potential of Data Mesh, Thoughtworks shares valuable insights from its extensive experience in successfully implementing Data Mesh principles, particularly through its engagement with Roche in 2022. The Data Mesh approach goes beyond a mere architectural shift; it represents a socio-technical paradigm, demanding a comprehensive transformation of processes, operating models, and technology. The series of articles looks at the practical aspects of adopting Data Mesh at scale, highlighting key principles such as ‘domain ownership and architecture’ and ‘data as a product’. The focus is on the critical adaptation of operating models, the intricacies of product thinking, technology evolution, and cross-domain maintenance and evolution. By addressing challenges such as scaling versus continuous learning, defining and empowering domains, and avoiding premature scaling, the series offers a roadmap for organizations to navigate the complexities of Data Mesh adoption and ensure the delivery of significant business value.

Double Diamond Approach to Onboarding a business unit / domain to the Data Mesh process, © Thoughtworks

Recommendations

1.1. Balancing Learning and Scaling

Rapid scaling is critical, but organizations must avoid the trap of scaling too quickly. This can discourage teams from collaborating across the organization, preventing effective learning.

Recommendation: Foster a culture that values continuous learning and implement practices that balance learning and scaling. Enable decentralized teams to work in parallel and share insights.

1.2. Defining and Empowering Domains

Defining clear boundaries for domains is essential, but reorganizing business units solely for data product might be excessive. Autonomy raises questions about setting precise goals, defining metrics, and ensuring the necessary support.

Recommendation: Leverage existing organizational structures, that enable domain teams to make data-related decisions that align with both business requirements and data strategy. Use models such as EDGE to ensure clear alignment of goals.

1.3. Premature Scaling and Effective Projects

Some organizations dive into Data Mesh without a clear roadmap, resulting in projects stalling in the initial proof-of-concept phase due to a lack of direction and basic planning.

Recommendation: Apply a structured onboarding approach, such as the Design Council’s double diamond, and emphasize the basic ‘why’ and ‘what’ aspects of Data Mesh before moving on to the ‘how’ of implementation. Start by leveraging current organizational structures and boundaries.

1.4. Evolving Data Mesh Best Practices

Data Mesh best practices are constantly evolving, and there is no one-size-fits-all approach. Organizations need to address evolving practices and tailor solutions to their specific context.

Recommendation: Stay flexible and follow a comprehensive discovery process. Define the “Data Mesh delta” to identify gaps and run concurrent streams for the operating model, product, and technology to ensure a holistic approach.

1.5. Organizational and Operating Model Changes

Implementing Data Mesh requires profound changes in organizational dynamics, processes and operating models, which poses a challenge for alignment, coordination and change management in the various business units and functions.

Recommendation: Start with the organizational boundaries already in place and manage the changes strategically by aligning them with business goals and customer value. Explicitly define the desired outcomes and ensure a seamless transition between the current and the target state.

1.6. Three Types of Change: Operational, Product, and Technological

Successful Data Mesh adoption requires significant changes in operational structure, products, and technology. This holistic transformation poses a challenge for coordination and seamless implementation in these complex areas.

Recommendation: Address all aspects simultaneously to ensure alignment and steady progress. Leveraging Roche’s experience can provide invaluable insights to make strategic decisions and maximize value creation.

Key practice: start your journey with existing organizational boundaries

Defining domains is a key part of the Data Mesh journey, but redefining domain boundaries can introduce a lot of additional challenges, and isn’t always a necessary prerequisite. We recommend starting with the organizational boundaries you already have, and only moving those boundaries to create new domains if you hit significant barriers along the way.

The three-stream discovery process running across multiple domains at Roche, © Thoughtworks

Organizational operating model (Part II)

The second part of the series on successful Data Mesh implementations looks at the critical operating model changes required to support Data Mesh. The three-stream discovery process, which focuses on the operating model stream, begins with a clear vision for Data Mesh that is aligned with strategic goals. Stakeholders answer critical questions to ensure compliance with Data Mesh principles. Organizations are aligned toward the EDGE operating model, which emphasizes autonomy, domain-driven innovation, and flexible governance. The key artifact, the Lean Value Tree (LVT), helps prioritize and communicate goals. Two critical practices emerge: ensuring organizational readiness for decentralization and clearly defining responsibilities. The operating model stream culminates in governance structures, addressing portfolio, domain, and technology aspects. The three principles for operational evolution emphasize the importance of vision, team empowerment and early consideration of governance.

Guiding clients towards operating models based on the EDGE operating model, © Thoughtworks

Recommendations

2.1. Scaling Dynamics and Cross-Domain Learning

Balancing rapid scaling with effective cross-domain learning can lead to knowledge silos. This can hinder collaboration and limit the holistic benefits of implementing Data Mesh.

Recommendation: Integrate design thinking-based onboarding for scalable learning. By emphasizing cross-functional collaboration, teams can better align their strategies, share insights and work together to drive the scalability and effectiveness of the Data Mesh framework.

2.2. Roles and Responsibilities Definition

Lack of clarity about roles and responsibilities within domains can lead to overlap, inefficiency and conflict and hinder productivity. Clear definitions are critical to streamlining operations and ensuring that tasks are assigned and executed appropriately.

Recommendation: In the discovery phase of Data Mesh, explicit definition and communication of roles ensures that team members understand their responsibilities. This clarity promotes accountability, minimizes confusion and optimizes the flow of work between domains.

2.3. Strategic Onboarding for Project Effectiveness

Initiating Data Mesh projects without proper strategic involvement carries the risk of misalignment and may not contribute effectively to overall business objectives.

Recommendation: Adopt a comprehensive, strategic onboarding approach to ensure that projects are harmoniously aligned with the company’s broader strategic goals and that their potential impact and value creation is optimized.

2.4. EDGE Model Integration

When implementing Data Mesh, one often faces challenges such as inconsistent value definitions, disjointed strategies, unclear priorities and misaligned goals.

Recommendation: Utilize the EDGE operating model for onboarding to ensure a consistent definition of value and align with Data Mesh principles using the Lean Value Tree (LVT) artifact to ensure clarity and consistency in value definition and strategic alignment.

2.5. Organizational Readiness for Decentralization

Adapting organizational structures to decentralized innovation can mean a significant change in culture, processes and leadership that challenges established norms.

Recommendation: Execute a holistic transformation program, incorporating change management practices to effectively transition and empower teams for decentralized data-driven initiatives and ensure sustainable success.

2.6. Clarifying Responsibilities

Lack of clarity in communicating responsibilities within domain teams can lead to overlapping tasks and unclear responsibilities, which can result in inefficiencies and misunderstandings during Data Mesh implementations.

Recommendation: During the discovery phase, provide teams with comprehensive training on their roles, responsibilities and dependencies. Offer ongoing mentoring and resources to strengthen understanding and promote collaboration.

2.7. Incentives and Support for New Roles

Introducing Data Mesh can cause resistance due to the unfamiliarity and potential disruption of established roles.

Recommendation: Create rewarding incentives and robust support mechanisms for those transitioning into Data Mesh roles to drive motivation, increase engagement and strengthen accountability within the evolving data ecosystem.

2.8. Effective Governance Structures

Developing effective governance structures is critical to ensure consistent data quality, security and compliance. But it can be complex due to the decentralized nature and differing requirements of each domain.

Recommendation: Establish clear portfolio, domain, and technology governance aligned with the EDGE model to ensure streamlined decision-making and alignment with organizational goals.

Lean Value Tree (LVT) for aligning priorities in the operating model discovery process, © Thoughtworks

3 principles for successful operational evolution

Start with a clear vision of what you want to achieve, and work down through the Lean Value Tree to identify specific hypotheses to help you get there

Empower teams to take up and act on their new responsibilities, and ensure that they’re able to move on from failed hypotheses quickly, in line with the EDGE operating model’s feedback cycles.

Consider governance early in the adoption and onboarding process, and build it into the product and technology decisions that will follow, rather than working to build structures around what you’ve created later

Interaction between governing bodies and data product teams, © Thoughtworks

Product thinking and development (Part III)

In the series on Data Mesh implementation, the third part focuses on strategies for unlocking tangible business value. Without abstract concepts, the emphasis is on the practical aspects of creating data products, emphasizing the central role of the Lean Value Tree (LVT). The LVT serves as a linchpin to align use cases with business goals and help stakeholders strategically prioritize goals and hypothetical use cases.. This deliberate approach prevents unintended redundancies and ensures that the data products meet the consumer needs. Key practices include completing value-oriented templates for each use case, using hypothesis and data product templates, and applying product thinking principles. The article emphasizes the importance of defining clear Service-Level Objectives (SLOs) to improve data reusability and highlights a discovery exercise, Product Usage Patterns, to establish SLOs based on stakeholder expectations. By mapping data products to domains, LVT provides a tangible link between data and business goals, facilitates change management and ensures clarity around ownership and compliance.

What makes up a data product, © Thoughtworks

Recommendations

3.1. Use Case Clarity

Unclear use cases can lead to ambiguity, making it difficult for teams to decide which data products to prioritize, which in turn complicates strategic decision-making processes within the organization.

Recommendation: Develop a structured approach using value-oriented templates for each use case, ensuring alignment with organizational goals. Engaging relevant stakeholders ensures diverse perspectives and comprehensive input.

3.2. Product Thinking

Insufficient application of product thinking principles can hinder the development of data products and potentially lead to solutions that do not fully meet user needs or business objectives.

Recommendation: Embrace product thinking best practices, emphasizing customer understanding, continuous improvement, cross-domain collaboration, and diverse perspectives. Provide necessary training and support for domain decision-makers.

3.3. Ambiguity in Data Product Definition

Lack of clarity in the definition of data products can lead to misunderstandings that result in inefficiencies, duplication of work and missed opportunities for optimized data usage and value creation.

Recommendation: Introduce a standardized data product template that guides teams in articulating the purpose, functionality, and stakeholders of each data product. Use clear questions to shape and define the specifics of the data product.

3.4. Overlooking Interconnected Data Products

Failure to recognize overlaps in data usage across different domains can lead to duplication of effort, redundant projects, wasted resources and a disjointed data ecosystem.

Recommendation: Develop a data product interaction map, visualizing how data products interconnect. Regularly update and integrate these maps to create a cohesive data product landscape, avoiding redundancy.

3.5. Undefined Service-Level Objectives (SLOs)

Without clearly defined SLOs, data product teams struggle to ensure consistent performance, resulting in unreliable data products and hindering their reuse for different organizational needs.

Recommendation: Define SLOs based on service-level indicators (SLIs), ensuring targeted service levels. Utilize a discovery exercise, such as Product Usage Patterns, to understand stakeholder expectations and set appropriate SLOs.

3.6. Mapping Data to Business Goals

Without aligning data products to specific business goals, there is a risk that solutions are developed without clear objectives. This discrepancy can lead to inefficiencies, misallocation of resources and lower overall value for the organization.

Recommendation: Utilize the Lean Value Tree (LVT) to map data products to domain goals. This step is critical for change management, providing a tangible link between data and business outcomes.

3.7. Lack of Data Product Ownership and Governance

Undefined ownership and governance structures can lead to confusion. Lack of clarity can encourage the misuse of data, hinder accountability and compromise data integrity and security.

Recommendation: Clearly define ownership and stewardship for each data product. Establish data sharing agreements, access policies, and compliance measures. Use a Data Product MVP Checklist to set minimum requirements for each data product.

The Data Mesh LVT — mapping data products to the domain LVT, © Thoughtworks

Technology and the architecture (Part IV)

The fourth part deals with the technical stream and focuses on the architectural decisions that are essential for the success of Data Mesh. The insights gained from the collaboration with Roche cut through the complexity and provide practical guidance. An outstanding result is the Data Mesh logical architecture, a visual representation that clarifies domain ownership, use cases, user interactions, operational systems, and self-service platform capabilities. Technical practitioners engaged in the Data Mesh discovery process benefit from practices such as treating data products as architectural quanta. Each data product, encapsulated in its own repository, has its own lifecycle and independent deployment capability, fostering robustness within the Data Mesh. The approach emphasizes optimized developer experiences, consistent metamodels, and automated governance to ensure smooth creation of data products and compliance to organizational standards. These technical insights form the foundation for scalable, interoperable and valuable data ecosystems in the evolving landscape of the Data Mesh.

The Data Mesh logical architecture — mapping out each domain’s data products across the Data Mesh, © Thoughtworks

Recommendations

4.1. Bridging Data Mesh Delta

The complexity of current platform functions while seamlessly integrating new technological solutions and architectural designs presents significant difficulties and potential obstacles.

Recommendation: Conduct a thorough analysis of existing capabilities during the onboarding process. Develop a clear plan for bridging the gap and integrating new technology smoothly.

4.2. Data Product Quantum

Enforcing a rigid approach that treats data products as indivisible entities can be limiting, especially when it comes to meeting the diverse and dynamic needs of consumer-oriented data products.

Recommendation: Flexibility can be maintained by assessing the need for providing multiple data sets via output ports. However, favor the creation of new data products if required for better scalability and independence. Building the mesh with data products as its architectural quantum makes the Data Mesh robust.

4.3. Multi-Plane Data Platform Design

An overemphasis on the data infrastructure aspect could narrow the perspective of a Data Mesh implementation and potentially overlook critical organizational, cultural and process-related factors.

Recommendation: Ensure careful assessment at multiple levels, including the data product developer experience and mesh supervision levels. Consider a holistic approach to platform design to enhance overall effectiveness.

4.4. Streamlined Developer Experiences

Friction in the creation and maintenance of data products can hinder the benefits of Data Mesh and affect its flexibility, scalability and overall effectiveness within an organization.

Recommendation: Invest in the development of an intuitive developer experience, including a specification language and a registry of capabilities. Empower domains to build their own products with minimal friction, promoting consistent and interoperable data products.

4.5. Consistent Metamodel

Inconsistencies in the definition and structuring of data product attributes across domains can create significant interoperability challenges, potentially leading to data misalignment and limited collaboration.

Recommendation: Establish a consistent metamodel for data products, enforcing mandatory attributes. Utilize a cataloging process to ensure visibility and searchability, promoting interoperability across the organization.

4.6. Automated Governance and Access Control Policies

The lack of an ideal solution for programmatic policy creation, especially in the context of polyglot storage environments, can lead to complexity, making it difficult to enforce consistent governance and access control.

Recommendation: Experiment with existing tools and explore innovative solutions. Consider extending Open Policy Agent (POA) to meet the specific demands of programmatic policy authoring in a Data Mesh. Keep an eye on emerging tools that align with the principles of Data Mesh.

4.7. Architectural Fitness Functions

Ensuring seamless interoperability and extracting consistent value from a variety of disparate data products can be a complicated challenge that requires careful orchestration and integration efforts.

Recommendation: Define automated tests (fitness functions) to ensure key characteristics of data products. Provide clear incentives for teams to adhere to governance principles. Utilize organization-wide dashboards for visibility and accountability.

4.8. Data Sharing Guidance

Enabling efficient data sharing within a federated architecture characterized by polyglot storage systems is a complex task that requires harmonization efforts for different data sources and ensuring coherent data access mechanisms.

Recommendation: Apply guidance and patterns such as using Virtual DB as an additional output port for simple reporting, leveraging native mechanisms for advanced use cases, and exercising caution when copying data. Be cautious of equating data virtualization to Data Mesh and monitor advancements in the field.

4.9. Defining and Building Your Own Path

Tools required to support federated data architectures are still in the early stages, leading to gaps in functionality and the need for iterative refinement to meet evolving Data Mesh requirements.

Recommendation: Take the opportunity to define your own path and demonstrate innovation. Stay up to date on new tools and frameworks that align with the principles of Data Mesh.

The components that form each layer of the self-service data platform, © Thoughtworks

Key principle: Data products as atomic, functionally-cohesive units

Data products are the architectural quantum of the Data Mesh. They should be designed as the smallest functionally cohesive unit of the mesh, each with an independent life cycle. This is a foundational principle of Data Mesh architecture.

The supervision plane dashboard — monitors the six characteristics of the data products, © Thoughtworks

Conclusion

Data Mesh represents more than just a technological shift; it embodies a holistic change in the way organizations perceive, manage and derive value from data. As Thoughtworks’ partnership with Roche demonstrates, key tenets include harmonizing scaling efforts with continuous learning, carefully defining data domains, and driving adaptability across operations. Strategic onboarding, embodied by the EDGE operating model, emphasizes autonomy, innovation and flexible leadership. In addition, the focus on product thinking underscores the importance of aligning data initiatives with tangible business outcomes. Technologically, Data Mesh relies on decentralized data domains that each operate as independent entities. The Data Mesh philosophy goes beyond traditional data management paradigms. It requires companies to re-calibrate their approaches by placing data at the center of their strategies while ensuring flexibility, clarity and alignment. Adopting Data Mesh is not just about adopting a new methodology, but also about creating a culture that values data as a strategic asset and leverages its potential for transformative impact.

The recommendations given are intended to guide the reader through the complexity of the Data Mesh and provide some clarity and direction. We hope that this structured approach will help to better understand and successfully implement Data Mesh.

References

Data Mesh in practice: Getting off to the right start (Part I)

Data Mesh in practice: Organizational operating model (Part II)

Data Mesh in practice: Product thinking and development (Part III)

Data Mesh in practice: Technology and the architecture (Part IV)

How to Move Beyond a Monolithic Data Lake to a Distributed Data Mesh — Domain data as a product

Zhamak Dehghani: Data Mesh, O’Reilly Media, Inc.

The four principles of Data Mesh (Webinar series)

Domain-Driven Design: Tackling Complexity in the Heart of Software, Addison-Wesley Professional

EDGE: Value-driven digital transformation, Addison-Wesley Professional

The Double Diamond: A universally accepted depiction of the design process

The curse of the data lake monster

Product thinking: Building experiences that deliver results

Building Evolutionary Architectures by Neal Ford, Rebecca Parsons, Patrick Kua, O’Reilly Media — Chapter 4. Architectural Coupling

Data Mesh Principles and Logical Architecture

AWS: Build a data sharing workflow with AWS Lake Formation for your data mesh

Exposing The Data Mesh Blind Side

--

--

Axel Schwanke

Senior Data Engineer | Data Architect | Data Science | Data Mesh | Data Governance | Databricks | https://www.linkedin.com/in/axelschwanke/