Reflections on Building a Data Mesh Platform from Scratch: Insights from Novo Nordisk’s Jyotshna Karki
In the ever-evolving field of data management, adaptability is key to success. Jyotshna Karki, a data engineer at the pharmaceutical company Novo Nordisk, has seen technology change rapidly over the past seven years. She has helped spearhead Novo Nordisk’s transition from a traditional centralized data lake system to an innovative data mesh approach and has seen the profound impact data mesh has had on the way data is harnessed for business growth.
When Jyotshna initially joined Novo Nordisk, the company was using a centralized data lake to manage their data. However, with rising consumer and producer demands, Jyotshna’s team realized that a decentralized approach like data mesh offered several benefits: enable domains to independently manage their data products; promote data agility; and allow the central data team to focus on the data mesh platform while aiding less data-proficient domains.
This article covers 10 powerful insights from Jyotshna’s data mesh journey with Novo Nordisk.
10 Insights into Building a Data Mesh
1) Define “Good Enough”
Not every solution has to be perfect; it’s more important to ensure that you determine the appropriate point where a solution meets the current needs. Recognize when to prioritize proof of concepts and manual setups for agility, accumulating technical debt to revisit and refine later. Enable steady improvement and avoid wasted effort by pursuing an agile approach to development.
2) Leverage Modularity
By using modular elements within the architecture you’re leveraging for data mesh, you can set yourself up for efficient development and empower multiple teams to leverage pre-existing solutions efficiently. Focus on the creation of reusable components like pipeline blueprint and prioritize designing for reusability.
3) Constantly Assess the Data Product
Identify techniques and metrics for monitoring the quality of your data product. Constantly evaluate the architecture to ensure it aligns with business requirements and to identify avenues for ongoing improvements.
4) Evolve the Data Product
Be open to changing your data product based on shifting requirements. A product-focused mentality encourages evolution and ensures that the architecture you’re using for data mesh remains relevant and valuable.
5) Be Curious
Foster curiosity in data engineering. Embrace the wealth of innovations available and explore diverse approaches and technologies. An inquisitive mindset drives the discovery of transformative solutions that can significantly impact your data management strategies.
6) Balance Centralization and Data Mesh
Recognize the balance between a centralized data lake setup and the decentralized principles of a data mesh. You can have happy data producers and consumers within a centralized data lake setup and still have data mesh be the right next step. In the long-run, for some larger organizations, it isn’t efficient to have a centralized data team coordinating all data use cases at scale. Find ways to harmonize data producers and consumers’ needs while avoiding overreliance on a centralized team.
7) Address Dependency on Centralized Processing
For many domain teams, centralized data processing and storage can be a black box: data goes in, it gets transformed and stored by the central team and then served out. When developing an architecture for leveraging data mesh methodologies, focus on increasing transparency and understanding of the data pipeline. By focusing on transparency, you can empower domain teams by reducing their reliance on external data experts.
8) Prioritize Specific Tools
When it comes to tool enablement, strive for quality over quantity — concentrate on providing exceptional experiences with widely used tools, rather than trying to support every available tool. With this approach, you can focus on developing high-quality tools that improve the user experience.
9) Embrace Data-Driven Development
Follow a data-driven approach when moving to data mesh: even if your data is raw to start, you can use it to inform decision-making. With time, your data will become more sophisticated and your approach will evolve into a refined system that aligns with user needs.
10) Establish Reliability and Trust
Cultivate self-service infrastructure that users can rely upon. Prioritize visibility and transparency in data handling processes to build trust. By providing users with insights into how data is handled, you can provide them with confidence to not just use the platform, but truly rely on its output.
Learn More about Data Mesh
This article covers key insights from Jyotshna’s extensive experience implementing data mesh at Novo Nordisk. To learn more about her approaches for building effective data sharing at scale, check out this episode of Data Mesh Radio. For more information about how organizations are leveraging data mesh, check out this list of user journey stories.
Originally published at Data Mesh Learning.