Building a Shared Understanding of Data Products with Data Modeling

Seckin Dinc
9 min readMay 2, 2023

--

Providing a shared understanding of data among stakeholders.

Photo by Jonny Gios on Unsplash

In recent years there was an unspoken trend in the data teams that they ignored and looked down on the business units and stakeholders. They thought that they don’t need any business understanding and support to build the data solutions. This solipsistic mindset created solutions that don’t care about the end user and don’t support any User Experience (UX) best practices. Not surprisingly those solutions are not used by the end users and the gap between data and business units increased over time. This surfaced that the product mindset is much needed in the data teams.

In my previous article, I tried to highlight the inevitable evolution of data products in the age of digitalization. My inspiration for that article was the simplicity for the end users and yet the sophistication of the ChatGPT served by OpenAI. Like many other data pioneers, first time in my life I witnessed that a data product has been embraced by the whole world without understanding how it works! This proved that if data teams improve the UX and product understanding for the end users, stakeholders, and customers they are willing to use the data products.

Similarly, to succeed in the development of data products, we need to ensure transparency and shared understanding within the product teams including the stakeholders and business units. The common understanding of how the product is going to function, what are the input data and output data, how the data is going to flow from the input to the output systems, what are the constraints about the information stored, etc are the critical questions that need to be answered before building our products.

With this article series, I will start writing about data modeling, its impact to be a bridge between data and business units, and its much-needed evolution to survive in the modern digital age.

What is Data Modelling?

Image courtesy https://en.wikipedia.org/wiki/Data_modeling#/media/File:4-3_Data_Modelling_Today.svg

Data modeling is a critical aspect of data management and analysis that involves creating a conceptual representation of data structures and relationships within an organization’s systems or processes. This representation allows stakeholders to gain a better understanding of the data and how it can be used to support decision-making processes.

In today’s data-driven world, data modeling has become an essential part of business strategy for organizations of all sizes. By developing a clear and consistent data model, organizations can streamline their data management processes and optimize their data-driven decision-making capabilities.

However, data modeling can often be a complex and technical process, which can make it difficult for stakeholders to fully understand its implications. To bridge this gap, data teams need to communicate the benefits and challenges of data modeling clearly and concisely so that can be easily understood by all stakeholders.

As a common understanding bridge between data teams and stakeholders, data modeling can help organizations achieve their data-driven goals and objectives by ensuring that everyone involved has a shared understanding of the data and its implications.

Creating a data model ensures that everyone involved has a clear understanding of the data's definition, permissible values, business constraints, and allowable integrations of data objects. This helps to ensure consistency and accuracy across the organization.

Data Modeling Challenges in the Modern and Agile Data Stack

As organizations increasingly rely on data to drive decision-making, data modeling has become more important than ever. However, the modern data stack and agile software development practices present several unique challenges to data modeling.

Lack of Fixed Schemas

Photo by Alexander Schimmeck on Unsplash

The lack of a fixed schema can create significant challenges for data modeling because a schema is essentially a blueprint for how data is organized and structured within a database or other data storage system. Without a fixed schema, data modeling becomes more difficult because there is no clear definition of how the data should be organized and structured.

In a traditional data modeling process, the data model is designed based on a fixed schema that outlines the structure and relationships of the data. This schema defines the rules for how data is organized and stored, which makes it easier for data teams to create a data model that accurately reflects the organization’s data landscape. The schema provides a clear guide for data teams to follow when designing the data model and ensures that the resulting data model is accurate, consistent, and easy to manage.

However, in modern data systems such as NoSQL databases, the lack of a fixed schema is a deliberate design choice that provides greater flexibility and scalability. While this approach offers significant benefits, it can also create challenges for data modeling because there is no clear blueprint for how the data should be organized and structured. This can make it difficult for data teams to create a cohesive and accurate data model that accurately reflects the organization’s data landscape.

Without a fixed schema, data modeling becomes more complex and time-consuming because data teams must carefully analyze the data to identify patterns and relationships. Additionally, changes to the data structure can be more difficult to implement because there is no clear blueprint for how the data should be organized and structured.

Agile Software Development

Photo by Eden Constantino on Unsplash

Agile software development is an approach that emphasizes flexibility and rapid iteration in the development process. While this approach can have many benefits for software development, it can create challenges for data modeling.

One of the main challenges of agile development for data modeling is the speed and frequency of changes. With agile development, requirements and specifications can change frequently, which can make it difficult to create a stable and consistent data model. As development teams work on new features and functionalities, they may need to modify the data model to accommodate these changes, leading to frequent updates and revisions. This can make it challenging to maintain a clear and consistent understanding of the data model, especially if changes are made without proper documentation or communication.

Another challenge is the need for collaboration and communication between data teams and development teams. With agile development, teams work in a highly collaborative and iterative environment, which can be beneficial for software development. However, it can be challenging for data teams, who may be working in a separate siloed environment, to keep up with the pace of development and ensure that the data model is properly integrated with the software being developed. Without effective communication and collaboration, data models may not be aligned with the needs of the development team or the broader business objectives.

The focus on speed and flexibility in agile development can make it difficult to prioritize data quality and accuracy. Data modeling requires careful analysis and planning to ensure that the data model accurately represents the organization’s data landscape. However, in an agile environment, the need for speed and flexibility can sometimes lead to shortcuts and compromises in data modeling, which can ultimately result in inaccurate or inconsistent data.

Domain-Driven Development & Data Mesh

Photo by shark ovski on Unsplash

Domain-driven development (DDD) and data mesh both present unique challenges for data modeling, despite their shared emphasis on domain modeling.

One of the main challenges of DDD for data modeling is the complexity of domain models. DDD encourages developers to create complex, highly specialized models that reflect the unique needs of each domain. While this approach can be effective for software development, it can make it challenging to create a consistent and cohesive data model that covers the entire organization. In some cases, domain models may be so complex that they are difficult to integrate with other parts of the organization, leading to silos and fragmentation.

Similarly, data mesh presents challenges for data modeling due to its decentralized approach to data ownership. With data mesh, each domain or business unit is responsible for managing its data, which can lead to inconsistencies in data models across different parts of the organization. Without a centralized data governance framework in place, it can be difficult to ensure that data models are consistent and aligned with the broader goals and objectives of the organization.

Another challenge of data mesh for data modeling is the need for interoperability. As data is managed and governed by different domains within the organization, it can be challenging to ensure that data is properly integrated and interoperable across different systems and applications. This can lead to data quality issues, duplication of effort, and inefficiencies in data management.

Both DDD and data mesh present challenges for data modeling due to the need for collaboration and communication across different parts of the organization. Without effective communication and collaboration, it can be difficult to ensure that data models are properly aligned with the needs of the organization and that changes and updates are properly documented and communicated.

How to Integrate Data Modeling into Data Product Development?

Photo by Jo Szczepanska on Unsplash

The product development process is a collaborative process that requires product, software engineering, and other cross-functional members to work together. The collaborations and brainstorming sessions are conducted in special events called product discovery sessions.

A product discovery session is a collaborative process that is used to identify, define, and prioritize the requirements for a new product. This session is typically facilitated by a product manager or a cross-functional team that includes representatives from different parts of the organization, such as engineering, design, marketing, and sales.

The purpose of a product discovery session is to gather input from stakeholders, identify the problem or opportunity that the product is intended to address and define the product’s scope and features. The session may involve a variety of activities, such as brainstorming, user research, competitive analysis, and prototyping.

The outcome of a product discovery session is typically a set of product requirements or user stories that describe the key features and functionality of the product. These requirements are then used to guide the development of the product and ensure that it meets the needs of the target users and stakeholders.

Data modeling can be used in product discovery sessions to help identify the data requirements and inform the design of the data product. Product discovery sessions typically involve gathering input from stakeholders, identifying the problem or opportunity that the product is intended to address, and defining the product’s scope and features.

During these sessions, data modeling can be used to create a conceptual representation of the data that will be used by the product. This can help to ensure that the product is designed to leverage the data effectively and that any data-related issues or constraints are identified early on in the product development process.

For example, data modeling can be used to identify the different types of data that will be used by the product, such as customer data, transaction data, or sensor data. It can also help to identify the relationships between the data, such as the relationships between customers and transactions. By modeling the data in this way, stakeholders can gain a better understanding of the data requirements and ensure that the product is designed to make the most effective use of the available data.

Conclusion

Data modeling is an essential process for building successful data products. However, as the technology landscape continues to evolve, data modeling needs to evolve and meet the requirements of the modern and agile data stack. This means that data modeling must be designed to support the rapid development and deployment of data products while ensuring that the data is accurate, consistent, and complete. With the increasing demand for real-time data and the need to leverage advanced technologies like machine learning and artificial intelligence, data modeling must be adapted to support these new requirements. By emphasizing the importance of data modeling in the modern data stack, organizations can ensure that they are building data products that are effective, efficient, and scalable, and that can deliver value to the business and its stakeholders.

--

--

Seckin Dinc

Building successful data teams to develop great data products