Introduction to Data Product Blueprint Model

--

The evolution of data management practices into data products involves several key aspects. Traditional data management focused on internal operational needs, with data stored in siloed systems. Data catalogs have transformed into interactive platforms, facilitating data discovery and consumption. Data product management emphasizes delivering value to external customers through iterative development and a customer-centric approach.

Moore’s Chasm highlights the challenge of transitioning from early adopters to the broader market. Data reuse and monetization are crucial, with internal data products laying the groundwork for external commercialization.

Feedback loops are essential throughout the data product lifecycle, enabling continuous improvement and adaptation to customer needs. Streamlining processes for both internal and external data products involves optimizing development, delivery, and maintenance to ensure agility and efficiency, with feedback loops informing iterative enhancements at every stage.

Discover the intricacies of modern data management and its evolution into data products in this insightful article. At the end, we discuss briefly the support for the below model from standardization point of view.

Let’s have a look at the above model by discussing the numbered sections of it.

1. Traditional Data Management

Before the emergence of data products, traditional data management practices revolved around the collection, storage, and processing of data primarily for internal operational needs within organizations. Data was often managed in siloed systems, with different departments maintaining their databases tailored to their specific requirements. These databases typically served transactional or operational purposes, such as managing customer information, inventory, or financial records.

Data catalogs represent a pivotal step towards the evolution of data management practices into data products. Traditionally, organizations maintained data catalogs primarily for internal reference, providing metadata descriptions of datasets stored within their systems. These catalogs served as repositories of information about data assets, including their structure, usage, and ownership.

However, with the rise of data products, data catalogs have transformed into dynamic and interactive platforms designed not only to catalog data assets but also to facilitate their discovery, exploration, and consumption by a wider audience. Modern data catalogs leverage advanced metadata management capabilities, such as automated metadata extraction and enrichment, to provide comprehensive insights into the organization’s data landscape.

As a result, we get “data-in-a-catalog” which contains rich metadata and sometimes even the data in refined format ready to be packaged into data products either as is or combined.

2. Data Product Management

Data product management differs from traditional data management in its focus, approach, and objectives. Traditional data management primarily revolves around the collection, storage, and processing of data for internal operational needs within organizations. It often involves managing data in siloed systems, with a primary emphasis on ensuring data quality, consistency, and compliance with regulatory requirements. The goal of traditional data management is to support internal business operations, such as customer relationship management, inventory management, and financial reporting, by providing accurate and reliable data for decision-making and analysis.

In contrast, data product management extends beyond traditional data management practices by emphasizing the development, delivery, and optimization of data-driven products and services. Rather than solely focusing on internal operational needs, data product management is customer-centric, with a primary focus on delivering value to external customers or end-users through data-driven solutions. This involves identifying market opportunities, understanding customer needs, and translating those needs into data products that address specific pain points or deliver tangible benefits to users.

3. Moore’s Chasm

Moore’s Chasm, also known as the “Chasm Theory” or “Crossing the Chasm,” is a concept introduced by Geoffrey Moore in his book “Crossing the Chasm: Marketing and Selling High-Tech Products to Mainstream Customers.”

The chasm represents a significant gap or barrier between early adopters of a technology or product and the broader mainstream market. According to Moore, the technology adoption lifecycle can be divided into five stages: Innovators, Early Adopters, Early Majority, Late Majority, and Laggards. The chasm occurs between the Early Adopters and the Early Majority.

This chasm is a critical phase for technology companies because crossing it often determines whether a product will achieve mass-market success or remain confined to a niche market. Crossing the chasm requires a different marketing and sales approach compared to the strategies used to appeal to early adopters. Companies must address the concerns and requirements of mainstream customers, who may be more risk-averse and have different needs and expectations than early adopters.

The Moore’s Chasm exists both in internal and external data product cases. Data Product is developed for a purpose and customer can be internal or external. Regardless of the target audience, you need to cross the chasm.

4. Data Reuse and Monetization

In focus points 4 and 5 we discuss data products that are either internal or external purposes. Commonly terms data monetization and commercialization are used in this context. Data commercialization involves generating revenue by selling or licensing data products or services to external customers, while data monetization encompasses all activities aimed at extracting value from data assets, whether through direct revenue generation or other means. While the two concepts are related and often overlap, they represent different approaches to leveraging data for business purposes and achieving different objectives.

Internal data products are essential for promoting data reuse, fostering collaboration, and driving informed decision-making within organizations. While they focus on delivering value internally, they also play a crucial role in laying the foundation for data commercialization initiatives by validating data assets, refining analytics capabilities, and providing a pathway for monetizing data externally.

5. Data Exchange and Commercialization

Data commercialization with externally focused data products involves the process of transforming internal data assets into marketable products or services that are sold or licensed to external customers or partners. This typically starts with identifying valuable data assets within the organization that have the potential to address specific market needs or opportunities. These data assets could include proprietary datasets, analytics capabilities, or insights derived from internal operations or customer interactions.

This is also the moment when Data Contracts (as technical contracts) do not suffice. When the data product is exposed to external customer, liabilities and other legal aspects must be taken into account as part of risk management. In this case the Data Contract is complimented with legal elements and becomes Data Agreement. Thus in the above picture we have separate box for data agreement for external value realization data products.

6. From Reuse to Exchange

Although data products can be initially designed to serve internal needs, some of them might become business assets that are commercialized (value realization happens externally). This is what should be kept in mind always.

Streamlining processes for both internal and external data products involves optimizing the entire lifecycle of data product development, delivery, and maintenance to improve efficiency, effectiveness, and agility. Ideally, you should have one process model that enables you with minimal effort expose the internal data product to external commercialization at any moment needed. Treat every data product as if it would be necessary to make it public if business so dictates.

7. Feedback loops

Feedback loops are crucial, much like in any successful business model. Data product development operates in iterative cycles, forming a cornerstone of its lifecycle. Ideally, feedback loops should permeate every phase, allowing for regression one or more steps back. For instance, during data product sales, it may surface that essential data is missing and currently unavailable. In such instances, the feedback loop extends back to data creation. Similarly, when pricing plans fail to meet customer expectations, the feedback loop transitions from data product value realization to data product offering, prompting business adjustments to the plans.

Requirements for data product management

Given that the above somewhat naive and heavily simplified thought process is accepted, we can draw some requirements for data product management.

First of all, you must be able to enable reuse of metadata. Data Product Blueprint is the sketch for both internal and external value realization. All of the data products have Data Contract (technical focus).

Depending on the business goals and opportunities:

  1. data product blueprint containing minimal business metadata and data contract combined into data product for internal purposes.
  2. data product blueprint with full business metadata including pricing plans and data contract with legal aspects (becomes data agreement) is combined into data product for external purposes.

Define once and reuse

Now, we do not want to define the same metadata as part of the data product blueprint, data contract, and data agreement. That would be redundant. We want all metadata defined once and reused in the respective parts of the data product. As an example Data Quality. That would be defined in a unified model once and used as a component. The value proposition of Data Quality can differ (values and level committed to by the provider), but the framework and metadata model remain the same. This principle has been discussed in more detail in the previous article (“Data Economy Interoperability Framework — shared standardized components and extensions“) but here’s a brief recap with a drawing and explanation.

In the above example, we utilize 2 standardized core components: Data Quality and Access. Both data product descriptions in the catalog and data contract define the two aspects of the artifact with the same structure (Schema). Now at least two parts of the metadata needed in the business processes are defined the same and there is no need for conversions. Ideally, data quality and access component metadata are defined once and reused in data contracts, data products, and in data agreements.

More cooperation between standards is needed

The above is not at the moment supported by emerging data economy standards like Open Data Contract Standard and Open Data Product Specification. The standards are still developed in isolation from each other and overlap. As a result, we have different standards for the same objects like access, data quality, and pricing. Both mentioned standards would benefit from closer cooperation.

As an example, Open Data Product Specification has had for a long time a mature data pricing plan model with 12 pricing plan options, and yet Open Data Contract Standard has started recently to develop alternative from scratch. Likewise, Open Data Contract Standard has done extensive work on defining data content that could be adopted by Open Data Product Specification.

The standards should cooperate rather than compete. This is a task an organization like the Linux Foundation could host and drive.

--

--

Jarkko Moilanen (PhD)
Exploring the Frontier of Data Products

API, Data and Platform Economy professional. Author of "Deliver Value in the Data Economy" and "API Economy 101" books.