Two Adoption Related Chasms of Data Products

--

The concept of the “chasm,” introduced by Geoffrey A. Moore in his book Crossing the Chasm: Marketing and Selling High-Tech Products to Mainstream Customers (1991), describes the significant gap between early adopters and the early majority in the technology adoption lifecycle. This gap often prevents high-tech products from achieving widespread adoption due to differing expectations and needs between these groups.

In the context of data products, internal data reuse is a common challenge, with Data Contracts being the latest attempt to standardize and simplify this process. As organizations aim to extend the value of their data by selling it externally, the limitations of traditional Data Sharing Agreements (DSAs) become apparent. The evolving marketplace requires more sophisticated Commercial Data Contracts (CDCs) to address these new complexities.

To successfully cross Moore’s Chasm, organizations must create data products and Data Contracts for internal use. For external commercialization, they need CDCs. This shift necessitates a clear distinction between DSAs and CDCs, focusing on the comprehensive management of data as a commercial asset. Consequently, a scalable data product ecosystem needs a well-defined metadata space and standardized components to support the creation of these agreements, ensuring clear terms for the sale and use of data products.

Original Chasm by Moore

The concept of the “chasm” originates from Geoffrey A. Moore’s influential book, Crossing the Chasm: Marketing and Selling High-Tech Products to Mainstream Customers. Published in 1991, the book addresses the challenges faced by high-tech companies in transitioning their products from early adopters to the mainstream market. Moore’s framework is a refinement of the technology adoption lifecycle.

The “chasm” refers to a significant gap between the early adopters and the early majority. Moore argues that many high-tech products fail to gain widespread adoption because companies are unable to bridge this gap. The early adopters and early majority have different expectations, needs, and risk tolerances, making it difficult to transition from one group to the other.

Data products reused by internal customers

Typically organizations now still struggle with internal data reuse and sharing. Data Contract is the latest attempt to make that easier, managed, and standardized. Before data can be shared even internally, it is often productized which might include harmonization, raised quality to match business needs, packaged into a suitable format to enable easier reuse, and so on. Data Contract offers a technical aspects focused approach to enable the reuse of the productized data internally. This Data Contract can be very limited in amount of elements but rich in content especially in describing the data included and access to it.

When that above-described situation is applied to Moore’s Chasm and technology adoption curve, it is logical to put the need to create a Data Contract in the spot where Moore said that chasm exists. Your early adopters have done the initial productizement by refining and collecting the data with the help of pipelines, tested it, and found value in it. At this point you want the other units in your company to also gain value from the data product. However this new audience is not so familiar with all the technical nuances of accessing the data and has less understanding of the data than early adopters. On top of that the internal reusers expect eas of use and quick value realization. Offering a Data Contract solves the problem at least for some parts.

Things get different when your data product crosses the company border and you decide to offer it to external users for value realization. You still have the initial internal reusers as customers, but on top of that, you have a new customer segment. From the data consumption and value realization perspective at the technical level, very little might change at this point, but the data contract is no longer enough. Since I lack creativity and time, I labeled this spot as “Chasm 2”.

Data Products reused by internal and external customers

Data products are currently sold in a relatively small-scale marketplace, but this is expected to grow as the value of internal data reuse diminishes and organizations seek to gain more value. Data is no longer just exchanged; it is being sold as a commodity. At this juncture, the traditional data-sharing agreement or data contract will no longer suffice. We need more sophisticated data contracts to address the complexities and demands of this evolving market. To make a difference between the data contracts we see now (technically oriented) and traditional Data Sharing Agreements, I decided to use the term Commercial Data Contract (CDC). That is needed when data is sold outside the company borders for 3rd party to utilize. This is the Chasm 2.

When an organization reaches Chasm 2 with data products, it still keeps on creating value internally, but on top of that data is commercialized. While data monetization is about generating revenue from data, data commercialization is about creating and selling marketable products or services derived from data. Monetization can be seen as a subset of commercialization, where the latter encompasses a more comprehensive approach to leveraging data for business growth and market presence. Thus I use the terms commercialization and commercial agreement in the below.

Since organizations have been sharing data for a long time, the practice of creating Data Sharing Agreements (DSAs) has emerged to manage these exchanges. However, traditional DSAs are not always suitable for the evolving business use of data products. In many cases, companies are not just sharing data; they are selling data products, transforming data into a commodity that can be traded and monetized. This shift necessitates more nuanced agreements that address the complexities of data as a commercial asset, ensuring clear terms for both the sale and use of these data products.

Commercial Data Contract and Data Sharing Agreement

While both types of agreements deal with the transfer of data, a commercial contract for a data product is primarily a business transaction with a focus on financial terms and deliverables. In contrast, a Data Sharing Agreement is more collaborative and focuses on the proper use, security, and compliance aspects of data sharing. Let’s have a look at the differences from 8 perspectives.

The purpose of a commercial contract is primarily centered around the sale and purchase of a data product, typically aimed at achieving commercial gain. In contrast, a Data Sharing Agreement (DSA) focuses on the mutual sharing of data, often for collaborative purposes, research, or to meet regulatory requirements.

The scope of the commercial contract includes detailed terms on pricing, payment, and deliverables. The Data Sharing Agreement (DSA) emphasizes data usage, access controls, and compliance with legal and regulatory requirements.

Ownership and rights in agreements can vary significantly depending on the type of contract. A commercial contract clearly delineates ownership rights, intellectual property rights, and licensing terms, ensuring that all parties understand who owns what and under what conditions they can use it. On the other hand, a Data Sharing Agreement (DSA) may not always involve the transfer of ownership. Instead, it focuses more on usage rights and restrictions, specifying how data can be used and shared, without necessarily changing who owns the data.

A commercial contract typically includes detailed financial terms, outlining payment schedules and penalties for non-compliance. In contrast, a Data Sharing Agreement (DSA) often features less detailed financial terms, as they may not be a primary focus of the agreement.

In commercial contracts, the section on liability and indemnification is often more comprehensive, detailing liabilities, indemnification clauses, and warranties extensively due to the commercial nature of the transaction. In contrast, a Data Sharing Agreement (DSA) emphasizes data protection and compliance with data privacy laws, placing less emphasis on commercial warranties.

A commercial contract includes data security clauses, although these may be less detailed compared to those in a Data Sharing Agreement (DSA). In contrast, a DSA typically has extensive provisions for data security and privacy, designed to ensure compliance with legal obligations.

In a commercial contract, termination clauses typically include specific conditions and consequences for ending the contract. These conditions are clearly defined, ensuring that both parties understand the circumstances under which the contract can be terminated and the resulting consequences. In contrast, a Data Sharing Agreement (DSA) may have more flexible termination clauses, with a strong emphasis on ensuring the return or destruction of shared data. This flexibility is designed to accommodate the unique nature of data sharing arrangements, where the focus is on protecting the data and ensuring its proper handling even after the agreement ends.

In a commercial contract, regulatory compliance primarily focuses on the sale and use of the data product. The terms are structured to ensure that the transaction and subsequent use of the data adhere to relevant laws and regulations, safeguarding the interests of both parties. On the other hand, a Data Sharing Agreement (DSA) places significant emphasis on regulatory compliance related to data protection and sharing. The DSA is designed to ensure that the sharing and handling of data comply with data protection laws and regulations, reflecting the importance of maintaining data integrity and confidentiality in these agreements.

As a conclusion, data product governance will have three fundamental elements: data products, data contracts, and commercial contracts. How to manage those and and reuse metadata in the process while crossing the chasms?

Data Product Metadata Space

Based on the above I claim that the scalable data product driven ecosystem needs clearly defined data product metadata space and standardized components. I will discuss the early stage idea just briefly now and make another post about it with detail later.

To cross Chasm 1, an organization needs data products and data contracts. To cross the chasm 2, the organization needs data products and commercial contracts. Both of the contract contents can be constructed from the data product metadata space components. Depending on which is constructed, a selection of reusable metadata space elements is used.

The middle layer is in progress already for the Data Contrat and Data Product. We have Bitol project driven machine-readable Open Data Contract Standard development going on. We have Data Products oriented standardization projects such as Open Data Product Specification. Both of the projects are developed under the umbrella of the Linux Foundation.

What we lack is the Commercial Contract Specification. Of course one might ask do we need yet another specification just for the Commercial Contracts. It could be part of the Data Contract eventually, but then that would make the Data Contract development focus divided into two very different aspects: technical data description with needed elements and commercial aspects. Some of the Commercial Contract elements are already scattered in the Data Contract and Data Product related standards. Perhaps decoupling those elements into a separate standard could bring more clarity to the models and enable the scope of standards to be kept nimble and compact.

The above is not that far from a concept created by Gartner. They use the term templatize data products as part of the product life cycle. Their model also includes more than just the technical aspects of data contracts.

What next?

As said, I will keep on working on the above described Data Product Metadata Governance Model even if is now seems a bit silly and lacks a lot of elements. I will make a detailed post just about it during the following weeks. I am exploring the opportunities and looking for new ways to organize data products related value chain.

My dream is to define what modern data product governance looks like, built upon nimble standards that apply Everything as Code to SLA, Data Quality, and Pricing plans. This model draws influence from the “terraforming” concept and principles of Infrastructure as Code. These concepts, often applied to data management, are extended to the business layer, which consists of data products, contracts, and commercial agreements. Thus, the working title of the next book is “Terraforming Data Product Governance.” The described model will contain what is possible now, as well as gaps that need to be filled.

--

--

Jarkko Moilanen (PhD)
Exploring the Frontier of Data Products

Open Data Product Specification igniter and maintainer (Linux Foundation project). Author of business-oriented data economy books. AI/ Product Lead professional