What the heck is a “data product” anyway — 4 Helpful Heuristics

Clayton Karges
5 min readJun 5, 2022

TL;DR — “Data Product” means different things to different people in different contexts. While it would be convenient to have a single definition, it’s simply not practical. Instead, I’ve found some heuristics especially helpful in trying to wrap my mind around the scope and meaning of the “data product” in context.

Over the past couple of years, the concept of a “data product” has seen a rise in both interest and confusion. This is no doubt due to its adoption within the business and technology communities under different contexts. This has resulted in the concept having multiple meanings. To make things even more difficult, vendors often push the version that suits them best, adding more noise to an already noisy space.

Although there is no hard and fast definition of a data product, I have found some heuristics especially helpful for finding some sense in the ambiguity:

Heuristic #1 — Does the data facilitate value directly or indirectly?

DJ Patil, the US chief data scientist, defines a data product as “a product that facilitates an end goal through the use of data”(from his book Data Jujitsu: The Art of Turning Data into Product, 2012).

No super helpful DJ… So, pretty much everything on the web is a data product then right?? Not so fast.

Simon O’Regan, current head of Monetization at TikTok, draws an important distinction; “the distinction between products that use data to facilitate an end goal and products whose primary objective is to use data to facilitate an end goal.” In other words, true data products are the medium by which end users interact with and extract value from data. The data product’s purpose is to facilitate this interaction with data in a way that maximizes value for the consumer.

Heuristic #2 — Who is referring to the “data product”?

One’s “level” within the organization does a lot to frame their understanding and intent. Oftentimes, the “higher” one resides on the organizational pyramid the broader their scope of context. This is no different with “data products”.

Organizational-Level

At a leadership level, an executive may view the organization’s data product as the collection of every piece of data, and the tools used to generate, access, and analyze that data, within that organization.

Management-Level

At a management level, a VP of marketing may view their Dashboard/Visualization as a data product.

Product-Level

At a product level, a Product Manager may view the algorithm data scientists are exposing for them to consume as a data product.

Developer-Level

At an engineering level, a developer may view the set of APIs for a specific domain as a data product.

Heuristic #3 — Does the data product fit on the hierarchy of abstraction?

Simon O’Regan introduced a useful framework for categorizing data products. In his post, Designing Data Products, he introduced 5 types of data products. I’ve further elaborated on this concept with the “Functional Hierarchy of Data Products”.

This model proposes that we can organize data products into 5 broad functions, listed in terms of increasing internal complexity and decreasing(ideally) user complexity. In other words the greater we ascend the data product functional hierarchy, the greater the data product internalizes complexity.

Let’s ascend the hierarchy from the bottom-up:

  1. Raw Data → At this level we are simply collecting and exposing data from the source. While this provides significant flexibility, most of the work is left to be done on the user’s side. Example: A data engineer connecting an API to a data warehouse.
  2. Derived Data → With derived data, some processing is done within the product. Consider the case of sales data; we could add additional attributes like market sales segmentation or opportunity conversion rates. Example: A data analyst connecting to an executive dashboard.
  3. Algorithms → With algorithms, date is “ran through” the algorithm(linear regression, K-means clustering etc) to return new information or insights for the consumer. Example: A product manager leveraging a KNN algorithm owned by the data science team to intelligently bucket cohorts of users based on specified features.
  4. Decision Support → As the name implies, these data products are specifically designed to help users with decision-making. It is important to note that the data product is not making the decision but rather presenting the most relevant data in a way that assists the user in making better decisions. Example: The Executive team’s KPI dashboard they use for quarterly planning.
  5. Automated Decision-making → At the top of our functional hierarchy, we totally outsource our intelligence to allow the algorithm to determine the final output for the user. Machine learning models are often employed for these types of data products. Example: The recommender system for an organization’s intranet.

Heuristic #4 — How is the data product being consumed?

While most traditional digital products are consumed by end customers(B2C) or businesses(B2B), in the form of applications, data products offer a spectrum of interfaces for consuming this value. Each interface, with its own distinct consumers and product design and development considerations.

Basic Interfaces for Data Product Consumption

APIs.
While many would consider APIs purely technical, properly designing them requires a mix of technical know-how and product best practices. A solid API Data Product must be intentionally designed to provide a desirable and valuable experience to API consumers in light of many constraints and trade-offs. This means more than just providing a connection to a data source, but also:
— Providing users frictionless onboarding
— Creating useful API documentation & tutorials
— Providing an API sandbox /playground
— Documenting and sharing an API roadmap
— Etc.

Dashboards & Visualizations.
Dashboards and visualizations are most often positioned as decision support tools for managers and executives. Selecting what data to share in what format is especially important for these types of data products. Understanding the questions consumers would like to answer along with their data/statistical proficiency is critical when designing dashboards as design decisions can significantly influence how the data is perceived.

Web Elements.
Web elements are the least technical(from a consumer standpoint) of data product interfaces. These types of data products can include search, AR, embedded analytics, dynamic maps and more. These data products are especially powerful as they can meet the user where they are, providing tools in context. Additionally, their abstracted nature often means that technical and non-technical users can engage with the data product. Finally, ease of access also lowers barriers to engagement. While embedded data products are still in their early years, you can expect to see more of these in the years to come.

While there is no single definition for a “data product”, heuristics like the ones defined above can be helpful for finding some clarity in the ambiguity.

Have any useful methods for defining data products? Please leave them in the comments below and, if they resonate, I’ll include them in a future post. Thanks for reading!

Up next → Data Projects vs. Data Products — Why Mindset Matters

--

--