Introduction to Data Product

Bryan Yang
A multi hyphen life
3 min readNov 3, 2022

Data + Product

Preface

Big data and AI are hot topics these years. Both companies and customers have started to improve their products or business through data analysis or machine learning. However, in the past few years, both companies and customers have been stumbling in the development and management of this data project. In order to enable you to jump over or avoid the pitfalls you have stepped in, this series of articles will introduce the pitfalls you have encountered, starting from the project management side of data products. This is the first time I’ve organized this topic, so I hope I can finish it successfully!

Data + Product

Data products are a very new concept, so let’s clarify them individually.

Information

Before we talk about data, let’s talk about “information” (https://zh.wikipedia.org/zh-tw/%E4%BF%A1%E6%81%AF)
Life is basically full of information in various forms, such as tangible weather, temperature, weight; intangible cognition, concepts, experience, etc. Wikipedia has a very good saying that “information can reduce uncertainty”. By collecting information, we can generate awareness of the world, judge the state of the external environment, and even predict what may happen in the future.

Data

Data can be described as the materialization of information. For example, we usually rely on our skin to determine the external temperature, which is still just information; but when we use a thermometer to measure the temperature and record it, it becomes data. This data is usually stored on electronic products (whether hard disk, Ram, or tape) in modern technology, and we can use computers to access this data and compute it.

Product

A product is defined by Wikipedia as “a tangible or intangible vehicle that satisfies a person’s needs or desires”. From these definitions, data are carriers of information, and information can be used to help people reduce uncertainty, so a data product is defined as “something that can be used to help people reduce uncertainty, which is made of data or information.

Types of Data Products

Modern people rely heavily on data to make decisions, and there are five levels of data products according to the degree of data processing.

Raw data: The recorded information is the data, which is the most basic form of data.

Processed data: The original data is adjusted or organized according to the requirements.

Model: When the data is calculated by the algorithm, a model is formed to explain the distribution of the data, and from this level, the data can no longer be seen in its original form.

Decision support system: A system that provides data for human reference to make decisions, such as BI reports, alarm systems, etc.

Automatic decision-making systems: Systems that make decisions automatically based on data, such as recommendation systems and automatic driving systems.

The characteristic of data products is that they need to be processed step by step, just like rice cakes in French cuisine, you still need to start with the basic ingredient — rice — and cannot skip layers. If you want to make a model (the third layer), you need to collect the original data from the first layer first, and then convert the data according to the feature values needed for the model.

Later, we will introduce examples of data product types in each layer, so that you can have a deeper understanding of data products.

References

https://www.coursera.org/learn/data-products
https://medium.com/@itunpredictable/data-as-a-product-vs-data-as-a-service-d9f7e622dc55
https://www.linkedin.com/pulse/what-does-mean-manage-data-product-martyn-sukys/
https://hbr.org/2018/10/how-to-build-great-data-products
https://towardsdatascience.com/designing-data-products-b6b93edf3d23

--

--

Bryan Yang
A multi hyphen life

Data Engineer, Data Producer Manager, Data Solution Architect