Product Management for Big Data

Published in

Agile Insider

4 min readAug 15, 2018

These are the earliest days of Big Data, and the vast troves of information available to product managers represent our greatest opportunity and our most challenging puzzle. Brilliant insights, features, and products can emerge from the petabytes, exabytes, and zettabytes ingested daily. But great product development based on data does not occur if the product manager can’t confidently interrogate a store of information.

That confidence arises from a deep understanding of the questions that the end user is asking, or has not known to ask, and a grasp of the pivotal insights that would create real change in the behavior of that user.

Here we aren’t going to emphasize the visual representation of data, but instead the coaxing of useful conclusions from the raw (or classified, or mapped and reduced) data at hand. These insights can be used for the greater good, or sold, or both.

For example, a product manager might create a customer insights report based on advertising or sales data, or the two types of data combined, to allow retailers to understand the effectiveness of their marketing efforts. Such reports would be sold as products, and would offer real value to the client.

Or a product manager might provide a team of medical researchers a new way to examine data used for clinical trials to allow them to detect significant anomalies in a field of others, most of which might be meaningless.

But how can we distinguish useful data from its opposite? Information should be evaluated by a few key criteria for a product manager to harness it properly.

Timeliness
Statistical Significance
Accuracy and Precision
Ability to guide valuable action

We will examine each of these to arrive at a coherent understanding of our data sets.

1. Timeliness

Data that is old does not always matter less, even though old data might have been collected infrequently, or at a less detailed level. When changes over intervals of elapsed time are particularly important, we refer to the information as time series data. This is obviously crucial to the financial industry.

For cyclical periods of observation — weekly, monthly, or yearly cycles — historical data is of course essential. The same seasonal advertising campaign can reveal performance differences (or similarities) when overlaid by year part, for instance.

As noted in this wonderful article, Bitcoin’s price has been charted against the price of gold sold on the commodities markets to reveal an uncanny echo when the timescale is adjusted to months for the cryptocurrency.

Bitcoin price intervals above are months not years.

2. Statistical Significance

Too small a set of data, meaning a sample size that does not offer a representative distribution of the larger population of data, can tell a deceptive story. The particular sample you’re working with might contain more outliers than the general population, or fewer. There are many distortions possible.

Margins of error, or confidence intervals, offer a way to score the sample size on its likelihood of being representative. Factors determining the confidence interval include sample size, degree of data homogeneity (did 99% of people in the sample set answer Yes?), and population size.

3. Accuracy and Precision

Precision is nearness of two or more measurements to one another, which is a virtue if they are measurements of the same, or similar phenomena. Accuracy is the nearness of a measurement to a known comparison point.

We want data to be accurate, and barring the availability of comparison points to judge accuracy, we want the data to have a high level of precision. But doing real justice to these concepts is beyond the scope of this article. See a high level explanation of them here.

4. Can We Act on This Data?

Data that doesn’t guide us to a specific useful action is data that might be irrelevant, or that hasn’t told its story well. To ascertain whether actionable insights have been uncovered, ask yourself these questions:

Does this data change your hypotheses about the population in question
Does this data change the segmentation you will perform on the population?
Does this data let you approach the population in a different way, whether theoretically or in practice?

Conclusion

Applying these four criteria allows us to turn large, intimidating data sets into actionable, pertinent insights that a product manager can utilize in meaningful new products.

Philip Hopkins is a product manager working in marketing and ad tech, and in finance.