Data Product Culture

Stefan Hermanek
People.ai Engineering
8 min readJul 9, 2019

This post highlights opinions that are my own and not the views of my employer.

tl;dr — Building Data Products requires rethinking your product; just relying on data driven product development won’t cut it; aim to build a product culture that sets you up for success — and perhaps even call it ‘data product culture’.

Since joining People.ai, I have had the great pleasure of working alongside some of the industry’s brightest engineers, data scientists, and product managers; and together, we’ve set out to solve some of the industry’s most exciting problems for our customers.

On this journey, I’ve realized that building data products requires a fundamental rethinking of product development, and it must start with a data product culture. This post highlights the evolution of products on the data maturity curve. I share some lessons learned, along with practical advice as your organization and product portfolio matures. The goal? To shape your culture into a data product culture.

Minimum viable data quality is use case dependent

In Silicon Valley, we love to talk about use cases and minimum viable [fill-in-the-blank]. In this spirit, and to gather the highest number of possible buzzwords, let’s look at some definitions:

  • Use cases are something your users do (no surprise here), based on a need they have.
  • Capabilities power one or more use cases. For example, if your users have a need to find things and take an action, you may wish to offer them the capability of search. Your capability should strive to delight users.
  • In offering some capabilities (and, increasingly, any capability), data is key. Some capabilities are more dependent on data than others; and some capabilities are virtually impossible to achieve without a certain degree of data quality. Let’s call this minimum viable data quality.

Here’s a chart to illustrate the above points:

As if a chart isn’t enough, here’s an example:

Say you are building an online store selling five different kinds of sneakers to a reasonable amount of people. In the offline world, you could just display your product in the physical store (or so my ancestors have told me). On the web, your store will at least have to show the sneakers being sold, perhaps with the price point for each. You have just built a data display product. The data you display should be accurate; other than displaying it, you don’t do much with it. If it’s uneven/imbalanced/incomplete in certain areas, you may even be able to work with your designers to still enable the core capability of buying sneakers.

Your site then gains massive traction, and you peek at a competitor called Amazon.

They seem to be recommending products to their user base; that’s cool. You decide to personalize the experience per user cohort, or even per user. You’ve just evolved into a data enhanced product.

Your user experience now depends on the data quality more strongly than when you were just selling five pairs of sneakers; but when stuff breaks (and it will), there’s always the fall-back option of your unpersonalized sneaker-selling industry behemoth.

At some point, you’re tired of walking around in sneakers and wish to drive instead. But you still love your sneakers, and you want to watch them while driving, so you decide to build a self driving car. (Yes, I firmly believe this is the reason behind and genesis of self driving cars.)

Building a self driving car is hard, so you shoot for a car that “simply” detects if the traffic light is red or green, and alerts you when a traffic light turns green. You can watch your sneakers while the light is red. At this point, the capability you built is critically dependent on data. Your product has turned into a data critical product.

If you get it wrong, someone may die. To even achieve a minimum viable product, you require a large set of high quality (labeled) data points of red and green lights. You obtain it, and you build an amazing red-light-look-at-your-sneakers-apparatus.

You keep driving around in your car, adoring your sneakers at every red light, and decide to buy a dash camera. You wish to record everything you see, and label objects on the car (seems like something someone may naturally wish to do, right?). You already have hundreds of self driving car startups waiting for your labeled dataset.

At this point, data is your product. To sell it, your minimum viable data quality threshold is off the charts — any mistake may lead to errors that propagate; and your customers are certainly building more advanced use cases than the red-light-look-at-your-sneakers-apparatus.

Even — and, dare I say, especially — in the age of “Big Data”, it is crucial to understand the minimum viable data quality required to offer your users a certain capability

Strategies and tactics for building data products

I wish I could recommend fancy transfer learning techniques to leapfrog from data display products to data critical products. Perhaps that is possible in some cases or domains; perhaps not. Instead, below are some strategies and tactics that have proven rather helpful during past experiences, and they amount to what I’d call a “Data Product Culture”.

Know where you stand and where you should stand

This should go without saying, but make sure you know where you fall in the above chart (or any other chart that makes sense to your product). If you can serve your users sufficiently well with data display or data enhanced products, go celebrate. Your job just got a whole lot easier. Don’t stumble into building a red-light-look-at-your-sneakers-apparatus when you really enjoy selling sneakers. And if you do decide to “pivot”, do it deliberately rather than through happenstance. Data Product Culture necessitates you consciously make decisions about how and where you choose to depend on data.

External messaging matters

No product should be built in isolation from your users. Your products should serve user needs. You should keep that in mind as a guiding light. And you should test if they actually do. But there’s more (with data products in particular): You should test different ways of externally messaging your capability.

Let’ use three features on the Amazon.com website as example:

Source: Amazon.com

You can see there are three capabilities offered here:

  • What other items do customers buy after viewing this item?
  • Frequently bought together, and
  • Customers who viewed this item also viewed

Note that none of the capabilities is prescriptive. Nor does it make a recommendation. It doesn’t urge you to buy product A over product B. It just states facts (based on a large data asset). And it enhances your user experience by giving you new information as well as social proof.

In an alternate world (and if Amazon was a bit more in your face), one could easily imagine a capability like: “We really think you should buy a Lord of the Rings novel instead of Harry Potter”. That would be concerning to some users, and quite truly, a bit too in your face.

The bottom line: External messaging of your product matters, particularly when you build a data product. To embrace Data Product Culture, be open to testing out different messaging techniques, particularly in areas where you still need to build out full confidence in your minimum viable data quality.

Fast Baselines + Error Analysis + Targeting = A good starting point

Obtain a baseline the moment you start working on a project or capability. The speed to baseline is crucial. Get it on day zero. Prove the feasibility before anything else.

Having this baseline will allow you to perform error analysis. Which segment of users or customers are you able to delight? Which customers or users would you prioritize during roll-out? And where do you need to do further research?

When you couple your baseline and your findings from the error analysis with targeting, you have a good starting point. You can validate if there is actual interest. And if there is, close the gap for the subset of users where rolling out a feature isn’t feasible yet, and improve over the baseline.

Data Product Culture emphasizes speed of experimentation, analyzing one’s product’s (data) weaknesses, and overcoming it through a mix of targeting and fallback options.

Building Data Products > Data Driven Product Development

You’ve almost made it to the bottom of this article, so here comes the incendiary part.

Building data products requires more than just data-driven product development.

Data-driven product development is awesome. It allows you to prioritize areas for improvement, run experiments, and know rather than assume if you’ve actually made a dent.

The paradox here is that when building your initial set of data products, you may not have all that much data; it may seem daunting to build something that is implicitly dependent on data without high velocity data to support it; and you may not know where to start.

That’s why there are couple of additional things involved in building data products:

  1. Strong business sense to inform rules for initial baselines.
  2. Insane customer focus to know if the baselines cut it.
  3. Trust and conviction that you can improve over baselines.
  4. The passion and drive to actually do (3)

In other words, be prepared to forego building the fanciest machine learning models when you just start out with a product and listen to the customer.

Data product culture

When you put all of the above together, a realization emerges: Your old way of running combined Engineering-DataScience-Product teams likely won’t cut it, for many reasons outlined above.

What I’ve found helpful to prioritize instead are:

  • Experimental velocity and repeatability — how quickly can you generate a baseline or a baseline improvement, evaluate it with customers, and make a decision if you’ve improved or not.
  • Frequent communication — your ideas alone won’t cut it; get everyone on the same page and fast forward a hundred pages together.
  • End to end ownership — how close can you get to the customer or user in solving their problems?
  • Building awareness to consumers of your data — this is a direct consequence of end to end ownership, but too important to not call out. Make sure that internal consumers of your data know the caveats behind your data. Make sure they don’t assume your service is perfect. And make sure that errors don’t compound.

It doesn’t matter if you’re an engineer, a data scientist, a product manager, or a product marketer. What matters is that you create a data product culture that will best serve the needs of your customers and your business.

--

--