Don’t Sell Your Data

Software was so hard two decades ago that making simple workflow apps put you far ahead of other companies. That’s why Microsoft had a business worth $879 billion[1] in December 1999. Today, such apps are much easier to make with a multitude of open source software, libraries and APIs, and elastic cloud infrastructure. But, you can’t accrue a competitive advantage doing what’s easy. Technology companies today must accrue a competitive advantage by doing something else, like building network effects or systems that learn over data to deliver insights. We say, “don’t sell your data,” because feeding that data into an intelligent application is one of the few ways you can accrue a competitive advantage in technology today.

The New Value Chain

Customers today demand that their software delivers insights to help them make high-level decisions. To do that, the software must collect data, learn over that data and provide predictions. Collecting clean, workable and relevant data is a key part of creating a product that learns over time in such a way that it can predict outcomes of business value. The best companies in this era vertically integrate in such a way that their data, learning models and product are geared towards developing the best ‘learning loop’ to solve a given business problem.

Companies that give away their data, on the other hand, allow others to capture the value at the top of the chain; selling your data gives birth to competitors.

Follow Google’s Example

Google provides us with many examples of a sound data strategy. Google Translate is free but Google doesn’t sell the lexicon they’re developing as users make translations. Tensorflow is free but Google doesn’t sell the data people are feeding their models. Google Search is free but Google doesn’t (directly) sell search data to marketers or other search companies. Google is a durable business because it has consistently pursued a strategy of collecting valuable data to build intelligent products.

“Google has a huge new moat. In fact I’ve probably never seen such a wide moat.” — Charlie Munger

Google and Facebook both created a huge advantage in their core business — advertising — by keeping their data.

Google and Facebook run first party ads on their core websites, and use that information to power marketplace and also run an ad network. With a God’s eye view of the entire ad ecosystem, Google and Facebook can train the best algorithms to optimize revenues. Everyone else is at a colossal and unsurmountable disadvantage. — Tom Tunguz

It’s not too late

Some companies take a while to protect their data. Facebook and LinkedIn, for example, realized part-way through their lifetime that their data was extremely valuable and cut off access to their data through APIs. They did this to protect their ability to capture the full value of their assets by selling high value applications (LinkedIn’s Sales Navigator and Recruiting Solutions, and Facebook’s advertising platform), something not possible if they just sold the underlying data (profiles) to other companies capturing value at the application layer.

The markets don’t value data merchants

Counter-examples to today’s intelligent application companies include data merchants Acxiom, Alliance Data, Dun and Bradstreet, and Nielsen. These companies essentially sell raw data and are valued far less than Facebook and LinkedIn in the public markets. Not bad valuations, but not great.[2]

There are other examples of data merchants that are solid businesses but haven’t broken through to be public companies of consequence. For example, ‘scraping’ companies such as Spokeo, Whitepages and Zoominfo. Or, data brokers such as Datalogix and factual.

Data Acquisition ‘Tricks’

We see a lot of companies with interesting data acquisition techniques hoping to be more valuable than the previous generation of data merchants (listed above). For example, a company that gets developers to install an SDK that collects background location data then aggregates that data and sells it to retailers, real estate developers and marketers. This isn’t necessarily easy and one can build a lucrative business by selling such data. However, we doubt that one can build a durable business this way and such companies are likely to get to the same point as the companies listed above. Firstly, the ‘trick’ may not last; another company could figure out the same trick or a different trick to get the same data and charge a lower price for the data. Secondly, the trick may involve paying money out to data conduits (e.g. paying developers to install your SDK) and your profit from selling the data may disintegrate if those developers want to charge you more, or customers pay you less. Thirdly, one can’t capture value and earn revenue on a recurring basis from static data because that data only provides value to customers at a point in time before decaying. The reinforcing cycle of collecting data, learning over that data, providing valuable predictions, getting more usage, collecting more data, and so on is one that compounds a competitive advantage over time.

Owning your data is just Step One

Accruing a proprietary dataset is a hard and valuable thing to do but it’s just Step One in creating a durable business. Feeding that data into a self-learning software system that produces insights for customers is the next step towards building a durable technology business today. Don’t sell your data, lest someone else take that next step over you.

Thanks

Thanks to the team at Clearbit for reading a draft of this post.

Footnotes

  1. Microsoft’s market capitalization (in today’s dollars), an all-time record at the time.
  2. Arguing the basis or date of comparison is futile given the order of magnitude difference between those companies that sell their data and those that build intelligent applications with their data.