The Future of Data Sales

What does the future of data sales look like?

If nothing else, it’s clear we’re on the cusp of big change — there’s never been such a proliferation of data before. Everywhere you look there’s a new company sprouting up around a data stream that didn’t exist (or wasn’t organized) a year or two ago, or building tools to help companies manage this data explosion.

A few examples:

  • Planet Labs
  • Satori
  • Amazon’s Million Song Dataset
  • SafeGraph
  • Google’s Public Data Explorer
  • PopWallet
  • Vertical Mass
  • So many others…

It’s easy to look at this and conclude that over time, there will be more and more data companies, some with large footprints (in the sense of providing lots of different kinds of data) and many with small, specialized footprints.

In lots of ways that’s a great thing — diversity like this is often a positive for innovation. The Data Buyer of Tomorrow (DBoT) will be able to buy just about any sort of data they can think of and integrate it seamlessly into their marketing, QA, investing, etc processes. Sounds great!

It’s not all rosy in the future, though. Two potential stumbling blocks:

  • You’ll need to work with many different vendors to get all the data you might want.
  • It’ll be hard to separate really useful data from useless wastes of time at the margin.

So if we accept that there will be lots of small data companies and sources of data in the future, the DBoT will need to be good at working with lots of different vendors, and also good at figuring out which vendors are worth the effort of working with, negotiating contracts with, integrating with, etc.

Wherever there’s a need for a relatively rare combination of skills like that, there’s room for specialized companies to do that work for the DBoT. In other words, I think the intuition is actually wrong here. I’m not sure it’s a foregone conclusion that the DBoT’s future actually will look like the above.

I think the future of data sales actually looks like consolidation, but on the aggregation side rather than on the sourcing side. There will still be tons of small-ish data companies that source different kinds of data from different places. Economies of scale matter less on the sourcing side because different data streams look so different.

If you’re good at sourcing, cleaning up, packaging, and selling financial data to hedge funds it’s not obvious to me that you’re also going to be good at sourcing, cleaning up, packaging, and selling movement and location data to city planners. In other words, there’s room for smaller, more specialized shops to work on the sourcing side.

Aggregation is another matter— if you aggregate data from a bunch of different places and package it for easy syndication to lots of clients, the more data you have the more value you’re providing to those clients. Scale matters.

Someone like LiveRamp that pulls together data from a bunch of different sources provides immense value to their clients. Instead of buying the ten different kinds of data it cares about from ten different sellers, Macy’s can go to LiveRamp and get it all in one place. The fewer data aggregators there are, the better for Macy’s (at least from an efficiency perspective, pricing is perhaps another story. But I’d say companies are increasingly willing to trade cost for convenience).

So what will the market look like in 5 or 10 years?

  • Many more “sourcers” of data
  • Only a few major data aggregators. Probably one or two per vertical, e.g. adtech, finance, self-driving cars, mapping, medicine.
  • Either another layer of “de-bullshitters,” or aggregators starting to get into the curation game. In other words, a recommendation layer for data buyers.

All of these layers have yet to reach maturity, except maybe in a few specific verticals. For example, LiveRamp has a good shot at being the data aggregator of choice in the adtech ecosystem. There isn’t yet an equivalent in a bunch of other verticals, especially newer ones where “Big Data” is still a newer thing.

The third layer all but doesn’t exist yet because it’s so hard in a world where there are thousands of data sources. Most aggregators don’t want to be telling customers what data to buy because they can’t afford to play favorites. As a result, their clients still have to figure out on their own (or working with a bunch of different vendors) what data is most relevant.

There’s room for someone with a deep understanding of what data is available (or a better way of evaluating data sources) to step in an provide immense value to brands.

Keep an eye out.

Show your support

Clapping shows how much you appreciated Matej Horava’s story.