A Primer on Opportunities for Decentralized Data Platforms

Daniel Mason
Elemetric
Published in
7 min readDec 15, 2017

This post explores the opportunities for data platforms that leverage decentralized or blockchain technology, either replacing existing platforms or creating entirely new paradigms for data.

Since the launch of ethereum, a regular conversation thread among decentralists and blockchain enthusiasts is that blockchain applications are going to “eat” centralized software, with executable code running on decentralized global computers erasing the need for centralized infrastructures and inheriting the characteristics of decentralized systems.

Scalable, cost-effective ways of storing and leveraging data, though, are at the center of the most significant software companies in the world. These companies extract value through targeted advertising and artificial intelligence, with heaps of consumer data as the principal ingredient that powers value creation.

The development of blockchain and decentralized technology provides an attractive alternative to the centralized infrastructures that dominate software development today. A decentralized infrastructure is attractive due to its resistance to downtime, censorship, fraud, and interference by 3rd parties.

The decentralized web tools for gathering, storing, and monetizing data are still in their infancy, but the use cases these platforms enable are becoming increasingly apparent.

At their simplest, decentralized solutions serve as 1–1 replacements for existing central platforms. In the long-term, blockchain and decentralized technology could unlock use cases that are impossible today, creating fundamentally new ways for data to be collected, shared, and leveraged.

This post aims to highlight and discuss some of those use cases and to briefly touch on the companies that are working to bring them to reality.

Use-Cases for Data on Blockchain

A few compelling use cases are as follows:

  1. Smart Contracts as New Application Backends (and the Need for Oracles)
  2. Decentralized Data Storage
  3. Industry Solutions for Data Sharing and Contributing
  4. Consumer Data Reclamation
  5. Open AI Initiatives

1. Smart Contracts as New Application Backends

The continued progress of Ethereum as a software ecosystem has generated thousands of use cases that could be enabled by “Smart Contracts.” Smart contracts are decentralized cloud functions, and you can think of them as the logical extension of the “serverless architecture” that many startups use to build web apps today with AWS Lambda. A good list of these Ethereum projects using smart contracts, from proposal phase to production, can be found here. Despite hundreds of active projects, the “Smart Contract” ecosystem suffers from access to 3rd party data, which needs to be solved with oracles, as proposed by Ethereum founder Vitalik Buterin back in 2014.

A lack of access to data, coupled with scalability issues, has caused most of the “live” Smart Contract applications, also called “dApps”, to be confined to insular use-cases that don’t require 3rd party data. If dApps, though, are poised to supplant centralized application backends, then data enablement, through oracles or otherwise, will be a necessary component.

It’s worth noting, though, that in a future where all important data originates from decentralized applications, this becomes less of a concern than it is today.

2. Decentralized Data Storage

The marketplace for data storage today is dominated by a few centralized, giant players that have created cost-effective consumer products that serve as gatekeepers for the world’s data. Dropbox and Google Drive provide “pure” data storage solutions, allowing consumers to have an external hard drive in the cloud; while other companies like Facebook or Netflix provide “free” data storage but derive tremendous profit from their usage of the data.

One of the most substantial early opportunities for decentralized technology is replacing centralized storage providers with decentralized providers, by creating a “sharing economy” marketplace for data storage. Heavily-funded decentralized projects like Filecoin / IPFS Consortium, Sia Tech, and Storj are chasing this opportunity.

3. Industry Solutions for Data Sharing and Contributing

Some opportunities enabled by blockchain and decentralized technology are profoundly new, creating unprecedented use cases rather than replacing existing solutions to today’s problems. New opportunities within the data sector often revolve around re-thinking how data will be contributed, shared, and monetized among a consortium of individuals on public blockchains or companies on private blockchains (or in an encrypted fashion on public blockchains).

Historically, data had one “owner” and could be sold directly or monetized through a select few channels like advertising, AI, or the creation of market research reports. Decentralized technology, especially blockchains, creates new, secure ways that groups of participants can monetize data, creating a new paradigm for contributing and leveraging information that is not restricted by the need for an “owner.”

A few examples of these types of new data sharing arrangements are:

  • Personal Identity platforms, driven by 3rd party attestations, that are decentralized but strengthened by collective contributions (Civic is an example)
  • New approaches to credit scoring and reporting that remove unreliable gatekeepers like Experian or Transunion, in favor of decentralized networks (Bloom is an example)
  • The sharing of known money launderers or fraudulent transactions among banks to reduce fraud without eroding competitive advantage

4. Consumer Data Reclamation

It should be no surprise that the leaders in AI and targeted advertising are the same companies that have the global reach to gather and store troves of valuable consumer data. Whether this data is used to predict shopping behaviors (Amazon), train self-driving cars (Google, Uber), or keep you enthralled by your favorite shows (Netflix), the existing paradigm is only going to make rich companies richer.

Decentralized technology, though, has the potential to revolutionize ownership of data for individuals and access to data for startups, providing smaller companies and individuals with ways to securely share and profit from their existing data. In aggregate, they can reach the scale of today’s tech giants.

This would mean a consumer reclamation of personal data, aided by companies that make the process more frictionless than it has been in the past. A shift in data ownership does not necessarily prevent companies from profiting from consumer data; however, it means that companies would have to “rent” data from users for a price, or at least ask for explicit authorization to use consumer data for targeted advertising or to train AI models. This extra friction likely reduces profitability and actively fights monopolistic behavior.

Blockstack Inc is a company working to facilitate this future by creating an application ecosystem where consumers retain their data with a BYO approach to data storage (currently supporting Dropbox and IPFS, among others). Applications built on Blockstack use a shared data store that belongs to the consumer and is entirely portable from one application to the next. This approach means that applications are “thinner” and more modular, lacking the same stickiness they have on the internet today.

5. Open AI Initiatives

One of the most simultaneously futuristic and compelling use cases for decentralized data is the ability to build shared datasets that can be used to securely train AI without ever exposing the unencrypted data to the AI (and vice versa). This vision for the future is being worked on by a few projects like OpenMined, Enigma Project, and BigchainDB, and it has the potential to decouple AI excellence from data wealth. As of today, this coupling has perpetuated enterprise profitability and hindered AI development outside of large companies with robust data sets.

In an “Open AI” world, individuals or companies with access to data that could provide value to other organizations could “lease” that data without revealing the data itself. They could distribute this data through distributed, collaborative data marketplaces that broker the data for use by other companies. These new shared datasets, at scale, could contend with the datasets collected by Facebook, Uber, Google, and similar companies, except with data coming from thousands of smaller sources instead of a single, centralized source. In this way, startups could gain access to valuable data that rebalances the opportunity for AI innovation. Additionally, this new framework provides startups and individuals without the tools or scale to contribute valuable data on their own to create shared pools that distribute wealth derived from data more efficiently.

This use case combines a series of cutting-edge technologies, including decentralized data storage, blockchains, AI, and homomorphic encryption. While likely years away from implementation, it is fascinating to consider the long-term data opportunities that are unlocked by the decentralized web.

Conclusion

If blockchain and decentralized technologies are going to “eat” centralized software development in the way that they have begun “eating” the financial services sector already, then the incorporation of more scalable, efficient ways to store data becomes an industry prerequisite.

It’s easy to look at the scalability issues being faced by blockchains today and conclude that these types of data use cases will never be possible; however, with countless projects underway to scale data storage capabilities and transaction throughputs, we’re strong believers that, with time, these visions will become reality.

We predict that even the most basic use-cases (where existing centralized data tools and processes are replaced with decentralized solutions) are still 2–3 years from being competitive production-grade options. The paradigmatically new use-cases, like consumer data reclamation and open AI initiatives, are even further from production but require the kind of revolutionary planning and pre-work that is being done across the industry today.

Nonetheless, the infusion of capital into the sector is fueling development at a rapid rate. Many large ICOs for data storage platforms occurred throughout 2017, including Filecoin ($257M), Enigma ($45M), and Storj ($30M).

Regardless of the timeline, decentralized technology has the potential first to replace, then to transform how consumers and businesses interact with data, reshaping how software companies (and beyond!) derive value and interact with their customers. We are excited to continue exploring, building and participating in this space as these future-focused visions rapidly become a reality.

--

--

Daniel Mason
Elemetric

Founder @ Spring Labs; Re-inventing credit and identity for financial services. Formerly @Techstars, @IDEO, @Red Hat