Introducing Data Unions: making crowdselling a reality
How do people sell their data? Individuals have certainly tried — offering it up on eBay is one tactic that goes all the way back to the year 2000. But almost two decades on from that early attempt, and despite the fact that the personal data market is worth tens of billions, there is still no convenient and scalable way for people to retail their digital information. And when we’re talking about real-time data, this problem is amplified.
What’s missing is:
- A very simple way to push data to those who want it
- An obvious venue for product discovery
- A way for individuals to pool their data to attract commercial buyers and compete with professional data brokers.
This lack of a decent solution for individuals has meant that, until now, data giants like Facebook, Twitter, Google and Amazon have been able to capture the personal data market by aggregating large volumes of data from individual users, and then amalgamating these feeds into products they can easily retail to big data buyers, usually in a closed shop fashion.
Far too often, this data oligarchy has operated without the informed consent of their users. And of course even though it’s their data that’s being sold, users don’t get to see the money in their pockets.
Lots of passionate thinkers and technologists such as Jaron Lanier, Glen Weyl + Eric Posner, Jeremy Rifkin and most recently Yuval Noah Harari and Tim Berners-Lee, have advocated for a fairer data economy where users are directly remunerated for the value they are creating¹. So far, the biggest change to data markets has come from lawmakers in the shape of GDPR. The legislation’s intention was to rebalance control between these data giants and end users. But the law won’t change the underlying fundamentals. Only technology can. Because this is fundamentally a technological problem.
Welcome to Crowdselling with Data Unions
Streamr project contributors, including myself, have been openly talking about Data Unions (formally Community Products), our solution to these issues, for a few months now. This is something that wasn’t on Streamr’s original project roadmap, but we knew at the start of last year that if our platform was going to give people genuine ownership of their data, this kind of feature would have to be built.
So what is a Data Union? First and foremost, a Data Union is a data product that sits on the Streamr Marketplace. Like other products, such as this real-time pollution feed from over 60 countries, it will contain a number of data streams that are pushed from multiple sources.
What makes a Data Union different from other Marketplace products is the underlying opt-in data sourcing and revenue sharing mechanics. For the first time, end users can push the data they create to a larger saleable product and receive automatic payment every time that product sells. This represents a new and unique feature for the real-time data industry — what we’re calling crowdselling.
Crowdsource — people pool their knowledge to solve a common problem
Crowdfund — people pool their resources to fund a common endeavour
Crowdsell — people pool their property to enable a common transaction
Let me give you a few definite examples of how crowdselling through a Data Union on the Streamr Marketplace will work. Over the last few months, one of Streamr project’s talented community members, Gang Liu, has been figuring out how to push data from his personal Fitbit to a product he created on the Streamr Marketplace. You can view his early output here. Gang then went on to develop an app that would backend his code. That app now allows anyone to push their own personal Fitbit data to their own Marketplace product in a few easy steps.
Without a way of combining all these Fitbit streams into one easy to purchase data product, there is unlikely to be any future for these multiple individual streams on the Marketplace. But by combining them into one product, multiple Fitbit users could produce big enough data sets to interest serious data buyers. This is what Data Unions achieves.
Streamr Labs has been working on a second example; a mobile prototype app that can pseudonymously send your phone’s GPS location data to the Marketplace. You can see it being tested by its co-developer Riku Ruokonen in the video below. All users will have to do is download the app to their smartphone and follow a few basic steps. After that, their data will be pushed to a dedicated Data Union where it will be offered, aggregated with other streams, for sale. (We expect to distribute an MVP app to members of the Streamr community in early summer).
These are just a few examples out of potentially 1000's. (We expect our partnership with Electrify and their PowerPod to lead to another such Data Union). Anything that generates digital information on behalf of a decent number of individuals — browsers, smart watches, Teslas — can be usefully integrated into a Data Union.
Setting up a Data Union
There will be two steps to establishing a Data Union on the Marketplace:
- Creating a special kind of product using the Marketplace
- Developing Streamr connectivity into the associated software
Step 1 is easy: tools will be added to the Marketplace to enable the creation of Data Unions in a similar fashion to how ordinary products are created today. The product creator would usually take on the role of Product Admin. They are incentivised (via an admin fee or simply the value added to the device they manufacture) to manage the product and ensure the data is as saleable as possible. (Admins are quite like MIDs as described by Lanier and Weyl here).
We expect that providing data to a Data Union will happen via an end-user application associated with a connected device, so Step 2 requires some development work from the app creator.
Nonetheless the process will be quite straightforward and will utilise the Streamr libraries (SDKs), which will be updated to support Data Unions when it launches. Official libraries are currently available for JS and Java, and libraries for other languages, including Python, are being worked on by the community.
To join a new user to the Community, the connecting app would (use the library to):
- Generate and store an Ethereum private key to establish a unique digital identity for the data-providing device
- Join the Community by either authenticating with an application secret or waiting for manual approval
- Start contributing data by digitally signing it and publishing it into a stream
- Allow the user to withdraw the earned tokens from the Data Union to whichever Ethereum address they want
While the above may sound complicated at first sight, it will be a simple matter of calling appropriate functions on the Streamr library. We will publish detailed developer documentation, along with examples, when the feature becomes available in around five months’ time.
Distributing payments with Monoplasma
When buyers purchase Data Unions, those payments will be automatically distributed to the individual data suppliers. Doing this directly on the Ethereum blockchain would not be scalable beyond a few dozen participants, because we envision the largest Data Unions having data providers numbering in the millions. This scalability will be achieved by leveraging an already developed scaling solution called Monoplasma.
Monoplasma is a uniquely built layer 2 (off-chain) scaling framework for recurring one-to-many payments. While it was developed by Streamr contributors with Data Unions in mind, it can be applied to other use cases of a similar shape, such as airdrops or revenue sharing models. Monoplasma will be available in February: we’ll officially introduce it at the ETHDenver #buidlthon, as well as publish an in-depth blog post about it.
Data Unions Roadmap
The benefits of crowdselling real-time data.
Finally, let’s outline four advantages to using Data Unions:
1. Individual stream owners will get paid to provide data
The advent of crypto makes this new era of data ownership possible, and this is clearly the biggest benefit of Data Unions. Without it, distributing value in a one-to-many pattern, using regular fiat bank accounts, would prove far too costly given the small sums and potentially high user numbers involved.
2. Gadget and app owners are given a real commercial edge
Whether it’s Streamr or another technology that delivers it, crowdselling as a feature of the data economy can radically change things for application developers and device manufacturers. Connected devices — everything from fridges to smart glasses, pollution monitors, smart meters, mobile phones or cars — will have the potential to make money for their owners.
3. Opening up an ecosystem for third party app developers/data brokers
Manufacturers of consumer devices tend to be big businesses and as a result, they tend to move slowly. This means that there will be real opportunities for third parties to lead the way and create revenue streams of their own by:
- Building their own integrations to help end users and gadget manufacturers to monetise their community’s data
- Buying new types of real-time data sets
- Enriching big data sets to allow companies from multiple industries to enter the real-time data economy
- Helping manage community data with extra moderation tools and artificial intelligence
4. Digital ethics
Opaque terms and conditions, which users almost never read, have created a dark market in data. Data Unions will make the data economy transparent to all users. The best way to do this is by helping them actively permission the sale of their data and enabling them to receive the funds in return.
¹ See also ‘A Blueprint for a Better Digital Society’, Lanier J., and Weyl E.G., https://hbr.org/2018/09/a-blueprint-for-a-better-digital-society Sept 2018