For a functioning Data Market — guest article series by Aurel Stenzel
In this blog post series, we review and analyze recent academic publications in order to draw conclusions on how important components of a data ecosystem should be defined. In this article, we have a closer look at:
Acemoglu, D., Makhdoumi, A., Malekian, A., & Ozdaglar, A. (2019). Too much data: Prices and inefficiencies in data markets (No. w26296). National Bureau of Economic Research.
In another blog post, we explained why data is a social construct. Your data is correlated to other individuals’ data. If you fully share your data, you do not only compromise your own privacy but also the privacy of others correlated to you. The correlation can either be explicit (e.g. your Facebook friends) or implicit (e.g. male Berliner in his 30s). For example, the Cambridge Analytica scandal was based on 270,000 Facebook users who voluntarily downloaded an app that accessed their Facebook data (news feed, timeline, posts, and messages). Based on that information, Cambridge Analytica could draw meaningful conclusions about more than 50 million Facebook users. The article extends models that consider externality costs (1) of data sharing and its consequences. The basic argumentation is that excessive data sharing lets each individual overlook her privacy concerns as the others’ sharing decisions already revealed so much about her. We probably all know this feeling: “They (the digital platforms) already know all about me. So why should I even care anymore?” As a consequence, we also fully share all of our data, imposing additional externality costs on the others — the unraveling effect continues.
Let us consider a very simple ecosystem with just two users — Alice and Bob, each owning personal data. The two users share main socio-economic characteristics (e.g. both live in Berlin with a similar income). Therefore, their data is related to each other (e.g. both like the same kind of restaurants). The ecosystem contains one platform that wants to acquire the data of the users in order to better estimate their preferences (e.g. personalized product). The platform makes an offer to the users to purchase their data. Each user sells her/his data if the offer exceeds her/his valuation of privacy. We assume that Alice highly values her privacy while Bob does not and that the platform’s offer lies between the two users’ valuations. In the absence of externality costs, Bob would sell his data while Alice would not.
However, given Bob’s sharing decision and the correlation between the users’ data, the platform has a fairly good estimate of Alice’s personal data now as well. This undermines the willingness of Alice to protect her data. Actually, as Bob already revealed a lot about her, Alice does not really care anymore. Despite her high valuation of privacy, she is now also willing to sell her own data for a very low price. But once she decided to sell her data at a low price, she also revealed a lot about Bob depressing the price he can ask for his data. In this simple example, the platform is able to acquire both users’ data at a very low price. As Alice highly values her privacy but only receives a very low price for her data, the data sharing leads to a decrease in social welfare (2).
In order for Bob to realize a price for his data, he depends on Alice’s sharing decision to not share her data. However, as soon as he sells his data, Alice does not have any incentive anymore to not sell her data. The article shows that a data market is inefficient when a subset of users are willing to share data, which is correlated to other users whose value of privacy is high. The findings can be extended to ecosystems with competing platforms and incomplete information with the same (unfortunate) outcome.
The article defines important features of data unions. Intuitively, we would form a data union based on basic attributes — e.g. a group of friends. However, based on the argumentation above, if the members value their privacy differently, the data union ultimately fails (especially because in the example, there exists a strong correlation of their data sets). If we form a data union, we need to form them based on the members’ privacy valuation. We further need to ensure that there does not exist any correlation between different data unions. Otherwise, we would face the same market breakdown as described above. With the latest encryption technologies (e.g. zk-SNARK, Secure Multi-Party Computation, homomorphic encryption), we could decouple the data union’s data sets. A data union that values its privacy low (Bob’s data union), could sell its data without revealing anything about another data union that highly values its privacy (Alice’s data union). As a consequence, by sharing its data, Bob’s union does not reveal anything about Alice’s union and therefore does not undermine its willingness to protect its data. As Alice’s union does not sell its data, Bob’s union can still charge a high price. Therefore, Bob’s union actually depends on Alice’s union to be privacy sensitive.
With the simple example above, the article illustrates that the data privacy concerns of Alice’s data union is a prerequisite that Bob’s data union can sell their data at a meaningful price. In our next blog post, we will show that we can design an incentive scheme — especially for the Bobs (low valuation of privacy) out there — for people to join a data union.
(1) In economics, externality costs describe a negative affect imposed on a third party due to the consumption or production of a good.
(2) Social welfare is the sum of all participants’ (Alice, Bob, platform) welfare. The increase in the platform’s welfare is lower than Alice’s decrease leading to an overall decrease.
About Fractal Protocol
Built on Polkadot, Fractal Protocol is an open-source, zero-margin protocol that defines a basic standard to exchange user information in a fair and open way, ensuring a high-quality version of the free internet. In its first version, it is designed to replace the ad cookie and give users back control over their data.
Make sure to -
This article does not include elements of any contractual relationship. This article shall not be deemed to constitute a prospectus of any sort or a solicitation for investment or investment advice; nor does it in any way pertain to an offering or a solicitation of an offer to buy any securities in any jurisdiction.
For the avoidance of doubt, please note that the Protocol has not been fully developed. Any statements made about the Protocol are forward-looking statements that merely reflect Fractal’s intention for the functioning of the Protocol. There are known and unknown risks that can cause the results to differ from the forward-looking statements.
Fractal does not intend to express investment, financial, legal, tax, or any other advice, and any conclusions drawn from statements in this article or otherwise made by Fractal shall not be deemed to constitute advice in any jurisdiction.