Blockchain’s role in creating a fair data marketplace.

Dilya Zhanispayeva
ZeroTerra
Published in
7 min readDec 29, 2017

Who collects your personal information and how is it being used?

In our increasingly digitized society, people’s personal data has become a precious resource that can generate huge value for those who own and control it. It is for this reason that organizations place a lot of resources into building extensive infrastructure to store and process large amounts of data. Big companies like Google and Facebook have succeeded in creating centrally located and centrally managed data repositories with large amounts of private user information. Owning all that data brings a lot of power, but, with great power comes great responsibility.

Think about the number of people who have Google accounts (Gmails), who store information on Google’s cloud, or have, in one form or another, entered confidential information through Google. Were even one of Google’s data centers to be breached, this could undermine the security and integrity of Google, and severely hurt its billions of users.

Apart from security risks, there is the possibility of sensitive information being shared with third-parties, including governments and advertisers. According to BitClave, in the past year alone, Google has satisfied more than half of governmental requests to disclose private information of users.

The main problem with the emergence of big aggregated pools of data is the concentration of power that is developed around its ownership and management. The individuals who actually own and produce that data do not have direct control over how it is being used, what security precautions guard it, and who gets to extract value from it. Despite putting consumers’ data out in the open, data companies do not disclose the intricacies and inner workings of their business models to the general public.

Behind the scenes of big data companies:

Source: https://www.theguardian.com/technology/2015/apr/19/google-dominates-search-real-problem-monopoly-data

Nowadays, more than ever before, people show interest in protecting their privacy online. Pew research showed (fix color) the following results from a 2015 consumer-facing survey on online privacy:

“93% of adults say that being in control of who can get information about them is important; 74% feel this is “very important,” while 19% say it is “somewhat important.”

90% say that controlling what information is collected about them is important — 65% think it is “very important” and 25% say it is “somewhat important.””

Many are pondering what can be used to address the long-standing issues related to centralization of power, fading of privacy, and minimal control over personal information. Blockchain technology is the solution.

What is blockchain and why it is so powerful in the realm of data management?

Blockchain is a decentralized system of record keeping that is maintained by a distributed network of computers. This means that records are replicated and spread amongst each and every participant of the network, rather than being exclusively owned and managed by a central party like Google. Unlike a centralized database, blockchain has neither one central location, main copy, nor central administrator. Rather, a decentralized database is collectively managed by a network of computers distributed across the world.

Due to the absence of a trusted coordinator, one may wonder what makes this type of a system sustainable in the long run, as it would seem to fail unless all the participants coordinated and acted in unison. To prevent these theoretical system failures, blockchain implements enforceable consensus mechanisms, such as PoW and PoS, which ensure that all participants of the network collectively agree on the content of a database and adhere to the same rules when adding new records to it.

Due to the inherent anti-fraud nature of the blockchain protocol, it makes for an incorruptible, transparent, and secure storage of data. Blockchain’s secure design eliminates the need for a central authority or a third-party to control the integrity of data.

A safer and more efficient alternative to centralized servers.

One of the main features of blockchain is reliance on a distributed network of computers, whereby all participants, or peers, share their computers’ resources, including bandwidth and CPU. This is juxtaposed with a client-server model, used by companies like Dropbox, Google, and Amazon, in which one central agent provides services to many users.

The data management market has recently been graced with a number of blockchain powered data storage solutions, such as Storj, that challenge the existing cloud storage infrastructure with more secure, efficient, and cost-effective storage solutions. With the use of blockchain, files are encrypted and distributed across millions of computers across the globe. The peer-to-peer nature of blockchain provides for faster download speed and increased data availability.

1. Higher speed.

James Prestwich, Storj Co-Founder, compares Storj to Bittorrent because of its P2P design. With Storj, files are broken down into smaller fragments, called shards, and then distributed across multiple computers, or farmers. Storj achieves faster downloads of data by following the same approach as Bittorrent. As Prestwich describes it, “shards of the file come in from different farmers all at once” instead of a single download, which allows to rebuild the file very quickly.

2. Data availability.

Earlier this year, Amazon S3 cloud storage service went offline for nearly 4 hours, affecting a number of websites, apps, and services that rely upon it such as Quora, Giphy, Business Insider, and Slack. Decentralized cloud storage, called Storj, has deployed redundancy mechanism that prevents this type of a system failure. Redundancy mechanisms ensures that, even if the original holder of a shard goes offline, one still, by default, has 6 other farmers from which to download and retrieve files, because Storj creates replicas of every piece of a file that have been uploaded and stores them across farmers.

3. Cost effectiveness.

Using blockchain also brings down operating costs of storage because it does not require building and maintaining the same infrastructure as traditional cloud computing solutions. Storj explains in this blog post that it will not have employees, offices, heating or cooling expenses, or shareholders and is able to do this since it runs on existing computers.

Safer storage for data.

One of the major risks that come with centralized data management is the potential for records to be compromised internally. Blockchain makes it impossible to modify records or otherwise change the constituents of the record book, as blockchain is very transparent and easily auditable.

Decentralized database stores data in blocks linked to one another in chronological order, meaning that everything is permanently timestamped. Every subsequent block contains an identifier called a cryptographic hash of the previous block, which ensures the blocks will remain linked. It is impossible to remove or alter information that has been previously recorded in a block and it is impossible to add false information to blocks. All of these conditions thereby render blockchain as 100% transparent and auditable.

One may ask, “what happens if an outsider tries to attack a blockchain”. This is virtually impossible to do due to the decentralized nature of blockchain as it removes any one point of failure from being used to compromise a system. In order to undermine the integrity and legitimacy of a blockchain, one would need to take control over more than half of the entire network’s computing power.

The Bitcoin blockchain, for example, is estimated by the WEF to soon be consuming the same amount of electricity as the rest of the world. In other words, according to these estimates, the energy consumption of the Bitcoin blockchain will equal the energy consumption of the rest of the world. So for one entity to power half of the Bitcoin blockchain, it would require a tremendous amount of infrastructure as well as energy. Although not impossible (especially with smaller networks), compromising a blockchain would be an extremely difficult task. Therefore, blockchain makes for a secure and tamper-proof database.

Bringing control over personal data to individuals.

Blockchain uses an end-to-end encryption, meaning that information stored on blockchain is kept in a non-readable form and does not hold any value to non-authorized users. Even if a malicious third-party were to compromise the data stored via blockchain, they would not be able to understand it unless they could decipher it, which can only be done by the holder of the password, or the private key. Encryption of data on blockchain relies on public key cryptography. All the data is encrypted with the use of public keys, and every public key has a corresponding private key that is used to decrypt that information. The private key is only accessible to the owner of that data. Therefore, data owners have exclusive access to control of their data because only they can “unlock” the data.

Furthermore, users can take advantage of a computer program that can specify, in detail, who can access the data, what pieces of it can be accessed, and how long the data can be accessed. This program is called smart contracts, which is a code that can be built on top of blockchain in order to automate and enforce the exchange of data between parties that do not trust each other. Jerry Cuomo, Vice President of Blockchain Technologies at IBM explains how smart contracts work in this article as if somebody were asked to present identification at a bar, smart contracts would ensure that only information about their age can be seen, and other details like their address or phone number would not.

Through enabling full ownership over individuals’ personal data, blockchain not only allows people to control how and where their data is used, but it also makes it possible for them to monetize that data. Smart contracts can be used to set up automatic compensation systems for the usage of data. One such project, called BitClave, is creating a decentralized online search engine that stores data on users’ online behavior, such as websites visited, in a decentralized and encrypted fashion. This means that every customer exclusively and individually controls what information is being collected when they use the BitClave search engine. The article published on BitClave’s Medium page explains that users can choose which, if any, of that data “they share with which business, and then receive fair compensation … each time a business uses a user’s data to target them with an ad.” This approach eliminates the need for ad middle-men by connecting consumers with businesses directly and establishing a fair data marketplace.

At the present, the data management sphere has a lot of issues, but this will change soon. The people who own and create data will soon benefit from their data rather than suffer when some unsolicited party manages to lose control of it. Through blockchain technology, we are seeing the shift in data management from corporations to the people, shifting the future of data management.

--

--