Safeguarding Identity and Personal Data in a Web3 World

If Web3 is emerging as a serious technology, how can we protect against the potential harms?

By Irving Wladawsky-Berger

Transformative technologies are generally accompanied by a mixture of excitement and confusion in their early years. Something important is going on out there, although there’s no consensus on what it is yet: there’s no single dimension around which to define an emerging technology or business model.

This was the case with the advent of the commercial internet in the early 1990s. It was pretty clear that a communications revolution was under way: after all, the internet was fundamentally a network of networks, and e-mail was one of its earliest and most popular applications. It was also an information revolution: anyone with a browser, a PC, and an internet connection could now access all kinds of content in the new World Wide Web. And, above all, it promised to be an economic revolution: the internet ushered a historical transition to a new kind of digital economy, including many innovative e-business applications.

Over the past few decades, the term internet has come to encompass a number of related technologies including broadband networks, mobile devices, social media, cloud computing, e-commerce platforms, big data, AI and more. More recently, we’ve seen the emergence of a new set of technologies and business models that are once more generating excitement, confusion, and multiple opinions on what they’re all about: Web3.

Recently, I wrote about Web3, referencing two recent books: the just-published Digital Asset Revolution by Alex Tapscott, and the soon-to-be-published Think Blockchain by Jerry Cuomo.

I wrote that Web3 aims to usher a more open, entrepreneurial internet and digital economy by replacing today’s corporate mega-platforms with blockchain-based decentralized networks.

Web3 would give creators, developers and users a way to monetize their contributions, involve them in the governance and decision-making of the platforms supporting their work, and give individuals more privacy and control over their data.

Now I want to discuss another perspective on Web3 based on The Emerging New Economy: Causes and Consequences of Web 3.0, a recent Stanford seminar by Alex (Sandy) Pentland, MIT professor and faculty director of the MIT Connection Science Research initiative. I’ve long been affiliated as a Connection Science Fellow at MIT.

In the seminar, Pentland cited a number of Web3-related projects that his research group has been involved with over the past few years. I’d like to focus my discussion on two key, closely intertwined projects in particular: safeguarding an individual’s digital identity and protecting their personal data.

Alex “Sandy” Pentland

Protecting Digital Identities

Identity plays a major role in everyday life. Think about logging on to a website, making an online purchase, or getting on a plane. As explained in A Blueprint for Digital Identity, a report by the World Economic Forum (WEF), identity is essentially a collection of data attributes associated with an individual, enabling them to participate in specific transactions by proving that they have the attributes required to do so. Identity attributes fall mainly into three main categories: inherent — e.g., height, age, date of birth, biometrics; accumulated,— job history, health records, home addresses, education; and assigned— e.g., email IDs, phone numbers, social security, drivers license, passport.

“In the Web 2 paradigm, third parties like banks, social media companies, and digital conglomerates give us our identities and allow us to access their services,” wrote Tapscott in Digital Asset Revolution.

Web 2’s Faustian bargain was signing our own data over to these intermediaries (via their terms of use and service). We gave them rights to use our data for their own gain, and they undermined our privacy in the process. We never get to own our identity. Rather, we simply rent it in the walled gardens.”

Self-sovereign identity gives individuals control over their digital identity — one of the most important objectives of the Web3 paradigm. “Anonymous, single sign-on will allow one user name and authentication method across all web sites and accounts, rather than individual logins for each site,” wrote Cuomo in Think Blockchain. “This login would not require you to relinquish control of sensitive personal data.” With Web3 wallets backed by the the appropriate type of blockchain network, users always retain control of their personal identity information (PII) and login credentials.

However, the various data attributes necessary to establish a self-sovereign digital identity are siloed within different private and public sector institutions. These institutions will not want to give up their data for a variety of competitive and legal reasons.

Thus, to achieve the level of privacy and security envisioned in a Web3 framework, it’s necessary to establish a federated ecosystem of institutions that can access the attributes necessary to validate an identity while preserving the privacy of the data.

The more data sources such an ecosystem has access to, the higher the probability of detecting fraud and identity theft while reducing false positives.

Open Algorithms (OPAL) is a governance framework for validating identities developed by Pentland and his students and collaborators. OPAL enables the institutions in a federated ecosystem to jointly run computations on the data while keeping the data completely private. The OPAL framework is described in Open Algorithms for Identity Federation, a 2017 paper by Pentland and Connection Science CTO Thomas Hardjono.

“The identity problem today is a data-sharing problem,” wrote the authors. “Today the fixed attributes approach adopted by the consumer identity management industry provides only limited information about an individual, and therefore is of limited value to the service providers and other participants in the identity ecosystem. This paper proposes the use of the Open Algorithms (OPAL) paradigm to address the increasing need for individuals and organizations to share data in a privacy-preserving manner.

Instead of exchanging static or fixed attributes, participants in the ecosystem will be able to obtain better insight through a collective sharing of algorithms, governed through a trust network.

Algorithms for specific datasets must be vetted to be privacy-preserving, fair and free from bias.”

Seven Ways to Safeguard Data

OPAL is the kind of technical and governance innovation that’s required to develop a trustworthy Web3 framework. It is based on several key principles, including:

  • Move the algorithm to the data. Instead of gathering raw data into a central location for processing, the algorithms or queries should be sent to the repositories and be processed there.
  • Decentralized data architecture. Raw data must always remain in its permanent repository under the control of the repository owners. Only the results of applying the algorithm or query against the data are returned.
  • Open, vetted algorithms. Algorithms must be openly published, agreed to, and vetted by experts to be safe from privacy violations, bias, and other unintended consequences.
  • Subject consent. Data repositories must obtain explicit consent from the subjects whose data they hold for the execution of an algorithm against their data; the vetted algorithms should be made available and understandable to subjects.
  • Data Federation. In a group-based trust network ecosystem, algorithms must be vetted collectively by all the members of the ecosystem; each member must observe the OPAL principles and legal frameworks.
  • Data is always in an encrypted state. Data must be encrypted while stored, transmitted and when algorithms are applied against it.
  • Transparency and regulatory compliance. All requests and responses must be stored in a public blockchain to provide a shared, immutable log of events that enables the auditing of all interactions, as well as proof of regulatory compliance.

“The OPAL paradigm offers a possible way forward for industry and government to begin addressing the core issues around privacy preserving data sharing,” noted Hardjono and Pentland. “Some of these challenges include siloed data, the limited type/domain of data, and the prohibitive situation of cross-organization sharing of raw data.

Instead of sharing fixed-attributes regarding a user or subject, the OPAL paradigm offers a way for Identity Providers, Relying Parties and Data Providers to share vetted algorithms.

This in turn provides better insight into the user’s behavior, with their consent.

“It also allows for the development of a trust network ecosystem consisting of these entities, providing new revenue sources, governed by relevant legal agreements and contracts that form the basis for an information sharing legal trust framework. Finally, a new set of legal rules and system-specific rules must be devised that must clearly articulate the required combination of technical standards and systems, business processes and procedures, and legal rules that, taken together, establish a trustworthy system for information sharing in a federation based on the OPAL model.”

This blog first appeared July 21 here.



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store



Addressing one of the most critical issues of our time: the impact of digital technology on businesses, the economy, and society.