Understanding Decentralized IDs (DIDs)
Decentralized identifiers (DIDs) came to my attention at the last Internet Identity Workshop (IIW), where it seemed like 30% of all presentations were about DIDs. I feel like I’m a latecomer to the party, but seeing as I can’t find a Wikipedia page on DIDs or anything about their history or when they started, maybe I’m not late at all.
DIDs propose to use blockchain to register identifiers that users can use to identify themselves. This article is to document what I have learned about DIDs in the last several months — both to help other interested parties with DIDs as well as so people can tell me I’m wrong and I can learn more things. To be clear, I am neither for DIDs or against DIDs; I am also neither for or against blockchain and / or centralized governments. I have to say that now, because all of those things seem to be very hot topics that are very polarizing and lots of people have very strong opinions about them. I hope what follows comes across as objective and neutral.
This article starts off with an overview of DIDs, DID Documents, Verifiable Claims and DIDAuth — basically laying out how the technology works. It then explores the economics of DIDs to try and understand what problems they propose to solve, for whom, and how they go about solving them.
DID Technology Overview
I’m not going to go too deep into the architecture of DIDs — partially because I’m no expert at it, but partially because I think I can explain it at a high level without confusing people with all the bits and bytes. The W3C Decentralized Identifiers (DIDs) specification currently stands at version 0.10 — so I’m sure this is all very fluid and subject to change.
To start with, a user can create a new DID at any time and for any reason. A DID is two things: a unique identifier and an associated DID Document. The unique identifier looks something like “did:example:123456789abcdefghi” and it’s safe enough to think about that as the unique ID for looking up a DID document. A DID Document is a JSON-LD object that is stored in some central location so that it can be easily looked up. The DID is expected to be “persistent and immutable” so that it is outside of the influence of anyone other than it’s owner.
The DID Document must include a DID. It might include things like:
- a timestamp of when it was created
- a cryptographic proof that the DID Document is valid
- a list of cryptographic public keys
- a list of ways that the DID can be used to authenticate
- a list of services where the DID can be used
- any number of externally defined extensions
Note that there is no personally identifiable information in the DID Document: no username, no address, no phone number. That comes through Verifiable Claims, which I will explain in the next section. But before we get there I should mention two more things.
First, it seems like most people are excited about the public key(s) in the DID Document — I haven’t heard much conversation about how the other fields get used or why people would be excited about them.
Second, I lied about the format of DID. It is actually “did:” + <method> + “:” <method-specific-identifier>. Above, my method was “example” and my method-specific-identifier was “123456789abcdefghi”. In reality, the method defines how / where you are going to find the DID. There are currently 9 registered methods, including Bitcoin, Ethereum, Sovrin, IPFS, and Veres One — all of which are blockchains (or “Distributed Ledger Technologies” to be more accurate). The method in the DID will define which one of those you are using, and you are expected to know the protocol to use the DID to get the DID Document. If you hear about “resolvers” this is just a generic term for the protocol looking up DIDs and a “universal resolver” is something that can look up a DID by any method.
Verifiable Claims
Most of the conversation around DIDs and how they get used seems to be verifiable claims. There are typically three pieces to Verifiable Claims:
- Subject: probably a user (like you), but it could be a company, a pet, or anything else that can be described
- Issuer: probably an organization of some sort, like the DMV, a university, a bank, etc.
- Claim: any statement that can be made, usually the examples are about people and include things like “is over 21 years old”, “lives at this address…”, or “has this name”. Could be any descriptive statement.
So a Verifiable Claim is when some Issuer makes a Claim about a Subject. The “verifiable” part is that it is trustworthy and tamperproof because it has been cryptographically signed by the issuer.
Note that a Verifiable Claim is just a JSON document and the specification only defines the data model. There is no specified protocol for how to document gets from one party to the next. The Verifiable Claims specification defines two more roles:
- Holder: Someone that receives and holds on to a Verifiable Claim, probably a user (like you) that got a Verifiable Claim from some issuer.
- Inspecter-Verifier: Probably a service of some sort, like Facebook or BevMo, that receives Verifiable Claims and uses them as part of their service.
Perhaps an example would help clarify all these terms. Let’s say that I would like to buy liquor from BevMo, and BevMo would like to know if I am 21 years or older. I will use a Verifiable Claim from the DMV (the Issuer) that says Adam Powers (the subject) am over 21 (the claim). I will download that Verifiable Claim document from the DMV (Adam Powers is now the holder) and upload it to BevMo (the Issuer-Verifier). BevMo will validate the claim and then allow me to buy liquor.
There are a few interesting points from that example that are worth highlighting:
- Now I’m downloading a bunch of Verifiable Claims and I have to figure out how to manage them. For that, there is the Credential Handler API, which allows users to create online wallets of Verifiable Claims and easily use them at different services.
- The claim from the DMV is that I am “over 21” not my birthday. Both would enable BevMo to make the same decision, but my birthday would give BevMo detailed information about that perhaps allows them to violate my privacy. For example, my birthday (in combination with my name) could be used to uniquely identify me and share information about me between services.
The example above also doesn’t specify how BevMo would validate the claim. Note that the “id” in a verifiable claim is a URI, in the example above it is a URL that points back to “http://example.gov/credentials/3732". Getting the public key for this Verifiable Claim is outside the scope of the Verifiable Claims specification, so it is up to the Inspector-Verifier to know where to get that public key; however, it should be noted that DIDs are also a form of URI. It is conceivable that the Issuer of the Verifiable Claim used a DID for the URI, enable the Issuer-Verifier to resolve that DID to a DID Document containing their public key.
We also skipped over a very important point in Verifiable Claims: how does the Issuer know that the Holder has the right to request a claim? And how does the Inspector-Verifier know that the Holder is associated with the ID in the claim?
For that there is DID-based Authorization, known as DIDAuth.
DIDAuth
The details around DIDAuth are still emerging and a specification is still being put together. The most recent work comes from Markus Sabadello, who is currently putting together a list of use cases and architecture.
At it’s heart, DIDAuth is a challenge-response authentication protocol: the service that you are logging in to sends a random challenge, which you sign with your private key, and then send the challenge, the signature, and DID back to the server.
The service that receives the DIDAuth response can validate that the user is associated with the provided DID by resolving the DID to a DID Document and using the public key in that document to validate the signature over the challenge.
Economics of DIDs
With that technology overview in place, we can now dig into more exciting questions around DIDs, such as:
- What problems do DIDs solve and how do they solve them?
- Is blockchain necessary for DIDs? Can existing technologies accomplish the same goals?
- Who will benefit from DIDs?
When I was working with Disney, Sony Pictures and the likes, one of my favorite books was Entertainment Industry Economics, which laid out how the entertainment industry makes decisions. Following the same spirit, I think it is worth looking at the ecosystem of DIDs from an economic perspective. This involves identifying the stakeholders involved in the production, distribution, and consumption of DIDs and the rational decisions that they make. Note, I’m a technology guy not a economist, so hopefully this analysis doesn’t offend any economists.
Note that this is where things can get contentious and people have very strong feelings about the things I am about to discuss. Remember at the beginning of this article I said that I was going to try to be neutral and objective. If I factually misrepresent anything, please let me know.
Actors in the DID ecosystem seem to be:
- Users: the owners of DIDs and the Holders and Subjects of Verifiable Claims
- Service Providers: the websites, mobile apps, and platforms that act as Inspector-Validators of claims.
- Claims Issuers: the government agencies, companies, and organizations that provide Claims about any number of Subjects
- Technology Providers: the companies that would be required to build out the DID ecosystem on behalf of the other actors
To be clear, these categories are not distinct in the real world: companies like Google and Facebook are Service Providers, Issuers, and Technology Providers; and frequently government agencies (such as the DMV) act as both Service Providers and Issuers. This segmentation is still useful though, since the choices when performing one of these roles drives common decision making.
We can start by considering the point of view of the DID Users.
DID Users
We can start from the perspective of DID Users and the value that they should receive from DIDs. The most common arguments about the value of DIDs is returning control to users. This can come in a few different varieties:
- Control of Identifier: Commonly used identifiers today include email addresses and domain names. Unfortunately, these are controlled by third-parties who’s interests may not align with users. For example, email providers may revoke or suspend email addresses; and domains may be shut down or taken over. So long as a DID private key stays private, a user remains in control of their DID.
- Variety of Authority & Claims: Currently there are very few authorities that can assert claims about people. Examples include Drivers Licenses, Passports, Social Security Number, Student IDs, Bank Accounts, etc. — all of which can be used to prove identity to a third party. However, the authorities controlling these identifiers and claims are not infallible and having more authorities that can issue claims creates open market competition around identifiers and claims, hopefully increasing the quality and availability of claims.
- Centralized Identity Providers: Today, many people use Facebook, Google, Twitter, GitHub or other identity providers to log in to third-party services. This gives these companies an incredible amount of control and insight into people’s lives. By using DID, users would not be dependent on these third-party accounts anymore.
- Control of Privacy: Users frequently don’t have control over which data they share or when they share it. One aspect of the centralized accounts is that users must share lots of private information with corporations that they may not choose to share otherwise.
Let’s consider each of these in turn to see if and how DIDs would achieve these goals.
First, users do get control over their unique identifiers. By using blockchain as an underlying technology, the ledger algorithms prevent interference from those that would have access and control to make changes. Even if authorities requested changes to a ledger, it would not be possible for those responsible for the governance of the ledger to make those changes.
A follow-on question to this is: how much does control of identifiers really matter? The examples of losing email addresses or domain names are not things that the majority of the population can relate to; however, in authoritarian regimes where revoking identifiers could silence journalists or freedom-seekers, having control over identifiers could have significant impact.
Second, will DIDs provide users with a variety of authorities and claims? A major argument for DIDs seems to be that authorities are brittle and corrupt, and the the government agencies that provide identifiers and claims are over-trusted and over-empowered. Verifiable Claims seem to hold the promise of more authorities popping up, most likely in the form of commercial entities (note: this seems like a similar vision to the mobile money “wakala” that act as agents for Tigo Pesa, Mpesa, and other providers).
It is not entirely clear whether this vision would come to pass. I can’t think of many economic systems, especially in developed countries, where they remain largely fragmented for very long. It would seem that marketing, network externalities, acquisitions, etc. would cause the market to converge to just a few commercial authorities that would be providing claims. Should that come to pass, it is not clear that large commercial authorities and claim providers would be better for users — perhaps it is a matter of perspective of whether one trusts the government or commercial entities more.
Third, users would benefit from not being locked in to large services that act as their identity providers. This seems like an especially problematic argument: nearly any service could setup OAuth or OpenID Connect today and act as an identity provider, so it doesn’t seem like technology preventing more commercial identity providers. Instead, it seems like users have selected their identity providers based on the frequency that they use the services and the confidence they have in their service providers, thus driving them to the companies that have gotten large through other means.
Fourth and finally, DIDs give users control over their privacy. It is not entirely clear that DIDs fundamentally change privacy for users, since users already have control over which information they give to which service providers. Service providers will frequently demand more information from users than is necessary, and there is nothing about DIDs that would limit the quantity, scope, or detail of information that would be required of users.
Another privacy concern is the link-ability of users information across services. If DIDs could prevent users from providing detailed information (like name and birthday), then DIDs would have promise in reducing the amount of privacy-violating data sharing that occurred between services. In addition, the low cost and high reliability of creating new DIDs makes it easier to ensure that unique identifiers (such as email addresses or social security numbers) aren’t used to link users across services. In fact, some DID implementations recommend creating a new DID for every service (called “pairwise DIDs”) to help reinforce this privacy. Pairwise DIDs aren’t an absolute protection, but it is an incremental step towards improved privacy along the privacy spectrum.
DID Service Providers
The current decision facing service providers are: 1) whether or not they should implement DID-based identities and authentication; and 2) if they implement DID-based identity, what sorts of claims will they accept.
Perhaps the best argument for a service provider to adopt using DIDs is that around risk shifting. By using DIDs, the risk of identity proofing, validity of claims, etc. is shifted back to users and claims issuers. Because the claims are cryptographically verifiable, the service provider has a strong argument that they were justified in believing claims in the event that the claims end up being wrong. In addition, the use of Verifiable Claims provides the service providers with the option not to retain the claim data, reducing their exposure to GDPR and other data and privacy risks.
Should a service provider choose integrate with the DID ecosystem, the most difficult challenge seems to be choosing which authorities and claims they will accept. Integrating with a large number of claims providers may be more convenient for users, but may require significant work around understanding the quality of the claim issuer, establishing legal relationships with issuers, and perhaps developing the technological interfaces to different providers.
Claims Issuers
Should users and service providers choose to adopt DIDs, there would be little reason for Claims Issuers not to enable the ecosystem. By providing cryptographically verifiable claims into a system that makes it convenient to transmit them for any number of use cases, the Claims Issuers would only increase their influence and the value of the claims they are providing.
DID Technology Providers
The final actor in the DID ecosystem are the technology providers that are building out blockchains, identity management systems, authentication systems, etc. Some technology providers are enthusiastically in favor of DIDs, both because of their beliefs in the value it provides to users, but also for the lucrative rewards of building out a new market.
Other technology providers have proven to be more skeptical of the DID ecosystem. At first blush, it would seem like DID is making significant investments in new technologies where existing technologies would perform just fine: blockchain’s functions can be performed by PKI; Verifiable Claims might just be a new data model on top of JSON Web Tokens (JWTs); the exchange of user information could take place through OpenID Connect, especially using it’s Discovery and UserInfo mechanisms.
Moving away from existing technologies isn’t simply a matter of pride and / or wasted effort. Existing technologies have known security and privacy models that have been well established through years of maturation and third-party research. Discarding known technologies introduces new risks that may be difficult for some markets to take.
Understanding whether existing technologies can achieve the same outcomes as DIDs is one of the areas of future work.
Future Work
There is a great deal of enthusiasm around DIDs and the technology and ecosystem are rapidly evolving. The W3C Credentials Community Group has a very active conversation around DIDs, their use cases, and the future of the specifications that have been developed. Much of the conversation is still ongoing and it isn’t entirely clear how the adoption of DIDs will play out; however, it is an interesting approach to identity and identifiers and one that merits watching.