Verifiable Claims spec does not champion privacy

Tristan Hoy
TOKENIZE.IO
Published in
3 min readJul 25, 2017

The current draft architecture for Verifiable Claims describes a single point of privacy failure: the identifier registry.

Source

Here is a set of cascading requirements, assuming the registry is some kind of “server”:

  • The registry MUST be resilient to denial of service attacks
  • The registry MUST be able to discriminate between high-volume inspectors (e.g. Walmart, government agencies) and DDoS attackers
  • Therefore, the registry MUST authenticate inspectors

And this gives the registry access to all of the metadata concerning who an identity holder is interacting with. While this is perfect in a government or corporate environment where every interaction will be logged regardless, it is not good for privacy.

Unless of course, the registry is a blockchain, and each inspector is running their own node. There is some very specific jargon that indicates that this may be the not-so-opaque intention of the architecture:

The registry MUST manage identifiers in a self-sovereign way

However this is speculation.

What isn’t speculation is that this architecture cannot possibly support interaction privacy unless the identity registry is a decentralized name service or some other flavour of blockchain.

The working group charter states:

“The Working Group will not…attempt to lead the creation of a specific style of supporting infrastructure”

But that’s exactly what’s happening: the registry is a required component, and if you want interaction privacy, the registry has to be a blockchain.

And this comes with potentially fatal drawbacks: decentralized name services critically lack the ability to block or revoke fake, hacked, spam and lost identities in the same way that SSL/DNSSEC/Estonia e-residency do. And because the blockchain is public, any implementation flaw (which is probable considering the high complexity) will permanently break privacy for all users [Edit: this statement is somewhat hyperbolic, however: 1) privacy in a public database has one chance at success 2) blockchain is highly complex 3) this complexity makes failure more likely]

If the architecture was agnostic about the issuance and verification of authenticating subject identifiers, then you could have privacy without a blockchain.

The FAQ states in the answer to Q7:

“The proposed data model and syntaxes are designed to be storage system and transaction protocol agnostic”

But it’s not transaction protocol agnostic. The use of a registry implies a specific transaction protocol that is either technology specific or privacy violating.

Buried in the details, the data model recommends the use of short-lived or single-use bearer tokens (e.g. a public-key signed JWT) for high-privacy applications. These bearer tokens would not require a central registry, although this is not stated.

Another alternative is to simply use per-claim public/private keypairs, which are self-sovereign, self-authenticating and stateless (no central store required). Upon presenting a claim, the claim holder can sign a challenge issued by the claim inspector to verify ownership (rather than just possession) of the claim.

But why isn’t a high privacy option — e.g. bearer tokens, public keys — the default configuration?

Why does the front-and-centre diagram include an identity registry, that is either technology specific or privacy violating?

Why does it state, nowhere, that the registry is optional?

Why does it seem like the spec places the needs of specific stakeholder groups above the absolute need for privacy? [Edit: this statement refers to placement of particular configurations within the spec — it takes digging to figure out if this spec even supports high-privacy applications]

Recommendations for the working group:

  • The “brochure” version of the spec is the most important — and should place zero-registry, high-privacy options first and foremost to encourage privacy-first adoption
  • The high-level architecture draft and interaction diagrams use singular language when referring to identifiers, indicating that a claims holder has only one identifier — this should be pluralized and indicate multiple identifiers by default to encourage privacy-first adoption
  • Fix the link on your proposal to point to the current home of the data model
TOKENIZE.IO

--

--