The Sidetree Protocol: Scalable DPKI for Decentralized Identity

Back in 2017, a few members of the Decentralized Identity Foundation (DIF) began discussing how decentralized identity systems could achieve global scale. At the lowest layer of most decentralized identity systems is a blockchain/ledger (‘Layer 1’ or L1) that is used in some fashion to support a Decentralized Public Key Infrastructure and W3C Decentralized Identifiers (DIDs). Blockchain scalability is a non-trivial problem, but there is now a promising strategy for scaling blockchain-based systems known as ‘Layer 2’ protocols, or L2s, for short (examples include: State Channels, Sidechains, and Bitcoin’s Lightning Network). L2s achieve scalability through deterministic processing and transaction schemes that take place ‘above’ the blockchain, with zero or minimal consensus requirements beyond small points of interaction with the underlying chain. For decentralized identity to become a reality, it requires a system that operates at massive scale, while still delivering on important features, such as determinisitic state resolution and differential persistence. Over the past 18 months, what started as one member posing an idea to others at an IIW happy hour gathering turned into a full-fledged technical exploration, and, more recently, development of a new Layer 2 protocol: Sidetree.

The Sidetree protocol is not itself a DID Method, it is a composition of code-level components that include deterministic processing logic, a content addressable storage abstraction, and state validation procedures that can be deployed atop Layer 1 decentralized ledger systems (e.g. public blockchains) to produce permissionless, Layer 2 DID networks. The protocol can be used to create distinct L2 DID networks on different chains by combining its core components with a chain-specific adapter, which handles reading and writing to the underlying L1. Almost all of Sidetree’s protocol implementation code remains the same regardless of the target L1 system it’s being applied to.

Here’s an overview of the system, which shows Bitcoin as the target chain (but as noted above, it can be applied to others):

How it Works

The Sidetree protocol is implemented using a collection of modular components, the most critical being:

  • Sidetree Core: the main logic module that observes incoming transactions from the target blockchain, fetches any DID operation batches it finds (via the CAS module below), and assembles/validates the state of individual DIDs.
  • Content Addressable Storage: the CAS module is a hash-based storage interface that L2 nodes in the network use to circulate DID operation batches between each other for local persistence and network-wide propagation. The interface is abstracted from the specific CAS protocol used within it, but it’s worth noting that DIF members have selected IPFS for this functionality.
  • Blockchain/Ledger Adapter: the component that contains any unique code required to read and write Sidetree-encoded transactions to a specific chain. This is the primary scope of code an implementer must write (assuming there is no existing adapter to address the chain they wish to use).

Sidetree-based L2 nodes perform the following high-level steps to create, read, and process DID operations:

  1. Nodes that wish to write batches of operations into a Sidetree-based L2 network gather together as many DID/DPKI operations (up to limits enforced through a variety of determinsitic protocol rules), and create an L1 on-chain transaction embedded with a hash of the operation batch
  2. Source data for DID operation batches is stored locally by the originating node, and pushed to the wider IPFS network. When other nodes become aware of the Sidetree-embedded transaction in the underlying chain, they begin requesting the batch data from the originating node, or any other active IPFS nodes across the wider network that may have it.
  3. When a batch is received by a node, it pins the source data locally for retention, then the core logic module decompresses the batch to parse and validate each operation. The target chain’s block/transactional lineage is the only consensus mechanism the protocol requires — there is no additional blockchain, sidechain, or authorities that must be consulted to arrive at the correct PKI state for the DIDs in the network.

This is a more detailed diagram of how batches and operations are embedded in a target chain:

The Sidetree protocol makes a few key assumptions as a part of its design:

  1. DIDs are assumed to be non-transferable, and the protocol provides no means for one logical entity to give, purchase, or acquire a DID from another entity who originated it. This is something that works for the DID/DPKI case, but would not for the monetary double-spend case (if it did, we wouldn’t need a blockchain to ascertain decentralized, deterministic lineage).
  2. Late reveal of embedded batch data is permitted, and handled via a deterministic rule set that provably converges on a correct state, regardless of when it learns about a given DID’s operations.
  3. DID states are siloed from each other, thus a DID owner can only affect the state of their own DID, not of any others.

On the Horizon

Currently, two groups within DIF are developing DID Layer 2 networks for two different chains (Bitcoin and Ethereum) using the Sidetree protocol. Microsoft is primarily focused on the Bitcoin variant, while Transmute Industries is leading the charge (with collaborators from Consensys) on developing an Ethereum version. These groups will soon detail their experiences implementing the Sidetree protocol on different blockchains, and disclose more about the public, permissionless, Sidetree-based DID networks they are working on.

The following contributors have been instrumental in the development and maturation of this protocol: