In the digital world, we cannot live without identity. The way to prove my identity (I am who I claim to be) is mainly implemented as simple username and password, and sometimes we use advanced mechanism like multi-factor authentication or biometrics. This does not change the reality that we have too many siloed identity, and situation is only getting worse. Federated identity service, in which some identity provider serves as a single point for accessing multiple services helps a lot from user perspective. However our access to multiple services becomes too centralized to one or two big companies. They hold my identity, and too many identities also make them a good target of attack.
Sovrin Foundation brings up the idea of Self-Sovereign Identity (SSI). I don’t think I can describe better than the material created by Sovrin. But in short, we are not relying on any single identity provider. Instead, actors in SSI this identity network will communicate in peer-to-peer fashion. The authentication is composed of credential (provided by one party) and proof verification (done in another party). When an identity verifier is to verify whether someone is who he claims to be or of any attributes, he is presenting the credentials obtained by trust parties. And the verifier accepts those credentials as proof. Individual now only discloses what is needed for authentication, and in many cases we can use proof on a condition item rather than disclosing the actual value or fact. Think of the classic example that “I only need to prove I am over 21 without disclosing my actual birthday”.
Each actor in this identity network is given decentralized identifiers (DIDs). It is “decentralized” as DID is recorded in a decentralized environment (for example, a blockchain network). Also for privacy purpose, each entity can have multiple DIDs, which can be used in various contexts. This largely reduces the chance of entity correlation, as different parties only hold some information about the entity.
The code base is open sourced and then contributed as the initial code for Hyperledger Indy, a blockchain project now under Linux Foundation. We will mainly refer it as Hyperledger Indy in this article.
Hyperledger Indy comes with a good example (indy-dev), simulating a real life case. There is a good and in-depth getting start guide (Indy Walkthrough), and it contains very comprehensive codes on how things work. Here I would like to explore the example in an illustrative way, and help audience to grasp the idea behind.
We first take a look on the storyline of this example.
There are five actors in this example: Government, Faber College, Acme Corp, Thrift Bank and Alice. The first four are organizations, while Alice is interacting with them as an individual during the story.
Government is responsible of setting the “standard” of schemas. This makes sense as it is preferable to have standards of various types of schemas such that all organizations can follow. Two schemas are created in this example: Transcript Schema, and Job-Certificate Schema. We will see how they are used later.
Alice is applying for a job in Acme Corp. Acme Corp requires a proof of education level as one requirement in the job application. As a graduate from Faber College, Alice is first applying transcript from Faber College, which is a credential Alice keeps. Faber College adopts Government’s Transcript Schema. When applying for that job in Acme Corp, she presents this credential to Acme Corp. Acme Corp accepts the credential as proof of her education level from Faber College.
In this case,
- Alice: prover and credential holder
- Faber College: credential issuer
- Acme Corp: credential verifier
Alice is lucky enough to get the job, and she is now an employee in Acme Corp.
Alice is applying for a car loan from Thrift Bank. Thrift Bank requires her proof of both employment status and identity, something known as Know Your Customer (KYC) process. Alice now applies a job certification from Acme Corp, which is again a credential issued by Acme Corp following Government’s Job-Certificate Schema. Now Alice has two credentials: one from Faber College which contains the identity, and one from Acme Corp for the employment status. She submits these credentials in car loan application.
In this case,
- Alice: prover and credential holder
- Faber College and Acme Bank: credential issuers
- Thrift Bank: credential verifier
This is the whole storyline of this example.
Run The Example
Before we go into detail how the storyline is implemented, we will give some highlights on the files of this repository and how to run this example.
Docker runtime is required to run this example. We will build the docker images and then run the containers. Here I use local host in this article.
First git clone the indy-dev example.
$ git clone https://github.com/sovrin-foundation/indy-dev.git
$ cd indy-dev
The Makefile helps us to run scripts defined inside the scripts directory. The one we are using is build and start.
Build Docker Images
We issue make build command to build docker images.
$ sudo make build
If we take a look on scripts/build.sh, we see that this command will build two docker images, corresponding to the two dockerfiles: indy-pool.dockerfile and indy-dev.dockerfile.
With the indy-pool.dockerfile, a docker image indy_dev_pool is built. When running, this container is a pool of four Indy nodes, which are where ledger is kept. With the indy-dev.dockerfile, a docker image indy_dev is built. When running, this container is where we implement the whole story, and records are written on the ledger.
Start the Dev Environment
We issue make start command to run the example.
$ sudo make start
If we take a look on scripts/start.sh, docker containers are instantiated. Effectively both the indy_dev_pool container and indy_dev container are brought up. We are also in the shell of indy_dev, where we can run the python code for the story.
And if we take a look on the running docker containers, we will see the two containers running (on another terminal).
Inside the shell of indy_dev we can run the python code to simulate the whole storyline.
$ cd python
$ python3 getting_started.py
When executing this python code getting_started.py, the whole process is running automatically, with good logging information. From the log we an understand the sequence of tasks different actors perform. If we wish to see more detail, we can take a look on the getting_started.py file, which is also detailed the steps.
We will detail the implementation in the next session.
During the code is running, we observe some wallets files are created (in .indy_client/wallet directory). But before code execution is complete, all these wallets are removed. It is done purposely in the python code for cleanup. See the last few lines of logs.
Also after we exit the shell, we also see the two docker containers are killed.
The storyline above is something happening in real life. Let’s see how it is implemented in Hyperledger Indy.
Indy Nodes: Source of Trust
First, we need a source of trust, which holds the necessary information that helps to build trust in the whole network. Distributed Ledger Technology (blockchain) is a good source of trust, given that data stored is secured against tampering and with good traceability. Hyperledger Indy comes with a permissioned blockchain in order to provide the trust required for the whole system. The blockchain is Indy Plenum.
In the indy-dev example, the network is implemented in four nodes. All these nodes have the shared ledger. The example does not go into detail on how the four nodes work. Let’s assume they are working well and things written in the ledger provides the source of trust for the whole network.
In this example the first actor is called the Steward. Steward can onboard new actors in the system and assigns role to them. All the organizations mentioned in the storyline are “created” by Steward with the role Trust Anchor before they can perform all activities.
In real life, for a well established Hyperledger Indy network, Stewards (and Trustees) are important members of a governing body, who holds the ultimate responsibility in maintaining the level of trust and credibility of the whole network. Governing body will define the necessary rules and regulation in choosing the right Stewards. Again, in this example, Steward is simply just “created” in the code.
Steward Creates All Actors
Once created, Steward is responsible of creating other actors (Government, Faber College, Acme Corp and Thrift Bank). Two steps are needed before an actor can perform the tasks in the storyline.
Onboarding involves creating a pairwise-unique identity (DID) between two parties. Pairwise-unique identity is a pair of DIDs, each owned by one party. This pair of DID is unique in a sense that it is only used for communication between these two parties. Each DID is created with a signing key, a verifying key (verkey), and a DID. Signing key is the private key, kept as secret and stored in wallet. Both the verkey and DID are public information and are recorded to the ledger for public access.
The pairwise-unique DID is not used for interacting with the ledger. They need another DID (called Verinym) that can identify themselves in the ledger and a role of Trust Anchor.
Granting Actor a Verinym and a Trust Anchor Role
Take Government as an example. After Steward onboards Government, Government will create another new DID (again a signing key, a verkey and a DID). As now Steward has a communication to Government (through that pairwise-unique DID), Government passes this new set of DID and verkey to Steward. Steward will record the DID and verkey to the ledger, and set the role Trust Anchor.
From now on, Government is fully functioning. Government now has a DID representing her own, and this DID has a Trust Anchor role.
Steward keeps doing the same “onboarding actor” and “giving a Verinym” to each actor: Government, Faber College, Acme Corp and Thrift Bank. Now these actors have the same capability to act on the ledger.
We will not see Steward any more in the example. And all actors are ready and we can continue the storyline.
Step 1: Government creating schemas
Government issues the Transcript Schema and Job-Certificate Schema, and records them onto the ledger. The schemas are accessible by everyone.
Step 2: Faber College and Acme Corp create their own credential definitions
The schemas defined in Step 1 is useful for any colleges and any companies if they adopt these schemas. In our example, Faber College and Acme Corp creates their own credential definitions based on the schemas issued by Government.
The credential definition contains the schema it is using, plus the necessary information about the issuer of credential definition.
Faber College issues “Faber Transcription Credential Definition”, which is based on “Transcript Schema”, and records it onto the ledger. Similarly, Acme Corp issues “Acme Job-Certificate Credential Definition”, based on “Job-Certificate Schema”, and records it onto the ledger.
Since both of them are now recorded in the ledger, everyone can access these credential definitions later.
In the first case, Alice is getting the transcript (credential) from Faber College, and later provides the credential to Acme Corp as proof of education qualification, when she applies for a job.
Step 3: Alice obtains a Credential from Faber College regarding the transcript
Here are the detail steps:
- Build a connection between Faber College and Alice (onboarding process).
- Faber College creates and sends a Credential Offer to Alice.
- Alice retrieves the “Faber Transcript Credential Definition” from the ledger, and creates a Credential Request and sends it to Faber College.
- Faber College creates the Credential for Alice. Inside the Credential it contains the values of items listed in the “Faber Transcript Credential Definition” (and Transcript Schema), plus the required proof that Alice can use later when requested by Acme.
- Alice now receives the Credential and stores it in her wallet.
Step 4: Acme Corp requests proof of education level from Alice
Here are the detail steps:
- Build a connection between Acme Corp and Alice (onboarding process)
- Acme Corp creates a Proof Request, which lists the items and the condition required. In this case Acme requires proof of degree, status and ssn from Faber College, and the average is over 4.
- Alice receives this Proof Request, and creates a Proof based on the credential she obtains from Faber College. The Proof contains information such that the requirement of Acme’s Proof Request can be satisfied.
- Acme Corp receives the Proof from Alice. Inside the Proof, Acme Corp sees the information and condition required, and verifies that they are coming from Faber College.
- Acme Corp accepts this Proof.
In the second case, Alice is getting the job-certificate (credential) from Acme, and provides the credential to Thrift Bank as proof of employment, when she applies for a car loan. Also a KYC (Know Your Customer) process requires her name as well. Alice uses the credential issued by Faber College to fulfil this process.
Step 5: Alice obtains a Credential from Acme Corp regarding the employment status
Here are the detail steps:
- Acme Corp creates and sends a Credential Offer to Alice.
- Alice retrieves the “Acme Job-Certificate Credential Definition” from the ledger, and creates and sends a Credential Request to Acme Corp.
- Acme Corp creates the Credential for Alice. Inside the Credential it contains the values of items listed in the “Acme Job-Certificate Credential Definition” (and Job-Certificate Schema), plus the required proof that Alice can use later when requested by the bank.
- Alice now receives the Credential and stores it in her wallet. Now she has two credentials in the wallet.
Step 6: Thrift Bank requests proof of employment status and KYC from Alice
Here are the detail steps:
- Build a connection between Thrift Bank and Alice (onboarding process).
- Thrift Bank creates two Proof Requests, which list the items and condition required. In this case Thrift Bank requires proof of employment status as Permanent, salary is over 2,000 and experience of more than 1 year. Also as KYC process she is asked for name and SSN.
- Alice receives both Proof Requests, and creates Proofs based on the credentials she obtains from Faber College and Acme Corp. The Proof contains information such that the requirement of Thrift Bank Proof Requests can be satisfied.
- Thrift Bank receives the Proofs from Alice. Inside the Proofs, Thrift Bank sees the information and condition required, and verifies that they are coming from both Faber College and Acme Corp.
- Thrift Bank accepts this Proof.
What is in the Ledger?
If we walk through the whole example, items recorded in the ledger are
- All Decentralized Identifier (DID), verinym and pairwise-unique DID, and corresponding verifying key (verkey)
- Schemas, which defines the structure of items referred by both credential definitions and credentials
- Credential Definition, which is built on top of a schema, plus the issuer’s information for proof creation
All these items are publicly accessible, and therefore is NOT secret at all to any organizations or individuals.
These items in the ledger serve as a good source of trust when constructing any proof later. We keep seeing these items are queried during the whole process.
Personally Identifiable Information (PII) Not in the Ledger
Then where is the PII? Over the story, we see some PII for Alice, such as her name, employment status, salary, year of graduation, etc. However this information is never exposed in the ledger, and therefore not readily accessible by anyone. This information is communicated via peer connections, such as Alice-to-Faber, Alice-to-Acme, and Alice-to-Thrift. And these connection is secured, through authenticated encryption, and information shared between is only known to the two parties. Therefore no PII is disclosed to the public, while the information in public (in ledger) provides the strong trust on the proof.
So how is Alice identified in this network? We first do not see a single party containing all Alice’s information. In fact Alice has different DID when she is communicating to other organizations.
A secure connection is first established between Alice and an organization. This secure connection is made by pairwise-unique DID, and that DID is only used for that connection.
If we open Alice’s wallet and see how many DID she has after going through the example, we see there are THREE, one for each organization she communicates before (Faber College, Acme Corp and Thrift Bank). As mentioned above, they are different DIDs, and all these three DIDs are recorded in the ledger with the corresponding verkey. Nothing else about Alice is seen in the ledger.
Alice’s attributes are only disclosed and kept on those organizations requiring proof on a necessity base. If the overall network is well designed, organizations do not keep more excessive information than needed, and by leveraging proof of conditional items information stored in organizations is further reduced. For example, Acme Corp knows Alice’s average is over 4, while they do not know the actual figure. And hence Acme does not store the actual average figure (which is 5 in the example) of Alice.
SSI is a very interesting topic. I hope this article provides a first step to understand what SSI is through indy-dev a real life example. For those who are interested, you can go to the indy repository, or begin with some good videos from SSI Meetup.