Why did the PGP Web of Trust fail?

Following up on a proposal for an Institutional Web of Trust, Bryan Ford replied in a short Tweet with a question that will be on the back of many Cyber Security specialists mind on reading the words “Web of Trust”

Discussing this on the Gitter SoLiD (Social Linked Data) Channel, Tim Berners-Lee suggested that (typos corrected)

[It] would be interesting to review PGP and come up with a list of reasons why it didn’t take off more seriously.

Having started thinking about it he listed a number of points as to what could have been problematic to PGP’s success, which I will render here without the channel noise:

Some reasons why PGP didn’t take off:
• it wasn’t bundled within Mac mail etc
• the workflow was horrible in many places like “you need to do x” in a dialogue box but no link to [a button to or an explanation on how to] do x
• you could with a tedious multi-step process download someone’s key and sign it and upload it but never explained the semantics of the action. “Do you vouch for this person being x “ or something like “why do you say that?” Confirm. Would maybe have been [better had it] spoken the users language
• exercise: design the GPGTools key management using Facebook vocabulary
• it always breaks with each new OS and you have to wait for the plugins to catch-up
• just as PKI assumed forced users to adopt a hierarchical trust model, GpG forced users to adopt a peer-peer key signing party model when the code would also have been able to support hierarchical trust models where appropriate

Note that in trying to assess the security of a tool/protocol Tim does not just look at the strength of the cryptography but at the whole process of using it: how well is it integrated into the user experience, how does a novice get going, how helpful are the dialogues, how does one work with it across devices, what kind of data structure is used (hierarchical, p2p or both), does that data structure allow one to explain how one came to a conclusion?, …

The UI questions depend on the technical ones, because if the technical side is not rich enough then any amount of UI improvements will be stymied by protocol or data structure limitations. On the other hand a bad User Interface or should we say Human Interface, will start introducing bugs at the psychological layer and open social engineering attacks.

There is bound to be a whole literature on this. But it would be helpful to know if there is a widely recognised document covering this. Failing that we need to work from the high level concepts defined in the PGP specs, to see if we can deduce some limitations to that Web of Trust model in particular, since that was the specific question asked by Bryan Ford.

1. The PGP Certificate

The PGP Web of Trust is based on the Public Key Certificates, which is a binary certificate described in RFC4880 §11.1 “Transferable Public Keys”. It consists of a public-key followed by a number of User-IDs and User Attributes, each of those signed potentially by a number of keys, each supposedly belonging to some other person who by signing it with their private key vouches that they verified that attribute.

This format is very strict and quite limited in its expressiveness. To show this let us look very quickly at the specification, in order to then make clearer why it may be worth moving on to something a lot better…

The User ID packet with Tag 13 is defined in RFC 4480 §5.11

A User ID packet consists of UTF-8 text that is intended to represent the name and email address of the key holder.  By convention, it includes an RFC2822 mail name-addr, but there are no restrictions on its content.  The packet length in the header specifies the length of the User ID.

The user attribute packet with Tag 14 is defined in §5.12 and consists of a subpacket length of 1, 2 or 5 octets, and a subpacket type of 1 octet. That is a hard coded limit on the number of attributes to 256. The only attribute that is defined is the image attribute numbered 1.

It is clear from this that there is something very brittle and limited about the PGP format, and it is very likely that different software vendors use an attribute number to mean something different when the attribute is not 1, since extending the format would require going through a centralised process to agree on settling the meaning of one of the remaining numbers, and the format that the attribute value could take.

Here it is clear that a lot would be gained for PGP to move to an extensible format such as that worked on by the Verifiable Claims Working Group at the W3C. This comes with a JSON serialisation and because it also has a JSON-LD one (LD, for Linked Data) it brings with it 20 years of standardisation efforts organised through the W3C with universities and companies worldwide on query languages, reasoning, publication, and much more, all part of the semantic web stack, which is designed to be syntax agnostic. That means it would be easy to also produce a binary representation using for example RDF HDT (Header Dictionary Triples), which may feel a lot more comfortable for people coming from the crypto space.

Note that both of the PGP and the Verifiable Claims syntaxes are algebraic structures— examples of such are lists, trees, graphs — which is another way of saying that they are constituted of parts and have a well defined identity criteria. They are serialisable things which can be signed, and so can be copied and distributed over the net. They are immutable.

2. Key Signing Parties

Image of the Key Signing Event at a Key Signing Party (Picture from Wikipedia)

Based on the above certificate format it is easy to understand what a Key Signing party is about. Since few people have that large a friend circle of technologically savvy members, and since the aim is to grow one’s network of verified keys beyond one’s local circle, the idea soon emerged to organise parties whose aim was for people to meet, sign each others attributes whilst give them an opportunity to talk about cryptography and socialise.

At the party people can check each other’s ids or attributes, by verifying that the name, email or photo on the certificate match the person in front of them, and then allow them to later sign the attribute. This requires doing things like looking at the picture in the certificate and checking their face, verifying the name in an official document such as a passport (by verifying that the passport is valid and the photo in the passport matches the person’s face) or verifying their e-mail address.

There is a Keysigning Party Guide and HOWTO. But neither of them mention how to verify the e-mail address or the photo. Perhaps because verifying the photo would require the party organisers to print it out and that would have been costly until recently. Verifying the e-mail address would require an extra step after the meeting such as having the person whose e-mail address was being verified send an encrypted or signed e-mail from that address to the verifier.

If the only thing that is verified is the name, then that is well known to be ambiguous: a number of people can have the same name. But also the procedure assumes that people can recognise fake passports or drivers licenses. Yet with modern technology these are getting easier to fake, and most people have no idea how to verify their own countries’ passports let alone foreign ones. How would someone even be able to recognise a name in a foreign passport using a different alphabet? But if we are to have international cooperation we need to have links between people of different nations, as we often do in software communities that span the globe.

For cross regional links to grow one would need people who can play the role of linkers between countries or between localities. It was thought that key signing parties at international conferences could help there… But at some point the idea of Trusted Third Party tends to appear, which means that one reverts to relying on institutions which in any case one was already relying on, since it is those that give out passports, driving licenses, …

3. The PGP Web of Trust (WoT)

Key Signing parties help grow the number of people each individual has verified personally. But the idea behind the PGP Web of Trust is that one can then rely on the set of signatures between people’s attributes to form a web of links which should allow one to grow one’s own network and trusted attributes indirectly. If Jane signed that the person with public key P2’s name was Bruno and the person with public key P2 signed that the person with public key P3’s name was “Henry” then Jane can have some confidence in attributing the name “Henry” to the person with public key P3.

When a small number of hops are involved this can work quite well. And this is indeed how we work with centralised Social Networks such as Facebook or LinkedIn: we have a bit more confidence we knows the attributes of someone because they are linked to someone we know. If I find out that someone does not have the claimed attribute I can alert people who have claimed that attribute so that they can retract their signatures by releasing a revocation statement.

Revocation statements are needed because PGP works with signed documents, statements that can not really be deleted for certain. But given that there is no centralised repository, one needs to be able to find statements and revocations around the internet. Every time one looks for a statement one has to also look for a revocation of that. PGP resolved that problem by developing key servers, and in order to avoid all key servers synchronising all the keys and all the statements and revocations, the certificates allow for a preferred keyserver to be specified (§ 5.2.3.18 Preferred Keyserver).

But if we want the PGP network to grow beyond a few hops we need to be a lot more precise as to how we come to believe that someone has an attribute, since if PGP or it’s successor is successful we would have all the world in our network, and so all the crooks too. The PGP certificate format only allows one type of relation between two keys: that someone signed someone else’s attribute, i.e. a statement of the form “P1 signs that P2 has attribute A”. But that I know your name does not make you a good name verifier. For someone who trusted me to rely on you for verifying other people’s names one would need me to sign you as having at least the attribute of being a good name verifier too. And also perhaps of having the attribute of being honest, which is not something that is in any way easy to verify, and so would if appearing anywhere be suspicious. But people relying one me would need to know I was a good name verifier too. It may seem like a basic skill of all humans to be able to verify names, though I doubt it, not just because people have very different skills but also because of time limitations. But things obviously get a lot more complicated for any other skill, such as driving ability, mathematical ability, etc…

4. Limitations of the PGP WoT

4.1 Each Attribute requires a different verification skill

We already saw that the number of attributes on a PGP certificate are very limited, and that only two are widely used: the name, e-mail pair and the picture. And just with those two we noticed that each type of attribute requires a different verification skill. An attribute for names requires identity document verification skills, something the passport office is very good at. Identifying photos would require having good facial recognition skills which not everyone has, as well as having a good quality picture, and not being confronted with an identical twin.

Were one to extend these certificates with an extensible format such as the W3C verifiable claims format then one could end up with any number of new attributes. Say that we have a driving ability attribute: verifying that would require having driving verification skills, which the Department of Motor Vehicles has. Or say we were to add the birth date of someone, then verifying that would require checking birth certificate validity and then how does one verify that the big bearded person in front of one was that cute little baby? Someone has a degree in Mathematics? That would perhaps be best for a University to verify. We may know that our neighbor is a good baker, because we can taste the bread he makes. But even there some people are very poor judges.

So the skills of the person verifying a property and the legal obligations that person is under in making a claim need to be taken into consideration when assessing the verification of a skill. That clearly is not part of the PGP WoT infrastructure. That is what an institutional web of trust would bring, since institutions are social knowledge machines.

Illustration of an Institutional Web Trust with explanatory text here.

4.2 The missing Institutions

Knowing the name of someone is not what we are always or perhaps mostly concerned with. If we go try to book a holiday on the other side of the world, we don’t care at all who the hotel owner’s name is, or indeed what his e-mail address is, or that it is that person who is sending us the e-mail. What we care about is that we are dealing with a hotel at that location in that country, bound by its countries laws and in relation with our embassy. We care that the web server of that hotel we are looking at is the one of the hotel and not a fake one. So a pure peer to peer network between me and the hotel owner is not what we are after. We are interested rather in that the hotel as a business is the institution we think it is. As explained in our proposal for an institutional Web of Trust, that means that there is a peer to peer relation between the two countries and a hierarchical one between me as citizen of mine and the hotel as a member organisation of the other.

5. Conclusion

The hyper-data web of institutional trust is a web of trust in the sense of “web” in world wide web, and in semantic-web. It is a declarative, reference based web of relations based on URLs (R for Resource) that can cross institutional boundaries on resources that can be created, change and deleted. In my reply to Bryan Ford I go into the differences between this co-algebraic, state based (as in REST: Representation of State Transfer), web and the algebraic PGP inspired one that is only concerned with signed documents. But I also show how they are complementary.

As the PGP movement discovers and perhaps even adopts the much richer Verifiable Claims data structure then it will find it necessary to work within a web of trusted institutions, since much of the time the appropriate way to deem a claim valid will be for us to know that it was produced by an institution that has the procedures in place to be able to make verified claims. In a globalised environment, we can only do that through a number of people cooperating within and across institutions.