Usability of crypto software

A review of the literature

The security community has a problem, and we all know it. Too often, the people we wish were using our software can’t figure it out. Over the last ten years or so multiple usability studies have been performed of common crypto systems like PGP or SSL. The results have been dire. We ignore them at our peril.

This short article reviews a few of the best known papers.


In a widely known and notorious study called “Why Johnny can’t encrypt”, a series of volunteers were asked to step through a sequence of tasks using PGP 5.0, then considered to be a good example of well designed security software. Quoting from the introduction of the paper:

The analysis found a number of user interface design flaws that may contribute to security failures, and the user test demonstrated that when our test participants were given 90 minutes in which to sign and encrypt a message using PGP 5.0, the majority of them were unable to do so successfully.

Given that PGP’s marketing literature at the time emphasised the effort that had gone into its design, you’d think this embarrassing study would have prompted serious reform. But you’d be wrong. In a followup study in 2006 called “Why Johnny still can’t encrypt”, researchers discovered that PGP 9.0, despite many changes from version 5.0, still confused people:

We found that key verification and signing is still severely lacking, such that no user was able to successfully verify their keys. Similar to PGP 5, users had difficulty with signing keys. Three of our users were not able to verify the validity of the key successfully and did not understand the reasoning to do so. Four users were not able to sign the key, these users attempted to but struggled with the interface. They did not understand that in order to ‘verify,’ they must ‘sign’ the key rather than just click ‘verify’ ….. Digital signing of messages is more problematic in PGP 9 than PGP 5 as none of the users were able to sign messages using PGP 9

During the early ‘90s the so-called crypto wars raged between governments and mathematicians. Governments feared a new era of unreadable communications and the crime it would enable. Cryptographers feared government overreach and surveillance. History records that cryptographers won the crypto wars and eventually the FBI accepted the new world order.

But did governments really lose? Or did they simply lose interest?

At a talk in 2012, investigative journalist Duncan Campbell discussed a series of successful terrorism prosecutions and how encryption had been used:

The hijackers who attacked New York and Washington did not use encryption at all. On the 18th of September 2001, FBI Assistant Director Ronald L. Dick, who was head of the U.S. National Infrastructure Protection Center, told reporters at an FBI briefing that records of the Internet messages between the 19 hijackers had not involved any encryption or concealment. He said it was simple e-mails back and forth. That was it: no encryption. A decade of fear, the biggest plot ever, the biggest loss of life — no encryption. And exactly the same conclusion was reached in the large official 9/11 report (see image). The hijackers had used regular e-mail services like Hotmail, they’d been taught to use simple codebook substitutions, such as calling the White House “The Faculty of Politics”, or the World Trade Center “The Faculty of Commerce”. They did not encrypt.

Campbell goes on to study over 10 more terrorism cases. PGP surfaces in only one of them, a case that involved cybercrime in aid of terrorist fundraising. So perhaps a more accurate interpretation of history is that governments simply stopped caring when it became clear just how terrible the cryptographer’s software really was. The Big Fear was that terrorists would encrypt their communications and become invisible to the authorities. That fear receded when it became apparent that PGP was so bad terrorists would literally rather die than use it.


SSL fared little better. Eye tracking studies discovered that users did not notice browser security indicators. From the study “Why phishing works”:

We then assessed these hypotheses with a usability study in which 22 participants were shown 20 web sites and asked to determine which ones were fraudulent. We found that 23% of the participants did not look at browser-based cues such as the address bar, status bar and the security indicators, leading to incorrect choices 40% of the time. We also found that some visual deception attacks can fool even the most sophisticated users.

Even when users do notice security indicators, it turns out to be difficult to make them unforgeable because of the prevalence of overlapping windows in desktop operating systems:

In this usability study of phishing attacks and browser anti- phishing defenses, 27 users each classified 12 web sites as fraudulent or legitimate. By dividing these users into three groups, our controlled study measured both the effect of extended validation certificates that appear only at legitimate sites and the effect of reading a help file about security features in Internet Explorer 7. Across all groups, we found that picture- in-picture attacks showing a fake browser window were as effective as the best other phishing technique, the homograph attack. Extended valida- tion did not help users identify either attack. Additionally, reading the help file made users more likely to classify both real and fake web sites as legitimate when the phishing warning did not appear.

This problem of overlapping security indicators has been repeatedly exploited by phishers trying to confuse people as to the real identity they’re interacting with. Not only the clever “picture of a web browser inside a web page” attack is possible, but phishers also exploited the confusing and hostile format of URLs to their advantage, by creating sites with addressess like this:

http://www.bankofamerica.com.netsystem.cn/

Users would start reading left to right and stop when they reached .com, failing to notice that the address was really a subdomain of a Chinese website.

Another study found error messages were also a weak point:

We evaluated warnings used in three popular web browsers and our two warnings in a 100- participant, between-subjects laboratory study. Our warnings performed significantly better than existing warnings, but far too many participants exhibited dangerous behavior in all warning conditions. Our results suggest that, while warnings can be improved, a better approach may be to minimize the use of SSL warnings altogether by blocking users from making unsafe connections and eliminating warnings in benign situations.

What’s interesting about these studies is that they often find that technically sophisticated users do no better than ordinary non-technical users.

Unlike in the PGP case however, usability studies of SSL did prompt changes across the browser industry. The above papers led to many warnings becoming un-dismissable, the wordings of the remaining warnings being clarified, and a decreased tolerance for self signed certificates across the industry. Popup alerts were phased out. In January 2007, 62% of web servers had SSL setups that caused browser errors. In 2014 that figure is negligible. Browser security indicators also became more consistent across competing products, with the padlock icon moving inside the address bar and the EV “green bar” paradigm becoming standard UI, so users became better trained about where to look.