Strength of a password — the problem between theory and practice

Published in

Tresorit Engineering

8 min readAug 10, 2020

The strength of a password is an important measurement of security for any system which uses password or PIN authentication. In this article, I will discuss the theoretical strength, and how it relates to the entropy of a password. Then, I will discuss the practical aspect compared to the theoretical approach — and highlight the big gap between theory and practice. This is an important pre-requisite to understand how to attack in practice a password authentication, and what strategies we can have to defend ourselves against those attacks.

A little bit of combinatorics

To understand the strength of a password we need to go back to high school math and a bit of combinatorics. Basically, the strength of a password is the potential number of variations the attacker needs to try on average to find the password. If a password is a randomly chosen one, the attacker needs to go through on average half of all the existing variations. Let’s take an example: a 4-digit passcode code. Don’t be confused, passcodes are basically just numerical passwords.

There are 10 variations for each digit — we can also say, that the valid symbol set is 10 because we have 0, 1, 2, 3..9 as symbols. So, all potential variations of the passcode is 10 x 10 x 10 x 10, or 10^4 = 10,000.

If you can try out a passcode with a computer, 10,000 is not a lot, without any protection all variations could be tried out in a fraction of a second. A brute-force attack is trying out all variations.

I wrote that on average, half of the passwords need to be brute-forced by the attacker — that is because the attacker might get lucky, and find it after 100 tries, or they might need to go through all variations. All these possible cases even out in a way, that if the number of variations is 10,000, on average the attacker needs to try 5,000.

To strengthen this passcode, we could increase the length or the number of symbols. If we increase the length of the passcode to 6 digits, that would mean 10⁶ variations, or 1,000,000. If we rather allow English alphabet characters (lowercase only), then we add 26 more symbols. That means 36 symbols altogether (including digits), which would mean 36⁴=1,679,616 variations for a 4 digit password. You can see, that even though the 4 character password is shorter, it has more variations than the 6 digit passcode — so, in this example, the one including characters is stronger.

All in all, the strength of the password, in theory, can be calculated with the following formula:

Strength = Number of valid symbols^Length
or just simply
S = N^L

The entropy of a password

From a security perspective, there is no real difference between 1,000,000 and 1,001,000 variations. Both numbers are in the same magnitude, while there is a significant difference between 10,000 (4 digit passcode) and 1,000,000 (6 digit passcode). For that reason, in practice, we are using the magnitude instead of the exact number of variations for password strength — the logarithm of the variations. We could use a 10-base logarithm (e.g. log10 10,000=4), but the industry rather uses binary logarithm. This number is called the entropy (H) and is measured in bits.

H = log₂(Variations of passwords) = log₂ (N^L)

Why? First, we want to show off our nerdiness to the world that, we are working with computers so our brain also uses only 0s and 1s. Second, for computer-stored numbers binary base is much easier to use. Third, this number indicates what the length of an equally strong, binary-only password would be.

Ha = log₂(10⁴)= log₂(10,000)=13.29 bits
Hb = log₂(10⁶)= log₂(1,000,000)=19.93 bits
Hc = log₂(36⁴)= log₂(1,679,616)=20.68 bits

There might be cases when the formula for calculating the variations of the password is not that simple. For example, in a case when the first character of the password cannot be a number and must be a lowercase letter of the English alphabet, but the rest of the password can be lowercase characters or numbers. You need to calculate entropy in the same way — binary logarithm of all the potential variations. Sticking to the previous example of a 4-character password, this is log₂(26*36³).

The problem in practice

Unfortunately, in practice, it works differently. The strength of a password is still the average number of variations an attacker has to try out before succeeding, but in reality, it cannot be easily calculated from the number of valid symbols and the potential length of the password. Why? Here is a big secret: average users don’t choose their passwords randomly.

The above-mentioned combinatoric formulas could only be used for measuring the strength of the password if “%2tK2” as a password would be chosen by an average user with the same probability as “apple” for a 5 character password. But that is never going to happen. Never.

By not choosing their passwords randomly, users lower the security of the system. But the question is, by how much? While there’s no simple answer, it’s definitely orders of magnitude. Here is an example: the Oxford English Dictionary has 171,476 words, the longest of which is 30 characters. If a user just picks any English word as a password, then the variations are 171,476 (note: we assume that a user picks “apple” with the same probability as “pseudopseudohypoparathyroidism”, which is a bit strong assumption about the average user’s vocabulary). If a user chose a random character string, which is maximum 30 characters long (meaning it can also be shorter), it is 26³⁰ + 26²⁹+ … +26²+26 = 2925726857336135756028965870800610381571030. Compared to 171,476, it is a pretty big difference.

To demonstrate how much larger this number is than the number of words in the dictionary, let’s imagine the observable universe as a giant ball. Let’s imagine an even bigger, super-giant ball which is 2 million times larger in diameter. The ratio of the super-giant ball’s diameter to the width of a human hair is the same as the ratio calculated above for the word-based password vs. randomly chosen password.

Working with such huge numbers makes life difficult, and that is why we are using entropy, measured in bits to compare the number of variations: the dictionary has 17.4 bits entropy (log₂(171,476) = 17.4), the maximum 30 character long randomly chosen string has 141.1 bits entropy. 123.7 bits entropy difference doesn’t seem like much, but remember the previous example about the width of a human hair and the super-giant ball — that is the magnitude of the difference.

You could say, that users may use more complex passwords than just one word from a dictionary, they can add numbers (which are not entirely random), special characters, etc. While that doesn’t make the password as strong as a randomly generated one, it definitely strengthens it — but how much?

Measuring password entropy in practice

The National Institute of Standards and Technology issued a rule of thumb to estimate the entropy of a user-chosen password (Special Publication 800–63–2), based on the length of the password:

char 1 → 4 bits

char 2–8 → 2 bits (per char)

char 9–20 → 1.5 bits (per char)

char 21 and above → 1 bit (per char)

As an example, the entropy of an 8 character user-selected password is 4+2+2+2+2+2+2+2=18 bits, the entropy of a 30 character length user-selected password is 46 bits.

The NIST estimation is just one heuristic. It shows a high-level picture of users but not doesn’t help in measuring one particular user’s password. Even NIST dropped this approach in 2017 and suggested not to use it for estimating the strength of one particular user.

There are other heuristics to measure the strength of a password chosen by the user, which really measures the effort the user puts in. These heuristics are looking for words in a dictionary, even for a partial match, checking simple numbers that might look like birthdays, etc. A great example is the Zxcvbn library. It’s worth noting that it uses an English dictionary by default, so non-English words will get higher entropy. If you want to use this with users from all around the globe, you need to add the extra dictionaries yourself.

Zxcvbn library compared with NIST estimation

These low entropy numbers suggest that we cannot design a truly secure system based only on user passwords, and if we do not replace it, we are all doomed. Well, not entirely. One study found that an average user chooses a password with ~40 bits of entropy. While it’s not a big number, but it’s definitively better than the numbers we see above.

If an attacker tries to brute-force a ~40-bit password, with a brute-force rate of 100,000 passwords/second, it will still take 63 days to go through all the combinations. But if we can slow down the attacker to 1,000 passwords/second, then we will end up with 17.4 years. If we slow it even further down, to 1 password/second, than we can end up with 17,400 years. But if the attacker can try passwords faster, let’s say 100M passwords/second, then it will only take 1.5 hours to crack the password.

The bottom line is: the attacker’s capabilities really do matter. Cracking a password in 90 minutes is an entirely different story than in 17,400 years. 40-bit entropy is low, but we can work with it if we manage to slow the attacker down significantly. But before that, it is essential to understand how an attacker can gain extremely high speed in user-generated passwords because understanding these tactics is essential if you want to design a system that can resist the attack. But that is a tale for a different post.

Strength of a password — the problem between theory and practice

A little bit of combinatorics

The entropy of a password

The problem in practice

Measuring password entropy in practice

Written by Istvan Lam