Cost-based Password Hashing

8 min readAug 21, 2022

Bank vault picture from Jason Dent on Unsplash

After spending some time at the password village at DEFCON 30 and reading an article several weeks ago about a new vulnerability in both Intel and AMD processors that “can covertly leak password data and other sensitive material” one thing caught my eye…

The exploit can extract kernel memory from AMD CPUs with a bandwidth of 3.9 kB per second. The researchers said that it’s capable of locating and leaking a Linux computer’s root password hash from physical memory in about 28 minutes when running the Intel CPUs and in about 6 minutes for AMD CPUs.

Unfortunately, even in 2022, there are 2 things we still suck at.

Creating secure passwords
Using secure password hashing algorithms

We have been trying to convince people to create secure passwords for years and years but still, the most common passwords are qwerty, password, and 123456. I am not going to spend much time talking about creating strong passwords since we have been doing that for decades and is going to continue. That’s a topic for a whole other discussion. So I am just going to say that you should use strong passwords that are unique for each site and a trusted password manager and ALWAYS set up Multi-factor Authentication whenever possible.

Instead, let’s talk about the password hashing algorithms to securely protect passwords on systems.

Cost-Based Passwords Hashes

Almost all the recommended password hashing algorithms include some kind of difficulty factor or Cost to increase the number of resources needed to hash the password. This can be either by requiring additional memory, additional CPU, the number of parallel threads, or even the number of iterations (rounds).

If we leverage the tuning options of the hashing algorithms we can increase the difficulty of computing each hash. When storing a password we use a one-way function. After the initial hash is created and stored, we validate the password by using the same process with the user-provided input and see if we get the same output. If we get the same values, we know we have a match.

When we create the hash, we take the difficulty settings (memory, CPU, iterations, etc), the salt, and the plaintext user input then we generate the hash and compare. If we take a look at the hash we can see the breakdown of this info.

$2b$10$KRGxLBS0LxevKhKyLRW1M.w0cqB396CYug0ztDrYZB6govAc62ggK

The $ is a delimiter to split apart the values.

2bis the cipher type. In this case, 2 is the value for bcrypt and version is b
10 is the number of log rounds. (2¹⁰)
KRGxLBS0LxevKhKyLRW1M is the salt. (1234567812345678)
w0cqB396CYug0ztDrYZB6govAc62ggK is the hashed password. (random_password)

To validate that a password is correct, we would take the user input (random_password). Then look up the value that we have stored and get the settings for that password (cipher, rounds, and salt). We would then re-compute the value from the user input and the stored settings to make sure we get the same value that we have a record for.

We can increase the rounds, cost, or difficulty (depending on the hashing algorithm) to make it require more resources and take more time to perform the hash. If we increase the difficulty, it makes it harder (computationally expensive) for both ourselves and for any potential attackers that get a copy of the hashed value. The difference is the number of times that a hash much be performed by a valid user vs an attacker that is just guessing the passwords.

The goal is to make it easy enough that our systems can compute the password in an acceptable amount of time and resources, but difficult enough that others can not brute-force the hashes to find the matching passwords without substantially more resources.

Linux passwords

Linux has a default password hash of sha512 on most systems but the default password hashing uses a very low number of rounds (5000). On some newer distributions with PAM version 1.4.0 or greater, the default hash is Yescrypt, which is a better hashing algorithm but still has a default setting of 5 for the number of rounds.

The default settings hash the password very quickly.

$ time echo "123456qwerty" | mkpasswd -m sha512crypt -S "poiuyt098765" -s -R 5000$6$rounds=5000$poiuyt098765$5DI.fUNj3RZxXlNzjH9umbJLKTkbldWSrlYcCoOgfcdxlD5ZzRN9SHhn.dJ957uaDgI.Z2LEV7s8zpiGH050v/real 0m0.015s
user 0m0.005s
sys 0m0.014s

However, if we increase the number of rounds we can increase the amount of time to hash a single password. This means we can fine-tune the amount of time that is acceptable based on the sensitivity of the system. The longer it takes for us to hash the password, the longer it takes for an attacker to hash the password.

$ time echo "123456qwerty" | mkpasswd -m sha512crypt -S "poiuyt098765" -s -R 3000000$6$rounds=3000000$poiuyt098765$B4k3B6Iqsbi1SS.R1Cir/l/gV1dTMCoBALETK24A1qek2p.obaTYERHkrgqse5cubbXHPt5v1IfigM0/YzuTD1real 0m0.934s
user 0m0.931s
sys 0m0.005s
$ time echo "123456qwerty" | mkpasswd -m yescrypt -s -R 11
$y$jFT$W8OiEtbvEP6i6zEdUr5vT.$KS6CpfRCq/b0cM0s8YmQvEjRiu0UbOP.j.p9KUAw8U2real 0m0.915s
user 0m0.710s
sys 0m0.214s

This default can be set in Linux using PAM. This can be in one of several places depending on the operating system and version

Amazon Linux 2: /etc/pam.d/password-auth (sha512)
Ubuntu 22.10: /etc/pam.d/common-password (sha512)
Ubuntu Latest: /etc/pam.d/common-password (yescrypt)
Debian Buster: /etc/pam.d/common-password (sha512)
Debian Bullseye: /etc/pam.d/common-password (yescrypt)
Arch Linux: /etc/pam.d/system-auth (sha512)

password  required  pam_unix.so sha512 shadow nullok rounds=300000

password [success=1 default=ignore] pam_unix.so obscure yescrypt rounds=11

Updating the values and then resetting the password will cause it to be hashed using the new settings.

Note: for Yescrypt the current limit for rounds is 11 and default of 5(logarithmic). If you set the number above 11, it will just set the value to 5

Web Applications

For any custom applications, You should be using a secure hashing algorithm like Argon2 or Scrypt.

Using the python below, we can run it with different settings to see how long it takes and how hard it is to hash the password. We will start with the default settings and then update and compare the differences. We will only be modifying PasswordHasher() through these different runs.

Note: if you want to test this locally, you will need psutil and argon2-cffi python packages installed

With the default settings, we can see that it creates a hash using argon2id with a memory (m) of 65536, a time_cost (t) of 3, and a parallel (p) value of 4. We can increase these values to increase the difficulty to hash the password

$ python3 hash_pass.py$argon2id$v=19$m=65536,t=3,p=4$pIFZByVQozr2AAlBnpxQ1A$1M0lA+JnqNQ+Mwn4l3gVNASwa6RMfnXs4zv7zFcR3+A
hash_pass: memory used: 64.0625; exec time: 0.028205156326293945

Now if we update the memory and iterations (time_cost) then we can see that it takes much longer to hash the password. ph = PasswordHasher(time_cost=6, memory_cost=2**17) We can see that it increased the requirement from 64M of memory to 128M memory and doubled the iterations. It only took a little over 100ms to run but it still took 4 times longer to complete.

$ python3 hash_pass.py$argon2id$v=19$m=131072,t=6,p=4$aIOnQE1agNRWguLb5qq0xA$JH+k5ilxvGMZw9uy18PenWGBgTg1ohdAA3zy3D3g0MQ
hash_pass: memory used: 128.046875; exec time: 0.11450004577636719

We can update it again to increase it even more just to see how it does. This time we will go much higher than you normally would for any type of large application. ph = PasswordHasher(time_cost=13, memory_cost=2**19). With this run, it required 512M of memory and with the iterations set at 13, it took just over 1 second to complete. This example took over 35 times longer then the default configuration.

$ python3 hash_pass.py$argon2id$v=19$m=524288,t=13,p=4$Cak8XzZTchf0AdBjFn4lPg$cpg/ysiiDwZBXc/j/7Zo/gde7Zs/YUuGuPrZ03GwWuE
hash_pass: memory used: 512.0625; exec time: 1.0690608024597168

So by tuning some of these knobs, we can adjust the number of resources required to compute the passwords. Because of these abilities and the ciphers in use, this can reduce the ability for GPUs or FPGAs to be leveraged to crack the password. GPU cracking is many times faster than CPUs for simple hashes but using hashes like argon2id, makes that much more difficult.

Hash Verification

We can leverage a similar flow as about for verifying the hash. Using most common trusted libraries, the provided functions for verifying the passwords do most of the heavy lifting and we just provide the hash and the plaintext user input to the function. In the case of argon2-cffi, we just use the verify function from the library.

If you look at the verification above, you can see we are verifying different hashes with the same function and no additional input to the verification process. Since the verification process can validate the password without having to provide the settings directly this allows 2 advantages.

Allows transparently updating the hash if the cost or memory requirements change
Allows different users to hash different password difficulties

Rehashing

If the hardware that the password validation is running gets replaced with new hardware that is faster, we can update the options for the cipher then we can automatically rehash to the updated parameters upon successful login without any additional actions from the user. With a little example code, we can automatically update the hash and replace the stored value without any additional user interaction.

$ python3 rehash.py$argon2id$v=19$m=65536,t=3,p=4$B/meRi8WYfkPO/RFkG7fGw$whbmpC96PVhKvcNMTNzZlZtVpUhN/Rxbg7BL+Ekufk8$argon2id$v=19$m=131072,t=6,p=4$AG/nEyYFBqV0oXs2q9dQtg$oR8c/qCL7GWhUuPYsKC3EvrBYdw/5dagRgSXPFRAwxA

You can see that after we verify the password hash, we check if we need to rehash the password to increase the difficulty. In this example, the time_cost and memory_cost were both updated so that it would mean that we need to rehash the password. Since we already have the user input from the verify, we can just leverage that and recompute then store the updated hash and improve the security of the passwords anytime we want.

Different strengths per user

Since the verification doesn’t need any of the settings, we can still leverage a single verification process but set different difficulties based on the permission set of the user. Since the number of users reduces as the permissions increase, having higher difficulty for higher permissions levels will reduce the impact of the additional resources requirements. Organizations may have thousands of users but only dozens of admins.

While this does open up the possibility of seeing which passwords have a stronger hash if they were leaked, then an attacker could potentially directly target those hashes individually if they are targeting a site or applications directly but if they have all the hashes then chances are they also have additional data like roles and other user information.

Different hardware

If you have a microservice architecture, you can leverage different hardware for your authentication services to allow increasing the difficulty while helping keep the time low. Attackers try and leverage the fastest hardware possible to break passwords since faster hardware can compute hashes in a shorter period of time.

If we compare EC2 instances in AWS, an M5 instance has a CPU frequency of 3.1Ghz vs an M5zn instance that has a CPU frequency of 4.5Ghz. This can make a substantial difference in the amount of time to hash a password. In the test below, just by changing the instance type from M5 to M5zn we had a 26.275% decrease in the time it took to compute a single hash. This means we can increase the difficulty of the hash without additional impact on the user logon time.

# EC2 m5 instance
$argon2id$v=19$m=524288,t=14,p=8$iST5wMzirkMqidVSBzymKQ$YxioBey8Amo6EvGU6VtThiUBrwy7YH8vd2YZyXrP3Z8
hash_pass: exec time: 1.247293472290039# EC2 m5zn instance
$argon2id$v=19$m=524288,t=14,p=8$BKuY7TSTkjoa3mTTylzEPw$pAYjMgsgcuMmXjKIFEKaoWyEZY+BevbEekECeRA4y9A
hash_pass: exec time: 0.9195597171783447

Summary

While things like Single Sign-On and Multi-Factor authentication are becoming more popular, passwords are not going away anytime soon so we need to make sure we are doing what we can to properly protect them. We know users are not getting better at creating secure passwords so we need to make sure that they are difficult for attackers to attempt to brute force should the hashes get leaked. Ensure that you are using a strong trusted hashing algorithm and DO NOT try and write your own.

Well-known and trusted libraries are incredibly easy to implement and use. Tuning them can take a bit of time but the protections can pay off in the event of a leak of the password hashes. Take the time to ensure you use a well-tested hashing algorithm that is industry recommended and ensure your systems are not using hashes that are easy to compute or brute force.

Cost-based Password Hashing

Cost-Based Passwords Hashes

Linux passwords

Web Applications

Summary

Written by Brian Davis