How to securely hash and store passwords in your next application
Are you hashing your user’s passwords? More importantly, are you doing it correctly? There’s a lot of information out there on password hashing, and there are certainly more than a few different hash algorithms available for you to use.
As a full-stack engineer, I’ve spent plenty of time building password-based authentication mechanisms. As an ethical hacker, I’ve spent plenty of time trying to break those mechanisms and crack password hashes.
In this article, I’m going to provide a brief overview of secure password hashing and storage, and then I’m going to show you how to securely hash your passwords for your next application.
Password hashing: a 30-second summary
There’s been a lot written about what a hash algorithm is, so I won’t waste your time reiterating all of it. In short, a hash algorithm is a one-way “trapdoor” function. Let’s call the hash function
H. Given some data
d, it's trivial to compute
H(d). But given only the hash
H(d), it's nearly impossible to compute
d. It's also important to note that even a one-byte difference in
d will result in a completely different hash
Instead of storing passwords in “plaintext” (i.e. storing passwords directly in our database), it is more secure to hash passwords before storing them in the database. Given a password
p, we compute
H(p), and store that value in the database. When a user tries to log in, we hash the password that the user tried to log in with, and compare it to the hash in the database. If the two match, then the password is valid, and we log the user in. If they don't, then the user provided the wrong password.
Why do we do this? It protects passwords from hackers, lazy or mal-intentioned system administrators, and data leaks. If the database is leaked or hacked, then hackers can’t easily determine what all of the user’s passwords are simply by looking at them. This is even more important considering that many people use the same or similar passwords for most of their accounts. Without password hashing, one account being hacked could lead to all of a user’s accounts across multiple services being compromised.
A brief example
Below I’ve provided some python pseudo-code to give you an idea of what your login function might look like using password hashing.
The above code is a little simplified since it’ll depend on how you’re storing and retrieving users from your database. In this example, I also didn’t touch on the actual hash function you should use, so let’s dig into the details a little more.
There are a myriad of hash functions out there, and many offer different advantages. For example, some hash functions are fast, and others are slow. Fast hashing algorithms are great for building data structures like hash tables, but we want to use slow hash functions for password hashing since slow hash functions make brute force attacks more difficult. Let’s look at a few common hash functions.
Some common hash functions
What on earth is a salt?
A “salt” is a random piece of data that is often added to the data you want to hash before you actually hash it. Adding a salt to your data before hashing it will make the output of the hash function different than it would be if you had only hashed the data.
When a user sets their password (often on signing up), a random salt should be generated and used to compute the password hash. The salt should then be stored with the password hash. When the user tries to log in, combine the salt with the supplied password, hash the combination of the two, and compare it to the hash in the database.
Why should you use a salt?
Without going into too much detail, hackers commonly use rainbow table attacks, dictionary attacks, and brute-force attacks to try and crack password hashes. While hackers can’t compute the original password given only a hash, they can take a long list of possible passwords and compute hashes for them to try and match them with the passwords in the database. This is effectively how these types of attacks work, although each of the above works somewhat differently.
A salt makes it much more difficult for hackers to perform these types of attacks. Depending on the hash function, salted hashes take nearly exponentially more time to crack than unsalted ones. They also make rainbow table attacks nearly impossible. It’s therefore important to always use salts in your hashes.
Which hash function should you use?
Lots of people will tell you that there’s no “right” or “wrong” answer to this question, only trade-offs. That’s true, but some hash functions make better trade-offs than others.
Personally, I’m a big fan of Bcrypt and Argon2 because both are extremely secure, both require salts, and both are slow (which as we discussed above, is a property we want for password hashing functions). Argon2 is a lot more complicated than Bcrypt, and can be more difficult to implement. Bcrypt is also a lot more common and more languages have libraries for it, so it’s what I tend to use. I recommend that you use one of these two as well.
Putting it all together
Below, I have provided two examples on how to implement everything that we’ve discussed today. The first example is pseudo-code, and the second one is in Python.
Password hash authentication pseudo-code
Most common languages should provide a bcrypt module or package, but the interface to it will invariably look different, so I’ve tried to be as language-agnostic as possible.
# should be called when a user signs up or changes their password
salt = random_bytes(14) # or any other length
hash = bcrypt_hash(password, salt)
# store this with the user in your database
# called whenever a user tries to login
function login_user(username, password)
user = get_user_from_database(username)
# bcrypt stores the salt with the hash, your library should manage this for you
salt = get_salt(user.hash)
new_hash = bcrypt_hash(password, salt)
if new_hash == user.hash
Note that your salt should be at least 8 bytes long, but longer is more secure.
Password hash authentication Python code
Python provides a bcrypt module that can be installed with Pip, and I’m going to use that for this example. The bcrypt module handles the computation behind the scenes for you, so it’s super easy to use:
- Always store your passwords as hashes, and never as plain text.
- Use a salt for extra security.
- Use Bcrypt or Argon2 for your hash function
I hope you find this article useful! Let me know what you think in the comments below.