
Security: Password Storage Jargon
TLDR; Use PBKDF2 or another key stretching algorithm when storing passwords, use a salt and don’t just hash them.
I'm going to take the example of securing a users password and try and break down the jargon that you would find from a Google search. Take a Google for “store user password server” and many results will be returned.
If you take a quick read you will notice several common themes: hashing salt and PBKDF2. In fact the usual instructions are to hash the password using PBKDF2 with a salt and IV and X number of rounds.
Hash/Hashing
Hashes are a method of transforming a given input (message) and outputting a cryptographically secure output (digest). They are usually know as a way of “fingerprinting” an input. Coming back to our example a password will usually be hashed before it is stored anywhere as such hashes have several properties that make them useful for this:
- Technically non-reversible — hashes are one way.
- Fast — we could be hashing a lot of things.
- Repeatable — given the same input the digest will be the same.
- Collision resistance — given two different inputs the digests should be different.
- Avalanche effect — changes to input results in a large difference to the digest such that you cannot guess how the input change affected the digest.
So lets say I want to store a password in a database. Given the password “catdog” as the input the digest is:
163e65be076bbea20ab8275969700373a6179a3c
If I change the password to “catDog” the digest is:
758e0b9141ec58d98a79139f5c5d9faf5c0b654c
As you can see the two digests results in a massively different hash with only a small change. Now that the password has been hashed it is no longer recognisable from it’s plaintext equivilant meaning no-one (us or a hacker) will know what it is.
Unfortunately it’s not quite that simple, as hashes are fast and always result in the same output an attacker would be able to pre-compute a series of hashes which map to a plaintext in order to reverse engineer the hash in your database. This attack is known as a dictionary/rainbow attack. This brings me on to my next term.
Salt
As previously mentioned hashes are a fantastic way of transforming data to be unrecognisable from it’s original form but they are open to attack. A way of preventing dictionary/rainbow attacks is to add a salt to the hash.
The salt is added to a password before it is hashed. It has several properties which make it useful in our quest to secure a password:
- Randomly generated.
- Globally unique — this Salt shouldn’t appear anywhere else in the world.
- Same size as the output of the hash function — this is a rule of thumb, say the Hash function is SHA256 then the Salt would be 32 bytes (256 bits).
Given a table which looks like this:

An attacker would be able to reverse one hash to it’s plaintext equivilant (catdog) and as the hashes are the same they would know the password for two accounts. So a salt it used to stop this exact situation. If we were to add a salt given the properties above our table would look like this:

As you can see the two password hashes are now completely different. The salt is stored with the password so that when a user attempts to log back we can recalculate the hash and check it matches. Using a salt will mean an attacker would face having to recompute a dictionary for each user to work out their password significantly slowing them down.
So we have increase the difficulty of cracking our passwords in our database through the use of a hashing algorithm and salt. However as previously mentioned a hash is very fast so pre-computing a dictionary to attack it with will be fast. This leads us to our next method.
PBKDF2
Just a disclaimer here: PBKDF2 is what’s known as a “key stretching” algorithm and is one of many, I picked it because it most often comes up.
PBKDF2 or Password Based Key Derivation Function 2 is a method used to significantly increase the amount of time it takes to generate a hash. PBKDF2 does this by essentially hashing the hash over and over to produce what is known as a derived key.
Common implementations of PBKDF2 usually take several parameters: salt, IV, rounds, algorithm.
Salt is exactly as previously mentioned it adds randomness to the password.
IV or Initialisation vector is used to provide randomisation to the underlying algorithm and depending on the cipher used it is or is not required to be random.
Rounds is the number of times the hash is to be re-hashed and is the component which defines how long each derived key will take to generate. For example if the number of rounds is 10000 then the hashing will be iterated 10000 times so it will take 10000 times as long as it would usually take. The number of rounds to use is really up for debate so you will have to decide what the trade off between security and usability is yourself.
Finally the algorithm is the underlying hash algorithm to use when performing the rounds.
Linking it Together
So a hash is a great way of fingerprinting something and stopping it being reverse engineered easily. A salt will introduce randomness to a hash to prevent an attacker from gaining multiple passwords from a single crack. PBKDF2 is thrown into the mix to increase the amount of time it takes to generate an attack against a salt and hashed password.
All of the above are interlinked and it can be confusing trying to find what to use in what situation. I have glossed over some elements of hashing such as the algorithms used, modes for algorithms etc as this is a whole other area to explore.
Thanks for reading.
p.s if you want to read more about this: http://en.wikipedia.org/wiki/Cryptographic_hash_function and http://crypto.stackexchange.com is a great place to start.