Salting our wounds: If web sites aren’t storing our passwords, how do passwords get stolen anyway?

Snigdha Sur
Jul 28, 2017 · 5 min read
From the NYTimes; Virgin America just experienced a password breach

One of the most surprising discoveries I made this week at Flatiron School is that secure Web sites — and now, industry standard — never store your passwords directly. This is why when you “forget” your password, you’re often redirected to create a new one.

Well, if this is the case, how do infamous password leaks even happen? Virgin America was the latest to lose corporate account login usernames and passwords in a recent cybersecurity breach. Over 3,000 employees had their information jeopardized.

First things, first. What happens when you sign up and create a username and password?

When you sign up and create a username and password, the backend database actually is using a hash function to map your password to a series of numbers and letters that are difficult to decode back to your literal password. The database then stores the hash, NOT your password.

What does hashing look like? Examples below, from CrackStation.

hash("hello") = 2cf24dba5fb0a30e26e83b2ac5b9e29e1b161e5c1fa7425e73043362938b9824
hash("hbllo") = 58756879c05c68dfac9866712fad6a93f8146f337a69afe7dd238f3364946366
hash("waltz") = c0e81794384491161f1777c232bc6bd9ec38f616560b120fda8e90f383853542

This is similar to how encryption works. But, most importantly, unlike encryption, it cannot be decrypted back to the original text (it is a ‘one-way’ cryptographic function, and is a fixed size for any size of source text).

Most encryption uses RSA. To keep things simple, the reason why these functions can successfully map to another value without it being it easy to factor back involves really, really, really, really large prime numbers. When you multiple two large prime numbers, say p, and q, together, you get p*q. The only numbers that should divide this large number are p, q, and 1. However, the amount of time and computing power it would take a computer to “factor” or find p or q, is so large (at least for today’s computing powers), that hackers don’t waste their time figuring this out.

Very secure hash functions today include SHA-256 and SHA-512, developed by the US National Security Agency/NSA in 2001, and bcrypt, developed in 1999 based on Blowfish. When you hear 128-bit encryption, it most likely is using SHA-512/256. SHA-2 series are required for most government applications; SHA-1 is being phased out because of increasing incidents of collision — when two different hash functions produce the same exact hash for different inputs. This would allow malware to use its own hash function (that mimics the legitimate hash function) to replace legitimate actions on a web server/client.

Bcrypt uses a salt and iteration count so that each consequent attack is slower even with increasing computation power, making it very costly for hackers in terms of time and resources. So, unlike

Imagine a world where many people are using the same passwords [reality check: many people already do this]. I could do a simple frequency analysis after stealing all these hashes and make an educated guess of what these users’ passwords are and then use this information to log into other sites.

From Fortune

Salts help prevent this. They are usually 29 characters long and are concatenated with the user password before this longer string is hashed. Both the hash and each individual’s salt is stored in the website database. Because each salt is generated randomly for each user, the web site can check if you have the right password if that password + the appended salt, when hashed, equals the hash stored in the database. But for a crook, it will be very difficult to reverse-engineer that password even if there are a bunch of people using the password “qwerty” because it will have to undo each salt + password combination for each user, one at a time. Very expensive indeed.

What a salt looks like, from CrackStation:

hash("hello")                    = 2cf24dba5fb0a30e26e83b2ac5b9e29e1b161e5c1fa7425e73043362938b9824
hash("hello" + "QxLUF1bgIAdeQX") = 9e209040c863f84a31e719795b2577523954739fe5ed3b58a75cff2127075ed1
hash("hello" + "bv5PehSMfV11Cd") = d1d3ec2e6f20fd420d50e2642992841d8338a314b8ea157c9e18477aaef226ab
hash("hello" + "YYLmfY6IehjZMQ") = a49670c3c18b9e079b9cfaf51634f563dc8ae3070db2c4a8544305df1b60f007

So how are all these hackers succeeding despite these super cool practices? Many Web sites are still not as secure as they should be because they are using collision-prone hash functions, such as MD5 and SHA1. Even with salt, technology has evolved so that processing power isn’t as expensive as it used to be before; also, hackers can guess user passwords at the same time en masse, unlike bcrypt. The bad list includes Yahoo, LinkedIn, and Adobe. In the 2012 LinkedIn breach, for example, LinkedIn used SHA-1 and didn’t individually salt each password.

From quora:

In the past RC4, MD5 and SHA1 (with or without salt) were used (a salt or nonce is always recommended). They are no longer considred [sic] secure, because they are all easily to fairly crackable now (e.g., SHA1 signed crypto certificates are being phased out in major browsers over the next year or so). MD5 has been considered insecure for some time, RC4 even longer.

The massive Adobe breach in 2013 turned out to be because they had 3DES encrypted all the user passwords with the same key.

At least one of the massive Yahoo breaches in 2013 has been revealed to be because the company was still storing MD5 hashes of its passwords.

MD5 and SHA1 are likely still being widely used even though they shouldn’t be. Encryption is likely widely used despite the known danger. If your service can email you your password or tell it to you over the phone, then they are encrypting all the passwords (bad) or storing them in plain text (very very bad).

So what should you be doing?

  • Try to figure out whether the Web site is using state-of-the-art security practices. If you forget your password, and it’s able to email you your plaintext password, you may want to delete your account or make sure it’s a more randomized password.
  • Use programs such as LastPass, one the best-rated password managers, which generate and remember random strings of passwords for you.
  • Use best practice security, such as ‘bcrypt’ when you develop your sites to ensure that it’s very very costly for developers.

At the end of the day, fighting hacking is like fighting a virus. As technology improves and as the virus gains experience hacking into different systems, it gets stronger. So it’s up to hash function scientists and creators to make hashes even more breach-proof. Let the evolution games begin.

More resources here:

http://dustwell.com/how-to-handle-passwords-bcrypt.html

Snigdha Sur

Written by

writing on coding, tech, bollywood | diversifying media so that we can see ourselves in what we consume | formerly @ HBS, McKinsey, Otter/Chernin

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade