As a web developer I have always missed some basic, clear information about how to deal with passwords and sensitive data storage in general.
During my years working on different projects, huge and small ones, with different scripting languages, old and brand new ones, the lack of documentation and simple guidelines about how to handle passwords has always been a constant.
This lack of information in teams and (for some times at least) even online can be linked to the common belief that “the less you share about your cryptography strategies, the better it is for your system”.
I can understand this idea (and I am not gonna share my cryptographic code in details too), but I also think that if we don’t start sharing some basic information in a clear way, each new generation of developers will have the same chances to make the same mistakes the old ones made.
Just as a reference, you can have a look at https://haveibeenpwned.com/ to understand how important it is to manage passwords and sensitive data in a way that makes it hard for attackers to re-use them in case of data breaches.
In this article I will try to summarize my (very basic) knowledge using Php and MySql, hoping this can be an helpful start for young programmers, and a motivation to further study this complex and fascinating matter.
Basically I will cover three concepts here:
- hashing passwords
- crypting and decrypting data
- data at rest encryption to protect your database tables
Knowing how to deal with these three basic concepts in a proper way will help you build secure web applications: obviously web security is not just this, but learning how to store your users’ passwords is a good start ;)
1. Hashing passwords
Dealing with passwords in web applications doesn’t really involve cryptography: when we store users’ passwords we don’t want to be able to “decrypt” them. We just want to check if an input given by a user (who wants to login to our application, for instance) is equal to her/his password, that we are storing somewhere, typically in a database table.
What we really want in this case is a strong hash function: a one way function, that nobody can invert in a reasonable amount of time (if you want to read more about these functions: Cryptographic hash function on Wikipedia says it all).
None of those functions is designed to store a password hash, however.
They can be used to safely check data integrity, since they can quickly produce a hash out of a huge amount of data, but do not use them to store passwords in your web applications. Ever.
The reason is quite simple: they can be cracked easily by someone interested in your data (even if you use salt, yes).
Once an attacker has access to your hashed data, s/he can try a brute force attack on all combinations of 6 characters password in… less than a minute. And get your users’ passwords. And try them on their social accounts, to check if they are using the same password, for instance. Your users won’t be happy.
Let’s start to run some code to better understand what’s going on here, and which is the best option that our language, in this case Php, offers us to solve our problem.
The following lines hash a sample string, hypothetically a password, using various hashing functions.
Here’s the output, in case you cannot run it on a web server on your own:
Now we can grab all the generated output and try to crack it as any script kid could do, by simply googling “crack sha1” or “crack md5”. I reach a website which lets me process up to 20 hashes per click. Here’s the instant result I get:
In less than a minute, without any knowledge about cryptography, I have been able to crack md5, sha1 and sha256 hashes. Both the hashes created with crypt() and password_hash() couldn’t be recognized by the same online hash cracker.
Why? Because they rely on stronger crypting algorithms. Hence my recommendation is to use password_hash() to store all your hashed passwords. Since you are here, I also strongly recommend to call it with the second parameter set to PASSWORD_BCRYPT: in this way you will be using Blowfish encryption, a quite complex and not too old algorithm (not as DES at least). Calling password_hash without specifying a second parameter with a strong crypting algorithm can be risky.
crypt() function is less versatile, so there is really no reason to use it instead of its more advanced sibling (which is built on it, by the way).
2. Encrypting and decrypting data
When managing users’ passwords, one way cryptographic functions are the right choice, since they let us check if a password is valid, without ever knowing it. Sometimes we need two ways functions instead, to store encrypted data somewhere, and to be able to decrypt data somewhere else.
In a simple hypothetic web application, it could be reasonable to store users’ email in a crypted form, for instance. Since we probably need to show their own email addresses to the users somewhere (for example in a profile page, in order for them to edit the email if they want), we need a two way function to crypt and decrypt data when needed. We could have a similar need if we want to store messages between users in our web application, being sure that only the sender and the receiver could read them.
Encryption is a process that makes content not understandable, unless you own a key. If the encryption process is symmetric, the same key is used both to encrypt and to decrypt the content.
Asymmetric encryption, instead, is a process where the encryption is made by a public key, and a different private key is able to decrypt the content.
Apart from encryption and decryption, a serious cryptographic system should also consider authentication: crypting data provides confidentiality (no one can read data without a key), while authentication provides integrity (no one can tamper data) and authenticity (no one can claim to be the source of the message if he/she is not). Since we are here, the fourth basic requisite of a strong cryptographic system is non-repudiation, i.e. the author of a message cannot neglect its existence and authorship once s/he sent it.
Let’s start by building a basic symmetric encryption tool by looking at the Php online manual. Our tool will let us write some text in a form, save it as a crypted text file on the server, and expose a unique url that will let us decrypt what we wrote with the correct key.
Here’s a basic form written in html:
You can try it here:
Some Php is used to generate a key at line 32. We could let the user choose a password and use it as a key, but this way we wouldn’t be sure the key is complex enough. Hence I use openssl_random_pseudo_bytes() native php function and then convert the result from binary to hex. The user is asked to save the key somewhere, or store it as a cookie (not exactly the best security wise choice, but it makes things faster for testing purposes). Both the key and the plain text are posted to another script, which receives and encrypts the data in a txt file. In a production environment this make no sense, and you will want to store the encrypted data in a database, but again, I am keeping things simple to focus on the encryption part.
Also please note that i am using openssl_random_pseudo_bytes() to keep my code compatible with Php 5. If you are using Php 7 please switch to random_bytes(), which is a safer function.
The interesting part here is the fact that the key is not stored anywhere on the server. This is a basic requirement to design a not-too-weak cryptographic system. Storing the key somewhere on the same server where encryption & decryption happen is BAD. If you are interested in key management strategies this is not the right place, but here’s a nice forum post to start.
Let’s look at the source code of aes-encrypt.php, the script that actually does the encryption.
For this basic demo, we choose to use AES-256-CBC as the cipher method (line 4) to pass to openssl_encrypt() function. In production environment, based on your server features, you would want to try AES-256-CTR for better security.
$iv stands for initialization vector, which basically is a random number with a given length (in bytes) used to perform the encryption. For further readings, start here. Important to know: the length of an initialization vector is linked to the cipher method, and this is why we calculate it via openssl_cipher_iv_length() on line 5.
Encryption is done at line 7 through the previously named openssl_encrypt(), while lines 8–9 provide authentication through a calculated hash.
Line 10 creates a random string, used to name the file where we are going to store the crypted message (lines 12–13), and subsequently an url (line 15) that our webpage shows to the user once the file is written on the server.
Aes-decrypt.php hosts a simple form that the user needs to fill with her/his password to access the crypted content. To speed up the testing process, we added a feature that checks if the user has previously saved the password and in this case it fills in the field automatically.
The password is then posted to aes-decrypted.php
The script looks for the encrypted content (line 9), and decrypts it using the password given by the user (line 19 and following).
This is some basic symmetric cryptography with a reasonable level of security, build with standard Php, without using any external library. Asymmetric cryptography can also be similarly built using standard Php, but it would probably require an article on its own, and a more complicated prototype, so I am avoiding it by now.
Of course, since cryptography is a complicated and fast changing discipline, you would like to try some ready made library in your real life projects (Libsodium is my favourite and the standard as I am writing), but I still believe it is useful to play a little bit with the prototype I’ve just summarized to understand how things work under the hood, at least at a basic level.
Here you can download the whole code and try it on your server, if you are interested: basic_cryptographic_tool.zip
3. Data at rest encryption
Encrypting the physical files of your database can be a good strategy to prevent data leaking in case your infrastructure is compromised. Most of modern databases feature some kind of encryption, and if you are working in EU, GDPR strongly encourages to encrypt your data at rest. Moreover, data at rest encryption also affects your backup, making it difficult for an attacker to retrieve data even from old abandoned backups.
Once the encryption features are set up, encrypting a table on MariaDb is really simple. All you need is something like:
ALTER TABLE mytable ENCRYPTED=YES
Before launching this Sql line be sure you know the passphrase needed to decrypt data. And basically there is nothing else to know as a database user.