Basic encryption with AWS KMS (part 1 of 2)

Dimitris Dovinos
codefully.io

--

Encryption is no longer a feature for the select few. Data breaches are common and government snooping, or just plain old snooping is also on the rise. We, the software developers, need to advise our customers to use encryption on all possible levels and not just rely on the fact that we use safe platforms for deploying applications and storing data.

Encryption allows us to have control of our data now and hopefully in the future too. Judging from Snowden’s articles it looks like encryption is the last line of defense against organizations like the NSA. If the NSA struggles against cryptographic mechanisms, then your average hacker or hacker organization does not stand a chance. Which brings us to algorithms:

AES-256

It is silly trying to come up with an algorithm for encrypting data when your job is to put together software. This is an intellectual task best left to mathematicians and the spy agencies around the world (in particular, NSA for USA, GCHQ for UK, and unit 8200 for Israel). So don’t be too smart and accept that you have to use whatever is available out there. These guys have done their homework and you kind of have to accept the result of their efforts as the de facto truth. After all, we are dealing with maths, which are under the scrutiny of the academic community, which is not necessarily always connected to any of these institutions. Do some research and you will find that there is consensus (as of early 2020) as to which algorithms are safe and which are not. Go with that consensus.

The standard for symmetric encryption is the AES algorithm. There are libraries in pretty much every language that implement the algorithm. Take care regarding the library that you use and make sure that the implementation is correct. You cannot necessarily go over all the code of your libraries, but you have to avoid libraries that are released by individual developers. Let’s look at Ruby: You could get an AES implementation by a random GitHub developer or you could use the OpenSSL library released by the people who maintain Ruby itself. The choice is clear. This is too serious a matter to allow any room for doubt. Your tools must be up to date and maintained by the best.

How it works

The concept is trivially simple (perhaps a major understatement). Imagine you have a database table with the details of military personnel. This is plaintext information that you have to protect. By ‘protecting’ I mean that nobody should get access other than authorized users, and no-one should be able to alter the information without being detected.

Plaintext data

and here is the encrypted version:

Encrypted Data

The way to move from plaintext (unencrypted) to ciphertext (encrypted) is the key, which is nothing more than a string of characters. The same key is used both for encrypting and for decrypting and this is why the algorithm is called symmetric; As opposed to asymmetric, where you use separate keys for encrypting and decrypting. In this case, the key was:

4VZS/oCY73fTENgF8LyLq00B+bsXEHLlAFpRhoDXTKc=

So now we are in a position where we can put together the ‘ecnrypt’ and ‘decrypt’ methods and we have managed to safeguard our data. Which brings us to the next problem of what do we do with the key. The key should ideally be easily accessible since for every row we pull from the database we will need it. Here are some alternatives:

Keep it with the client (e.g. ask for it in every call to the webserver)

Pros:

  • You are no longer responsible for the key
  • The customer has full control. They can stop supplying the key and the data will no longer be accessible to anyone.

Cons:

  • You have to identify the client. In the case of a military personnel database, the client could be the solder, or the person using the computer, or the platoon leader. It boils down to who owns the data, which may not always be clear.
  • You have to manage the safe delivery of the key from the client’s computer to your server. Sending the key from the browser will expose it to several pieces of equipment that manage internet traffic and are prone to attacks or owned by actors that cannot be trusted. You will constantly expose the key to 3rd parties. (Yes, there will be https but still, there are ways to expose your data)
  • Should there be one key for each person, or is there one key for all, or is there one key per department? Sharing the key between different users is totally not secure.
  • The client will soon tire of having to supply the key on every request. In fact, it is not realistic asking for the key more than once every few hours.

Soon the disadvantages of this approach pile up and make the system both unusable and insecure. It is very nice though that the client has the option of revoking the key and making their data inaccessible. This would have been very useful if the client was a company that stored its files on the system. It would allow the company to instantaneously make the data inaccessible.

Keep it inside the database right next to the data

Pros:

  • You can encrypt decrypt with minimal overhead since you will get the key in the same database query that you get the data with.
  • You no longer expose the key to unnecessary journeys over the internet.
  • You do not bother the user with supplying the key on every request.
  • You have flexibility and control over which data you encrypt with which key. Should you decide you want one key per database row then you can do it.

Cons:

  • The system is no more secure than it used to be!! In the case your database is accessed by a malicious actor then they have both the key and the encrypted data.

The disadvantage of this approach is so critical that it renders it unusable. But there are some great advantages to having the key close to the data. If only I could keep it close to the data and yet make it inaccessible in the case the database is compromised…

Keep it with a Key Management Service!

We will look into this approach on the next post.

--

--