Data Encryption in AWS (Part 1)
We don’t usually revisit and discuss Encryption (at least at a low level) much. Usually, Encryption and Decryption are abstracted through libraries, methods and protocols we use. However, last week, I had the chance to look at a feature that required a bit of work around Encryption and had to refresh my knowledge a bit. So I thought it could be an opportunity to share the learnings.
This is a series of 2 posts. In this first post, we will go over some theories around Encryption and use some examples to provide more clarity, highlighting how Encryption works in AWS. In the next post, we will implement client-side encryption together using AWS-provided libraries.
What is Encryption?
Encryption is the process that converts information into an alternative form that hides the true meaning of that information. Encryption has been there a long time before computers to facilitate secret communication. Rearranging the order of the letters in a message is a typical example of cryptography (the study of Encryption, Decryption, etc.) — Hello World
could become ehlol owrdl
. Another example is the usage of invisible ink (More info on the history of Cryptography can be found here).
In the computers world, the “Hello World” example would look as follows:
We used a simple algorithm that would alternate the given text and display it in an unreadable format in the previous image. Here’s the algorithm in javascript:
However, this is not very efficient, as any person knowledgeable in writing code would reverse that algorithm and decrypt the cypher text. Hence, a key is usually used as part of the Encryption algorithms. So an algorithm would need the plain text and a key to perform the Encryption. Think of a key as a password.
We need a Key
Alright, we know what is a key, and we visualized it as a password, but who provides this key?
Symmetric Encryption
Assume two persons wanting to exchange some information securely without other humans having the ability to read them. They could agree on a key and algorithm to use and store it on both their computers. Before sending that data, the sender uses the key and the algorithm and generates the cypher text. When the receiver receives the cypher text, they use the key and the algorithm (or a reversed version) to decrypt that data.
There is a slight problem here. How to share the key? What if somebody intercepted that communication and stole the key; this would give the interceptor the ability to decrypt any future data. They could exchange the key physically on a piece of paper, but that restricts the sharing operation by the location of these individuals.
This is called symmetric encryption, where the same key is used for both encryption and decryption processes. It is a good way of Encryption for local data on the computer but not efficient when transferring data between machines.
Some examples of symmetric encryption algorithms include:
- AES (Advanced Encryption Standard)
- DES (Data Encryption Standard)
- IDEA (International Data Encryption Algorithm)
- Blowfish (Drop-in replacement for DES or IDEA)
Asymmetric Encryption
As the symmetric encryption method is not very practical for data transfer, since securing the key itself is a bit problematic, we could use the Asymmetric Encryption method for this type of operation. This Encryption type uses two keys: a public key and a private key.
For two persons (Jane and John) to exchange data in an encrypted manner, both need to agree on an algorithm and generate public and private keys.
The public key is used to encrypt the data (from plain text to cypher text) — Cannot decrypt that generated cypher text. Only the associated private key can decrypt that data.
How could these two-person transfer data between each other?
- Jane could share her public key publicly, and there is no risk in doing that; anyone who obtains that public key can only use it to encrypt data, which can only be decrypted by Jane’s private key (the private key needs to be stored securely and only accessible by Jane)
- John uses the public key of Jane to encrypt some data through the agreed-upon algorithm
- Once that data is transmitted to Jane, she can only decrypt it using the private key through the same algorithm
For Jane to transfer data to John, she’d need to use a public key generated by John to encrypt the data, then transfer the cypher text to John, where he can only decrypt that data using his private key.
Asymmetric Encryption is used when multiple parties require an exchange of data.
The following diagram shows the process to get one person transferring encrypted data to the other and decrypting received cypher text:
In this previous diagram, Jane shared a public key. John used that public key with an algorithm to generate cypher text and sent that cypher text to Jane. Jane then used the private key and the algorithm to decrypt the cypher text.
Some examples of asymmetric encryption algorithms include:
- RSA (Rivest-Shamir-Adleman)
- DH (Diffie-Hellman)
- ECC (Elliptic-curve cryptography)
- DSA (Digital Signature Algorithm)
Keys Protection
As you’ve already noticed, protecting the keys we’re using in our encryption is critical. The best way to maximize the key security is to use HSM (hardware security model). This is a device that has multiple security controls built into it that prevent the keys from being leaked or stolen.
Encryption in AWS
AWS encourages encryption as part of the defence in-depth strategy. They also offer 2 services that use HSM to protect customer keys: KMS (Key Management Service) and AWS CloudHSM. KMS manages HSMs on the customer’s behalf, while AWS CloudHSM allows the user to manage their own HSM.
A method called “Envelope Encryption” is used in AWS services that encrypt data on behalf of the customers (Server-side encryption). This method is mainly encrypting the keys rather than the data itself.
How does that happen? A customer master key (CMK) would live on KMS (or AWS CloudHSM) for encryption. The CMK is used to encrypt another key (Data key), which is used to encrypt the data. The “Encrypted Data Key” would be distributed to the applications to encrypt the data that will be transferred… This is good as it enables encryption or decryption to happen on the customer’s machine rather than sending data every time to the HSM (KMS/CloudHSM). This would improve the performance of course.
It is advised to use the same “Envelop Encryption” process to perform client-side encryption with KMS or CloudHSM using the AWS Encryption SDK
How are you using Encryption to secure your data at rest or in transit?
Stay tuned for the next post.