AWS Encryption SDK in Baby Steps

Albert Niderhofer
CyberArk Engineering
5 min readOct 6, 2020

How to protect customer data from physical and applicative data breaches is a challenge that every developer will face sooner than later.

When developing a cloud application based on AWS, the responsibility of protecting customer data is shared between the AWS infrastructure and the application developer. The AWS infrastructure supplies the infrastructure for encryption at rest, against physical data breaches and data protection in transit, while the application developer implements a Client side encryption.

In reality, designing a client side encryption solution is a challenging task with lots of considerations to take care of, like managing the encryption keys’ lifecycle, complying with encryption industry standards, choosing the encryption library, etc.

Luckily, AWS developed an open source solution — the AWS Encryption SDK that applies encryption industry best practices and innovations, while hiding most of the complexity in a simple set of APIs and configurations. In addition, the SDK integrates natively with the AWS Key management service.

In this post you will learn step by step how to encrypt customer data using a CMK key, followed by an explanation of the code flow and how it interacts with the AWS services behind the scenes.

Envelope Encryption

Before we dive into code, there is an encryption key concept called Envelope Encryption that you should be familiar with, as the SDK relies on in its default implementation.

The concept divides the encryption keys into 2 types:

1. Data Encryption Key — DEK in short.

2. Key Encryption Key — KEK in short.

Envelope encryption is the practice of encrypting data with a data encryption key (DEK) and then encrypting the DEK with a key encryption key (KEK) that you can fully manage in a Hardware Security Module (HSM). Conceptually you can encrypt with a hierarchy of DEKs, and encrypt the last DEK with the KEK — the root key that is safely protected in a hardware security module. So if any data is revealed to a hacker, it will mean nothing to them.

AWS Envelope Encryption

AWS implements the envelop encryption with several artifacts: MasterKey as KEK, DataKey as DEK and the KMS service which manages behind the scenes a dedicated HSM

AWS Envelope Encryption

The AWS implementation of the envelope encryption, enables a set of important abilities in the encryption process, for instance:

· Adding user access control policies into the encryption process

· Handling encryption scale challenges with large data objects

· Allowing customers to manage their encryption keys

· Encrypting the same data under multiple keys across regions

· Creating strong encryption combinations

Sample Code

Pre-Requisites

  • Basic Python knowledge to understand the sample code
  • vscode or any other preferred IDE
  • Installed Python version >= 3.7.6
  • Access to an AWS account
  • Understanding of the AWS KMS concept (read documentation)

Step 1

In the AWS console:

  • Create a symmetric CMK key, see guidelines here
  • Ensure you have valid IAM User Access keys with permission to access the created symmetric CMK key

AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY

  • Copy aside the CMK ID, you will use it later in the code sample.

Step 2

In your computer’s console, run the following commands:

$ mkdir aws-encryption-test
$ cd aws-encryption-test

Open vscode > menu select > Terminal > New Terminal and type in terminal window:

$ pipenv install aws-encryption-sdk
$ export AWS_ACCESS_KEY_ID=<your access Key ID>
$ export AWS_SECRET_ACCESS_KEY=<your secret access Key>

Step 3

You’re ready to start coding.

  • In vscode select menu > File > new File > app.py
  • Paste the following code and replace the cmk_id, with the CMK ID you created in the previous step:

Run the sample code, type in vscode console window

$ python ./app.py

Already with the few lines of code above, the SDK and AWS services do lots of work behind the scenes.

Here is the illustration of the actual flow of the sample code (step numbers marked in yellow).

  • Steps 1–3: Initialization of your code, so you will eventually be able to call the encrypt method of the SDK.
  • Step 4: The SDK communicates with the KMS APIs to generate a datakey from the given CMK, which passes along the given context parameter.
  • Step 5: The KMS verifies calling code access permission, according to the CMK policy. Actually, there is a way to bind the context data and CMK policy to limit encryption/decryption operation according to the context content.
  • Step 6: The KMS writes the context info into the Audit Trail logs. So you will be able to investigate audits in case of a security event.
  • Step 7: The KMS returns the datakey in 2 formats: a plaindatakey and an encrypted datakey with the CMK cipherdatakey.
  • Step 8: The SDK uses the plaindatakey to encrypt (using default algorithm AES-256 GCM) the plaintext. Then it clears the plaindatakey from memory.
  • Step 9: In this last step, the SDK returns a cipher text that is actually a self contained structure consisting of ciphertext + cipherdatakey + metadata.
    The SDK knows to open this structure when it needs to decrypt the text.
    The 2nd returned value is the copy of the encryption metadata that already exists in the ciphertext. It’s for optional use — so let’s ignore it for this post.

TIP — if you want to turn on the SDK tracing, turn on log DEBUG mode in your code. This will also activate the Encryption SDK tracing

Paste the below code between the imports and the cmk_id variable

Top ten Key facts about SDK you should be aware of:

  1. The SDK supports 4 development languages: C, Java, JavaScript and Python. All language implementations are interoperable, meaning you can encrypt in one language and decrypt in another, and vice versa. This is convenient in case you have a polyglot multi-service architecture
  2. When using Python or Java the SDK only supports symmetric keys encryption. With JavaScript or C, you can define your own keyring that also supports asymmetric encryption
  3. The SDK has APIs for encrypting strings or streams (file)
  4. Extensible — you can plug your own Master Key Provider, so you actually do not need to rely on any AWS services and even use this SDK on-premise
  5. The SDK comes with a built-in support of KMS as the Master Key Provider
  6. The SDK has a caching functionality to overcome performance and heavy costs in case of heavy use
  7. The SDK can encrypt and decrypt with multiple CMKs, which is a killer feature to support different encryption per region and/or create stronger encryption combinations
  8. The SDK supplies a built-in encryption context to support Encryption AAD, context based encryption policies and audit encryption records in logs & CloudTrail
  9. The SDK uses a best practice encryption concept called Envelope Encryption, also known as KEK & DEK
  10. The SDK has a CLI interface that is pretty handy to build tools and perform quick checks

Happy Encrypting!

--

--