Data Encryption in the Cloud, Part 4: Comparing AWS, Azure, and Google Cloud

Kenneth Hui
21 min readMar 9, 2018

--

Due to the length of this blog post (20 pages), I’ve decided to make it available as a downloaded PDF which you can grab here. But I suggest reading the first section of this page before switching to the PDF if you plan to do so.

I’ve written previously about the role of data encryption as a critical component of any company’s security posture and the potential pitfalls of not using encryption properly. This is magnified when you are talking about storing data outside of customer data centers such as is the case when archiving data to public cloud storage repositories such as Amazon S3, Azure Blob Storage and Google Cloud Storage. It is important to understand that while public cloud providers are responsible for securing the infrastructure and provide tools for protecting the data stored in their infrastructures, the user is ultimately responsible for using those tools to secure their data.

I want to continue this blog series by giving an overview of how encryption at rest is implemented among the big three public clouds — Amazon Web Services (AWS), Microsoft Azure and Google Cloud Platform (GCP). In the interest of keeping this post to a manageable scope, I will focus specifically on the following:

  • Data-at-rest encryption only since all three providers’ implementation of TLS for encrypting data-in-transit do not differ greatly
  • Encryption of object storage data since that is where the largest amount of data is stored in the cloud. I will probably be covering services such as block storage, file storage and databases in the future.
  • How each provider implements their cryptographic system, including encryption methods, ciphers and key management.

I assume, in this blog post, that readers are familiar with the basics of data encryption and encryption key management. If you you want to learn more about encryption generally or need a refresher on terms such as envelope encryption and key encryption key, I strongly suggest reading my encryption primer and key management blog posts. They will provide a foundation for understand the concepts discussed in this post.

Server-Side vs. Client-Side Encryption

I will be talking about server-side vs. client side encryption throughout the post so it might be helpful here to review the differences.

With server-side encryption, data is not encrypted until it is transferred to the target, in this case the object storage service. All three providers offer server-side encryption with some differences in implementation details, particularly in regards to key management.

With client-side encryption, data is encrypted at the source and prior to it being transferred to the target, in this case the object storage service. All three public cloud providers allow for client-side encryption with some offering varying levels of integration.

So now let’s take a look at each of the public cloud providers.

Amazon Web Services

Amazon S3 is the AWS object storage service and is by far, the most widely used in the world. As the public cloud provider with the most longevity, AWS has the encryption services and key management offerings with the most options. Details on the AWS services and options are also the most clearly explained and the most accessible of the three providers.

Encryption Methods

Amazon S3 supports both server-side and client-side encryption with a number of options for each. Customers have the option of enabling server-side encryption by default for all uploaded objects to S3. For both server-side and client-side encryption, AWS utilizes AES-256 with Galois Counter Mode (GCM) for any symmetric key encryption operations. Without getting into detail, GCM provides authenticated encryption by adding a unique tag to the ciphertext which verifies that the encrypted data has not been tampered with in any way. Envelope encryption is used for all client-side options and for all server-side options except when the customer provides the encryption key.

For server-side encryption, Amazon S3 supports three options:

  • Amazon S3-managed keys (SSE-S3)
  • AWS Key Management Service (KMS) managed keys (SSE-KMS)
  • Customer-provided keys (SSE-C)

With SSE-S3, both the KEKs and the DEKs are stored and managed by the S3 service. All key management functions, including the periodic rotation of keys, are performed by the service without input required from the user. S3 does this by using an AWS-managed Key Management Service (KMS). The encryption workflow for SSE-S3 is as follows:

  • Data is uploaded to Amazon S3
  • The S3 service generates a unique one-time Data Encryption Key (DEK)
  • The uploaded data is encrypted using the DEK
  • The DEK is then encrypted using a KEK that is stored and managed by the S3 service
  • The encrypted DEK is stored, as metadata, alongside the ciphertext data while the plaintext version of the DEK is deleted from memory

The decryption workflow is as follows:

  • Amazon S3 retrieves the encrypted DEK for the requested object and decrypts it using the associated KEK
  • S3 decrypts the ciphertext object using the decrypted DEK and then deletes the key from memory
  • The decrypted object is downloaded to the requesting client or application

Before diving into the SSE-KMS option, it is important to note that KMS uses the term Customer Master key (CMK) to describe what would be typically called the Key Encryption Key (KEK). Similarly, AWS uses the term Data Key to describe what would be typically called the Data Encryption Key (DEK). These are not merely a change in terminology but a CMK and a data key are logical representations of a KEK and a DEK respectively. I will provide more details when we talk specifically about key management.

With SSE-KMS, the Data Key is encrypted either by a default CMK that is automatically created when a user chooses to encrypt an S3 object for the first time in an AWS Region or by a pre-existing CMK created by the user. Using a CMK created explicitly by the user provides more flexibility and control over the CMK. The encryption workflow for SSE-KMS is as follows:

  1. Data is uploaded to Amazon S3
  2. The S3 service requests both the plaintext and the ciphertext versions of a Data Key under the context of either the default CMK or the user-created CMK
  3. AWS KMS uses the CMK to generate a new unique one-time Data Key and encrypts the key using the CMK
  4. AWS KMS sends both the plaintext and the ciphertext versions of the Data Key to S3
  5. S3 uses the plaintext Data Key to encrypt the object and deletes the key from memory
  6. The encrypted Data Key is stored in S3 as metadata alongside the encrypted object

The decryption workflow is as follows:

  1. Amazon S3 retrieves the encrypted Data Key for the requested object and sends it to AWS KMS
  2. KMS decrypts the Data Key using the associated CMK and send the decrypted key to S3
  3. S3 decrypts the object using the decrypted Data Key and then deletes the key from memory
  4. The decrypted object is downloaded to the requesting client or application

The SSE-C option places the burden of key management completely on the user. Amazon S3 still handles the encryption and decryption process but the customer provides the encryption keys which must be a AES-256 symmetric key. The customer provides these keys for every encryption and decryption operation. AWS doesn’t store the actual keys but performs a salted Hash-based Message Authentication (HMAC) operation against the keys and stores the resultant hash. Without getting into details, an HMAC hash is essentially a digital signature that can be used to verify the authenticity of the keys for future operations without having to store customer-provided keys in AWS. The hash itself cannot be used to decrypt data or to recover a lost key.

The encryption workflow for SSE-C is as follows:

  1. Data is uploaded to Amazon S3 along with the customer-provided encryption key
  2. The data is encrypted by the S3 service
  3. A hash of the encryption key is created and the key itself deleted from memory
  4. The hash and the encrypted object is saved to S3

The decryption workflow is as follows:

  1. The client or application requests an object and provides the symmetric key used for encryption as part of the request
  2. Amazon S3 validates the symmetric key using the hash that was created at encryption
  3. S3 decrypts the object using the symmetric key and then deletes the key from memory
  4. The decrypted object is downloaded to the requesting client or application

Moving on to client-side encryption, Amazon S3 supports two options:

  • Using a KMS-managed Customer Master Key (CMK)
  • Using a client-side master key

To assist customers who choose the client-side encryption option, AWS provides an Amazon S3 encryption client which is embedded into the AWS SDK for a number of languages including Java, Go and others. The encryption client handles all data encryption and decryption operations using AES-256 GCM symmetric encryption with a master key (AWS equivalent of a Key Encryption Key) generated in KMS or provided by the user.

With client-side encryption that leverages AWS KMS, the customer creates a CMK in KMS and receives a CMK ID, which is a logical representation of the actual CMK. Users provide the CMK ID when they request an object, ensuring that the actual CMK never leaves KMS. The encryption workflow is as follows:

  1. Data is passed to the AWS encryption client
  2. The encryption client requests a Data Key from KMS using a specified CMK ID
  3. KMS uses the associated CMK to generate a new unique one-time Data Key
  4. KMS passes the plaintext and the ciphertext versions of the Data Key to the encryption client
  5. The encryption client encrypts the data using the plaintext Data Key and then deletes the key from memory
  6. The encryption client returns the encrypted message which includes the encrypted Data Key alongside the encrypted data
  7. The encrypted message is uploaded to S3

The decryption workflow is as follows:

  1. Encrypted data in the form of an encryption message is downloaded to the user client or application
  2. The user passes the encrypted message to the AWS encryption client
  3. The encryption client retrieves the encrypted Data Key from the encryption message and sends the encrypted Data Key to KMS
  4. KMS uses the associated CMK to decrypt the ciphertext Data Key
  5. KMS passes the plaintext Data Key to the encryption client
  6. The encryption client decrypts the data using the plaintext Data Key and then deletes the key from memory
  7. The encryption client returns the decrypted data

Using this option, customers are able to encrypt data before it leave their data center and uploaded to S3. However, they can do without bearing the responsibility for maintaining a cryptographic system or their own Key Management Infrastructure (KMI).

The final client-side encryption option requires the customer to provide the master key (KEK) that is used to encrypt any Data Keys. This master key can be a symmetric key or an asymmetric public key. The client side master key is provided to the AWS encryption client which handles all data encryption and decryption operations. The customer has full responsibility for storing and managing the master key. The encryption workflow is as follows:

  1. Data is passed to the AWS encryption client along with a master key
  2. The encryption client generates a unique one-time only data key and encrypts the data using that key
  3. The encryption client encrypts the plaintext data key using the provided master key and then deletes the plaintext key from memory
  4. The encryption client returns the encrypted message which includes the encrypted data, the encrypted data key and metadata that associates that data key with a master key
  5. The encrypted message is uploaded to S3

The decryption workflow is as follows:

  1. Encrypted data in the form of an encryption message is downloaded to the user client or application from S3
  2. The user passes the encrypted message to the AWS encryption client
  3. The encryption client retrieves the encrypted data key from the encryption message and finds the associated master key using the metadata in the encrypted message
  4. The encryption client uses the associated symmetric master key or asymmetric private master key to decrypt the ciphertext Data Key
  5. The encryption client decrypts the data using the plaintext Data Key and then deletes the key from memory
  6. The encryption client returns the decrypted data

Users can also choose to encrypt data prior to being uploaded to S3 using their own cryptographic and key management infrastructure without the AWS encryption client. The encryption process is transparent to S3 and the encrypted data is stored as it would be with unencrypted data.

Key Management

From a key management perspective, AWS primarily offers three options to help customer with encryption of their S3 data:

  1. CloudHSM for customer with their own key management software
  2. Amazon S3-managed keys
  3. AWS KMS

We’ve already spent time on the first option and explained that the S3 service can store and manage encryption keys on behalf of the customer, includes periodic rotation of keys. With this option, users are offloading all key management responsibilities to AWS.

CloudHSM offers key storage for customers with their own key management software who don’t want to manage their own key storage system. CloudHSM give customers a dedicated hardware appliance that is tamper-proof running in an Amazon Data Center. The CloudHSM appliance integrates with key management software that users can run on-premises or in AWS. User are responsible for the full lifecycle of the keys that are stored in CloudHSM.

AWS Key Management Service (KMS) is a fully managed service that is essentially provided to users as key management software as a service running on a fleet of hosted HSM appliances. Users can interact with KMS via a web interface that gives them the option of managing the full lifecycle of encryption keys stored in KMS. User can also choose to leverage KMS for key management through SSE-KMS or Client-Side Encryption with KMS-managed customer keys.

Historically Amazon has called their Hardware Security Module (HSM) appliances Hardened Security Appliances (HSA) but they are essentially the same thing.

Internally, KMS has multiple layers of abstraction, all designed to secure the encryption keys and to simplify key management. This begins with the CMK which represents the top a customer’s key hierarchy within KMS. A CMK is not an actual instance of a master key but a logical container for a set of historical keys.

When a user requests a new CMK, the following workflow occurs:

  1. A CMK container is created with an associated CMK ID and Amazon Resource Name (ARN)
  2. An initial AES-256 HSA Backing Key (HBK) will be generated on an HSA that is part of a given KMS domain. Without getting into details, a domain is a collection of trusted KMS entities and each domain has as set of domain keys that are shared among HSAs in that domain.
  3. The HBK is encrypted under a domain key using AES-256 GCM. The sole purpose of a domain key is to encrypt HBKs.
  4. The encrypted HBK is exported to the newly created CMK as an Encrypted Key Token (EKT) backing the CMK

When a CMK is rotated, the following workflow occurs:

  1. A new HBK is generated on an HSA
  2. The new HBK is encrypted under a domain key
  3. The encrypted HBK is exported as an EKT to the CMK that is being rotated
  4. KMS activates the new EKT to become the backing key for the CMK
  5. KMS deactivates and saves the previous EKT within the CMK container so it can be used to decrypt DKs, also known as CDKs, that it previously encrypted

Because of the use of these abstraction, the CMK maintains the same CMK ID and ARN even if the CMK is rotated multiple times. Only the backing key changes so uses can continue using the “same” CMK while avoiding encryption key over-reuse.

Note that during the data encryption process, KMS uses an HSA to generate a DEK which is also known as a Customer Data Key (CDK). The CDK is also a logical key container that holds the actual DEK and other relevant cryptographic materials.

Beyond the creating and rotating of keys, KMS enables management of the full lifecycle of encryption keys.

In terms of KMS integration with S3, SSE-S3, KMS is fully managed by AWS and offers the following key management capabilities:

  • CMKs are generated on behalf of the user
  • CMKs are rotated automatically every three years
  • CMKs cannot be deleted
  • Key access policies are managed by AWS

With SSE-KMS and Client-Side Encryption with KMS-managed customer keys,, customer can managed their CMKs and owns the following key management capabilities:

  • CMKs are customer generated
  • CMKs are rotated automatically once a year or on-demand
  • CMKs can be deleted
  • Key access policies are managed by the customer

Microsoft Azure

Azure Blob Storage is the Microsoft Azure object storage services offering. For the purposes of this post, we will be focusing on Block Blob Storage which is most similar to Amazon S3 and Google Cloud Storage. While Azure has almost as many options as AWS for data encryption and key management, the details are not always easy to find and details are often unclear or missing.

Encryption Methods

Azure supports both server-side and client-side encryption with users having the option of enabling server-side encryption by default for all uploaded objects. In Azure, server-side encryption is called Storage Service Encryption when it pertains to blob storage. Azure leverages envelope encryption using AES-256 symmetric keys for data or content encryption (Microsoft uses the term Content Encryption Key in place of Data Encryption Key) and supports using either a symmetric or an asymmetric keys for the Key Encryption Key (KEK), depending on who is generating and managing the keys.

Storage Service Encryption supports using a KEK that is either:

  • Managed by the storage service itself, using Microsoft’s internal key management infrastructure
  • Customer managed and stored in Key Vault, the Azure key management service offering.

The encryption workflow for Storage Service Encryption is as follows:

  1. Data is uploaded to Azure Blob Storage
  2. Azure Blob Storage calls a cryptographic library to generate a unique one-time Content Encryption Key (CEK)
  3. The uploaded data is encrypted using the CEK
  4. The CEK is then encrypted using a RSA public KEK that is either stored and managed by the storage service or stored in Azure Key Vault
  5. The encrypted CEK is stored, as metadata, alongside the ciphertext data while the plaintext version of the CEK is deleted from memory

The decryption workflow is as follows:

  1. When data is requested, Azure Blob Storage retrieves the encrypted DEK and sends it to the storage services’s internal key management service or to Azure Key Vault
  2. The CEK is decrypted using the private key associated with the KEK and sent back to Azure Blob Storage
  3. The data is decrypted using the plaintext CEK
  4. Azure Blob Storage discards the CEK and sends the decrypted data to the client that requested the data

For client-side encryption, Azure supplies a storage client library, written for Java, .NET and Python, that integrates with Azure encryption. With this option, users have the option of storing and managing their own KEKs or using Azure Key Vault. The workflow is as follows:

  1. The storage client library generates a unique one-time Content Encryption Key (CEK)
  2. The data is encrypted using the CEK
  3. The storage client invokes a key wrapping algorithm calling a KEK that is either stored by the user or stored in Azure Key Vault. The KEK can be a symmetric or asymmetric key
  4. The CEK is encrypted using a KEK
  5. The encrypted CEK is stored, as metadata, alongside the ciphertext data while the plaintext version of the CEK is deleted from memory
  6. The encrypted data is uploaded to and stored in Azure Blob Storage

The decryption workflow is as follows:

  1. The encrypted data is retrieved from Azure Blob Storage
  2. The storage client invokes a key unwrapping algorithm calling a KEK that is either stored by the user or stored in Azure Key Vault.
  3. The encrypted CEK is decrypted using the the KEK
  4. The data is decrypted using the plaintext CEK which is then deleted from memory

Users can also choose to encrypt data prior to being uploaded to Azure using their own cryptographic and key management infrastructure without the storage client library. The encryption process is transparent to Azure Blob Storage and the encrypted data is stored as it would be with unencrypted data.

Key Management

For server-side encryption, keys are managed via one of two options:

  1. All keys are generated and stored by the Azure Blob Storage service itself. Microsoft handles key storage and management with no customer involvement.
  2. CEKs are generated and stored within Azure Key Vault. KEKs are stored within Azure Key Vault but managed by the customer. The KEK can be generated within Key Vault or imported into Key Vault.

For client-side encryption, keys are managed via one of three options::

  1. CEKs are generated by the Azure storage client library. KEKs are stored within Azure Key Vault but managed by the customer.
  2. CEKs are generated by the Azure storage client library. KEKs are generated, stored and managed by the customer using their own key management infrastructure.
  3. Both the CEK and the KEK are generated, stored and managed by the customer using their own cryptographic system. Azure Blob Storage is unaware that it is storing encrypted data.

For both the service-managed and Azure Key Vault options, keys are stored in a set of Hardware Security Modules (HSM) that are managed by Microsoft.

With the service-managed option, all keys are generated by the Azure Blob Storage Service and managed by Microsoft. This includes storing keys, rotating keys and also managing the lifecycle of older historical keys which are needed for decryption of older data.

Server-Side encryption using service-managed keys can be enabled by default for Azure Blob storage and is a good option for customers who don’t have their own key management infrastructure on-premises or in Azure and don’t want to assume the responsibility for key management.

Azure Key Vault is a service that exposes an HSM-backed key management infrastructure to Azure customers and can be integrated with both server-side and client-side encryption. Users can generate keys via Azure Key Vault or import them to the Key Vault. Responsibility for managing the lifecycle of the keys fall on the user using Azure Key Vault tools.

Azure Key Vault is a great option for customers who have a requirement for managing their own encryption keys but don’t have their own key management infrastructure installed. Unlike service-managed keys, customers can choose when they want to rotate, revoke or delete keys while relying on Azure Key Vault and the HSM infrastructure the service runs on to store and to protect all sets of historical keys.

Google Cloud Platform

Google Cloud Storage (not the most original name in the world) Is the Google Cloud Platform (GCP) object storage service. Probably due to the newness of the service, Google Cloud Storage has the fewest options for encrypting data and managing keys. Those options, however, are clearly explained.

Encryption Methods

Google Cloud Storage performs server-side encryption by default on all uploaded objects. All data is broken into chunks which can be up to several GB in size. Using envelope encryption, each chunk of data is encrypted with a unique Data Encryption Key (DEK) that is also encrypted with a Key Encryption Key (KEK) and the encrypted version of the DEK is then stored alongside the encrypted data. The encrypted chunks of data are then distributed across Google’s storage systems. Both the KEK and the DEK use symmetric AES-256 with Galois Counter Mode (GCM) cipher.

Google Cloud Storage supports server-side encryption with two options:

  • Keys generated and stored in Google’s KMS
  • Customer-supplied Encryption Key

With server-side encryption using KEKs that are stored and managed by Google’s internal KMS (not to be confused with Google’s Cloud KMS service) or supplied by the customer. The workflow using keys managed by KMS is as follows:

  1. Data is broken into multiple sub-file chunks after being uploaded to Google Cloud
  2. A Google Cloud Storage system calls a common cryptographic library that Google maintains, called CrunchyCrypt, to generate a unique one-time use DEK
  3. Each data chunk is encrypted using a DEK
  4. The storage system then sends the DEK to Google’s Key Management Service (KMS) to be encrypted using that storage system’s associated Key Encryption Key (KEK)
  5. The encrypted DEK is sent and stored alongside the ciphertext chunk it encrypted in Google Cloud Storage while the plaintext version of the DEK is deleted from memory

The decryption workflow is as follows:

  1. When data is requested the Google Cloud Storage identifies the chunks in which the data is stored and where the chunks reside and retrieves the chunks
  2. For each data chunk, the storage system retrieves the encrypted DEK and sends it to Google’s KMS for decryption
  3. KMS sends the decrypted DEK to the storage system where is it used to decrypt the data
  4. The storage system discards the DEK and sends the decrypted data to the client that requested the data

With the Customer-Supplied Encryption Key (CSEK) option, users have to generate their own AES-256 symmetric key and provide it to Google Cloud Storage for encryption/decryption operations. The CSEK is only stored in storage system memory and never persisted to any Google Cloud device. The encryption workflow is as follows:

  1. The CSEK is provided to Google Cloud Storage along with the data upload
  2. Data is broken into multiple sub-file chunks
  3. A Google Cloud Storage system calls a common cryptographic library that Google maintains, called CrunchyCrypt, to generate a unique one-time use DEK
  4. Each data chunk is encrypted using a DEK
  5. The storage system then uses the CSEK as the KEK and encrypts the DEK
  6. The encrypted DEK is sent and stored alongside the ciphertext chunk it encrypted in Google Cloud Storage while the plaintext version of the DEK is deleted from memory
  7. The customer-supplied encryption key is hashed and then purged from the storage system. The cryptographic hash is used to validate future requests but can’t be used to decrypt data or to reconstruct the key

The decryption workflow is as follows:

  1. The client or application requests data from Google Cloud Storage while supplying the CSEK
  2. Google Cloud Storage identifies the chunks in which the data is stored and where the chunks reside and retrieves the chunks
  3. For each data chunk, the storage system retrieves the encrypted DEK and decrypts it using the CSEK
  4. The storage system discards the DEK and sends the decrypted data to the client or application that requested the data

Since Google’s KMS is not involved, users are responsible for not only key generation but their own key management.

Google Cloud allows for client-side encryption but does not currently offer any specific integrations such as a client-side library for generating DEKs. The user is responsible for generating the encryption keys and encrypting the data prior to uploading it to Google Cloud Storage. The encryption process is transparent to Google Cloud Storage and the encrypted data is stored as it would be with unencrypted data, which means the client-side encrypted data will actually be encrypted again by the service.

Key Management

The Google Cloud Storage Encryption by Default option leverages Google’s internal Key Management Service (KMS) and should not be confused with GCP’s Cloud KMS offering which is currently NOT supported with Google Cloud Storage. Customer KEKs are generated by and centrally stored in Google’s internal KMS. This KMS is protected using a hierarchy of encryption keys.

  • Each KEK (used to encrypt DEKs) are stored in Google’s KMS running across multiple machines in various data centers globally. These KEKs are encrypted with a KMS Master Key using AES-256.
  • The KMS Master Key is stored in a separate system called the Root KMS that is distributed across multiple smaller dedicated machines. The KMS Master Key is also encrypted, using AES-256, with a Root KMS Master Key.
  • The Root KMS Master Key is stored in a system called the Root KMS Master Key Distributor which replicates the key globally. This Key Distributor system holds the keys in RAM and runs on the same machine that runs the root KMS.
  • The root KMS Master Key is also backed up to secure hardware devices (likely Hardware Security Modules) which are stored in physical safes. These safes are stored separately in highly secured facilities that can be accessed by only a few Google employees.

The root KMS Master Key distributor is a distributed system where each instance will periodically compare its keys with the keys stored in a random peer system and reconcile any differences that are found. This ensures that there is no single point of failure and that KMS maintains high availability.

Users can offload a host of key management functions to Google’s KMS, including key generation and key rotation. KEKs are generated using a random number generator built by Google and seeded from various entropy sources. Key rotation is performed every 90 days and up to 20 versions of a KEK can be stored at a time. This means all the DEKs that are encrypted by a given KEK would need to be re-encrypted at least once every 5 years.

Summary

Below is a side-by-side comparison of encryption-at-rest across the three providers’ object storage services. As you may expect, the robustness of the service and the diversity of options is strongly correlated with the age of the cloud provider.

It should be clear that security in general and data encryption in particular are key features for the big three public cloud providers. Every user who is considering if they should use the public cloud or which public cloud to choose should factor data encryption into their decision. Hopefully, this blog series will help shed some light and assist readers in their evaluations.

Originally published at cloudarchitectmusings.com on March 9, 2018.

--

--

Kenneth Hui

Ken is the Service Solutions Architect Leader for the Amazon Web Services (AWS) Data Protection Team. He is passionate about Cloud Computing and great food.