Beef up your MongoDB security with Client-Side Field-Level Encryption

Shyam Arjarapu
The Startup
Published in
8 min readJan 23, 2020

The Federal government compliance standards wants the organizations to recognize PII — Personally identifiable information and PHI Protected health information and handle them securely. Some of the corporate companies in healthcare, health insurance, and finance industries may have to meet various regulations and compliances such as Health Insurance Portability and Accountability Act (HIPAA), General Data Protection Regulation (GDPR), California Consumer Privacy Act (CCPA). If you work for one of these companies then the MongoDB — Client-Side Field-Level Encryption is a godsend for you. This article helps you understand

  • Why do you need Field-Level Encryption?
  • What is Field-Level Encryption and how it works?
  • Differences between Community Edition vs Enterprise Edition
  • Hands-On lab to implement Field-Level encryption?

This is one of the many articles in multi-part series, Mastering MongoDB — One tip a day, solely created for you to master MongoDB by learning ‘one tip a day’. In a few series of articles, I would like to give various tips to tighten the security of MongoDB. In this article, I would discuss How to beef up your MongoDB security with Client-Side Field-Level Encryption.

Mastering — MongoDB Client-Side Field-Level Encryption

Why do you need Field-Level Encryption?

The MongoDB database offers various levers to harden the security for your database. The Security-Checklist documentation and the MongoDB Security Architecture whitepaper provide in-depth guidelines to how to build HIPAA or PCI-DSS compliance key security capabilities. These security hardening recommendations include

While these hardening recommendations really tighten the security of your data, what if someone (for example — your DBA) has credentials with reading privilege on a collection containing sensitive information? There is nothing stopping them from reading that information as it is stored in cleartext. Had this information been transmitted to and stored in the database as encrypted text rather than plain text, the decryption of the text now depends on access of key used to encrypt the text (hopefully it’s not saved in the very next field :p).

Typically application developers make use of cryptographic libraries available in Java/.NET/NodeJS etc to encrypt and decrypt the text. What if all the heavy lifting of encrypting/decrypting is automatically done for you consistently, irrespective of what programming language/client driver you use? This is exactly what the MongoDB Field-Level Client-Side Encryption does it for you.

What is Field-Level Encryption and how does it work?

Starting with MongoDB v4.2, the MongoDB compatible drivers provide support for client-side Field-Level encryption. The client-side applications can encrypt the fields in a document using a private key before transmitting data over the wire to the server. In this way, the network eavesdroppers, MongoDB server and the users with data access only see the encrypted text. Only the applications with access to the encryption key can decrypt the sensitive information. Finally here is the best part -

The California Consumer Privacy Act allows consumers to request a business to delete their personal information. While you could delete the customer’s record/data in the database, the deletion of the same data from historical backups becomes very challenging. With the Field-Level encryption when a consumer’s encryption key is deleted, all the encrypted data you ever stored is also marked permanently unreadable.

How automatic encryption works in MongoDB Client-Side Field-Level Encryption
How automatic encryption works in MongoDB Client-Side Field-Level Encryption

The above diagram illustrates

  • MongoDB Data Store where sensitive information is stored in an encrypted format.
  • MongoDB Key Vault to store data encryption keys to encrypt and decrypt document fields. Typically this is a separate MongoDB replica set/cluster isolated from your encrypted data store.
  • libmongocryptd a cryptography library to perform creation of data encryption keys, encryption/decryption of data from supported client-side drivers
  • mongocryptd an enterprise feature to support automatic Field-Level encryption.
  • For write operations, the MongoDB v4.2+ client drivers make use of a private key from the key vault to encrypt the plain text and transmits the encrypted text to the server.
  • Similarly, for the read operations, the MongoDB v4.2+ client drivers receive the encrypted text from the server and automatically decrypts the text using the respective private key from the key vault.

Differences between Community Edition vs Enterprise Edition

Contrary to popular belief, the client-side Field-Level encryption is available in both MongoDB community and enterprise editions. However, the major difference between them comes from the context of available encryption methods.

The MongoDB client-side Field-Level encryption only supports encrypting one field at a time. If you have multiple fields such as ssnand Mobile then you would need multiple calls to the encryption methods. But, you could send the encrypted values for both the fields to the server in a single insert/update operation. Since the community edition only supports explicit/manual encryption, each field needs to be encrypted manually and its the application's responsibility (imagine multiple teams/application services) to only ship the encrypted text.

With the enterprise version, the driver invokes the mongocryptd process which makes use of the schema definition to perform automatic Field-Level encryption on the above two fields. The requested write operations are parsed by the mongocryptd and ensure the sensitive information in clear text is always encrypted before sending it to the server.

Hands-On lab exercises

Now that you some background on what the Client-Side Field-Level Encryption, lets work on setting up a lab environment. This lab exercise helps you understand how to use Client-Side Field-Level Encryption on MongoDB Enterprise with MongoDB shell as a client on a CentOS 7.5. At the time of this writing, the Client-side Field-Level encryption only supports the following KMS providers:

  • Amazon Web Services KMS
  • Locally Managed Keyfile

To keep the lab simple I would be using a MongoDB Standalone with the locally managed keyfile as KMS provider.

Running MongoDB Enterprise Server v4.2

The below script lets you download and install the latest version of MongoDB Enterprise server v4.2+. If you already have the binaries available, you may skip this step.

A bash script to download and install MongoDB v4.2.2 enterprise on my Mac

Prepare the client objects to make use of local key encryption

You must have a 96-byte long string keyfile to make use of the locally managed keyfile. So let’s create a keyfile with random text using openssl. This key will be used to create/fetch your data encryption keys. Please save this key in a secure location and don’t lose it otherwise you would not be able to decrypt it at a later time.

A bash script to create LOCAL_KEY and start the MongoDB shell

I want you to pay attention to the above mongo shell command where you are not connected to any server yet. In the below scripts, I am creating two MongoDB client objects one with client-side Field-Level encryption options and the other without.

A JavaScript to make use of the client-side field-level encryptions, create the MongoDB client objects, and create data encryption keys for SSN and Mobile fields

Manually encrypt data for CRUD operations

For the sake of simplicity, let’s get started with the explicit/manual encryption. The encryption methods make use of data encryption keys and encryption algorithms (Deterministic or Random). If you are using a Deterministic encryption algorithm then the encryption of the same plain text will always result in the same encrypted text as the output. The Deterministic algorithm is especially useful when performing find operations on encrypted fields such as ssn. However, if you are using the Random encryption algorithm then the encryption of plain text will result in the different encrypted text as the output. So you should never use the Randomalgorithm on the encrypted fields that need find/filter on the server-side. When you use a Deterministic encryption algorithm on low cardinality fields such as boolean fields, then the encrypted data is prone to interpretation and guesstimation of actual values based on frequency histogram analysis. So make use of the Random encryption algorithm for such fields.

The below code illustrates that the data on the server is stored as encrypted format when queried with plainClient and this text is automatically decrypted when you query via csfleClient.

A JavaScript to illustrate insert/find operations while using MongoDB client-side field level encryption feature.

Automatic Field-Level encryption with MongoDB Enterprise

The use of explicit/manual encryption methods for client-side Field-Level encryption is the only option available for the Community Edition. This requires the applications to make use of the same algorithm across various client applications to ensure consistency and it’s the clients’ responsibility to ensure that the find, update and insert operations are always sent with the encrypted text. However, with mongocryptd, a MongoDB Enterprise only features, it is assured that the same encryption algorithm is used across all the client applications. Most importantly, the mongocryptd process parses the operation and automatically encrypts all the fields before sending the query to the server.

A JavaScript to illustrate insert/find operations while using MongoDB client-side field level encryption with automatic encryption feature.

Clients may still be able to save unencrypted data

The automatic Field-Level encryption from mongocryptd is a very cool feature as the queries originating from the client object with client-side Field-Level encryption options will automatically handle the encryption/decryption for you. However, if a client is trying to insert a plain text from a MongoDB client object that’s not using encryption options, then the insert operation will go through.

The below code snippet illustrates the possibility of such scenarios.

A JavaScript to show the possibility of clients not using client-side field level encryption may still be able to accidentally insert plain text data.

Enforcing clients to save encrypted data

A recommended approach to getting over such accidental mistakes is to have JSON schema validation on the collection. When a JSON schema validation is defined, the MongoDB server validates the data to have encrypted text for the respective fields and either complete the validation or rejects it.

A JavaScript to show how JSONSchema can be used to enforce clients to insert encrypted data than plain text data.

How Field-Level encryption can helps you build CCPA compliant applications?

As briefed earlier, the California Consumer Privacy Act allows consumers to request a business to delete their personal information. In such scenarios, the deletion of the user-specific data encryption key will make all the encrypted data in the databases/historical backup snapshots permanently unreadable for that user.

The below code snippet illustrates a unique data encryption key is created for every patient object. Once the data is tested for encryption/decryption for a patient’s data (_id: 9), then the data encryption key is deleted for both these fields. Although the data for the patient with _id: 9 exists the encrypted fields now could not be decrypted anymore.

A JavaScript to show MongoDB client-side field-level encryption can help you implement solutions for California Consumer Privacy Act.

Summary

MongoDB offers various means to harden security at different levels. The Field-Level encryption takes the security hardening to the next level, where a database user can only decrypt the data if they have access to the data encryption keys. When used correctly, you could build solutions to meet GDPR, HIPAA, CCPA, and other regulatory compliance standards. Hopefully, this article shed some light on “How to beef up your MongoDB security with Field-Level Encryption”, and you learned something new today as you scale the path to “Mastering MongoDB — One tip a day”.

Credits: Thanks to Jay Pearson from MongoDB for reviewing the article and sharing his thoughts.

Previously published articles

--

--