Beef up your MongoDB security with Client-Side Field-Level Encryption
The Federal government compliance standards wants the organizations to recognize PII — Personally identifiable information and PHI Protected health information and handle them securely. Some of the corporate companies in healthcare, health insurance, and finance industries may have to meet various regulations and compliances such as Health Insurance Portability and Accountability Act (HIPAA), General Data Protection Regulation (GDPR), California Consumer Privacy Act (CCPA). If you work for one of these companies then the MongoDB — Client-Side Field-Level Encryption is a godsend for you. This article helps you understand
- Why do you need Field-Level Encryption?
- What is Field-Level Encryption and how it works?
- Differences between Community Edition vs Enterprise Edition
- Hands-On lab to implement Field-Level encryption?
This is one of the many articles in multi-part series, Mastering MongoDB — One tip a day, solely created for you to master MongoDB by learning ‘one tip a day’. In a few series of articles, I would like to give various tips to tighten the security of MongoDB. In this article, I would discuss How to beef up your MongoDB security with Client-Side Field-Level Encryption.
Mastering — MongoDB Client-Side Field-Level Encryption
Why do you need Field-Level Encryption?
The MongoDB database offers various levers to harden the security for your database. The Security-Checklist documentation and the MongoDB Security Architecture whitepaper provide in-depth guidelines to how to build HIPAA or PCI-DSS compliance key security capabilities. These security hardening recommendations include
- Enable Authentication to enforce every connection to provide valid credentials for connecting to the database.
- Use Role-Based Access Control to implement the Principle of least privilege. Provision specific user roles/privileges that are essential to perform its intended function for a given account.
- Encrypt communication between your application and MongoDB using TLS/SSL
- Encrypted Volumes helps restrict cloud service providers’ staff from accessing the files on your system.
- Encryption at Rest to limit your System Administrations with access to file system to decipher any secure data.
- Auditing to perform forensic analysis and allow administrators to track access and changes to database configurations and data.
- Harden Network configuration to allow access to MongoDB only from trusted hosts.
- To prevent sensitive information from being exposed, use Log Redaction on the log files. Also, use Field-Level Redaction to restrict the application-based Access Control List.
While these hardening recommendations really tighten the security of your data, what if someone (for example — your DBA) has credentials with reading privilege on a collection containing sensitive information? There is nothing stopping them from reading that information as it is stored in cleartext. Had this information been transmitted to and stored in the database as encrypted text rather than plain text, the decryption of the text now depends on access of key used to encrypt the text (hopefully it’s not saved in the very next field :p).
Typically application developers make use of cryptographic libraries available in Java/.NET/NodeJS etc to encrypt and decrypt the text. What if all the heavy lifting of encrypting/decrypting is automatically done for you consistently, irrespective of what programming language/client driver you use? This is exactly what the MongoDB Field-Level Client-Side Encryption does it for you.
What is Field-Level Encryption and how does it work?
Starting with MongoDB v4.2, the MongoDB compatible drivers provide support for client-side Field-Level encryption. The client-side applications can encrypt the fields in a document using a private key before transmitting data over the wire to the server. In this way, the network eavesdroppers, MongoDB server and the users with data access only see the encrypted text. Only the applications with access to the encryption key can decrypt the sensitive information. Finally here is the best part -
The California Consumer Privacy Act allows consumers to request a business to delete their personal information. While you could delete the customer’s record/data in the database, the deletion of the same data from historical backups becomes very challenging. With the Field-Level encryption when a consumer’s encryption key is deleted, all the encrypted data you ever stored is also marked permanently unreadable.
The above diagram illustrates
- MongoDB Data Store where sensitive information is stored in an encrypted format.
- MongoDB Key Vault to store data encryption keys to encrypt and decrypt document fields. Typically this is a separate MongoDB replica set/cluster isolated from your encrypted data store.
libmongocryptd
a cryptography library to perform creation of data encryption keys, encryption/decryption of data from supported client-side drivers- mongocryptd an enterprise feature to support automatic Field-Level encryption.
- For write operations, the MongoDB v4.2+ client drivers make use of a private key from the key vault to encrypt the plain text and transmits the encrypted text to the server.
- Similarly, for the read operations, the MongoDB v4.2+ client drivers receive the encrypted text from the server and automatically decrypts the text using the respective private key from the key vault.
Differences between Community Edition vs Enterprise Edition
Contrary to popular belief, the client-side Field-Level encryption is available in both MongoDB community and enterprise editions. However, the major difference between them comes from the context of available encryption methods.
The MongoDB client-side Field-Level encryption only supports encrypting one field at a time. If you have multiple fields such as ssn
and Mobile
then you would need multiple calls to the encryption methods. But, you could send the encrypted values for both the fields to the server in a single insert/update operation. Since the community edition only supports explicit/manual encryption, each field needs to be encrypted manually and its the application's responsibility (imagine multiple teams/application services) to only ship the encrypted text.
With the enterprise version, the driver invokes the mongocryptd
process which makes use of the schema definition to perform automatic Field-Level encryption on the above two fields. The requested write operations are parsed by the mongocryptd
and ensure the sensitive information in clear text is always encrypted before sending it to the server.
Hands-On lab exercises
Now that you some background on what the Client-Side Field-Level Encryption, lets work on setting up a lab environment. This lab exercise helps you understand how to use Client-Side Field-Level Encryption on MongoDB Enterprise with MongoDB shell as a client on a CentOS 7.5. At the time of this writing, the Client-side Field-Level encryption only supports the following KMS providers:
- Amazon Web Services KMS
- Locally Managed Keyfile
To keep the lab simple I would be using a MongoDB Standalone with the locally managed keyfile as KMS provider.
Running MongoDB Enterprise Server v4.2
The below script lets you download and install the latest version of MongoDB Enterprise server v4.2+. If you already have the binaries available, you may skip this step.
Prepare the client objects to make use of local key encryption
You must have a 96-byte long string keyfile to make use of the locally managed keyfile. So let’s create a keyfile with random text using openssl. This key will be used to create/fetch your data encryption keys. Please save this key in a secure location and don’t lose it otherwise you would not be able to decrypt it at a later time.
I want you to pay attention to the above mongo shell command where you are not connected to any server yet. In the below scripts, I am creating two MongoDB client objects one with client-side Field-Level encryption options and the other without.
Manually encrypt data for CRUD operations
For the sake of simplicity, let’s get started with the explicit/manual encryption. The encryption methods make use of data encryption keys and encryption algorithms (Deterministic or Random). If you are using a Deterministic
encryption algorithm then the encryption of the same plain text will always result in the same encrypted text as the output. The Deterministic
algorithm is especially useful when performing find operations on encrypted fields such as ssn
. However, if you are using the Random
encryption algorithm then the encryption of plain text will result in the different encrypted text as the output. So you should never use the Random
algorithm on the encrypted fields that need find/filter on the server-side. When you use a Deterministic
encryption algorithm on low cardinality fields such as boolean
fields, then the encrypted data is prone to interpretation and guesstimation of actual values based on frequency histogram analysis. So make use of the Random
encryption algorithm for such fields.
The below code illustrates that the data on the server is stored as encrypted format when queried with plainClient
and this text is automatically decrypted when you query via csfleClient
.
Automatic Field-Level encryption with MongoDB Enterprise
The use of explicit/manual encryption methods for client-side Field-Level encryption is the only option available for the Community Edition. This requires the applications to make use of the same algorithm across various client applications to ensure consistency and it’s the clients’ responsibility to ensure that the find, update and insert operations are always sent with the encrypted text. However, with mongocryptd
, a MongoDB Enterprise only features, it is assured that the same encryption algorithm is used across all the client applications. Most importantly, the mongocryptd
process parses the operation and automatically encrypts all the fields before sending the query to the server.
Clients may still be able to save unencrypted data
The automatic Field-Level encryption from mongocryptd is a very cool feature as the queries originating from the client object with client-side Field-Level encryption options will automatically handle the encryption/decryption for you. However, if a client is trying to insert a plain text from a MongoDB client object that’s not using encryption options, then the insert operation will go through.
The below code snippet illustrates the possibility of such scenarios.
Enforcing clients to save encrypted data
A recommended approach to getting over such accidental mistakes is to have JSON schema validation on the collection. When a JSON schema validation is defined, the MongoDB server validates the data to have encrypted text for the respective fields and either complete the validation or rejects it.
How Field-Level encryption can helps you build CCPA compliant applications?
As briefed earlier, the California Consumer Privacy Act allows consumers to request a business to delete their personal information. In such scenarios, the deletion of the user-specific data encryption key will make all the encrypted data in the databases/historical backup snapshots permanently unreadable for that user.
The below code snippet illustrates a unique data encryption key is created for every patient object. Once the data is tested for encryption/decryption for a patient’s data (_id: 9), then the data encryption key is deleted for both these fields. Although the data for the patient with _id: 9 exists the encrypted fields now could not be decrypted anymore.
Summary
MongoDB offers various means to harden security at different levels. The Field-Level encryption takes the security hardening to the next level, where a database user can only decrypt the data if they have access to the data encryption keys. When used correctly, you could build solutions to meet GDPR, HIPAA, CCPA, and other regulatory compliance standards. Hopefully, this article shed some light on “How to beef up your MongoDB security with Field-Level Encryption”, and you learned something new today as you scale the path to “Mastering MongoDB — One tip a day”.
Credits: Thanks to Jay Pearson from MongoDB for reviewing the article and sharing his thoughts.
Previously published articles
- Tip # 006: Configure MongoDB with Kerberos Authentication
Enterprise authentication with Kerberos can’t get any easy! - Tip # 005: Getting started with MongoDB Enterprise Operator
Now you could deploy MongoDB in Kubernetes under a minute - Tip # 004: Faster elections
Measures to reduce the election time during the rolling maintenance - Tip # 003: Transactions
A long awaited and most requested feature for many, has finally arrived - Tip # 002: createRole
Ahhh …! Someone just dropped a collection