MongoDB Storage Engine Journaling
I came across a question the other day as it relates to journaling in MongoDB. Specifically how it is handled in the different supported storage engines and is it necessary to use. There was an interesting discussion on this topic so I thought I would generate some thoughts and explanations on the topic.
To start with, some questions arise. What exactly is a MongoDB journal? Why is journaling important? For the sake of this post, I’m going to be relating this information to 64-bit builds of mongod and based on the 3.4 version of the database.
What is Journaling?
Much like one uses a journal
to record thoughts and daily events, MongoDB uses a journal to ensure data integrity. This is accomplished through writing data first to the journal files and then to the core data files. In the event of an untimely server shutdown, the data can be restored to a consistent state.
This is accomplished through MongoDB’s write operation durability guarantee. If your mongod process stops in an unexpected manner, data from the journal will be used to re-apply the write operations when it is restarted. MongoDB creates, when journaling is enabled, a subdirectory for the journal data called
journal. This resides under the
dbPath directory and contains the write ahead logs.
Since each different storage engine in MongoDB implements crash resiliency and data persistence slightly differently, let’s see how journaling is utilized.
Storage Engine Implementations
There are three different storage engines that are predominately used with MongoDB. MMAPv1, WiredTiger, and In-Memory. They each have their own strengths and weaknesses. Those differences are beyond the scope of this post, but I would like to look at how journaling is implemented in each.
Starting in version 3.2 of MongoDB, MMAPv1 is no longer the default storage engine. However, it is still in use and in certain circumstances is a better option. Therefore, it is still good to understand how journaling works with this storage engine in its default configuration.
In a nutshell, when a write command is issued, the operation is applied to an internal private view, then written to the journal. Once the data has been updated in the journal the changes are applied to an internal shared view and then written to disk.
In MMAPv1, the journal is updated every 100 milliseconds in batch processes called group commits. Data is written to disk, though, every 60 seconds in the process flushing the shared view to disk. Depending on the quantity and availability of system memory, the flushing of data may occur more often.
Where then does the importance of the journal come in? Well, in the case of an unexpected shutdown of the mongod process the journal can be used to restore the data. Without journaling on a standalone server, there is a more lengthy and involved repair process involved.
On systems using a properly configured replica set, data recovery may be simplified without a journal over using the repair process. It is still not as clean as with journaling enabled, however.
The WiredTiger storage engine takes a different approach to write operation data concurrency. WiredTiger uses checkpoints in conjunction with a journal. These checkpoints allow for data to be recovered after the last checkpoint.
When a write operation is called, a snapshot is taken of the data. When data is written to disk (every 60 seconds by default), the data is written across all data files and becomes durable. This becomes a new checkpoint and can be used as a recovery point.
This allows for WiredTiger to be covered from the last checkpoint without a journal. Pretty slick. However, if an unexpected shutdown occurs between checkpoints and journaling is disabled, data will be lost. The journal in WiredTiger, therefore, utilizes a write-ahead log similar to MMAPv1 between checkpoints for data durability.
So journaling and replica sets are still important pieces of a server environment when using WiredTiger. It just is implemented in a slightly different way than MMAPv1.
For those that are running an Enterprise version of MongoDB, there is a storage engine that stores data in memory. Because memory is stored in memory, the data is non-persistent. The concept of a journal does not apply in this situation.
I have seen questions similar to “Why is the journal unnecessary for WiredTiger” listed in various study guides. As we have learned, it is indeed not required for data consistency. At least not in the same fashion as it is for MMAP. That being said, I might argue that “unnecessary” is a bit of a misleading word. WiredTiger’s data consistency model is just different than MMAP. Journaling may not be “necessary” perhaps, but I wouldn’t run a system without it.
All of these details of journaling can be a lot to think about and potentially manage. This is one of the great advantages of MongoDB Atlas, in that these internal matters are handled by them. If you are running and/or managing a MongoDB server, it is a best practice to leave journaling on for data integrity. Further, it is recommended to have your system use a replica set at a minimum as data recovery is often simplified even more.
There are several MongoDB specific terms in this post. I created a MongoDB Dictionary skill for the Amazon Echo line of products. Check it out and you can say “Alexa, ask MongoDB for the definition of a journal?” and get a helpful response.
Follow me on Twitter @kenwalger to get the latest updates on my postings.
Originally published at Blog of Ken W. Alger.