MongoDB Journaling – Potential Update Loss in WiredTiger

mongodb

This is what MongoDB says about journaling:

To provide durability in the event of a failure, MongoDB uses write
ahead logging to on-disk journal files.

But with WiredTiger:

Important

In between write operations, while the journal records remain in the
WiredTiger buffers, updates can be lost following a hard shutdown of
mongod.

If journaling in WiredTiger can't fully guarantee that there will not be lost updates we have to write our application to be aware that some documents may just "disappear", so why enable journaling at all, considering that it will slow down writes?

It seems that if you can't lose updates* you have to choose MMAPv1 with journaling and if you can you may prefer to choose WiredTiger without journaling.

(*) You still may lose updates in a MongoDB cluster due to rollbacks altought you can avoid rollbacks with w: majority write concern or do something with the files in the rollback/ folder.

Best Answer

It doesn't matter if it's a wiredTiger or MMAPv1, they are just storage engines. Whether you lose updates or not depends on write concerns.

Your app communicates with MongoDB server that writes mostly to memory that has a cache of pages which is periodically written and read from the persistent disk. There is also a journal that logs everything.

When your write to the database it writes both to pages and journal simultaneously. Pages will write to the disk depending on memory pressure. By default you don't wait for acknowledgement from the journal because the journal may not write to the disk for a while, which is represented by j=false. And by default w=1 which means it only acknowledges write to the cache pages.

Default: w = 1, j = false - this is fast, but if something happens before the journal has a chance to write to the disk then the data is lost. What's worse is that it will show as if the data has been written/persisted if you rely on w = 1.

w = 1, j = true - in this case you wait for the acknowledgement from the journal that it has written to the disk. This is a lot slower, but ensures the data is persisted.

enter image description here