Mongodb is using most of disk space for journal and not cleaning itself

mongodb

I am using mongodb on production with journal enabled. I am using mongodb to store application data and also using gridfs to store images. I am using Elastic Block Storage (EBS) with 10GB for each mongodb instances I have (total 3 instances on replica).

When I checked the disk usage then I got surprised that journal folder is using almost all the space. Following is details.

bitnami@ip-172-31-25-9:~/stack/mongodb/data/db$ pwd
/home/bitnami/stack/mongodb/data/db
bitnami@ip-172-31-25-9:~/stack/mongodb/data/db$ du -h *
65M admin.0
16M admin.ns
65M bhs.0
129M    bhs.1
257M    bhs.2
513M    bhs.3
16M bhs.ns
3.1G    journal
64M local.0
1.1G    local.1
16M local.ns
4.0K    mongod.lock
4.0K    _tmp
bitnami@ip-172-31-25-9:~/stack/mongodb/data/db$ df -h journal
Filesystem      Size  Used Avail Use% Mounted on
/dev/xvda1      9.8G  7.0G  2.4G  76% /

However when I did mongodump on my data then I found that exported data size is only 500MB. I.e. most space is bsing used by journal. I understand journal does keep details on each write operation made which causing large file size. However I am surprised is that its not deleting old transaction.

Ideally I beleive after write transaction is flushed to disk then journal should be deleted. Is this journal folder size expected? Or I am missing some configuration? Should I think to increase the disk volume soon?

Please advice.

Best Answer

MongoDB (for MMAP storage engine) will allocate 3 journal files by default at 1GiB each. That's where your journal related space usage is coming from, but it will not grow unless you have a very high insert rate.

You can start with the smallfiles option and reduce the size to 3 x 128MiB if you wish, but be aware that your data files will also be reduced (to 512MiB each) so there will be many more of them and they will need to be allocated more often when adding data.

As for whether to increase your storage, that depends on how much data you intend to add to the database - it will need to allocate new data files to store any data you insert, so it really is dependent on your planned usage as to whether 10GiB is enough or not.