MongoDB – HDD Disk Grows Too Fast When Inserting/Deleting Data

mongodb

My Mongo version is 2.6.9

I have a application what is inserting, reading and removing data from Mongo. Thats all working fine, however the disk space increate exponentially. My HDD is around 120 GB, and mongo consumes about: 100GB.

I checked the collection what was consuming so much data, i couldnt do a repair database because of disk space. So i remove my database completely.

mongodump -d db
echo 'db.dropDatabase()' | mongo db
mongorestore /root/bashscript/backups/dump/db

I reimported all the old documents, except from the collection what was consuming so much data. I recreated this collection with compact and power of 2 sizes (because that should do the trick)

Now when i add new data in this collection (around 200.00 docs), the result is the same, the collection is using 80GB of data.

What am i doing wrong? Am i using Mongo DB wrong? Is it not a good idea to use mongo db in my situation?
I don't understand why mongo is consuming so much disk space.

Best Answer

Well. This depends on the below factors. Working Dataset, padding factor and fragmentation.

You will need to check which of the below things are not working for you.

1) Padding: As you said you implemented padding, can you let me know the value of padding factor of your mongo?

2) Fragmentation leaves some holes in your data files when they are moved around. MongoDB tries to fill holes whenever possible. Again it depends on your data files, if the new data files which are supposed to be accommodated in the existing holes are larger than the hole size, it can't be fit. So, compact and usePowerOf2Sizes options come into our rescue.

Can you let me know the frag value of your collection?

db.(collection).stats().storageSize

It gives the sum of size of the collection + total index size

If you are on MMAPv1, you can switch to WiredTiger storage engine and test how fragmentation is working. You will need to upgrade to 3.0 to plugin to avail this storage engine.

If all the above options doesn't work, we are left with a tough option. Your application has to minimize fragmentation by pre-allocating the max size of a document and ensuring document size growth is managed correctly.