MongoDB – How to Manage Backup Sizes in MongoDB

mongodb

I'm backing up a MongoDB server using (for better or worse) mongodump. After running db.stats() I get a storageSize of 8.9GB and a dataSize of 44.5GB. This makes sense to me as I expect a lot of the data (at least 1/2) to be highly compressible.

My question is this, why would my backup's be 45GB in size, if the storage size is only 9GB – I've also checked my actual disk usage (the disk is used ONLY for mongo and has slightly over 9GB of disk usage.

I'm running mongodump with the following options:

/usr/bin/mongodump -h $HOST -d $DBNAME --username=$DBUSER --password=$DBPASS --archive="$OUTPUT"

I'm aware of the gzip option, but if the data is stored compressed, surely it should be backed up compressed too – and gzip should simply be a second level of compression?

Best Answer

Actually, if you don't use --gzip parameter, data is at bson format what is not compressed and you can look that data with bsondump program. What mongodump do, is read data from database and that result is of course not compressed.

Here is different between those two from my system

-rw-r--r-- 1 root root 61M Mar 16 06:48 data.bson -rw-r--r-- 1 root root 5.3M Mar 16 06:48 data.bson.gz

Over 10 fold compression ration.