MongoDB – How to Restore a gzipped Tarball with mongorestore –gzip?

backupcompressionmongodbmongodb-3.2restore

I created a backup with

mongodump -h $RANDOM_SECONDARY_SUP -u $BACKUP_USER -p $BACKUP_PASSWD --out /data/$BACKUP_USER/sup-repl/sup_$DUMPFILE
tar cvzf /data/$BACKUP_USER/dbt/st_$DUMPFILE.tar.gz /data/$BACKUP_USER/st-repl/st_$DUMPFILE

I tried to restore that with --gzip and receive erorr

mv st_$DUMPFILE.tar.gz test_restore.bson.gz
mongorestore --gzip test_restore.bson.gz
2016-09-08T18:05:14.975+0200    checking for collection data in tgdmead2_test_restore.bson.gz
2016-09-08T18:05:15.090+0200    restoring test.test_restore from tgdmead2_test_restore.bson.gz
2016-09-08T18:05:15.156+0200    Failed: test.test_restore: error restoring from test_restore.bson.gz: reading bson input: invalid BSONSize: 1635017060 bytes

In docs I read that the gzipped tarball should be created with --archive

What does this parameter do? Howto emulate that with tar and gz?

Which format does mongorestore expect with --gzip?

Best Answer

I think the problem you have is that the archive format created with --archive is not a tarball (and the docs don't say it is anywhere that I could find). Rather it is a custom packaging format which you can see the details of here. Based on a quick scan of the code, it looks like a lightweight format containing a series of headers, metadata which describes the raw BSON. Short of creating a standalone binary to create a compatible archive file, you can't do this manually.

If you do mongodump without archive, but with --gzip, it will actually gzip the individual files, and you can emulate that by doing the dump normally then gzipping each file in the folder separately. Those compressed files could then be restored theoretically with mongorestore --gzip.

Overall I would advise to just use --archive in the 3.2 tools and stay away from trying to recreate it manually, but the --gzip option is as close as it is going to get without a bunch of work.