MongoDB – Fixing Mongo Import Error: Exception: Read Error or Input Line Too Long

mongodbsharding

I have a shareded collection which I'm trying to migrate from an existing production Cluster running v 2.4 to a new one running 2.6.

As both clusters are shareded, I'm using mongoexport from the old and mongoimport to the new so the data will go through the config servers.
The way I'm doing it is the call mongoexport process and pipe the results into mongoimport process.

I'm guessing that the export | import process affects the size of the exported json which fails to be imported back to the new cluster.
Is there any way I can avoid having this problem or even ignore some documents that cause this error? (I'd rather lose a few than lost all 8 million records).

Is there a way to go around this?

Thanks in advance

Best Answer

For data migration (or backup) between MongoDB servers, you should be using mongodump and mongorestore (binary backups) rather than mongoimport/mongoexport (text backups).

Backups (and restores) of sharded collections need to be done through a mongos.

There are several reasons to use mongodump/mongorestore:

  • binary backups preserve type fidelity in your BSON documents; mongoexport to a text format can lose type information because JSON can only represent a subset of the BSON types

  • you can dump/restore full databases (rather than being limited to a single collection)

  • information on index definitions is included so they can be recreated by mongorestore

If you're trying to do a live migration by dumping/exporting records, this is going to be problematic because your dump will not represent a single point in time if there is write activity on the collections you are dumping.

You definitely want to disable the balancer on the source cluster before taking your mongodump, to ensure there are no active data migrations.

You may also want to look into a live migration tool like hydra.