Mongodb – mongoexport not exporting all possible data

exportmongodb

I have a collection with documents similar to the following:

{
 _id: 'p_123456',
 id: 123456,
 kind: 'person'
 data: [...]
}

id field can contain either positive or negative integers.
This collection contains close to 100 million documents and I have a python script which I use in order to process the data from the collection.

I'm trying to export all the data, but into 8 different processes using the $mod operator in the following way:
mongoexport -u user -p password -d db -c collection --query "{\$and:[{kind:'person'},{id:{\$mod:[8,\$i]}}]}" | python process.py – where $i is a number between 0 – 7.

For some reason I've noticed that when I use this method with 8 processes not all the data is being exported just 65 million out of 87 million for this specific kind.

If I run a single mongoexport process with the query {kind:'person'} only, all 87 million documents are being exported.

Is it possible that running 8 different proccesses with $mod:[8,0] to $mod:[8,7] isn't enough in order to export all the data? what am I missing here?

EDIT 1:
Following AdamC suggestion, I ran the following map reduce:

var map = function() {
  if (this.kind == 'person'){
      var key = this.id % 8
      if (key < 0){
          key = key * (-1)
      }
      emit( key, 1 );
  }
};


var reduce = function( key, values ) {    
  return Array.sum(values);
}

The results returned as expected ~87 million records:

> db.map_reduce_modulus.find()
{ "_id" : 0, "value" : 10886482 }
{ "_id" : 1, "value" : 10878131 }
{ "_id" : 2, "value" : 10881552 }
{ "_id" : 3, "value" : 10882586 }
{ "_id" : 4, "value" : 10886060 }
{ "_id" : 5, "value" : 10882565 }
{ "_id" : 6, "value" : 10886171 }
{ "_id" : 7, "value" : 10883563 }

Therefor I don't think this is a type issue

Thanks for advance

Best Answer

I can only speculate without having access to the data, but I suspect that you have non-numeric data in the id field. Try using the $type operator to verify that some of the fields have not been passed in as strings or similar and that all the fields are homogeneous in terms of type.