Mongodb – mongodump sharded collection size is smaller after restoring

mongodbmongodb-3.2mongodumpmongorestoresharding

I need to change shard index from range to hashed in a sharded collection. So I need to mongodump and restore the collection. I have 3 shards with 3 replica sets.MongoDB version is: 3.2.11
I started mongodump from one of mongoses.

mongodump –collection Collname –db dbname

I have "count" : 12925651, "size" : 21233976913 in main collection.

But mongodump is done at %75 with

[##################……] dbname.Collname 9700124/12925651 (75.0%) done dumping dbname.Collname (9700124 documents)

I restore the dump to another db and collection size is smaller than the main collection.

It seems that mongodump doesn't take data from one of the shards. Shard distribution is :

Totals
data : 19.77GiB docs : 12925651 chunks : 475
Shard shard01 contains 24.98% data, 24.98% docs in cluster, avg obj size on shard : 1KiB
Shard shard02 contains 50.03% data, 50.03% docs in cluster, avg obj size on shard : 1KiB
Shard shard03 contains 24.98% data, 24.98% docs in cluster, avg obj size on shard : 1KiB

What is the problem? And what can i do for replacing the shard index?

Best Answer

If the cluster is healthy, there are no documents with same sharding key at different shards. If there is "duplicates", those are "orphans" what should be cleaned from cluster.

Here more information about orphans and how to remove those.