MongoDb: sharded replica set: data distribution

mongodb

  1. How does the data get distributed among the shards?
  2. How to backup and restore an infrastructure with 2 shards (3 config servers and 1 mongos)
  3. When should I choose MongoDb Shard with Replica Set vs MongoDb Replica set only?
  4. I have 2 shards. I use mongodump and mongorestore for backup and restore, is it a good practice?
  5. Sometimes, the 3 config server are out of sync. What does it mean? How can I quickly get rid of these issues?

Best Answer

Sometimes, the 3 config server are out of sync. What does it mean? How can I quickly get rid of these issues?

Most convenient way would be to copy (cp or rsync) the configdb data directory content between hosts in cases of inconsistency

When should I choose MongoDb Shard with Replica Set vs MongoDb Replica set only?

Replica ONLY: When you desire to make your mongo data highly available and add redundancy in your solution. The benefit can be to offload reporting requests to secondary nodes and/or make mongodb cluster sustain network partitions or disaster recovery

Sharding ONLY: When you desire to scale your mongodb cluster horizontally to achieve load balancing and distribute traffic to multiple hosts you must choose sharding. For example, you can achieve partitioning benefits with sharding enabled on databases.

Sharding with replica set: When you desire to enable all benefits stated above i.e. scale you cluster horizontally to handle large volumes of data for load balancing and also achieve HA within each shard you must choose this approach