MongoDB Tag-Aware Sharding Workflow

mongodbperformancesharding

I wonder the work flow of sharding cluster in Tag-Aware Sharding.

--- Sharding Status --- 
  sharding version: {
    "_id" : 1,
    "version" : 3,
    "minCompatibleVersion" : 3,
    "currentVersion" : 4,
    "clusterId" : ObjectId("546df06d63a15917a8356f4e")
}
  shards:
    {  "_id" : "shard0000",  "host" : "hadoop4:27017",  "tags" : [  "TR" ] }
    {  "_id" : "shard0001",  "host" : "hadoop5:27018",  "tags" : [  "US" ] }
    {  "_id" : "shard0002",  "host" : "hadoop6:27017",  "tags" : [  "OTHER" ] }

  databases:
    {  "_id" : "admin",  "partitioned" : false,  "primary" : "config" }
    {  "_id" : "inventory",  "partitioned" : true,  "primary" : "shard0000" }

In tag-aware sharding , with mongorestore command via mongos , first all data writes to PrimaryDatabase and then migrate the appropriate chunks ?

Or it can fill the appropriate chunk according to tagRange at the beggining of the insert?

For example , when i insert {country:"US",keyword:"abc"} document , at first it goes to shard0001 or goes to the primary database shard0000 and then chunk migration to shard0001?

Best Answer

It depends from the distribution of your inserts if moves going to occur and how many.

1) you start with one chunk {minkey,maxkey} which lives on primary (lets say shard000)

2) as you insert data a split will occur. Might be

2.1) {minkey, US} {US,maxkey} -> moves the second chunk to shar001

2.2) {minkey, TR} {TR,maxkey} -> moves first chunk to shard001

2.3) {minkey,OTHER} {OTHER,maxkey} -> moves second shard to shard002

3) I cant analyze all paths but for 2.1)

{US,maxkey} is unmovable and all the chunks come from it as well

{minkey, US} depends if it going to move according to the next split

....