Mongodb – Mongo sharding issue with chunk split and Data transfer

mongodbsharding

I am getting this error in my primary shard

warning: moveChunk failed to engage TO-shard in the data transfer: still waiting for a previous migrates data to get cleaned, can't accept new chunks, num threads: 3
Tue Jul 22 11:06:47.639 [conn405736] MigrateFromStatus::done About to acquire global write lock to exit critical section

This is from Mongos collection :

Thu Jul 24 11:01:52.056 [conn148389] warning: splitChunk failed – cmd: { splitChunk: "ibeat20140724.visitor", keyPattern: { host: 1 }, min: { host: "eisamay.indiatimes.com" }, max: { host: "timesofindia.indiatimes.com" }, from: "shard0000", splitKeys: [ { host: "navbharattimes.indiatimes.com" } ], shardId: "ibeat20140724.visitor-host_"eisamay.indiatimes.com"", configdb: "192.168.24.192:27017,192.168.24.54:27017,192.168.24.55:27017" } result: { who: { _id: "ibeat20140724.visitor", process: "ibeatdb61:27017:1399428004:822699867", state: 1, ts: ObjectId('53d09a4705eb5bebaa264d1f'), when: new Date(1406179911981), who: "ibeatdb61:27017:1399428004:822699867:conn426252:1780690214", why: "split-{ host: "eisamay.indiatimes.com" }" }, ok: 0.0, errmsg: "the collection's metadata lock is taken" }

This is in from change log collection:

{ "_id" : "ibeatdb61-2014-07-21T23:28:04-53cda20405eb5bebaae92ed5", "server" : "ibeatdb61", "clientAddr" : "192.168.22.106:57441", "time" : ISODate("2014-07-21T23:28:04.308Z"), "what" : "moveChunk.from", "ns" : "ibeat20140721.pageTrendLog", "details" : { "min" : { "host" : { "$minKey" : 1 } }, "max" : { "host" : "beautypageants.indiatimes.com" }, "step1 of 6" : 0, "step2 of 6" : 177, "note" : "aborted" } }

Could you please help me to resolve these errors

Regards
Viren

Best Answer

You really only have 2 choices when a primary is still processing deletes from previous migrations (which is why you are getting the failure to engage error):

  1. Wait for the deletes to finish
  2. Step down the primary of that shard (assuming it is a replica set)

The first action may take a long time if the shard in question is under heavy load, but it is the safest way forward. The second option will cause the primary to step down to secondary and terminate the background threads for the deletes - if you have only a single node then your only option is a restart, and downtime. Either way, this will allow the new primary (or restarted node) to accept migrations, but it will create orphans which you will need to manually clean up later.

If you are in any doubt, then the best thing to do is wait for the deletes to finish.