Mongodb – No shardIdentity document – MongoDB

backupmongodbsharding

Having a sharded cluster in production which has been running for a good year now without issues. Everything is on 4.0, classic 3-3-3(config) + 2 mongos.

While revisiting our back-up strategy i stumped upon this step.

I tried finding that value on every node (mongos, primaries, secondaries) and none of them have a shardIdentity.

mongos> db.system.version.find()
{ "_id" : "featureCompatibilityVersion", "version" : "4.0" }
{ "_id" : "authSchema", "currentVersion" : 5 }

This thing looks important and is part of STARTUP2 phase.

For the sake of testing and documenting our back-ups, i followed all the steps from the "Restore Sharded Cluster" documentation. At the end of it:

  • sh.status() is looking good, i'm seeing the shards
  • rs.status() is also good
  • A standalone mongod gives me the proper stats on restored data
  • mongod output seems fine..

But then db.coll.stats() fails with :

{
        "ok" : 0,
        "errmsg" : "Cannot accept sharding commands if sharding state has not been initialized with a shardIdentity document",
        "code" : 203,
        "codeName" : "ShardingStateNotInitialized",
        "operationTime" : Timestamp(1548184216, 1),
        "$clusterTime" : {
                "clusterTime" : Timestamp(1548184216, 1),
                "signature" : {
                        "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
                        "keyId" : NumberLong(0)
                }
        }
}

And i can see this error from mongod:

I CONTROL [LogicalSessionCacheReap] Sessions collection is not set up; waiting until next sessions reap interval: sharding state is not yet initialized

So far i tried:

  • sh.enableSharding() returns true (1) but doesn't change anything
  • Been through https://github.com/mongodb/mongo/wiki/Sharding-Internals#sharding-component-initialization
  • Double checked on our production cluster and no shardIdentity there.

So, how does one enforce a shardIdentify or generate one? How do you find out the "shard name" because i couldn't find such a thing. And, well, why would sh.status() give the proper result but cannot associate its shards with the current data?

Thanks,

Best Answer

At first I didn't want to mess around with the admin database but it seemed it was the only thing to do.

Gather information

sh.status() on a mongos will give the shards name needed and the shard cluster id. Something like that:

"clusterId" : ObjectId("3b743bf3134df1f277980ard")
...
shards:
        {  "_id" : "shardname1",  "host" : "[host:port]",  "state" : 1 }
        {  "_id" : "shardname2",  "host" : "[host:port]",  "state" : 1 }

For each replica set

  • Get on the primary, select the admin database

Then:

db.system.version.insert({
    "_id" : "shardIdentity",
    "clusterId" : ObjectId("3b743bf3134df1f277980ard"),
    "shardName" : "REPLICA_SHARD_NAME",
    "configsvrConnectionString" : "conf/[host]"
})

See Shard Aware.

  • Reboot all replica set members