MongoDB config replica failed connection after a few days running

mongodbmongodb-3.4

I have setup a MongoDB shard cluster. It has three shard replicas, each replica has three mongo instances. And it has one replica of 3 mongo config servers. And one mongos. It works fine at the beginning but the config replica failed connection after a few days run. When I login to each of the config server mongo instance, below is the command rs.status() output:

Configure server1:

OTHER> rs.status()
{
    "state" : 10,
    "stateStr" : "REMOVED",
    "uptime" : 121353,
    "optime" : {
        "ts" : Timestamp(1504367995, 1),
        "t" : NumberLong(3)
    },
    "optimeDate" : ISODate("2017-09-02T15:59:55Z"),
    "ok" : 0,
    "errmsg" : "Our replica set config is invalid or we are not a member of it",
    "code" : 93,
    "codeName" : "InvalidReplicaSetConfig"
}

Configure Server 2:

OTHER> rs.status()
{
    "state" : 10,
    "stateStr" : "REMOVED",
    "uptime" : 121421,
    "optime" : {
        "ts" : Timestamp(1504367995, 1),
        "t" : NumberLong(3)
    },
    "optimeDate" : ISODate("2017-09-02T15:59:55Z"),
    "ok" : 0,
    "errmsg" : "Our replica set config is invalid or we are not a member of it",
    "code" : 93,
    "codeName" : "InvalidReplicaSetConfig"
}

Configure server 3:

SECONDARY> rs.status()
{
    "set" : "cnf-serv",
    "date" : ISODate("2017-09-04T01:45:05.842Z"),
    "myState" : 2,
    "term" : NumberLong(3),
    "configsvr" : true,
    "heartbeatIntervalMillis" : NumberLong(2000),
    "optimes" : {
        "lastCommittedOpTime" : {
            "ts" : Timestamp(0, 0),
            "t" : NumberLong(-1)
        },
        "appliedOpTime" : {
            "ts" : Timestamp(1504367995, 1),
            "t" : NumberLong(3)
        },
        "durableOpTime" : {
            "ts" : Timestamp(1504367995, 1),
            "t" : NumberLong(3)
        }
    },
    "members" : [
        {
            "_id" : 0,
            "name" : "172.19.0.10:27017",
            "health" : 1,
            "state" : 2,
            "stateStr" : "SECONDARY",
            "uptime" : 121454,
            "optime" : {
                "ts" : Timestamp(1504367995, 1),
                "t" : NumberLong(3)
            },
            "optimeDate" : ISODate("2017-09-02T15:59:55Z"),
            "configVersion" : 403866,
            "self" : true
        },
        {
            "_id" : 1,
            "name" : "172.19.0.7:27017",
            "health" : 0,
            "state" : 8,
            "stateStr" : "(not reachable/healthy)",
            "uptime" : 0,
            "optime" : {
                "ts" : Timestamp(0, 0),
                "t" : NumberLong(-1)
            },
            "optimeDurable" : {
                "ts" : Timestamp(0, 0),
                "t" : NumberLong(-1)
            },
            "optimeDate" : ISODate("1970-01-01T00:00:00Z"),
            "optimeDurableDate" : ISODate("1970-01-01T00:00:00Z"),
            "lastHeartbeat" : ISODate("2017-09-04T01:45:02.312Z"),
            "lastHeartbeatRecv" : ISODate("1970-01-01T00:00:00Z"),
            "pingMs" : NumberLong(0),
            "lastHeartbeatMessage" : "Connection refused",
            "configVersion" : -1
        },
        {
            "_id" : 2,
            "name" : "172.19.0.4:27017",
            "health" : 0,
            "state" : 8,
            "stateStr" : "(not reachable/healthy)",
            "uptime" : 0,
            "optime" : {
                "ts" : Timestamp(0, 0),
                "t" : NumberLong(-1)
            },
            "optimeDurable" : {
                "ts" : Timestamp(0, 0),
                "t" : NumberLong(-1)
            },
            "optimeDate" : ISODate("1970-01-01T00:00:00Z"),
            "optimeDurableDate" : ISODate("1970-01-01T00:00:00Z"),
            "lastHeartbeat" : ISODate("2017-09-04T01:45:02.310Z"),
            "lastHeartbeatRecv" : ISODate("1970-01-01T00:00:00Z"),
            "pingMs" : NumberLong(0),
            "lastHeartbeatMessage" : "Connection refused",
            "configVersion" : -1
        }
    ],
    "ok" : 1
}

It looks like the first two instance is removed and the third config server is secondary. Based on my understanding, if there is one instance shutdown in a replica, another healthy instance should be selected to become primary. Why didn't the third instance become primary in my replica?

All mongo instances are using the version 3.4.4.

Below is the command I used to launch mongod config server:

mongod --replSet cnf-serv --rest --configsvr --port 27017 --oplogSize 16 --noprealloc --smallfiles

FYI, from the first two instance log, I see below error message:

2017-09-04T01:39:23.006+0000 I SHARDING [shard registry reload] Periodic reload of shard registry failed  :: caused by :: 134 could not get updated shard list from config server due to Read concern majority reads are currently not possible.; will retry after 30s
2017-09-04T01:39:53.006+0000 I SHARDING [shard registry reload] Periodic reload of shard registry failed  :: caused by :: 134 could not get updated shard list from config server due to Read concern majority reads are currently not possible.; will retry after 30s
2017-09-04T01:40:23.006+0000 I SHARDING [shard registry reload] Periodic reload of shard registry failed  :: caused by :: 134 could not get updated shard list from config server due to Read concern majority reads are currently not possible.; will retry after 30s
2017-09-04T01:40:53.006+0000 I SHARDING [shard registry reload] Periodic reload of shard registry failed  :: caused by :: 134 could not get updated shard list from config server due to Read concern majority reads are currently not possible.; will retry after 30s

Best Answer

OK.. The reason, why that third server is secondary and not primary, is "majority". When there is only one of three servers "up", that one is not the majority. So, if you "lose" one of three servers, rest of two are fine because 2/3 is the majority.

And then back to the base problem. It looks like that those two servers what are in "OTHER" state has now different address/name what it was when you added them to the replica set. Check that those two servers still resolve same IP address (or DNS name) what is used in replica set config.