Mongodb – Outage due to Replica Set Protocol Version upgrade

mongodb

My production system has a replica set created in MongoDB v3.0, which still has protocol version undefined (i.e. version 0). It is now running v3.4 and I intend to upgrade to replica set protocol version 1, but I'm wary of what will happen while the replica set reconfigures itself.

Is there any risk of an outage at all? Either if all nodes are running normally, or if any nodes have problems during the change? The documentation warns that reconfiguration can trigger the current primary to step down in some situations but doesn't say which situations.

Apart from the protocol version, I am making no other configuration changes; for example, the member priorities will be unchanged so the same priority node will hopefully remain primary.

The current config is:

rsMyApp:PRIMARY> rs.conf()
{
  "_id" : "rsMyApp",
  "version" : 26139,
  "members" : [
    {
      "_id" : 4,
      "host" : "DB2:27017",
      "arbiterOnly" : false,
      "buildIndexes" : true,
      "hidden" : false,
      "priority" : 1,
      "tags" : {},
      "slaveDelay" : NumberLong(0),
      "votes" : 1
    },
    {
      "_id" : 5,
      "host" : "DB3:27017",
      "arbiterOnly" : false,
      "buildIndexes" : true,
      "hidden" : false,
      "priority" : 1,
      "tags" : {},
      "slaveDelay" : NumberLong(0),
      "votes" : 1
    },
    {
      "_id" : 1,
      "host" : "DB1:27017",
      "arbiterOnly" : false,
      "buildIndexes" : true,
      "hidden" : false,
      "priority" : 2,
      "tags" : {},
      "slaveDelay" : NumberLong(0),
      "votes" : 1
    }
  ],
  "settings" : {
    "chainingAllowed" : true,
    "heartbeatIntervalMillis" : 2000,
    "heartbeatTimeoutSecs" : 10,
    "electionTimeoutMillis" : 10000,
    "catchUpTimeoutMillis" : 60000,
    "getLastErrorModes" : {},
    "getLastErrorDefaults" : {
      "w" : "majority",
      "wtimeout" : 0
    }
  }
}

Best Answer

What that protocol change do, is change decision pattern what nodes do when new primary must be elected. At the situation where all SECONDARY nodes are up-to-date (not falling behind) and your current PRIMARY have higher priority than any of SECONDARY node, nothing will happen. The system continues as it was.

But afterward, if your current PRIMARY is unavailable, next primary is elected faster than it was with protocolVersion:0