Mongodb – how to change primary node on mongo replica set without losing cursor

mongodb

Today, the primary machine's loading is extremely high on a replica set.
Users complain about query/insert operations are very slow.
So I tried to change the priority of node to make secondary/primary exchanged:

cfg = rs.conf()
cfg.members[0].priority = 1
cfg.members[1].priority = 0.5
rs.reconfig(cfg)

Then, 2 users complained about their program get exception from running cursor lost.

Is there any way to avoid the running cursor lost when primary node changed?

Best Answer

As at MongoDB 3.4, when a primary transitions to a non-primary state (eg. as the result of an election) all active connections are dropped so they can be re-established on the new primary.

Cursor state is specific to a given mongod, so you cannot resume a cursor on a different member of the replica set.

A recommended area to investigate would be why your primary was heavily loaded and why a change in primary would have reduced the load significantly. Generally electable secondaries in the same replica set should be identically provisioned in terms of hardware resources, so exchanging server roles should have pushed similar load onto the new primary. If the load was coming from suboptimal queries that were terminated on the former primary (or due to other resource contention), you could perhaps have avoided reconfiguring your replica set by finding and addressing the root cause.

The MongoDB manual has some information on how to Evaluate Performance of Current Operations. You should also implement a monitoring solution (if you haven't already) in order to capture a baseline of normal activity and help identify metrics that change significantly when your deployment is under load.

If you have long running queries which are likely to be interrupted by a restart you could consider:

  • Adding some logic to your application to restart the query using criteria from the last document seen while iterating.
  • Using secondary reads for these queries if your use case can tolerate eventual consistency. Before doing so, it's worth reading Can I use more replica nodes to scale?.