Mongodb – Primary replica set server goes secondary after secondary fails

mongodb

I have a 2 servers replica set that, after the secondary fails the primary goes into secondary mode while the secondary is in STARTUP2 (recovering). The problem with this is that I can't use the collection stored in that replica set freely, I'm getting errors trying to use the collection:

pymongo.errors.OperationFailure: database error: ReplicaSetMonitor no master found for set: rs2

Sometimes if I restart the mongod instances, the server rs2-1 is the primary for a while, but after some time (while the secondary is recovering) I see this in the logs of rs2-1 (the primary):

Tue May  7 17:43:40.677 [rsHealthPoll] replSet member XXX.XXX.XXX.XXX:27017 is now in state DOWN
Tue May  7 17:43:40.677 [rsMgr] can't see a majority of the set, relinquishing primary
Tue May  7 17:43:40.682 [rsMgr] replSet relinquishing primary state
Tue May  7 17:43:40.682 [rsMgr] replSet SECONDARY
Tue May  7 17:43:40.682 [rsMgr] replSet closing client sockets after relinquishing primary

Is there an easy way to make the primary keep being primary after the secondary fails? Am I doing something wrong?

Thanks in advance!

Best Answer

MongoDB requires a primary to be able to see a majority of the set in order to remain a primary. The majority is defined as being greater than n/2 machines where n is the number of machines in the replica set. So, if n is 1, then one machine constitutes a majority. When n is 2 or 3, 2 machines constitute a majority.

When the secondary is added, the primary now requires to be able to access two machines, itself and the secondary. When the newly added secondary goes down, the primary no longer sees a majority, and steps down.

This scenario is why an arbiter is recommended in addition to the primary and secondary. If you have two machines and one arbiter in the replica set, assuming the arbiter always stays up (which should be true because the arbiter does barely anything), then as long as one other machine is up, the arbiter will ensure that a primary will exist, because the other machine and the arbiter constitute a majority (2 out of 3).