Mongodb – Does a replica set with only two machines make sense

failoverhigh-availabilitymongodb

A lot of MongoDB tutorials, when they talk about replica sets, give examples of two machines: one originally initialized as primary, and another one originally created as secondary.

As I understand it:

  • If a primary member unexpectedly dies, the secondary one won't elect itself the new primary anyway, since it won't have the majority (which is two in a replica set of two machines).

  • It is impossible to read the data from the database if the primary member is unavailable, even with read_preference=ReadPreference.SECONDARY.

  • Recreating the primary member won't work anyway: it cannot be added to the existent replica set, since it lacks the primary member; it cannot rs.initiate(); either, since it will create a different replica set with a different ID.

Therefore, is there a real life situation where having only two members in a replica set would make any sense? Or such situation is only theoretical, and one would expect to see at least three members in every replica set in practice?

Best Answer

If a primary member unexpectedly dies, the secondary one won't elect itself the new primary anyway, since it won't have the majority (which is two in a replica set of two machines).

This is correct. In order to elect (and maintain) a primary, a majority of voting members need to be available.

It is impossible to read the data from the database if the primary member is unavailable, even with read_preference=ReadPreference.SECONDARY.

This is incorrect. Without a primary you cannot write to a replica set, but reads are still possible using non-primary read preferences such as secondary, primaryPreferred, secondaryPreferred, or nearest. The default read preference is primary, which provides strong consistency (versus eventual consistency when reading from a secondary). The primaryPreferred read preference reads from a primary if available, otherwise from a secondary.

Recreating the primary member won't work anyway: it cannot be added to the existent replica set, since it lacks the primary member; it cannot rs.initiate(); either, since it will create a different replica set with a different ID.

If you lose one of the members in a two node replica set, you can force reconfigure the surviving member to be a single node replica set and then add new member(s). You only need to run rs.initiate() once in the lifetime of a replica set.

Therefore, is there a real life situation where having only two members in a replica set would make any sense? Or such situation is only theoretical, and one would expect to see at least three members in every replica set in practice?

In general, most replica set deployments have sufficient members to allow for automatic failover (i.e. minimum of three members). If high availability is not a concern (or you prefer to have manual intervention), a two member replica set is not disallowed. However, a more typical option to allow failover would be to add a third voting-only member (aka an arbiter) to your replica set.

An arbiter is a lightweight voting-only member that will provide the additional vote needed for a majority in the event one of your two data bearing members is unavailable in a three-member configuration. The arbiter helps with availability but is a compromise over a third data-bearing member. When a three member replica set with an arbiter is in a degraded state (i.e. one of your data-bearing members is unavailable) you will still have a primary but no longer have data redundancy or replication. An arbiter will also prevent your application from being able to rely on the majority write concern to ensure that data is committed to a majority of replica set members.

My personal recommendation would be to deploy three data-bearing members as a minimum for a production replica set, however fewer data-bearing members could be considered for a development or non-critical deployment.