Mongodb – Mobile node in a mongodb replica set

mongodbreplication

Since a use case for a mongodb replica set is a base for analyses and learning cases (i.e a read-only database), is it possible/advisable to, in a replica set mongodb network, have an instance running on my laptop that will sync when connected to the net and then act as an offline read only set when not?

To put a point on it,

  1. will this replica set be stable when the laptop is put to sleep/opened/etc?
  2. will the intermittent availability of this set jeopardize an otherwise robust replicated database (say, with one primary, two arbiters and always on secondaries)?

Other questions that touch upon this but don't answer it directly (AFAIK) are

In these cases setting the arbiter correctly seems to do the trick, so maybe my crazy idea would work?

Thanks in advance.

Best Answer

Since a use case for a mongodb replica set is a base for analyses and learning cases (i.e a read-only database), is it possible/advisable to, in a replica set mongodb network, have an instance running on my laptop that will sync when connected to the net and then act as an offline read only set when not?

This configuration is possible, but generally not advisible.

1) will this replica set be stable when the laptop is put to sleep/opened/etc?

The normal rules for fault tolerance apply, so assuming your replica set has a majority of configured voting members available it's fine for one or more members to go offline.

The replica set node on your laptop will also still be usable as a readonly secondary (assuming you use a secondary read preference with your driver or rs.slaveOk() in the mongo shell).

2) will the intermittent availability of this set jeopardize an otherwise robust replicated database (say, with one primary, two arbiters and always on secondaries)?

This is where the approach heads into "not advisible" territory. In theory you can get a workable configuration with a core of stable voting members in your replica set configuration plus your laptop configured as a hidden, non-voting secondary.

However, there are a number of caveats to consider:

  • All replica set members need to be visible to each other via the same details as configured in your replica set configuration (hostnames and ports). You need to ensure your laptop knows what its expected hostname should be (otherwise it may not be able to find itself in the replica set config) and you need to consider how all the members will securely communicate with each other (eg. VPN).
  • Replica set members will be noisy about failed communication with unavailable members so you may have a lot of log noise that could potentially mask real networking problems.
  • If you fudge hostname resolution (eg. by editing /etc/hosts) it's very easy to introduce typos or forget to include this when updating or adding other members in the replica set.
  • If your laptop has incomplete or conflicting views of the network connectivity to other members, this may cause issues with elections/failover depending on your version of MongoDB and how you have configured your replica set.
  • Depending on the write activity for your replica set and how long your laptop remains offline, the laptop rejoining may cause noticeable load when catching up on old data that isn't in the current working set.
  • If your laptop remains offline for too long, it may require a full resync if there are no longer any oplog (operations log) entries in common with another member of the replica set.

Recommended approaches

1) Setup a hidden secondary in your hosted environment, instead.

2) If you really want a locally replicated copy and your replica set environment is anything other than development, I would suggest an alternative approach that sidesteps some of the potential issues with networking and visibility: you could consider using Mongo Connector. This is not an officially supported tool, but what it provides is replication-like behaviour to one or more target systems. It also doesn't remove caveats about possibly adding additional load or requiring a resync. For more background, see: Mongo Connector - Usage with MongoDB.

an otherwise robust replicated database (say, with one primary, two arbiters and always on secondaries)?

FYI, you should only have zero or one arbiter. The arbiter is a member that only exists to potentially cast a tie-breaking vote where you have an even number of voting replica set members. The presence of one (or especially more than one!) arbiter already compromises robustness in the event of failover. An arbiter helps with voting majority but cannot contribute to write majority -- this leads to asymmetric failover scenarios. For example, in a Primary-Secondary-Arbiter configuration if either data bearing node fails you will still have a Primary but no longer have active replication or an ability to acknowledge more than a w:1 write concern.