Mongodb – Deploy Across Cloud and On-Premises

awscloudmongodbredhat

We are planning to implement MongoDB as a stretch cluster which includes Primary secondary in DC 1, Secondary secondary in DC 2, Arbiter in cloud(DC 3) where DC 1 and DC 2 are on premise servers and DC 3 in cloud

My questions are :

1) Should we need to keep same OS in AWS as our On-premise server or we can keep Amazon Linux also ?
In our case it is RHEL 7.2 in On premise servers

2) If No, will there be a performance impact if we use different OS between replica nodes ?

3) If yes, Do we need to do patch upgrades as frequent as we do in On Prem servers ?

4) How the failover works here in case of write concern as Majority ?

5) What are the other trade-offs in this approach if any ?

Best Answer

1) Should we need to keep same OS in AWS as our On-premise server or we can keep Amazon Linux also ? In our case it is RHEL 7.2 in On premise servers

There is no strict requirement to have the same O/S for all members of a replica set, but in general it is a good idea to have consistent O/S so you have similar configuration and performance tuning across replica set members.

However, since your DC3 (cloud) instance appears to be an arbiter (which only participates in voting) any O/S differences should be irrelevant to performance.

2) If No, will there be a performance impact if we use different OS between replica nodes ?

Amazon Linux evolved from RHEL, so isn't an entirely different O/S (for example, like Linux vs Windows). However, there may be different configuration or tuning between Linux distributions. I wouldn't expect dramatically different performance between Linux distros, but this is something you'd have to test with your own use case and workload.

3) If yes, Do we need to do patch upgrades as frequent as we do in On Prem servers ?

This is up to your own security policy, but I would expect patch upgrades to be applied similarly for On-Premise versus cloud servers.

4) How the failover works here in case of write concern as Majority ?

Assuming you have an equal number of instances in DC1 vs DC2, an arbiter can be useful to ensure a primary can be elected in the event either DC is unreachable. An arbiter cannot acknowledge writes (since it is a voting-only node), so if you have a Primary-Secondary-Arbiter (PSA) configuration you will not be able to acknowledge majority writes if one of your data bearing nodes is unavailable.

I would strongly recommend using PSS (i.e. no arbiter) to support consistent failover with both elections and majority write concern.

5) What are the other trade-offs in this approach if any ?

As noted above, arbiters cannot acknowledge write so are not recommendable if you want to support fault tolerance with majority write concern. With a PSA configuration degraded to PsA (one secondary down), you have write availability (since a primary can still be maintained) but no longer have replication or data redundancy (since there is only one data-bearing node writing data).