Mongodb replica set primary and secondary node conf fail when one slave is down

disk-spacegoogle-cloud-platformmaster-slave-replicationmongodb

In my env, I had created MongoDB replica set conf for high availability so one server fails should be handled by other also in my conf I have 3 DB server and 1 arbiter too but,

my slave's server fails due to disk utilization by this, my master node also fails to provide data to api's which create disrupts my whole system.

So, is this event occurs due to configuration fail or its mongo property? Or some conf other than replica set creation is required to prevent this??

currently, I am using MongoDB shell version v4.0.3

Followed this doc some of this command used in the replica set setup

https://docs.mongodb.com/manual/tutorial/deploy-replica-set/

Best Answer

In my env, I had created MongoDB master-slave conf for high availability so one server fails should be handled by other but, my slave's server fails due to disk utilization by this, my master node also fails to provide data to api's which create disrupts my whole system.

As per MongoDB documentation here to providing all the functionality of master-slave deployments, replica sets are also more robust for production use. Master-slave replication preceded replica sets and made it possible to have a large number of non-master (i.e. slave) nodes, as well as to restrict replicated operations to only a single database; however, master-slave replication provides less redundancy and does not automate failover.

Master instances store operations in an oplog which is a capped collection. As a result, if a slave falls too far behind the state of the master, it cannot “catchup” and must re-sync from scratch. Slave may become out of sync with a master if:

  • The slave falls far behind the data updates available from that master.
  • The slave stops (i.e. shuts down) and restarts later after the master has overwritten the relevant operations from the master.

When slaves are out of sync, replication stops. Administrators must intervene manually to restart replication. Use the resync command. Alternatively, the --autoresync allows a slave to restart replication automatically, after ten second pause, when the slave falls out of sync with the master. With --autoresync specified, the slave will only attempt to re-sync once in a ten minute period.

To prevent these situations you should specify a larger oplog when you start the master instance, by adding the --oplogSize option when starting mongod. If you do not specify --oplogSize, mongod will allocate 5% of available disk space on start up to the oplog, with a minimum of 1 GB for 64-bit machines and 50 MB for 32-bit machines.

So, is this event occurs due to configuration fail or its mongo property? Or some conf other than master-slave creation is required to prevent this?

As MongoDB documented here from MongoDB 4.0 removes support for master-slave replication. Before you can upgrade to MongoDB 4.0, if your deployment uses master-slave replication, you must upgrade to a replica set.

Warning : Deprecated since version 3.2: MongoDB 3.2 deprecates the use of master-slave replication for components of sharded clusters.

Important: Replica sets replace master-slave replication for most use cases. If possible, use replica sets rather than master-slave replication for all new production deployments. This documentation remains to support legacy deployments and for archival purposes only.

After modification the question of OP

As here MongoDB provides two options for performing an initial sync:

Restart the mongod with an empty data directory and let MongoDB’s normal initial syncing feature restore the data. This is the more simple option but may take longer to replace the data.

See Automatically Sync a Member.

Restart the machine with a copy of a recent data directory from another member in the replica set. This procedure can replace the data more quickly but requires more manual steps.

See Sync by Copying Data Files from Another Member.

WARNING: Avoid reconfiguring replica sets that contain members of different MongoDB versions as validation rules may differ across MongoDB versions.

Reconfigures an existing replica set, overwriting the existing replica set configuration. To run the method, you must connect to the primary of the replica set.

For Example

rs.reconfig(configuration, force)

To reconfigure an existing replica set, first retrieve the current configuration with rs.conf(), modify the configuration document as needed, and then pass the modified document to rs.reconfig().

rs.reconfig() provides a wrapper around the replSetReconfig command.

The force parameter allows a reconfiguration command to be issued to a non-primary node.

For further your ref here, here and here