MongoDB – Replica Index Filled Hard Drive Stuck in Fatal

mongo-repairmongodb

I have a 3 server replica set deployed by Googles 1 Click Deploy functionality, the problem is I created an index on a particularly large collection (100,000,000 documents) and it consumed the whole hard drive and caused Mongo to crash and stay crashed.

I have deleted a 30GB log file and freed up some other space and tried to restart the mongo process on both the primary and secondary servers but the secondary keeps falling into fatal even after getting it to rollback.

I've run rs.reconfig() and it seemed to start the rollback process but it just fell back into FATAL and now it won't budge unless I restart the process and it goes in circles.

To make matters worse, the data (due to connection/training issues) sits only on this busted instance and is not spread across the replica.

I've tried to also run rs.slaveOk() and remove the offending index but I can't query without getting told I can't and obviously when it's in ROLLBACK I can't query for anything because of the lock.

Is there any way I can remove the offending index, repair and resync to fix this while it's in this state?

Thanks.

Best Answer

If your data exist on a node then this node is named Primary from now on and the other 2 secondaries.

1) Start the Primary as standalone and drop the index.

2) Take a backup of your data just in case

3) Stop secondaries and delete their data directory.

4) Start the Primary as a member of the replica set.

5) Start the first secondary. This will cause an initial sync of the data, when the member join the set as secondary start the other secondary.