Mongodb – Mongo repairDatabase procedure

mongo-repairmongodb

I need to run a repairDatabase in mongo in order to reclaim some os disk space. Can anyone provide feedback on the following procedure: let me know if its correct, if I'm missing anything, if the steps are in order, etc.

The setup is a primary/secondary on separate servers. They are both about out of disk space. My plan is to repair the database on each one separately after the other completes. As I am out of disk space, I will be using an external block storage from my cloud provider to use as a repair path. Mongo is currently running via sudo service mongodb start. To complete this it seems that I'll run the following:

  • On the primary I step down using rs.stepDown()
  • I shutdown mongo using rs.shutdownServer() – Is this correct? For the next step it seems that I shouldn't run it if Mongo already has another instance running.
  • On the primary I run mongod –config /etc/mongodb.conf –repair –repairpath PATH_TO_BLOCK_STORAGE
  • When complete, do I need to copy the data in the repair path over to the original (/var/lib/mongodb)?
  • Once back up, I restart Mongo using sudo service mongodb start
  • I then wait for the oplogs to catch up and for the mongos to be in sync.
  • Once they primary and secondary are in sync I then go onto mongodb-perf-02 and run rs.stepDown() so that mongodb-perf-01 is the primary again.
  • Once complete I then repeat this on the secondary.

Best Answer

The solution to this issue was as follows: Starting with the secondary:

Inside of Mongo I ran the following:

db.shutdownServer()
exit

On the command line:

rm -rf /var/lib/mongodb/*
sudo service mongodb start

It then takes a couple of hours to resync. Once resynced I did the following on the primary:

rs.status() //To make sure that the everything was synced
rs.stepDown(9200)
db.shutdownServer()
exit

On the command line I ran:

rm -rf /var/lib/mongodb/*
sudo service mongodb start

Then in Mongo I ran:

rs.status() //To make sure that everything was synced
rs.freeze(0) //To remove the re.stepDown(9200) lock

On the secondary I ran:

rs.stepDown()

Note: If possible, back up all of your data first.