Restoring config servers, particularly if you have had some sort of catastrophic event is tricky, but not impossible. But, before we go any further, a big bold caveat:
BACK UP EVERYTHING
That means taking a back up of all three config servers. I am going to give you some advice, and it is generally correct, but please, please take a back up of every current config server instance before you overwrite/replace anything
As a quick explanation, config servers are not configured as a replica set - each config server instance is supposed to be identical (at least for all the collections that matter) to the others. Hence, any healthy config server can be used to replace a non-healthy config server and you can then follow the tutorial you mentioned to get back to a good config.
The key to recovery is to identify the healthy config server and then use that to replace the others - you then end up with 3 identical config servers.
There is more than one way to do this, they basically fall into three categories:
1) Use the error message
The error message that is printed out actually lets you know which config server it believes is health, though that is not obvious from the messaging. Here's how to read it generically:
ERROR: config servers not in sync! config servers <healthy-server> and <out-of-sync-server> differ
Basically the first one in the list is the healthy one, in your case that would be mongocfg1.testing.com:27000
. That is our first candidate for a healthy config database.
2) Use dbhash
to compare all three and pick the ones that agree
On each config server switch to the config database using use config
, run db.runCommand("dbhash")
and compare the hashes for the collections below:
- chunks
- databases
- settings
- shards
- version
You are looking for two servers that agree, and using that as the basis to determine that the version of the config database on those hosts is basically trustworthy and should be used to seed the rest.
3. Manually inspect the collections in the config database
Finally, take a look at the config database, and pay attention to the collections listed in the second option above. This is a straight judgement call based on your familiarity with your data.
Hopefully all three methods point you at the same host (or hosts). That config server should be used to seed the other two (after you have taken backups so you can go back). That is basically your best bet. Should that fail, then you may want to try one of the other versions (from the backups) - always making sure that when you start them, all three are identical.
Finally, always ensure that all mongos
processes are using the same config server string, and that all 3 servers are always listed in the same order on every process - not doing so across all mongos
processes can lead to (very) odd results.
I tested this, first I installed MongoDB 2.4.6 with brew, and then used launchctl to load and unload. In my testing, it sends a SIGTERM to the mongod process which then shuts down as expected. Here are the commands I used as well as the logs for the mongod
process:
Commands:
launchctl load -w /usr/local/Cellar/mongodb/2.4.6/homebrew.mxcl.mongodb.plist
launchctl unload -w /usr/local/Cellar/mongodb/2.4.6/homebrew.mxcl.mongodb.plist
Logs:
tail -f /usr/local/var/log/mongodb/mongo.log
Tue Oct 22 17:33:32.774 [initandlisten] MongoDB starting : pid=13192 port=27017 dbpath=/usr/local/var/mongodb 64-bit host=adamc-mbp.local
Tue Oct 22 17:33:32.774 [initandlisten] db version v2.4.6
Tue Oct 22 17:33:32.774 [initandlisten] git version: nogitversion
Tue Oct 22 17:33:32.774 [initandlisten] build info: Darwin minimountain.local 12.4.0 Darwin Kernel Version 12.4.0: Wed May 1 17:57:12 PDT 2013; root:xnu-2050.24.15~1/RELEASE_X86_64 x86_64 BOOST_LIB_VERSION=1_49
Tue Oct 22 17:33:32.774 [initandlisten] allocator: tcmalloc
Tue Oct 22 17:33:32.774 [initandlisten] options: { bind_ip: "127.0.0.1", command: [ "run" ], config: "/usr/local/etc/mongod.conf", dbpath: "/usr/local/var/mongodb", logappend: "true", logpath: "/usr/local/var/log/mongodb/mongo.log" }
Tue Oct 22 17:33:32.775 [initandlisten] journal dir=/usr/local/var/mongodb/journal
Tue Oct 22 17:33:32.775 [initandlisten] recover : no journal files present, no recovery needed
Tue Oct 22 17:33:32.806 [websvr] admin web console waiting for connections on port 28017
Tue Oct 22 17:33:32.806 [initandlisten] waiting for connections on port 27017
Tue Oct 22 17:34:21.682 [signalProcessingThread] got signal 15 (Terminated: 15), will terminate after current cmd ends
Tue Oct 22 17:34:21.682 [signalProcessingThread] now exiting
Tue Oct 22 17:34:21.682 dbexit:
Tue Oct 22 17:34:21.682 [signalProcessingThread] shutdown: going to close listening sockets...
Tue Oct 22 17:34:21.682 [signalProcessingThread] closing listening socket: 9
Tue Oct 22 17:34:21.682 [signalProcessingThread] closing listening socket: 10
Tue Oct 22 17:34:21.682 [signalProcessingThread] closing listening socket: 11
Tue Oct 22 17:34:21.682 [signalProcessingThread] removing socket file: /tmp/mongodb-27017.sock
Tue Oct 22 17:34:21.682 [signalProcessingThread] shutdown: going to flush diaglog...
Tue Oct 22 17:34:21.682 [signalProcessingThread] shutdown: going to close sockets...
Tue Oct 22 17:34:21.682 [signalProcessingThread] shutdown: waiting for fs preallocator...
Tue Oct 22 17:34:21.682 [signalProcessingThread] shutdown: lock for final commit...
Tue Oct 22 17:34:21.683 [signalProcessingThread] shutdown: final commit...
Tue Oct 22 17:34:21.692 [signalProcessingThread] shutdown: closing all files...
Tue Oct 22 17:34:21.692 [signalProcessingThread] closeAllFiles() finished
Tue Oct 22 17:34:21.692 [signalProcessingThread] journalCleanup...
Tue Oct 22 17:34:21.692 [signalProcessingThread] removeJournalFiles
Tue Oct 22 17:34:21.692 [signalProcessingThread] shutdown: removing fs lock...
Tue Oct 22 17:34:21.693 dbexit: really exiting now
I did this several times to confirm the behavior. In Chrome at least the status page no longer responds and I receive an error (as expected) once it has been shut down.
The only difference between what I am doing and what you have posted is that I am not using sudo
(in fact it refuses to load or unload due to dubious ownership of the file). So, I changed the ownership of the plist file to root and tried sudo
with the same results.
The only way I was able to recreate a failure to unload was as follows:
- Start with
sudo launchctl
(root is the owner of the plist file)
- Change ownership of the plist file back to my regular user
- Try to unload without
sudo
This fails with an error however:
launchctl: Error unloading: homebrew.mxcl.mongodb
Note: Subsequently changing the ownership back to the regular user made the unload successful
Similarly, this also produces the same error:
- Start without
sudo
(regular user owns the plist file)
- Change ownership of the plist file to root
- Try to unload with sudo
I was unable to recreate the silent failure you seem to be having with any of the various combinations I tried.
Some information gathering tips which may give you a clue:
- When you launch with launchctl, what does the output of this command say:
launchctl list | grep mongodb
? (it should list something like 13340 - homebrew.mxcl.mongodb
- If you run this same command after you run unload (without error), it should show the exit status in that middle column (-15)
- Sometimes it can take a while for MongoDB to exit - so tail the log (see example above), see if the TERM signal is being received
- Why is it you are using sudo? Have you installed brew as root? If so this might be at the core of the issue here - generally it is not recommended to run MongoDB as root.
Best Answer
First, assuming you are actually specifying where to read the file from, make sure that you have permission read that file with the current user (
cat /usr/local/Cellar/mongodb/2.4.6/mongod.conf
- or use less/vi/editor of choice). Assuming that works (and if it does not, adjust your permissions), then the next thing you need to do is make sure you are actually pointing at the correct file.However, if you are not specifying where to read the file from, by default, if you just run
mongod
using the brew installation it will attempt to read from:I verified this by installing 2.4.6 with brew and then checking the logs when it starts up:
You can either modify that file (
/usr/local/etc/mongod.conf
) to look the way you want in your example, and make sure you have permissions to get to it, or you can run this instead to specify the original file: