Mongodb – how to upgrade the config servers to use WiredTiger in mongodb 3.0.0

mongodb

I am reading the document upgrade config servers to use WiredTiger. In the section of "change config server to use WiredTiger", the second step is "stop the last config server listed in the mongos' configDB setting". But the 5th step is to restart the second config server with WiredTiger. I am confused by the sequence of stopping and restarting. Is there problem on this document?

Another question, why the data in config servers should be exported and uploaded manually? Why the data in replicate set should not be exported and uploaded manually? Instead, the data in replicate set can be imported by "Resync a Member of a Replica Set" please see official doc

Best Answer

The current wording in the documented steps may be somewhat confusing given the relative references to first/last/second config servers.

I've added some context for the documented steps below, but you should consider the manual the definitive source. It's also worth noting that you do not have to upgrade the config servers to use WiredTiger (even if your shards are using WiredTiger). Config servers typically have a small data set and are not under high write load.

Annotated version of the steps from the MongoDB 3.0 Upgrade Guide

Disable the balancer.

Disabling the balancer ensures any active migrations have completed.
Stop the last config server listed in your mongos' configDB setting (will call that config3 for the purpose of these steps):

At this stage you should have:
- config1 (running mmap)
- config2 (running mmap)
- config3 (stopped)
Stopping one of the config servers ensures there are no changes to the metadata in the cluster (chunk splits or migrations cannot be committed without all three config servers available).
Use mongodump to export the config database from config2

After running the mongodump you should have a dump directory with bson files.
Create a new data directory on config2.

The storage format for WiredTiger data is different from the existing mmap data, and cannot use the same dbpath as mmap.
Restart the config2 server with the WiredTiger and appropriate storage options:

mongod --storageEngine wiredTiger --dbpath <newWiredTigerDBPath> ...

At this stage you should have:
- config1 (running mmap)
- config2 (running WiredTiger with no data)
- config3 (stopped)
Note that config2 doesn't have any data yet, because it has just been started up with the new WiredTiger dbpath.
Use mongorestore to load the config database backup you created in step 3.

At this stage you should have:
- config1 (running mmap)
- config2 (running WiredTiger)
- config3 (stopped)
Shut down config2:

At this stage you should have:
- config1 (running mmap)
- config2 (stopped)
- config3 (stopped)
config2 is stopped at this point to ensure no metadata changes can happen when we start config3 up in the next step.
Restart config3.

At this stage you should have:
- config1 (running mmap)
- config2 (stopped)
- config3 (running mmap)

(steps 9-15) These steps are just repeating the same mongodump and mongorestore for each config server. There's a bit of shuffling to ensure you always have at least one config server available, and do not have all three up while you are still migrating data.

Upload the data into config1

At this stage you should have:
- config1 (running WiredTiger)
- config2 (stopped)
- config3 (running WiredTiger)
Start config2

With all three config servers are available & upgraded, changes to the sharded cluster metadata can now resume.
Re-enable the balancer so normal balancing activity & chunk migration can resume.

Related Solutions

Mongodb config servers not in sync

Restoring config servers, particularly if you have had some sort of catastrophic event is tricky, but not impossible. But, before we go any further, a big bold caveat:

BACK UP EVERYTHING

That means taking a back up of all three config servers. I am going to give you some advice, and it is generally correct, but please, please take a back up of every current config server instance before you overwrite/replace anything

As a quick explanation, config servers are not configured as a replica set - each config server instance is supposed to be identical (at least for all the collections that matter) to the others. Hence, any healthy config server can be used to replace a non-healthy config server and you can then follow the tutorial you mentioned to get back to a good config.

The key to recovery is to identify the healthy config server and then use that to replace the others - you then end up with 3 identical config servers.

There is more than one way to do this, they basically fall into three categories:

1) Use the error message

The error message that is printed out actually lets you know which config server it believes is health, though that is not obvious from the messaging. Here's how to read it generically:

ERROR: config servers not in sync! config servers <healthy-server> and <out-of-sync-server> differ

Basically the first one in the list is the healthy one, in your case that would be mongocfg1.testing.com:27000. That is our first candidate for a healthy config database.

2) Use dbhash to compare all three and pick the ones that agree

On each config server switch to the config database using use config, run db.runCommand("dbhash") and compare the hashes for the collections below:

chunks
databases
settings
shards
version

You are looking for two servers that agree, and using that as the basis to determine that the version of the config database on those hosts is basically trustworthy and should be used to seed the rest.

3. Manually inspect the collections in the config database

Finally, take a look at the config database, and pay attention to the collections listed in the second option above. This is a straight judgement call based on your familiarity with your data.

Hopefully all three methods point you at the same host (or hosts). That config server should be used to seed the other two (after you have taken backups so you can go back). That is basically your best bet. Should that fail, then you may want to try one of the other versions (from the backups) - always making sure that when you start them, all three are identical.

Finally, always ensure that all mongos processes are using the same config server string, and that all 3 servers are always listed in the same order on every process - not doing so across all mongos processes can lead to (very) odd results.

Mongodb – When upgrading config servers to use WireTiger, the running config server with wiredtiger does not authenticate the existing pass and user

It looks like the current MongoDB 3.0 upgrade instructions are missing mention of two important parameters for backing up and restoring users and roles:

mongodump --dumpDbUsersAndRoles (see also: Required Access to Backup User Data).
mongorestore --restoreDbUsersAndRoles (see also: Required Access to Restore User Data)

I can think of several approaches to fix:

If you don't have many user accounts on the config servers, recreate the administrator & user accounts. This isn't ideal, but is probably the fastest approach.
Export the users from your mmap database. This is more involved, but saves you recreating the users & roles. I've described steps for this below.
Redo the config server migration with the user & role information included. I expect this is the least desirable option.

Exporting the users

Assuming you have already upgraded all of your config servers to WiredTiger, here are some steps to add the user information:

Disable the balancer
Stop the last config server listed in your mongos' configDB setting (will call that config3 for the purpose of these steps). This will ensure your sharded cluster metadata remains read-only for the following steps.
Re-start config2 using the mmap data directory

At this stage you should have:
- config1 (running WiredTiger)
- config2 (running mmap with user/role data)
- config3 (stopped)
Export the data from config2:

mongodump --db config --dumpDbUsersAndRoles --username .. --password ..

Add any other parameters needed, eg --authenticationDatabase .. if you need to auth against another database.
If you have users in the admin database on your config server, you will also want to dump that as well.
(optional) Remove files from your dump except for the user/role information. If you are certain nothing has changed since you did the original migration from mmap to WiredTiger you could skip this step, however it would be safer to not overwrite any existing data.

Preview the files to remove:

find ./dump -type f -not -name "\$admin.system*"

WARNING: removing files, make sure you have previewed to confirm:

find ./dump -type f -not -name "\$admin.system*" | xargs rm
Re-start config2 using the wiredTiger storage engine
Run: mongorestore --db config --restoreDbUsersAndRoles dump/config/

You should see messages about restoring users & roles, for example:

2015-03-18T02:41:34.887+1100 restoring users from dump/config/$admin.system.users.bson

2015-03-18T02:41:34.887+1100 restoring roles from dump/config/$admin.system.roles.bson
Login to config2 and confirm the users are correctly setup (i.e. auth with admin account, use db.getUsers() to check).

At this stage you should have:
- config1 (running WiredTiger)
- config2 (running WiredTiger with user/role data)
- config3 (stopped)
Copy the dump directory to config1 and repeat the mongorestore step.
Shutdown config2 (to keep the sharded cluster metadata readonly for the next step).

At this stage you should have:
- config1 (running WiredTiger with user/role data)
- config2 (stopped)
- config3 (stopped)
Start config3. Copy the dump directory to config3, and repeat the mongorestore step.

At this stage you should have:
- config1 (running WiredTiger with user/role data)
- config2 (stopped)
- config3 (running WiredTiger with user/role data)
Start config2. At this point all config servers should be online with the user information.
Re-enable the balancer so normal balancing activity & chunk migration can resume.