MongoDB Migrating away from RocksDB to WiredTiger

mongodbperconawiredtiger

We have been working for some time with RocksDB as our engine and we are now trying to migrate to WiredTiger. We have some pretty big databases around 4~12 TB of data and according to the process described in the docs, we added a new node with WiredTiger and tried letting it replicate from scratch.

With the amount of data, the replication times are VERY long and quite a lot of times we were in the situation were the WiredTiger node decided to change the node it was replicating from, only to drop ALL data and start from scratch again. Only once we succeeded in completing the replication, but the node ended up way behind in comparison to the oplog.

Again with this amount of data it becomes prohibitive to have a big enough oplog to hold weeks of transactions and the process is also very flimsy, single threaded, slow and prone to fail.

So my questions are the following:

  1. Is there a better way to proceed with this migration?

  2. Is there a way to speed up replication (i.e. multithreaded replication)?

  3. Is there a way to tell the new WiredTiger node to stop dropping all data in case of mishaps?

We are working with 3 and 5 nodes replica sets of Percona MongoDB version 3.4.13 and trying to move to Open Source MongoDB 3.4.13 (with the idea of upgrading to 4.x once we are in WiredTiger and dropping RocksDB and Percona entirely).

Best Answer

If replication doesn't keep up with primary and it's not possible to extend the oplog, I'd try to shard the deployment and...

  • let smaller shards replicate faster

    or

  • drain the original shard and then remove it from the deployment - this solves you storage engine problem, but leaves the replication as laggy as before