Mongodb – Adding wiredtiger nodes to a non-wiredtiger replica set

mongodbmongodb-3.0replication

Situation

I have 3 nodes on a Replica Set (P+S+S). Mongodb version 2.6.9. I would like to migrate all the nodes to 3.0.4 wiredtiger.

When I need to do that without downtime, I do the following workflow:

  1. setup a new server with 3.0.4
  2. add the node to the replica set
  3. remove one of the old nodes
  4. go to step 1

Problem is that with my set of data, adding a node with the "rs.addnode" tactic just doesn't work. The process takes forever and starts the initial syncronization again and again. To walkaround this, I normally do a cold backup from one of the old nodes to the new one. That takes the data and the local database, the node joins the replica set without initial syncronization.

Problem is that I cannot do that because the file system is not the same. If I can put one with wiredtiger, I can do the cold backup trick with the other ones.

Attempts I made with no luck:

  1. reduce size of my database
  2. reduce the Oplog size
  3. try with a dump backup / restore
  4. rs.add

I know there is a way to add a node with a setting –fastsync. But to do that, I need to have a copy from the primary, and the files are not the same. Not sure if that will do the trick with dump backup / restore. I would know if when the node takes the new member as a valid secondary, I can restart the node without the fastsync or I should put that setting forever.

This is my mongod.log when it fails with the initial sync:

I INDEX    [rsSync]         building index using bulk method
I INDEX    [rsSync] build index done.  scanned 3 total records. 0 secs
I REPL     [rsSync] initial sync cloning db: config
I REPL     [rsSync] initial sync data copy, starting syncup
I REPL     [rsSync] oplog sync 1 of 3
I NETWORK  [rsSync] Socket recv() timeout  x.x.x.x:x
I NETWORK  [rsSync] SocketException: remote: x.x.x.x:x error: 9001 socket exception [RECV_TIMEOUT] server [x.x.x.x:x]
I NETWORK  [rsSync] DBClientCursor::init call() failed
E REPL     [rsSync] 10276 DBClientBase::findN: transport error: x:x ns: local.oplog.rs query: { query: {}, orderby: { $natural: -1 } }
E REPL     [rsSync] initial sync attempt failed, 9 attempts remaining

SOLUTION (see comments)

configure ulimits:

sudo nano /etc/security/limits.conf
# add at the bottom
mongod       soft    nproc   64000
mongod       hard    nproc   64000
mongod       soft    nofile  64000
mongod       hard    nofile  64000
mongod       soft    fsize   unlimited
mongod       hard    fsize   unlimited
mongod       soft    cpu     unlimited
mongod       hard    cpu     unlimited
mongod       soft    as      unlimited
mongod       hard    as      unlimited

configure keep alive

sudo nano /etc/sysctl.conf
# add at the bottom
net.ipv4.tcp_keepalive_time = 120

Note: I did that on the new member and on the existing ones

Best Answer

To migrate data between different storage engines in MongoDB 3.0 you will need to perform an initial sync to the new replica set node or use mongorestore to seed a new replica set with your data.

If you want to add a node with a new storage engine to a live production replica set, rs.add() is the correct approach. To ensure a smooth upgrade I would also upgrade all existing nodes to the latest version of MongoDB 3.0.x with MMAP storage (the default) before adding your new WiredTiger node(s).

The key information in your question (and comments) is:

  • socket exceptions related to networking timeouts:

    I NETWORK  [rsSync] Socket recv() timeout  x.x.x.x:x
    I NETWORK  [rsSync] SocketException: remote: x.x.x.x:x error: 9001 socket exception [RECV_TIMEOUT] server [x.x.x.x:x]
    
  • deploying on Azure, which has some known issues with timeouts

As per the MongoDB production notes on Azure, you should reduce the TCP keepalive setting to avoid network timeouts:

The TCP keepalive on the Azure load balancer is 240 seconds by default, which can cause it to silently drop connections if the TCP keepalive on your Azure systems is greater than this value. You should set tcp_keepalive_time to 120 to ameliorate this problem.

As further preparation for successful initial sync with WiredTiger you should review the MongoDB production notes for your operating system.

Since you mention using Linux, I would recommend checking:

  • ulimits are appropriately set. WiredTiger creates more files than MMAP (separate data & index files per collection rather than per database) so will use more file handles than MMAP nodes in the same replica set. If you don't increase the ulimits from the default you may find the initial data transfer completes but initial sync fails at the index build stage with a fatal error like Too many open files.

  • the WiredTiger data directory is on a volume using the XFS file system. As at the time of writing, there are known performance issues (such as throughput stalls) with EXT4.

The production notes in the MongoDB manual are updated regularly based on issues and experiences reported by MongoDB users, so should be part of your preflight checklist for production O/S deployments or upgrades.