MongoDB – Converting Standalone Instance to Replica Set and Backing Up

backupmongodbmongodb-3.6replication

I want to convert my MongoDB 3.6 standalone server to a replica set so I can watch a collection using change streams, which requires an oplog. I only want to use change streams so will keep this as a single member replica set.

The problem is that my standalone instance is kind of big (~400 GB) and I do not have a backup of it.

I have a few questions:

Is such conversion considered a safe operation? Are there any caveats considering the time of "conversion", the additional memory needed etc. or does that only apply if I connect more nodes?
What are the ways of backing up such big instances? I have heard of and read about Filesystem Snapshots but I'm not sure what are they and how to handle them.

Best Answer

Is such conversion considered a safe operation?

Converting a standalone node to a replica set is a straightforward procedure, but if you do not have a backup of your deployment (and this data is important) I would definitely prioritise creating and testing a backup. If you only have a single copy of your data (the one being used!) you will have very limited (and possibly painful) recovery options if something unwelcome accidentally (or intentionally) happens to your data.

Are there any caveats considering the time of "conversion", the additional memory needed etc. or does that only apply if I connect more nodes?

A replica set member will use some extra storage space and I/O for the operation log (oplog) which stores a rolling record of operations for use cases like replication and change streams. The conversion process is essentially restarting mongod with a new replSet option and then initialising the oplog and replica set configuration using rs.initate().

What are the ways of backing up such big instances? I have heard of and read about Filesystem Snapshots but I'm not sure what are they and how to handle them.

The MongoDB documentation describes supported Backup Methods including Back Up and Restore with Filesystem Snapshots. Filesystem snapshots use system level tools that vary depending on the O/S and filesystem used for your deployment.

For example, Linux has LVM (Logical Volume Manager) which enables taking a consistent backup of a block device. The initial snapshot will have some more noticeable overhead, but subsequent snapshots are generally quick. However, snapshots typically depend on the same storage infrastructure as the original disk so it is essential that you have a plan for archiving snapshots and saving backups elsewhere. If you are using a cloud provider (Amazon, Google Cloud, Azure, ...) with network data volumes, these also have snapshot APIs.

Related Solutions

MongoDB Sharding – Sharding Between Replica and Standalone Server

Inserts need to be done via the mongos for splits to happen automatically. In fact, all work should be done via the mongos, not directly to the primary on one of the sets. That is why you have not seen any splits happen and your chunks are all still in one place. In terms of details, I've written up some of the details about how splits happen in a previous answer.

Note that you can split manually via the mongos if you want to, you do not have to wait for the automatic splitting to happen.

MongoDB – Converting Standalone to Replica Set Causes Services Down

Basically, the scenario from the comments is what has happened here. You have added a new host to the set (mongo-primary) and this host is not reachable from your original host (kuankr). That means that you have a replica set with 2 hosts, but only one healthy. When that occurs you cannot satisfy the requirement for electing a primary - which is that >50% of the votes (or a strict majority) are required to elect a primary.

In a 2 node set, both nodes must be available and voting to elect a primary. In a 3 node set, you need 2 out of 3, in a 4 node set you need 3 out of 4, in a 5 node set you need 3 out of 5 etc.

This is why it is always recommended to have an odd number of nodes in your set. I would recommend adding an arbiter that can be reached by your original primary so that it can be elected again. Then, with the immediate problem solved, work out why the original primary cannot talk to the new node (most common issues: firewall, routing, incorrect bind IP on new node).

Update based on comments:

To force an addition if the helpers will not work on a secondary, then you can do something like this:

cfg = rs.conf()
// here's what my sample config member array looks like - adjust as necessary
> cfg.members 
[
    {
        "_id" : 0,
        "host" : "mongod_A.example.net:27017"
    },
    {
        "_id" : 1,
        "host" : "mongod_B.example.net:27017"
    }
]
// let's manually add an arbiter
> cfg.members[2] = {
... "_id" : 2,
... "host": "arbiter:27017",
... "arbiter": true
... }
// now our cfg object looks like this
> cfg
{
    "_id" : "rs",
    "version" : 7,
    "members" : [
        {
            "_id" : 0,
            "host" : "mongod_A.example.net:27017"
        },
        {
            "_id" : 1,
            "host" : "mongod_B.example.net:27017"
        },
        {
            "_id" : 2,
            "host" : "arbiter:27017",
            "arbiter" : true
        }
    ]
}
// Finally, reconfigure with force on the secondary
rs.reconfig(cfg, {force : true})

You can also just remove the "bad" node using this similar procedure

Best Answer

Related Solutions

MongoDB Sharding – Sharding Between Replica and Standalone Server

MongoDB – Converting Standalone to Replica Set Causes Services Down

Related Question