MongoDB – Create Single-Server Replica Set for Watching Change Streams

mongodb

I have a MongoDB database which is of relatively low importance. I shovel data from home automation stuff into it (mostly sensor data, sunlight, switch states and so on) and don't need any replication.

I want to subscribe to change streams on that DB, and for that they must be members of a replica set. Is it OK to just add the --replSet [name] flag to the already existing server and then call rs.initiate() in it without rs.add()-ing any secondary members?

What drawbacks will this have (database size, performance, …)?

Basically I want devices/software to insert data directly into the database, and then have some middleware watching on change streams to then publish the data via RabbitMQ.

Currently I'm inserting into RabbitMQ and MongoDB in parallel. Inserting into RabbitMQ and having a middleware insert the data into MongoDB is also an option, but I want to check both ways of doing it.

I'm not that much of a friend of replica sets as I have an important Database which holds about 80-90MB of documents (total size when exporting all collections to disc), but each member consumes around 1.5 GB of space (--logpath data excluded).

Best Answer

Is it OK to just add the --replSet [name] flag to the already existing server and then call rs.initiate() in it without rs.add()-ing any secondary members?

Yes, a single node replica set is fine if you don't require data redundancy or fault tolerance.

What drawbacks will this have (database size, performance, ...)?

A replica set node maintains an operation log (oplog) which has some additional storage overhead. You can change the size of the oplog to suit your use case. Since you only need the oplog for change streams, it can perhaps be smaller than the default to save on disk space.

In MongoDB 3.6+ you can also adjust the size at runtime using the replSetResizeOplog command. In earlier server versions the resize procedure is manual: Change the size of the oplog (MongoDB 3.4).

I'm not that much of a friend of replica sets as I have an important Database which holds about 80-90MB of documents (total size when exporting all collections to disc), but each member consumes around 1.5 GB of space (--logpath data excluded).

Clarifying excessive storage usage is really a separate question, but this definitely seems an anomalous ratio.

Some likely causes:

  • Storage usage in the dbPath includes the replication oplog. The default oplog size is determined based on free disk space and O/S (for example, 5% of free disk space on Linux/Windows with a lower bound of 990MB and upper bound of 50GB).
  • You were using an older version of MongoDB (likely 3.2 or earlier) with the MMAPv1 storage engine. MMAP does not compress data on disk and some use cases will lead to storage fragmentation over time. Modern versions of MongoDB default to the WiredTiger storage engine which includes data & index compression and significant improvements over MMAPv1.