You have answered some of your own questions here, specifically you have a decent idea about the write lock aspect of the equation - 12,000 insert/sec gets you to ~60% write lock. That's a reasonable level to get consistent performance - you will be getting some contention, and some ops will be a little slower, but you really want to start worrying at about 80% - like a lot of things, when you start exceeding 80% available capacity you will start hitting issues a lot more often.
In terms of other bottlenecks, and specifically how quickly you can write to disk - this can cause problems, but to look at the relevant stats over time I would recommend getting MMS installed with the munin-node plugin to give you hardware and IO stats in addition to the MongoDB stats.
When you have that, the metrics you will want to keep an eye on are:
- The Average Flush time (this is how long MongoDB's periodic sync to disk is taking)
- The IOStats in the hardware tab (IOWait in particular)
- Page Faults (if your disk is busy with writes and you need to read data, they are going to be competing for a scarce resource)
It's a bit complicated then, but here's a basic idea:
- When average flush time starts to increase, be worried
- If it gets into the multiple second range, you are probably on the limit (though this depends on the volume of data written and the disk speed)
- If it approaches 60 seconds you will see performance degrade severely (the flush happens every 60 seconds, so they would essentially be queuing up)
- High IOWait is going to hinder performance too, especially if you have to read from disk at any point
- Hence looking at page fault levels will also be important
The other piece of this puzzle, which we have not mentioned yet, is the journal. That will be persisting data to disk as well (by default every 100ms) and so it will be adding to the disk's load if it is on the same volume. Hence if you are seeing high disk utilization, then moving the journal off to another disk would be a good idea.
There are no real "magic numbers" to stay under, in most cases it's all relative, so get a good baseline for your normal traffic, check to see if things are trending up and maybe load test to see what your limits are and when things start to degrade and you will be in good shape.
After all that pre-amble, on to some of your questions:
What happens if there are more inserts per second than mongod is able
to save to the hard disk? Will there be any warning or will it simply
fail silently?
If you start to stress the disk to the levels described above, eventually everything is going to slow down and at some point (and this will depend on time outs, how beefy your hardware is, how you handle exceptions) your writes will fail - if you are using a recent version of pymongo then you will be using safe writes by default and those will then fail. If you wish to be a little more paranoid, you can occasionally do a write concern of j:true which will wait to return OK until the write has made it to the journal (i.e. on disk). This will, of course, be slower than a normal safe write, but it will be an immediate indication of disk capacity related issues, and you could use it to block/queue other operations and essentially act as a throttle to prevent your database from being overwhelmed.
I am thinking about a simple replication setup using one master and
one slave. Does the initial sync or a resync process lock the
databases?
I think I covered locking overall at the start, but to answer this piece specifically: First, make sure you are using a replica set, not master/slave. The master/slave implementation is deprecated and not recommended for use in general. As for the initial sync will add some load to the primary in terms of reads, but not in terms of writes, so you should be fine in terms of locking.
What happens to my data if the write queue increases on long term?
As you can probably tell from the explanation above, the answer is very much dependent on how you write your application, how you choose to have your writes acknowledged and how much capacity you have available. You can, essentially, be as safe as you wish when it comes to writing to disk on MongoDB, but there is a performance trade off, as mentioned with the j:true
discussion above.
Generally, you want to figure out your limiting factor - be it locking, disk speed etc. and then track the levels over time and scale out (sharding) or up (better hardware) before you hit a hard limit and see performance problems.
One last thing, db.serverStatus().writeBacksQueued
is actually a metric that will only ever be non-zero in a sharded environment, and it has to do with making sure that writes to a chunk during a migration are dealt with appropriately (handled by the writeback listener). Hence it essentially is a red herring here - nothing to do with general write volume.
Best Answer
There are two primary methods for WiredTiger to ensure the durability of your data: journaling and checkpointing.
Checkpoints store the valid and consistent state of the whole database. Should the server crash for any reason, it can restart from the last checkpoint in a consistent state. WiredTiger by default checkpoints the database every 60 seconds.
Journals store the writes that happened in-between checkpoints. If the database crashed between checkpoints, the journal data will be replayed on top of the last known good checkpoint. The journal is persisted to disk every 50 ms. The journaling process is optimized to be a lot more lightweight compared to checkpoints (and it also does a lot less work), so it's typically much faster than a checkpoint.
If journaling is disabled, the behaviour is slightly different between a
mongod
running in standalone mode or in replica set mode:MongoDB 3.6.4 standalone with journaling disabled: only the complete state of the whole database is written to disk on every checkpoint. That is, if your database crashes between checkpoints, you can potentially lose the last 60 seconds of your writes (if using default checkpoint timing values).
MongoDB 3.6.4 replica set with journaling disabled: by design, a replica set requires more durability since it must record every operation in the oplog and ensure that the oplog is safely persisted. Thus, it will perform a checkpoint on every write instead. This will slow down the replica set tremendously.
The tradeoff you make by disabling journaling:
In a standalone deployment, since the main goal of a database is to persist your data, limiting disk usage (space or IOPS) will result in more data being lost between allowed writes. However, it's not recommended to run a standalone MongoDB for production purposes.
In a replica set, the write process is considerably slowed down due to the need to persist the oplog. This will also make the deployment use more disk IOPS. In the future versions of MongoDB, running a replica set with no journal will not be allowed (see SERVER-30347).
For more information regarding journaling, please see the Journaling Process page.