MongoDB: What does ‘[DataFileSync] flushing mmaps took … ms for … files’ mean

memorymongodb

I am building an index in MongoDB for a large collections (~300GB) and the db terminal window keeps printing:

[DataFileSync] flushing mmaps took 10920ms for 151 files

The time that it takes to flush mmaps is usually around 10 seconds and sometimes higher. I'm running MongoDB on a high performance machine so RAM availability is not an issue.

What does the data file sync message mean?

Is it an issue??

EDIT:

The real problem with the flushing routine is that MongoDB seems to freeze while doing that.

Even simple queries from a terminal window on the same host have to wait for the flushing to finish in order to complete and ~10 seconds (up to 150s) is way too long. The routine happens (at irregular intervals of 1, 2, or 3 mins) even if no operations are run on the server and there are zero connections active.

I'm running MongoDB 2.2.1 2008R2+ on Windows Server 2008 R2.

Best Answer

MongoDB flushes the data changes in memory to disk using a background thread every 60 seconds (it's tunable, see syncdelay). During normal operations, that is going to be inserts, updates, deletes and the like (including changes to indexes) and will roughly correlate to your activity in terms of those operations. The mmaps piece relates to the fact that MongoDB uses memory mapped files

When you build an index, you are going to be creating a lot of new in-memory data and that is all going to be persisted to disk every 60 seconds. How quick that flush is will be a function of:

  • How much data has to be flushed to disk (in your case how much index building has gone on)
  • How fast your disk can write that data out

Given that your example shows that it is taking 10 seconds to flush, that would imply that the disk is struggling a bit - single digit seconds is high, double digits is worrying. Of course it does depend on the nature of your disk - if an SSD were taking 10 seconds (for example) that would need to be a huge amount of data.

That explains the messaging, but besides all that index builds are intensive (and blocking) operations - you can run them in the background on a primary to avoid some of the impact, but they will run in the foreground on secondaries until SERVER-2771 is fixed. As a result the recommended way to build indexes with a replica set is outlined here:

http://docs.mongodb.org/manual/administration/indexes/#index-building-replica-sets

The impact will depend on how long the build takes, what you are doing at the time etc. but if you follow the instructions above you should be able to avoid any of those effects impacting your end users.