First, let me say that with the information given it will be hard to get at the root cause - it is usually an iterative process that takes multiple attempts to track down the culprit. In the interest of answering your "what next?" portion of the question rather than identifying the root cause, read on.....
First, a couple of recommendations:
- Get the host into MMS (it's free) - see http://mms.10gen.com - so you can graph stats over time and get a view of the issues without having to be sitting on the box running commands
- Get munin-node installed too, so you can correlate ops etc. with IO (install docs for MMS explain this).
Next, a couple of quick checks for common causes:
- What is your filesystem/kernel? - these generally need to be ext4/XFS and recent enough to have fallocate working (2.6.23 and 2.6.25 respectively) so that new file allocation is not slow
- Assuming you don't get MMS and munin installed, get iostat output to match up with mongostat to determine if IO is the root cause for the bottleneck
- Do you do any periodic batch updates that grow the documents significantly (i.e. that would cause moves)? Moves are expensive and can cause IO to get backed up
- Is your disk up to the data volume you are writing to it? MongoDB fsyncs to disk every 60 seconds by default, if the volume that needs to be synced after 60 seconds is massive (say because of an insert spike) then you can also run into issues
That's not an exhaustive list, I have seen other issues cause this, but that should get you started down the right path.
For the first part, expiring based on a timestamp, you will want to check out TTL collections (requires version 2.2+). They won't cap size like a capped collection, but let's you set the constraints via time as long as you have a BSON Date type field to index on.
For the second part, automatically storing them to a new database before expiry, I think what you need is a periodic query (via cron, at, or other and in the language of your choice) to run and grab those documents before they expire. This should just be a simple range based query over time, find the appropriate docs, insert to new DB, always at enough of an offset to avoid missing documents.
The other approach I can think of would be to tail the oplog, look for deletes on that collection, and then transcode them into inserts on a different DB.
For extra safety you could run a secondary on a slavedelay to give an extra window if one of the approaches above failed.
I don't know of anything that will do this by default, but I believe either of the approaches above should work.
Best Answer
In MongoDB dates are stored as BSON Dates and this type is a 64-bit integer that represents the number of milliseconds since the Unix epoch (Jan 1, 1970). This means that dates and date-time values are stored in the same format, so there is no difference in indexing on either of them.