I suggest a single time series entry per document. There are some problems with storing multiple entries per document:
- a single document is limited to a certain size (currently 16 MB); this limits how many entries can be stored in a single document
- as more entries are added to a document, the entire document (and time series) will needlessly be deleted and reallocated to a larger piece of memory
- queries on sub-documents are limited compared to queries on regular documents
- documents with very flat structures (like one sub-document for each second) are not performant
- the built-in map-reduce does not work as well on sub-documents
Also note a timestamp is built-in to the default MongoDB ObjectId. You can use this if the time series precision is less than one second.
Here is an example BSON document from an event logging library that uses MongoDB:
Example format of generated bson document:
{
'thread': -1216977216,
'level': 'ERROR',
'timestamp': Timestamp(1290895671, 63),
'message': 'test message',
'fileName': '/var/projects/python/log4mongo-python/tests/test_mongo_handler.py',
'lineNumber': 38,
'method': 'test_emit_exception',
'loggerName': 'testLogger',
'exception': {
'stackTrace': 'Traceback (most recent call last):
File "/var/projects/python/log4mongo-python/tests/test_mongo_handler.py", line 36, in test_emit_exception
raise Exception(\'exc1\')
Exception: exc1',
'message': 'exc1',
'code': 0
}
}
Since an event log is similar to a time series, it may be worth studying the rest of the code. There are versions in Java, C#, PHP, and Python.
Here is another similar open source project: Zarkov
[update] In response to @RockScience's comment, I've adding some more references:
For the first part, expiring based on a timestamp, you will want to check out TTL collections (requires version 2.2+). They won't cap size like a capped collection, but let's you set the constraints via time as long as you have a BSON Date type field to index on.
For the second part, automatically storing them to a new database before expiry, I think what you need is a periodic query (via cron, at, or other and in the language of your choice) to run and grab those documents before they expire. This should just be a simple range based query over time, find the appropriate docs, insert to new DB, always at enough of an offset to avoid missing documents.
The other approach I can think of would be to tail the oplog, look for deletes on that collection, and then transcode them into inserts on a different DB.
For extra safety you could run a secondary on a slavedelay to give an extra window if one of the approaches above failed.
I don't know of anything that will do this by default, but I believe either of the approaches above should work.
Best Answer
You can always insert object with type "date" (type 9), no special codes needed. As you can always insert it as "string" or ISODate.