Mongodb – How should I store time series in mongodb

mongodb

I need to create a database of time series, and perform the following tasks:

  • create new time series
  • update existing time series
  • query one or several time series at once (for instance all time series for the same date etc…)

Is Mongo adapted to that and if yes, how should I structure the database? (one time serie = one document? Or one document = one entry of the time serie, and all these documents form the collection which is the entire time series?)

I am a bit lost here and I find it difficult to find any information as usually Mongo is presented as very flexible so the user has the choice in the infrastructure.

Any link to tutorial that specifically explain how to manage time series in Mongo is very much welcome.

Thank you!

Best Answer

I suggest a single time series entry per document. There are some problems with storing multiple entries per document:

  • a single document is limited to a certain size (currently 16 MB); this limits how many entries can be stored in a single document
  • as more entries are added to a document, the entire document (and time series) will needlessly be deleted and reallocated to a larger piece of memory
  • queries on sub-documents are limited compared to queries on regular documents
  • documents with very flat structures (like one sub-document for each second) are not performant
  • the built-in map-reduce does not work as well on sub-documents

Also note a timestamp is built-in to the default MongoDB ObjectId. You can use this if the time series precision is less than one second.

Here is an example BSON document from an event logging library that uses MongoDB:

Example format of generated bson document:
{
    'thread': -1216977216,
    'level': 'ERROR',
    'timestamp': Timestamp(1290895671, 63),
    'message': 'test message',
    'fileName': '/var/projects/python/log4mongo-python/tests/test_mongo_handler.py',
    'lineNumber': 38,
    'method': 'test_emit_exception',
    'loggerName':  'testLogger',
    'exception': {
        'stackTrace': 'Traceback (most recent call last):
                       File "/var/projects/python/log4mongo-python/tests/test_mongo_handler.py", line 36, in test_emit_exception
                       raise Exception(\'exc1\')
                       Exception: exc1',
        'message': 'exc1',
        'code': 0
    }
}

Since an event log is similar to a time series, it may be worth studying the rest of the code. There are versions in Java, C#, PHP, and Python.

Here is another similar open source project: Zarkov


[update] In response to @RockScience's comment, I've adding some more references: