Mongodb – How to we ensure security and integrity of data stored in MongoDB

mongodb

Since MongoDB lacks support of transactions how one can ensure consistency of data stored in MongoDB database?

Best Answer

You are mixing concepts here. Transactions are related to atomicity and guarantee that multiple operations are either executed as a whole or not at all.

Consistency, however is (simplified) the guarantee that after each successful transaction or operation applied, the database is in a a valid, usable state.

An example

Note This example is simplified for the sake of understandability.

Often people say that you can not use MongoDB for financial operations such as account balancing, since it lacks transactions. They could not be farer from the truth.

Take a simple example of two users, each with an account:

{
  _id:"act1",
  owner: "Foo Bar"
},
{
  _id: "act2",
  owner: "Bar Baz"
}

And now, we have a collection we call "transfers". Each of our accounts get $100.00 from an outside source:

db.transfers.insert({from:"Scroogey", to:"act1", amount: 100.00, date: new ISODate()})
db.transfers.insert({from:"Scroogey", to:"act2", amount: 100.00 , date: new ISODate()}})

Now, how do we get the balance of our two accounts? Say Foo Bar wants to check his balance. We have to do an aggregation for that:

db.transfers.aggregate([
  { $match:{ $or:[ { from:"act1" },{ to:"act1" } ]}},
  { $sort:{ date:1 }},
  { $group:{
    _id:"act1",
    balance:{
      $sum:{
        $cond:[
          { $eq: [ "$to", "act1" ] },
          "$amount",
          {$multiply:["$amount",-1]}
        ]}
      }
    }
  }
])

Which gives us the correct value when run against transfers:

{ "_id" : "act1", "balance" : 100 }

So far, so good. So how does this relate to atomicity and consistency?

In MongoDB, a write operation is atomic on the level of a single document, even if the operation modifies multiple embedded documents within a single document.

So if we insert a document into the transfers collection, we achieve what we want: atomic transfers from and to accounts. Say Foo Bar wants to send Bar Baz $20.25:

db.transfers.insert({from:"act1",to:"act2", amount: 20.25, date: new ISODate()})

Atomic and consistent. Now, if we run our aggregation again, we get the correct result:

{ "_id" : "act1", "balance" : 79.75 }

If Bar Baz checks his account

db.transfers.aggregate([
  { $match:{ $or:[ { from:"act2" },{ to:"act2" } ]}},
  { $sort:{ date:1 }},
  { $group:{
    _id:"act2",
    balance:{
      $sum:{
        $cond:[
          { $eq: [ "$to", "act2" ] },
          "$amount",
          {$multiply:["$amount",-1]}
        ]}
      }
    }
  }
])

he sees the correct value, too:

{ "_id" : "act2", "balance" : 120.25 }

We have an atomic system to do account balancing without transactions! And still we have a log of all transfers from and to an account which you need anyway. We just use it differently.

There is another way to do it

There is a way to achieve multi-document atomicity from the applications point of view, called Two-Phase-Commits.

Basically, it is an extension of the above. You'd have a transaction collection, in which pending operations are documented. Only if all operations are done successful, the transaction is marked as committed. There are various rollback scenarios. This is definetly more work than doing it like documented above, and should only be considered in case the data model can not be designed for the according use case. That would be the case when dealing with legacy systems, for example, in which you want to add features which would rely on transactions.

Basically you move the transaction logic from the DBMS to the application. This is not necessarily as disadvantage, since control over the various stages of a transaction which is supposed to be committed is much more granular.

Conclusion

As we have seen, there are not transactions. With careful data modeling and a bit of knowledge about the inner workings and tools provided by MongoDB, there is hardly any use case that can not be implemented with MongoDB. It might not be the best tool for any use case, however, and you should choose your persistence system carefully.

If you really need transactions, use the Two-Phase-Commit approach.