Why are NoSQL databases not ACID compliant

nosql

Not having ACID properties means that the database works well on clusters. But ACID is something very fundamental. How can a database work well if there is no atomicity, consistency, isolation and durability (ACID)?

Best Answer

How can a database work well if there is no atomicity, consistency, isolation and durability?

They can't. This is why those features were created.

Sometimes you can work around the lack of these features because you need to scale horizontally or scaling becomes your first and primary concern. You may not need need consistency now, you may be able to handle it later.. You may not need all the data returned in order: you may want the queries executed in parallel over clusters. Or your data may itself be fairly un-valuable.

Look at major products that implement these types of systems though -- generally, they suck and they have lots of errors. Ever post something to your wall on Facebook just to see it disappear and reappear multiple times? Or to have a conversation you're commenting on vanish for an extended period of time and come back? That's hundreds of people working to manage data using "NoSQL" and compiling PHP to C++. It doesn't especially work well. It works, and for most companies that's further than they can get developing an alternative.

Related Solutions

Can NoSQL databases cause occasional data loss

Though this is years old question...

In short, you can understand ACID as guarantee of data integrity/safety in any expected circumstances. As like in generic programming, all the headaches comes from multi-threading.

The biggest issue on NoSQL is mostly ACI. D(urability) is usually a separated issue.

If your DB is single-threaded - so only one user can access at once -, that's natively ACI compliant. But I am sure virtually no server can have this luxury.

If your DB need to be multi-threaded - serve multiple users/clients simultaneously - you must need ACI-compliant transaction. Or you will get silent data corruption rather than simple data loss. Which is a lot more horrible. Simply, this is exactly same with generic multi-threaded programming. If you don't have proper mechanism such as lock, you will get undefined data. And the mechanism in DB called fully ACID compliance.

Many YesSQL/NoSQL databases advertises themselves ACID-complient, but actually, very few of them are really does.

No ACID compliance = You will get always undefined result under multi-user (client) environment. I don't even think what kind of DB does this.
Single row/key ACID compliant = You will get guaranteed result if you modify only single value at once. But undefined result (=silent data corruption) for simultaneous multi row/key update. Most of currently popular NoSQL DBs including Cassandra, MongoDB, CouchDB, … These kind of DBs are safe only for single-row transaction. So you need to guarantee your DB logic won't touch multiple rows in a transaction.
Multi row/key ACID compliance = You will always get guaranteed result for any operation. This is minimal requirements as a RDBMS. In NoSQL field, very few of them does this. Spanner, MarkLogic, VoltDB, FoundationDB. I am not even sure there's more solutions. These kind of DBs are really fresh and new, so mostly nothing is known about their ability or limitation.

Anyway, this is a comparison except D(urability). So don't forget to check durability attribute too. It's very hard to compare durability because range becomes too wide. I don't know this topic well…

No durability. You will lost data at any time.
Safely stored on disk. When you get COMMIT OK, then the data is guaranteed on disk. You lost data if disk break.

Also, there're difference even on ACID compliant DBs.

Sometimes ACID compliant / you need configuration / no automatic something.. / some components are not ACID-complient / very fast but you need to turn off something for this... / ACID-compliant if you use specific module... = we will not bundle data safety by default. That's an add-on, option or separated sold. Don't forget to download, assemble, setup and issuing proper command. Anyway, data safety may be ignored silently. Do it yourself. Check it yourself. Good luck not to make any mistake. Everyone in your team must be flawless DBA to use this kind of DB safely. MySQL.
Always ACID compliant = We don't trade data safety with performance or anything. Data safety is a forced bundle with this DB package. Most commercial RDBMS, PostgreSQL.

Above is typical DB's implementation. But still, any other hardware failure may corrupt the database. Such as memory error, data channel error, or any other possible errors. So you need extra redundancy, and real production-quality DB must offer fault tolerance features.

No redundancy. You lose all data if your data corrupted.
Backup. You make snapshot copy/restore. You lose data after last backup.
Online backup. You can do snapshot backup while the database is running.
Asynchronous replication. Backup for each second (or specified period). If machine down, this DB guaranteed to get the data back by just rebooting. You lose data after last second.
Synchronous replication. Backup immediately for each data update. You always have exact copy of original data. Use the copy if origin breaks.

Until now, I see many DB implementation lacks many of these. And I think if they lacks proper ACID and redundancy support, users will lose data eventually.

NoSQL ACID and consistency for banking

As far as I know there are no "nosql" databases that promise ACID transactions, so for banking purposes they are a non starter. Referential consistency support is not usually in their key feature sets either.

mySQL claims ACID transactions when using innodb tables, but I believe there are some caveats around that which may be show stoppers (any mix of other table types, including in-memory tables which are sometimes used for intermediate results in complex logic, will break ACID compliance for instance).

If you are looking at these options because they are free, then consider postgres which does offer ACID transactions, even including transactions that make schema changes. Please consider your support SLAs though: if this is a system that needs high availability (and the words "realtime" and "banking" suggest it is) then I recommend having a support contract in case something beyond your understanding or access to fix goes wrong. This will be rather expensive and once you are talking about that sort of money you might as well at least consider commercial database servers too.

One final point not relating to any particular database: make sure you have the right options selected for full ACID behaviour. A lot of systems default to settings that bend the isolation part, because guaranteeing complete isolation often requires serialisation which can significantly impact the performance of concurrent operations. Whatever DBMS you chose make sure that you understand its options for isolation/serialisation (like "SET TRANSACTION ISOLATION LEVEL" and related directives in MSSQL) and how they interact with each other.

Best Answer

Related Solutions

Can NoSQL databases cause occasional data loss

NoSQL ACID and consistency for banking

Related Question