The issue with storing your database in memory is if you have any sort of memoery issue or server has to be restarted or anything of that issue all your memory will get flushed.
That is the reason people don't store their database in memory.
Now, there are caching tools which are in-memory and can work as a very simple database like memcached. That may meet your needs. If you look in to tmpfs and ramfs you can create a folder that exists in memory and move your files in there normally.
So, if you are working with MongoDB, mysql or whatever you work with, you can have the data folder live in the RAM folder. This will give the database super fast read and writes. Everything will be really fast. You will be limited to how much RAM you have minus the size of your OS and other things running.
Also, just be careful: MongoDB likes to store writes in memory until the disk has a chance to write, it so you may want to turn that feature off because it will be the same speed.
My recommendation is to work with memcached and then mix it with a normal database that lives on disk. The concept is done with PHP sessions on some systems.
http://mickeyben.com/2009/12/30/using-nginx-as-a-load-balancer.html
The basic way it works is, if your record is found in memcached, then it will not check the database. If it is not found, then do three(3) things:
- check the db
- send the data to memcached
- send the data to calling function
:)
Oracle is a SQL DBMS not a truly relational database. It implements as its logical model of data a variant of SQL. Its architecture was developed in the late 1970s along the same lines as IBM's System R which was an initial implementation of a DBMS based on the relational model using SQL as the data sub-language. This short background is necessary to understand that SQL and Oracle are not the same as relational. The relational model, as defined by Codd and further developed by researchers like Date, is a purely logical model of data where data is to be presented as relations, with a relational algebra defined to manipulate the relations, and a data integrity component to make it possible for the DBMS to maintain data consistent with its real world intent. The relational model is mute on implementation by the DBMS. Therefore, when identifying a use-case that a given SQL DBMS does not handle well purely due to performance reasons, the issue lies with the implementation, not the relational model.
Given this, I suspect NoSQL solutions are sometimes recommended over SQL DBMS' for time series analysis as time series analysis is a very narrow use case and the SQL DBMS architecture was aimed at more generalized online transaction processing use cases. I know little about time series analysis but do recognize that it is purely analytical and not OLTP, and so transaction support - which is a mainstay of SQL DBMS' but orthogonal to the relational model - is pure overhead in such a use case. I recall seeing Michael Stonebraker a few years ago discuss time series analysis and argue the solution was to store data in arrays not rows. Since all the SQL DBMS's are row-stores, that may be another reason why other solutions are recommended.
I would caution against diving right into a NoSQL solution. These systems are much less mature than traditional DBMS'. Secondly, time series analysis I believe is pretty heavy statistics and you will likely have to add that yourself with a NoSQL solution. A mature SQL DBMS like Oracle may have some built in statistical features that are much easier to use. Third, SQL is, despite its flaws, a complete query language giving you the power to write queries of arbitrary complexity. Most NoSQL solutions require you to write programs to perform the analysis you want. Finally, and perhaps most importantly, it is likely that to get any useful information out of your time series data you need to "enrich" it with other related data. For example, I work for an electric utility and in this business just having a huge amount of time series data on how much power was used for an interval of time isn't very useful unless you can correlate it to weather, demographics, and so on. A SQL DBMS, precisely because it is a generalized data management solution, makes that easy. You can place the time series data in the same database as the enriching data and have the full power of SQL to join and analyze it. With a NoSQL solution you will have to perform the enrichment yourself as an additional step - potentially extracting, transforming, and loading the data from the very SQL DBMS that wasn't used to store the time series data in the first place! It will be a lot of extra work to write the ETL programs, and you have to decide at the time you write them what data will be useful for analysis. If later you decide you didn't have everything useful, you now have to write more programs. If instead you placed the time series data in Oracle right along with all your other data it is already in place and ready for analysis once you discover a need.
Bottom line I would say that unless can prove you have so much data coming in so fast as to exceed the capacity of your existing SQL DBMS installation, and you have the time and the skill sets necessary to write the additional infrastructure on top of a NoSQL solution (assuming of course the NoSQL solution you choose does have the capability to scale to the data volume and velocity), you are better off sticking with the SQL DBMS.
Best Answer
Though this is years old question...
In short, you can understand ACID as guarantee of data integrity/safety in any expected circumstances. As like in generic programming, all the headaches comes from multi-threading.
The biggest issue on NoSQL is mostly ACI. D(urability) is usually a separated issue.
If your DB is single-threaded - so only one user can access at once -, that's natively ACI compliant. But I am sure virtually no server can have this luxury.
If your DB need to be multi-threaded - serve multiple users/clients simultaneously - you must need ACI-compliant transaction. Or you will get silent data corruption rather than simple data loss. Which is a lot more horrible. Simply, this is exactly same with generic multi-threaded programming. If you don't have proper mechanism such as lock, you will get undefined data. And the mechanism in DB called fully ACID compliance.
Many YesSQL/NoSQL databases advertises themselves ACID-complient, but actually, very few of them are really does.
No ACID compliance = You will get always undefined result under multi-user (client) environment. I don't even think what kind of DB does this.
Single row/key ACID compliant = You will get guaranteed result if you modify only single value at once. But undefined result (=silent data corruption) for simultaneous multi row/key update. Most of currently popular NoSQL DBs including Cassandra, MongoDB, CouchDB, … These kind of DBs are safe only for single-row transaction. So you need to guarantee your DB logic won't touch multiple rows in a transaction.
Multi row/key ACID compliance = You will always get guaranteed result for any operation. This is minimal requirements as a RDBMS. In NoSQL field, very few of them does this. Spanner, MarkLogic, VoltDB, FoundationDB. I am not even sure there's more solutions. These kind of DBs are really fresh and new, so mostly nothing is known about their ability or limitation.
Anyway, this is a comparison except D(urability). So don't forget to check durability attribute too. It's very hard to compare durability because range becomes too wide. I don't know this topic well…
No durability. You will lost data at any time.
Safely stored on disk. When you get
COMMIT OK
, then the data is guaranteed on disk. You lost data if disk break.Also, there're difference even on ACID compliant DBs.
Sometimes ACID compliant / you need configuration / no automatic something.. / some components are not ACID-complient / very fast but you need to turn off something for this... / ACID-compliant if you use specific module... = we will not bundle data safety by default. That's an add-on, option or separated sold. Don't forget to download, assemble, setup and issuing proper command. Anyway, data safety may be ignored silently. Do it yourself. Check it yourself. Good luck not to make any mistake. Everyone in your team must be flawless DBA to use this kind of DB safely. MySQL.
Always ACID compliant = We don't trade data safety with performance or anything. Data safety is a forced bundle with this DB package. Most commercial RDBMS, PostgreSQL.
Above is typical DB's implementation. But still, any other hardware failure may corrupt the database. Such as memory error, data channel error, or any other possible errors. So you need extra redundancy, and real production-quality DB must offer fault tolerance features.
No redundancy. You lose all data if your data corrupted.
Backup. You make snapshot copy/restore. You lose data after last backup.
Online backup. You can do snapshot backup while the database is running.
Asynchronous replication. Backup for each second (or specified period). If machine down, this DB guaranteed to get the data back by just rebooting. You lose data after last second.
Synchronous replication. Backup immediately for each data update. You always have exact copy of original data. Use the copy if origin breaks.
Until now, I see many DB implementation lacks many of these. And I think if they lacks proper ACID and redundancy support, users will lose data eventually.