Looking at the feature availability list at http://dev.mysql.com/doc/refman/5.1/en/memory-storage-engine.html two possible problems jump out:
- No transaction or FK support, meaning you will have to manage transactional integrity and referential integrity in your own code were needed (which could end up being a lot less efficient than letting the DB do this for you, though that very much depends on your app's expected behaviour patterns).
- Table level locking only: this could be a significant barrier to scalability if your app needs multiple concurrent writers to the same set of tables or in cases where your read operations use locks to ensure consistent data is read - in such cases a disk based table that supports much finer lock granularity will perform far better if enough of its content is currently cached in RAM.
Other than that, assuming you have enough RAM, a memory based table should be faster than a disk based one. Obviously you need to factor in taking snapshots to disk to address the issue of what happens when the server instance is reset, which is likely to completely negate the performance benefit overall if the data needs capturing often (if you can live with losing a day of data in such an instance you could just take a backup once per day, but in most cases that would not be acceptable).
An alternative might be to:
- Use disk based tables, but ensure that you have more than enough RAM to hold them all in RAM at any given time (and "enough RAM" might be more than you think as you need to account for any other processes on the machine, OS IO buffers/cache and so forth)
- Scan the entire contents (all data and index pages) of the table on each startup to preload the content into memory with
SELECT * FROM <table> ORDER BY <pkey fields>
for each table followed by SELECT <indexed fields> FROM <table> ORDER BY <index fields>
for each index
This way all your data is in RAM, you only have to worry about I/O performance for write operations. If your app's common working set is much smaller than the whole DB (which it usually the case - in most applications most users will only be looking at the most recent data most if the time) you might be better of being more selective about how much you scan to preload into memory, allowing the rest to be loaded from disk on demand.
I find it interesting that Percona says that HandlerSocket is not that popular. In fact, it has bugs when writing operations interferes with open HANDLER structures.
What I also find appalling is that the concept of the HandlerSocket (known back then as HANDLER goes way back to MySQL 4.0.3 for both MyISAM and InnoDB. There were only provisions for reading at that time.
The basic usage of HANDLER is also in the book MySQL Reference Manual : Documentation From the Source, Section 6.4.2 Pages 512,513 (I have the book right here next to me)
Question remains : What purpose does the HandlerSocket serve ???
Imagine Facebook engineers trying to read a value in a MyISAM table (Facebook hates MyISAM, BTW) that is heavily trafficked. We all know that MyISAM performs a full table lock for any DML or SELECT performed. What you wanted to do is adjust a single column in a specfic row in that table (via SQL) but you need to know that data right now before SELECT queries lock it down. Do you want to wade through the humongous number of DB Connections hitting the one table you need just to change a single value? You could bypass these waiting SELECTs with no locking using the HANDLER syntax. You can do the same to InnoDB tables and bypass MVCC (MultiVersioning Concurrecy Control) and row-level locks as well.
Only in a high-traffic, high-read environment would you need to takes such a risk, especially if finding out data RIGHT NOW was the highest priority.
If you really want to know what production environments actually use it, I would write Percona directly in the blogs begging for that answer. Since I have never seen coding examples with the HANDLER (today HandlerSocket) do things other than reading, it would only serve the purposes of big MySQL installations like Facebook.
CAVEAT
When you look at the syntax for opening tables, opening indexes and traversing rows, it resembles some programming languages I coded in back in the late 1980's.
I am referring to the languages DBase, Clipper, FoxPro and Visual FoxPro. DBase was the grandfather of all these products. Now, before you start laughing and falling out of your chair, DBASE STILL EXISTS AND SO DOES VISUAL FOXPRO !!! I personally know one person that still codes in FoxPro and Visual FoxPro !!! OK, now you can fall out of your chair, crying in the fetal position. Still don't believe me ??? How about some sample code ???
Enough ranting on old data handling programming languages that resembles hierglyphics !!!
Best Answer
If you are learning MySQL, skipping MEMORY storage engine is not a bad idea. InnoDB is the default storage engine and it is suitable for most cases. Only in edge cases it makes sense to use another one, and you probably shouldn't consider this option before knowing InnoDB quite well.
However, to answer your question, first let's check pro's and con's of MEMORY.
Cons:
HASH
indexes, so there is no optimisation for queries likeWHERE col > n
,ORDER BY
orGROUP BY
.Pros:
So the question becomes, which use cases don't suffer because of the cons and take advantage of the pro's? Well, a first obvious generic answer is, avoid all use cases where data loss is a problem. More specifically, you could consider MEMORY for:
BTREE
indexes, this is not a good option.TEMPORARY
tables, when you don't needBTREE
indexes.