OBSERVATION #1
Performance implications should become quickly apparent if the index pages of monitor.result
have to experience two things:
- Experience 1) Swapping of the RAM making up the MyISAM Key Cache
- Experience 2) Rotation in and out of the MyISAM Key Cache (sized by key_buffer_size)
Experience 1 is virtually unavoidable. As for Experience #2, it could result in needed index pages being purged out of the MyISAM Key Cache in the face of more recent queries against other MyISAM tables. Those needed index pages more be brought back by querying the corresponding table. The two experiences togther could make for a slower-than-expected query on relatively small tables.
However, you could minimize or neutralize any ill effects of swapping by assigning the index of monitor.result
by creating a dedicated MyISAM Key Cache. It will be a cache that will contain only index pages from monitor.result
.
How do you do that ???
First, recall that you mentioned that the index usage for monitor.result
was 2.29MB. You can create that dedicated key cache with that size with a little headroom, say 2.5MB. Let us create that key cache like this:
SET GLOBAL monitor_result_private_cache.key_buffer_size = 1024 * 512 * 5;
CACHE INDEX monitor.result IN monitor_result_private_cache;
LOAD INDEX INTO CACHE monitor.result;
This will perform the following steps:
- Create the Key Cache
- Assign the Key Cache to the MyISAM Table using LOAD INDEX INTO CACHE
- Load the Index Pages of the Assigned MyISAM Table into its Corresponding Cache
It would conveniently keep the index pages of that table from ever leaving the cache. The only table index pages would leave is if INSERTs into monitor.result
increases the contents beyond 2.5MB. You must choose enough headroom to accommodate many INSERTs into monitor.result
.
OBSERVATION #2
I also noticed the index you laid out for monitor.result
:
PRIMARY KEY (`server`, `request`, `recorded`, `ref_id`)
If any of your queries against monitor.result
resemble something like this:
SELECT resoultion,value FROM monitor.result
WHERE server = 200 AND refid = 50 AND ... ;
You can speed queries by reordering the PRIMARY KEY
CREATE TABLE IF NOT EXISTS `monitor`.`result` (
`server` SMALLINT UNSIGNED NOT NULL ,
`ref_id` SMALLINT UNSIGNED NOT NULL ,
`request` TINYINT UNSIGNED NOT NULL ,
`recorded` TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ,
`resolution` TINYINT NOT NULL ,
`value` MEDIUMINT UNSIGNED NOT NULL ,
PRIMARY KEY (`server`, `ref_id`, `request`, `recorded`) )
ENGINE = MyISAM
or adding a UNIQUE index
CREATE TABLE IF NOT EXISTS `monitor`.`result` (
`server` SMALLINT UNSIGNED NOT NULL ,
`ref_id` SMALLINT UNSIGNED NOT NULL ,
`request` TINYINT UNSIGNED NOT NULL ,
`recorded` TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ,
`resolution` TINYINT NOT NULL ,
`value` MEDIUMINT UNSIGNED NOT NULL ,
PRIMARY KEY (`server`, `request`, `recorded`, `ref_id`),
UNIQUE KEY uniqndx1 (`server`, `ref_id`, `request`, `recorded`)
ENGINE = MyISAM
If you add a UNIQUE index, you must double the dedicate keycache accordingly.
OBSERVATION #3
You mentioned a SATA drive. Good choice for archival, low-update historical data. Any MyISAM table on a SATA drive which has a dedicated keycache should not be bothered by the index lookup but the data retrival time from the drive would be up to you to benchmark to see if you could live with the running times.
Here is an alternative:
Create an index that has every column
CREATE TABLE IF NOT EXISTS `monitor`.`result` (
`server` SMALLINT UNSIGNED NOT NULL ,
`ref_id` SMALLINT UNSIGNED NOT NULL ,
`request` TINYINT UNSIGNED NOT NULL ,
`recorded` TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ,
`resolution` TINYINT NOT NULL ,
`value` MEDIUMINT UNSIGNED NOT NULL ,
PRIMARY KEY (`server`, `ref_id`, `request`, `recorded`, `resolution`, `value`) )
ENGINE = MyISAM
What does this do? It provides data retrieval of entire rows strictly from the index. Combining this with a dedicated keycache, you will essentially has the whole table in RAM. All queries would be fulfilled by the index and never touch the table, REGARDLESS of the table being on SAS, SATA, SSD, or even stone.
UPDATE 2012-01-26 18:18 EDT
Question 1: You may want to look into memcached. I believe that there is a version InnoDB with a memcached plugin. At least, that's what I heard.
Question 2: InnoDB is for transactional tables. If you have archive data, compressed MyISAM tables should fill the bill. In fact, you could look into the ARCHIVE storage engine.
Question 3: Storing an index on disk (MyISAM,InnoDB) is always standard and cannot be changed. You have use special commands or run special queries to preload caches.
Question 4: RAID-10 for high-writes, SSD for high-reads. Watch your disk surface temperatures !!!
Question 5: if the table is strictly for holding historical info, no need for overkill. As long as it is a table rarely read, no need for special caching considerations.
I am not very familiar with your needs, but perhaps storing each data point in the database is a bit of overkill. It sound almost like taking the approach of storing an image library by storing each pixel as a separate record in a relational database.
As a general rule, storing binary data in databases is wrong most of the time. There is usually a better way of solving the problem. While it is not inherently wrong to store binary data in relational database, often times the disadvantages outweigh the gains. Relational databases, as the name alludes to, are best suited for storing relational data. Binary data is not relational. It adds size (often significantly) to databases, can hurt performance, and may lead to questions about maintaining billion-record MySQL instances. The good news is that there are databases especially well suited for storing binary data. One of them, while not always readily apparent, is your file system! Simply come up with a directory and file naming structure for your binary files, store those in your MySQL DB together with any other data which may yield value through querying.
Another approach would be using a document-based storage system for your datapoints (and perhaps spectra) data, and using MySQL for the runs (or perhaps putting the runs into the same DB as the others).
Best Answer
The problem here is "large number of queries" which you likely have other ideas of than me (who says sure - you need a lot of hardware for 30.000 queries per second).
Generally otherwise - yes. This is possible.
Lots of memory needed ;) keep it all in memory.
O rget a proper disc subsystem. Seriously, your "regular hard disc" is dead slow. Like terribly slow. SSD area good and low cost alternative (roughly 550 times as fast as your hard disc in random access). Unless you RAM up to keep it all in memory your disc IO is the typical bottleneck... and SSD are the cheapest solution. Unless you REALLY up your RAM you simply have no other alternative than gettig some decent IO capabilities, the (end user level) hard disc (i.e. a non SAS 15k high performance disc) just wont get you far.