Mysql – How important is drive speed for a MySql server

hardwareindexMySQL

I'm trying to invest in a self managed server for MySQL. My use case will involve querying a full text index of anywhere from 5m to 20m records. About 5 thousand full text searches per hour. Usually the searches will each query multiple match against statements as I will need to search multiple columns in the table index. The only write queries will be to update a single field of a single record approximately 10 to 20 times per minute.

My question is this… How beneficial would it be to setup an 8 drive NVMe raid 0? I know indexes are usually queried from RAM as this is usually significantly faster than storage drives. But with the newest PCIe 4.0 NVMe SSDs and the crazy speeds of these drives, I could probably setup raid0 that has about twice throughout as the fastest DDR4 ram.

Or does none if this even matter? Is throughout the factor that even matters? Can you even force indexes to be queried from the storage drive?

Thanks in advance for the advice.

Best Answer

How many GB in the tables? It sounds like the RAM might be big enough to cache everything all the time? If that is the case, the disk is irrelevant.

Even if RAM is not big enough, FULLTEXT does not need to hit the disk that much. Think of a query as being controlled by how many disk hits are needed, minus how many were avoided by caching.

A Rule of Thumb is that a disk hit on HDD is 10ms; on SSD is 1ms. If the query needs only a "few" hits, the response time on even HDD will be adequate.

I'm saying that FT 5K searches per minute is probably feasible with your proposed configuration! Even if there were 20 updates per second.

Every action involving the disk actually occurs in the "buffer_pool" in RAM. So, if the disk must be hit, the time taken for the query is necessarily slower regardless of disk speed.

On the other hand, NVMe is getting cheap, so why not!

Related Solutions

Mysql – Extreme storage beasts

All one has to do is run these queries:

MyISAM

SELECT CONCAT(ROUND(KBS/POWER(1024,
IF(PowerOf1024<0,0,IF(PowerOf1024>3,0,PowerOf1024)))+0.4999),
SUBSTR(' KMG',IF(PowerOf1024<0,0,
IF(PowerOf1024>3,0,PowerOf1024))+1,1))
recommended_key_buffer_size FROM
(SELECT LEAST(POWER(2,32),KBS1) KBS
FROM (SELECT SUM(index_length) KBS1
FROM information_schema.tables
WHERE engine='MyISAM' AND
table_schema NOT IN ('information_schema','mysql')) AA ) A,
(SELECT 2 PowerOf1024) B;

InnoDB

SELECT CONCAT(ROUND(KBS/POWER(1024,
IF(PowerOf1024<0,0,IF(PowerOf1024>3,0,PowerOf1024)))+0.49999),
SUBSTR(' KMG',IF(PowerOf1024<0,0,
IF(PowerOf1024>3,0,PowerOf1024))+1,1)) recommended_innodb_buffer_pool_size
FROM (SELECT SUM(data_length+index_length) KBS FROM information_schema.tables
WHERE engine='InnoDB') A,
(SELECT 2 PowerOf1024) B;

These will recommend the right sizes for the MyISAM Key Cache for existing MyISAM and the InnoDB Buffer Pool for InnoDB data and indexes.

While running these SQL queries recommend the buffer sizes, the human factor must still kick in. We must plan data usage based on hardware and frequently accessed data. We must also ask: Of the amount of buffer recommended, how much of my working set will actually reside in memory?

If you configure MySQL to use 75% of RAM for InnoDB, even on a dedicated MySQL server, the OS will get a busy paging to disk. Just because MySQL can use huge numbers for buffer sizes does not mean it's OK to push the limits. Two have two basic storage engines to consider.

Use of commodity hardware merely demonstrates MySQL configurability with noticeable results. Unless you are using 32-bit RHEL (if you are, please stop using it right now, get plenty of coffee, go upgrade), raising buffer sizes in storage beasts can be well trusted.

For example, my employer has a client with 3 DB servers dedicated to MySQL only. Each has 192GB of RAM of which 162GB is the InnoDB Buffer Pool. The working set is actually 170GB. While there is a little swapping and paging going on, the DB performance is astounding. OS has about 30GB of RAM for itself (16GB of RAM disk for temp tables). I am sure that these servers are just little lizards compared to the komodo dragon-sized DB servers out there. MySQL can be up to the task if configured properly.

Keep in mind that InnoDB puts its own checks and balances in place because the current source code limits InnoDB log files to 4GB. This is the case because InnoDB was originally designed with commodity hardware in mind. In light of this, ACID transaction throughput could possibly bottleneck there regardless of the storage beast's hardware setup and the version of MySQL you choose to run.

CONCLUSION

It's OK to push limits on Big Beefy DB Servers. You should always do so with consideration to the OS in terms of Memory, Swap Space, Working Dataset, and number of transactions expected.

Mysql – Less RAM than Index_length MyISAM

OBSERVATION #1

Performance implications should become quickly apparent if the index pages of monitor.result have to experience two things:

Experience 1) Swapping of the RAM making up the MyISAM Key Cache
Experience 2) Rotation in and out of the MyISAM Key Cache (sized by key_buffer_size)

Experience 1 is virtually unavoidable. As for Experience #2, it could result in needed index pages being purged out of the MyISAM Key Cache in the face of more recent queries against other MyISAM tables. Those needed index pages more be brought back by querying the corresponding table. The two experiences togther could make for a slower-than-expected query on relatively small tables.

However, you could minimize or neutralize any ill effects of swapping by assigning the index of monitor.result by creating a dedicated MyISAM Key Cache. It will be a cache that will contain only index pages from monitor.result.

How do you do that ???

First, recall that you mentioned that the index usage for monitor.result was 2.29MB. You can create that dedicated key cache with that size with a little headroom, say 2.5MB. Let us create that key cache like this:

SET GLOBAL monitor_result_private_cache.key_buffer_size = 1024 * 512 * 5;
CACHE INDEX monitor.result IN monitor_result_private_cache;
LOAD INDEX INTO CACHE monitor.result;

This will perform the following steps:

Create the Key Cache
Assign the Key Cache to the MyISAM Table using LOAD INDEX INTO CACHE
Load the Index Pages of the Assigned MyISAM Table into its Corresponding Cache

It would conveniently keep the index pages of that table from ever leaving the cache. The only table index pages would leave is if INSERTs into monitor.result increases the contents beyond 2.5MB. You must choose enough headroom to accommodate many INSERTs into monitor.result.

OBSERVATION #2

I also noticed the index you laid out for monitor.result:

PRIMARY KEY (`server`, `request`, `recorded`, `ref_id`)

If any of your queries against monitor.result resemble something like this:

SELECT resoultion,value FROM monitor.result
WHERE server = 200 AND refid = 50 AND ... ;

You can speed queries by reordering the PRIMARY KEY

CREATE  TABLE IF NOT EXISTS `monitor`.`result` (            
  `server` SMALLINT UNSIGNED NOT NULL ,            
  `ref_id` SMALLINT UNSIGNED NOT NULL ,            
  `request` TINYINT UNSIGNED NOT NULL ,            
  `recorded` TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ,            
  `resolution` TINYINT NOT NULL ,            
  `value` MEDIUMINT UNSIGNED NOT NULL ,            
  PRIMARY KEY (`server`, `ref_id`, `request`, `recorded`) )            
ENGINE = MyISAM

or adding a UNIQUE index

CREATE  TABLE IF NOT EXISTS `monitor`.`result` (             
  `server` SMALLINT UNSIGNED NOT NULL ,             
  `ref_id` SMALLINT UNSIGNED NOT NULL ,             
  `request` TINYINT UNSIGNED NOT NULL ,             
  `recorded` TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ,             
  `resolution` TINYINT NOT NULL ,             
  `value` MEDIUMINT UNSIGNED NOT NULL ,             
  PRIMARY KEY (`server`, `request`, `recorded`, `ref_id`),
  UNIQUE KEY uniqndx1 (`server`, `ref_id`, `request`, `recorded`)
ENGINE = MyISAM

If you add a UNIQUE index, you must double the dedicate keycache accordingly.

OBSERVATION #3

You mentioned a SATA drive. Good choice for archival, low-update historical data. Any MyISAM table on a SATA drive which has a dedicated keycache should not be bothered by the index lookup but the data retrival time from the drive would be up to you to benchmark to see if you could live with the running times.

Here is an alternative:

Create an index that has every column

CREATE  TABLE IF NOT EXISTS `monitor`.`result` (            
  `server` SMALLINT UNSIGNED NOT NULL ,            
  `ref_id` SMALLINT UNSIGNED NOT NULL ,            
  `request` TINYINT UNSIGNED NOT NULL ,            
  `recorded` TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ,            
  `resolution` TINYINT NOT NULL ,            
  `value` MEDIUMINT UNSIGNED NOT NULL ,            
  PRIMARY KEY (`server`, `ref_id`, `request`, `recorded`, `resolution`, `value`) )            
ENGINE = MyISAM

What does this do? It provides data retrieval of entire rows strictly from the index. Combining this with a dedicated keycache, you will essentially has the whole table in RAM. All queries would be fulfilled by the index and never touch the table, REGARDLESS of the table being on SAS, SATA, SSD, or even stone.

UPDATE 2012-01-26 18:18 EDT

Question 1: You may want to look into memcached. I believe that there is a version InnoDB with a memcached plugin. At least, that's what I heard.

Question 2: InnoDB is for transactional tables. If you have archive data, compressed MyISAM tables should fill the bill. In fact, you could look into the ARCHIVE storage engine.

Question 3: Storing an index on disk (MyISAM,InnoDB) is always standard and cannot be changed. You have use special commands or run special queries to preload caches.

Question 4: RAID-10 for high-writes, SSD for high-reads. Watch your disk surface temperatures !!!

Question 5: if the table is strictly for holding historical info, no need for overkill. As long as it is a table rarely read, no need for special caching considerations.