Mysql – partially invalidate the MySQL table cache

cachemyisamMySQL

I want to set a fairly large table to MyISAM mode to keep it cached proactively. The table file is approximately 3GB.

After several hours, MySQL invalidates the entire table from cache, while < 0.1% of the table actually changed in the meantime.

Sometimes we need to do ±100,000 SELECT statements very quickly, which causes a cache stampede. When cached, very nice. When not cached, terrible.

What can I do to ensure the table is kept in memory?

Best Answer

The only thing that is definitively cached for MyISAM is an index.

The key_buffer_size variables sets up how large the MyISAM Key Cache will be.

There are two suggestions I can offer

SUGGESTION #1 : Use a Dedicated Key Cache for the Table

Did you know you could create a keycache dedicated to one or more MyISAM tables? Suppose you have a table called mydb.mytable and you want to load the entire mytable.MYI file into it. Here are the steps to set this up:

STEP 01) Get the physical size of the table's .MYI file

Run this query

SELECT index_length MYISize FROM information_schema.tables
WHERE table_schema='mydb' AND table_name='mytable';

Suppose the query returns 1234567890 for MYISize. That's 1.114978 GB = 1177.3757 MB. Round that up to 1280M (that 1.25 GB).

STEP 02) Create the Buffer with that size

SET GLOBAL mykeycache.key_buffer_size = 1280 * 1024 * 1024;

STEP 03) Associate the cache with the indexes for mydb.mytable

CACHE INDEX mydb.mytable INTO mykeycache;

STEP 04) Load the Index Pages from the .MYI into the Dedicated Cache

LOAD INDEX INTO CACHE mykeycache;

That's all. All indexes for mydb.mytable are now in that cache only. Running these commands also removes the same index entries from the general MyISAM keycache.

You could create a startup script for mysql to preload the indexes.

I have written about this subject is past posts

As a sidenote, I would like to suggestion something optional. You could speed up the access time of MyISAM data by changing the ROW_FORMAT of the MyISAM you want. For the same MyISAM table mydb.mytable, here is what you do:

ALTER TABLE mydb.mytable ROW_FORMAT=Fixed;

That's it. This a good and bad side to changing ROW_FORMAT to Fixed

SUGGESTION #2 : Switch the table's Storage Engine to InnoDB

If you switch that table to InnoDB, you set up the InnoDB Buffer Pool in such a way that data and indexes that enter the Buffer Pool will linger a lot longer. This will simulate what your original question is asking: Partially invalidating a cache. That way, mysqldumps, full table scans, and queries like these will have roadblocks to cache stampedes. The options to tweak are innodb_old_blocks_pct and innodb_old_blocks_time.

According to the Configuration Options section of the MySQL Documentation for the InnoDB Buffer Pool:

Setting innodb_old_blocks_time greater than 0 prevents one-time table scans from flooding the new sublist with blocks used only for the scan. Rows in a block read in for a scan are accessed many times in rapid succession, but the block is unused after that. If innodb_old_blocks_time is set to a value greater than time to process the block, the block remains in the “old” sublist and ages to the tail of the list to be evicted quickly. This way, blocks used only for a one-time scan do not act to the detriment of heavily used blocks in the new sublist.

innodb_old_blocks_time can be set at runtime, so you can change it temporarily while performing operations such as table scans and dumps:

SET GLOBAL innodb_old_blocks_time = 1000;... perform queries that scan tables ...
SET GLOBAL innodb_old_blocks_time = 0;

This strategy does not apply if your intent is to “warm up” the buffer pool by filling it with a table's content. For example, benchmark tests often perform a table or index scan at server startup, because that data would normally be in the buffer pool after a period of normal use. In this case, leave innodb_old_blocks_time set to 0, at least until the warmup phase is complete.