Mysql – Is the Memory engine adequate for a read intensive/lookup table

MySQLstorage-engine

I have an InnoDB table with just two columns, a VARCHAR(20) and a DATETIME, and around 4M rows. This table serves as a blacklist for other tables, and is truncated and recreated from a csv file from time to time. There are no writes involved besides that, it's only used for a SELECT checking if a key exists, which always hits the index for the VARCHAR column.

The problem is, this table has to be consulted for every single operation in all my systems, all the time, because if there's a match, which is very rare, the operation must be aborted immediately. From my application profiling we spend around 10% of the database time reading from it.

I'm considering using the Memory engine for this table. The idea is to create a base table with the CSV engine that just loads the csv file instead of the whole data importing operation, and a init script to populate the Memory table. I'm assuming the HASH index on the VARCHAR column would be faster for simple lookups, but I'm not sure if it performs well with the almost 100% miss ratio I have.

Is this a good idea to improve my lookup speed?

Best Answer

I would not recommend the MEMORY storage engine

REASON #1 : No Redundancy

Whether you have a server crash or a normal system shutdown, all the data in the MEMORY table are lost. All you would have is the table structure.

REASON #2 : Mild Disk I/O

No matter what Storage Engine you choose, the .frm of a table is always accessed to check for table existence and availability. This will incur some disk I/O for this check.

Please read past posts on the pros and cons of the MEMORY Storage Engine

RECOMMENDATION

Given the two reasons for not using the MEMORY Storage Engine, I would recommend the MyISAM Storage Engine over using MEMORY or InnoDB. Why?

Looking back at Reason #1, you can have everything in RAM and have data redundancy on disk if you create the table as follows:

STEP 01) Create the table like this:

CREATE TABLE blacklist
(
    url VARCHAR(20),
    dt DATETIME,
    PRIMARY KEY (url)
) ENGINE=MyISAM ROW_FORMAT=Fixed;

STEP 02) Create a dedicated 16MB MyISAM cache for that table:

cd /var/lib/mysql
echo "SET GLOBAL blacklist_keycache.key_buffer_size = 1024 * 1024 * 16;" > init-file.sql
echo "CACHE INDEX blacklist IN blacklist_keycache; >> init-file.sql
echo "LOAD INDEX INTO CACHE blacklist; >> init-file.sql

STEP 03) Add this to /etc/my.cnf

[mysqld]
init-file=/var/lib/mysql/init-file/sql

STEP 04) Restart MySQL

service mysql restart

That's it.

Going forward, every reload of the table will populate the dedicated key cache. Please note the ROW_FORMAT=Fixed clause. What that does is speed up character search 20-25% (I wrote about this before).

Why not use InnoDB?

  • The data and index pages would have the data twice in RAM.
  • Accessing an InnoDB table introduces additional mild disk I/O via data dictionary checking (See pictorial representation of ibdata1)
  • Index pages to rotate out is the InnoDB Buffer Pool is too small. Contrariwise, an InnoDB Buffer Pool too big wastes RAM.

Using MyISAM, the data remains on disk but is exclusively accessed from the dedicate key cache.