Mysql – Hash index for InnoDB

indexinnodbMySQL

I saw a database schema that looked like this

CREATE TABLE foo (
    name_hash bigint,
    name varachar(500),
    a varchar(500),
    b varchar(500),
    ...
    PRIMARY KEY (name_hash),
    KEY ...
);

This seems to be an attempt to limit index size, by using a 8-byte integer rather than a 100-byte string. When looking up a value by name, the application hashes it, and then uses that hash in the SQL query instead of the name.

This is rather tedious, and I'm not sure if it is necessary.


Is there a feature of MySQL InnoDB that does something similar — that looks up a string by its much shorter hash in order to fit index into memory?

Or does it do something like that already?

Best Answer

InnoDB has no tool to do anything like what you describe.

InnoDB has a limit of 767 bytes per column in an index. That has room for a VARCHAR(255) utf8 or a VARCHAR(191) utf8mb4. Also, InnoDB wants to put long VARCHARs in a different block if the whole record is bigger than about 8KB. Will this be common? (Don't blindly use VARCHAR(500) when you can reasonably state a smaller limit.)

Plan A: Compress (in the client) the name and store it into VARBINARY(255). Assuming it is typical text, the compression will be about 3:1. Use that instead of the Hash.

Plan B: Split the name into 2 or 3 columns so the index limit is abided by. (An ugly solution!)

Plan C: Change the 767 limit. (It's possible, but I forget the details at the moment.)

Something to note: Any decent "hash" is very random. That is, each new row will land in some random place in the table. Once the table exceeds innodb_buffer_pool_size, you will be doing more and more I/O, thereby slowing down.

What will the SELECTs be like? Will you have other indexes on this table? Will you JOIN on name_hash? All of these may have a bearing on the 'best' way to design the schema.