I saw a database schema that looked like this
CREATE TABLE foo (
name_hash bigint,
name varachar(500),
a varchar(500),
b varchar(500),
...
PRIMARY KEY (name_hash),
KEY ...
);
This seems to be an attempt to limit index size, by using a 8-byte integer rather than a 100-byte string. When looking up a value by name, the application hashes it, and then uses that hash in the SQL query instead of the name.
This is rather tedious, and I'm not sure if it is necessary.
Is there a feature of MySQL InnoDB that does something similar — that looks up a string by its much shorter hash in order to fit index into memory?
Or does it do something like that already?
Best Answer
InnoDB has no tool to do anything like what you describe.
InnoDB has a limit of 767 bytes per column in an index. That has room for a
VARCHAR(255) utf8
or aVARCHAR(191) utf8mb4
. Also, InnoDB wants to put longVARCHARs
in a different block if the whole record is bigger than about 8KB. Will this be common? (Don't blindly useVARCHAR(500)
when you can reasonably state a smaller limit.)Plan A: Compress (in the client) the
name
and store it intoVARBINARY(255)
. Assuming it is typical text, the compression will be about 3:1. Use that instead of the Hash.Plan B: Split the name into 2 or 3 columns so the index limit is abided by. (An ugly solution!)
Plan C: Change the 767 limit. (It's possible, but I forget the details at the moment.)
Something to note: Any decent "hash" is very random. That is, each new row will land in some random place in the table. Once the table exceeds innodb_buffer_pool_size, you will be doing more and more I/O, thereby slowing down.
What will the
SELECTs
be like? Will you have other indexes on this table? Will youJOIN
onname_hash
? All of these may have a bearing on the 'best' way to design the schema.