MySQL Unique Index Keylength

foreign keyinnodbMySQLunique-constraint

I am hashing keys from one table that provides a 256 bit (64 char length) alphanumeric string that I intend to use in a second table as a foreign key.

The engine of choice here is InnoDB.

There are about 2 million entries in Table A with my unique-key 256 bit key.

Table B, with 7 million records, is related to table A with the 256 bit foreign key.

My question is, instead of 64 char. length keys, should I opt for 160
bit (40 digit) keys?

Will I see a large performance boost? Or the difference would not be
too large?

My machine has limited computation power and although scripting and programming is quite easy, it would take me hours to compute and re-calculate for 160 bit from 256, which is the reason for my question.

Best Answer

Large PRIMARY KEYs and UNIQUE KEYs are a very bad idea for InnoDB. If you want a real performance boost for subsequent retrieval, you must use smaller PRIMARY KEYs. Why?

The smaller the PRIMARY KEY, the better things will be
Secondary indexes have a primary key associated with each secondary key. Bigger PRIMARY Key, bigger Secondary Index. and Vice-Versa
Caching improves with smaller primary keys

Here is what you should do with 64-char keys

STEP01) Create a Key Table for 64-character keys

Start with a table that will hold the 64-character key and associate it with an auto incremented field

CREATE TABLE big64keytable
(
    bigkey   INT UNSIGNED NOT NULL AUTO_INCREMENT,
    big64key CHAR(64),
    PRIMARY KEY (bigkey),
    UNIQUE KEY (big64key)
) ENGINE=MyISAM;

STEP02) Create a Stored Function to Get numeric ID for the 64-character key

First, create the SQL to INSERT your 64-character key and then retrieve the numeric ID

SET @givenkey = '8g4gbf3g7b5gf4n3gfn8g927534';
INSERT IGNORE INTO big64keytable (big64key) VALUES (@givenkey);
SELECT bigkey FROM big64keytable WHERE big64key = @givenkey;

Now take this simple retrieval algorithm and place it in a Stored Function

DELIMITER $$

DROP FUNCTION IF EXISTS `GetBigKey` $$
CREATE FUNCTION `GetBigKey` (GivenKey CHAR(64)) RETURNS INT UNSIGNED
DETERMINISTIC
BEGIN

    DECLARE rv INT UNSIGNED;

    INSERT IGNORE INTO big64keytable (big64key) VALUES (GivenKey);
    SELECT bigkey INTO rv FROM big64keytable WHERE big64key = GivenKey;
    RETURN rv;

END $$

DELIMITER ;

Going forward, simply generate a numeric unsigned ID (4 bytes ) for each 64-character key.

Here are two of my past posts in the DBA StackExchange on other large PRIMARY KEY issues

Dec 22, 2011 : File name as primary key?
Oct 18, 2012 : MySQL: Unique constraint on large column

Related Solutions

Mysql – Is UNIQUE(), INDEX() different from UNIQUE INDEX()

UNIQUE (column1, column2) implies UNIQUE INDEX (column1, column2) because the INDEX keyword is optional. So an index is created. ~~However, the MySQL 5.5 docs show that the INDEX (or KEY) keyword is mandatory so UNIQUE (column1, column2) should give an error~~

INDEX (column1, column2) does not mean UNIQUE INDEX (column1, column2): it means an index that does allow duplicate (column1, column2) pairs

Mysql – Unique combination key MySQL

Here is before trigger you can use to guarantee that id1 and id2 are inserted in min/max order

DELIMITER $$
CREATE TRIGGER mytable_bi BEFORE INSERT ON mytable
FOR EACH ROW
BEGIN
    DECLARE sum INT;
    SET sum = NEW.id1 + NEW.id2;
    SET NEW.id1 = (NEW.id1 + NEW.id2 - ABS(NEW.id1 - NEW.id2)) / 2;
    SET NEW.id2 = sum - NEW.id1;
END $$
DELIMITER ;

Here is some Sample Data:

DROP DATABASE IF EXISTS ryanzec;
CREATE DATABASE ryanzec;
USE ryanzec
CREATE TABLE mytable
(
    id1 INT NOT NULL,
    id2 INT NOT NULL,
    mapTypeId INT NOT NULL,
    primary key (id1,id2)
);
DELIMITER $$
CREATE TRIGGER mytable_bi BEFORE INSERT ON mytable
FOR EACH ROW
BEGIN
    DECLARE sum INT;
    SET sum = NEW.id1 + NEW.id2;
    SET NEW.id1 = (NEW.id1 + NEW.id2 - ABS(NEW.id1 - NEW.id2)) / 2;
    SET NEW.id2 = sum - NEW.id1;
END $$
DELIMITER ;
INSERT INTO mytable VALUES (1,2,3);
INSERT INTO mytable VALUES (2,1,3);
INSERT INTO mytable VALUES (5,4,6);
INSERT INTO mytable VALUES (4,5,6);
SELECT * FROM mytable;

Here it is the Sample Data Loaded

mysql> DROP DATABASE IF EXISTS ryanzec;
Query OK, 1 row affected (0.03 sec)

mysql> CREATE DATABASE ryanzec;
Query OK, 1 row affected (0.00 sec)

mysql> USE ryanzec
Database changed
mysql> CREATE TABLE mytable
    -> (
    ->     id1 INT NOT NULL,
    ->     id2 INT NOT NULL,
    ->     mapTypeId INT NOT NULL,
    ->     primary key (id1,id2)
    -> );
Query OK, 0 rows affected (0.10 sec)

mysql> DELIMITER $$
mysql> CREATE TRIGGER mytable_bi BEFORE INSERT ON mytable
    -> FOR EACH ROW
    -> BEGIN
    ->     DECLARE sum INT;
    ->     SET sum = NEW.id1 + NEW.id2;
    ->     SET NEW.id1 = (NEW.id1 + NEW.id2 - ABS(NEW.id1 - NEW.id2)) / 2;
    ->     SET NEW.id2 = sum - NEW.id1;
    -> END $$
Query OK, 0 rows affected (0.10 sec)

mysql> DELIMITER ;

Watch What Happens When I Insert the Rows:

mysql> INSERT INTO mytable VALUES (1,2,3);
Query OK, 1 row affected (0.06 sec)

mysql> INSERT INTO mytable VALUES (2,1,3);
ERROR 1062 (23000): Duplicate entry '1-2' for key 'PRIMARY'
mysql> INSERT INTO mytable VALUES (5,4,6);
Query OK, 1 row affected (0.07 sec)

mysql> INSERT INTO mytable VALUES (4,5,6);
ERROR 1062 (23000): Duplicate entry '4-5' for key 'PRIMARY'
mysql> SELECT * FROM mytable;
+-----+-----+-----------+
| id1 | id2 | mapTypeId |
+-----+-----+-----------+
|   1 |   2 |         3 |
|   4 |   5 |         6 |
+-----+-----+-----------+
2 rows in set (0.00 sec)

mysql>

Now, no matter what order id1 and id2 are entered, id1 will always be less than or equal to id2

This may not be contextually what you want because id1 and id2 are unique identifiers.