MySQL – Indexing Integer and String Fields

index-tuningmyisamMySQL

A table has about 300M records. Each has an unique ID that starts with 13 to 14 digits (with the first digit on the left being either 1 or 2), and is optionally followed by a string of up to 7 ASCII chars. Two examples are 1450123410123 and 21347854123024Asd_uhA.

Currently, I'm storing this field as a VARCHAR with a PRIMARY KEY, and am worried about the memory usage of the index. I'm thinking of splitting this field into two columns, a BIGINT and a VARCHAR. Will the PRIMARY KEY created as a combination of these two columns consume less memory?

Update: Here are the two possibilities.

Please note that I care about inserts and updates. I will have only one huge select to export the data after a long time.

CREATE TABLE without_split (
  id varchar(21) CHARSET=ascii COLLATE ascii_bin NOT NULL,
  content text NOT NULL
  PRIMARY KEY id
);

CREATE TABLE with_split (
  id_prefix bigint unsigned NOT NULL,
  id_suffix varchar(7) CHARSET=ascii COLLATE ascii_bin NOT NULL,
  content text NOT NULL
  PRIMARY KEY (id_prefix, id_suffix)
);

Best Answer

MyISAM! Shame. Switch to InnoDB.

Don't worry about that PRIMARY KEY; it is too much hassle to split it up, etc. But do make sure it is the 'appropriate' character set:

VARCHAR(21) CHARACTER SET ascii

If it is case sensitive, then add on COLLATE ascii_bin.

Yes, there is a memory and disk penalty. But there are tradeoffs with speed, simplicity, etc.

(If it were 100 characters long, I might give a different answer.)

Numbers for MyISAM

In MyISAM, assuming the strings are half length on average:
PRIMARY KEY(decimal_14_0, varchar_7) needs 2*(7+(1+3.5)) bytes.
PRIMARY KEY(varchar_21) needs 2*(1+(13.5+3.5))

  • 2* because PK fields stored with data and with index;
  • 7 bytes for DECIMAL(14,0);
  • 1=length field for VARCHAR;
  • 3.5(?)=avg bytes for up-to-7-char suffix;
  • 13.5(?) for avg # digits in prefix.

After cancelling out things, the diff is 2*(13.5-7)=13 bytes smaller per row (data+pk) for DECIMAL+VARCHAR. 300M*13=4GB.

(InnoDB computation is in Comments.)