MySQL Partitions with Partition limit

MySQLpartitioningperconaperformancequery-performance

I have a partitioned table on the basis of the Primary key having 1.5 Billion rows in it. But when I check the .ibd files only 80 ibd show in GB's whereas 120ibd shows 1MB file. I would like to know how does the distribution work when I limit Partition with PARTITIONS 200. My structure is as below. Given that id & checksum both are unique all the time.

CREATE TABLE `mapping` (
  `id` varchar(50) NOT NULL DEFAULT '',
  `sha_checksum` varchar(50) NOT NULL DEFAULT '',
  `path_id` varchar(90) DEFAULT NULL,
  `file_id` varchar(35) DEFAULT NULL,
  `server_id` CHAR(6) DEFAULT NULL,
  PRIMARY KEY (`id`,`sha_checksum`)
) ENGINE=InnoDB 
/*!50100 PARTITION BY KEY (id,sha_checksum)
PARTITIONS 200 */

Please help me to understand where I am wrong.

Best Answer

Rather than trying to explain why most of the partitions seem empty, let me argue against that flavor of PARTITIONing.

To put it bluntly, PK & BY KEY(id, sha1) gains no performance, nor any other benefit that I can imagine.

Note that to get "partition pruning", you have to specify both the id and the sha_checksum. Performance will be essentially the same as having a non-partitioned table with the same PRIMARY KEY.

Since the records are randomly scattered around, I can think of no use for "transportable tablespaces".

What version of MySQL are you using?

What character set are you using? (I'm thinking of space considerations, especially for the CHAR column.)

Please describe id and the application; I can help you devise a better PARTITION scheme, or continue to argue against it.

Another thing to note: More than about 50 partitions slows things down for a variety of reasons, one of which is that all partitions are opened when an INSERT occurs. (Very new versions have 'fixed' this failure to 'prune'.)