MySQL – Indexing Partitioned Column

indexMySQLpartitioningperformanceperformance-tuning

I have a table with no primary key defined, partitioned on a column, using List Partitioning method. It has huge data in one partition. Though partitioned column is used in WHERE clause while querying against this table, performance is very slow.

Is it suggestible to create an index that includes the partitioned column? Here's my sample table structure,

CREATE TABLE inventory
(
   AccountId BIGINT,
   AccountNumber VARCHAR(250),
   DeviceId BIGINT,
   DeviceNumber VARCHAR(250),
   ClientId SMALLINT
   CreatedDate DATETIME,
   CreatedUser VARCHAR(50)
)
PARTITION BY LIST (ClientId)
 (PARTITION p_1 VALUES IN (1),
  PARTITION p_2 VALUES IN (2),
  PARTITION p_3 VALUES IN (3),
  PARTITION p_4 VALUES IN (4),
  PARTITION p_5 VALUES IN (5),
  PARTITION p_6 VALUES IN (6),
  PARTITION p_7 VALUES IN (7),
  PARTITION p_8 VALUES IN (8),
  PARTITION p_9 VALUES IN (9),
  PARTITION p_10 VALUES IN (10),
  PARTITION p_11 VALUES IN (11),
  PARTITION p_12 VALUES IN (12),
  PARTITION p_13 VALUES IN (13),
  PARTITION p_14 VALUES IN (14),
  PARTITION p_15 VALUES IN (15));

Table is partitioned by "ClientId" and I had to make 1024 partitions to this table.

Best Answer

  • 1024 partitions means 1024 "sub-tables". Certain operations will open all 1024.
  • A table (or partition) is implemented using files in the OS. You are at the mercy of the OS speed or sluggishness.
  • If you have WHERE ClientId = 123 that should lead to "partition pruning", which leads to opening the one partition, then scanning the entire partition.
  • INDEX(ClientId), without partitioning, would be almost as good.

This would be better:

id INT UNSIGNED NOT NULL AUTO_INCREMENT,
PRIMARY KEY(ClientId, id),
INDEX(id)

Now the data is clustered on ClientId, there is no extra thing to "open", and the performance will be better.

Please provide the main queries you have so that I can check whether I have messed up other cases.

Other issues...

AccountId BIGINT,
AccountNumber VARCHAR(250),

sounds grossly redundant. Is there a 1:1 mapping between that Id and that Number? Then the pairs should be in another table, and only the Id should be in this table. Ditto for Devices.

Making that change would significantly shrink this table, thereby making operations somewhat faster.

Meanwhile, learn about JOIN in order to get the ...Number from the ...Id.

More on partitioning.