MySQL – Indexing Partitioned Column

indexMySQLpartitioningperformanceperformance-tuning

I have a table with no primary key defined, partitioned on a column, using List Partitioning method. It has huge data in one partition. Though partitioned column is used in WHERE clause while querying against this table, performance is very slow.

Is it suggestible to create an index that includes the partitioned column? Here's my sample table structure,

CREATE TABLE inventory
(
   AccountId BIGINT,
   AccountNumber VARCHAR(250),
   DeviceId BIGINT,
   DeviceNumber VARCHAR(250),
   ClientId SMALLINT
   CreatedDate DATETIME,
   CreatedUser VARCHAR(50)
)
PARTITION BY LIST (ClientId)
 (PARTITION p_1 VALUES IN (1),
  PARTITION p_2 VALUES IN (2),
  PARTITION p_3 VALUES IN (3),
  PARTITION p_4 VALUES IN (4),
  PARTITION p_5 VALUES IN (5),
  PARTITION p_6 VALUES IN (6),
  PARTITION p_7 VALUES IN (7),
  PARTITION p_8 VALUES IN (8),
  PARTITION p_9 VALUES IN (9),
  PARTITION p_10 VALUES IN (10),
  PARTITION p_11 VALUES IN (11),
  PARTITION p_12 VALUES IN (12),
  PARTITION p_13 VALUES IN (13),
  PARTITION p_14 VALUES IN (14),
  PARTITION p_15 VALUES IN (15));

Table is partitioned by "ClientId" and I had to make 1024 partitions to this table.

Best Answer

1024 partitions means 1024 "sub-tables". Certain operations will open all 1024.
A table (or partition) is implemented using files in the OS. You are at the mercy of the OS speed or sluggishness.
If you have WHERE ClientId = 123 that should lead to "partition pruning", which leads to opening the one partition, then scanning the entire partition.
INDEX(ClientId), without partitioning, would be almost as good.

This would be better:

id INT UNSIGNED NOT NULL AUTO_INCREMENT,
PRIMARY KEY(ClientId, id),
INDEX(id)

Now the data is clustered on ClientId, there is no extra thing to "open", and the performance will be better.

Please provide the main queries you have so that I can check whether I have messed up other cases.

Other issues...

AccountId BIGINT,
AccountNumber VARCHAR(250),

sounds grossly redundant. Is there a 1:1 mapping between that Id and that Number? Then the pairs should be in another table, and only the Id should be in this table. Ditto for Devices.

Making that change would significantly shrink this table, thereby making operations somewhat faster.

Meanwhile, learn about JOIN in order to get the ...Number from the ...Id.

More on partitioning.

Related Solutions

MySQL Partitioning

Personally I'd use the date as your partition function, and partition by a hash of the year and month. Maybe splitting the data into 48 or more partitions. I've done this on some large volume databases and had good results.

ALTER TABLE `your_table` 
PARTITION BY HASH(YEAR(`date_field`)*12 + MONTH(`date_field`)) 
  PARTITIONS 48;

This should create a nice distributed set of data across 48 partitions (you may need to fiddle with the calculation on the date to get it quite right for your needs).

I build a model in Excel, with all the dates down one column, put the partition function on the second showing which partition that data would appear in. You can then chart the second column frequency to see how the data distribution is placed across the partitions - a really useful way of tinkering with your function before you alter your table!

Hope that helps...

Sql-server – Inefficient queries on partitioned tables

It depends on the query, so if you have a specific problem you'll need to give us details for the query (and possibly table structure) in order to get specific help.

Generally speaking though:

If the filter is not sargable then it is no help to the query planner in the same way as references to indexes where the clause is not sargable - for instance if a function or sub-squery that can not be simplified down to a single value for all possible rows is involved.
If there are other filters on columns covered by non-partitioned (or unaligned) indexes the query planner may consider searching that way to be more efficient than using the partitioning rule.

Best Answer

Related Solutions

MySQL Partitioning

Sql-server – Inefficient queries on partitioned tables

Related Question