Mysql – Does partitioning a MySQL table also increase index creation performance

MySQLpartitioning

This is something I was wondering since I'm planning to partition a aggregation table with 10 mil records where rebuilding indexes cause table locks for some time.

Because partitions are bluntly put just multiple tables that have their own index and our system only inserts new records into a single partition that is kept small (Only data from the current day is added).

So when rebuilding or adding indexes will the table lock occur on all partitions or only the partitions it is rebuilding at that moment?

Also because you have to add your partition key to the primary key are there other troubles that I might run into when partitioning a existing table?

Best Answer

Partitioning usually rather decreases ALTER TABLE speed. Index and data is stored in separate files for each partition (a separate .ibd file for INNODB if innodb_file_per_table=1), so that you might make alter in parallel. But partitioning add an extra processing step to choose the partition for storing data/index, thus the whole operation is made slower.

mysql> create table alter_test (id int primary key auto_increment, v   char(20));
mysql> load data infile '/tmp/1' into table alter_test;
Query OK, 500000 rows affected (9.98 sec)

mysql> alter table alter_test add index i_test(v);
Query OK, 0 rows affected (4.55 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql> create table alter_test (id int primary key auto_increment, v  char(20)) partition by key(id) partitions 10;
mysql> load data infile '/tmp/1' into table alter_test;
Query OK, 500000 rows affected (9.98 sec)

mysql> alter table alter_test add index i_test(v);
Query OK, 0 rows affected (4.83 sec)

As for the shared lock for ALTER TABLE, all partitions are locked in the same time, so partitioning does not help.

Hints:

Upgrade to MySQL 5.6, since it supports online alter table.
Use scripts such as oak-online-alter-table or pt-online-schema-change as a workaround for MySQL version < 5.6

Related Solutions

MySQL Partitioning

Personally I'd use the date as your partition function, and partition by a hash of the year and month. Maybe splitting the data into 48 or more partitions. I've done this on some large volume databases and had good results.

ALTER TABLE `your_table` 
PARTITION BY HASH(YEAR(`date_field`)*12 + MONTH(`date_field`)) 
  PARTITIONS 48;

This should create a nice distributed set of data across 48 partitions (you may need to fiddle with the calculation on the date to get it quite right for your needs).

I build a model in Excel, with all the dates down one column, put the partition function on the second showing which partition that data would appear in. You can then chart the second column frequency to see how the data distribution is placed across the partitions - a really useful way of tinkering with your function before you alter your table!

Hope that helps...

MySQL – Partitioning Large Table

First, you should consider solving the problem in another way.

Upgrade to MySQL 5.6, where OPTIMIZE TABLE works without blocking (for an InnoDB table), as it is supported by InnoDB Online DDL.
If you can't upgrade, try using Percona Toolkit's pt-online-schema-change, which can perform the table rebuild without blocking.
```
$ pt-online-schema-change h=localhost,D=mydatabase,t=mytable --execute
    --alter="ENGINE=InnoDB"
```

If you're stuck on using partition, yes, you must make id the partition key in the table you show. You can convert the table to partitioning with ALTER TABLE. If you need the conversion operation to be non-blocking, use pt-online-schema-change.

There's no way to partition to fixed-size partitions. You have to partition by values. But is it really that important to hit a specific size per partition?

Re your comment about partition size:

When using RANGE partitioning, what I do is set up a schedule to ALTER TABLE and split the last partition from time to time. If you have a regular rate of growth, this is easy, but if you have irregular patterns of growth, you might instead set up a periodic check that examines the number of rows per partition (use the INFORMATION_SCHEMA.PARTITIONS), and email you if it's getting full.

For example, let's set up a table partitioned by range on id.

CREATE TABLE `mytable` (
  `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
  `transactionid` int(11) NOT NULL,
  `parent` int(11) NOT NULL,
  `headers` longtext,
  `creator` int(11) NOT NULL,
  `created` datetime DEFAULT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `id` (`id`),
  KEY `transactionid` (`transactionid`,`parent`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
/*!50100 PARTITION BY RANGE (id)
(PARTITION p0 VALUES LESS THAN (0) ENGINE = InnoDB,
 PARTITION p1 VALUES LESS THAN (1000) ENGINE = InnoDB,
 PARTITION p2 VALUES LESS THAN (2000) ENGINE = InnoDB,
 PARTITION p3 VALUES LESS THAN (3000) ENGINE = InnoDB,
 PARTITION p4 VALUES LESS THAN MAXVALUE ENGINE = InnoDB) */

As the MAX(id) approaches 3000, it's getting close to filling up p3 and spilling over into p4. So it's time to reorganize. It's good to do this before any data spills over into p4, because the reorg will affect only the last, empty partition and will therefore be very quick.

ALTER TABLE mytable REORGANIZE PARTITION p4 INTO 
(PARTITION p4 VALUES LESS THAN (4000), PARTITION p5 VALUES LESS THAN MAXVALUE);

Even if you miss a day and you get some data into the old p4, chances are it's not much data. But if you neglect this for a month or two, and p4 fills up with a lot of data, then the REORGANIZE will take longer.

Best Answer

Related Solutions

MySQL Partitioning

MySQL – Partitioning Large Table

Related Question