Mysql – Vertical Partition MySQL table

innodbjoin;MySQLmysql-5.7partitioning

I am reviewing a case, where I have 200+ columns mostly varchar(100). The columns are from several external data sources like CRM / Demographics data etc. I cannot hold them in one MySQL table with ever growing number of columns.

The general query situation may contain columns from one or more vertical partitions.

  • Is it a good idea to vertically split them? and join them on query?
  • What should be the size of each partition (number of columns)? To improve performance.
  • What should be the best JOIN matching condition?

MySQL version: 5.7
Storage Engine: InnoDB

Best Answer

I don't think that someone can give you magic numbers like the acceptable number of columns, or information too strictly related to your workload, like if it is a good idea to split the table. There are too many variables: number/types of existing columns and indexes, number of queries, how many columns you read per query, and so on. Proper tests will give you a good answer. All we can say is that, yes, common sense says that such a table should be split if possible.

But then, every query will need to read from multiple partitions? Every new row will have matches in all partitions? This could slow down your application. Ideally, most of the queries should be able to read from only one partition. So the first suggestion would be to check with developers if some queries can be rewritten so that they will read less columns - possibly by rewriting some SELECT *.

You also ask about the best possible match condition. Here the answer is easy: join by primary key. All matching rows should have the same id. It should be an AUTO_INCREMENT columns on only one table. You first insert new rows into the table, then to the others, in this way:

INSERT INTO t1 (...) VALUES (...);
INSERT INTO t2 (id, ...) VALUES (LAST_INSERT_ID(), ...);
INSERT INTO t3 (id, ...) VALUES (LAST_INSERT_ID(), ...);

Also don't add foreign keys, because they will slow down your writes.