MySQL Partitioning – Deciding Which Columns to Use

MySQLpartitioning

I am having trouble deciding what fields of my table to base the partition. Currently there are a total of 8 indexes on the table out of a total of 14 columns. The problem is the indexes are taking up too much space 2 to 1 compared to the data and we need to reduce the size of the DB, and at least maintain current speed. 3 of the indexes have a significantly lower cardinality (in the teens) than the rest (in the millions). There are over 40 million rows in the table.

The index data type looks something like this:

datetime
smallint(6) (low cardinality)
varchar(20) (low cardinality)
int(15)
int(15)
varchar(100)
varchar30
char(1) (low cardinality)

and a few others. All the indexes are non unique and of type BTREE.

I created a test table and removed the indexes with the lowest cardinality but the queries take a bit longer. I'd like suggestions on two aspects:

  • What indexes should be kept and, which should be removed.

  • What columns should my partitioning be based on.

The table definition:

Create Table: CREATE TABLE `mytable` (
  `date` datetime DEFAULT NULL,
  `time` smallint(6) DEFAULT NULL,
  `source` varchar(20) DEFAULT NULL,
  `b` int(15) DEFAULT NULL,
  `name` varchar(100) DEFAULT NULL,
  `ttype` char(1) DEFAULT NULL,
  `ag` char(1) DEFAULT NULL,
  `direction` char(1) DEFAULT NULL,
  `mode` char(1) DEFAULT NULL,
  `username` varchar(30) DEFAULT NULL,
  `name` varchar(15) DEFAULT NULL,
  `authentication` char(1) DEFAULT NULL,
  `authenticated_id` varchar(75) DEFAULT NULL,
  `status` char(1) DEFAULT NULL,
  KEY `date` (`date`),
  KEY `time` (`time`),
  KEY `b` (`b`),
  KEY `name` (`name`),
  KEY `username` (`username`),
  KEY `source` (`source`),
  KEY `b2` (`b`),
  KEY `status` (`status`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1

Best Answer

Always have an explicit PRIMARY KEY for any InnoDB table. If practical, use a "natural" key -- a column, or combination of columns, that is unique. Otherwise, add an AUTO_INCREMENT.

Learn about "composite indexes" -- this is using more than one column in an index. It is often a good optimization. It is not the same as having an index on each column.

It is OK for the indexes to take up more space than the data. A quick survey showed 1/4 of my tables have that. Having good indexes improves performance far more than the bulkiness hurts performance.

Each Index is updated for each row INSERTed, so having too many indexes hurts INSERTs (some).

You mentioned partitioning; did you mean "PARTITION BY ..."? Don't do it unless you have a particular purpose. I know of only 4 use cases; see my blog.

Do not bother to index low-cardinality columns (flags, enums, etc). They won't be used. If you have a status flag that is 10/90, then an index on just that column would be used for the 10, but not for the 90. A composite index starting with that column may be always useful when all the columns (in the index) are in the WHERE clause.

The most important performance setting for InnoDB is to set innodb_buffer_pool_size to about 70% of available RAM.

It is usually a bad idea split DATE and TIME into two fields. Look at how messy your SELECTs have to be.

How to create good indexes: http://mysql.rjweb.org/doc.php/index_cookbook_mysql

Show us some of the SELECTs; we can advise further.