MYSQL Query Index Check

indexindex-tuningMySQL

I have a table with the following structure;

CREATE TABLE `operations_copy` (
  `id` bigint(22) NOT NULL AUTO_INCREMENT,
  `date` datetime NOT NULL,
  `zoneId` int(11) unsigned DEFAULT NULL,
  `machineId` int(11) unsigned DEFAULT NULL,
  `shiftId` int(11) unsigned DEFAULT NULL,
  `statusId` int(11) unsigned DEFAULT NULL,
  `actualCycle` float(10,4) NOT NULL,
  `targetCycle` float(10,4) NOT NULL,
  PRIMARY KEY (`id`),
) ENGINE=InnoDB AUTO_INCREMENT=185538661 DEFAULT CHARSET=latin1;

I have nearly 200m records in this table.

I am frequently performing the following queries on this table;

1 – Operations By Zone

SELECT zoneId, statusId, COUNT(*) as intervals 
FROM `operations` 
WHERE (`date` > 'XXXX-XX-XX' AND `date` <= 'XXXX-XX-XX') 
GROUP BY `zoneId`,`statusId` 
ORDER BY NULL;

2 – Operations By Machine

SELECT machineId, statusId, COUNT(*) as intervals 
FROM `operations` 
WHERE (`date` > 'XXXX-XX-XX' AND `date` <= 'XXXX-XX-XX') GROUP BY `machineId`,`statusId` 
ORDER BY NULL;

3 – Operations By Shift

SELECT shiftId, statusId, COUNT(*) as intervals 
FROM `operations` 
WHERE (`date` > 'XXXX-XX-XX' AND `date` <= 'XXXX-XX-XX') 
GROUP BY `shiftId`,`statusId` 
ORDER BY NULL;

I am currently looking at 30s for query #1.

I have enabled caching, which reduces the query time dramatically after the initial query, but I still want to optimise the initial query.

Studying has shown me that I need to make better use of indexes (ref: https://www.slideshare.net/billkarwin/how-to-design-indexes-really, MYSQL docs and more). So I want to check the following indexes will be the best to optimise each query;

#1

KEY `operations_by_zone_query` (`date`,`zoneId`,`statusId`,`id`)
#2

KEY `operations_by_machine_query` (`date`,`machineId`,`statusId`,`id`)
#3

KEY `operations_by_shift_query` (`date`,`shiftId`,`statusId`,`id`)

Any help will be much appreciated. Thanks.

Best Answer

I have an interesting idea for you.

Create the following indexes

ALTER TABLE operations
    ADD INDEX `date_zoneId_statusId_ndx`    (`date`,`zoneId`,`statusId`)
   ,ADD INDEX `date_machineId_statusId_ndx` (`date`,`machineId`,`statusId`)
   ,ADD INDEX `date_shiftId_statusId_ndx`   (`date`,`shiftId`,`statusId`)
   ,ADD INDEX `zoneId_statusId_date_ndx`    (`zoneId`,`statusId`,`date`)
   ,ADD INDEX `machineId_statusId_date_ndx` (`machineId`,`statusId`,`date`)
   ,ADD INDEX `shiftId_statusId_date_ndx`   (`shiftId`,`statusId`,`date`)
;

You could then evaluate your queries by running:

EXPLAIN SELECT zoneId, statusId, COUNT(*) as intervals 
FROM `operations` 
WHERE (`date` > 'XXXX-XX-XX' AND `date` <= 'XXXX-XX-XX') 
GROUP BY `zoneId`,`statusId` 
ORDER BY NULL\G

EXPLAIN SELECT machineId, statusId, COUNT(*) as intervals 
FROM `operations` 
WHERE (`date` > 'XXXX-XX-XX' AND `date` <= 'XXXX-XX-XX') GROUP BY `machineId`,`statusId` 
ORDER BY NULL\G

EXPLAIN SELECT shiftId, statusId, COUNT(*) as intervals 
FROM `operations` 
WHERE (`date` > 'XXXX-XX-XX' AND `date` <= 'XXXX-XX-XX') 
GROUP BY `shiftId`,`statusId` 
ORDER BY NULL\G

The output from each EXPLAIN will plainly tell you which index.

It will also show the least number of steps the Query Optimizer will take.

My guess is that the indexes with the date column last will be selected. An exception may be a small date range which may favor the indexes with the date column first. In that instance, you would want to keep all 6 indexes.

The indexes with the date column last would be used by the GROUP BY so that no temp table sorting would be necessary.

WARNING ON HAVE MANY INDEXES : Bulk INSERT performance would suffer due to managing multiple covering indexes, but SELECT performance would improve greatly. Dropping unnecessary indexes becomes a necessity.