Mysql – Query slower with index on 14M rows table

MySQLoptimizationperformancequery-performance

DB: Amazon RDS MySQL (OS: Linux, 2 vCPU, Memory: 8GB)

I have a table with almost 14M rows of data.

CREATE TABLE `meterreadings` (
 `Id` bigint(20) NOT NULL AUTO_INCREMENT,
 `meterid` varchar(16) DEFAULT NULL,
 `metervalue` int(11) DEFAULT NULL,
 `date_time` timestamp NULL DEFAULT NULL,
 PRIMARY KEY (`Id`),
 KEY `meterid` (`meterid`)
) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=latin1;

As you can see, I use an index on meterid.

Another table which stores device IDs (around 100 rows of data)

CREATE TABLE `devices` (
`Id` bigint(20) NOT NULL AUTO_INCREMENT,
`meterid` varchar(16) DEFAULT NULL,
`location` varchar(8) DEFAULT NULL,
PRIMARY KEY (`Id`),
UNIQUE KEY `meterid_UNIQUE` (`meterid`)
) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=latin1;

To get 15 minute aggregated data, I use the below query

SELECT AVG(metervalue) as value
 , DATE_FORMAT(date_time, "%d %b %Y %H:%i") as label 
FROM meterreadings 
WHERE meterid IN (SELECT meterid from devices) 
  AND date_time BETWEEN '2018-07-23' AND '2018-07-24' 
GROUP BY DATE(date_time), HOUR(date_time), MINUTE(date_time) DIV 15 
ORDER BY date_time ASC;

Query performance is very bad – It takes approximately around 12 seconds to execute, and causes a temporary spike in DB server usage as well.

EXPLAIN on this query returned this:

1   SIMPLE  devices index   meterid_UNIQUE  meterid_UNIQUE  19      125  
   Using where; Using index; Using temporary; Using filesort
1   SIMPLE  meterreadings   ref meterid meterid 19  devices.meterid 322  
   Using where

I dropped the index on meterreadings and surprisingly the query performance is better – almost about 6 seconds now. I am still wondering why?

EXPLAIN on the query after dropping the index

1   SIMPLE  meterreadings   ALL                 14580167    Using where; 
Using temporary; Using filesort
1   SIMPLE  devices ref meterid_UNIQUE  meterid_UNIQUE  19   
   meterreadings.meterid    1   Using index

I am currently doing my query operation on the table without index – Is there a way I can optimize the table / query to do the operation faster (like a composite index on two columns?)

[The table is growing approximately by around 40 rows per second]

Best Answer

You should play with it a bit, because it might not be clear beforehand what solution will produce the best results.

A few points to consider

  1. It is very likely that an index on the date column will provide you with better selectivity, as it has a higher selectivity.

  2. Composite indexes are usually a good idea, but please make sure to chose the order correctly date_time, meterid vs. meterid, date_time. In most cases it makes more sense to leave columns with dense values (i.e dates, floats) to the end, as any column in the index following them is unlikely to have any effect. ( try meterid, date_time for an index.)

  3. Subselects might force the optimizer to use a specific plan. Try converting it into a join if possible.