Mysql – Optimize table and query, aggregate by date or date-hour

innodbMySQLmysql-5.6optimizationperformancequery-performance

I have a table storing sensor data (100M rows currently) from stations from different sites. In my case, sites may have many stations. Create table statement is as follows:

CREATE TABLE sensor_data (
site INT NOT NULL,
station INT NOT NULL,
time TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
reading1 FLOAT DEFAULT NULL,
reading2 FLOAT DEFAULT NULL,
reading3 FLOAT DEFAULT NULL,
reading4 FLOAT DEFAULT NULL,
reading5 FLOAT DEFAULT NULL,
reading6 FLOAT DEFAULT NULL,
reading7 FLOAT DEFAULT NULL,
reading8 FLOAT DEFAULT NULL,
reading9 FLOAT DEFAULT NULL,
reading10 FLOAT DEFAULT NULL,
reading11 FLOAT DEFAULT NULL,
reading12 FLOAT DEFAULT NULL,
reading13 FLOAT DEFAULT NULL,
reading14 FLOAT DEFAULT NULL,
reading15 FLOAT DEFAULT NULL,
reading16 FLOAT DEFAULT NULL,
reading17 FLOAT DEFAULT NULL,
reading18 FLOAT DEFAULT NULL,
reading19 FLOAT DEFAULT NULL,
reading20 FLOAT DEFAULT NULL,
KEY Index_sst (site, station, time)
) ENGINE=InnoDB;

I need to query readings of a specific station aggregated by day or day-hour for a user defined interval. The queries are as follows:

Query 1

SELECT AVG(reading1), DATE_FORMAT(time, '%Y-%m-%d 00:00:00') AS daily 
FROM sensor_data
WHERE site=1 AND station=1 AND time>='2010-00-00 00:00:00' 
GROUP BY daily 
ORDER BY daily;

Query 2

SELECT AVG(reading1), DATE_FORMAT(time, '%Y-%m-%d %H:00:00') AS hourly
FROM sensor_data
WHERE site=1 AND station=1 AND time>='2010-00-00 00:00:00' 
GROUP BY hourly
ORDER BY hourly;

These queries don't use index, but use temporary table and filesort.

How should I get rid of long execution time of these queries? Adding a separate date column and indexing site-station-date seems to improve Query 1, but no idea about Query 2. I am wondering how the optimal way could be in this case.

SQL Fiddle

Best Answer

Create a new column, update it with the daily value,and add this column in the index. Here is an example.This is just to remove the function-on-column situation,where an index can`t be used.Be patient,it will take a while on 100 M rows.

SQL Fiddle