I'm using mysql 5.5
Here is my table:
CREATE TABLE `temperature_information` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`device` int(11) NOT NULL,
`temperature` int(11) NOT NULL,
`date` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (`id`),
KEY `device` (`device`),
KEY `date` (`date`),
KEY `idx` (`device`, `date`)
) ENGINE=InnoDB AUTO_INCREMENT=25602738 DEFAULT CHARSET=latin1
This table has ~50m. records.
Here is my query:
SELECT date as ValueDate, MAX(temperature)
FROM (
SELECT date, temperature FROM temperature_information WHERE device = 1111
ORDER BY temperature DESC) c
GROUP BY DATE(ValueDate),HOUR(ValueDate)
This query returns Maximum temperature for each day. It's execution time is ~0.9 s. and subquery's time 0.003 sec.
I have separate indexes on the date
and device
columns and a multi-column index idx
on device,date
. The EXPLAIN
says it uses device
index for query, which is good. The sub query is very fast.
But to get MAX
temperature for each day I need to use GROUP BY
. I know that functions on indexed column disables the use of indexes, but I don't know a workaround to make it efficient and produce same results.
Here is EXPLAIN
:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE temperature_information ref device,idx device 4 const 29330 Using where; Using temporary; Using filesort
My question:
Is it possible to write a query which would be more efficient and would produce same results or should I process rows returned by sub query and find MAX
temperature for each day myself (This would be written in c)?
Sub query returns 20-40k rows on average.
PS. I know you can remove sub query, but I left it like this to clarify question
Creating index on those (3) columns improved performance ~20 times. But I have another concern regarding indexes. I provided, not full table. table have another columns like signal
, etc. I'm planning to do same queries to find MAX
for each day on these columns too. Does adding too many indexes make SELECT
slower for other queries. I know that inserts and updates will be slower.
Best Answer
ORDER BY
inside a subquery like this (without aLIMIT
) makes no sense.SELECT date
butGROUP BY DATE(date). HOUR(date)
. This, while allowed in MySQL older versions (i.e. before 5.7), is not valid SQL.I suggest you rewrite:
Regarding performance:
(device, date, temperature)
will make the query more efficient than the current indexes on(device)
alone and on(device, date)
. If you add this 3-column index, you could drop the other two ("device"
and"idx"
) indexes.date
andhour
part in separate columns and add a 4-column index on(device, date_part, hour_part, temperature)
.PRIMARY KEY
to(device, date)
- or(device, date_part, hour_part, min_sec_part)
. This effectively - for an InnoDB table - clusters the data in the way you want them for this query. Of course you should test first this alternative design and how it will affect other queries, too.