Given a simple, with a text and time column, I want to select X unique values from the text column, whereas that row contains the highest value for time.
INSERT INTO `test` (`id`, `text`, `time`)
VALUES
(1, 'test1', 1),
(2, 'test1', 3),
(3, 'test1', 2),
(4, 'test2', 1),
(5, 'test2', 100),
(6, 'test2', 20) ;
The query that meets most of my requirements is:
SELECT a.* FROM
test a
INNER JOIN (
SELECT `text`, MAX(`time`) AS `time`
FROM
test
WHERE
`text` LIKE "te%"
GROUP BY
`text`) b
ON
a.`text` = b.`text` AND
a.`time` = b.`time`
ORDER BY
a.`text` ASC
LIMIT 0,50
For small tables, this works perfect. Though in my table (300k+ rows) it makes mysql crash, due to the subquery.
Is it possible to optimize this query? If it cannot be optimized, would it be possible to select the last inserted unique values for text
? (the id
and time
are theoretically uncorrelated, though in 99% of the cases a correlation will be found, whereas the higher the id, the higher the time)
Best Answer
The obvious index for this query is on
(test,time)
. Add it with:You may also get efficiency gain if you use this version of the query (pushing the limit inside the derived table):
If there cases where the same maximum
time
appears on more than one row with sametext
, it will not show the same exact results and will return more than 50 rows (but I think this version is more appropriate). If you really want the arbitrary choice in case of ties, you can add theLIMIT
in the external select, too.