Mysql – Select unique value whereas the time is highest in the most optimal way

mysql-5.5

Given a simple, with a text and time column, I want to select X unique values from the text column, whereas that row contains the highest value for time.

INSERT INTO `test` (`id`, `text`, `time`) 
VALUES
    (1, 'test1', 1),
    (2, 'test1', 3),
    (3, 'test1', 2),
    (4, 'test2', 1),
    (5, 'test2', 100),
    (6, 'test2', 20) ;

The query that meets most of my requirements is:

SELECT a.* FROM
    test a
INNER JOIN (
    SELECT `text`, MAX(`time`) AS `time`
    FROM
        test
    WHERE
        `text` LIKE "te%"
    GROUP BY
        `text`) b
ON
    a.`text` = b.`text` AND
    a.`time` = b.`time`
ORDER BY
    a.`text` ASC
LIMIT 0,50

For small tables, this works perfect. Though in my table (300k+ rows) it makes mysql crash, due to the subquery.

Is it possible to optimize this query? If it cannot be optimized, would it be possible to select the last inserted unique values for text? (the id and time are theoretically uncorrelated, though in 99% of the cases a correlation will be found, whereas the higher the id, the higher the time)

Best Answer

The obvious index for this query is on (test,time). Add it with:

ALTER TABLE test
  ADD INDEX text_time_IX          -- choose a name for the index
    (test,time) ;

You may also get efficiency gain if you use this version of the query (pushing the limit inside the derived table):

SELECT a.* FROM
    test a
INNER JOIN (
    SELECT `text`, MAX(`time`) AS `time`
    FROM
        test
    WHERE
        `text` LIKE "te%"
    GROUP BY
        `text`
    ORDER BY
        `text` ASC
    LIMIT 0,50 ) b
ON
    a.`text` = b.`text` AND
    a.`time` = b.`time`
ORDER BY
    `text` ASC ;

If there cases where the same maximum time appears on more than one row with same text, it will not show the same exact results and will return more than 50 rows (but I think this version is more appropriate). If you really want the arbitrary choice in case of ties, you can add the LIMIT in the external select, too.

Best Answer

Related Solutions

Mysql – the optimal way to select the parent and child rows on a table with a self-relationship

MySQL optimization – year column grouping – using temporary table, filesort

Related Question