MySql – How to improve this query

MySQLperformancequeryquery-performance

I have the following table: http://pastebin.com/XH6FvhZ8

and the following query:

SELECT *
FROM (SELECT *
      FROM `job_logs`
      ORDER BY id DESC) AS job_logs_tmp
WHERE (job_logs_tmp.user_id = ? AND job_logs_tmp.workflow_state = ?)
GROUP BY chain, subchain, job_type

here is the explain of this query: http://pastebin.com/AjXgzRBA

this query usually takes over 4K ms to execute, any ideas how I can speed the query execution? I have no problem adding indexes to table if needed

thanks!

Best Answer

What are you trying to achieve here? I am not a MySQL programmer; however, as I understand SQL in general, every column that is not listed in the GROUP BY clause must be part of an aggregate like

SELECT
    MAX(id) AS max_id,
    chain, subchain, job_type
FROM
    `job_logs`
WHERE
    user_id = ? AND
    workflow_state = ?
GROUP BY
    chain, subchain, job_type 

SELECT * in a grouping query will not work in most (if not all) SQL implementations.


UPDATE

In order to get the row with the highest id for each group you would have to embed the query above in an "outer" query.

SELECT *
FROM `job_logs`
WHERE id IN (
    SELECT
        MAX(id) AS max_id
    FROM
        `job_logs`
    WHERE
        user_id = ? AND
        workflow_state = ?
    GROUP BY
        chain, subchain, job_type 
)

This query is deterministic and should work with most SQL dialects.


Some query engines perform better with joins than with "IN subquery". You can give this a try

SELECT A.*
FROM
    `job_logs` A
    INNER JOIN (
        SELECT
            MAX(id) AS max_id
        FROM
            `job_logs`
        WHERE
            user_id = ? AND
            workflow_state = ?
        GROUP BY
            chain, subchain, job_type) B
    ON A.id = B.max_id;