What are you trying to achieve here? I am not a MySQL programmer; however, as I understand SQL in general, every column that is not listed in the GROUP BY clause must be part of an aggregate like
SELECT
MAX(id) AS max_id,
chain, subchain, job_type
FROM
`job_logs`
WHERE
user_id = ? AND
workflow_state = ?
GROUP BY
chain, subchain, job_type
SELECT *
in a grouping query will not work in most (if not all) SQL implementations.
UPDATE
In order to get the row with the highest id for each group you would have to embed the query above in an "outer" query.
SELECT *
FROM `job_logs`
WHERE id IN (
SELECT
MAX(id) AS max_id
FROM
`job_logs`
WHERE
user_id = ? AND
workflow_state = ?
GROUP BY
chain, subchain, job_type
)
This query is deterministic and should work with most SQL dialects.
Some query engines perform better with joins than with "IN subquery". You can give this a try
SELECT A.*
FROM
`job_logs` A
INNER JOIN (
SELECT
MAX(id) AS max_id
FROM
`job_logs`
WHERE
user_id = ? AND
workflow_state = ?
GROUP BY
chain, subchain, job_type) B
ON A.id = B.max_id;
From the EXPLAIN output, it looks like first table userdata_langue
would benefit from an index that contains both id_langue
and id_inscri
. If there is already an index on that combination, try forcing it.
The second table userdata_sexe
could be partitioned on sexe
(I assume there are only two options here?). This would speed up the scan required to fetch all those.
I am not sure if mySQL would benefit from moving the order of the UNION DISTINCT components around. It might be worth trying to move component 3 and 4 up to the top of the query.
Finally, consider if you can afford to increase the sort buffer size in InnoDb (see: http://www.mysqlperformanceblog.com/2010/10/25/impact-of-the-sort-buffer-size-in-mysql/) to make sure you can hold everything in memory while mySQL calculates the UNION DISTINCT.
Best Answer
Refactoring the query like this gives better performance (from ~7s to ~0.364s on my local): http://pastebin.com/7VpLGdQB
EXPLAIN now shows this: http://pastebin.com/q8zEkXbx
Joining the node table twice was the bottleneck. The subquery with node and content_type_profile tables is much more efficient.
On the Drupal side I'll find a way to change this using hook_query_alter or by creating a custom block.