MySQL Optimization – How to Optimize a Slow SQL Query

MySQLoptimization

I have a slow SQL query on my Drupal website and I don't know how to optimize it.

Here is the query that Views produces: http://pastebin.com/StAEiXbY

SQL fiddle: http://sqlfiddle.com/#!2/20081/1/0

EXPLAIN shows this: http://pastebin.com/uKtte5KR

Amount of data (rows):

+-------+----------------------+------------+----------------+-------------+-----------+
| node  | content_type_profile | userpoints | node_revisions | node_access | term_node |
+-------+----------------------+------------+----------------+-------------+-----------+
| 59805 |                23294 |       2692 |          71284 |      147783 |    225052 |
+-------+----------------------+------------+----------------+-------------+-----------+

This query takes more than 10s to complete. How can I optimize this query?

Best Answer

Refactoring the query like this gives better performance (from ~7s to ~0.364s on my local): http://pastebin.com/7VpLGdQB

EXPLAIN now shows this: http://pastebin.com/q8zEkXbx

Joining the node table twice was the bottleneck. The subquery with node and content_type_profile tables is much more efficient.

On the Drupal side I'll find a way to change this using hook_query_alter or by creating a custom block.

Related Solutions

MySql – How to improve this query

What are you trying to achieve here? I am not a MySQL programmer; however, as I understand SQL in general, every column that is not listed in the GROUP BY clause must be part of an aggregate like

SELECT
    MAX(id) AS max_id,
    chain, subchain, job_type
FROM
    `job_logs`
WHERE
    user_id = ? AND
    workflow_state = ?
GROUP BY
    chain, subchain, job_type

SELECT * in a grouping query will not work in most (if not all) SQL implementations.

UPDATE

In order to get the row with the highest id for each group you would have to embed the query above in an "outer" query.

SELECT *
FROM `job_logs`
WHERE id IN (
    SELECT
        MAX(id) AS max_id
    FROM
        `job_logs`
    WHERE
        user_id = ? AND
        workflow_state = ?
    GROUP BY
        chain, subchain, job_type 
)

This query is deterministic and should work with most SQL dialects.

Some query engines perform better with joins than with "IN subquery". You can give this a try

SELECT A.*
FROM
    `job_logs` A
    INNER JOIN (
        SELECT
            MAX(id) AS max_id
        FROM
            `job_logs`
        WHERE
            user_id = ? AND
            workflow_state = ?
        GROUP BY
            chain, subchain, job_type) B
    ON A.id = B.max_id;

Mysql – Optimize UNION query in MySQL

From the EXPLAIN output, it looks like first table userdata_langue would benefit from an index that contains both id_langue and id_inscri. If there is already an index on that combination, try forcing it.

The second table userdata_sexe could be partitioned on sexe (I assume there are only two options here?). This would speed up the scan required to fetch all those.

I am not sure if mySQL would benefit from moving the order of the UNION DISTINCT components around. It might be worth trying to move component 3 and 4 up to the top of the query.

Finally, consider if you can afford to increase the sort buffer size in InnoDb (see: http://www.mysqlperformanceblog.com/2010/10/25/impact-of-the-sort-buffer-size-in-mysql/) to make sure you can hold everything in memory while mySQL calculates the UNION DISTINCT.

Best Answer

Related Solutions

MySql – How to improve this query

Mysql – Optimize UNION query in MySQL

Related Question