Mysql – Slow MySQL query with Order and Group by

group byMySQLorder-byperformancequery-performance

I have a query that is taking a long time to execute (1.13s). The two tables user_articles and articles have around 25,000 records. This is the query:

SELECT `user_articles`.*
FROM `user_articles`
INNER JOIN `articles` ON (`user_articles`.`article_id` = `articles`.`id`)
GROUP BY `articles`.`id`
ORDER BY `user_articles`.`created_at` DESC

I have found that by removing the ORDER BY statement it speeds it up (0.003s). These are the results from the EXPLAIN

+----+-------------+---------------+--------+---------------+------------+---------+-------------+-------+----------------------------------------------+
| id | select_type | table         | type   | possible_keys | key        | key_len | ref         | rows  | Extra                                        |
+----+-------------+---------------+--------+---------------+------------+---------+-------------+-------+----------------------------------------------+
| 1  | SIMPLE      | articles      | index  | PRIMARY       | PRIMARY    | 4       | NULL        | 22678 | Using index; Using temporary; Using filesort |
| 1  | SIMPLE      | user_articles | ref    | article_id    | article_id | 4       | articles.id | 1     | Using where                                  |
+----+-------------+---------------+--------+---------------+------------+---------+-------------+-------+----------------------------------------------+

Is there any way I can speed up my query?

Best Answer

As @Back in a Flash has suggested, an index on created_at in user_articles will probably help if the optimizer chooses to use it... but the optimizer might not catch on to the viability of that index, because you're doing something else that needs an explanation.

You're using GROUP BY but you're not aggregating anything that I can see. If you're not looking for the MIN(), MAX(), SUM(), COUNT(), or the result of some similar aggregate function, then your use of GROUP BY here is incorrect. By default, MySQL does not throw an error when GROUP BY is used like this.

It looks as if multiple rows in user_articles could reference the same row in articles. Your query, as written, will only return one matching row from user_articles for each referenced row in articles, and the selection of that row will not be deterministic.

That doesn't seem like what you want to be doing. If it is, you shouldn't be doing it this way. :)

If it isn't... or if each row in user_articles references a unique row in articles, then your query is unnecessarily burdening the optimizer by insisting that this is indeed what it wants the server to try to do. As the query is written, the GROUP BY is superfluous at best, and possibly contributing to the performance problem, since, even if you had an index on created_at, that conflicts with the fact that the server also needs to GROUP BY (potentially) a different index.