MySQL – How to Speed Up Slow Query Using GROUP BY and CASE

casegroup byMySQLselect

I'm not very experienced with MySQL and and trying to figure out how to speed up a slow query using GROUP BY and CASE statements.

CASE 1 — GROUP BY and CASE


SELECT SQL_NO_CACHE m.id, m.sku, m.movie_url 
FROM movie.movies m               
WHERE m.s_id = 1                   
GROUP BY m.mg_id, CASE WHEN m.mg_id IS NULL THEN m.id ELSE 0 END
LIMIT 100

Execution times: 1.025s, 1.042s, 0.946s

Explain output:

+----+-------------+-------+------+--------------------+----------+---------+-------+-------+----------------------------------------------+
| id | select_type | table | type | possible_keys      | key      | key_len | ref   | rows  | Extra                                        |
+----+-------------+-------+------+--------------------+----------+---------+-------+-------+----------------------------------------------+
|  1 | SIMPLE      | m     | ref  | ind_mg_id,ind_s_id | ind_s_id | 4       | const | 39941 | Using where; Using temporary; Using filesort |
+----+-------------+-------+------+--------------------+----------+---------+-------+-------+----------------------------------------------+

CASE 2 — GROUP BY without CASE


I realized the CASE statement in the GROUP BY might slow things down, so I tried removing it for testing purposes (it has to be there for production use, though):

SELECT SQL_NO_CACHE m.id, m.sku, m.movie_url 
FROM movie.movies m               
WHERE m.s_id = 1                   
GROUP BY m.mg_id#, CASE WHEN m.mg_id IS NULL THEN m.id ELSE 0 END
LIMIT 100

Execution times: 0.258s, 0.149s, 0.193s

Explain output:

+----+-------------+-------+-------+--------------------+-----------+---------+------+------+-------------+
| id | select_type | table | type  | possible_keys      | key       | key_len | ref  | rows | Extra       |
+----+-------------+-------+-------+--------------------+-----------+---------+------+------+-------------+
|  1 | SIMPLE      | m     | index | ind_mg_id,ind_s_id | ind_mg_id | 9       | NULL | 2200 | Using where |
+----+-------------+-------+-------+--------------------+-----------+---------+------+------+-------------+

Much better, but it still is a pretty slow query.

CASE 3 — Without GROUP BY and without CASE


Without the GROUP BY statement, the above times drop further.

SELECT SQL_NO_CACHE m.id, m.sku, m.movie_url 
FROM movie.movies m               
WHERE m.s_id = 1                   
#GROUP BY m.mg_id, CASE WHEN m.mg_id IS NULL THEN m.id ELSE 0 END
LIMIT 100

Execution times: 0.053s, 0.050s, 0.052s

Explain output:

+----+-------------+-------+------+---------------+----------+---------+-------+-------+-------+
| id | select_type | table | type | possible_keys | key      | key_len | ref   | rows  | Extra |
+----+-------------+-------+------+---------------+----------+---------+-------+-------+-------+
|  1 | SIMPLE      | m     | ref  | ind_s_id      | ind_s_id | 4       | const | 39941 | NULL  |
+----+-------------+-------+------+---------------+----------+---------+-------+-------+-------+

More Information


There are 95,600 rows in the movies table. The table will continue to grow to a much larger size (eventually tens of millions).

Relevant INDEXES:

+----------+------------+-------------------------+--------------+---------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table    | Non_unique | Key_name                | Seq_in_index | Column_name         | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+----------+------------+-------------------------+--------------+---------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| movies   |          0 | PRIMARY                 |            1 | id                  | A         |       79882 | NULL     | NULL   |      | BTREE      |         |               |
| movies   |          1 | ind_mg_id               |            1 | pg_id               | A         |        7262 | NULL     | NULL   | YES  | BTREE      |         |               |
| movies   |          1 | ind_s_id                |            1 | s_id                | A         |          10 | NULL     | NULL   |      | BTREE      |         |               |
+----------+------------+-------------------------+--------------+---------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+

How do I increase the performance of the query?

Best Answer

Please create the following index

ALTER TABLE movie.movies ADD INDEX sid_mgid_ndx (s_id,mg_id);

This index will help the WHERE and GROUP BY

Perhaps a different index would help

ALTER TABLE movie.movies
    DROP INDEX sid_mgid_ndx,
    ADD INDEX sid_mgid_id_ndx (s_id,mg_id,id)
;

and adjust the query

SELECT SQL_NO_CACHE m.id, m.sku, m.movie_url 
FROM movie.movies m               
WHERE m.s_id = 1                   
ORDER BY m.mg_id, m.id;
LIMIT 100;

I figured you should change the GROUP BY to ORDER BY because you are not doing any actual aggregation (i.e., you not doing SUM(), COUNT(), AVG(), or any other summations)