I'm not very experienced with MySQL and and trying to figure out how to speed up a slow query using GROUP BY
and CASE
statements.
CASE 1 — GROUP BY
and CASE
SELECT SQL_NO_CACHE m.id, m.sku, m.movie_url
FROM movie.movies m
WHERE m.s_id = 1
GROUP BY m.mg_id, CASE WHEN m.mg_id IS NULL THEN m.id ELSE 0 END
LIMIT 100
Execution times: 1.025s, 1.042s, 0.946s
Explain output:
+----+-------------+-------+------+--------------------+----------+---------+-------+-------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+--------------------+----------+---------+-------+-------+----------------------------------------------+
| 1 | SIMPLE | m | ref | ind_mg_id,ind_s_id | ind_s_id | 4 | const | 39941 | Using where; Using temporary; Using filesort |
+----+-------------+-------+------+--------------------+----------+---------+-------+-------+----------------------------------------------+
CASE 2 — GROUP BY
without CASE
I realized the CASE
statement in the GROUP BY
might slow things down, so I tried removing it for testing purposes (it has to be there for production use, though):
SELECT SQL_NO_CACHE m.id, m.sku, m.movie_url
FROM movie.movies m
WHERE m.s_id = 1
GROUP BY m.mg_id#, CASE WHEN m.mg_id IS NULL THEN m.id ELSE 0 END
LIMIT 100
Execution times: 0.258s, 0.149s, 0.193s
Explain output:
+----+-------------+-------+-------+--------------------+-----------+---------+------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+--------------------+-----------+---------+------+------+-------------+
| 1 | SIMPLE | m | index | ind_mg_id,ind_s_id | ind_mg_id | 9 | NULL | 2200 | Using where |
+----+-------------+-------+-------+--------------------+-----------+---------+------+------+-------------+
Much better, but it still is a pretty slow query.
CASE 3 — Without GROUP BY
and without CASE
Without the GROUP BY
statement, the above times drop further.
SELECT SQL_NO_CACHE m.id, m.sku, m.movie_url
FROM movie.movies m
WHERE m.s_id = 1
#GROUP BY m.mg_id, CASE WHEN m.mg_id IS NULL THEN m.id ELSE 0 END
LIMIT 100
Execution times: 0.053s, 0.050s, 0.052s
Explain output:
+----+-------------+-------+------+---------------+----------+---------+-------+-------+-------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+----------+---------+-------+-------+-------+
| 1 | SIMPLE | m | ref | ind_s_id | ind_s_id | 4 | const | 39941 | NULL |
+----+-------------+-------+------+---------------+----------+---------+-------+-------+-------+
More Information
There are 95,600 rows in the movies
table. The table will continue to grow to a much larger size (eventually tens of millions).
Relevant INDEXES:
+----------+------------+-------------------------+--------------+---------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+----------+------------+-------------------------+--------------+---------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| movies | 0 | PRIMARY | 1 | id | A | 79882 | NULL | NULL | | BTREE | | |
| movies | 1 | ind_mg_id | 1 | pg_id | A | 7262 | NULL | NULL | YES | BTREE | | |
| movies | 1 | ind_s_id | 1 | s_id | A | 10 | NULL | NULL | | BTREE | | |
+----------+------------+-------------------------+--------------+---------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
How do I increase the performance of the query?
Best Answer
Please create the following index
This index will help the
WHERE
andGROUP BY
Perhaps a different index would help
and adjust the query
I figured you should change the
GROUP BY
toORDER BY
because you are not doing any actual aggregation (i.e., you not doing SUM(), COUNT(), AVG(), or any other summations)