Mysql – Slow query with multiple group by

group byjoin;MySQLmysql-5.5performancequery-performance

I have a query to get all the comments from threads where a given user has commented and not yet seen the comment. So if user 1 commented to a topic, then user 2 commented after his comment, then user 1 gets a notification and the notification is gone when he saw the new comment (p1.date>p2.last_seen) checks this.

It all works fine, but multiple group by is slow. It's better if I have separate queries for each type (tooic,news,pictures), as seen below.

If I have only 1 group by, the query is quick (0,02 sec or so).

    SELECT SQL_NO_CACHE p1.* FROM comment p1 INNER JOIN
      (SELECT max(seen) last_seen, max(id) MaxId,
      user_id, topic_id, picture_id, news_id, date
      FROM comment WHERE user_id='1' and deleted=0
      GROUP BY topic_id) p2
   ON p1.topic_id = p2.topic_id
   WHERE p1.date > p2.last_seen and p1.id > p2.MaxId and deleted=0
   GROUP BY p1.topic_id ORDER BY p1.date DESC

But it's very slow (~1,5 sec) with multiple group by (creating temp table takes ~1,2 sec).

    SELECT SQL_NO_CACHE p1.* FROM comment p1 INNER JOIN
      (SELECT max(seen) last_seen, max(id) MaxId,
      user_id, topic_id, picture_id, news_id, date
      FROM comment WHERE user_id='1' and deleted=0
      GROUP BY topic_id, picture_id, news_id) p2
   ON p1.topic_id = p2.topic_id and p1.picture_id= p2.picture_id
   and p1.news_id= p2.news_id
   WHERE p1.date > p2.last_seen and p1.id > p2.MaxId and deleted=0
   GROUP BY p1.topic_id, p1.picture_id, p1.news_id ORDER BY p1.date DESC

I'm using no cache for testing purposes.

Schema

Id is primary, other indexes are user_id, topic_id, picture_id, news_id, deleted, date and seen

CREATE TABLE `comment` (
`id` int(11) NOT NULL,
`user_id` int(11) NOT NULL,
`topic_id` int(11) NOT NULL,
`picture_id` int(11) NOT NULL,
`news_id` int(11) NOT NULL,
`text` text COLLATE utf8_hungarian_ci NOT NULL,
`date` datetime NOT NULL,
`reply_id` int(11) NOT NULL,
`comment_vote` int(11) NOT NULL,
`comment_hit` int(11) NOT NULL,
`moderated` int(11) NOT NULL,
`seen` datetime NOT NULL,
`deleted` tinyint(4) NOT NULL,
`ip` text COLLATE utf8_hungarian_ci NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8 
COLLATE=utf8_hungarian_ci;
ALTER TABLE `comment`
ADD PRIMARY KEY (`id`),
ADD KEY `topic_id` (`topic_id`),
ADD KEY `picture_id` (`picture_id`),
ADD KEY `news_id` (`news_id`),
ADD KEY `deleted` (`deleted`),
ADD KEY `user_id` (`user_id`),
ADD KEY `date` (`date`),
ADD KEY `seen` (`seen`);

Explain

1(id)   PRIMARY(select_type)    <derived2>(table) ALL(type) NULL(possible keys) NULL(key)   NULL(key_len)   NULL(ref)   105(rows)   Using temporary; Using filesort(extra)
1(id)   PRIMARY(select_type)    p1(table)   ref(type)   PRIMARY,topic_id,deleted,date,picture_id,news_id(possible keys)  topic_id(key)  4(key_len)  p2.topic_id(ref)    5(rows) Using where(extra)
2(id)   DERIVED(select_type)    p2(table)   index_merge(type)   deleted,user_id(possible_keys)  user_id(key)    4,1(key_len)    NULL(ref)   169(rows)   Using intersect(user_id,deleted); Using where; Using filesort(extra)

Best Answer

Using intersect(user_id,deleted) --> Provide INDEX(user_id, deleted) (in either order).

See the tag [greatest-n-per-group]; there are more efficient ways to do this type of query. Also, see [groupwise-max] in stackoverflow.com

Also INDEX(topic_id, picture_id, news_id) (in any order)

(There may be more improvements to make; see how far those tips get you.)