Mysql – Slow query while using DISTINCT in MYSQL with various other conditions (10 million records)

amazon-rdsMySQLoptimization

I am working on a social Network where, as usual there is a feed for users where they are able to see the activities of users a user is following and not the users who are blocked by the users.

This is the query that is being used currently from the Activities table

SELECT DISTINCT `activities`.`post_id` 
FROM `activities` 
WHERE (activities.user_id IN ([followed_user_ids]) AND 
activities.language_id IN ([language_ids]) AND 
activities.id <= ?) AND 
(`activities`.`post_user_id` NOT IN 
([blocked_and_deactivated_user_ids])) 
ORDER BY `activities`.`id` DESC 
LIMIT 10 OFFSET 10

This is the query which is causing issues. Sometimes the Mysql CPU usage goes to 100% which causes issues to the users. I have also used index on the columns being used in the activity table.

There are 10 million records in the Activities table.

What changes can i possibly do to optimise this?

Note – followed_user_ids/blocked_and_deactivated_user_ids – can have an array of thousands of ids.

Update 1

This is the new query that i formed ignoring the blocked users from post_user_id :-

"SELECT  DISTINCT `activities`.`id` FROM `activities`
INNER JOIN `users` ON `users`.`id` = `activities`.`user_id` AND 
`users`.`deleted_at` IS NULL
LEFT OUTER JOIN `follows` ON `follows`.`followable_id` = `users`.`id` 
AND `follows`.`deleted_at` IS NULL AND `follows`.`followable_type` = 
'User' WHERE `activities`.`language_id` IN (19, 21) AND (activities.id 
< 99999999) AND `follows`.`blocked` = 0 AND `follows`.`follower_id` = 1 
AND `follows`.`follower_type` = 'User' AND `follows`.`followable_type` 
= 'User' AND `users`.`deactivated` = 0  
ORDER BY activities.id desc LIMIT 10 OFFSET 10"

This doesn't seem to help either. I need to get all the unique post activities from the users i am following, in a pagination format.

Best Answer

IN ( long list ) often performs badly. Try rewriting using [NOT] EXISTS or [LEFT] JOIN. Example:

SELECT DISTINCT activities.post_id 
FROM activities 
JOIN <table with followed_user_ids>
   ON activities.user_id = ...    
JOIN user_languages
   ON activities.user_id = ...
LEFT JOIN <table with blocked users> bu
   ON activities.user_id = ...
WHERE bu.<attribute> IS NULL -- no match
ORDER BY activities.id DESC 
LIMIT 1 OFFSET 10;

Make sure you have appropriate indexes on JOIN predicates.