MySQL Query running slow when using REPLACE function instead of CASE statement

MySQL

Is it better to use CASE statements instead of REPLACE functions in MySQL when mapping a comma-separated-string field?

The below query runs extremely slow using replace.
Note, the underlying user_roles table is of format [user_id (bigint), string_of_user_role_ids (varchar(200))]

-- this runs slowly
select      string_of_user_role_ids
            , replace(replace(replace(replace(replace(string_of_user_role_ids,
    '10', 'Scientist'), '9', 'Superhero'), '8', 'Teacher'), '7', 
         'Journalist'), '6', 'Farmer')
            , count(1) 
from        user_roles
group by    1,2 
order by    3 desc

-- this runs quickly, but is more difficult to keep adding in multiple new when clauses whenever a new user role is added
select      string_of_user_role_ids
            , case  when string_of_user_role_ids= "6" then 'Farmer'
                    when string_of_user_role_ids= "7" then 'Journalist'
                    when string_of_user_role_ids= "8" then 'Teacher'    
                    when string_of_user_role_ids= "6,7" then 'Farmer, Journalist'
                    when string_of_user_role_ids= "6,8" then 'Farmer, Teacher'
                    when string_of_user_role_ids= "7,8" then 'Journalist, Teacher'
                    when string_of_user_role_ids= "6,7,8" then 'Farmer, Journalist, Teacher'    
                    -- ... etc.
                    else 'Unknown' end as app_user_type
            , count(1) 
from        user_roles  
group by    1,2 
order by    3 desc

Ideally I would use the REPLACE function instead of a CASE statement, as it seems easier to scale out in terms of expanding the code and less risk to manage.

I can't understand why one query runs quickly and the other very slowly (seconds versus minutes, after a few mins I killed the slow query).
Ideas/questions are welcome please.

Explain statement output for both is the same:

Best Answer

It would work faster if you would use replace/case after group by. For example:

select      string_of_user_role_ids
            , replace(replace(replace(replace(replace(string_of_user_role_ids,
    '10', 'Scientist'), '9', 'Superhero'), '8', 'Teacher'), '7', 
         'Journalist'), '6', 'Farmer')
            , cnt 
from        (select      string_of_user_role_ids
                        , count(1) as cnt
            from        user_roles
            group by    string_of_user_role_ids 
            ) as ur
order by    cnt desc

Related Solutions

Mysql – Slow complex query with group/order

I can see couple things that should improve your query performance.

1 As you already found out there is absolutely no need to join mentioncache. Using EXISTS seems more natural (or IN as you did, but EXISTS may work better from performance point of view).

2 DATE(m.indexed) BETWEEN "2012-09-16" AND "2012-10-16" can be rewritten to m.indexed between "2012-09-16" AND "2012-10-16 23:59:59", so mysql can use index.

3 urlinfluranks doesn't seem to be used anywhere except in LEFT JOIN, why do you need it?

4 f.foreign_id can be either null or m.id, and this is the only reference to favoureditems table, I'd rather use subquery in this case.

Finally, I think you can get the same results without GROUP BY m.id (as far as I understood , mentions.id a primary key).

SELECT   
m.id, m.title, m.title_text, m.content_text, m.url,m.root_url,m.sub_type,m.indexed,  
CASE 
 WHEN EXISTS 
    (SELECT NULL FROM favoureditems f WHERE f.model = "Mention" 
    AND f.foreign_id = m.id AND f.owner_id = 803) THEN m.id 
END AS f.foreign_id,
, v.foreign_id, v.created, mfs.score,  
Image.id,Image.model,Image.foreign_key, Image.dirname,Image.basename,  
(REPLACE(REPLACE(m.host_url, 'http://www.', ''), 'http://', '')) AS Mention__plain_url  
FROM mentions AS m  

LEFT JOIN 
(
  SELECT id,model,foreign_key,dirname,basename 
  FROM attachments Image  
  WHERE model = 'Mention'
  GROUP BY foreign_key
 )Image  ON (Image.foreign_key = m.id)      

LEFT JOIN 
(
   SELECT v.foreign_id, v.created 
   FROM visiteditems AS v  
   WHERE (v.model = "Mention"  AND v.owner_id = 803)  
    GROUP BY v.foreign_id
)v ON (v.foreign_id = m.id)
LEFT JOIN 
(
   SELECT mention_id,score
   FROM mentionfeedscores mfs  
   WHERE mfs.feed_id = '474737584865424564398208323289092'
   GROUP BY mention_id
)mfs ON (mfs.mention_id = m.id )

WHERE m.indexed BETWEEN "2012-09-16" AND "2012-10-16 23:59:59"  
   AND EXISTS 
  (
     SELECT NULL FROM mentioncache mc  
      WHERE mc.mention_id = m.id AND mc.profile_id = 803  
   )    
ORDER BY m.indexed DESC  
LIMIT 10

MySQL query extremely slow after server restart

My problem has been solved. Thanks for all your replies and comments.

The success seems to come from a couple of things including:

Indexing visits (excluded, website_id, time)
Upping innodb_buffer_pool_size to 8G
Upping sort_buffer_size, read_buffer_size and join_buffer_size to 8M
Possibly adding thread_cache_size = 4 which was suggested from mysqltuner.pl

The overall server memory usage has gone up about 3G (which probably relates to the 3G increase in innodb_buffer_pool_size, but I will just have to compensate for that.

Best Answer

Related Solutions

Mysql – Slow complex query with group/order

MySQL query extremely slow after server restart

Related Question